From 09d4326a8a853f9a5f05acb434d92c06156ec107 Mon Sep 17 00:00:00 2001 From: Peter Amstutz Date: Fri, 15 Apr 2022 16:54:41 -0400 Subject: [PATCH] 18894: Add section "estimating manifest size" Arvados-DCO-1.1-Signed-off-by: Peter Amstutz --- .../manifest-format.html.textile.liquid | 24 ++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/doc/architecture/manifest-format.html.textile.liquid b/doc/architecture/manifest-format.html.textile.liquid index 1780768bc3..9efbef788f 100644 --- a/doc/architecture/manifest-format.html.textile.liquid +++ b/doc/architecture/manifest-format.html.textile.liquid @@ -60,6 +60,28 @@ A normalized manifest is a manifest that meets the following additional restrict * Blocks within a stream are ordered based on order of file tokens of the stream. A given block is listed at most once in a stream. * Filename must not contain @"/"@ (the stream name represents the path prefix) +h3. Estimating manifest size + +Here's a formula for estimating manifest size as stored in the database, assuming efficiently packed blocks. + +
+manifest_size =
+   + (total data size / 64 MB) * 40
+   + sum(number of files * 20)
+   + sum(size of all directory paths)
+   + sum(size of all file names)
+
+ +Here is the size when including signatures. The signatures authorize access to fetch the block from a Keep server, as described below. The signed manifest is what is actually transferred to/from the API server and stored in RAM by @arv-mount@. + +
+manifest_size =
+   + (total data size / 64 MB) * 94
+   + sum(number of files * 20)
+   + sum(size of all directory paths)
+   + sum(size of all file names)
+
+ h3. Example manifests A manifest with four files in two directories: @@ -122,7 +144,7 @@ table(table table-bordered table-condensed). |@d41d8cd98f00b204e9800998ecf8427e+0+z@|Hint does not start with uppercase letter| |@d41d8cd98f00b204e9800998ecf8427e+0+Zfoo*bar@|Hint contains invalid character @*@| -h3. Token signatures +h3(#token_signatures). Token signatures A token signature (sign-hint) provides proof-of-access for a data block. It is computed by taking a SHA1 HMAC of the blob signing token (a shared secret between the API server and keep servers), block digest, current API token, expiration timestamp, and blob signature TTL. -- 2.30.2