- install/install-shell-server.html.textile.liquid
- install/create-standard-objects.html.textile.liquid
- install/install-keepstore.html.textile.liquid
+ - install/configure-fs-storage.html.textile.liquid
- install/configure-s3-object-storage.html.textile.liquid
- install/configure-azure-blob-storage.html.textile.liquid
- install/install-keepproxy.html.textile.liquid
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
-Storage classes (sometimes also called "storage tiers") allow you to control which volumes should be used to store particular collection data blocks. This can be used to implement data storage policies such as moving data to archival storage.
+Storage classes (alternately known as "storage tiers") allow you to control which volumes should be used to store particular collection data blocks. This can be used to implement data storage policies such as moving data to archival storage.
+
+h3. Configuring storage classes
+
+The storage classes for each volume are set in the per-volume "keepstore configuration":{{site.baseurl}}/install/install-keepstore.html
+
+<pre>
+Volumes:
+ - ... Volume configuration ...
+ #
+ # If no storage classes are specified, will use [default]
+ #
+ StorageClasses: null
+
+ - ... Volume configuration ...
+ #
+ # Specify this volume is in the "archival" storage class.
+ #
+ StorageClasses: [archival]
+
+</pre>
+
+Names of storage classes are internal to the cluster and decided by the administrator. Aside from "default", Arvados currently does not define any standard storage class names.
+
+h3. Using storage classes
+
+Desired storage classes are specified by setting the "storage_classes_desired" field of a Collection. For example, at the command line:
+
+<pre>
+$ arv collection update --uuid zzzzz-4zz18-dhhm0ay8k8cqkvg --collection '{"storage_classes_desired": ["archival"]}'
+</pre>
+
+By setting "storage_classes_desired" to "archival", the blocks that make up the collection will be preferentially moved to keepstore volumes which are configured with the "archival" storage class.
+
+You may also specify the desired storage class when using @arv-put@:
+
+<pre>
+$ arv-put --storage-classes=hot myfile.txt
+</pre>
+
+Collection blocks will be in the "default" storage class if not otherwise specified.
+
+A collection may have only specify one desired storage class.
+
+A user with write access to a collection may set any storage class.
+
+h3. Storage management notes
+
+The "keep-balance":{{site.baseurl}}/install/install-keep-balance.html service is responsible for deciding which blocks should be placed on which keepstore volumes. As part of the rebalancing behavior, it will determine where a block should go in order to satisfy the desired storage classes, and issue pull requests to copy the block from its original volume to the desired volume. The block will subsequently be moved to trash on the original volume.
+
+If a block appears in multiple collections with different storage classes, the block will be stored in separate volumes for each storage class, even if that results in overreplication, unless there is a volume which has all the desired storage classes.
+
+If a collection has a desired storage class which is not available in any keepstore volume, the collection's blocks will remain in place, and an error will appear in the @keep-balance@ logs.
+
+This feature does not provide a hard guarantee on where data will be stored. Data may be written to default storage and moved to the desired storage class later. If controlling data locality is a hard requirement (such as legal restrictions on the location of data) we recommend setting up multiple Arvados clusters.
# This is used to inform replication decisions at the Keep layer.
S3Replication: 2
- # Storage classes to associate with this volume. See "Configuring
- # storage classes" in the "Admin" section of doc.arvados.org.
+ # Storage classes to associate with this volume. See
+ # "Storage classes" in the "Admin" section of doc.arvados.org.
StorageClasses: null
# Enable deletion (garbage collection) even when TrashLifetime is
See previous section for documentation of configuration fields.
-</pre>
+<pre>
Volumes:
- # Example configuration using alternate storage provider
# Configuration for Google cloud storage
If keep-balance instructs keepstore to trash a block which is older than @BlobSignatureTTL@, and @EnableDelete@ is true, the block will be moved to "trash". A block which is in the trash is no longer accessible by read requests, but has not yet been permanently deleted. Blocks which are in the trash may be recovered using the "untrash" API endpoint. Blocks are permanently deleted after they have been in the trash for the duration in @TrashLifetime@.
-Keep-balance is also responsible for balancing the distribution of blocks across keepstore servers by asking servers to pull blocks from other servers (as determined by their storage class and "rendezvous hashing order":{{site.baseurl}}/api/storage.html). Pulling a block makes a copy. If a block is overreplicated (i.e. there are excess copies) after pulling, it will be subsequently trashed on the original server.
+Keep-balance is also responsible for balancing the distribution of blocks across keepstore servers by asking servers to pull blocks from other servers (as determined by their "storage class":{{site.baseurl}}/admin/storage-classes.html and "rendezvous hashing order":{{site.baseurl}}/api/storage.html). Pulling a block makes a copy. If a block is overreplicated (i.e. there are excess copies) after pulling, it will be subsequently trashed on the original server.
h3. Configure storage volumes