X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/bd1aa20c5878436505b31aa987473ac3fbb6395c..3b321249456939079404973d40ae7e999872c963:/doc/install/install-keepstore.html.textile.liquid diff --git a/doc/install/install-keepstore.html.textile.liquid b/doc/install/install-keepstore.html.textile.liquid index d1633f31c2..869ca15d9e 100644 --- a/doc/install/install-keepstore.html.textile.liquid +++ b/doc/install/install-keepstore.html.textile.liquid @@ -9,249 +9,101 @@ Copyright (C) The Arvados Authors. All rights reserved. SPDX-License-Identifier: CC-BY-SA-3.0 {% endcomment %} -Keepstore provides access to underlying storage for reading and writing content-addressed blocks, with enforcement of Arvados permissions. Keepstore supports a variety of cloud object storage and POSIX filesystems for its backing store. +# "Introduction":#introduction +# "Update config.yml":#update-config +# "Install keepstore package":#install-packages +# "Restart the API server and controller":#restart-api +# "Confirm working installation":#confirm-working +# "Note on storage management":#note -We recommend installing at least two Keepstore servers. By convention, we use the following hostname pattern: +h2. Introduction -
~$ sudo apt-get install keepstore
-
-~$ sudo yum install keepstore
-
-~$ keepstore --version
-
--# Duration for which new permission signatures (returned in PUT -# responses) will be valid. This should be equal to the API -# server's blob_signature_ttl configuration entry. -BlobSignatureTTL: 336h0m0s - -# Local file containing the secret blob signing key (used to generate -# and verify blob signatures). The contents of the key file must be -# identical to the API server's blob_signing_key configuration entry. -BlobSigningKeyFile: "" - -# Print extra debug logging -Debug: false - -# Maximum number of concurrent block deletion operations (per -# volume) when emptying trash. Default is 1. -EmptyTrashWorkers: 1 - -# Enable trash and delete features. If false, trash lists will be -# accepted but blocks will not be trashed or deleted. -# Keepstore does not delete data on its own. The keep-balance -# service determines which blocks are candidates for deletion -# and instructs the keepstore to move those blocks to the trash. -EnableDelete: true - -# Local port to listen on. Can be 'address:port' or ':port', where -# 'address' is a host IP address or name and 'port' is a port number -# or name. -Listen: :25107 - -# Format of request/response and error logs: "json" or "text". -LogFormat: json -ManagementToken: "" - -# Maximum RAM to use for data buffers, given in multiples of block -# size (64 MiB). When this limit is reached, HTTP requests requiring -# buffers (like GET and PUT) will wait for buffer space to be -# released. -# -# It should be set such that MaxBuffers * 64MiB + 10% fits -# comfortably in memory. On a host dedicated to running keepstore, -# divide total memory by 88MiB to suggest a suitable value. For example, -# if grep MemTotal /proc/meminfo reports MemTotal: 7125440 kB, -# compute 7125440 / (88 * 1024)=79 and configure MaxBuffers: 79 -MaxBuffers: 128 - -# Maximum concurrent requests. When this limit is reached, new -# requests will receive 503 responses. Note: this limit does not -# include idle connections from clients using HTTP keepalive, so it -# does not strictly limit the number of concurrent connections. If -# omitted or zero, the default is 2 * MaxBuffers. -MaxRequests: 0 - -# Path to write PID file during startup. This file is kept open and -# locked with LOCK_EX until keepstore exits, so "fuser -k pidfile" is -# one way to shut down. Exit immediately if there is an error -# opening, locking, or writing the PID file. -PIDFile: "" - -# Maximum number of concurrent pull operations. Default is 1, i.e., -# pull lists are processed serially. -PullWorkers: 0 - -# Honor read requests only if a valid signature is provided. This -# should be true, except for development use and when migrating from -# a very old version. -RequireSignatures: true - -# Local file containing the Arvados API token used by keep-balance -# or data manager. Delete, trash, and index requests are honored -# only for this token. -SystemAuthTokenFile: "" - -# Path to server certificate file in X509 format. Enables TLS mode. -# -# Example: /var/lib/acme/live/keep0.example.com/fullchain -TLSCertificateFile: "" - -# Path to server key file in X509 format. Enables TLS mode. -# -# The key pair is read from disk during startup, and whenever SIGHUP -# is received. -# -# Example: /var/lib/acme/live/keep0.example.com/privkey -TLSKeyFile: "" - -# How often to check for (and delete) trashed blocks whose -# TrashLifetime has expired. -TrashCheckInterval: 24h0m0s - -# Time duration after a block is trashed during which it can be -# recovered using an /untrash request. -TrashLifetime: 336h0m0s - -# Maximum number of concurrent trash operations. Default is 1, i.e., -# trash lists are processed serially. -TrashWorkers: 1 -- -h3. Notes on storage management - -On its own, a keepstore server never deletes data. The "keep-balance":install-keep-balance.html service service determines which blocks are candidates for deletion and instructs the keepstore to move those blocks to the trash. - -When a block is newly written, it is protected from deletion for the duration in @BlobSignatureTTL@. During this time, it cannot be trashed. - -If keep-balance instructs keepstore to trash a block which is older than @BlobSignatureTTL@, and @EnableDelete@ is true, the block will be moved to "trash". +h2(#update-config). Update cluster config h3. Configure storage volumes -Available storage volume types include cloud object storage and POSIX filesystems. +Fill in the @Volumes@ section of @config.yml@ for each storage volume. Available storage volume types include POSIX filesystems and cloud object storage. It is possible to have different volume types in the same cluster. -If you are using S3-compatible object storage (including Amazon S3, Google Cloud Storage, and Ceph RADOS), follow the setup instructions "S3 Object Storage":configure-s3-object-storage.html page instead and then "Run keepstore as a supervised service.":#keepstoreservice +* To use a POSIX filesystem, including both local filesystems (ext4, xfs) and network file system such as GPFS or Lustre, follow the setup instructions on "Filesystem storage":configure-fs-storage.html +* If you are using S3-compatible object storage (including Amazon S3, Google Cloud Storage, and Ceph RADOS), follow the setup instructions on "S3 Object Storage":configure-s3-object-storage.html +* If you are using Azure Blob Storage, follow the setup instructions on "Azure Blob Storage":configure-azure-blob-storage.html -If you are using Azure Blob Storage, follow the setup instructions "Azure Blob Storage":configure-azure-blob-storage.html and then proceed to "Run keepstore as a supervised service.":#keepstoreservice +h3. List services -To use a POSIX filesystem, including both local filesystems (ext4, xfs) and network file system such as GPFS or Lustre, continue reading this section. +Add each keepstore server to the @Services.Keepstore@ section of @/etc/arvados/config.yml@ . -h4. Setting up filesystem mounts +
Services:
+ Keepstore:
+ # No ExternalURL because they are only accessed by the internal subnet.
+ InternalURLs:
+ "http://keep0.ClusterID.example.com:25107": {}
+ "http://keep1.ClusterID.example.com:25107": {}
+ # and so forth
+
+-Volumes: -- # The volume type, indicates this is a filesystem directory. - Type: Directory +{% include 'install_packages' %} - # The actual directory that will be used as the backing store. - Root: /mnt/local-disk +{% include 'start_service' %} - # How much replication is performed by the underlying filesystem. - # (for example, a network filesystem may provide its own replication). - # This is used to inform replication decisions at the Keep layer. - DirectoryReplication: 1 +{% include 'restart_api' %} - # If true, do not accept write or trash operations, only reads. - ReadOnly: false +h2(#confirm-working). Confirm working installation - # When true, read and write operations (for whole 64MiB blocks) on - # an individual volume will queued and issued sequentially. When - # false, read and write operations will be issued concurrently as - # they come in. - # - # When using spinning disks where storage partitions map 1:1 to - # physical disks that are dedicated to Keepstore, enabling this may - # reduce contention and improve throughput by minimizing seeks. - # - # When using SSDs, RAID, or a parallel network filesystem, you probably - # don't want this. - Serialize: true +Log into a host that is on your private Arvados network. The host should be able to contact your your keepstore servers (eg keep[0-9].ClusterID.example.com). - # Storage classes to associate with this volume. See "Configuring - # storage classes" in the "Admin" section of doc.arvados.org. - StorageClasses: null +@ARVADOS_API_HOST@ and @ARVADOS_API_TOKEN@ must be set in the environment. - # Example of a second volume section -- DirectoryReplication: 2 - ReadOnly: false - Root: /mnt/network-disk - Serialize: false - StorageClasses: null - Type: Directory -+@ARVADOS_API_HOST@ should be the hostname of the API server. -h3(#keepstoreservice). Run keepstore as a supervised service +@ARVADOS_API_TOKEN@ should be the system root token. -Install runit to supervise the keepstore daemon. {% include 'install_runit' %} +Install the "Command line SDK":{{site.baseurl}}/sdk/cli/install.html -Install this script as the run script @/etc/sv/keepstore/run@ for the keepstore service: +Check that the keepstore server is in the @keep_service@ "accessible" list:
#!/bin/sh
-
-exec 2>&1
-GOGC=10 exec keepstore -config /etc/arvados/keepstore/keepstore.yml
+
+$ arv keep_service accessible
+[...]
~$ prefix=`arv --format=uuid user current | cut -d- -f1`
-~$ echo "Site prefix is '$prefix'"
-~$ read -rd $'\000' keepservice <<EOF; arv keep_service create --keep-service "$keepservice"
-{
- "service_host":"keep0.$prefix.your.domain",
- "service_port":25107,
- "service_ssl_flag":false,
- "service_type":"disk"
-}
-EOF
-