X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/5bcba288077488791daa43a15d5fd5fb0c6e653c..58fa1d8438cb613c6bf7bece8702146f3eed5205:/doc/install/install-keepstore.html.textile.liquid diff --git a/doc/install/install-keepstore.html.textile.liquid b/doc/install/install-keepstore.html.textile.liquid index 7fb810d841..d1633f31c2 100644 --- a/doc/install/install-keepstore.html.textile.liquid +++ b/doc/install/install-keepstore.html.textile.liquid @@ -3,10 +3,15 @@ layout: default navsection: installguide title: Install Keepstore servers ... +{% comment %} +Copyright (C) The Arvados Authors. All rights reserved. -This installation guide assumes you are on a 64 bit Debian or Ubuntu system. +SPDX-License-Identifier: CC-BY-SA-3.0 +{% endcomment %} -We are going to install two Keepstore servers. By convention, we use the following hostname pattern: +Keepstore provides access to underlying storage for reading and writing content-addressed blocks, with enforcement of Arvados permissions. Keepstore supports a variety of cloud object storage and POSIX filesystems for its backing store. + +We recommend installing at least two Keepstore servers. By convention, we use the following hostname pattern:
table(table table-bordered table-condensed). @@ -15,56 +20,222 @@ table(table table-bordered table-condensed). |keep1.@uuid_prefix@.your.domain|
-Because the Keepstore servers are not directly accessible from the internet, these hostnames only need to resolve on the local network. +Keepstore servers should not be directly accessible from the Internet (they are accessed via "keepproxy":install-keepproxy.html), so the hostnames only need to resolve on the private network. h2. Install Keepstore -First add the Arvados apt repository, and then install the Keepstore package. +On Debian-based systems: -
~$ echo "deb http://apt.arvados.org/ wheezy main" | sudo tee /etc/apt/sources.list.d/apt.arvados.org.list
-~$ sudo /usr/bin/apt-key adv --keyserver pool.sks-keyservers.net --recv 1078ECD7
-~$ sudo /usr/bin/apt-get update
-~$ sudo /usr/bin/apt-get install keepstore
+
~$ sudo apt-get install keepstore
+
+ + +On Red Hat-based systems: + + +
~$ sudo yum install keepstore
 
Verify that Keepstore is functional: -
~$ keepstore -h
-2014/10/29 14:23:38 Keep started: pid 6848
-Usage of keepstore:
-  -data-manager-token-file="": File with the API token used by the Data Manager. All DELETE requests or GET /index requests must carry this token.
-  -enforce-permissions=false: Enforce permission signatures on requests.
-  -listen=":25107": Interface on which to listen for requests, in the format ipaddr:port. e.g. -listen=10.0.1.24:8000. Use -listen=:port to listen on all network interfaces.
-  -never-delete=false: If set, nothing will be deleted. HTTP 405 will be returned for valid DELETE requests.
-  -permission-key-file="": File containing the secret key for generating and verifying permission signatures.
-  -permission-ttl=1209600: Expiration time (in seconds) for newly generated permission signatures.
-  -pid="": Path to write pid file
-  -serialize=false: If set, all read and write operations on local Keep volumes will be serialized.
-  -volumes="": Comma-separated list of directories to use for Keep volumes, e.g. -volumes=/var/keep1,/var/keep2. If empty or not supplied, Keep will scan mounted filesystems for volumes with a /keep top-level directory.
+
~$ keepstore --version
 
-If you want access control on your Keepstore server(s), you should provide a permission key. The @-permission-key-file@ argument should contain the path to a file that contains a single line with a long random alphanumeric string. It should be the same as the @blob_signing_key@ that can be set in the "API server":install-api-server.html config/application.yml file. +h3. Create config file + +By default, keepstore will look for its configuration file at @/etc/arvados/keepstore/keepstore.yml@ + +You can override the configuration file location using the @-config@ command line option to keepstore. + +The following is a sample configuration file: + +
+# Duration for which new permission signatures (returned in PUT
+# responses) will be valid.  This should be equal to the API
+# server's blob_signature_ttl configuration entry.
+BlobSignatureTTL: 336h0m0s
+
+# Local file containing the secret blob signing key (used to generate
+# and verify blob signatures).  The contents of the key file must be
+# identical to the API server's blob_signing_key configuration entry.
+BlobSigningKeyFile: ""
+
+# Print extra debug logging
+Debug: false
+
+# Maximum number of concurrent block deletion operations (per
+# volume) when emptying trash. Default is 1.
+EmptyTrashWorkers: 1
+
+# Enable trash and delete features. If false, trash lists will be
+# accepted but blocks will not be trashed or deleted.
+# Keepstore does not delete data on its own.  The keep-balance
+# service determines which blocks are candidates for deletion
+# and instructs the keepstore to move those blocks to the trash.
+EnableDelete: true
+
+# Local port to listen on. Can be 'address:port' or ':port', where
+# 'address' is a host IP address or name and 'port' is a port number
+# or name.
+Listen: :25107
+
+# Format of request/response and error logs: "json" or "text".
+LogFormat: json
+ManagementToken: ""
+
+# Maximum RAM to use for data buffers, given in multiples of block
+# size (64 MiB). When this limit is reached, HTTP requests requiring
+# buffers (like GET and PUT) will wait for buffer space to be
+# released.
+#
+# It should be set such that MaxBuffers * 64MiB + 10% fits
+# comfortably in memory. On a host dedicated to running keepstore,
+# divide total memory by 88MiB to suggest a suitable value. For example,
+# if grep MemTotal /proc/meminfo reports MemTotal: 7125440 kB,
+# compute 7125440 / (88 * 1024)=79 and configure MaxBuffers: 79
+MaxBuffers: 128
+
+# Maximum concurrent requests. When this limit is reached, new
+# requests will receive 503 responses. Note: this limit does not
+# include idle connections from clients using HTTP keepalive, so it
+# does not strictly limit the number of concurrent connections. If
+# omitted or zero, the default is 2 * MaxBuffers.
+MaxRequests: 0
+
+# Path to write PID file during startup. This file is kept open and
+# locked with LOCK_EX until keepstore exits, so "fuser -k pidfile" is
+# one way to shut down. Exit immediately if there is an error
+# opening, locking, or writing the PID file.
+PIDFile: ""
+
+# Maximum number of concurrent pull operations. Default is 1, i.e.,
+# pull lists are processed serially.
+PullWorkers: 0
+
+# Honor read requests only if a valid signature is provided.  This
+# should be true, except for development use and when migrating from
+# a very old version.
+RequireSignatures: true
+
+# Local file containing the Arvados API token used by keep-balance
+# or data manager.  Delete, trash, and index requests are honored
+# only for this token.
+SystemAuthTokenFile: ""
+
+# Path to server certificate file in X509 format. Enables TLS mode.
+#
+# Example: /var/lib/acme/live/keep0.example.com/fullchain
+TLSCertificateFile: ""
+
+# Path to server key file in X509 format. Enables TLS mode.
+#
+# The key pair is read from disk during startup, and whenever SIGHUP
+# is received.
+#
+# Example: /var/lib/acme/live/keep0.example.com/privkey
+TLSKeyFile: ""
+
+# How often to check for (and delete) trashed blocks whose
+# TrashLifetime has expired.
+TrashCheckInterval: 24h0m0s
 
-Prepare one or more volumes for Keepstore to use. Simply create a /keep directory on all the partitions you would like Keepstore to use, and then start Keepstore. For example, using 2 tmpfs volumes:
+# Time duration after a block is trashed during which it can be
+# recovered using an /untrash request.
+TrashLifetime: 336h0m0s
+
+# Maximum number of concurrent trash operations. Default is 1, i.e.,
+# trash lists are processed serially.
+TrashWorkers: 1
+
+ +h3. Notes on storage management + +On its own, a keepstore server never deletes data. The "keep-balance":install-keep-balance.html service service determines which blocks are candidates for deletion and instructs the keepstore to move those blocks to the trash. + +When a block is newly written, it is protected from deletion for the duration in @BlobSignatureTTL@. During this time, it cannot be trashed. + +If keep-balance instructs keepstore to trash a block which is older than @BlobSignatureTTL@, and @EnableDelete@ is true, the block will be moved to "trash". + +h3. Configure storage volumes + +Available storage volume types include cloud object storage and POSIX filesystems. + +If you are using S3-compatible object storage (including Amazon S3, Google Cloud Storage, and Ceph RADOS), follow the setup instructions "S3 Object Storage":configure-s3-object-storage.html page instead and then "Run keepstore as a supervised service.":#keepstoreservice + +If you are using Azure Blob Storage, follow the setup instructions "Azure Blob Storage":configure-azure-blob-storage.html and then proceed to "Run keepstore as a supervised service.":#keepstoreservice + +To use a POSIX filesystem, including both local filesystems (ext4, xfs) and network file system such as GPFS or Lustre, continue reading this section. + +h4. Setting up filesystem mounts + +Volumes are configured in the @Volumes@ section of the configuration +file. You may provide multiple volumes for a single keepstore process +to manage multiple disks. Keepstore distributes blocks among volumes +in round-robin fashion. + +
+Volumes:
+- # The volume type, indicates this is a filesystem directory.
+  Type: Directory
+
+  # The actual directory that will be used as the backing store.
+  Root: /mnt/local-disk
+
+  # How much replication is performed by the underlying filesystem.
+  # (for example, a network filesystem may provide its own replication).
+  # This is used to inform replication decisions at the Keep layer.
+  DirectoryReplication: 1
+
+  # If true, do not accept write or trash operations, only reads.
+  ReadOnly: false
+
+  # When true, read and write operations (for whole 64MiB blocks) on
+  # an individual volume will queued and issued sequentially.  When
+  # false, read and write operations will be issued concurrently as
+  # they come in.
+  #
+  # When using spinning disks where storage partitions map 1:1 to
+  # physical disks that are dedicated to Keepstore, enabling this may
+  # reduce contention and improve throughput by minimizing seeks.
+  #
+  # When using SSDs, RAID, or a parallel network filesystem, you probably
+  # don't want this.
+  Serialize: true
+
+  # Storage classes to associate with this volume.  See "Configuring
+  # storage classes" in the "Admin" section of doc.arvados.org.
+  StorageClasses: null
+
+  # Example of a second volume section
+- DirectoryReplication: 2
+  ReadOnly: false
+  Root: /mnt/network-disk
+  Serialize: false
+  StorageClasses: null
+  Type: Directory
+
+ +h3(#keepstoreservice). Run keepstore as a supervised service + +Install runit to supervise the keepstore daemon. {% include 'install_runit' %} + +Install this script as the run script @/etc/sv/keepstore/run@ for the keepstore service: -
~$ keepstore
-2014/10/29 11:41:37 Keep started: pid 20736
-2014/10/29 11:41:37 adding Keep volume: /tmp/tmp.vwSCtUCyeH/keep
-2014/10/29 11:41:37 adding Keep volume: /tmp/tmp.Lsn4w8N3Xv/keep
-2014/10/29 11:41:37 Running without a PermissionSecret. Block locators returned by this server will not be signed, and will be rejected by a server that enforces permissions.
-2014/10/29 11:41:37 To fix this, run Keep with --permission-key-file= to define the location of a file containing the permission key.
+
#!/bin/sh
 
+exec 2>&1
+GOGC=10 exec keepstore -config /etc/arvados/keepstore/keepstore.yml
 
-It's recommended to run Keepstore under "runit":https://packages.debian.org/search?keywords=runit or something similar. +h3. Set up additional servers -Repeat this section for each Keepstore server you are setting up. +Repeat the above sections to prepare volumes and bring up supervised services on each Keepstore server you are setting up. h3. Tell the API server about the Keepstore servers @@ -77,13 +248,10 @@ Make sure to update the @service_host@ value to match each of your Keepstore ser ~$ echo "Site prefix is '$prefix'" ~$ read -rd $'\000' keepservice <<EOF; arv keep_service create --keep-service "$keepservice" { - "service_host":"keep0.$prefix.your.domain", + "service_host":"keep0.$prefix.your.domain", "service_port":25107, "service_ssl_flag":false, "service_type":"disk" } EOF
- - -