3 navsection: installguide
4 title: Multi host Arvados
7 Copyright (C) The Arvados Authors. All rights reserved.
9 SPDX-License-Identifier: CC-BY-SA-3.0
12 # "Introduction":#introduction
13 # "Prerequisites and planning":#prerequisites
14 # "Required hosts":#hosts
15 # "Download the installer":#download
16 # "Initialize the installer":#copy_config
17 # "Edit local.params":#localparams
18 # "Configure Keep storage":#keep
19 # "Choose the SSL configuration":#certificates
20 ## "Using a self-signed certificates":#self-signed
21 ## "Using a Let's Encrypt certificates":#lets-encrypt
22 ## "Bring your own certificates":#bring-your-own
23 # "Create a compute image":#create_a_compute_image
24 # "Further customization of the installation":#further_customization
25 # "Begin installation":#installation
26 # "Confirm the cluster is working":#test-install
27 ## "Debugging issues":#debugging
28 ## "Iterating on config changes":#iterating
29 ## "Common problems and solutions":#common-problems
30 # "Install the CA root certificate":#ca_root_certificate
31 # "Initial user and login":#initial_user
32 # "After the installation":#post_install
34 h2(#introduction). Introduction
36 This multi host installer is the recommendend way to set up a production Arvados cluster. These instructions include specific details for installing on Amazon Web Services (AWS), which are marked as "AWS specific". However with additional customization the installer can be used as a template for deployment on other cloud provider or HPC systems.
38 h2(#prerequisites). Prerequisites and planning
40 h3. Cluster ID and base domain
42 Choose a 5-character cluster identifier that will represent the cluster. Here are "guidelines on choosing a cluster identifier":../architecture/federation.html#cluster_id . Only lowercase letters and digits 0-9 are allowed. Examples will use @xarv1@ or ${CLUSTER}, you should substitute the cluster id you have selected.
44 Determine the base domain for the cluster. This will be referred to as ${DOMAIN}
46 For example, if CLUSTER is "xarv1" and DOMAIN is "example.com", then "controller.${CLUSTER}.${DOMAIN}" means "controller.xargv1.example.com".
48 h3. Virtual Private Cloud (AWS specific)
50 We recommend setting Arvados up in a "Virtual Private Cloud (VPC)":https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html
52 When you do so, you need to configure a couple of additional things:
54 # "Create a subnet for the compute nodes":https://docs.aws.amazon.com/vpc/latest/userguide/configure-subnets.html
55 # You should set up a "security group which allows SSH access (port 22)":https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html
56 # Make sure to add a "VPC S3 endpoint":https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html
58 h3(#keep-bucket). S3 Bucket (AWS specific)
60 We recommend "creating an S3 bucket":https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html for data storage named @${CLUSTER}-nyw5e-000000000000000-volume@
62 Then create an IAM role called @${CLUSTER}-keepstore-00-iam-role@ which has "permission to read and write the bucket":https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create.html . Here is an example policy:
67 "Id": "arvados-keepstore policy",
74 "Resource": "arn:aws:s3:::xarv1-nyw5e-000000000000000-volume"
81 h2(#hosts). Required hosts
83 You will need to allocate several hosts (physical or virtual machines) for the fixed infrastructure of the Arvados cluster. These machines should have at least 2 cores and 8 GiB of RAM, running a supported Linux distribution.
85 {% include 'supportedlinux' %}
87 Allocate the following hosts as appropriate for your site. On AWS you may choose to do it manually with the AWS console, or using a DevOps tool such as CloudFormation or Terraform.
89 The installer will set up the Arvados services on your machines. Here is the default assignment of services to machines:
94 ## arvados controller (recommendend hostname @controller.${CLUSTER}.${DOMAIN}@)
95 ## arvados websocket (recommendend hostname @ws.${CLUSTER}.${DOMAIN}@)
96 ## arvados cloud dispatcher
97 ## arvados keepbalance
98 # KEEPSTORE nodes (at least 2)
99 ## arvados keepstore (recommendend hostnames @keep0.${CLUSTER}.${DOMAIN}@ and @keep1.${CLUSTER}.${DOMAIN}@)
101 ## arvados keepproxy (recommendend hostname @keep.${CLUSTER}.${DOMAIN}@)
102 ## arvados keepweb (recommendend hostname @download.${CLUSTER}.${DOMAIN}@ and @*.collections.${CLUSTER}.${DOMAIN}@)
104 ## arvados workbench (recommendend hostname @workbench.${CLUSTER}.${DOMAIN}@)
105 ## arvados workbench2 (recommendend hostname @workbench2.${CLUSTER}.${DOMAIN}@)
106 ## arvados webshell (recommendend hostname @webshell.${CLUSTER}.${DOMAIN}@)
107 # SHELL node (optional)
108 ## arvados shell (recommended hostname @shell.${CLUSTER}.${DOMAIN}@)
110 Additional prerequisites when preparing machines to run the installer:
112 # root or passwordless sudo access
113 # from the account where you are performing the install, passwordless @ssh@ to each machine (meaning, the client's public key added to @~/.ssh/authorized_keys@ on each node)
114 # @git@ installed on each machine
115 # port 443 reachable by clients
116 # DNS hostnames for each service
117 ## @controller.${CLUSTER}.${DOMAIN}@
118 ## @ws.${CLUSTER}.${DOMAIN}@
119 ## @keep0.${CLUSTER}.${DOMAIN}@
120 ## @keep1.${CLUSTER}.${DOMAIN}@
121 ## @keep.${CLUSTER}.${DOMAIN}@
122 ## @download.${CLUSTER}.${DOMAIN}@
123 ## @*.collections.${CLUSTER}.${DOMAIN}@ -- important note, this should be a wildcard DNS, going to the keepweb service
124 ## @workbench.${CLUSTER}.${DOMAIN}@
125 ## @workbench2.${CLUSTER}.${DOMAIN}@
126 ## @webshell.${CLUSTER}.${DOMAIN}@
127 ## @shell.${CLUSTER}.${DOMAIN}@
129 (AWS specific) The machine that runs the arvados cloud dispatcher will need an "IAM role that allows it to create EC2 instances, see here for details .":{{site.baseurl}}/install/crunch2-cloud/install-dispatch-cloud.html
131 If your infrastructure differs from the setup proposed above (ie, different hostnames, or using an external DB server such as AWS RDS), you can still use the installer, but "additional customization may be necessary":#further_customization .
133 h2(#download). Download the installer
135 {% assign local_params_src = 'multiple_hosts' %}
136 {% assign config_examples_src = 'multi_host/aws'%}
137 {% include 'download_installer' %}
139 h2(#localparams). Edit @local.params@
141 This can be found wherever you choose to initialize the install files (@~/setup-arvados-xarv1@ in these examples).
143 # Set @CLUSTER@ to the 5-character cluster identifier (e.g "xarv1")
144 # Set @DOMAIN@ to the base DNS domain of the environment, e.g. "example.com"
145 # Edit Internal IP settings. Since services share hosts, some hosts are the same.
146 # Edit @CLUSTER_INT_CIDR@, this should be the CIDR of the private network that Arvados is running on, e.g. the VPC.
147 CIDR stands for "Classless Inter-Domain Routing" and describes which portion of the IP address that refers to the network. For example 192.168.3.0/24 means that the first 24 bits are the network (192.168.3) and the last 8 bits are a specific host on that network.
148 _AWS Specific: Go to the AWS console and into the VPC service, there is a column in this table view of the VPCs that gives the CIDR for the VPC (IPv4 CIDR)._
149 # Set @INITIAL_USER_EMAIL@ to your email address, as you will be the first admin user of the system.
150 # Set each @KEY@ / @TOKEN@ to a random string
151 Here's an easy way to create five random tokens:
152 <pre><code>for i in 1 2 3 4 5; do
153 tr -dc A-Za-z0-9 </dev/urandom | head -c 32 ; echo ''
156 # Set @DATABASE_PASSWORD@ to a random string
157 Important! If this contains any non-alphanumeric characters, in particular ampersand ('&'), it is necessary to add backslash quoting.
158 For example, if the password is `Cq&WU<A']p?j`
159 With backslash quoting the special characters it should appear like this in local.params:
160 <pre><code>DATABASE_PASSWORD="Cq\&WU\<A\'\]p\?j"</code></pre>
162 h2(#keep). Configure Keep storage
164 The @multi_host/aws@ template uses S3 for storage. Arvados also supports "filesystem storage":configure-fs-storage.html and "Azure blob storage":configure-azure-blob-storage.html . Keep storage configuration can be found in in the section @arvados.cluster.Volumes@ of @local_config_dir/pillars/arvados.sls@.
166 h3. Object storage in S3 (AWS Specific)
168 Open @local_config_dir/pillars/arvados.sls@ and edit as follows:
170 # In the @arvados.cluster.Volumes@ section, set @Region@ to the appropriate AWS region (e.g. 'us-east-1')
171 # Set @Bucket@ to the value of "keepstore role you created earlier":#keep-bucket
172 # Set @IAMRole@ to "keepstore role you created earlier":#keep-bucket
174 {% include 'ssl_config_multi' %}
176 h2(#create_a_compute_image). Create a compute image
178 {% include 'branchname' %}
180 On cloud installations, containers are dispatched in Docker daemons running in the _compute instances_, which need some additional setup.
182 *Start by following "the instructions build a cloud compute node image":{{site.baseurl}}/install/crunch2-cloud/install-compute-node.html using the "compute image builder script":https://github.com/arvados/arvados/tree/{{ branchname }}/tools/compute-images* .
184 Once you have that image created, Open @local_config_dir/pillars/arvados.sls@ and edit as follows (AWS specific settings described here, configuration for Azure is similar):
186 # In the @arvados.cluster.Containers.CloudVMs@ section:
187 ## Set @ImageID@ to the AMI produced by Packer
188 ## Set @Region@ to the appropriate AWS region
189 ## Set @AdminUsername@ to the admin user account on the image
190 ## Set the @SecurityGroupIDs@ list to the VPC security group which you set up to allow SSH connections to these nodes
191 ## Set @SubnetID@ to the value of SubnetId of your VPC
192 # Update @arvados.cluster.Containers.DispatchPrivateKey@ and paste the contents of the @~/.ssh/id_dispatcher@ file you generated in an earlier step.
193 # Update @arvados.cluster.InstanceTypes@ as necessary. If m5/c5 node types are not available, replace them with m4/c4. You'll need to double check the values for Price and IncludedScratch/AddedScratch for each type that is changed.
195 h2(#further_customization). Further customization of the installation (optional)
197 If you are installing on AWS and following the naming conventions recommend in this guide, then likely no further configuration is necessary and you can begin installation.
199 A couple of common customizations are described here. Other changes may require editing the Saltstack pillars and states files found in @local_config_dir@. In particular, @local_config_dir/pillars/arvados.sls@ has the template used to produce the Arvados configuration file that is distributed to all the nodes.
201 Any extra salt _state_ files you add under @local_config_dir/states@ will be added to the salt run and applied to the hosts.
203 h3(#authentication). Using a different authentication provider
205 By default, the installer will use the "Test" provider, which is a list of usernames and cleartext passwords stored in the Arvados config file. *This is low security configuration and you are strongly advised to configure one of the other "supported authentication methods":setup-login.html* .
207 h3(#ext-database). Using an external database (optional)
209 Arvados requires a database that is compatible with PostgreSQL 9.5 or later.
211 For example, Arvados is known to work with Amazon Aurora (note: even idle, Arvados constantly accesses the database, so we strongly advise using "provisioned" mode).
213 # In @local.params@, remove 'database' from the list of roles assigned to the controller node:
215 [controller.${CLUSTER}.${DOMAIN}]=api,controller,websocket,dispatcher,keepbalance
219 # In @local.params@, set @DATABASE_INT_IP@ to the database endpoint (can be a hostname, does not have to be an IP address).
220 <pre><code>DATABASE_INT_IP=...
222 # In @local.params@, set @DATABASE_PASSWORD@ to the correct value. "See the previous section describing correct quoting":#localparams
223 # In @local_config_dir/pillars/arvados.sls@ you may need to adjust the database name and user. This can be found in the section @arvados.cluster.database@.
225 h2(#installation). Begin installation
227 At this point, you are ready to run the installer script in deploy mode that will conduct all of the Arvados installation.
229 Run this in @~/arvados-setup-xarv1@:
232 ./installer.sh deploy
235 This will deploy all the nodes. It will take a while and produce a lot of logging. If it runs into an error, it will stop.
237 {% include 'install_ca_cert' %}
239 h2(#test-install). Confirm the cluster is working
241 When everything has finished, you can run the diagnostics.
243 Depending on where you are running the installer, you need to provide @-internal-client@ or @-external-client@.
245 If you are running the diagnostics from one of the Arvados machines inside the VPC, you want @-internal-client@ .
247 You are an "external client" if you running the diagnostics from your workstation outside of the VPC.
250 ./installer.sh diagnostics (-internal-client|-external-client)
253 h3(#debugging). Debugging issues
255 Most service logs go to @/var/log/syslog@.
257 The logs for Rails API server and for Workbench can be found in
259 @/var/www/arvados-api/current/log/production.log@
261 @/var/www/arvados-workbench/current/log/production.log@
263 on the appropriate instances.
265 Workbench 2 is a client-side Javascript application. If you are having trouble loading Workbench 2, check the browser's developer console (this can be found in "Tools → Developer Tools").
267 h3(#iterating). Iterating on config changes
269 You can iterate on the config and maintain the cluster by making changes to @local.params@ and @local_config_dir@ and running @installer.sh deploy@ again.
271 If you are debugging a configuration issue on a specific node, you can speed up the cycle a bit by deploying just one node:
274 ./installer.sh deploy keep0.xarv1.example.com@
277 However, once you have a final configuration, you should run a full deploy to ensure that the configuration has been synchronized on all the nodes.
279 h3(#common-problems). Common problems and solutions
281 h4. PG::UndefinedTable: ERROR: relation \"api_clients\" does not exist
283 The arvados-api-server package sets up the database as a post-install script. If the database host or password wasn't set correctly (or quoted correctly) at the time that package is installed, it won't be able to set up the database.
285 This will manifest as an error like this:
288 #<ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation \"api_clients\" does not exist
291 If this happens, you need to
293 1. correct the database information
294 2. run @./installer.sh deploy xarv1.example.com@ to update the configuration on the API/controller node
295 3. On the API/controller server node, run this command to re-run the post-install script, which will set up the database:
298 dpkg-reconfigure arvados-api-server
301 4. Re-run @./installer.sh deploy@ again to synchronize everything, and so that the install steps that need to contact the API server are run successfully.
303 h4. Missing ENA support (AWS Specific)
305 If the AMI wasn't built with ENA (extended networking) support and the instance type requires it, it'll fail to start. You'll see an error in syslog on the node that runs @arvados-dispatch-cloud@. The solution is to build a new AMI with --aws-ena-support true
307 h2(#initial_user). Initial user and login
309 At this point you should be able to log into the Arvados cluster. The initial URL will be
311 https://workbench.${CLUSTER}.${DOMAIN}
313 If you did not "configure a different authentication provider":#authentication you will be using the "Test" provider, and the provision script creates an initial user for testing purposes. This user is configured as administrator of the newly created cluster. It uses the values of @INITIAL_USER@ and @INITIAL_USER_PASSWORD@ the @local.params@ file.
315 If you did configure a different authentication provider, the first user to log in will automatically be given Arvados admin privileges.
317 h2(#post_install). After the installation
319 As part of the operation of @installer.sh@, it automatically creates a @git@ repository with your configuration templates. You should retain this repository but be aware that it contains sensitive information (passwords and tokens used by the Arvados services).
321 As described in "Iterating on config changes":#iterating you may use @installer.sh deploy@ to re-run the Salt to deploy configuration changes and upgrades. However, be aware that the configuration templates created for you by @installer.sh@ are a snapshot which are not automatically kept up to date.
323 When deploying upgrades, consult the "Arvados upgrade notes":{{site.baseurl}}/admin/upgrading.html to see if changes need to be made to the configuration file template in @local_config_dir/pillars/arvados.sls@.
325 See also "Maintenance and upgrading":{{site.baseurl}}/admin/maintenance-and-upgrading.html for more information.