X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/ed4d8462e763eb1d8c8f1548912495563cd9288f..ae92d144610446849eb568247a44f02ae985c281:/doc/install/install-manual-prerequisites.html.textile.liquid diff --git a/doc/install/install-manual-prerequisites.html.textile.liquid b/doc/install/install-manual-prerequisites.html.textile.liquid index 73b54c462e..21b3871e01 100644 --- a/doc/install/install-manual-prerequisites.html.textile.liquid +++ b/doc/install/install-manual-prerequisites.html.textile.liquid @@ -27,8 +27,9 @@ The Arvados storage subsystem is called "keep". The compute subsystem is called h2(#supportedlinux). Supported GNU/Linux distributions table(table table-bordered table-condensed). -|_. Distribution|_. State|_. Last supported version| +|_. Distribution|_. State|_. Last supported Arvados version| |CentOS 7|Supported|Latest| +|Debian 11 ("bullseye")|Supported|Latest| |Debian 10 ("buster")|Supported|Latest| |Ubuntu 20.04 ("focal")|Supported|Latest| |Ubuntu 18.04 ("bionic")|Supported|Latest| @@ -48,8 +49,8 @@ Arvados consists of many components, some of which may be omitted (at the cost o table(table table-bordered table-condensed). |\3=. *Core*| -|"Postgres database":install-postgresql.html |Stores data for the API server.|Required.| -|"API server":install-api-server.html |Core Arvados logic for managing users, groups, collections, containers, and enforcing permissions.|Required.| +|"PostgreSQL database":install-postgresql.html |Stores data for the API server.|Required.| +|"API server + Controller":install-api-server.html |Core Arvados logic for managing users, groups, collections, containers, and enforcing permissions.|Required.| |\3=. *Keep (storage)*| |"Keepstore":install-keepstore.html |Stores content-addressed blocks in a variety of backends (local filesystem, cloud object storage).|Required.| |"Keepproxy":install-keepproxy.html |Gateway service to access keep servers from external networks.|Required to be able to use arv-put, arv-get, or arv-mount outside the private Arvados network.| @@ -57,14 +58,15 @@ table(table table-bordered table-condensed). |"Keep-balance":install-keep-balance.html |Storage cluster maintenance daemon responsible for moving blocks to their optimal server location, adjusting block replication levels, and trashing unreferenced blocks.|Required to free deleted data from underlying storage, and to ensure proper replication and block distribution (including support for storage classes).| |\3=. *User interface*| |"Workbench":install-workbench-app.html, "Workbench2":install-workbench2-app.html |Primary graphical user interface for working with file collections and running containers.|Optional. Depends on API server, keep-web, websockets server.| -|"Workflow Composer":install-composer.html |Graphical user interface for editing Common Workflow Language workflows.|Optional. Depends on git server (arv-git-httpd).| +|"Workflow Composer":install-composer.html |Graphical user interface for editing Common Workflow Language workflows.|Optional. Depends on git server (arvados-git-httpd).| |\3=. *Additional services*| |"Websockets server":install-ws.html |Event distribution server.|Required to view streaming container logs in Workbench.| |"Shell server":install-shell-server.html |Synchronize (create/delete/configure) Unix shell accounts with Arvados users.|Optional.| |"Git server":install-arv-git-httpd.html |Arvados-hosted git repositories, with Arvados-token based authentication.|Optional, but required by Workflow Composer.| |\3=. *Crunch (running containers)*| |"arvados-dispatch-cloud":crunch2-cloud/install-dispatch-cloud.html |Allocate and free cloud VM instances on demand based on workload.|Optional, not needed for a static Slurm cluster such as on-premises HPC.| -|"crunch-dispatch-slurm":crunch2-slurm/install-dispatch.html |Run analysis workflows using Docker containers distributed across a Slurm cluster.|Optional, not needed for a Cloud installation, or if you wish to use Arvados for data management only.| +|"crunch-dispatch-slurm":crunch2-slurm/install-dispatch.html |Run analysis workflows using Docker or Singularity containers distributed across a Slurm cluster.|Optional, not needed for a Cloud installation, or if you wish to use Arvados for data management only.| +|"crunch-dispatch-lsf":crunch2-lsf/install-dispatch.html |Run analysis workflows using Docker or Singularity containers distributed across an LSF cluster.|Optional, not needed for a Cloud installation, or if you wish to use Arvados for data management only.| h2(#identity). Identity provider @@ -75,6 +77,10 @@ Choose which backend you will use to authenticate users. * LDAP login to authenticate users by username/password using the LDAP protocol, supported by many services such as OpenLDAP and Active Directory. * PAM login to authenticate users by username/password according to the PAM configuration on the controller node. +h2(#postgresql). PostgreSQL + +Arvados works well with a standalone PostgreSQL installation. When deploying on AWS, Aurora RDS also works but Aurora Serverless is not recommended. + h2(#storage). Storage backend Choose which backend you will use for storing and retrieving content-addressed Keep blocks. @@ -92,7 +98,8 @@ h2(#scheduler). Container compute scheduler Choose which backend you will use to schedule computation. * On AWS EC2 and Azure, you probably want to use @arvados-dispatch-cloud@ to manage the full lifecycle of cloud compute nodes: starting up nodes sized to the container request, executing containers on those nodes, and shutting nodes down when no longer needed. -* For on-premise HPC clusters using "slurm":https://slurm.schedmd.com/ use @crunch-dispatch-slurm@ to execute containers with slurm job submissions. +* For on-premises HPC clusters using "slurm":https://slurm.schedmd.com/ use @crunch-dispatch-slurm@ to execute containers with slurm job submissions. +* For on-premises HPC clusters using "LSF":https://www.ibm.com/products/hpc-workload-management/ use @crunch-dispatch-lsf@ to execute containers with slurm job submissions. * For single node demos, use @crunch-dispatch-local@ to execute containers directly. h2(#machines). Hardware (or virtual machines) @@ -104,7 +111,7 @@ For a production installation, this is a reasonable starting point:
table(table table-bordered table-condensed). |_. Function|_. Number of nodes|_. Recommended specs| -|Postgres database, Arvados API server, Arvados controller, Git, Websockets, Container dispatcher|1|16+ GiB RAM, 4+ cores, fast disk for database| +|PostgreSQL database, Arvados API server, Arvados controller, Git, Websockets, Container dispatcher|1|16+ GiB RAM, 4+ cores, fast disk for database| |Workbench, Keepproxy, Keep-web, Keep-balance|1|8 GiB RAM, 2+ cores| |Keepstore servers ^1^|2+|4 GiB RAM| |Compute worker nodes ^1^|0+ |Depends on workload; scaled dynamically in the cloud| @@ -112,7 +119,7 @@ table(table table-bordered table-condensed).
^1^ Should be scaled up as needed -^2^ Refers to shell nodes managed by Arvados, that provide ssh access for users to interact with Arvados at the command line. Optional. +^2^ Refers to shell nodes managed by Arvados that provide ssh access for users to interact with Arvados at the command line. Optional. {% include 'notebox_begin' %} For a small demo installation, it is possible to run all the Arvados services on a single node. Special considerations for single-node installs will be noted in boxes like this. @@ -137,26 +144,74 @@ You may also use a different method to pick the cluster identifier. The cluster h2(#dnstls). DNS entries and TLS certificates -The following services are normally public-facing and require DNS entries and corresponding TLS certificates. Get certificates from your preferred TLS certificate provider. We recommend using "Let's Encrypt":https://letsencrypt.org/. You can run several services on same node, but each distinct hostname requires its own TLS certificate. +The following services are normally public-facing and require DNS entries and corresponding TLS certificates. Get certificates from your preferred TLS certificate provider. We recommend using "Let's Encrypt":https://letsencrypt.org/. You can run several services on the same node, but each distinct DNS name requires a valid, matching TLS certificate. -This guide uses the following hostname conventions. A later part of this guide will describe how to set up Nginx virtual hosts. +This guide uses the following DNS name conventions. A later part of this guide will describe how to set up Nginx virtual hosts. +It is possible to use custom DNS names for the Arvados services.
table(table table-bordered table-condensed). -|_. Function|_. Hostname| +|_. Function|_. DNS name| |Arvados API|@ClusterID.example.com@| |Arvados Git server|git.@ClusterID.example.com@| +|Arvados Webshell|webshell.@ClusterID.example.com@| |Arvados Websockets endpoint|ws.@ClusterID.example.com@| |Arvados Workbench|workbench.@ClusterID.example.com@| |Arvados Workbench 2|workbench2.@ClusterID.example.com@| |Arvados Keepproxy server|keep.@ClusterID.example.com@| |Arvados Keep-web server|download.@ClusterID.example.com@ _and_ -*.collections.@ClusterID.example.com@ or -*--collections.@ClusterID.example.com@ or +*.collections.@ClusterID.example.com@ _or_ +*--collections.@ClusterID.example.com@ _or_ collections.@ClusterID.example.com@ (see the "keep-web install docs":install-keep-web.html)|
+Setting up Arvados is easiest when Wildcard TLS and wildcard DNS are available. It is also possible to set up Arvados without wildcard TLS and DNS, but not having a wildcard for @keep-web@ (i.e. not having *.collections.@ClusterID.example.com@) comes with a tradeoff: it will disable some features that allow users to view Arvados-hosted data in their browsers. More information on this tradeoff caused by the CORS rules applied by modern browsers is available in the "keep-web URL pattern guide":../api/keep-web-urls.html. + +The table below lists the required TLS certificates and DNS names in each scenario. + +
+table(table table-bordered table-condensed). +||_. Wildcard TLS and DNS available|_. Wildcard TLS available|_. Other| +|TLS|*.@ClusterID.example.com@ +@ClusterID.example.com@ +*.collections.@ClusterID.example.com@|*.@ClusterID.example.com@ +@ClusterID.example.com@|@ClusterID.example.com@ +git.@ClusterID.example.com@ +webshell.@ClusterID.example.com@ +ws.@ClusterID.example.com@ +workbench.@ClusterID.example.com@ +workbench2.@ClusterID.example.com@ +keep.@ClusterID.example.com@ +download.@ClusterID.example.com@ +collections.@ClusterID.example.com@| +|DNS|@ClusterID.example.com@ +git.@ClusterID.example.com@ +webshell.@ClusterID.example.com@ +ws.@ClusterID.example.com@ +workbench.@ClusterID.example.com@ +workbench2.@ClusterID.example.com@ +keep.@ClusterID.example.com@ +download.@ClusterID.example.com@ +*.collections.@ClusterID.example.com@|@ClusterID.example.com@ +git.@ClusterID.example.com@ +webshell.@ClusterID.example.com@ +ws.@ClusterID.example.com@ +workbench.@ClusterID.example.com@ +workbench2.@ClusterID.example.com@ +keep.@ClusterID.example.com@ +download.@ClusterID.example.com@ +collections.@ClusterID.example.com@|@ClusterID.example.com@ +git.@ClusterID.example.com@ +webshell.@ClusterID.example.com@ +ws.@ClusterID.example.com@ +workbench.@ClusterID.example.com@ +workbench2.@ClusterID.example.com@ +keep.@ClusterID.example.com@ +download.@ClusterID.example.com@ +collections.@ClusterID.example.com@| +
+ {% include 'notebox_begin' %} It is also possible to create your own certificate authority, issue server certificates, and install a custom root certificate in the browser. This is out of scope for this guide. {% include 'notebox_end' %}