- Manual installation:
- install/install-manual-prerequisites.html.textile.liquid
- install/packages.html.textile.liquid
- - admin/upgrading.html.textile.liquid
- Configuration:
- install/config.html.textile.liquid
- - admin/config-migration.html.textile.liquid
- - admin/config.html.textile.liquid
- admin/config-urls.html.textile.liquid
+ - admin/config.html.textile.liquid
+ - Maintenance and upgrading:
+ - admin/upgrading.html.textile.liquid
+ - admin/maintenance-and-upgrading.html.textile.liquid
- Core:
- install/install-api-server.html.textile.liquid
- Keep:
+++ /dev/null
----
-layout: default
-navsection: installguide
-title: Migrating Configuration from v1.4 to v2.0
-...
-
-{% comment %}
-Copyright (C) The Arvados Authors. All rights reserved.
-
-SPDX-License-Identifier: CC-BY-SA-3.0
-{% endcomment %}
-
-{% include 'notebox_begin_warning' %}
-_New installations of Arvados 2.0+ can skip this section_
-{% include 'notebox_end' %}
-
-Arvados 2.0 migrates to a centralized configuration file for all components. The centralized Arvados configuration is @/etc/arvados/config.yml@. Components that support the new centralized configuration are listed below. During the migration period, legacy configuration files are still loaded and take precedence over the centralized configuration file.
-
-h2. API server
-
-The legacy API server configuration is stored in @config/application.yml@ and @config/database.yml@. After migration to @/etc/arvados/config.yml@, both of these files should be moved out of the way and/or deleted.
-
-Change to the API server directory and use the following commands:
-
-<pre>
-$ RAILS_ENV=production bin/rake config:migrate > config.yml
-$ cp config.yml /etc/arvados/config.yml
-</pre>
-
-This will print the contents of @config.yml@ after merging the legacy @application.yml@ and @database.yml@ into the existing systemwide @config.yml@. It may be redirected to a file and copied to @/etc/arvados/config.yml@ (it is safe to copy over, all configuration items from the existing @/etc/arvados/config.yml@ will be included in the migrated output).
-
-If you wish to update @config.yml@ configuration by hand, or check that everything has been migrated, use @config:diff@ to print configuration items that differ between @application.yml@ and the system @config.yml@.
-
-<pre>
-$ RAILS_ENV=production bin/rake config:diff
-</pre>
-
-This command will also report if no migrations are required.
-
-h2. Workbench
-
-The legacy workbench configuration is stored in @config/application.yml@. After migration to @/etc/arvados/config.yml@, this file should be moved out of the way and/or deleted.
-
-Change to the workbench server directory and use the following commands:
-
-<pre>
-$ RAILS_ENV=production bin/rake config:migrate > config.yml
-$ cp config.yml /etc/arvados/config.yml
-</pre>
-
-This will print the contents of @config.yml@ after merging the legacy @application.yml@ into the existing systemwide @config.yml@. It may be redirected to a file and copied to @/etc/arvados/config.yml@ (it is safe to copy over, all configuration items from the existing @/etc/arvados/config.yml@ will be included in the migrated output).
-
-If you wish to update @config.yml@ configuration by hand, or check that everything has been migrated, use @config:diff@ to print configuration items that differ between @application.yml@ and the system @config.yml@.
-
-<pre>
-$ RAILS_ENV=production bin/rake config:diff
-</pre>
-
-This command will also report if no migrations are required.
-
-h2. keepstore, keep-web, crunch-dispatch-slurm, arvados-ws, keepproxy, arv-git-httpd, keep-balance
-
-The legacy config for each component (loaded from @/etc/arvados/component/component.yml@ or a different location specified via the -legacy-component-config command line argument) takes precedence over the centralized config. After you migrate everything from the legacy config to the centralized config, you should delete @/etc/arvados/component/component.yml@ and/or stop using the corresponding -legacy-component-config argument.
-
-To migrate a component configuration, do this on each node that runs an Arvados service:
-
-# Ensure that the latest @config.yml@ is installed on the current node
-# Install @arvados-server@ using @apt-get@ or @yum@.
-# Run @arvados-server config-check@, review and apply the recommended changes to @/etc/arvados/config.yml@
-# After applying changes, re-run @arvados-server config-check@ again to check for additional warnings and recommendations.
-# When you are satisfied, delete the legacy config file, restart the service, and check its startup logs.
-# Copy the updated @config.yml@ file to your next node, and repeat the process there.
-# When you have a @config.yml@ file that includes all volumes on all keepstores, it is important to add a 'Rendezvous' parameter to the InternalURLs entries to make sure the old volume identifiers line up with the new config. If you don't do this, @keep-balance@ will want to shuffle all the existing data around to match the new volume order. The 'Rendezvous' value should be the last 15 characters of the keepstore's UUID in the old configuration. Here's an example:
-
-<notextile>
-<pre><code>Clusters:
- xxxxx:
- Services:
- Keepstore:
- InternalURLs:
- "http://keep1.xxxxx.arvadosapi.com:25107": {Rendezvous: "eim6eefaibesh3i"}
- "http://keep2.xxxxx.arvadosapi.com:25107": {Rendezvous: "yequoodalai7ahg"}
- "http://keep3.xxxxx.arvadosapi.com:25107": {Rendezvous: "eipheho6re1shou"}
- "http://keep4.xxxxx.arvadosapi.com:25107": {Rendezvous: "ahk7chahthae3oo"}
-</code></pre>
-</notextile>
-
-In this example, the keepstore with the name `keep1` had the uuid `xxxxx-bi6l4-eim6eefaibesh3i` in the old configuration.
-
-After migrating and removing all legacy config files, make sure the @/etc/arvados/config.yml@ file is identical across all system nodes -- API server, keepstore, etc. -- and restart all services to make sure they are using the latest configuration.
-
-h2. Cloud installations only: node manager
-
-Node manager is deprecated and replaced by @arvados-dispatch-cloud@. No automated config migration is available. Follow the instructions to "install the cloud dispatcher":../install/crunch2-cloud/install-dispatch-cloud.html
-
-*Only one dispatch process should be running at a time.* If you are migrating a system that currently runs Node manager and @crunch-dispatch-slurm@, it is safest to remove the @crunch-dispatch-slurm@ service entirely before installing @arvados-dispatch-cloud@.
-
-<notextile>
-<pre><code>~$ <span class="userinput">sudo systemctl --now disable crunch-dispatch-slurm</span>
-~$ <span class="userinput">sudo apt-get remove crunch-dispatch-slurm</span>
-</code></pre>
-</notextile>
-
-h2. arvados-controller, arvados-dispatch-cloud
-
-Already uses centralized config exclusively. No migration needed.
The Arvados configuration is stored at @/etc/arvados/config.yml@
-See "Migrating Configuration":config-migration.html for information about migrating from legacy component-specific configuration files.
-
{% codeblock as yaml %}
{% include 'config_default_yml' %}
{% endcodeblock %}
--- /dev/null
+---
+layout: default
+navsection: installguide
+title: Maintenance and upgrading
+...
+{% comment %}
+Copyright (C) The Arvados Authors. All rights reserved.
+
+SPDX-License-Identifier: CC-BY-SA-3.0
+{% endcomment %}
+
+# "Commercial support":#commercial_support
+# "Maintaining Arvados":#maintaining
+## "Modification of the config.yml file":#configuration
+## "Distributing the configuration file":#distribution
+## "Restart the services affected by the change":#restart
+# "Upgrading Arvados":#upgrading
+
+h2(#commercial_support). Commercial support
+
+Arvados is "100% open source software":{{site.baseurl}}/user/copying/copying.html. Anyone can download, install, maintain and upgrade it. However, if this is not something you want to spend your time and energy doing, "Curii Corporation":https://curii.com provides managed Arvados installations as well as commercial support for Arvados. Please contact "info@curii.com":mailto:info@curii.com for more information.
+
+If you'd prefer to do things yourself, a few starting points for maintaining and upgrading Arvados can be found below.
+
+h2(#maintaining). Maintaining Arvados
+
+After Arvados is installed, periodic configuration changes may be required to adapt the software to your needs. Arvados uses a unified configuration file, which is normally found at @/etc/arvados/config.yml@.
+
+Making a configuration change to Arvados typically involves three steps:
+
+* modification of the @config.yml@ file
+* distribution of the modified file to the machines in the cluster
+* restarting of the services affected by the change
+
+h3(#configchange). Modification of the @config.yml@ file
+
+Consult the "configuration reference":{{site.baseurl}}/admin/config.html or another part of the documentation to identify the change to be made.
+
+Preserve a copy of your existing configuration file as a backup, and make the desired modification.
+
+Run @arvados-server config-check@ to make sure the configuration file has no errors and no warnings.
+
+h3(#distribution). Distribute the configuration file
+
+We recommend to keep the @config.yml@ file in sync between all the Arvados system nodes, to avoid issues with services running on different versions of the configuration.
+
+Distribution of the configuration file can be done in many ways, e.g. scp, configuration management software, etc.
+
+h3(#restart). Restart the services affected by the change
+
+If you know which Arvados service uses the specific configuration that was modified, restart those services. When in doubt, restart all Arvados system services.
+
+h2(#upgrading). Upgrading Arvados
+
+Upgrading Arvados typically involves the following steps:
+
+# consult the "upgrade notes":{{site.baseurl}}/admin/upgrading.html and the "release notes":https://arvados.org/releases/ for the release you want to upgrade to
+# Wait for the cluster to be idle and stop Arvados services.
+# Make a backup of your database, as a precaution.
+# update the configuration file for the new release, if necessary (see "Maintaining Arvados":#maintaining above)
+# rebuild and deploy the "compute node image":{{site.baseurl}}/install/crunch2-cloud/install-compute-node.html (cloud only)
+# Install new packages using @apt-get upgrade@ or @yum upgrade@.
+# Wait for package installation scripts as they perform any necessary data migrations.
+# Verify that the Arvados services were restarted as part of the package upgrades.
---
layout: default
navsection: installguide
-title: "Upgrading Arvados and Release notes"
+title: "Arvados upgrade notes"
...
{% comment %}
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
-For Arvados administrators, this page will cover what you need to know and do in order to ensure a smooth upgrade of your Arvados installation. For general release notes covering features added and bugs fixed, see "Arvados releases":https://arvados.org/releases .
+For Arvados administrators, this page will cover what you need to know and do in order to ensure a smooth upgrade of your Arvados installation. For general release notes covering features added and bugs fixed, see "Arvados releases":https://arvados.org/releases.
-h2. General process
-
-# Consult upgrade notes below to see if any manual configuration updates are necessary.
-# Wait for the cluster to be idle and stop Arvados services.
-# Make a backup of your database, as a precaution.
-# Install new packages using @apt-get upgrade@ or @yum upgrade@.
-# Wait for package installation scripts as they perform any necessary data migrations.
-# Restart Arvados services.
+Upgrade instructions can be found at "Maintenance and upgrading":{{site.baseurl}}/admin/maintenance-and-upgrading.html#upgrading.
h2. Upgrade notes
-Some versions introduce changes that require special attention when upgrading: e.g., there is a new service to install, or there is a change to the default configuration that you might need to override in order to preserve the old behavior.
+Some versions introduce changes that require special attention when upgrading: e.g., there is a new service to install, or there is a change to the default configuration that you might need to override in order to preserve the old behavior. These notes are listed below, organized by release version. Scroll down to the version number you are upgrading to.
{% comment %}
Note to developers: Add new items at the top. Include the date, issue number, commit, and considerations/instructions for those about to upgrade.
h2(#main). development main (as of 2022-02-10)
-"previous: Upgrading from 2.3.0":#v2_3_0
+"previous: Upgrading to 2.3.0":#v2_3_0
h3. Anonymous token changes
-The anonymous token configured in @Users.AnonymousUserToken@ must now be 32 characters or longer. This was already the suggestion in the documentation, now it is enforced. The @script/get_anonymous_user_token.rb@ script that was needed to register the anonymous user token in the database has been removed. Registration of the anonymous token is no longer necessary.
+The anonymous token configured in @Users.AnonymousUserToken@ must now be 32 characters or longer. This was already the suggestion in the documentation, now it is enforced. The @script/get_anonymous_user_token.rb@ script that was needed to register the anonymous user token in the database has been removed. Registration of the anonymous token is no longer necessary. If the anonymous token in @config.yml@ is specified as a full V2 token, that will now generate a warning - it should be updated to list just the secret (i.e. the part after the last forward slash).
h3. Preemptible instance types are used automatically, if any are configured
h3. Migrating to centralized config.yml
-See "Migrating Configuration":config-migration.html for notes on migrating legacy per-component configuration files to the new centralized @/etc/arvados/config.yml@.
+See "Migrating Configuration":https://doc.arvados.org/v2.1/admin/config-migration.html for notes on migrating legacy per-component configuration files to the new centralized @/etc/arvados/config.yml@.
To ensure a smooth transition, the per-component config files continue to be read, and take precedence over the centralized configuration. Your cluster should continue to function after upgrade but before doing the full configuration migration. However, several services (keepstore, keep-web, keepproxy) require a minimal `/etc/arvados/config.yml` to start:
You can no longer specify individual keep services to balance via the @config.KeepServiceList@ command line option or @KeepServiceList@ legacy config option. Instead, keep-balance will operate on all keepstore servers with @service_type:disk@ as reported by the @arv keep_service list@ command. If you are still using the legacy config, @KeepServiceList@ should be removed or keep-balance will produce an error.
-Please see the "config migration guide":{{site.baseurl}}/admin/config-migration.html and "keep-balance install guide":{{site.baseurl}}/install/install-keep-balance.html for more details.
+Please see the "config migration guide":https://doc.arvados.org/v2.1/admin/config-migration.html and "keep-balance install guide":{{site.baseurl}}/install/install-keep-balance.html for more details.
h3. Arv-git-httpd configuration migration
-(feature "#14712":https://dev.arvados.org/issues/14712 ) The arv-git-httpd package can now be configured using the centralized configuration file at @/etc/arvados/config.yml@. Configuration via individual command line arguments is no longer available. Please see "arv-git-httpd's config migration guide":{{site.baseurl}}/admin/config-migration.html#arv-git-httpd for more details.
+(feature "#14712":https://dev.arvados.org/issues/14712 ) The arv-git-httpd package can now be configured using the centralized configuration file at @/etc/arvados/config.yml@. Configuration via individual command line arguments is no longer available. Please see "arv-git-httpd's config migration guide":https://doc.arvados.org/v2.1/admin/config-migration.html#arv-git-httpd for more details.
h3. Keepstore and keep-web configuration migration
keep-web now supports the legacy @keep-web.yml@ config format (used by Arvados 1.4) and the new cluster config file format. Please check "keep-web's install guide":{{site.baseurl}}/install/install-keep-web.html for more details.
-keepstore now supports the legacy @keepstore.yml@ config format (used by Arvados 1.4) and the new cluster config file format. Please check the "keepstore config migration notes":{{site.baseurl}}/admin/config-migration.html#keepstore and "keepstore install guide":{{site.baseurl}}/install/install-keepstore.html for more details.
+keepstore now supports the legacy @keepstore.yml@ config format (used by Arvados 1.4) and the new cluster config file format. Please check the "keepstore config migration notes":https://doc.arvados.org/v2.1/admin/config-migration.html#keepstore and "keepstore install guide":{{site.baseurl}}/install/install-keepstore.html for more details.
h3. Keepproxy configuration migration
-(feature "#14715":https://dev.arvados.org/issues/14715 ) Keepproxy can now be configured using the centralized config at @/etc/arvados/config.yml@. Configuration via individual command line arguments is no longer available and the @DisableGet@, @DisablePut@, and @PIDFile@ configuration options are no longer supported. If you are still using the legacy config and @DisableGet@ or @DisablePut@ are set to true or @PIDFile@ has a value, keepproxy will produce an error and fail to start. Please see "keepproxy's config migration guide":{{site.baseurl}}/admin/config-migration.html#keepproxy for more details.
+(feature "#14715":https://dev.arvados.org/issues/14715 ) Keepproxy can now be configured using the centralized config at @/etc/arvados/config.yml@. Configuration via individual command line arguments is no longer available and the @DisableGet@, @DisablePut@, and @PIDFile@ configuration options are no longer supported. If you are still using the legacy config and @DisableGet@ or @DisablePut@ are set to true or @PIDFile@ has a value, keepproxy will produce an error and fail to start. Please see "keepproxy's config migration guide":https://doc.arvados.org/v2.1/admin/config-migration.html#keepproxy for more details.
h3. Delete "keep_services" records
h3. New configuration
-Arvados is migrating to a centralized configuration file for all components. During the migration, legacy configuration files will continue to be loaded. See "Migrating Configuration":config-migration.html for details.
+Arvados is migrating to a centralized configuration file for all components. During the migration, legacy configuration files will continue to be loaded. See "Migrating Configuration":https://doc.arvados.org/v2.1/admin/config-migration.html for details.
h2(#v1_3_3). v1.3.3 (2019-05-14)
# "Create an SSH keypair":#sshkeypair
# "Compute image requirements":#requirements
# "The build script":#building
+# "DNS resolution":#dns-resolution
+# "NVIDIA GPU support":#nvidia
# "Singularity mksquashfs configuration":#singularity_mksquashfs_configuration
# "Build an AWS image":#aws
+## "Autoscaling compute node scratch space":#aws-ebs-autoscaler
# "Build an Azure image":#azure
h2(#introduction). Introduction
</code></pre>
</notextile>
-{% assign show_docker_warning = true %}
-
-{% include 'singularity_mksquashfs_configuration' %}
-
-The desired amount of memory to make available for @mksquashfs@ can be configured in an argument to "the build script":#building. It defaults to @256M@.
-
h2(#requirements). Compute image requirements
Arvados comes with a build script to automate the creation of a suitable compute node image (see "The build script":#building below). It is provided as a convenience. It is also possible to create a compute node image via other means. These are the requirements:
VPC id for AWS, otherwise packer will pick the default one
--aws-subnet-id
Subnet id for AWS otherwise packer will pick the default one for the VPC
+ --aws-ebs-autoscale (default: false)
+ Install the AWS EBS autoscaler daemon.
--gcp-project-id (default: false, required if building for GCP)
GCP project id
--gcp-account-file (default: false, required if building for GCP)
Output debug information
</code></pre></notextile>
-h2(#building). NVIDIA GPU support
+h2(#dns-resolution). DNS resolution
+
+Compute nodes must be able to resolve the hostnames of the API server and any keepstore servers to your internal IP addresses. You can do this by running an internal DNS resolver. The IP address of the resolver should be passed as the value for the @--resolver@ argument to "the build script":#building.
+
+Alternatively, the services could be hardcoded into an @/etc/hosts@ file. For example:
+
+<notextile><pre><code>10.20.30.40 <span class="userinput">ClusterID.example.com</span>
+10.20.30.41 <span class="userinput">keep1.ClusterID.example.com</span>
+10.20.30.42 <span class="userinput">keep2.ClusterID.example.com</span>
+</code></pre></notextile>
+
+Adding these lines to the @/etc/hosts@ file in the compute node image could be done with a small change to the Packer template and the @scripts/base.sh@ script, which will be left as an exercise for the reader.
+
+h2(#nvidia). NVIDIA GPU support
If you plan on using instance types with NVIDIA GPUs, add @--nvidia-gpu-support@ to the build command line. Arvados uses the same compute image for both GPU and non-GPU instance types. The GPU tooling is ignored when using the image with a non-GPU instance type.
+{% assign show_docker_warning = true %}
+
+{% include 'singularity_mksquashfs_configuration' %}
+
+The desired amount of memory to make available for @mksquashfs@ can be configured in an argument to "the build script":#building. It defaults to @256M@.
+
h2(#aws). Build an AWS image
<notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-aws.json \
@ArvadosDispatchCloudPublicKeyPath@ should be replaced with the path to the ssh *public* key file generated in "Create an SSH keypair":#sshkeypair, above.
-Compute nodes must be able to resolve the hostnames of the API server and any keepstore servers to your internal IP addresses. You can do this by running an internal DNS resolver. The IP address of the resolver should replace the string @ResolverIP@ in the command above.
+h3(#aws-ebs-autoscaler). Autoscaling compute node scratch space
+
+If you want to add the "AWS EBS autoscaler":https://github.com/awslabs/amazon-ebs-autoscale daemon in your images, add the @--aws-ebs-autoscale@ flag to the "the build script":#building. Doing so will make the compute image scratch space scale automatically as needed.
+
+The AWS EBS autoscaler daemon will be installed with this configuration:
+
+<notextile><pre><code>{
+ "mountpoint": "/tmp",
+ "filesystem": "lvm.ext4",
+ "lvm": {
+ "volume_group": "autoscale_vg",
+ "logical_volume": "autoscale_lv"
+ },
+ "volume": {
+ "type": "gp3",
+ "iops": 3000,
+ "encrypted": 1
+ },
+ "detection_interval": 2,
+ "limits": {
+ "max_ebs_volume_size": 1500,
+ "max_logical_volume_size": 8000,
+ "max_ebs_volume_count": 16
+ },
+ "logging": {
+ "log_file": "/var/log/ebs-autoscale.log",
+ "log_interval": 300
+ }
+}
+</code></pre></notextile>
-Alternatively, the services could be hardcoded into an @/etc/hosts@ file. For example:
+Changing the configuration is left as an exercise for the reader.
+
+Using this feature also requires a few Arvados configuration changes in @config.yml@:
+
+* The @Containers/InstanceTypes@ list should be modified so that all @AddedScratch@ lines are removed, and the @IncludedScratch@ value should be set to a (fictional) high number. This way, the scratch space requirements will be met by all the defined instance type. For example:
+
+<notextile><pre><code> InstanceTypes:
+ c5large:
+ ProviderType: c5.large
+ VCPUs: 2
+ RAM: 4GiB
+ IncludedScratch: 16TB
+ Price: 0.085
+ m5large:
+ ProviderType: m5.large
+ VCPUs: 2
+ RAM: 8GiB
+ IncludedScratch: 16TB
+ Price: 0.096
+...
+</code></pre></notextile>
-<notextile><pre><code>10.20.30.40 <span class="userinput">ClusterID.example.com</span>
-10.20.30.41 <span class="userinput">keep1.ClusterID.example.com</span>
-10.20.30.42 <span class="userinput">keep2.ClusterID.example.com</span>
+* You will also need to create an IAM role in AWS with these permissions:
+
+<notextile><pre><code>{
+ "Version": "2012-10-17",
+ "Statement": [
+ {
+ "Effect": "Allow",
+ "Action": [
+ "ec2:AttachVolume",
+ "ec2:DescribeVolumeStatus",
+ "ec2:DescribeVolumes",
+ "ec2:DescribeTags",
+ "ec2:ModifyInstanceAttribute",
+ "ec2:DescribeVolumeAttribute",
+ "ec2:CreateVolume",
+ "ec2:DeleteVolume",
+ "ec2:CreateTags"
+ ],
+ "Resource": "*"
+ }
+ ]
+}
</code></pre></notextile>
-Adding these lines to the @/etc/hosts@ file in the compute node image could be done with a small change to the Packer template and the @scripts/base.sh@ script, which will be left as an exercise for the reader.
+Then, in @config.yml@ set @Containers/CloudVMs/DriverParameters/IAMInstanceProfile@ to the name of the IAM role. This will make @arvados-dispatch-cloud@ pass an IAMInstanceProfile to the compute nodes as they start up, giving them sufficient permissions to attach and grow EBS volumes.
h2(#azure). Build an Azure image
</code></pre></notextile>
@ArvadosDispatchCloudPublicKeyPath@ should be replaced with the path to the ssh *public* key file generated in "Create an SSH keypair":#sshkeypair, above.
-
-Compute nodes must be able to resolve the hostnames of the API server and any keepstore servers to your internal IP addresses. You can do this by running an internal DNS resolver. The IP address of the resolver should replace the string @ResolverIP@ in the command above.
-
-Alternatively, the services could be hardcoded into an @/etc/hosts@ file. For example:
-
-<notextile><pre><code>10.20.30.40 <span class="userinput">ClusterID.example.com</span>
-10.20.30.41 <span class="userinput">keep1.ClusterID.example.com</span>
-10.20.30.42 <span class="userinput">keep2.ClusterID.example.com</span>
-</code></pre></notextile>
-
-Adding these lines to the @/etc/hosts@ file in the compute node image could be done with a small change to the Packer template and the @scripts/base.sh@ script, which will be left as an exercise for the reader.
</code></pre>
</notextile>
+Example policy for the IAM role used by the cloud dispatcher:
+
+<notextile>
+<pre>
+{
+ "Version": "2012-10-17",
+ "Id": "arvados-dispatch-cloud policy",
+ "Statement": [
+ {
+ "Effect": "Allow",
+ "Action": [
+ "iam:PassRole",
+ "ec2:DescribeKeyPairs",
+ "ec2:ImportKeyPair",
+ "ec2:RunInstances",
+ "ec2:DescribeInstances",
+ "ec2:CreateTags",
+ "ec2:TerminateInstances"
+ ],
+ "Resource": "*"
+ }
+ ]
+}
+</pre>
+</notextile>
+
h4. Minimal configuration example for Azure
Using managed disks:
# "Run the provision.sh script":#run_provision_script
# "Initial user and login":#initial_user
# "Test the installed cluster running a simple workflow":#test_install
-
-
+# "After the installation":#post_install
h2(#introduction). Introduction
INFO Final process status is success
</code></pre>
</notextile>
+
+h2(#post_install). After the installation
+
+Once the installation is complete, it is recommended to keep a copy of your local configuration files. Committing them to version control is a good idea.
+
+Re-running the Salt-based installer is not recommended for maintaining and upgrading Arvados, please see "Maintenance and upgrading":{{site.baseurl}}/admin/maintenance-and-upgrading.html for more information.
## "DNS configuration (single host / multiple hostnames)":#single_host_multiple_hostnames_dns_configuration
# "Initial user and login":#initial_user
# "Test the installed cluster running a simple workflow":#test_install
+# "After the installation":#post_install
h2(#single_host). Single host install using the provision.sh script
INFO Final process status is success
</code></pre>
</notextile>
+
+h2(#post_install). After the installation
+
+Once the installation is complete, it is recommended to keep a copy of your local configuration files. Committing them to version control is a good idea.
+
+Re-running the Salt-based installer is not recommended for maintaining and upgrading Arvados, please see "Maintenance and upgrading":{{site.baseurl}}/admin/maintenance-and-upgrading.html for more information.
|==--submit-runner-ram== SUBMIT_RUNNER_RAM|RAM (in MiB) required for the workflow runner (default 1024)|
|==--submit-runner-image== SUBMIT_RUNNER_IMAGE|Docker image for workflow runner|
|==--always-submit-runner==|When invoked with --submit --wait, always submit a runner to manage the workflow, even when only running a single CommandLineTool|
+|==--match-submitter-images==|Where Arvados has more than one Docker image of the same name, use image from the Docker instance on the submitting node.|
|==--submit-request-uuid== UUID|Update and commit to supplied container request instead of creating a new one.|
|==--submit-runner-cluster== CLUSTER_ID|Submit workflow runner to a remote cluster|
|==--name NAME==|Name to use for workflow execution instance.|
Workflows submitted with @arvados-cwl-runner@ will take advantage of Arvados job reuse. If you submit a workflow which is identical to one that has run before, it will short cut the execution and return the result of the previous run. This also applies to individual workflow steps. For example, a two step workflow where the first step has run before will reuse results for first step and only execute the new second step. You can disable this behavior with @--disable-reuse@.
+h3(#docker). Docker images
+
+Docker images referenced by the workflow must be uploaded to Arvados. This requires @docker@ to be installed and usable by the user running @arvados-cwl-runner@. If the image is not present in the local Docker instance, @arvados-cwl-runner@ will first attempt to pull the image using @docker pull@, then upload it.
+
+If there is already a Docker image in Arvados with the same name, it will use the existing image. In this case, the submitter will not use Docker.
+
+The @--match-submitter-images@ option will check the id of the image in the local Docker instance and compare it to the id of the image already in Arvados with the same name and tag. If they are different, it will choose the image matching the local image id, which will be uploaded it if necessary. This helpful for development, if you locally rebuild the image with the 'latest' tag, the @--match-submitter-images@ will ensure that the newer version is used.
+
h3. Command line options
See "arvados-cwl-runner options":{{site.baseurl}}/user/cwl/cwl-run-options.html
)
type ec2InstanceSetConfig struct {
- AccessKeyID string
- SecretAccessKey string
- Region string
- SecurityGroupIDs arvados.StringSet
- SubnetID string
- AdminUsername string
- EBSVolumeType string
+ AccessKeyID string
+ SecretAccessKey string
+ Region string
+ SecurityGroupIDs arvados.StringSet
+ SubnetID string
+ AdminUsername string
+ EBSVolumeType string
+ IAMInstanceProfile string
}
type ec2Interface interface {
}}
}
+ if instanceSet.ec2config.IAMInstanceProfile != "" {
+ rii.IamInstanceProfile = &ec2.IamInstanceProfileSpecification{
+ Name: aws.String(instanceSet.ec2config.IAMInstanceProfile),
+ }
+ }
+
rsv, err := instanceSet.client.RunInstances(&rii)
err = wrapError(err, &instanceSet.throttleDelayCreate)
if err != nil {
Region: ""
EBSVolumeType: gp2
AdminUsername: debian
+ # (ec2) name of the IAMInstanceProfile for instances started by
+ # the cloud dispatcher. Leave blank when not needed.
+ IAMInstanceProfile: ""
# (azure) Credentials.
SubscriptionID: ""
for _, err = range []error{
ldr.checkClusterID(fmt.Sprintf("Clusters.%s", id), id, false),
ldr.checkClusterID(fmt.Sprintf("Clusters.%s.Login.LoginCluster", id), cc.Login.LoginCluster, true),
- ldr.checkToken(fmt.Sprintf("Clusters.%s.ManagementToken", id), cc.ManagementToken, true),
- ldr.checkToken(fmt.Sprintf("Clusters.%s.SystemRootToken", id), cc.SystemRootToken, true),
- ldr.checkToken(fmt.Sprintf("Clusters.%s.Users.AnonymousUserToken", id), cc.Users.AnonymousUserToken, false),
- ldr.checkToken(fmt.Sprintf("Clusters.%s.Collections.BlobSigningKey", id), cc.Collections.BlobSigningKey, true),
+ ldr.checkToken(fmt.Sprintf("Clusters.%s.ManagementToken", id), cc.ManagementToken, true, false),
+ ldr.checkToken(fmt.Sprintf("Clusters.%s.SystemRootToken", id), cc.SystemRootToken, true, false),
+ ldr.checkToken(fmt.Sprintf("Clusters.%s.Users.AnonymousUserToken", id), cc.Users.AnonymousUserToken, false, true),
+ ldr.checkToken(fmt.Sprintf("Clusters.%s.Collections.BlobSigningKey", id), cc.Collections.BlobSigningKey, true, false),
checkKeyConflict(fmt.Sprintf("Clusters.%s.PostgreSQL.Connection", id), cc.PostgreSQL.Connection),
ldr.checkEnum("Containers.LocalKeepLogsToContainerLog", cc.Containers.LocalKeepLogsToContainerLog, "none", "all", "errors"),
ldr.checkEmptyKeepstores(cc),
return nil, err
}
}
+ if strings.Count(cc.Users.AnonymousUserToken, "/") == 3 {
+ // V2 token, strip it to just a secret
+ tmp := strings.Split(cc.Users.AnonymousUserToken, "/")
+ cc.Users.AnonymousUserToken = tmp[2]
+ }
}
return &cfg, nil
}
var acceptableTokenRe = regexp.MustCompile(`^[a-zA-Z0-9]+$`)
var acceptableTokenLength = 32
-func (ldr *Loader) checkToken(label, token string, mandatory bool) error {
+func (ldr *Loader) checkToken(label, token string, mandatory bool, acceptV2 bool) error {
if len(token) == 0 {
if !mandatory {
// when a token is not mandatory, the acceptable length and content is only checked if its length is non-zero
}
}
} else if !acceptableTokenRe.MatchString(token) {
- return fmt.Errorf("%s: unacceptable characters in token (only a-z, A-Z, 0-9 are acceptable)", label)
+ if !acceptV2 {
+ return fmt.Errorf("%s: unacceptable characters in token (only a-z, A-Z, 0-9 are acceptable)", label)
+ }
+ // Test for a proper V2 token
+ tmp := strings.SplitN(token, "/", 3)
+ if len(tmp) != 3 {
+ return fmt.Errorf("%s: unacceptable characters in token (only a-z, A-Z, 0-9 are acceptable)", label)
+ }
+ if !strings.HasPrefix(token, "v2/") {
+ return fmt.Errorf("%s: unacceptable characters in token (only a-z, A-Z, 0-9 are acceptable)", label)
+ }
+ ldr.Logger.Warnf("%s: token is a full V2 token, should just be a secret (remove everything up to and including the last forward slash)", label)
+ if !acceptableTokenRe.MatchString(tmp[2]) {
+ return fmt.Errorf("%s: unacceptable characters in V2 token secret (only a-z, A-Z, 0-9 are acceptable)", label)
+ }
} else if len(token) < acceptableTokenLength {
if ldr.Logger != nil {
ldr.Logger.Warnf("%s: token is too short (should be at least %d characters)", label, acceptableTokenLength)
"github.com/coreos/go-oidc"
lru "github.com/hashicorp/golang-lru"
"github.com/jmoiron/sqlx"
+ "github.com/lib/pq"
"github.com/sirupsen/logrus"
"golang.org/x/oauth2"
"google.golang.org/api/option"
tokenCacheNegativeTTL = time.Minute * 5
tokenCacheTTL = time.Minute * 10
tokenCacheRaceWindow = time.Minute
+ pqCodeUniqueViolation = pq.ErrorCode("23505")
)
type oidcLoginController struct {
// it's expiring.
exp := time.Now().UTC().Add(tokenCacheTTL + tokenCacheRaceWindow)
- var aca arvados.APIClientAuthorization
if updating {
_, err = tx.ExecContext(ctx, `update api_client_authorizations set expires_at=$1 where api_token=$2`, exp, hmac)
if err != nil {
}
ctxlog.FromContext(ctx).WithField("HMAC", hmac).Debug("(*oidcTokenAuthorizer)registerToken: updated api_client_authorizations row")
} else {
- aca, err = ta.ctrl.Parent.CreateAPIClientAuthorization(ctx, ta.ctrl.Cluster.SystemRootToken, *authinfo)
+ aca, err := ta.ctrl.Parent.CreateAPIClientAuthorization(ctx, ta.ctrl.Cluster.SystemRootToken, *authinfo)
if err != nil {
return err
}
- _, err = tx.ExecContext(ctx, `update api_client_authorizations set api_token=$1, expires_at=$2 where uuid=$3`, hmac, exp, aca.UUID)
+ _, err = tx.ExecContext(ctx, `savepoint upd`)
if err != nil {
+ return err
+ }
+ _, err = tx.ExecContext(ctx, `update api_client_authorizations set api_token=$1, expires_at=$2 where uuid=$3`, hmac, exp, aca.UUID)
+ if e, ok := err.(*pq.Error); ok && e.Code == pqCodeUniqueViolation {
+ // unique_violation, given that the above
+ // query did not find a row with matching
+ // api_token, means another thread/process
+ // also received this same token and won the
+ // race to insert it -- in which case this
+ // thread doesn't need to update the database.
+ // Discard the redundant row.
+ _, err = tx.ExecContext(ctx, `rollback to savepoint upd`)
+ if err != nil {
+ return err
+ }
+ _, err = tx.ExecContext(ctx, `delete from api_client_authorizations where uuid=$1`, aca.UUID)
+ if err != nil {
+ return err
+ }
+ ctxlog.FromContext(ctx).WithField("HMAC", hmac).Debug("(*oidcTokenAuthorizer)registerToken: api_client_authorizations row inserted by another thread")
+ } else if err != nil {
+ ctxlog.FromContext(ctx).Errorf("%#v", err)
return fmt.Errorf("error adding OIDC access token to database: %w", err)
+ } else {
+ ctxlog.FromContext(ctx).WithFields(logrus.Fields{"UUID": aca.UUID, "HMAC": hmac}).Debug("(*oidcTokenAuthorizer)registerToken: inserted api_client_authorizations row")
}
- aca.APIToken = hmac
- ctxlog.FromContext(ctx).WithFields(logrus.Fields{"UUID": aca.UUID, "HMAC": hmac}).Debug("(*oidcTokenAuthorizer)registerToken: inserted api_client_authorizations row")
}
err = tx.Commit()
if err != nil {
return err
}
- aca.ExpiresAt = exp
- ta.cache.Add(tok, aca)
+ ta.cache.Add(tok, arvados.APIClientAuthorization{ExpiresAt: exp})
return nil
}
"net/url"
"sort"
"strings"
+ "sync"
"testing"
"time"
ctx := auth.NewContext(context.Background(), &auth.Credentials{Tokens: []string{accessToken}})
var exp1 time.Time
- oidcAuthorizer.WrapCalls(func(ctx context.Context, opts interface{}) (interface{}, error) {
- creds, ok := auth.FromContext(ctx)
- c.Assert(ok, check.Equals, true)
- c.Assert(creds.Tokens, check.HasLen, 1)
- c.Check(creds.Tokens[0], check.Equals, accessToken)
- err := db.QueryRowContext(ctx, `select expires_at at time zone 'UTC' from api_client_authorizations where api_token=$1`, apiToken).Scan(&exp1)
- c.Check(err, check.IsNil)
- c.Check(exp1.Sub(time.Now()) > -time.Second, check.Equals, true)
- c.Check(exp1.Sub(time.Now()) < time.Second, check.Equals, true)
- return nil, nil
- })(ctx, nil)
+ concurrent := 4
+ s.fakeProvider.HoldUserInfo = make(chan *http.Request)
+ s.fakeProvider.ReleaseUserInfo = make(chan struct{})
+ go func() {
+ for i := 0; ; i++ {
+ if i == concurrent {
+ close(s.fakeProvider.ReleaseUserInfo)
+ }
+ <-s.fakeProvider.HoldUserInfo
+ }
+ }()
+ var wg sync.WaitGroup
+ for i := 0; i < concurrent; i++ {
+ i := i
+ wg.Add(1)
+ go func() {
+ defer wg.Done()
+ _, err := oidcAuthorizer.WrapCalls(func(ctx context.Context, opts interface{}) (interface{}, error) {
+ c.Logf("concurrent req %d/%d", i, concurrent)
+ var exp time.Time
+
+ creds, ok := auth.FromContext(ctx)
+ c.Assert(ok, check.Equals, true)
+ c.Assert(creds.Tokens, check.HasLen, 1)
+ c.Check(creds.Tokens[0], check.Equals, accessToken)
+
+ err := db.QueryRowContext(ctx, `select expires_at at time zone 'UTC' from api_client_authorizations where api_token=$1`, apiToken).Scan(&exp)
+ c.Check(err, check.IsNil)
+ c.Check(exp.Sub(time.Now()) > -time.Second, check.Equals, true)
+ c.Check(exp.Sub(time.Now()) < time.Second, check.Equals, true)
+ if i == 0 {
+ exp1 = exp
+ }
+ return nil, nil
+ })(ctx, nil)
+ c.Check(err, check.IsNil)
+ }()
+ }
+ wg.Wait()
+ if c.Failed() {
+ c.Fatal("giving up")
+ }
// If the token is used again after the in-memory cache
// expires, oidcAuthorizer must re-check the token and update
var exp time.Time
err := db.QueryRowContext(ctx, `select expires_at at time zone 'UTC' from api_client_authorizations where api_token=$1`, apiToken).Scan(&exp)
c.Check(err, check.IsNil)
- c.Check(exp.Sub(exp1) > 0, check.Equals, true)
- c.Check(exp.Sub(exp1) < time.Second, check.Equals, true)
+ c.Check(exp.Sub(exp1) > 0, check.Equals, true, check.Commentf("expect %v > 0", exp.Sub(exp1)))
+ c.Check(exp.Sub(exp1) < time.Second, check.Equals, true, check.Commentf("expect %v < 1s", exp.Sub(exp1)))
return nil, nil
})(ctx, nil)
//
// SPDX-License-Identifier: AGPL-3.0
+//go:build !static
+
package localdb
import (
--- /dev/null
+// Copyright (C) The Arvados Authors. All rights reserved.
+//
+// SPDX-License-Identifier: AGPL-3.0
+
+//go:build static
+
+package localdb
+
+import (
+ "context"
+ "errors"
+
+ "git.arvados.org/arvados.git/sdk/go/arvados"
+)
+
+type pamLoginController struct {
+ Cluster *arvados.Cluster
+ Parent *Conn
+}
+
+func (ctrl *pamLoginController) Logout(ctx context.Context, opts arvados.LogoutOptions) (arvados.LogoutResponse, error) {
+ return logout(ctx, ctrl.Cluster, opts)
+}
+
+func (ctrl *pamLoginController) Login(ctx context.Context, opts arvados.LoginOptions) (arvados.LoginResponse, error) {
+ return arvados.LoginResponse{}, errors.New("interactive login is not available")
+}
+
+func (ctrl *pamLoginController) UserAuthenticate(ctx context.Context, opts arvados.UserAuthenticateOptions) (arvados.APIClientAuthorization, error) {
+ return arvados.APIClientAuthorization{}, errors.New("support not available due to static compilation")
+}
"time"
)
-// This magically allows us to look up userHz via _SC_CLK_TCK:
-
-/*
-#include <unistd.h>
-#include <sys/types.h>
-#include <pwd.h>
-#include <stdlib.h>
-*/
-import "C"
-
// A Reporter gathers statistics for a cgroup and writes them to a
// log.Logger.
type Reporter struct {
var userTicks, sysTicks int64
fmt.Sscanf(string(b), "user %d\nsystem %d", &userTicks, &sysTicks)
- userHz := float64(C.sysconf(C._SC_CLK_TCK))
+ userHz := float64(100)
nextSample := cpuSample{
hasData: true,
sampleTime: time.Now(),
help="When invoked with --submit --wait, always submit a runner to manage the workflow, even when only running a single CommandLineTool",
default=False)
+ parser.add_argument("--match-submitter-images", action="store_true",
+ default=False, dest="match_local_docker",
+ help="Where Arvados has more than one Docker image of the same name, use image from the Docker instance on the submitting node.")
+
exgroup = parser.add_mutually_exclusive_group()
exgroup.add_argument("--submit-request-uuid",
default=None,
runtimeContext.pull_image,
runtimeContext.project_uuid,
runtimeContext.force_docker_pull,
- runtimeContext.tmp_outdir_prefix)
+ runtimeContext.tmp_outdir_prefix,
+ runtimeContext.match_local_docker)
network_req, _ = self.get_requirement("NetworkAccess")
if network_req:
import sys
import threading
import copy
+import re
+import subprocess
from schema_salad.sourceline import SourceLine
cached_lookups = {}
cached_lookups_lock = threading.Lock()
+def determine_image_id(dockerImageId):
+ for line in (
+ subprocess.check_output( # nosec
+ ["docker", "images", "--no-trunc", "--all"]
+ )
+ .decode("utf-8")
+ .splitlines()
+ ):
+ try:
+ match = re.match(r"^([^ ]+)\s+([^ ]+)\s+([^ ]+)", line)
+ split = dockerImageId.split(":")
+ if len(split) == 1:
+ split.append("latest")
+ elif len(split) == 2:
+ # if split[1] doesn't match valid tag names, it is a part of repository
+ if not re.match(r"[\w][\w.-]{0,127}", split[1]):
+ split[0] = split[0] + ":" + split[1]
+ split[1] = "latest"
+ elif len(split) == 3:
+ if re.match(r"[\w][\w.-]{0,127}", split[2]):
+ split[0] = split[0] + ":" + split[1]
+ split[1] = split[2]
+ del split[2]
+
+ # check for repository:tag match or image id match
+ if match and (
+ (split[0] == match.group(1) and split[1] == match.group(2))
+ or dockerImageId == match.group(3)
+ ):
+ return match.group(3)
+ except ValueError:
+ pass
+
+ return None
+
+
def arv_docker_get_image(api_client, dockerRequirement, pull_image, project_uuid,
- force_pull, tmp_outdir_prefix):
+ force_pull, tmp_outdir_prefix, match_local_docker):
"""Check if a Docker image is available in Keep, if not, upload it using arv-keepdocker."""
if "http://arvados.org/cwl#dockerCollectionPDH" in dockerRequirement:
image_name=image_name,
image_tag=image_tag)
+ if images and match_local_docker:
+ local_image_id = determine_image_id(dockerRequirement["dockerImageId"])
+ if local_image_id:
+ # find it in the list
+ found = False
+ for i in images:
+ if i[1]["dockerhash"] == local_image_id:
+ found = True
+ images = [i]
+ break
+ if not found:
+ # force re-upload.
+ images = []
+
if not images:
# Fetch Docker image if necessary.
try:
self.cluster_target_id = 0
self.always_submit_runner = False
self.collection_cache_size = 256
+ self.match_local_docker = False
super(ArvRuntimeContext, self).__init__(kwargs)
"Option 'dockerOutputDirectory' of DockerRequirement not supported.")
arvados_cwl.arvdocker.arv_docker_get_image(arvrunner.api, docker_req, True, arvrunner.project_uuid,
arvrunner.runtimeContext.force_docker_pull,
- arvrunner.runtimeContext.tmp_outdir_prefix)
+ arvrunner.runtimeContext.tmp_outdir_prefix,
+ arvrunner.runtimeContext.match_local_docker)
else:
arvados_cwl.arvdocker.arv_docker_get_image(arvrunner.api, {"dockerPull": "arvados/jobs:"+__version__},
True, arvrunner.project_uuid,
arvrunner.runtimeContext.force_docker_pull,
- arvrunner.runtimeContext.tmp_outdir_prefix)
+ arvrunner.runtimeContext.tmp_outdir_prefix,
+ arvrunner.runtimeContext.match_local_docker)
elif isinstance(tool, cwltool.workflow.Workflow):
for s in tool.steps:
upload_docker(arvrunner, s.embedded_tool)
v["http://arvados.org/cwl#dockerCollectionPDH"] = arvados_cwl.arvdocker.arv_docker_get_image(arvrunner.api, v, True,
arvrunner.project_uuid,
arvrunner.runtimeContext.force_docker_pull,
- arvrunner.runtimeContext.tmp_outdir_prefix)
+ arvrunner.runtimeContext.tmp_outdir_prefix,
+ arvrunner.runtimeContext.match_local_docker)
for l in v:
visit(v[l], cur_id)
if isinstance(v, list):
try:
return arvados_cwl.arvdocker.arv_docker_get_image(arvrunner.api, {"dockerPull": img}, True, arvrunner.project_uuid,
arvrunner.runtimeContext.force_docker_pull,
- arvrunner.runtimeContext.tmp_outdir_prefix)
+ arvrunner.runtimeContext.tmp_outdir_prefix,
+ arvrunner.runtimeContext.match_local_docker)
except Exception as e:
raise Exception("Docker image %s is not available\n%s" % (img, e) )
# file to determine what version of cwltool and schema-salad to
# build.
install_requires=[
- 'cwltool==3.1.20220210171524',
+ 'cwltool==3.1.20220217222804',
'schema-salad==8.2.20211116214159',
'arvados-python-client{}'.format(pysdk_dep),
'setuptools',
}))
+ # The test passes no builder.resources
+ # Hence the default resources will apply: {'cores': 1, 'ram': 1024, 'outdirSize': 1024, 'tmpdirSize': 1024}
+ @mock.patch("arvados_cwl.arvdocker.determine_image_id")
+ @mock.patch("arvados.commands.keepdocker.list_images_in_arv")
+ def test_match_local_docker(self, keepdocker, determine_image_id):
+ arvados_cwl.add_arv_hints()
+ arv_docker_clear_cache()
+
+ runner = mock.MagicMock()
+ runner.ignore_docker_for_reuse = False
+ runner.intermediate_output_ttl = 0
+ runner.secret_store = cwltool.secrets.SecretStore()
+ runner.api._rootDesc = {"revision": "20210628"}
+
+ keepdocker.return_value = [("zzzzz-4zz18-zzzzzzzzzzzzzz4", {"dockerhash": "456"}),
+ ("zzzzz-4zz18-zzzzzzzzzzzzzz3", {"dockerhash": "123"})]
+ determine_image_id.side_effect = lambda x: "123"
+ def execute(uuid):
+ ex = mock.MagicMock()
+ lookup = {"zzzzz-4zz18-zzzzzzzzzzzzzz4": {"portable_data_hash": "99999999999999999999999999999994+99"},
+ "zzzzz-4zz18-zzzzzzzzzzzzzz3": {"portable_data_hash": "99999999999999999999999999999993+99"}}
+ ex.execute.return_value = lookup[uuid]
+ return ex
+ runner.api.collections().get.side_effect = execute
+
+ tool = cmap({
+ "inputs": [],
+ "outputs": [],
+ "baseCommand": "echo",
+ "arguments": [],
+ "id": "",
+ "cwlVersion": "v1.2",
+ "class": "CommandLineTool"
+ })
+
+ loadingContext, runtimeContext = self.helper(runner, True)
+
+ arvtool = cwltool.load_tool.load_tool(tool, loadingContext)
+ arvtool.formatgraph = None
+
+ container_request = {
+ 'environment': {
+ 'HOME': '/var/spool/cwl',
+ 'TMPDIR': '/tmp'
+ },
+ 'name': 'test_run_True',
+ 'runtime_constraints': {
+ 'vcpus': 1,
+ 'ram': 268435456
+ },
+ 'use_existing': True,
+ 'priority': 500,
+ 'mounts': {
+ '/tmp': {'kind': 'tmp',
+ "capacity": 1073741824
+ },
+ '/var/spool/cwl': {'kind': 'tmp',
+ "capacity": 1073741824 }
+ },
+ 'state': 'Committed',
+ 'output_name': 'Output for step test_run_True',
+ 'owner_uuid': 'zzzzz-8i9sb-zzzzzzzzzzzzzzz',
+ 'output_path': '/var/spool/cwl',
+ 'output_ttl': 0,
+ 'container_image': '99999999999999999999999999999994+99',
+ 'command': ['echo'],
+ 'cwd': '/var/spool/cwl',
+ 'scheduling_parameters': {},
+ 'properties': {},
+ 'secret_mounts': {},
+ 'output_storage_classes': ["default"]
+ }
+
+ runtimeContext.match_local_docker = False
+ for j in arvtool.job({}, mock.MagicMock(), runtimeContext):
+ j.run(runtimeContext)
+ runner.api.container_requests().create.assert_called_with(
+ body=JsonDiffMatcher(container_request))
+
+ arv_docker_clear_cache()
+ runtimeContext.match_local_docker = True
+ container_request['container_image'] = '99999999999999999999999999999993+99'
+ container_request['name'] = 'test_run_True_2'
+ container_request['output_name'] = 'Output for step test_run_True_2'
+ for j in arvtool.job({}, mock.MagicMock(), runtimeContext):
+ j.run(runtimeContext)
+ runner.api.container_requests().create.assert_called_with(
+ body=JsonDiffMatcher(container_request))
+
+
+
class TestWorkflow(unittest.TestCase):
def setUp(self):
cwltool.process._names = set()
stubs.keep_client = keep_client2
stubs.docker_images = {
- "arvados/jobs:"+arvados_cwl.__version__: [("zzzzz-4zz18-zzzzzzzzzzzzzd3", "")],
- "debian:buster-slim": [("zzzzz-4zz18-zzzzzzzzzzzzzd4", "")],
- "arvados/jobs:123": [("zzzzz-4zz18-zzzzzzzzzzzzzd5", "")],
- "arvados/jobs:latest": [("zzzzz-4zz18-zzzzzzzzzzzzzd6", "")],
+ "arvados/jobs:"+arvados_cwl.__version__: [("zzzzz-4zz18-zzzzzzzzzzzzzd3", {})],
+ "debian:buster-slim": [("zzzzz-4zz18-zzzzzzzzzzzzzd4", {})],
+ "arvados/jobs:123": [("zzzzz-4zz18-zzzzzzzzzzzzzd5", {})],
+ "arvados/jobs:latest": [("zzzzz-4zz18-zzzzzzzzzzzzzd6", {})],
}
def kd(a, b, image_name=None, image_tag=None):
return stubs.docker_images.get("%s:%s" % (image_name, image_tag), [])
arvrunner.project_uuid = ""
api.return_value = mock.MagicMock()
arvrunner.api = api.return_value
+ arvrunner.runtimeContext.match_local_docker = False
arvrunner.api.links().list().execute.side_effect = ({"items": [{"created_at": "",
"head_uuid": "zzzzz-4zz18-zzzzzzzzzzzzzzb",
"link_class": "docker_image_repo+tag",
PeopleAPIResponse map[string]interface{}
+ // send incoming /userinfo requests to HoldUserInfo (if not
+ // nil), then receive from ReleaseUserInfo (if not nil),
+ // before responding (these are used to set up races)
+ HoldUserInfo chan *http.Request
+ ReleaseUserInfo chan struct{}
+
key *rsa.PrivateKey
Issuer *httptest.Server
PeopleAPI *httptest.Server
case "/auth":
w.WriteHeader(http.StatusInternalServerError)
case "/userinfo":
+ if p.HoldUserInfo != nil {
+ p.HoldUserInfo <- req
+ }
+ if p.ReleaseUserInfo != nil {
+ <-p.ReleaseUserInfo
+ }
authhdr := req.Header.Get("Authorization")
if _, err := jwt.ParseSigned(strings.TrimPrefix(authhdr, "Bearer ")); err != nil {
p.c.Logf("OIDCProvider: bad auth %q", authhdr)
UNLOGGED_CHANGES = ['last_used_at', 'last_used_by_ip_address', 'updated_at']
def assign_random_api_token
- self.api_token ||= rand(2**256).to_s(36)
+ begin
+ self.api_token ||= rand(2**256).to_s(36)
+ rescue ActiveModel::MissingAttributeError
+ # Ignore the case where self.api_token doesn't exist, which happens when
+ # the select=[...] is used.
+ end
end
def owner_uuid
return ApiClientAuthorization.new(user: User.find_by_uuid(anonymous_user_uuid),
uuid: Rails.configuration.ClusterID+"-gj3su-anonymouspublic",
api_token: token,
- api_client: anonymous_user_token_api_client)
+ api_client: anonymous_user_token_api_client,
+ scopes: ['GET /'])
else
return nil
end
uniqueness: true,
allow_nil: true)
validate :must_unsetup_to_deactivate
+ validate :identity_url_nil_if_empty
before_update :prevent_privilege_escalation
before_update :prevent_inactive_admin
before_update :verify_repositories_empty, :if => Proc.new {
repo.save!
end
end
+
+ def identity_url_nil_if_empty
+ if identity_url == ""
+ self.identity_url = nil
+ end
+ end
end
get :current
assert_response 401
end
+
+ # Tests regression #18801
+ test "select param is respected in 'show' response" do
+ authorize_with :active
+ get :show, params: {
+ id: api_client_authorizations(:active).uuid,
+ select: ["uuid"],
+ }
+ assert_response :success
+ assert_raises ActiveModel::MissingAttributeError do
+ assigns(:object).api_token
+ end
+ assert_nil json_response["expires_at"]
+ assert_nil json_response["api_token"]
+ assert_equal api_client_authorizations(:active).uuid, json_response["uuid"]
+ end
end
assert user.save
end
+ test "empty identity_url saves as null" do
+ set_user_from_auth :admin
+ user = users(:active)
+ assert user.update_attributes(identity_url: '')
+ user.reload
+ assert_nil user.identity_url
+ end
+
end
"--volume=$PG_DATA:/var/lib/postgresql:rw" \
"--volume=$VAR_DATA:$ARVADOS_CONTAINER_PATH:rw" \
"--volume=$PASSENGER:/var/lib/passenger:rw" \
- "--volume=$GEMS:/var/lib/arvados/lib/ruby/gems:rw" \
+ "--volume=$GEMS:/var/lib/arvados-arvbox/.gem:rw" \
"--volume=$PIPCACHE:/var/lib/pip:rw" \
"--volume=$NPMCACHE:/var/lib/npm:rw" \
"--volume=$GOSTUFF:/var/lib/gopath:rw" \
export R_LIBS=/var/lib/Rlibs
export HOME=$(getent passwd arvbox | cut -d: -f6)
export ARVADOS_CONTAINER_PATH=/var/lib/arvados-arvbox
-GEMLOCK=/var/lib/arvados/lib/ruby/gems/gems.lock
+export GEM_HOME=$HOME/.gem
+GEMLOCK=$HOME/gems.lock
defaultdev=$(/sbin/ip route|awk '/default/ { print $5 }')
dockerip=$(/sbin/ip route | grep default | awk '{ print $3 }')
fi
run_bundler() {
- flock $GEMLOCK /var/lib/arvados/bin/gem install --no-document bundler:$BUNDLER_VERSION
+ flock $GEMLOCK /var/lib/arvados/bin/gem install --no-document --user bundler:$BUNDLER_VERSION
if test -f Gemfile.lock ; then
frozen=--frozen
else
if ! [[ -z "$waiting" ]] ; then
if ps x | grep -v grep | grep "bundle install" > /dev/null; then
- gemcount=$(ls /var/lib/arvados/lib/ruby/gems/*/gems 2>/dev/null | wc -l)
+ gemcount=$(ls /var/lib/arvados/lib/ruby/gems/*/gems /var/lib/arvados-arvbox/.gem/ruby/*/gems 2>/dev/null | wc -l)
gemlockcount=0
for l in /usr/src/arvados/services/api/Gemfile.lock \
"aws_profile": "",
"aws_secret_key": "",
"aws_source_ami": "ami-031283ff8a43b021c",
+ "aws_ebs_autoscale": "",
"build_environment": "aws",
"public_key_file": "",
"mksquashfs_mem": "",
"type": "file",
"source": "scripts/usr-local-bin-ensure-encrypted-partitions.sh",
"destination": "/tmp/usr-local-bin-ensure-encrypted-partitions.sh"
+ },{
+ "type": "file",
+ "source": "scripts/usr-local-bin-ensure-encrypted-partitions-aws-ebs-autoscale.sh",
+ "destination": "/tmp/usr-local-bin-ensure-encrypted-partitions-aws-ebs-autoscale.sh"
+ },{
+ "type": "file",
+ "source": "scripts/create-ebs-volume-nvme.patch",
+ "destination": "/tmp/create-ebs-volume-nvme.patch"
},{
"type": "file",
"source": "{{user `public_key_file`}}",
"type": "shell",
"execute_command": "sudo -S env {{ .Vars }} /bin/bash '{{ .Path }}'",
"script": "scripts/base.sh",
- "environment_vars": ["RESOLVER={{user `resolver`}}","REPOSUFFIX={{user `reposuffix`}}","MKSQUASHFS_MEM={{user `mksquashfs_mem`}}","NVIDIA_GPU_SUPPORT={{user `nvidia_gpu_support`}}","CLOUD=aws"]
+ "environment_vars": ["RESOLVER={{user `resolver`}}","REPOSUFFIX={{user `reposuffix`}}","MKSQUASHFS_MEM={{user `mksquashfs_mem`}}","NVIDIA_GPU_SUPPORT={{user `nvidia_gpu_support`}}","CLOUD=aws","AWS_EBS_AUTOSCALE={{user `aws_ebs_autoscale`}}"]
}]
}
VPC id for AWS, otherwise packer will pick the default one
--aws-subnet-id
Subnet id for AWS otherwise packer will pick the default one for the VPC
+ --aws-ebs-autoscale (default: false)
+ Install the AWS EBS autoscaler daemon.
--gcp-project-id (default: false, required if building for GCP)
GCP project id
--gcp-account-file (default: false, required if building for GCP)
--debug (default: false)
Output debug information
+For more information, see the Arvados documentation at https://doc.arvados.org/install/crunch2-cloud/install-compute-node.html
+
EOF
JSON_FILE=
AWS_SOURCE_AMI=
AWS_VPC_ID=
AWS_SUBNET_ID=
+AWS_EBS_AUTOSCALE=
GCP_PROJECT_ID=
GCP_ACCOUNT_FILE=
GCP_ZONE=
NVIDIA_GPU_SUPPORT=
PARSEDOPTS=$(getopt --name "$0" --longoptions \
- help,json-file:,arvados-cluster-id:,aws-source-ami:,aws-profile:,aws-secrets-file:,aws-region:,aws-vpc-id:,aws-subnet-id:,gcp-project-id:,gcp-account-file:,gcp-zone:,azure-secrets-file:,azure-resource-group:,azure-location:,azure-sku:,azure-cloud-environment:,ssh_user:,resolver:,reposuffix:,public-key-file:,mksquashfs-mem:,nvidia-gpu-support,debug \
+ help,json-file:,arvados-cluster-id:,aws-source-ami:,aws-profile:,aws-secrets-file:,aws-region:,aws-vpc-id:,aws-subnet-id:,aws-ebs-autoscale,gcp-project-id:,gcp-account-file:,gcp-zone:,azure-secrets-file:,azure-resource-group:,azure-location:,azure-sku:,azure-cloud-environment:,ssh_user:,resolver:,reposuffix:,public-key-file:,mksquashfs-mem:,nvidia-gpu-support,debug \
-- "" "$@")
if [ $? -ne 0 ]; then
exit 1
--aws-subnet-id)
AWS_SUBNET_ID="$2"; shift
;;
+ --aws-ebs-autoscale)
+ AWS_EBS_AUTOSCALE=1
+ ;;
--gcp-project-id)
GCP_PROJECT_ID="$2"; shift
;;
done
-if [[ "$JSON_FILE" == "" ]] || [[ ! -f "$JSON_FILE" ]]; then
+if [[ -z "$JSON_FILE" ]] || [[ ! -f "$JSON_FILE" ]]; then
echo >&2 "$helpmessage"
echo >&2
echo >&2 "ERROR: packer json file not found"
exit 1
fi
-if [[ "$PUBLIC_KEY_FILE" == "" ]] || [[ ! -f "$PUBLIC_KEY_FILE" ]]; then
+if [[ -z "$PUBLIC_KEY_FILE" ]] || [[ ! -f "$PUBLIC_KEY_FILE" ]]; then
echo >&2 "$helpmessage"
echo >&2
echo >&2 "ERROR: public key file file not found"
EXTRA2=""
-if [[ "$AWS_SOURCE_AMI" != "" ]]; then
+if [[ -n "$AWS_SOURCE_AMI" ]]; then
EXTRA2+=" -var aws_source_ami=$AWS_SOURCE_AMI"
fi
-if [[ "$AWS_PROFILE" != "" ]]; then
+if [[ -n "$AWS_PROFILE" ]]; then
EXTRA2+=" -var aws_profile=$AWS_PROFILE"
fi
-if [[ "$AWS_VPC_ID" != "" ]]; then
+if [[ -n "$AWS_VPC_ID" ]]; then
EXTRA2+=" -var vpc_id=$AWS_VPC_ID -var associate_public_ip_address=true "
fi
-if [[ "$AWS_SUBNET_ID" != "" ]]; then
+if [[ -n "$AWS_SUBNET_ID" ]]; then
EXTRA2+=" -var subnet_id=$AWS_SUBNET_ID -var associate_public_ip_address=true "
fi
-if [[ "$AWS_DEFAULT_REGION" != "" ]]; then
+if [[ -n "$AWS_DEFAULT_REGION" ]]; then
EXTRA2+=" -var aws_default_region=$AWS_DEFAULT_REGION"
fi
-if [[ "$GCP_PROJECT_ID" != "" ]]; then
+if [[ -n "$AWS_EBS_AUTOSCALE" ]]; then
+ EXTRA2+=" -var aws_ebs_autoscale=$AWS_EBS_AUTOSCALE"
+fi
+if [[ -n "$GCP_PROJECT_ID" ]]; then
EXTRA2+=" -var project_id=$GCP_PROJECT_ID"
fi
-if [[ "$GCP_ACCOUNT_FILE" != "" ]]; then
+if [[ -n "$GCP_ACCOUNT_FILE" ]]; then
EXTRA2+=" -var account_file=$GCP_ACCOUNT_FILE"
fi
-if [[ "$GCP_ZONE" != "" ]]; then
+if [[ -n "$GCP_ZONE" ]]; then
EXTRA2+=" -var zone=$GCP_ZONE"
fi
-if [[ "$AZURE_RESOURCE_GROUP" != "" ]]; then
+if [[ -n "$AZURE_RESOURCE_GROUP" ]]; then
EXTRA2+=" -var resource_group=$AZURE_RESOURCE_GROUP"
fi
-if [[ "$AZURE_LOCATION" != "" ]]; then
+if [[ -n "$AZURE_LOCATION" ]]; then
EXTRA2+=" -var location=$AZURE_LOCATION"
fi
-if [[ "$AZURE_SKU" != "" ]]; then
+if [[ -n "$AZURE_SKU" ]]; then
EXTRA2+=" -var image_sku=$AZURE_SKU"
fi
-if [[ "$AZURE_CLOUD_ENVIRONMENT" != "" ]]; then
+if [[ -n "$AZURE_CLOUD_ENVIRONMENT" ]]; then
EXTRA2+=" -var cloud_environment_name=$AZURE_CLOUD_ENVIRONMENT"
fi
-if [[ "$SSH_USER" != "" ]]; then
+if [[ -n "$SSH_USER" ]]; then
EXTRA2+=" -var ssh_user=$SSH_USER"
fi
-if [[ "$RESOLVER" != "" ]]; then
+if [[ -n "$RESOLVER" ]]; then
EXTRA2+=" -var resolver=$RESOLVER"
fi
-if [[ "$REPOSUFFIX" != "" ]]; then
+if [[ -n "$REPOSUFFIX" ]]; then
EXTRA2+=" -var reposuffix=$REPOSUFFIX"
fi
-if [[ "$PUBLIC_KEY_FILE" != "" ]]; then
+if [[ -n "$PUBLIC_KEY_FILE" ]]; then
EXTRA2+=" -var public_key_file=$PUBLIC_KEY_FILE"
fi
-if [[ "$MKSQUASHFS_MEM" != "" ]]; then
+if [[ -n "$MKSQUASHFS_MEM" ]]; then
EXTRA2+=" -var mksquashfs_mem=$MKSQUASHFS_MEM"
fi
-if [[ "$NVIDIA_GPU_SUPPORT" != "" ]]; then
+if [[ -n "$NVIDIA_GPU_SUPPORT" ]]; then
EXTRA2+=" -var nvidia_gpu_support=$NVIDIA_GPU_SUPPORT"
fi
-
-
echo
packer version
echo
#
# SPDX-License-Identifier: Apache-2.0
+set -eu -o pipefail
+
SUDO=sudo
wait_for_apt_locks() {
if [ "x$RESOLVER" != "x" ]; then
$SUDO sed -i "s/#prepend domain-name-servers 127.0.0.1;/prepend domain-name-servers ${RESOLVER};/" /etc/dhcp/dhclient.conf
fi
-# Set up the cloud-init script that will ensure encrypted disks
-$SUDO mv /tmp/usr-local-bin-ensure-encrypted-partitions.sh /usr/local/bin/ensure-encrypted-partitions.sh
+
+if [ "$AWS_EBS_AUTOSCALE" != "1" ]; then
+ # Set up the cloud-init script that will ensure encrypted disks
+ $SUDO mv /tmp/usr-local-bin-ensure-encrypted-partitions.sh /usr/local/bin/ensure-encrypted-partitions.sh
+else
+ wait_for_apt_locks && $SUDO DEBIAN_FRONTEND=noninteractive apt-get -qq --yes install jq unzip
+
+ curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "/tmp/awscliv2.zip"
+ unzip -q /tmp/awscliv2.zip -d /tmp && $SUDO /tmp/aws/install
+ # Pinned to v2.4.5 because we apply a patch below
+ #export EBS_AUTOSCALE_VERSION=$(curl --silent "https://api.github.com/repos/awslabs/amazon-ebs-autoscale/releases/latest" | jq -r .tag_name)
+ export EBS_AUTOSCALE_VERSION="v2.4.5"
+ cd /opt && $SUDO git clone https://github.com/awslabs/amazon-ebs-autoscale.git
+ cd /opt/amazon-ebs-autoscale && $SUDO git checkout $EBS_AUTOSCALE_VERSION
+ $SUDO patch -p1 < /tmp/create-ebs-volume-nvme.patch
+
+ # This script really requires bash and the shebang line is wrong
+ $SUDO sed -i 's|^#!/bin/sh|#!/bin/bash|' /opt/amazon-ebs-autoscale/bin/ebs-autoscale
+
+ # Set up the cloud-init script that makes use of the AWS EBS autoscaler
+ $SUDO mv /tmp/usr-local-bin-ensure-encrypted-partitions-aws-ebs-autoscale.sh /usr/local/bin/ensure-encrypted-partitions.sh
+fi
+
$SUDO chmod 755 /usr/local/bin/ensure-encrypted-partitions.sh
$SUDO chown root:root /usr/local/bin/ensure-encrypted-partitions.sh
$SUDO mv /tmp/etc-cloud-cloud.cfg.d-07_compute_arvados_dispatch_cloud.cfg /etc/cloud/cloud.cfg.d/07_compute_arvados_dispatch_cloud.cfg
--- /dev/null
+# Copyright (C) The Arvados Authors. All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+Make the create-ebs-volume script work with nvme devices.
+
+diff --git a/bin/create-ebs-volume b/bin/create-ebs-volume
+index 6857564..e3122fa 100755
+--- a/bin/create-ebs-volume
++++ b/bin/create-ebs-volume
+@@ -149,10 +149,11 @@ function get_next_logical_device() {
+ for letter in ${alphabet[@]}; do
+ # use /dev/xvdb* device names to avoid contention for /dev/sd* and /dev/xvda names
+ # only supported by HVM instances
+- if [ ! -b "/dev/xvdb${letter}" ]; then
++ if [[ $created_volumes =~ .*/dev/xvdb${letter}.* ]]; then
++ continue
++ fi
+ echo "/dev/xvdb${letter}"
+ break
+- fi
+ done
+ }
+
+@@ -323,8 +324,13 @@ function create_and_attach_volume() {
+
+ logthis "waiting for volume $volume_id on filesystem"
+ while true; do
+- if [ -e "$device" ]; then
+- logthis "volume $volume_id on filesystem as $device"
++ # AWS returns e.g. vol-00338247831716a7b4, the kernel changes that to vol00338247831716a7b
++ valid_volume_id=`echo $volume_id |sed -e 's/[^a-zA-Z0-9]//'`
++ # example lsblk output:
++ # nvme4n1 259:7 0 150G 0 disk vol00338247831716a7b
++ if LSBLK=`lsblk -o NAME,SERIAL |grep $valid_volume_id`; then
++ nvme_device=/dev/`echo $LSBLK|cut -f1 -d' '`
++ logthis "volume $volume_id on filesystem as $nvme_device (aws device $device)"
+ break
+ fi
+ sleep 1
+@@ -338,7 +344,7 @@ function create_and_attach_volume() {
+ > /dev/null
+ logthis "volume $volume_id DeleteOnTermination ENABLED"
+
+- echo $device
++ echo "$nvme_device"
+ }
+
+ create_and_attach_volume
--- /dev/null
+#!/bin/bash
+
+# Copyright (C) The Arvados Authors. All rights reserved.
+#
+# SPDX-License-Identifier: Apache-2.0
+
+set -e
+set -x
+
+MOUNTPATH=/tmp
+
+findmntq() {
+ findmnt "$@" >/dev/null
+}
+
+ensure_umount() {
+ if findmntq "$1"; then
+ umount "$1"
+ fi
+}
+
+# First make sure docker is not using /tmp, then unmount everything under it.
+if [ -d /etc/sv/docker.io ]
+then
+ sv stop docker.io || service stop docker.io || true
+else
+ service docker stop || true
+fi
+
+ensure_umount "$MOUNTPATH/docker/aufs"
+
+/bin/bash /opt/amazon-ebs-autoscale/install.sh -f lvm.ext4 -m $MOUNTPATH 2>&1 > /var/log/ebs-autoscale-install.log
+
+# Make sure docker uses the big partition
+cat <<EOF > /etc/docker/daemon.json
+{
+ "data-root": "$MOUNTPATH/docker-data"
+}
+EOF
+
+# restart docker
+if [ -d /etc/sv/docker.io ]
+then
+ ## runit
+ sv up docker.io
+else
+ service docker start
+fi
+
+end=$((SECONDS+60))
+
+while [ $SECONDS -lt $end ]; do
+ if /usr/bin/docker ps -q >/dev/null; then
+ exit 0
+ fi
+ sleep 1
+done
+
+# Docker didn't start within a minute, abort
+exit 1