---
layout: default
navsection: installguide
title: Install the cloud dispatcher
...
{% comment %}
Copyright (C) The Arvados Authors. All rights reserved.
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
{% include 'notebox_begin_warning' %}
@arvados-dispatch-cloud@ is only relevant for cloud installations. Skip this section if you are installing an on premises cluster that will spool jobs to Slurm or LSF.
{% include 'notebox_end' %}
# "Introduction":#introduction
# "Create compute node VM image":#create-image
# "Update config.yml":#update-config
# "Install arvados-dispatch-cloud":#install-packages
# "Start the service":#start-service
# "Restart the API server and controller":#restart-api
# "Confirm working installation":#confirm-working
h2(#introduction). Introduction
The cloud dispatch service is for running containers on cloud VMs. It works with Microsoft Azure and Amazon EC2; future versions will also support Google Compute Engine.
The cloud dispatch service can run on any node that can connect to the Arvados API service, the cloud provider's API, and the SSH service on cloud VMs. It is not resource-intensive, so you can run it on the API server node.
More detail about the internal operation of the dispatcher can be found in the "architecture section":{{site.baseurl}}/architecture/dispatchcloud.html.
h2(#update-config). Update config.yml
h3. Configure CloudVMs
Add or update the following portions of your cluster configuration file, @config.yml@. Refer to "config.defaults.yml":{{site.baseurl}}/admin/config.html for information about additional configuration options. The @DispatchPrivateKey@ should be the *private* key generated in "Create a SSH keypair":install-compute-node.html#sshkeypair .
Services:
DispatchCloud:
InternalURLs:
"http://localhost:9006": {}
Containers:
CloudVMs:
# BootProbeCommand is a shell command that succeeds when an instance is ready for service
BootProbeCommand: "sudo systemctl status docker"
# --- driver-specific configuration goes here --- see Amazon and Azure examples below ---
DispatchPrivateKey: |
-----BEGIN RSA PRIVATE KEY-----
MIIEpQIBAAKCAQEAqXoCzcOBkFQ7w4dvXf9B++1ctgZRqEbgRYL3SstuMV4oawks
ttUuxJycDdsPmeYcHsKo8vsEZpN6iYsX6ZZzhkO5nEayUTU8sBjmg1ZCTo4QqKXr
FJ+amZ7oYMDof6QEdwl6KNDfIddL+NfBCLQTVInOAaNss7GRrxLTuTV7HcRaIUUI
jYg0Ibg8ZZTzQxCvFXXnjseTgmOcTv7CuuGdt91OVdoq8czG/w8TwOhymEb7mQlt
lXuucwQvYgfoUgcnTgpJr7j+hafp75g2wlPozp8gJ6WQ2yBWcfqL2aw7m7Ll88Nd
[...]
oFyAjVoexx0RBcH6BveTfQtJKbktP1qBO4mXo2dP0cacuZEtlAqW9Eb06Pvaw/D9
foktmqOY8MyctzFgXBpGTxPliGjqo8OkrOyQP2g+FL7v+Km31Xs61P8=
-----END RSA PRIVATE KEY-----
InstanceTypes:
x1md:
ProviderType: x1.medium
VCPUs: 8
RAM: 64GiB
IncludedScratch: 64GB
Price: 0.62
x1lg:
ProviderType: x1.large
VCPUs: 16
RAM: 128GiB
IncludedScratch: 128GB
Price: 1.23
InstanceTypes:
g4dn:
ProviderType: g4dn.xlarge
VCPUs: 4
RAM: 16GiB
IncludedScratch: 125GB
Price: 0.56
CUDA:
DriverVersion: "11.4"
HardwareCapability: "7.5"
DeviceCount: 1
InstanceTypes:
c5large:
ProviderType: c5.large
VCPUs: 2
RAM: 4GiB
IncludedScratch: 5TB
Price: 0.085
m5large:
ProviderType: m5.large
VCPUs: 2
RAM: 8GiB
IncludedScratch: 5TB
Price: 0.096
...
{
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:DescribeVolumeStatus",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"ec2:ModifyInstanceAttribute",
"ec2:DescribeVolumeAttribute",
"ec2:CreateVolume",
"ec2:DeleteVolume",
"ec2:CreateTags"
],
"Resource": "*"
}
]
}
Containers:
CloudVMs:
ImageID: ami-01234567890abcdef
Driver: ec2
DriverParameters:
# If you are not using an IAM role for authentication, specify access
# credentials here. Otherwise, omit or set AccessKeyID and
# SecretAccessKey to an empty value.
AccessKeyID: XXXXXXXXXXXXXXXXXXXX
SecretAccessKey: YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
SecurityGroupIDs:
- sg-0123abcd
SubnetID: subnet-0123abcd
Region: us-east-1
EBSVolumeType: gp2
AdminUsername: arvados
{
"Id": "arvados-dispatch-cloud policy",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags",
"ec2:Describe*",
"ec2:CreateImage",
"ec2:CreateKeyPair",
"ec2:ImportKeyPair",
"ec2:DeleteKeyPair",
"ec2:RunInstances",
"ec2:StopInstances",
"ec2:TerminateInstances",
"ec2:ModifyInstanceAttribute",
"ec2:CreateSecurityGroup",
"ec2:DeleteSecurityGroup",
"iam:PassRole"
],
"Resource": "*"
}
]
}
Containers:
CloudVMs:
ImageID: "zzzzz-compute-v1597349873"
Driver: azure
# (azure) managed disks: set MaxConcurrentInstanceCreateOps to 20 to avoid timeouts, cf
# https://docs.microsoft.com/en-us/azure/virtual-machines/linux/capture-image
MaxConcurrentInstanceCreateOps: 20
DriverParameters:
# Credentials.
SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
# Data center where VMs will be allocated
Location: centralus
# The resource group where the VM and virtual NIC will be
# created.
ResourceGroup: zzzzz
NetworkResourceGroup: yyyyy # only if different from ResourceGroup
Network: xxxxx
Subnet: xxxxx-subnet-private
# The resource group where the disk image is stored, only needs to
# be specified if it is different from ResourceGroup
ImageResourceGroup: aaaaa
Containers:
CloudVMs:
ImageID: "shared_image_gallery_image_definition_name"
Driver: azure
DriverParameters:
# Credentials.
SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
# Data center where VMs will be allocated
Location: centralus
# The resource group where the VM and virtual NIC will be
# created.
ResourceGroup: zzzzz
NetworkResourceGroup: yyyyy # only if different from ResourceGroup
Network: xxxxx
Subnet: xxxxx-subnet-private
# The resource group where the disk image is stored, only needs to
# be specified if it is different from ResourceGroup
ImageResourceGroup: aaaaa
# (azure) shared image gallery: the name of the gallery
SharedImageGalleryName: "shared_image_gallery_1"
# (azure) shared image gallery: the version of the image definition
SharedImageGalleryImageVersion: "0.0.1"
Containers:
CloudVMs:
ImageID: "https://zzzzzzzz.blob.core.windows.net/system/Microsoft.Compute/Images/images/zzzzz-compute-osDisk.55555555-5555-5555-5555-555555555555.vhd"
Driver: azure
DriverParameters:
# Credentials.
SubscriptionID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ClientSecret: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
TenantID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
# Data center where VMs will be allocated
Location: centralus
# The resource group where the VM and virtual NIC will be
# created.
ResourceGroup: zzzzz
NetworkResourceGroup: yyyyy # only if different from ResourceGroup
Network: xxxxx
Subnet: xxxxx-subnet-private
# Where to store the VM VHD blobs
StorageAccount: example
BlobContainer: vhds
$ az account list [ { "cloudName": "AzureCloud", "id": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX", "isDefault": true, "name": "Your Subscription", "state": "Enabled", "tenantId": "YYYYYYYY-YYYY-YYYY-YYYYYYYY", "user": { "name": "you@example.com", "type": "user" } } ]You will need to create a "service principal" to use as a delegated authority for API access.
$ az ad app create --display-name "Arvados Dispatch Cloud (ClusterID)" --homepage "https://arvados.org" --identifier-uris "https://ClusterID.example.com" --end-date 2299-12-31 --password Your_Password
$ az ad sp create "appId"
(appId is part of the response of the previous command)
$ az role assignment create --assignee "objectId" --role Owner --scope /subscriptions/{subscriptionId}/
(objectId is part of the response of the previous command)
~$ arvados-server cloudtest && echo "OK!"
# journalctl -o cat -fu arvados-dispatch-cloud.service
# arvados-client sudo diagnostics
INFO 5: running health check (same as `arvados-server check`)
INFO 10: getting discovery document from https://zzzzz.arvadosapi.com/discovery/v1/apis/arvados/v1/rest
...
INFO 160: running a container
INFO ... container request submitted, waiting up to 10m for container to run
~$ curl -sH "Authorization: Bearer $token" http://localhost:9006/arvados/v1/dispatch/containers
{
"items": [
{
"container": {
"uuid": "zzzzz-dz642-hdp2vpu9nq14tx0",
...
"state": "Running",
"scheduling_parameters": {
"partitions": null,
"preemptible": false,
"max_run_time": 0
},
"exit_code": 0,
"runtime_status": null,
"started_at": null,
"finished_at": null
},
"instance_type": {
"Name": "Standard_D2s_v3",
"ProviderType": "Standard_D2s_v3",
"VCPUs": 2,
"RAM": 8589934592,
"Scratch": 16000000000,
"IncludedScratch": 16000000000,
"AddedScratch": 0,
"Price": 0.11,
"Preemptible": false
}
}
]
}
shell:~$ arv get zzzzz-dz642-hdp2vpu9nq14tx0
{
...
"exit_code":0,
"log":"a01df2f7e5bc1c2ad59c60a837e90dc6+166",
"output":"d41d8cd98f00b204e9800998ecf8427e+0",
"state":"Complete",
...
}
~$ arv keep ls a01df2f7e5bc1c2ad59c60a837e90dc6+166
./crunch-run.txt
./stderr.txt
./stdout.txt
~$ arv-get a01df2f7e5bc1c2ad59c60a837e90dc6+166/stdout.txt
2016-08-05T13:53:06.201011Z Hello, Crunch!