- architecture/index.html.textile.liquid
- Storage in Keep:
- architecture/storage.html.textile.liquid
+ - architecture/keep-components-overview.html.textile.liquid
- architecture/keep-clients.html.textile.liquid
- architecture/keep-data-lifecycle.html.textile.liquid
- architecture/manifest-format.html.textile.liquid
This page describes how to enable preemptible instances. Preemptible instances typically offer lower cost computation with a tradeoff of lower service guarantees. If a compute node is preempted, Arvados will restart the computation on a new instance.
-Currently Arvados supports preemptible instances using AWS spot instances.
+Currently Arvados supports preemptible instances using AWS and Azure spot instances.
h2. Configuration
-To use preemptible instances, set @UsePreemptibleInstances: true@ and add entries to @InstanceTypes@ with @Preemptible: true@ to @config.yml@. Typically you want to add both preemptible and non-preemptible entries for each cloud provider VM type. The @Price@ for preemptible instances is the maximum bid price, the actual price paid is dynamic and may be lower. For example:
+To use preemptible instances, set @UsePreemptibleInstances: true@ and add entries to @InstanceTypes@ with @Preemptible: true@ to @config.yml@. Typically you want to add both preemptible and non-preemptible entries for each cloud provider VM type. The @Price@ for preemptible instances is the maximum bid price, the actual price paid is dynamic and will likely be lower. For example:
<pre>
Clusters:
When @UsePreemptibleInstances@ is enabled, child containers (workflow steps) will automatically be made preemptible. Note that because preempting the workflow runner would cancel the entire workflow, the workflow runner runs in a reserved (non-preemptible) instance.
-If you are using "arvados-dispatch-cloud":{{site.baseurl}}/install/crunch2-cloud/install-dispatch-cloud.html no additional configuration is required.
+No additional configuration is required, "arvados-dispatch-cloud":{{site.baseurl}}/install/crunch2-cloud/install-dispatch-cloud.html will now start preemptible instances where appropriate.
+
+h3. Cost Tracking
+
+Preemptible instances prices are declared at instance request time and defined by the maximum price that the user is willing to pay per hour. By default, this price is the same amount as the on-demand version of each instance type, and this setting is the one that @arvados-dispatch-cloud@ uses for now, as it doesn't include any pricing data to the spot instance request.
+
+For AWS, the real price that a spot instance has at any point in time is discovered at the end of each usage hour, depending on instance demand. For this reason, AWS provides a data feed subscription to get hourly logs, as described on "Amazon's User Guide":https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-data-feeds.html.
h2. Preemptible instances on AWS
-For general information, see "using Amazon EC2 spot instances":https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html .
+For general information, see "using Amazon EC2 spot instances":https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html.
h3. Permissions
The account needs to have a service linked role created. This can be done by logging into the AWS account, go to _IAM Management_ → _Roles_ and create the @AWSServiceRoleForEC2Spot@ role by clicking on the @Create@ button, selecting @EC2@ service and @EC2 - Spot Instances@ use case.
-h3. Cost Tracking
+h2. Preemptible instances on Azure
+
+For general information, see "Use Spot VMs in Azure":https://docs.microsoft.com/en-us/azure/virtual-machines/spot-vms.
-Amazon's Spot instances prices are declared at instance request time and defined by the maximum price that the user is willing to pay per hour. By default, this price is the same amount as the on-demand version of each instance type, and this setting is the one that @arvados-dispatch-cloud@ uses for now, as it doesn't include any pricing data to the spot instance request.
+When starting preemptible instances on Azure, Arvados configures the eviction policy to 'delete', with max price set to '-1'. This has the effect that preemptible VMs will not be evicted for pricing reasons. The price paid for the instance will be the current spot price for the VM type, up to a maximum of the price for a standard, non-spot VM of that type.
-The real price that a spot instance has at any point in time is discovered at the end of each usage hour, depending on instance demand. For this reason, AWS provides a data feed subscription to get hourly logs, as described on "Amazon's User Guide":https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-data-feeds.html.
+Please note that Azure provides no SLA for preemptible instances. Even in this configuration, preemptible instances can still be evicted for capacity reasons. If that happens and a container is aborted, Arvados will try to restart it, subject to the usual retry rules.
+Spot pricing is not available on 'B-series' VMs, those should not be defined in the configuration file with the _Preemptible_ flag set to true. Spot instances have a separate quota pool, make sure you have sufficient quota available.
A regular Workbench "download" link is also accepted, but credentials passed via cookie, header, etc. are ignored. Only public data can be served this way:
pre. http://collections.example.com/collections/uuid_or_pdh/foo/bar.txt
+
+h2(#same-site). Same-site requirements for requests with tokens
+
+Although keep-web doesn't care about the domain part of the URL, the clients do: especially when rendering inline content.
+
+When a client passes a token in the URL, keep-web sends a redirect response placing the token in a @Set-Cookie@ header with the @SameSite=Lax@ attribute. The browser will ignore the cookie if it's not coming from a _same-site_ request, and thus its subsequent request will fail with a @401 Unauthorized@ error.
+
+This mainly affects Workbench's ability to show inline content, so it should be taken into account when configuring both services' URL schemes.
+
+You can read more about the definition of a _same-site_ request at the "RFC 6265bis-03 page":https://tools.ietf.org/html/draft-ietf-httpbis-rfc6265bis-03#section-5.2
\ No newline at end of file
--- /dev/null
+---
+layout: default
+navsection: architecture
+title: Keep components overview
+...
+{% comment %}
+Copyright (C) The Arvados Authors. All rights reserved.
+
+SPDX-License-Identifier: CC-BY-SA-3.0
+{% endcomment %}
+
+Keep has a number of components. This page describes each component and the role it plays.
+
+h3. Keep clients for data access
+
+In order to access data in Keep, a client is needed to store data in and retrieve data from Keep. Different types of Keep clients exist:
+* a command line client like "@arv-get@":/user/tutorials/tutorial-keep-get.html#download-using-arv or "@arv-put@":/user/tutorials/tutorial-keep.html#upload-using-command
+* a FUSE mount provided by "@arv-mount@":/user/tutorials/tutorial-keep-mount-gnu-linux.html
+* a WebDAV mount provided by @keep-web@
+* an S3-compatible endpoint provided by @keep-web@
+* programmatic access via the "Arvados SDKs":/sdk/index.html
+
+In essense, these clients all do the same thing: they translate file and directory references into requests for Keep blocks and collection manifests. How Keep clients work, and how they use rendezvous hashing, is described in greater detail in "the next section":/architecture/keep-clients.html.
+
+For example, when a request comes in to read a file from Keep, the client will
+* request the collection object (including its manifest) from the API server
+* look up the file in the collection manifest, and retrieve the hashes of the block(s) that contain its content
+* ask the keepstore(s) for the block hashes
+* return the contents of the file to the requestor
+
+All of those steps are subject to access control, which applies at the level of the collection: in the example above, the API server and the keepstore daemons verify that the client has permission to read the collection, and will reject the request if it does not.
+
+h3. API server
+
+The API server stores collection objects and all associated metadata. That includes data about where the blocks for a collection are to be stored, e.g. when "storage classes":/admin/storage-classes.html are configured, as well as the desired and confirmed replication count for each block. It also stores the ACLs that control access to the collections. Finally, the API server provides Keep clients with time-based block signatures for access.
+
+h3. Keepstore
+
+The @keepstore@ daemon is Keep's workhorse, the storage server that stores and retrieves data from an underlying storage system. Keepstore exposes an HTTP REST API. Keepstore only handles requests for blocks. Because blocks are content-addressed, they can be written and deleted, but there is no _update_ operation: blocks are immutable.
+
+So what happens if the content of a file changes? When a client changes a file, it first writes any new blocks to the keepstore(s). Then, it updates the manifest for the collection the file belongs to with the references to the new blocks.
+
+A keepstore can store its blocks in object storage (S3 or an S3-compatible system, or Azure Blob Storage). It can also store blocks on a POSIX file system. A keepstore can be configured with multiple storage volumes. Each keepstore volume is configured with a replication number; e.g. a POSIX file system backed by a single disk would have a replication factor of 1, while an Azure 'LRS' storage volume could be configured with a replication factor of 3 (that is how many copies LRS stores under the hood, according to the Azure documentation).
+
+By default, Arvados uses a replication factor of 2. See the @DefaultReplication@ configuration parameter in "the configuration reference":https://doc.arvados.org/admin/config.html. Additionally, each collection can be configured with its own replication factor. It's worth noting that it is the responsibility of the Keep clients to make sure that all blocks are stored subject to their desired replica count, which is derived from the collections the blocks belong to. @keepstore@ itself does not provide replication; all it does is store blocks on the volumes it knows about. The @keepproxy@ and @keep-balance@ processes (see below) make sure that blocks are replicated properly.
+
+The maximum block size for @keepstore@ is 64 MiB, and keep clients typically combine small files into larger blocks. In a typical Arvados installation, the majority of blocks stored in Keep will be 64 MiB, though some fraction will be smaller.
+
+h3. Keepproxy
+
+The @keepproxy@ server is a gateway into your Keep storage. Unlike the Keepstore servers, which are only accessible on the local LAN, Keepproxy is suitable for clients located elsewhere on the internet. A client writing through Keepproxy only writes one copy of each block; the Keepproxy server will write additional copies of the data to the Keepstore servers, to fulfill the requested replication factor. Keepproxy also checks API token validity before processing requests.
+
+h3. Keep-web
+
+The @keep-web@ server provides read/write access to files stored in Keep using the HTTP, WebDAV and S3 protocols. This makes it easy to access files in Keep from a browser, or mount Keep as a network folder using WebDAV support in various operating systems. It serves public data to unauthenticated clients, and serves private data to clients that supply Arvados API tokens.
+
+h3. Keep-balance
+
+Keep is a garbage-collected system. When a block is no longer referenced in any collection manifest in the system, it becomes eligible for garbage collection. When the desired replication factor for a block (derived from the default replication factor, in addition to the replication factor of any collection(s) the block belongs to) does not match reality, the number of copies stored in the available Keepstore servers needs to be adjusted.
+
+The @keep-balance@ program takes care of these things. It runs as a service, and wakes up periodically to do a scan of the system and send instructions to the Keepstore servers. That process is described in more detail at "Balancing Keep servers":https://doc.arvados.org/admin/keep-balance.html.
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
-h2. Arados /etc/arvados/config.yml
+h2. Arvados /etc/arvados/config.yml
The configuration file is normally found at @/etc/arvados/config.yml@ and will be referred to as just @config.yml@ in this guide. This configuration file must be kept in sync across every service node in the cluster, but not shell and compute nodes (which do not require config.yml).
# "Introduction":#introduction
# "Create an SSH keypair":#sshkeypair
# "The build script":#building
-# "Build an Azure image":#azure
# "Build an AWS image":#aws
+# "Build an Azure image":#azure
h2(#introduction). Introduction
Output debug information (default: false)
</code></pre></notextile>
-h2(#azure). Build an Azure image
+h2(#aws). Build an AWS image
-<notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-azure.json \
+<notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-aws.json \
--arvados-cluster-id ClusterID \
- --azure-resource-group ResourceGroup \
- --azure-location AzureRegion \
- --azure-sku AzureSKU \
- --azure-secrets-file AzureSecretsFilePath \
+ --aws-profile AWSProfile \
+ --aws-source-ami AMI \
+ --aws-vpc-id VPC \
+ --aws-subnet-id Subnet \
+ --ssh_user admin \
--resolver ResolverIP \
--public-key-file ArvadosDispatchCloudPublicKeyPath
</span>
</code></pre></notextile>
-For @ClusterID@, fill in your cluster ID. The @ResourceGroup@ and @AzureRegion@ (e.g. 'eastus2') should be configured for where you want the compute image to be generated and stored. The @AzureSKU@ is the SKU of the base image to be used, e.g. '18.04-LTS' for Ubuntu 18.04.
-
-@AzureSecretsFilePath@ should be replaced with the path to a shell script that loads the Azure secrets with sufficient permissions to create the image. The file would look like this:
-
-<notextile><pre><code>export ARM_CLIENT_ID=...
-export ARM_CLIENT_SECRET=...
-export ARM_SUBSCRIPTION_ID=...
-export ARM_TENANT_ID=...
-</code></pre></notextile>
-
-These secrets can be generated from the Azure portal, or with the cli using a command like this:
+For @ClusterID@, fill in your cluster ID. The @VPC@ and @Subnet@ should be configured for where you want the compute image to be generated and stored. The @AMI@ is the identifier for the base image to be used. Current AMIs are maintained by "Debian":https://wiki.debian.org/Cloud/AmazonEC2Image/Buster and "Ubuntu":https://cloud-images.ubuntu.com/locator/ec2/.
-<notextile><pre><code>~$ <span class="userinput">az ad sp create-for-rbac --name Packer --password ...</span>
-</code></pre></notextile>
+@AWSProfile@ should be replaced with the name of an AWS profile with sufficient permissions to create the image.
@ArvadosDispatchCloudPublicKeyPath@ should be replaced with the path to the ssh *public* key file generated in "Create an SSH keypair":#sshkeypair, above.
Adding these lines to the @/etc/hosts@ file in the compute node image could be done with a small change to the Packer template and the @scripts/base.sh@ script, which will be left as an exercise for the reader.
-h2(#aws). Build an AWS image
+h2(#azure). Build an Azure image
-<notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-aws.json \
+<notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-azure.json \
--arvados-cluster-id ClusterID \
- --aws-profile AWSProfile \
- --aws-source-ami AMI \
- --aws-vpc-id VPC \
- --aws-subnet-id Subnet \
- --ssh_user admin \
+ --azure-resource-group ResourceGroup \
+ --azure-location AzureRegion \
+ --azure-sku AzureSKU \
+ --azure-secrets-file AzureSecretsFilePath \
--resolver ResolverIP \
--public-key-file ArvadosDispatchCloudPublicKeyPath
</span>
</code></pre></notextile>
-For @ClusterID@, fill in your cluster ID. The @VPC@ and @Subnet@ should be configured for where you want the compute image to be generated and stored. The @AMI@ is the identifier for the base image to be used. Current AMIs are maintained by "Debian":https://wiki.debian.org/Cloud/AmazonEC2Image/Buster and "Ubuntu":https://cloud-images.ubuntu.com/locator/ec2/.
+For @ClusterID@, fill in your cluster ID. The @ResourceGroup@ and @AzureRegion@ (e.g. 'eastus2') should be configured for where you want the compute image to be generated and stored. The @AzureSKU@ is the SKU of the base image to be used, e.g. '18.04-LTS' for Ubuntu 18.04.
-@AWSProfile@ should be replaced with the name of an AWS profile with sufficient permissions to create the image.
+@AzureSecretsFilePath@ should be replaced with the path to a shell script that loads the Azure secrets with sufficient permissions to create the image. The file would look like this:
+
+<notextile><pre><code>export ARM_CLIENT_ID=...
+export ARM_CLIENT_SECRET=...
+export ARM_SUBSCRIPTION_ID=...
+export ARM_TENANT_ID=...
+</code></pre></notextile>
+
+These secrets can be generated from the Azure portal, or with the cli using a command like this:
+
+<notextile><pre><code>~$ <span class="userinput">az ad sp create-for-rbac --name Packer --password ...</span>
+</code></pre></notextile>
@ArvadosDispatchCloudPublicKeyPath@ should be replaced with the path to the ssh *public* key file generated in "Create an SSH keypair":#sshkeypair, above.
h4. Minimal configuration example for Amazon EC2
+The <span class="userinput">ImageID</span> value is the compute node image that was built in "the previous section":install-compute-node.html#aws.
+
<notextile>
<pre><code> Containers:
CloudVMs:
- ImageID: ami-01234567890abcdef
+ ImageID: <span class="userinput">ami-01234567890abcdef</span>
Driver: ec2
DriverParameters:
AccessKeyID: XXXXXXXXXXXXXXXXXXXX
Using managed disks:
+The <span class="userinput">ImageID</span> value is the compute node image that was built in "the previous section":install-compute-node.html#azure.
+
<notextile>
<pre><code> Containers:
CloudVMs:
- ImageID: "zzzzz-compute-v1597349873"
+ ImageID: <span class="userinput">"zzzzz-compute-v1597349873"</span>
Driver: azure
# (azure) managed disks: set MaxConcurrentInstanceCreateOps to 20 to avoid timeouts, cf
# https://docs.microsoft.com/en-us/azure/virtual-machines/linux/capture-image
<notextile>
<pre><code> Containers:
CloudVMs:
- ImageID: "shared_image_gallery_image_definition_name"
+ ImageID: <span class="userinput">"shared_image_gallery_image_definition_name"</span>
Driver: azure
DriverParameters:
# Credentials.
Using unmanaged disks (deprecated):
+The <span class="userinput">ImageID</span> value is the compute node image that was built in "the previous section":install-compute-node.html#azure.
+
<notextile>
<pre><code> Containers:
CloudVMs:
- ImageID: "https://zzzzzzzz.blob.core.windows.net/system/Microsoft.Compute/Images/images/zzzzz-compute-osDisk.55555555-5555-5555-5555-555555555555.vhd"
+ ImageID: <span class="userinput">"https://zzzzzzzz.blob.core.windows.net/system/Microsoft.Compute/Images/images/zzzzz-compute-osDisk.55555555-5555-5555-5555-555555555555.vhd"</span>
Driver: azure
DriverParameters:
# Credentials.
There are two approaches to mitigate this.
# The service can tell the browser that all files should go to download instead of in-browser preview, except in situations where an attacker is unlikely to be able to gain access to anything they didn't already have access to.
-# Each each collection served by @keep-web@ is served on its own virtual host. This allows for file with executable content to be displayed in-browser securely. The virtual host embeds the collection uuid or portable data hash in the hostname. For example, a collection with uuid @xxxxx-4zz18-tci4vn4fa95w0zx@ could be served as @xxxxx-4zz18-tci4vn4fa95w0zx.collections.ClusterID.example.com@ . The portable data hash @dd755dbc8d49a67f4fe7dc843e4f10a6+54@ could be served at @dd755dbc8d49a67f4fe7dc843e4f10a6-54.collections.ClusterID.example.com@ . This requires "wildcard DNS record":https://en.wikipedia.org/wiki/Wildcard_DNS_record and "wildcard TLS certificate.":https://en.wikipedia.org/wiki/Wildcard_certificate
+# Each collection served by @keep-web@ is served on its own virtual host. This allows for file with executable content to be displayed in-browser securely. The virtual host embeds the collection uuid or portable data hash in the hostname. For example, a collection with uuid @xxxxx-4zz18-tci4vn4fa95w0zx@ could be served as @xxxxx-4zz18-tci4vn4fa95w0zx.collections.ClusterID.example.com@ . The portable data hash @dd755dbc8d49a67f4fe7dc843e4f10a6+54@ could be served at @dd755dbc8d49a67f4fe7dc843e4f10a6-54.collections.ClusterID.example.com@ . This requires "wildcard DNS record":https://en.wikipedia.org/wiki/Wildcard_DNS_record and "wildcard TLS certificate.":https://en.wikipedia.org/wiki/Wildcard_certificate
h3. Collections download URL
Note the trailing slash.
+{% include 'notebox_begin' %}
+Whether you choose to serve collections from their own subdomain or from a single domain, it's important to keep in mind that they should be served from me same _site_ as Workbench for the inline previews to work.
+
+Please check "keep-web's URL pattern guide":/api/keep-web-urls.html#same-site to learn more.
+{% include 'notebox_end' %}
+
h2. Set InternalURLs
<notextile>
A shell node runs the @arvados-login-sync@ service to manage user accounts, and typically has Arvados utilities and SDKs pre-installed. Users are allowed to log in and run arbitrary programs. For optimal performance, the Arvados shell server should be on the same LAN as the Arvados cluster.
-Because it _contains secrets_ shell nodes should *not* have a copy of the Arvados @config.yml@.
+Because Arvados @config.yml@ _contains secrets_ it should not *not* be present on shell nodes.
Shell nodes should be separate virtual machines from the VMs running other Arvados services. You may choose to grant root access to users so that they can customize the node, for example, installing new programs. This has security considerations depending on whether a shell node is single-user or multi-user.
Set @ARVADOS_VIRTUAL_MACHINE_UUID@ to the UUID from "Create record for VM":#vm-record
+h3. Standalone cluster
+
<notextile>
<pre>
<code>shellserver:# <span class="userinput">umask 0700; tee /etc/cron.d/arvados-login-sync <<EOF
</pre>
</notextile>
+h3. Part of a LoginCLuster federation
+
+If this cluster is part of a "federation with centralized user management":../admin/federation.html#LoginCluster , the login sync script also needs to be given the host and user token for the login cluster.
+
+<notextile>
+<pre>
+<code>shellserver:# <span class="userinput">umask 0700; tee /etc/cron.d/arvados-login-sync <<EOF
+ARVADOS_API_HOST="<strong>ClusterID.example.com</strong>"
+ARVADOS_API_TOKEN="<strong>xxxxxxxxxxxxxxxxx</strong>"
+LOGINCLUSTER_ARVADOS_API_HOST="<strong>LoginClusterID.example.com</strong>"
+LOGINCLUSTER_ARVADOS_API_TOKEN="<strong>yyyyyyyyyyyyyyyyy</strong>"
+ARVADOS_VIRTUAL_MACHINE_UUID="<strong>zzzzz-2x53u-zzzzzzzzzzzzzzz</strong>"
+*/2 * * * * root arvados-login-sync
+EOF</span></code>
+</pre>
+</notextile>
+
+
h2(#confirm-working). Confirm working installation
A user should be able to log in to the shell server when the following conditions are satisfied:
The SDK is packaged as a JAR named @arvados-java-<version>.jar@, which is published to Maven Central and can be included using Maven, Gradle, or by hand.
-Here is an example @build.gradle@ file that uses the Arados java sdk:
+Here is an example @build.gradle@ file that uses the Arvados java sdk:
<pre>
apply plugin: 'application'
--- /dev/null
+// Copyright (C) The Arvados Authors. All rights reserved.
+//
+// SPDX-License-Identifier: AGPL-3.0
+
+package boot
+
+import (
+ "context"
+ "net/url"
+
+ "git.arvados.org/arvados.git/lib/controller/rpc"
+ "git.arvados.org/arvados.git/lib/service"
+ "git.arvados.org/arvados.git/sdk/go/arvados"
+ "git.arvados.org/arvados.git/sdk/go/arvadosclient"
+ "git.arvados.org/arvados.git/sdk/go/auth"
+ "git.arvados.org/arvados.git/sdk/go/ctxlog"
+ "git.arvados.org/arvados.git/sdk/go/keepclient"
+ "gopkg.in/check.v1"
+)
+
+// TestCluster stores a working test cluster data
+type TestCluster struct {
+ Super Supervisor
+ Config arvados.Config
+ ControllerURL *url.URL
+ ClusterID string
+}
+
+type logger struct {
+ loggerfunc func(...interface{})
+}
+
+func (l logger) Log(args ...interface{}) {
+ l.loggerfunc(args)
+}
+
+// NewTestCluster loads the provided configuration, and sets up a test cluster
+// ready for being started.
+func NewTestCluster(srcPath, clusterID string, cfg *arvados.Config, listenHost string, logWriter func(...interface{})) *TestCluster {
+ return &TestCluster{
+ Super: Supervisor{
+ SourcePath: srcPath,
+ ClusterType: "test",
+ ListenHost: listenHost,
+ ControllerAddr: ":0",
+ OwnTemporaryDatabase: true,
+ Stderr: &service.LogPrefixer{
+ Writer: ctxlog.LogWriter(logWriter),
+ Prefix: []byte("[" + clusterID + "] ")},
+ },
+ Config: *cfg,
+ ClusterID: clusterID,
+ }
+}
+
+// Start the test cluster.
+func (tc *TestCluster) Start() {
+ tc.Super.Start(context.Background(), &tc.Config, "-")
+}
+
+// WaitReady waits for all components to report healthy, and finishes setting
+// up the TestCluster struct.
+func (tc *TestCluster) WaitReady() bool {
+ au, ok := tc.Super.WaitReady()
+ if !ok {
+ return ok
+ }
+ u := url.URL(*au)
+ tc.ControllerURL = &u
+ return ok
+}
+
+// ClientsWithToken returns Context, Arvados.Client and keepclient structs
+// initialized to connect to the cluster with the supplied Arvados token.
+func (tc *TestCluster) ClientsWithToken(token string) (context.Context, *arvados.Client, *keepclient.KeepClient) {
+ cl := tc.Config.Clusters[tc.ClusterID]
+ ctx := auth.NewContext(context.Background(), auth.NewCredentials(token))
+ ac, err := arvados.NewClientFromConfig(&cl)
+ if err != nil {
+ panic(err)
+ }
+ ac.AuthToken = token
+ arv, err := arvadosclient.New(ac)
+ if err != nil {
+ panic(err)
+ }
+ kc := keepclient.New(arv)
+ return ctx, ac, kc
+}
+
+// UserClients logs in as a user called "example", get the user's API token,
+// initialize clients with the API token, set up the user and
+// optionally activate the user. Return client structs for
+// communicating with the cluster on behalf of the 'example' user.
+func (tc *TestCluster) UserClients(rootctx context.Context, c *check.C, conn *rpc.Conn, authEmail string, activate bool) (context.Context, *arvados.Client, *keepclient.KeepClient, arvados.User) {
+ login, err := conn.UserSessionCreate(rootctx, rpc.UserSessionCreateOptions{
+ ReturnTo: ",https://example.com",
+ AuthInfo: rpc.UserSessionAuthInfo{
+ Email: authEmail,
+ FirstName: "Example",
+ LastName: "User",
+ Username: "example",
+ },
+ })
+ c.Assert(err, check.IsNil)
+ redirURL, err := url.Parse(login.RedirectLocation)
+ c.Assert(err, check.IsNil)
+ userToken := redirURL.Query().Get("api_token")
+ c.Logf("user token: %q", userToken)
+ ctx, ac, kc := tc.ClientsWithToken(userToken)
+ user, err := conn.UserGetCurrent(ctx, arvados.GetOptions{})
+ c.Assert(err, check.IsNil)
+ _, err = conn.UserSetup(rootctx, arvados.UserSetupOptions{UUID: user.UUID})
+ c.Assert(err, check.IsNil)
+ if activate {
+ _, err = conn.UserActivate(rootctx, arvados.UserActivateOptions{UUID: user.UUID})
+ c.Assert(err, check.IsNil)
+ user, err = conn.UserGetCurrent(ctx, arvados.GetOptions{})
+ c.Assert(err, check.IsNil)
+ c.Logf("user UUID: %q", user.UUID)
+ if !user.IsActive {
+ c.Fatalf("failed to activate user -- %#v", user)
+ }
+ }
+ return ctx, ac, kc, user
+}
+
+// RootClients returns Context, arvados.Client and keepclient structs initialized
+// to communicate with the cluster as the system root user.
+func (tc *TestCluster) RootClients() (context.Context, *arvados.Client, *keepclient.KeepClient) {
+ return tc.ClientsWithToken(tc.Config.Clusters[tc.ClusterID].SystemRootToken)
+}
+
+// AnonymousClients returns Context, arvados.Client and keepclient structs initialized
+// to communicate with the cluster as the anonymous user.
+func (tc *TestCluster) AnonymousClients() (context.Context, *arvados.Client, *keepclient.KeepClient) {
+ return tc.ClientsWithToken(tc.Config.Clusters[tc.ClusterID].Users.AnonymousUserToken)
+}
+
+// Conn gets rpc connection struct initialized to communicate with the
+// specified cluster.
+func (tc *TestCluster) Conn() *rpc.Conn {
+ return rpc.NewConn(tc.ClusterID, tc.ControllerURL, true, rpc.PassthroughTokenProvider)
+}
},
}
+ if instanceType.Preemptible {
+ // Setting maxPrice to -1 is the equivalent of paying spot price, up to the
+ // normal price. This means the node will not be pre-empted for price
+ // reasons. It may still be pre-empted for capacity reasons though. And
+ // Azure offers *no* SLA on spot instances.
+ var maxPrice float64 = -1
+ vmParameters.VirtualMachineProperties.Priority = compute.Spot
+ vmParameters.VirtualMachineProperties.EvictionPolicy = compute.Delete
+ vmParameters.VirtualMachineProperties.BillingProfile = &compute.BillingProfile{MaxPrice: &maxPrice}
+ }
+
vm, err := az.vmClient.createOrUpdate(az.ctx, az.azconfig.ResourceGroup, name, vmParameters)
if err != nil {
// Do some cleanup. Otherwise, an unbounded number of new unused nics and
Price: .02,
Preemptible: false,
},
+ "tinyp": {
+ Name: "tiny",
+ ProviderType: "Standard_D1_v2",
+ VCPUs: 1,
+ RAM: 4000000000,
+ Scratch: 10000000000,
+ Price: .002,
+ Preemptible: true,
+ },
})}
if *live != "" {
var exampleCfg testConfig
c.Check(tags["TestTagName"], check.Equals, "test tag value")
c.Logf("inst.String()=%v Address()=%v Tags()=%v", inst.String(), inst.Address(), tags)
+ instPreemptable, err := ap.Create(cluster.InstanceTypes["tinyp"],
+ img, map[string]string{
+ "TestTagName": "test tag value",
+ }, "umask 0600; echo -n test-file-data >/var/run/test-file", pk)
+
+ c.Assert(err, check.IsNil)
+
+ tags = instPreemptable.Tags()
+ c.Check(tags["TestTagName"], check.Equals, "test tag value")
+ c.Logf("instPreemptable.String()=%v Address()=%v Tags()=%v", instPreemptable.String(), instPreemptable.Address(), tags)
+
}
func (*AzureInstanceSetSuite) TestListInstances(c *check.C) {
"math"
"net"
"net/http"
- "net/url"
"os"
"os/exec"
"path/filepath"
"git.arvados.org/arvados.git/lib/boot"
"git.arvados.org/arvados.git/lib/config"
- "git.arvados.org/arvados.git/lib/controller/rpc"
- "git.arvados.org/arvados.git/lib/service"
"git.arvados.org/arvados.git/sdk/go/arvados"
- "git.arvados.org/arvados.git/sdk/go/arvadosclient"
"git.arvados.org/arvados.git/sdk/go/arvadostest"
- "git.arvados.org/arvados.git/sdk/go/auth"
"git.arvados.org/arvados.git/sdk/go/ctxlog"
- "git.arvados.org/arvados.git/sdk/go/keepclient"
check "gopkg.in/check.v1"
)
var _ = check.Suite(&IntegrationSuite{})
-type testCluster struct {
- super boot.Supervisor
- config arvados.Config
- controllerURL *url.URL
-}
-
type IntegrationSuite struct {
- testClusters map[string]*testCluster
+ testClusters map[string]*boot.TestCluster
oidcprovider *arvadostest.OIDCProvider
}
s.oidcprovider.ValidClientID = "clientid"
s.oidcprovider.ValidClientSecret = "clientsecret"
- s.testClusters = map[string]*testCluster{
+ s.testClusters = map[string]*boot.TestCluster{
"z1111": nil,
"z2222": nil,
"z3333": nil,
ExternalURL: https://` + hostport[id] + `
TLS:
Insecure: true
- Login:
- LoginCluster: z1111
SystemLogs:
Format: text
RemoteClusters:
loader.SkipAPICalls = true
cfg, err := loader.Load()
c.Assert(err, check.IsNil)
- s.testClusters[id] = &testCluster{
- super: boot.Supervisor{
- SourcePath: filepath.Join(cwd, "..", ".."),
- ClusterType: "test",
- ListenHost: "127.0.0." + id[3:],
- ControllerAddr: ":0",
- OwnTemporaryDatabase: true,
- Stderr: &service.LogPrefixer{Writer: ctxlog.LogWriter(c.Log), Prefix: []byte("[" + id + "] ")},
- },
- config: *cfg,
- }
- s.testClusters[id].super.Start(context.Background(), &s.testClusters[id].config, "-")
+ tc := boot.NewTestCluster(
+ filepath.Join(cwd, "..", ".."),
+ id, cfg, "127.0.0."+id[3:], c.Log)
+ s.testClusters[id] = tc
+ s.testClusters[id].Start()
}
for _, tc := range s.testClusters {
- au, ok := tc.super.WaitReady()
+ ok := tc.WaitReady()
c.Assert(ok, check.Equals, true)
- u := url.URL(*au)
- tc.controllerURL = &u
}
}
func (s *IntegrationSuite) TearDownSuite(c *check.C) {
for _, c := range s.testClusters {
- c.super.Stop()
- }
-}
-
-// Get rpc connection struct initialized to communicate with the
-// specified cluster.
-func (s *IntegrationSuite) conn(clusterID string) *rpc.Conn {
- return rpc.NewConn(clusterID, s.testClusters[clusterID].controllerURL, true, rpc.PassthroughTokenProvider)
-}
-
-// Return Context, Arvados.Client and keepclient structs initialized
-// to connect to the specified cluster (by clusterID) using with the supplied
-// Arvados token.
-func (s *IntegrationSuite) clientsWithToken(clusterID string, token string) (context.Context, *arvados.Client, *keepclient.KeepClient) {
- cl := s.testClusters[clusterID].config.Clusters[clusterID]
- ctx := auth.NewContext(context.Background(), auth.NewCredentials(token))
- ac, err := arvados.NewClientFromConfig(&cl)
- if err != nil {
- panic(err)
- }
- ac.AuthToken = token
- arv, err := arvadosclient.New(ac)
- if err != nil {
- panic(err)
+ c.Super.Stop()
}
- kc := keepclient.New(arv)
- return ctx, ac, kc
-}
-
-// Log in as a user called "example", get the user's API token,
-// initialize clients with the API token, set up the user and
-// optionally activate the user. Return client structs for
-// communicating with the cluster on behalf of the 'example' user.
-func (s *IntegrationSuite) userClients(rootctx context.Context, c *check.C, conn *rpc.Conn, clusterID string, activate bool) (context.Context, *arvados.Client, *keepclient.KeepClient, arvados.User) {
- login, err := conn.UserSessionCreate(rootctx, rpc.UserSessionCreateOptions{
- ReturnTo: ",https://example.com",
- AuthInfo: rpc.UserSessionAuthInfo{
- Email: "user@example.com",
- FirstName: "Example",
- LastName: "User",
- Username: "example",
- },
- })
- c.Assert(err, check.IsNil)
- redirURL, err := url.Parse(login.RedirectLocation)
- c.Assert(err, check.IsNil)
- userToken := redirURL.Query().Get("api_token")
- c.Logf("user token: %q", userToken)
- ctx, ac, kc := s.clientsWithToken(clusterID, userToken)
- user, err := conn.UserGetCurrent(ctx, arvados.GetOptions{})
- c.Assert(err, check.IsNil)
- _, err = conn.UserSetup(rootctx, arvados.UserSetupOptions{UUID: user.UUID})
- c.Assert(err, check.IsNil)
- if activate {
- _, err = conn.UserActivate(rootctx, arvados.UserActivateOptions{UUID: user.UUID})
- c.Assert(err, check.IsNil)
- user, err = conn.UserGetCurrent(ctx, arvados.GetOptions{})
- c.Assert(err, check.IsNil)
- c.Logf("user UUID: %q", user.UUID)
- if !user.IsActive {
- c.Fatalf("failed to activate user -- %#v", user)
- }
- }
- return ctx, ac, kc, user
-}
-
-// Return Context, arvados.Client and keepclient structs initialized
-// to communicate with the cluster as the system root user.
-func (s *IntegrationSuite) rootClients(clusterID string) (context.Context, *arvados.Client, *keepclient.KeepClient) {
- return s.clientsWithToken(clusterID, s.testClusters[clusterID].config.Clusters[clusterID].SystemRootToken)
-}
-
-// Return Context, arvados.Client and keepclient structs initialized
-// to communicate with the cluster as the anonymous user.
-func (s *IntegrationSuite) anonymousClients(clusterID string) (context.Context, *arvados.Client, *keepclient.KeepClient) {
- return s.clientsWithToken(clusterID, s.testClusters[clusterID].config.Clusters[clusterID].Users.AnonymousUserToken)
}
func (s *IntegrationSuite) TestGetCollectionByPDH(c *check.C) {
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- conn3 := s.conn("z3333")
- userctx1, ac1, kc1, _ := s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ conn3 := s.testClusters["z3333"].Conn()
+ userctx1, ac1, kc1, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
// Create the collection to find its PDH (but don't save it
// anywhere yet)
testText := "IntegrationSuite.TestS3WithFederatedToken"
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- userctx1, ac1, _, _ := s.userClients(rootctx1, c, conn1, "z1111", true)
- conn3 := s.conn("z3333")
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ userctx1, ac1, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
+ conn3 := s.testClusters["z3333"].Conn()
createColl := func(clusterID string) arvados.Collection {
- _, ac, kc := s.clientsWithToken(clusterID, ac1.AuthToken)
+ _, ac, kc := s.testClusters[clusterID].ClientsWithToken(ac1.AuthToken)
var coll arvados.Collection
fs, err := coll.FileSystem(ac, kc)
c.Assert(err, check.IsNil)
c.Assert(err, check.IsNil)
mtxt, err := fs.MarshalManifest(".")
c.Assert(err, check.IsNil)
- coll, err = s.conn(clusterID).CollectionCreate(userctx1, arvados.CreateOptions{Attrs: map[string]interface{}{
+ coll, err = s.testClusters[clusterID].Conn().CollectionCreate(userctx1, arvados.CreateOptions{Attrs: map[string]interface{}{
"manifest_text": mtxt,
}})
c.Assert(err, check.IsNil)
}
func (s *IntegrationSuite) TestGetCollectionAsAnonymous(c *check.C) {
- conn1 := s.conn("z1111")
- conn3 := s.conn("z3333")
- rootctx1, rootac1, rootkc1 := s.rootClients("z1111")
- anonctx3, anonac3, _ := s.anonymousClients("z3333")
+ conn1 := s.testClusters["z1111"].Conn()
+ conn3 := s.testClusters["z3333"].Conn()
+ rootctx1, rootac1, rootkc1 := s.testClusters["z1111"].RootClients()
+ anonctx3, anonac3, _ := s.testClusters["z3333"].AnonymousClients()
// Make sure anonymous token was set
c.Assert(anonac3.AuthToken, check.Not(check.Equals), "")
c.Check(err, check.IsNil)
// Make a v2 token of the z3 anonymous user, and use it on z1
- _, anonac1, _ := s.clientsWithToken("z1111", outAuth.TokenV2())
+ _, anonac1, _ := s.testClusters["z1111"].ClientsWithToken(outAuth.TokenV2())
outUser2, err := anonac1.CurrentUser()
c.Check(err, check.IsNil)
// z3 anonymous user will be mapped to the z1 anonymous user
// Get a token from the login cluster (z1111), use it to submit a
// container request on z2222.
func (s *IntegrationSuite) TestCreateContainerRequestWithFedToken(c *check.C) {
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- _, ac1, _, _ := s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ _, ac1, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
// Use ac2 to get the discovery doc with a blank token, so the
// SDK doesn't magically pass the z1111 token to z2222 before
// we're ready to start our test.
- _, ac2, _ := s.clientsWithToken("z2222", "")
+ _, ac2, _ := s.testClusters["z2222"].ClientsWithToken("")
var dd map[string]interface{}
err := ac2.RequestAndDecode(&dd, "GET", "discovery/v1/apis/arvados/v1/rest", nil, nil)
c.Assert(err, check.IsNil)
}
func (s *IntegrationSuite) TestCreateContainerRequestWithBadToken(c *check.C) {
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- _, ac1, _, au := s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ _, ac1, _, au := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, "user@example.com", true)
tests := []struct {
name string
// to test tokens that are secret, so there is no API response that will give them back
func (s *IntegrationSuite) dbConn(c *check.C, clusterID string) (*sql.DB, *sql.Conn) {
ctx := context.Background()
- db, err := sql.Open("postgres", s.testClusters[clusterID].super.Cluster().PostgreSQL.Connection.String())
+ db, err := sql.Open("postgres", s.testClusters[clusterID].Super.Cluster().PostgreSQL.Connection.String())
c.Assert(err, check.IsNil)
conn, err := db.Conn(ctx)
db, dbconn := s.dbConn(c, "z1111")
defer db.Close()
defer dbconn.Close()
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- userctx1, ac1, _, au := s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ userctx1, ac1, _, au := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, "user@example.com", true)
tests := []struct {
name string
// one cluster with another cluster as the destination
// and check the tokens are being handled properly
func (s *IntegrationSuite) TestIntermediateCluster(c *check.C) {
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- uctx1, ac1, _, _ := s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ uctx1, ac1, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, "user@example.com", true)
tests := []struct {
name string
// Test for bug #16263
func (s *IntegrationSuite) TestListUsers(c *check.C) {
- rootctx1, _, _ := s.rootClients("z1111")
- conn1 := s.conn("z1111")
- conn3 := s.conn("z3333")
- userctx1, _, _, _ := s.userClients(rootctx1, c, conn1, "z1111", true)
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ conn1 := s.testClusters["z1111"].Conn()
+ conn3 := s.testClusters["z3333"].Conn()
+ userctx1, _, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
// Make sure LoginCluster is properly configured
for cls := range s.testClusters {
c.Check(
- s.testClusters[cls].config.Clusters[cls].Login.LoginCluster,
+ s.testClusters[cls].Config.Clusters[cls].Login.LoginCluster,
check.Equals, "z1111",
check.Commentf("incorrect LoginCluster config on cluster %q", cls))
}
}
func (s *IntegrationSuite) TestSetupUserWithVM(c *check.C) {
- conn1 := s.conn("z1111")
- conn3 := s.conn("z3333")
- rootctx1, rootac1, _ := s.rootClients("z1111")
+ conn1 := s.testClusters["z1111"].Conn()
+ conn3 := s.testClusters["z3333"].Conn()
+ rootctx1, rootac1, _ := s.testClusters["z1111"].RootClients()
// Create user on LoginCluster z1111
- _, _, _, user := s.userClients(rootctx1, c, conn1, "z1111", false)
+ _, _, _, user := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
// Make a new root token (because rootClients() uses SystemRootToken)
var outAuth arvados.APIClientAuthorization
c.Check(err, check.IsNil)
// Make a v2 root token to communicate with z3333
- rootctx3, rootac3, _ := s.clientsWithToken("z3333", outAuth.TokenV2())
+ rootctx3, rootac3, _ := s.testClusters["z3333"].ClientsWithToken(outAuth.TokenV2())
// Create VM on z3333
var outVM arvados.VirtualMachine
}
func (s *IntegrationSuite) TestOIDCAccessTokenAuth(c *check.C) {
- conn1 := s.conn("z1111")
- rootctx1, _, _ := s.rootClients("z1111")
- s.userClients(rootctx1, c, conn1, "z1111", true)
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
accesstoken := s.oidcprovider.ValidAccessToken()
- for _, clusterid := range []string{"z1111", "z2222"} {
- c.Logf("trying clusterid %s", clusterid)
+ for _, clusterID := range []string{"z1111", "z2222"} {
+ c.Logf("trying clusterid %s", clusterID)
- conn := s.conn(clusterid)
- ctx, ac, kc := s.clientsWithToken(clusterid, accesstoken)
+ conn := s.testClusters[clusterID].Conn()
+ ctx, ac, kc := s.testClusters[clusterID].ClientsWithToken(accesstoken)
var coll arvados.Collection
if srcobj["class"] == "Directory" and "listing" not in srcobj:
raise WorkflowException("Directory literal '%s' is missing `listing`" % src)
elif src.startswith("http:") or src.startswith("https:"):
- keepref = http_to_keep(self.arvrunner.api, self.arvrunner.project_uuid, src)
- logger.info("%s is %s", src, keepref)
- self._pathmap[src] = MapperEnt(keepref, keepref, srcobj["class"], True)
+ try:
+ keepref = http_to_keep(self.arvrunner.api, self.arvrunner.project_uuid, src)
+ logger.info("%s is %s", src, keepref)
+ self._pathmap[src] = MapperEnt(keepref, keepref, srcobj["class"], True)
+ except Exception as e:
+ logger.warning(str(e))
else:
self._pathmap[src] = MapperEnt(src, src, srcobj["class"], True)
if "$schemas" in workflowobj:
sch = CommentedSeq()
for s in workflowobj["$schemas"]:
- sch.append(mapper.mapper(s).resolved)
+ if s in mapper:
+ sch.append(mapper.mapper(s).resolved)
workflowobj["$schemas"] = sch
return mapper
--- /dev/null
+# Copyright (C) The Arvados Authors. All rights reserved.
+#
+# SPDX-License-Identifier: Apache-2.0
+
+cwlVersion: v1.0
+class: CommandLineTool
+$schemas:
+ - http://example.com/schema.xml
+inputs: []
+outputs:
+ out: stdout
+baseCommand: [echo, "foo"]
+stdout: foo.txt
}
tool: 16377-missing-default.cwl
doc: "Test issue 16377 - missing default fails even when it should be overridden by valid input"
+
+- job: hello.yml
+ output:
+ "out":
+ "checksum": "sha1$f1d2d2f924e986ac86fdf7b36c94bcdf32beec15"
+ "class": "File"
+ "location": "foo.txt"
+ "size": 4
+ tool: 17267-broken-schemas.cwl
+ doc: "Test issue 17267 - inaccessible $schemas URL is not a fatal error"
begin
arv = Arvados.new({ :suppress_ssl_warnings => false })
+ logincluster_arv = Arvados.new({ :api_host => (ENV['LOGINCLUSTER_ARVADOS_API_HOST'] || ENV['ARVADOS_API_HOST']),
+ :api_token => (ENV['LOGINCLUSTER_ARVADOS_API_TOKEN'] || ENV['ARVADOS_API_TOKEN']),
+ :suppress_ssl_warnings => false })
vm_uuid = ENV['ARVADOS_VIRTUAL_MACHINE_UUID']
begin
if !File.exist?(tokenfile)
- user_token = arv.api_client_authorization.create(api_client_authorization: {owner_uuid: l[:user_uuid], api_client_id: 0})
+ user_token = logincluster_arv.api_client_authorization.create(api_client_authorization: {owner_uuid: l[:user_uuid], api_client_id: 0})
f = File.new(tokenfile, 'w')
f.write("ARVADOS_API_HOST=#{ENV['ARVADOS_API_HOST']}\n")
f.write("ARVADOS_API_TOKEN=v2/#{user_token[:uuid]}/#{user_token[:api_token]}\n")
--- /dev/null
+// Copyright (C) The Arvados Authors. All rights reserved.
+//
+// SPDX-License-Identifier: AGPL-3.0
+
+package main
+
+import (
+ "bytes"
+ "net"
+ "os"
+ "path/filepath"
+
+ "git.arvados.org/arvados.git/lib/boot"
+ "git.arvados.org/arvados.git/lib/config"
+ "git.arvados.org/arvados.git/sdk/go/arvados"
+ "git.arvados.org/arvados.git/sdk/go/arvadostest"
+ "git.arvados.org/arvados.git/sdk/go/ctxlog"
+ check "gopkg.in/check.v1"
+)
+
+var _ = check.Suite(&FederationSuite{})
+
+var origAPIHost, origAPIToken string
+
+type FederationSuite struct {
+ testClusters map[string]*boot.TestCluster
+ oidcprovider *arvadostest.OIDCProvider
+}
+
+func (s *FederationSuite) SetUpSuite(c *check.C) {
+ origAPIHost = os.Getenv("ARVADOS_API_HOST")
+ origAPIToken = os.Getenv("ARVADOS_API_TOKEN")
+
+ cwd, _ := os.Getwd()
+
+ s.oidcprovider = arvadostest.NewOIDCProvider(c)
+ s.oidcprovider.AuthEmail = "user@example.com"
+ s.oidcprovider.AuthEmailVerified = true
+ s.oidcprovider.AuthName = "Example User"
+ s.oidcprovider.ValidClientID = "clientid"
+ s.oidcprovider.ValidClientSecret = "clientsecret"
+
+ s.testClusters = map[string]*boot.TestCluster{
+ "z1111": nil,
+ "z2222": nil,
+ }
+ hostport := map[string]string{}
+ for id := range s.testClusters {
+ hostport[id] = func() string {
+ // TODO: Instead of expecting random ports on
+ // 127.0.0.11, 22 to be race-safe, try
+ // different 127.x.y.z until finding one that
+ // isn't in use.
+ ln, err := net.Listen("tcp", ":0")
+ c.Assert(err, check.IsNil)
+ ln.Close()
+ _, port, err := net.SplitHostPort(ln.Addr().String())
+ c.Assert(err, check.IsNil)
+ return "127.0.0." + id[3:] + ":" + port
+ }()
+ }
+ for id := range s.testClusters {
+ yaml := `Clusters:
+ ` + id + `:
+ Services:
+ Controller:
+ ExternalURL: https://` + hostport[id] + `
+ TLS:
+ Insecure: true
+ SystemLogs:
+ Format: text
+ RemoteClusters:
+ z1111:
+ Host: ` + hostport["z1111"] + `
+ Scheme: https
+ Insecure: true
+ Proxy: true
+ ActivateUsers: true
+`
+ if id != "z2222" {
+ yaml += ` z2222:
+ Host: ` + hostport["z2222"] + `
+ Scheme: https
+ Insecure: true
+ Proxy: true
+ ActivateUsers: true
+`
+ }
+ if id == "z1111" {
+ yaml += `
+ Login:
+ LoginCluster: z1111
+ OpenIDConnect:
+ Enable: true
+ Issuer: ` + s.oidcprovider.Issuer.URL + `
+ ClientID: ` + s.oidcprovider.ValidClientID + `
+ ClientSecret: ` + s.oidcprovider.ValidClientSecret + `
+ EmailClaim: email
+ EmailVerifiedClaim: email_verified
+`
+ } else {
+ yaml += `
+ Login:
+ LoginCluster: z1111
+`
+ }
+
+ loader := config.NewLoader(bytes.NewBufferString(yaml), ctxlog.TestLogger(c))
+ loader.Path = "-"
+ loader.SkipLegacy = true
+ loader.SkipAPICalls = true
+ cfg, err := loader.Load()
+ c.Assert(err, check.IsNil)
+ tc := boot.NewTestCluster(
+ filepath.Join(cwd, "..", ".."),
+ id, cfg, "127.0.0."+id[3:], c.Log)
+ s.testClusters[id] = tc
+ s.testClusters[id].Start()
+ }
+ for _, tc := range s.testClusters {
+ ok := tc.WaitReady()
+ c.Assert(ok, check.Equals, true)
+ }
+
+ // Activate user, make it admin.
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ userctx1, _, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
+ user1, err := conn1.UserGetCurrent(userctx1, arvados.GetOptions{})
+ c.Assert(err, check.IsNil)
+ c.Assert(user1.IsAdmin, check.Equals, false)
+ user1, err = conn1.UserUpdate(rootctx1, arvados.UpdateOptions{
+ UUID: user1.UUID,
+ Attrs: map[string]interface{}{
+ "is_admin": true,
+ },
+ })
+ c.Assert(err, check.IsNil)
+ c.Assert(user1.IsAdmin, check.Equals, true)
+}
+
+func (s *FederationSuite) TearDownSuite(c *check.C) {
+ for _, c := range s.testClusters {
+ c.Super.Stop()
+ }
+ _ = os.Setenv("ARVADOS_API_HOST", origAPIHost)
+ _ = os.Setenv("ARVADOS_API_TOKEN", origAPIToken)
+}
+
+func (s *FederationSuite) TestGroupSyncingOnFederatedCluster(c *check.C) {
+ // Get admin user's V2 token
+ conn1 := s.testClusters["z1111"].Conn()
+ rootctx1, _, _ := s.testClusters["z1111"].RootClients()
+ userctx1, _, _, _ := s.testClusters["z1111"].UserClients(rootctx1, c, conn1, s.oidcprovider.AuthEmail, true)
+ user1Auth, err := conn1.APIClientAuthorizationCurrent(userctx1, arvados.GetOptions{})
+ c.Check(err, check.IsNil)
+ userV2Token := user1Auth.TokenV2()
+
+ // Get federated admin clients on z2222 to set up environment
+ conn2 := s.testClusters["z2222"].Conn()
+ userctx2, userac2, _ := s.testClusters["z2222"].ClientsWithToken(userV2Token)
+ user2, err := conn2.UserGetCurrent(userctx2, arvados.GetOptions{})
+ c.Check(err, check.IsNil)
+ c.Check(user2.IsAdmin, check.Equals, true)
+
+ // Set up environment for sync-groups using admin user credentials on z2222
+ err = os.Setenv("ARVADOS_API_HOST", userac2.APIHost)
+ c.Assert(err, check.IsNil)
+ err = os.Setenv("ARVADOS_API_TOKEN", userac2.AuthToken)
+ c.Assert(err, check.IsNil)
+
+ // Check that no parent group is created
+ gl := arvados.GroupList{}
+ params := arvados.ResourceListParams{
+ Filters: []arvados.Filter{{
+ Attr: "owner_uuid",
+ Operator: "=",
+ Operand: s.testClusters["z2222"].ClusterID + "-tpzed-000000000000000",
+ }, {
+ Attr: "name",
+ Operator: "=",
+ Operand: "Externally synchronized groups",
+ }},
+ }
+ err = userac2.RequestAndDecode(&gl, "GET", "/arvados/v1/groups", nil, params)
+ c.Assert(err, check.IsNil)
+ c.Assert(gl.ItemsAvailable, check.Equals, 0)
+
+ // Set up config, confirm that the parent group was created
+ os.Args = []string{"cmd", "somefile.csv"}
+ config, err := GetConfig()
+ c.Assert(err, check.IsNil)
+ userac2.RequestAndDecode(&gl, "GET", "/arvados/v1/groups", nil, params)
+ c.Assert(gl.ItemsAvailable, check.Equals, 1)
+
+ // Run the tool with custom config
+ data := [][]string{
+ {"TestGroup1", user2.Email},
+ }
+ tmpfile, err := MakeTempCSVFile(data)
+ c.Assert(err, check.IsNil)
+ defer os.Remove(tmpfile.Name()) // clean up
+ config.Path = tmpfile.Name()
+ err = doMain(&config)
+ c.Assert(err, check.IsNil)
+ // Check the group was created correctly, and has the user as a member
+ groupUUID, err := RemoteGroupExists(&config, "TestGroup1")
+ c.Assert(err, check.IsNil)
+ c.Assert(groupUUID, check.Not(check.Equals), "")
+ c.Assert(GroupMembershipExists(config.Client, user2.UUID, groupUUID, "can_write"), check.Equals, true)
+}