From: Javier Bértoli Date: Thu, 24 Mar 2022 13:27:05 +0000 (-0300) Subject: Merge branch '18633-install-sudo-passwordless-in-shell' X-Git-Tag: 2.4.0~27 X-Git-Url: https://git.arvados.org/arvados.git/commitdiff_plain/970254fba5b9297884c521987d081c232004eb77?hp=e72de00afc11b7681555f30b1dba3433125b84e2 Merge branch '18633-install-sudo-passwordless-in-shell' closes #18633 Arvados-DCO-1.1-Signed-off-by: Javier Bértoli --- diff --git a/doc/_includes/_container_scheduling_parameters.liquid b/doc/_includes/_container_scheduling_parameters.liquid index be046173ad..636b6df59c 100644 --- a/doc/_includes/_container_scheduling_parameters.liquid +++ b/doc/_includes/_container_scheduling_parameters.liquid @@ -11,5 +11,5 @@ Parameters to be passed to the container scheduler (e.g., Slurm) when running a table(table table-bordered table-condensed). |_. Key|_. Type|_. Description|_. Notes| |partitions|array of strings|The names of one or more compute partitions that may run this container. If not provided, the system will choose where to run the container.|Optional.| -|preemptible|boolean|If true, the dispatcher will ask for a preemptible cloud node instance (eg: AWS Spot Instance) to run this container.|Optional. Default is false.| +|preemptible|boolean|If true, the dispatcher should use a preemptible cloud node instance (eg: AWS Spot Instance) to run this container. Whether a preemptible instance is actually used "depends on cluster configuration.":{{site.baseurl}}/admin/spot-instances.html|Optional. Default is false.| |max_run_time|integer|Maximum running time (in seconds) that this container will be allowed to run before being cancelled.|Optional. Default is 0 (no limit).| diff --git a/doc/admin/spot-instances.html.textile.liquid b/doc/admin/spot-instances.html.textile.liquid index 7ca57df0ab..3837f30d6d 100644 --- a/doc/admin/spot-instances.html.textile.liquid +++ b/doc/admin/spot-instances.html.textile.liquid @@ -16,13 +16,11 @@ Currently Arvados supports preemptible instances using AWS and Azure spot instan h2. Configuration -First, ensure automatic selection of preemptible instances is not disabled in your configuration file (this is enabled by default, but can be disabled with @AlwaysUsePreemptibleInstances: false@), and add entries to @InstanceTypes@ that have @Preemptible: true@. Typically you want to add both preemptible and non-preemptible entries for each cloud provider VM type. The @Price@ for preemptible instances is the maximum bid price, the actual price paid is dynamic and will likely be lower. For example: +Add entries to @InstanceTypes@ that have @Preemptible: true@. Typically you want to add both preemptible and non-preemptible entries for each cloud provider VM type. The @Price@ for preemptible instances is the maximum bid price, the actual price paid is dynamic and will likely be lower. For example:
 Clusters:
-  ClusterID: 
-    Containers:
-      AlwaysUsePreemptibleInstances: true
+  ClusterID:
     InstanceTypes:
       m4.large:
         Preemptible: false
@@ -40,7 +38,18 @@ Clusters:
         Price: 0.1
 
-When @AlwaysUsePreemptibleInstances@ is enabled, child containers (workflow steps) will automatically be made preemptible. Note that because preempting the workflow runner would cancel the entire workflow, the workflow runner runs in a reserved (non-preemptible) instance. +Next, you can choose to enable automatic use of preemptible instances: + +
+    Containers:
+      AlwaysUsePreemptibleInstances: true
+
+ +If @AlwaysUsePreemptibleInstances@ is "true", child containers (workflow steps) will always select preemptible instances, regardless of user option. + +If @AlwaysUsePreemptibleInstances@ is "false" (the default) or unspecified, preemptible instance are "used when requested by the user.":{{site.baseurl}}/user/cwl/cwl-run-options.html#preemptible + +Note that regardless of the value of @AlwaysUsePreemptibleInstances@, the top level workflow runner container always runs in a reserved (non-preemptible) instance, to avoid situations where the workflow runner is killed requiring the entire to be restarted. No additional configuration is required, "arvados-dispatch-cloud":{{site.baseurl}}/install/crunch2-cloud/install-dispatch-cloud.html will now start preemptible instances where appropriate. diff --git a/doc/api/methods/collections.html.textile.liquid b/doc/api/methods/collections.html.textile.liquid index 5ff8d529f8..a2a6a77e19 100644 --- a/doc/api/methods/collections.html.textile.liquid +++ b/doc/api/methods/collections.html.textile.liquid @@ -109,6 +109,36 @@ table(table table-bordered table-condensed). Note: Because adding access tokens to manifests can be computationally expensive, the @manifest_text@ field is not included in results by default. If you need it, pass a @select@ parameter that includes @manifest_text@. +h4. Searching Collections for names of file or directories + +You can search collections for specific file or directory names (whole or part) using the following filter in a @list@ query. + +
+filters: [["file_names", "ilike", "%sample1234.fastq%"]]
+
+ +Note: @file_names@ is a hidden field used for indexing. It is not returned by any API call. On the client, you can programmatically enumerate all the files in a collection using @arv-ls@, the Python SDK @Collection@ class, Go SDK @FileSystem@ struct, the WebDAV API, or the S3-compatible API. + +As of this writing (Arvados 2.4), you can also search for directory paths, but _not_ complete file paths. + +In other words, this will work (when @dir3@ is a directory): + +
+filters: [["file_names", "ilike", "%dir1/dir2/dir3%"]]
+
+ +However, this will _not_ return the desired results (where @sample1234.fastq@ is a file): + +
+filters: [["file_names", "ilike", "%dir1/dir2/dir3/sample1234.fastq%"]]
+
+ +As a workaround, you can search for both the directory path and file name separately, and then filter on the client side. + +
+filters: [["file_names", "ilike", "%dir1/dir2/dir3%"], ["file_names", "ilike", "%sample1234.fastq%"]]
+
+ h3. update Update attributes of an existing Collection. diff --git a/doc/api/methods/groups.html.textile.liquid b/doc/api/methods/groups.html.textile.liquid index e4b8594dd1..2a762d9248 100644 --- a/doc/api/methods/groups.html.textile.liquid +++ b/doc/api/methods/groups.html.textile.liquid @@ -34,6 +34,17 @@ table(table table-bordered table-condensed). |trash_at|datetime|If @trash_at@ is non-null and in the past, this group and all objects directly or indirectly owned by the group will be hidden from API calls. May be untrashed.|| |delete_at|datetime|If @delete_at@ is non-null and in the past, the group and all objects directly or indirectly owned by the group may be permanently deleted.|| |is_trashed|datetime|True if @trash_at@ is in the past, false if not.|| +|frozen_by_uuid|string|For a frozen project, indicates the user who froze the project; null in all other cases. When a project is frozen, no further changes can be made to the project or its contents, even by admins. Attempting to add new items or modify, rename, move, trash, or delete the project or its contents, including any subprojects, will return an error.|| + +h3. Frozen projects + +A user with @manage@ permission can set the @frozen_by_uuid@ attribute of a @project@ group to their own user UUID. Once this is done, no further changes can be made to the project or its contents, including subprojects. + +The @frozen_by_uuid@ attribute can be cleared by an admin user. It can also be cleared by a user with @manage@ permission, unless the @API.UnfreezeProjectRequiresAdmin@ configuration setting is active. + +The optional @API.FreezeProjectRequiresDescription@ and @API.FreezeProjectRequiresProperties@ configuration settings can be used to prevent users from freezing projects that have empty @description@ and/or specified @properties@ entries. + +h3. Filter groups @filter@ groups are virtual groups; they can not own other objects. Filter groups have a special @properties@ field named @filters@, which must be an array of filter conditions. See "list method filters":{{site.baseurl}}/api/methods.html#filters for details on the syntax of valid filters, but keep in mind that the attributes must include the object type (@collections@, @container_requests@, @groups@, @workflows@), separated with a dot from the field to be filtered on. diff --git a/doc/install/install-postgresql.html.textile.liquid b/doc/install/install-postgresql.html.textile.liquid index 1413890cde..a9614b9be5 100644 --- a/doc/install/install-postgresql.html.textile.liquid +++ b/doc/install/install-postgresql.html.textile.liquid @@ -9,7 +9,7 @@ Copyright (C) The Arvados Authors. All rights reserved. SPDX-License-Identifier: CC-BY-SA-3.0 {% endcomment %} -Arvados requires at least version *9.4* of PostgreSQL. +Arvados requires at least version *9.4* of PostgreSQL. We recommend using version 10 or newer. * "AWS":#aws * "CentOS 7":#centos7 diff --git a/doc/user/cwl/cwl-extensions.html.textile.liquid b/doc/user/cwl/cwl-extensions.html.textile.liquid index dd78e989fd..d6148d7eee 100644 --- a/doc/user/cwl/cwl-extensions.html.textile.liquid +++ b/doc/user/cwl/cwl-extensions.html.textile.liquid @@ -63,6 +63,9 @@ hints: cudaComputeCapabilityMin: "9.0" deviceCountMin: 1 deviceCountMax: 1 + + arv:UsePreemptible: + usePreemptible: true {% endcodeblock %} h2(#RunInSingleContainer). arv:RunInSingleContainer @@ -164,6 +167,14 @@ table(table table-bordered table-condensed). |deviceCountMin|integer|Minimum number of GPU devices to allocate on a single node. Required.| |deviceCountMax|integer|Maximum number of GPU devices to allocate on a single node. Optional. If not specified, same as @minDeviceCount@.| +h2(#UsePreemptible). arv:UsePreemptible + +Specify whether a workflow step should request preemptible (e.g. AWS Spot market) instances. Such instances are generally cheaper, but can be taken back by the cloud provider at any time (preempted) causing the step to fail. When this happens, Arvados will automatically re-try the step, up to the configuration value of @Containers.MaxRetryAttempts@ (default 3) times. + +table(table table-bordered table-condensed). +|_. Field |_. Type |_. Description | +|usePreemptible|boolean|Required, true to opt-in to using preemptible instances, false to opt-out.| + h2. arv:dockerCollectionPDH This is an optional extension field appearing on the standard @DockerRequirement@. It specifies the portable data hash of the Arvados collection containing the Docker image. If present, it takes precedence over @dockerPull@ or @dockerImageId@. diff --git a/doc/user/cwl/cwl-run-options.html.textile.liquid b/doc/user/cwl/cwl-run-options.html.textile.liquid index d331dad871..94e46ae1bc 100644 --- a/doc/user/cwl/cwl-run-options.html.textile.liquid +++ b/doc/user/cwl/cwl-run-options.html.textile.liquid @@ -63,6 +63,9 @@ table(table table-bordered table-condensed). |==--priority== PRIORITY|Workflow priority (range 1..1000, higher has precedence over lower)| |==--thread-count== THREAD_COUNT|Number of threads to use for container submit and output collection.| |==--http-timeout== HTTP_TIMEOUT|API request timeout in seconds. Default is 300 seconds (5 minutes).| +|==--enable-preemptible==|Use preemptible instances. Control individual steps with "arv:UsePreemptible":cwl-extensions.html#UsePreemptible hint.| +|==--disable-preemptible==|Don't use preemptible instances.| +|==--skip-schemas==|Skip loading of extension schemas (the $schemas section).| |==--trash-intermediate==|Immediately trash intermediate outputs on workflow success.| |==--no-trash-intermediate==|Do not trash intermediate outputs (default).| @@ -143,3 +146,19 @@ Using @--intermediate-output-ttl@ without @--trash-intermediate@ means that inte h3(#federation). Run workflow on a remote federated cluster By default, the workflow runner will run on the local (home) cluster. Using @--submit-runner-cluster@ you can specify that the runner should be submitted to a remote federated cluster. When doing this, @--project-uuid@ should specify a project on that cluster. Steps making up the workflow will be submitted to the remote federated cluster by default, but the behavior of @arv:ClusterTarget@ is unchanged. Note: when using this option, any resources that need to be uploaded in order to run the workflow (such as files or Docker images) will be uploaded to the local (home) cluster, and streamed to the federated cluster on demand. + +h3(#preemptible). Using preemptible (spot) instances + +Preemptible instances typically offer lower cost computation with a tradeoff of lower service guarantees. If a compute node is preempted, Arvados will restart the computation on a new instance. + +If the sitewide configuration @Containers.AlwaysUsePreemptibleInstances@ is true, workflow steps will always select preemptible instances, regardless of user option. + +If @Containers.AlwaysUsePreemptibleInstances@ is false, you can request preemptible instances for a specific run with the @arvados-cwl-runner --enable-preemptible@ option. + +Within the workflow, you can control whether individual steps should be preemptible with the "arv:UsePreemptible":cwl-extensions.html#UsePreemptible hint. + +If a workflow requests preemptible instances with "arv:UsePreemptible":cwl-extensions.html#UsePreemptible , but you _do not_ want to use preemptible instances, you can override it for a specific run with the @arvados-cwl-runner --disable-preemptible@ option. + +h3(#gpu). Use CUDA GPU instances + +See "cwltool:CUDARequirement":cwl-extensions.html#CUDARequirement . diff --git a/doc/user/cwl/cwl-style.html.textile.liquid b/doc/user/cwl/cwl-style.html.textile.liquid index 853ed3b3e2..303ae37e9e 100644 --- a/doc/user/cwl/cwl-style.html.textile.liquid +++ b/doc/user/cwl/cwl-style.html.textile.liquid @@ -13,7 +13,15 @@ h2(#performance). Performance To get the best perfomance from your workflows, be aware of the following Arvados features, behaviors, and best practices. -Does your application support NVIDIA GPU acceleration? Use "cwltool:CUDARequirement":cwl-extensions.html#CUDARequirement to request nodes with GPUs. +h3. Does your application support NVIDIA GPU acceleration? + +Use "cwltool:CUDARequirement":cwl-extensions.html#CUDARequirement to request nodes with GPUs. + +h3. Trying to reduce costs? + +Try "using preemptible (spot) instances":cwl-run-options.html#preemptible . + +h3. You have a sequence of short-running steps If you have a sequence of short-running steps (less than 1-2 minutes each), use the Arvados extension "arv:RunInSingleContainer":cwl-extensions.html#RunInSingleContainer to avoid scheduling and data transfer overhead by running all the steps together in the same container on the same node. To use this feature, @cwltool@ must be installed in the container image. Example: @@ -42,10 +50,16 @@ steps: run: subworkflow-with-short-steps.cwl {% endcodeblock %} +h3. Avoid declaring @InlineJavascriptRequirement@ or @ShellCommandRequirement@ + Avoid declaring @InlineJavascriptRequirement@ or @ShellCommandRequirement@ unless you specifically need them. Don't include them "just in case" because they change the default behavior and may add extra overhead. +h3. Prefer text substitution to Javascript + When combining a parameter value with a string, such as adding a filename extension, write @$(inputs.file.basename).ext@ instead of @$(inputs.file.basename + 'ext')@. The first form is evaluated as a simple text substitution, the second form (using the @+@ operator) is evaluated as an arbitrary Javascript expression and requires that you declare @InlineJavascriptRequirement@. +h3. Use @ExpressionTool@ to efficiently rearrange input files + Use @ExpressionTool@ to efficiently rearrange input files between steps of a Workflow. For example, the following expression accepts a directory containing files paired by @_R1_@ and @_R2_@ and produces an array of Directories containing each pair. {% codeblock as yaml %} @@ -80,9 +94,13 @@ expression: | } {% endcodeblock %} -Available compute nodes types vary over time and across different cloud providers, so try to limit the RAM requirement to what the program actually needs. However, if you need to target a specific compute node type, see this discussion on "calculating RAM request and choosing instance type for containers.":{{site.baseurl}}/api/execution.html#RAM +h3. Limit RAM requests to what you really need + +Available compute nodes types vary over time and across different cloud providers, so it is important to limit the RAM requirement to what the program actually needs. However, if you need to target a specific compute node type, see this discussion on "calculating RAM request and choosing instance type for containers.":{{site.baseurl}}/api/execution.html#RAM -Instead of scattering separate steps, prefer to scatter over a subworkflow. +h3. Avoid scattering by step by step + +Instead of a scatter step that feeds into another scatter step, prefer to scatter over a subworkflow. With the following pattern, @step1@ has to wait for all samples to complete before @step2@ can start computing on any samples. This means a single long-running sample can prevent the rest of the workflow from moving on: @@ -148,10 +166,16 @@ h2. Portability To write workflows that are easy to modify and portable across CWL runners (in the event you need to share your workflow with others), there are several best practices to follow: +h3. Always provide @DockerRequirement@ + Workflows should always provide @DockerRequirement@ in the @hints@ or @requirements@ section. +h3. Build a reusable library of components + Build a reusable library of components. Share tool wrappers and subworkflows between projects. Make use of and contribute to "community maintained workflows and tools":https://github.com/common-workflow-library and tool registries such as "Dockstore":http://dockstore.org . +h3. Supply scripts as input parameters + CommandLineTools wrapping custom scripts should represent the script as an input parameter with the script file as a default value. Use @secondaryFiles@ for scripts that consist of multiple files. For example: {% codeblock as yaml %} @@ -180,22 +204,21 @@ outputs: glob: "*.fastq" {% endcodeblock %} +h3. Getting the temporary and output directories + You can get the designated temporary directory using @$(runtime.tmpdir)@ in your CWL file, or from the @$TMPDIR@ environment variable in your script. Similarly, you can get the designated output directory using $(runtime.outdir), or from the @HOME@ environment variable in your script. -Avoid specifying resource requirements in CommandLineTool. Prefer to specify them in the workflow. You can provide a default resource requirement in the top level @hints@ section, and individual steps can override it with their own resource requirement. +h3. Specifying @ResourceRequirement@ + +Avoid specifying resources in the @requirements@ section of a @CommandLineTool@, put it in the @hints@ section instead. This enables you to override the tool resource hint with a workflow step level requirement: {% codeblock as yaml %} cwlVersion: v1.0 class: Workflow inputs: inp: File -hints: - ResourceRequirement: - ramMin: 1000 - coresMin: 1 - tmpdirMin: 45000 steps: step1: in: {inp: inp} @@ -205,7 +228,7 @@ steps: in: {inp: step1/inp} out: [out] run: tool2.cwl - hints: + requirements: ResourceRequirement: ramMin: 2000 coresMin: 2 diff --git a/lib/config/config.default.yml b/lib/config/config.default.yml index 9800be7047..22e2c58b78 100644 --- a/lib/config/config.default.yml +++ b/lib/config/config.default.yml @@ -240,6 +240,18 @@ Clusters: # https://doc.arvados.org/admin/metadata-vocabulary.html VocabularyPath: "" + # If true, a project must have a non-empty description field in + # order to be frozen. + FreezeProjectRequiresDescription: false + + # Project properties that must have non-empty values in order to + # freeze a project. Example: {"property_name": true} + FreezeProjectRequiresProperties: {} + + # If true, only an admin user can un-freeze a project. If false, + # any user with "manage" permission can un-freeze. + UnfreezeProjectRequiresAdmin: false + Users: # Config parameters to automatically setup new users. If enabled, # this users will be able to self-activate. Enable this if you want @@ -903,14 +915,9 @@ Clusters: # If false, containers are scheduled on preemptible instances # only when requested by the submitter. # - # Note that arvados-cwl-runner does not currently offer a - # feature to request preemptible instances, so this value - # effectively acts as a cluster-wide decision about whether to - # use preemptible instances. - # # This flag is ignored if no preemptible instance types are # configured, and has no effect on top-level containers. - AlwaysUsePreemptibleInstances: true + AlwaysUsePreemptibleInstances: false # PEM encoded SSH key (RSA, DSA, or ECDSA) used by the # cloud dispatcher for executing containers on worker VMs. diff --git a/lib/config/export.go b/lib/config/export.go index 4e903a8b3d..db413b97bd 100644 --- a/lib/config/export.go +++ b/lib/config/export.go @@ -62,6 +62,8 @@ var whitelist = map[string]bool{ "API": true, "API.AsyncPermissionsUpdateInterval": false, "API.DisabledAPIs": false, + "API.FreezeProjectRequiresDescription": true, + "API.FreezeProjectRequiresProperties": true, "API.KeepServiceRequestTimeout": false, "API.MaxConcurrentRequests": false, "API.MaxIndexDatabaseRead": false, @@ -72,6 +74,7 @@ var whitelist = map[string]bool{ "API.MaxTokenLifetime": false, "API.RequestTimeout": true, "API.SendTimeout": true, + "API.UnfreezeProjectRequiresAdmin": true, "API.VocabularyPath": false, "API.WebsocketClientEventQueue": false, "API.WebsocketServerEventQueue": false, diff --git a/lib/controller/handler_test.go b/lib/controller/handler_test.go index 723e1011f9..817cff7960 100644 --- a/lib/controller/handler_test.go +++ b/lib/controller/handler_test.go @@ -367,16 +367,14 @@ func (s *HandlerSuite) CheckObjectType(c *check.C, url string, token string, ski for k := range direct { if _, ok := skippedFields[k]; ok { continue - } else if val, ok := proxied[k]; ok { - if direct["kind"] == "arvados#collection" && k == "manifest_text" { - // Tokens differ from request to request - c.Check(strings.Split(val.(string), "+A")[0], check.Equals, strings.Split(direct[k].(string), "+A")[0]) - } else { - c.Check(val, check.DeepEquals, direct[k], - check.Commentf("RailsAPI %s key %q's value %q differs from controller's %q.", direct["kind"], k, direct[k], val)) - } - } else { + } else if val, ok := proxied[k]; !ok { c.Errorf("%s's key %q missing on controller's response.", direct["kind"], k) + } else if direct["kind"] == "arvados#collection" && k == "manifest_text" { + // Tokens differ from request to request + c.Check(strings.Split(val.(string), "+A")[0], check.Equals, strings.Split(direct[k].(string), "+A")[0]) + } else { + c.Check(val, check.DeepEquals, direct[k], + check.Commentf("RailsAPI %s key %q's value %q differs from controller's %q.", direct["kind"], k, direct[k], val)) } } } diff --git a/lib/controller/router/response.go b/lib/controller/router/response.go index c0c599be8b..42b3435593 100644 --- a/lib/controller/router/response.go +++ b/lib/controller/router/response.go @@ -208,7 +208,7 @@ func (rtr *router) mungeItemFields(tmp map[string]interface{}) { // they appear in responses as null, rather than a // zero value. switch k { - case "output_uuid", "output_name", "log_uuid", "description", "requesting_container_uuid", "container_uuid": + case "output_uuid", "output_name", "log_uuid", "description", "requesting_container_uuid", "container_uuid", "modified_by_client_uuid", "frozen_by_uuid": if v == "" { tmp[k] = nil } diff --git a/lib/crunchrun/crunchrun.go b/lib/crunchrun/crunchrun.go index 4fa3f26ab5..65f43e9644 100644 --- a/lib/crunchrun/crunchrun.go +++ b/lib/crunchrun/crunchrun.go @@ -19,6 +19,7 @@ import ( "os" "os/exec" "os/signal" + "os/user" "path" "path/filepath" "regexp" @@ -1475,6 +1476,7 @@ func (runner *ContainerRunner) NewArvLogWriter(name string) (io.WriteCloser, err // Run the full container lifecycle. func (runner *ContainerRunner) Run() (err error) { runner.CrunchLog.Printf("crunch-run %s started", cmd.Version.String()) + runner.CrunchLog.Printf("%s", currentUserAndGroups()) runner.CrunchLog.Printf("Executing container '%s' using %s runtime", runner.Container.UUID, runner.executor.Runtime()) hostname, hosterr := os.Hostname() @@ -2045,3 +2047,30 @@ func startLocalKeepstore(configData ConfigData, logbuf io.Writer) (*exec.Cmd, er os.Setenv("ARVADOS_KEEP_SERVICES", url) return cmd, nil } + +// return current uid, gid, groups in a format suitable for logging: +// "crunch-run process has uid=1234(arvados) gid=1234(arvados) +// groups=1234(arvados),114(fuse)" +func currentUserAndGroups() string { + u, err := user.Current() + if err != nil { + return fmt.Sprintf("error getting current user ID: %s", err) + } + s := fmt.Sprintf("crunch-run process has uid=%s(%s) gid=%s", u.Uid, u.Username, u.Gid) + if g, err := user.LookupGroupId(u.Gid); err == nil { + s += fmt.Sprintf("(%s)", g.Name) + } + s += " groups=" + if gids, err := u.GroupIds(); err == nil { + for i, gid := range gids { + if i > 0 { + s += "," + } + s += gid + if g, err := user.LookupGroupId(gid); err == nil { + s += fmt.Sprintf("(%s)", g.Name) + } + } + } + return s +} diff --git a/lib/crunchrun/crunchrun_test.go b/lib/crunchrun/crunchrun_test.go index 26f78d2bf7..62df0032b4 100644 --- a/lib/crunchrun/crunchrun_test.go +++ b/lib/crunchrun/crunchrun_test.go @@ -885,6 +885,7 @@ func (s *TestSuite) TestLogVersionAndRuntime(c *C) { c.Assert(s.api.Logs["crunch-run"], NotNil) c.Check(s.api.Logs["crunch-run"].String(), Matches, `(?ms).*crunch-run \S+ \(go\S+\) start.*`) + c.Check(s.api.Logs["crunch-run"].String(), Matches, `(?ms).*crunch-run process has uid=\d+\(.+\) gid=\d+\(.+\) groups=\d+\(.+\)(,\d+\(.+\))*\n.*`) c.Check(s.api.Logs["crunch-run"].String(), Matches, `(?ms).*Executing container 'zzzzz-zzzzz-zzzzzzzzzzzzzzz' using stub runtime.*`) } diff --git a/sdk/cwl/arvados_cwl/__init__.py b/sdk/cwl/arvados_cwl/__init__.py index 826467cc09..c73b358ecc 100644 --- a/sdk/cwl/arvados_cwl/__init__.py +++ b/sdk/cwl/arvados_cwl/__init__.py @@ -213,6 +213,10 @@ def arg_parser(): # type: () -> argparse.ArgumentParser parser.add_argument("--http-timeout", type=int, default=5*60, dest="http_timeout", help="API request timeout in seconds. Default is 300 seconds (5 minutes).") + exgroup = parser.add_mutually_exclusive_group() + exgroup.add_argument("--enable-preemptible", dest="enable_preemptible", default=None, action="store_true", help="Use preemptible instances. Control individual steps with arv:UsePreemptible hint.") + exgroup.add_argument("--disable-preemptible", dest="enable_preemptible", default=None, action="store_false", help="Don't use preemptible instances.") + parser.add_argument( "--skip-schemas", action="store_true", @@ -255,7 +259,8 @@ def add_arv_hints(): "http://arvados.org/cwl#ClusterTarget", "http://arvados.org/cwl#OutputStorageClass", "http://arvados.org/cwl#ProcessProperties", - "http://commonwl.org/cwltool#CUDARequirement" + "http://commonwl.org/cwltool#CUDARequirement", + "http://arvados.org/cwl#UsePreemptible", ]) def exit_signal_handler(sigcode, frame): diff --git a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.0.yml b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.0.yml index 6e2d4f1d92..af75481431 100644 --- a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.0.yml +++ b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.0.yml @@ -385,3 +385,18 @@ $graph: doc: | Maximum number of GPU devices to request. If not specified, same as `cudaDeviceCountMin`. + +- name: UsePreemptible + type: record + extends: cwl:ProcessRequirement + inVocab: false + doc: | + Specify a workflow step should opt-in or opt-out of using preemptible (spot) instances. + fields: + class: + type: string + doc: "Always 'arv:UsePreemptible" + jsonldPredicate: + _id: "@type" + _type: "@vocab" + usePreemptible: boolean diff --git a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.1.yml b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.1.yml index 0e81347d72..0ae451ccaa 100644 --- a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.1.yml +++ b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.1.yml @@ -328,3 +328,18 @@ $graph: doc: | Maximum number of GPU devices to request. If not specified, same as `cudaDeviceCountMin`. + +- name: UsePreemptible + type: record + extends: cwl:ProcessRequirement + inVocab: false + doc: | + Specify a workflow step should opt-in or opt-out of using preemptible (spot) instances. + fields: + class: + type: string + doc: "Always 'arv:UsePreemptible" + jsonldPredicate: + _id: "@type" + _type: "@vocab" + usePreemptible: boolean diff --git a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.2.yml b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.2.yml index e9f70bf1cf..de5e55ca01 100644 --- a/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.2.yml +++ b/sdk/cwl/arvados_cwl/arv-cwl-schema-v1.2.yml @@ -330,3 +330,18 @@ $graph: doc: | Maximum number of GPU devices to request. If not specified, same as `cudaDeviceCountMin`. + +- name: UsePreemptible + type: record + extends: cwl:ProcessRequirement + inVocab: false + doc: | + Specify a workflow step should opt-in or opt-out of using preemptible (spot) instances. + fields: + class: + type: string + doc: "Always 'arv:UsePreemptible" + jsonldPredicate: + _id: "@type" + _type: "@vocab" + usePreemptible: boolean diff --git a/sdk/cwl/arvados_cwl/arvcontainer.py b/sdk/cwl/arvados_cwl/arvcontainer.py index 2a5ff3a13a..8c468dd22d 100644 --- a/sdk/cwl/arvados_cwl/arvcontainer.py +++ b/sdk/cwl/arvados_cwl/arvcontainer.py @@ -300,6 +300,17 @@ class ArvadosContainer(JobBase): "hardware_capability": aslist(cuda_req["cudaComputeCapability"])[0] } + if runtimeContext.enable_preemptible is False: + scheduling_parameters["preemptible"] = False + else: + preemptible_req, _ = self.get_requirement("http://arvados.org/cwl#UsePreemptible") + if preemptible_req: + scheduling_parameters["preemptible"] = preemptible_req["usePreemptible"] + elif runtimeContext.enable_preemptible is True: + scheduling_parameters["preemptible"] = True + elif runtimeContext.enable_preemptible is None: + pass + if self.timelimit is not None and self.timelimit > 0: scheduling_parameters["max_run_time"] = self.timelimit @@ -550,6 +561,12 @@ class RunnerContainer(Runner): if self.enable_dev: command.append("--enable-dev") + if runtimeContext.enable_preemptible is True: + command.append("--enable-preemptible") + + if runtimeContext.enable_preemptible is False: + command.append("--disable-preemptible") + command.extend([workflowpath, "/var/lib/cwl/cwl.input.json"]) container_req["command"] = command diff --git a/sdk/cwl/arvados_cwl/context.py b/sdk/cwl/arvados_cwl/context.py index 4239dd3b51..316250106b 100644 --- a/sdk/cwl/arvados_cwl/context.py +++ b/sdk/cwl/arvados_cwl/context.py @@ -37,6 +37,7 @@ class ArvRuntimeContext(RuntimeContext): self.always_submit_runner = False self.collection_cache_size = 256 self.match_local_docker = False + self.enable_preemptible = None super(ArvRuntimeContext, self).__init__(kwargs) diff --git a/sdk/cwl/arvados_cwl/runner.py b/sdk/cwl/arvados_cwl/runner.py index ad17950a2f..38e2c4d806 100644 --- a/sdk/cwl/arvados_cwl/runner.py +++ b/sdk/cwl/arvados_cwl/runner.py @@ -40,7 +40,7 @@ import schema_salad.validate as validate import arvados.collection from .util import collectionUUID -import ruamel.yaml as yaml +from ruamel.yaml import YAML from ruamel.yaml.comments import CommentedMap, CommentedSeq import arvados_cwl.arvdocker @@ -265,7 +265,8 @@ def upload_dependencies(arvrunner, name, document_loader, textIO = StringIO(text.decode('utf-8')) else: textIO = StringIO(text) - return yaml.safe_load(textIO) + yamlloader = YAML(typ='safe', pure=True) + return yamlloader.load(textIO) else: return {} diff --git a/sdk/cwl/test_with_arvbox.sh b/sdk/cwl/test_with_arvbox.sh index 0021bc8d90..d38414fc81 100755 --- a/sdk/cwl/test_with_arvbox.sh +++ b/sdk/cwl/test_with_arvbox.sh @@ -118,14 +118,15 @@ elif [[ "$suite" =~ conformance-(.*) ]] ; then git clone https://github.com/common-workflow-language/cwl-\${version}.git fi cd cwl-\${version} + git checkout \${version}.0 elif [[ "$suite" != "integration" ]] ; then echo "ERROR: unknown suite '$suite'" exit 1 fi -if [[ "$suite" != "integration" ]] ; then - git pull -fi +#if [[ "$suite" != "integration" ]] ; then +# git pull +#fi export ARVADOS_API_HOST=localhost:8000 export ARVADOS_API_HOST_INSECURE=1 @@ -154,18 +155,6 @@ else arv-keepdocker arvados/jobs latest fi -cat >/tmp/cwltest/arv-cwl-jobs </tmp/cwltest/arv-cwl-containers <= 3))) " end - sql_conds = "(#{owner_check} #{direct_check} #{links_cond}) #{trashed_check.empty? ? "" : "AND"} #{trashed_check}" + sql_conds = "(#{owner_check} #{direct_check} #{links_cond}) AND NOT (#{excluded_trash})" end @@ -614,8 +638,12 @@ class ArvadosModel < ApplicationRecord if check_uuid.nil? # old_owner_uuid is nil? New record, no need to check. elsif !current_user.can?(write: check_uuid) - logger.warn "User #{current_user.uuid} tried to set ownership of #{self.class.to_s} #{self.uuid} but does not have permission to write #{which} owner_uuid #{check_uuid}" - errors.add :owner_uuid, "cannot be set or changed without write permission on #{which} owner" + if FrozenGroup.where(uuid: check_uuid).any? + errors.add :owner_uuid, "cannot be set or changed because #{which} owner is frozen" + else + logger.warn "User #{current_user.uuid} tried to set ownership of #{self.class.to_s} #{self.uuid} but does not have permission to write #{which} owner_uuid #{check_uuid}" + errors.add :owner_uuid, "cannot be set or changed without write permission on #{which} owner" + end raise PermissionDeniedError elsif rsc_class == Group && Group.find_by_uuid(owner_uuid).group_class != "project" errors.add :owner_uuid, "must be a project" @@ -625,8 +653,12 @@ class ArvadosModel < ApplicationRecord else # If the object already existed and we're not changing # owner_uuid, we only need write permission on the object - # itself. - if !current_user.can?(write: self.uuid) + # itself. (If we're in the act of unfreezing, we only need + # :unfreeze permission, which means "what write permission would + # be if target weren't frozen") + unless ((respond_to?(:frozen_by_uuid) && frozen_by_uuid_was && !frozen_by_uuid) ? + current_user.can?(unfreeze: uuid) : + current_user.can?(write: uuid)) logger.warn "User #{current_user.uuid} tried to modify #{self.class.to_s} #{self.uuid} without write permission" errors.add :uuid, " #{uuid} is not writable by #{current_user.uuid}" raise PermissionDeniedError @@ -643,7 +675,7 @@ class ArvadosModel < ApplicationRecord end def permission_to_create - current_user.andand.is_active + return current_user.andand.is_active end def permission_to_update diff --git a/services/api/app/models/frozen_group.rb b/services/api/app/models/frozen_group.rb new file mode 100644 index 0000000000..bf4ee5d0bd --- /dev/null +++ b/services/api/app/models/frozen_group.rb @@ -0,0 +1,6 @@ +# Copyright (C) The Arvados Authors. All rights reserved. +# +# SPDX-License-Identifier: AGPL-3.0 + +class FrozenGroup < ApplicationRecord +end diff --git a/services/api/app/models/group.rb b/services/api/app/models/group.rb index 8565b2a417..b1b2e942c6 100644 --- a/services/api/app/models/group.rb +++ b/services/api/app/models/group.rb @@ -19,9 +19,11 @@ class Group < ArvadosModel validate :ensure_filesystem_compatible_name validate :check_group_class validate :check_filter_group_filters + validate :check_frozen_state_change_allowed before_create :assign_name after_create :after_ownership_change after_create :update_trash + after_create :update_frozen before_update :before_ownership_change after_update :after_ownership_change @@ -29,7 +31,8 @@ class Group < ArvadosModel after_create :add_role_manage_link after_update :update_trash - before_destroy :clear_permissions_and_trash + after_update :update_frozen + before_destroy :clear_permissions_trash_frozen api_accessible :user, extend: :common do |t| t.add :name @@ -40,6 +43,7 @@ class Group < ArvadosModel t.add :trash_at t.add :is_trashed t.add :properties + t.add :frozen_by_uuid end def ensure_filesystem_compatible_name @@ -92,37 +96,104 @@ class Group < ArvadosModel end end + def check_frozen_state_change_allowed + if frozen_by_uuid == "" + self.frozen_by_uuid = nil + end + if frozen_by_uuid_changed? || (new_record? && frozen_by_uuid) + if group_class != "project" + errors.add(:frozen_by_uuid, "cannot be modified on a non-project group") + return + end + if frozen_by_uuid_was && Rails.configuration.API.UnfreezeProjectRequiresAdmin && !current_user.is_admin + errors.add(:frozen_by_uuid, "can only be changed by an admin user, once set") + return + end + if frozen_by_uuid && frozen_by_uuid != current_user.uuid + errors.add(:frozen_by_uuid, "can only be set to the current user's UUID") + return + end + if !new_record? && !current_user.can?(manage: uuid) + raise PermissionDeniedError + end + if trash_at || delete_at || (!new_record? && TrashedGroup.where(group_uuid: uuid).any?) + errors.add(:frozen_by_uuid, "cannot be set on a trashed project") + end + if frozen_by_uuid_was.nil? + if Rails.configuration.API.FreezeProjectRequiresDescription && !attribute_present?(:description) + errors.add(:frozen_by_uuid, "can only be set if description is non-empty") + end + Rails.configuration.API.FreezeProjectRequiresProperties.andand.each do |key, _| + key = key.to_s + if !properties[key] || properties[key] == "" + errors.add(:frozen_by_uuid, "can only be set if properties[#{key}] value is non-empty") + end + end + end + end + end + def update_trash - if saved_change_to_trash_at? or saved_change_to_owner_uuid? - # The group was added or removed from the trash. - # - # Strategy: - # Compute project subtree, propagating trash_at to subprojects - # Remove groups that don't belong from trash - # Add/update groups that do belong in the trash - - temptable = "group_subtree_#{rand(2**64).to_s(10)}" - ActiveRecord::Base.connection.exec_query %{ -create temporary table #{temptable} on commit drop -as select * from project_subtree_with_trash_at($1, LEAST($2, $3)::timestamp) -}, - 'Group.update_trash.select', - [[nil, self.uuid], - [nil, TrashedGroup.find_by_group_uuid(self.owner_uuid).andand.trash_at], - [nil, self.trash_at]] - - ActiveRecord::Base.connection.exec_delete %{ -delete from trashed_groups where group_uuid in (select target_uuid from #{temptable} where trash_at is NULL); -}, - "Group.update_trash.delete" - - ActiveRecord::Base.connection.exec_query %{ -insert into trashed_groups (group_uuid, trash_at) - select target_uuid as group_uuid, trash_at from #{temptable} where trash_at is not NULL -on conflict (group_uuid) do update set trash_at=EXCLUDED.trash_at; -}, - "Group.update_trash.insert" + return unless saved_change_to_trash_at? || saved_change_to_owner_uuid? + + # The group was added or removed from the trash. + # + # Strategy: + # Compute project subtree, propagating trash_at to subprojects + # Ensure none of the newly trashed descendants were frozen (if so, bail out) + # Remove groups that don't belong from trash + # Add/update groups that do belong in the trash + + temptable = "group_subtree_#{rand(2**64).to_s(10)}" + ActiveRecord::Base.connection.exec_query( + "create temporary table #{temptable} on commit drop " + + "as select * from project_subtree_with_trash_at($1, LEAST($2, $3)::timestamp)", + "Group.update_trash.select", + [[nil, self.uuid], + [nil, TrashedGroup.find_by_group_uuid(self.owner_uuid).andand.trash_at], + [nil, self.trash_at]]) + frozen_descendants = ActiveRecord::Base.connection.exec_query( + "select uuid from frozen_groups, #{temptable} where uuid = target_uuid", + "Group.update_trash.check_frozen") + if frozen_descendants.any? + raise ArgumentError.new("cannot trash project containing frozen project #{frozen_descendants[0]["uuid"]}") + end + ActiveRecord::Base.connection.exec_delete( + "delete from trashed_groups where group_uuid in (select target_uuid from #{temptable} where trash_at is NULL)", + "Group.update_trash.delete") + ActiveRecord::Base.connection.exec_query( + "insert into trashed_groups (group_uuid, trash_at) "+ + "select target_uuid as group_uuid, trash_at from #{temptable} where trash_at is not NULL " + + "on conflict (group_uuid) do update set trash_at=EXCLUDED.trash_at", + "Group.update_trash.insert") + end + + def update_frozen + return unless saved_change_to_frozen_by_uuid? || saved_change_to_owner_uuid? + temptable = "group_subtree_#{rand(2**64).to_s(10)}" + ActiveRecord::Base.connection.exec_query( + "create temporary table #{temptable} on commit drop as select * from project_subtree_with_is_frozen($1,$2)", + "Group.update_frozen.select", + [[nil, self.uuid], + [nil, !self.frozen_by_uuid.nil?]]) + if frozen_by_uuid + rows = ActiveRecord::Base.connection.exec_query( + "select cr.uuid, cr.state from container_requests cr, #{temptable} frozen " + + "where cr.owner_uuid = frozen.uuid and frozen.is_frozen " + + "and cr.state not in ($1, $2) limit 1", + "Group.update_frozen.check_container_requests", + [[nil, ContainerRequest::Uncommitted], + [nil, ContainerRequest::Final]]) + if rows.any? + raise ArgumentError.new("cannot freeze project containing container request #{rows.first['uuid']} with state = #{rows.first['state']}") + end end + ActiveRecord::Base.connection.exec_delete( + "delete from frozen_groups where uuid in (select uuid from #{temptable} where not is_frozen)", + "Group.update_frozen.delete") + ActiveRecord::Base.connection.exec_query( + "insert into frozen_groups (uuid) select uuid from #{temptable} where is_frozen on conflict do nothing", + "Group.update_frozen.insert") end def before_ownership_change @@ -138,12 +209,16 @@ on conflict (group_uuid) do update set trash_at=EXCLUDED.trash_at; end end - def clear_permissions_and_trash + def clear_permissions_trash_frozen MaterializedPermission.where(target_uuid: uuid).delete_all - ActiveRecord::Base.connection.exec_delete %{ -delete from trashed_groups where group_uuid=$1 -}, "Group.clear_permissions_and_trash", [[nil, self.uuid]] - + ActiveRecord::Base.connection.exec_delete( + "delete from trashed_groups where group_uuid=$1", + "Group.clear_permissions_trash_frozen", + [[nil, self.uuid]]) + ActiveRecord::Base.connection.exec_delete( + "delete from frozen_groups where uuid=$1", + "Group.clear_permissions_trash_frozen", + [[nil, self.uuid]]) end def assign_name @@ -181,4 +256,19 @@ delete from trashed_groups where group_uuid=$1 end end end + + def permission_to_update + if !super + return false + elsif frozen_by_uuid && frozen_by_uuid_was + errors.add :uuid, "#{uuid} is frozen and cannot be modified" + return false + else + return true + end + end + + def self.full_text_searchable_columns + super - ["frozen_by_uuid"] + end end diff --git a/services/api/app/models/user.rb b/services/api/app/models/user.rb index 811cd89758..bbb2378f5c 100644 --- a/services/api/app/models/user.rb +++ b/services/api/app/models/user.rb @@ -87,6 +87,7 @@ class User < ArvadosModel VAL_FOR_PERM = {:read => 1, :write => 2, + :unfreeze => 3, :manage => 3} @@ -141,6 +142,23 @@ SELECT 1 FROM #{PERMISSION_VIEW} ).any? return false end + + if action == :write + if FrozenGroup.where(uuid: [target_uuid, target_owner_uuid]).any? + # self or parent is frozen + return false + end + elsif action == :unfreeze + # "unfreeze" permission means "can write, but only if + # explicitly un-freezing at the same time" (see + # ArvadosModel#ensure_owner_uuid_is_permitted). If the + # permission query above passed the permission level of + # :unfreeze (which is the same as :manage), and the parent + # isn't also frozen, then un-freeze is allowed. + if FrozenGroup.where(uuid: target_owner_uuid).any? + return false + end + end end true end diff --git a/services/api/db/migrate/20220224203102_add_frozen_by_uuid_to_groups.rb b/services/api/db/migrate/20220224203102_add_frozen_by_uuid_to_groups.rb new file mode 100644 index 0000000000..d730b25350 --- /dev/null +++ b/services/api/db/migrate/20220224203102_add_frozen_by_uuid_to_groups.rb @@ -0,0 +1,9 @@ +# Copyright (C) The Arvados Authors. All rights reserved. +# +# SPDX-License-Identifier: AGPL-3.0 + +class AddFrozenByUuidToGroups < ActiveRecord::Migration[5.2] + def change + add_column :groups, :frozen_by_uuid, :string + end +end diff --git a/services/api/db/migrate/20220301155729_frozen_groups.rb b/services/api/db/migrate/20220301155729_frozen_groups.rb new file mode 100644 index 0000000000..730d051ce2 --- /dev/null +++ b/services/api/db/migrate/20220301155729_frozen_groups.rb @@ -0,0 +1,39 @@ +# Copyright (C) The Arvados Authors. All rights reserved. +# +# SPDX-License-Identifier: AGPL-3.0 + +require '20200501150153_permission_table_constants' + +class FrozenGroups < ActiveRecord::Migration[5.0] + def up + create_table :frozen_groups, :id => false do |t| + t.string :uuid + end + add_index :frozen_groups, :uuid, :unique => true + + ActiveRecord::Base.connection.execute %{ +create or replace function project_subtree_with_is_frozen (starting_uuid varchar(27), starting_is_frozen boolean) +returns table (uuid varchar(27), is_frozen boolean) +STABLE +language SQL +as $$ +WITH RECURSIVE + project_subtree(uuid, is_frozen) as ( + values (starting_uuid, starting_is_frozen) + union + select groups.uuid, project_subtree.is_frozen or groups.frozen_by_uuid is not null + from groups join project_subtree on project_subtree.uuid = groups.owner_uuid + ) + select uuid, is_frozen from project_subtree; +$$; +} + + # Initialize the table. After this, it is updated incrementally. + # See app/models/group.rb#update_frozen_groups + refresh_frozen + end + + def down + drop_table :frozen_groups + end +end diff --git a/services/api/db/migrate/20220303204419_add_frozen_by_uuid_to_group_search_index.rb b/services/api/db/migrate/20220303204419_add_frozen_by_uuid_to_group_search_index.rb new file mode 100644 index 0000000000..52fb16b2aa --- /dev/null +++ b/services/api/db/migrate/20220303204419_add_frozen_by_uuid_to_group_search_index.rb @@ -0,0 +1,17 @@ +# Copyright (C) The Arvados Authors. All rights reserved. +# +# SPDX-License-Identifier: AGPL-3.0 + +class AddFrozenByUuidToGroupSearchIndex < ActiveRecord::Migration[5.0] + disable_ddl_transaction! + + def up + remove_index :groups, :name => 'groups_search_index' + add_index :groups, ["uuid", "owner_uuid", "modified_by_client_uuid", "modified_by_user_uuid", "name", "group_class", "frozen_by_uuid"], name: 'groups_search_index', algorithm: :concurrently + end + + def down + remove_index :groups, :name => 'groups_search_index' + add_index :groups, ["uuid", "owner_uuid", "modified_by_client_uuid", "modified_by_user_uuid", "name", "group_class"], name: 'groups_search_index', algorithm: :concurrently + end +end diff --git a/services/api/db/structure.sql b/services/api/db/structure.sql index da9959593c..cfe21f7c9a 100644 --- a/services/api/db/structure.sql +++ b/services/api/db/structure.sql @@ -190,6 +190,24 @@ case (edges.edge_id = perm_edge_id) $$; +-- +-- Name: project_subtree_with_is_frozen(character varying, boolean); Type: FUNCTION; Schema: public; Owner: - +-- + +CREATE FUNCTION public.project_subtree_with_is_frozen(starting_uuid character varying, starting_is_frozen boolean) RETURNS TABLE(uuid character varying, is_frozen boolean) + LANGUAGE sql STABLE + AS $$ +WITH RECURSIVE + project_subtree(uuid, is_frozen) as ( + values (starting_uuid, starting_is_frozen) + union + select groups.uuid, project_subtree.is_frozen or groups.frozen_by_uuid is not null + from groups join project_subtree on (groups.owner_uuid = project_subtree.uuid) + ) + select uuid, is_frozen from project_subtree; +$$; + + -- -- Name: project_subtree_with_trash_at(character varying, timestamp without time zone); Type: FUNCTION; Schema: public; Owner: - -- @@ -548,6 +566,15 @@ CREATE SEQUENCE public.containers_id_seq ALTER SEQUENCE public.containers_id_seq OWNED BY public.containers.id; +-- +-- Name: frozen_groups; Type: TABLE; Schema: public; Owner: - +-- + +CREATE TABLE public.frozen_groups ( + uuid character varying +); + + -- -- Name: groups; Type: TABLE; Schema: public; Owner: - -- @@ -567,7 +594,8 @@ CREATE TABLE public.groups ( trash_at timestamp without time zone, is_trashed boolean DEFAULT false NOT NULL, delete_at timestamp without time zone, - properties jsonb DEFAULT '{}'::jsonb + properties jsonb DEFAULT '{}'::jsonb, + frozen_by_uuid character varying ); @@ -1775,7 +1803,7 @@ CREATE INDEX group_index_on_properties ON public.groups USING gin (properties); -- Name: groups_search_index; Type: INDEX; Schema: public; Owner: - -- -CREATE INDEX groups_search_index ON public.groups USING btree (uuid, owner_uuid, modified_by_client_uuid, modified_by_user_uuid, name, group_class); +CREATE INDEX groups_search_index ON public.groups USING btree (uuid, owner_uuid, modified_by_client_uuid, modified_by_user_uuid, name, group_class, frozen_by_uuid); -- @@ -2058,6 +2086,13 @@ CREATE INDEX index_containers_on_secret_mounts_md5 ON public.containers USING bt CREATE UNIQUE INDEX index_containers_on_uuid ON public.containers USING btree (uuid); +-- +-- Name: index_frozen_groups_on_uuid; Type: INDEX; Schema: public; Owner: - +-- + +CREATE UNIQUE INDEX index_frozen_groups_on_uuid ON public.frozen_groups USING btree (uuid); + + -- -- Name: index_groups_on_created_at; Type: INDEX; Schema: public; Owner: - -- @@ -3147,6 +3182,9 @@ INSERT INTO "schema_migrations" (version) VALUES ('20210126183521'), ('20210621204455'), ('20210816191509'), -('20211027154300'); +('20211027154300'), +('20220224203102'), +('20220301155729'), +('20220303204419'); diff --git a/services/api/lib/20200501150153_permission_table_constants.rb b/services/api/lib/20200501150153_permission_table_constants.rb index 74c15bc2e9..7ee5039368 100644 --- a/services/api/lib/20200501150153_permission_table_constants.rb +++ b/services/api/lib/20200501150153_permission_table_constants.rb @@ -15,8 +15,8 @@ # update_permissions reference that file instead. PERMISSION_VIEW = "materialized_permissions" - TRASHED_GROUPS = "trashed_groups" +FROZEN_GROUPS = "frozen_groups" # We need to use this parameterized query in a few different places, # including as a subquery in a larger query. @@ -83,3 +83,21 @@ INSERT INTO materialized_permissions }, "refresh_permission_view.do" end end + +def refresh_frozen + ActiveRecord::Base.transaction do + ActiveRecord::Base.connection.execute("LOCK TABLE #{FROZEN_GROUPS}") + ActiveRecord::Base.connection.execute("DELETE FROM #{FROZEN_GROUPS}") + + # Compute entire frozen_groups table, starting with top-level + # projects (i.e., project groups owned by a user). + ActiveRecord::Base.connection.execute(%{ +INSERT INTO #{FROZEN_GROUPS} +select ps.uuid from groups, + lateral project_subtree_with_is_frozen(groups.uuid, groups.frozen_by_uuid is not null) ps + where groups.owner_uuid like '_____-tpzed-_______________' + and group_class = 'project' + and ps.is_frozen +}) + end +end diff --git a/services/api/test/functional/arvados/v1/groups_controller_test.rb b/services/api/test/functional/arvados/v1/groups_controller_test.rb index 0819c23067..fcdce0e600 100644 --- a/services/api/test/functional/arvados/v1/groups_controller_test.rb +++ b/services/api/test/functional/arvados/v1/groups_controller_test.rb @@ -920,4 +920,24 @@ class Arvados::V1::GroupsControllerTest < ActionController::TestCase assert_response 422 end + + test "include_trash does not return trash inside frozen project" do + authorize_with :active + trashtime = Time.now - 1.second + outerproj = Group.create!(group_class: 'project') + innerproj = Group.create!(group_class: 'project', owner_uuid: outerproj.uuid) + innercoll = Collection.create!(name: 'inner-not-trashed', owner_uuid: innerproj.uuid) + innertrash = Collection.create!(name: 'inner-trashed', owner_uuid: innerproj.uuid, trash_at: trashtime) + innertrashproj = Group.create!(group_class: 'project', name: 'inner-trashed-proj', owner_uuid: innerproj.uuid, trash_at: trashtime) + outertrash = Collection.create!(name: 'outer-trashed', owner_uuid: outerproj.uuid, trash_at: trashtime) + innerproj.update_attributes!(frozen_by_uuid: users(:active).uuid) + get :contents, params: {id: outerproj.uuid, include_trash: true, recursive: true} + assert_response :success + uuids = json_response['items'].collect { |item| item['uuid'] } + assert_includes uuids, outertrash.uuid + assert_includes uuids, innerproj.uuid + assert_includes uuids, innercoll.uuid + refute_includes uuids, innertrash.uuid + refute_includes uuids, innertrashproj.uuid + end end diff --git a/services/api/test/unit/group_test.rb b/services/api/test/unit/group_test.rb index 10932e116d..7a16962402 100644 --- a/services/api/test/unit/group_test.rb +++ b/services/api/test/unit/group_test.rb @@ -313,4 +313,219 @@ update links set tail_uuid='#{g5}' where uuid='#{l1.uuid}' assert Link.where(tail_uuid: g3, head_uuid: g4, link_class: "permission", name: "can_manage").any? assert !Link.where(link_class: 'permission', name: 'can_manage', tail_uuid: g5, head_uuid: g4).any? end + + test "freeze project" do + act_as_user users(:active) do + Rails.configuration.API.UnfreezeProjectRequiresAdmin = false + + test_cr_attrs = { + command: ["echo", "foo"], + container_image: links(:docker_image_collection_tag).name, + cwd: "/tmp", + environment: {}, + mounts: {"/out" => {"kind" => "tmp", "capacity" => 1000000}}, + output_path: "/out", + runtime_constraints: {"vcpus" => 1, "ram" => 2}, + name: "foo", + description: "bar", + } + parent = Group.create!(group_class: 'project', name: 'freeze-test-parent', owner_uuid: users(:active).uuid) + proj = Group.create!(group_class: 'project', name: 'freeze-test', owner_uuid: parent.uuid) + proj_inner = Group.create!(group_class: 'project', name: 'freeze-test-inner', owner_uuid: proj.uuid) + coll = Collection.create!(name: 'freeze-test-collection', manifest_text: '', owner_uuid: proj_inner.uuid) + + # Cannot set frozen_by_uuid to a different user + assert_raises do + proj.update_attributes!(frozen_by_uuid: users(:spectator).uuid) + end + proj.reload + + # Cannot set frozen_by_uuid without can_manage permission + act_as_system_user do + Link.create!(link_class: 'permission', name: 'can_write', tail_uuid: users(:spectator).uuid, head_uuid: proj.uuid) + end + act_as_user users(:spectator) do + # First confirm we have write permission + assert Collection.create(name: 'bar', owner_uuid: proj.uuid) + assert_raises(ArvadosModel::PermissionDeniedError) do + proj.update_attributes!(frozen_by_uuid: users(:spectator).uuid) + end + end + proj.reload + + # Cannot set frozen_by_uuid without description (if so configured) + Rails.configuration.API.FreezeProjectRequiresDescription = true + err = assert_raises do + proj.update_attributes!(frozen_by_uuid: users(:active).uuid) + end + assert_match /can only be set if description is non-empty/, err.inspect + proj.reload + err = assert_raises do + proj.update_attributes!(frozen_by_uuid: users(:active).uuid, description: '') + end + assert_match /can only be set if description is non-empty/, err.inspect + proj.reload + + # Cannot set frozen_by_uuid without properties (if so configured) + Rails.configuration.API.FreezeProjectRequiresProperties['frobity'] = true + err = assert_raises do + proj.update_attributes!( + frozen_by_uuid: users(:active).uuid, + description: 'ready to freeze') + end + assert_match /can only be set if properties\[frobity\] value is non-empty/, err.inspect + proj.reload + + # Cannot set frozen_by_uuid while project or its parent is + # trashed + [parent, proj].each do |trashed| + trashed.update_attributes!(trash_at: db_current_time) + err = assert_raises do + proj.update_attributes!( + frozen_by_uuid: users(:active).uuid, + description: 'ready to freeze', + properties: {'frobity' => 'bar baz'}) + end + assert_match /cannot be set on a trashed project/, err.inspect + proj.reload + trashed.update_attributes!(trash_at: nil) + end + + # Can set frozen_by_uuid if all conditions are met + ok = proj.update_attributes( + frozen_by_uuid: users(:active).uuid, + description: 'ready to freeze', + properties: {'frobity' => 'bar baz'}) + assert ok, proj.errors.messages.inspect + + # Once project is frozen, cannot create new items inside it or + # its descendants + [proj, proj_inner].each do |frozen| + assert_raises do + collections(:collection_owned_by_active).update_attributes!(owner_uuid: frozen.uuid) + end + assert_raises do + Collection.create!(owner_uuid: frozen.uuid, name: 'inside-frozen-project') + end + assert_raises do + Group.create!(owner_uuid: frozen.uuid, group_class: 'project', name: 'inside-frozen-project') + end + cr = ContainerRequest.new(test_cr_attrs.merge(owner_uuid: frozen.uuid)) + assert_raises ArvadosModel::PermissionDeniedError do + cr.save + end + assert_match /frozen/, cr.errors.inspect + # Check the frozen-parent condition is the only reason save failed. + cr.owner_uuid = users(:active).uuid + assert cr.save + cr.destroy + end + + # Once project is frozen, cannot change name/contents, move, + # trash, or delete the project or anything beneath it + [proj, proj_inner, coll].each do |frozen| + assert_raises(StandardError, "should reject rename of #{frozen.uuid} (#{frozen.name}) with parent #{frozen.owner_uuid}") do + frozen.update_attributes!(name: 'foo2') + end + frozen.reload + + if frozen.is_a?(Collection) + assert_raises(StandardError, "should reject manifest change of #{frozen.uuid}") do + frozen.update_attributes!(manifest_text: ". d41d8cd98f00b204e9800998ecf8427e+0 0:0:foo\n") + end + else + assert_raises(StandardError, "should reject moving a project into #{frozen.uuid}") do + groups(:private).update_attributes!(owner_uuid: frozen.uuid) + end + end + frozen.reload + + assert_raises(StandardError, "should reject moving #{frozen.uuid} to a different parent project") do + frozen.update_attributes!(owner_uuid: groups(:private).uuid) + end + frozen.reload + assert_raises(StandardError, "should reject setting trash_at of #{frozen.uuid}") do + frozen.update_attributes!(trash_at: db_current_time) + end + frozen.reload + assert_raises(StandardError, "should reject setting delete_at of #{frozen.uuid}") do + frozen.update_attributes!(delete_at: db_current_time) + end + frozen.reload + assert_raises(StandardError, "should reject delete of #{frozen.uuid}") do + frozen.destroy + end + frozen.reload + if frozen != proj + assert_equal [], frozen.writable_by + end + end + + # User with write permission (but not manage) cannot unfreeze + act_as_user users(:spectator) do + # First confirm we have write permission on the parent project + assert Collection.create(name: 'bar', owner_uuid: parent.uuid) + assert_raises(ArvadosModel::PermissionDeniedError) do + proj.update_attributes!(frozen_by_uuid: nil) + end + end + proj.reload + + # User with manage permission can unfreeze, then create items + # inside it and its children + assert proj.update_attributes(frozen_by_uuid: nil) + assert Collection.create!(owner_uuid: proj.uuid, name: 'inside-unfrozen-project') + assert Collection.create!(owner_uuid: proj_inner.uuid, name: 'inside-inner-unfrozen-project') + + # Re-freeze, and reconfigure so only admins can unfreeze. + assert proj.update_attributes(frozen_by_uuid: users(:active).uuid) + Rails.configuration.API.UnfreezeProjectRequiresAdmin = true + + # Owner cannot unfreeze, because not admin. + err = assert_raises do + proj.update_attributes!(frozen_by_uuid: nil) + end + assert_match /can only be changed by an admin user, once set/, err.inspect + proj.reload + + # Cannot trash or delete a frozen project's ancestor + assert_raises(StandardError, "should not be able to set trash_at on parent of frozen project") do + parent.update_attributes!(trash_at: db_current_time) + end + parent.reload + assert_raises(StandardError, "should not be able to set delete_at on parent of frozen project") do + parent.update_attributes!(delete_at: db_current_time) + end + parent.reload + assert_nil parent.frozen_by_uuid + + act_as_user users(:admin) do + # Even admin cannot change frozen_by_uuid to someone else's UUID. + err = assert_raises do + proj.update_attributes!(frozen_by_uuid: users(:project_viewer).uuid) + end + assert_match /can only be set to the current user's UUID/, err.inspect + proj.reload + + # Admin can unfreeze. + assert proj.update_attributes(frozen_by_uuid: nil), proj.errors.messages + end + + # Cannot freeze a project if it contains container requests in + # Committed state (this would cause operations on the relevant + # Containers to fail when syncing container request state) + creq_uncommitted = ContainerRequest.create!(test_cr_attrs.merge(owner_uuid: proj_inner.uuid)) + creq_committed = ContainerRequest.create!(test_cr_attrs.merge(owner_uuid: proj_inner.uuid, state: 'Committed')) + err = assert_raises do + proj.update_attributes!(frozen_by_uuid: users(:active).uuid) + end + assert_match /container request zzzzz-xvhdp-.* with state = Committed/, err.inspect + proj.reload + + # Can freeze once all container requests are in Uncommitted or + # Final state + creq_committed.update_attributes!(state: ContainerRequest::Final) + assert proj.update_attributes(frozen_by_uuid: users(:active).uuid) + end + end end diff --git a/services/api/test/unit/permission_test.rb b/services/api/test/unit/permission_test.rb index 647139d9ec..efc43dfde5 100644 --- a/services/api/test/unit/permission_test.rb +++ b/services/api/test/unit/permission_test.rb @@ -602,4 +602,17 @@ class PermissionTest < ActiveSupport::TestCase end end end + + # Show query plan for readable_by query. The plan for a test db + # might not resemble the plan for a production db, but it doesn't + # hurt to show the test db plan in test logs, and the . + [false, true].each do |include_trash| + test "query plan, include_trash=#{include_trash}" do + sql = Collection.readable_by(users(:active), include_trash: include_trash).to_sql + sql = "explain analyze #{sql}" + STDERR.puts sql + q = ActiveRecord::Base.connection.exec_query(sql) + q.rows.each do |row| STDERR.puts(row) end + end + end end diff --git a/tools/arvbash/arvbash.sh b/tools/arvbash/arvbash.sh index ecad0888df..1d4fbade8b 100755 --- a/tools/arvbash/arvbash.sh +++ b/tools/arvbash/arvbash.sh @@ -15,10 +15,10 @@ Syntax: arvswitch Set ARVADOS_API_HOST and ARVADOS_API_TOKEN in the current environment based on $HOME/.config/arvados/.conf - With no arguments, list available Arvados configurations. + With no arguments, print current API host and available Arvados configurations. arvsave - Save values of ARVADOS_API_HOST and ARVADOS_API_TOKEN in the current environment to + Save current values of ARVADOS_API_HOST and ARVADOS_API_TOKEN in the current environment to $HOME/.config/arvados/.conf arvrm @@ -26,12 +26,12 @@ arvrm arvboxswitch Set ARVBOX_CONTAINER to - With no arguments, list available arvboxes. + With no arguments, print current arvbox and available arvboxes. -arvopen: +arvopen Open an Arvados uuid in web browser (http://arvadosapi.com) -arvissue +arvissue Open an Arvados ticket in web browser (http://dev.arvados.org) EOF @@ -61,7 +61,8 @@ arvswitch() { fi else echo "Switch Arvados environment conf" - echo "Usage: arvswitch name" + echo "Current host: ${ARVADOS_API_HOST}" + echo "Usage: arvswitch " echo "Available confs:" $((cd $HOME/.config/arvados && ls --indicator-style=none *.conf) | rev | cut -c6- | rev) fi } @@ -73,7 +74,7 @@ arvsave() { env | grep ARVADOS_ > $HOME/.config/arvados/$1.conf else echo "Save current Arvados environment variables to conf file" - echo "Usage: arvsave name" + echo "Usage: arvsave " fi } @@ -86,25 +87,25 @@ arvrm() { fi else echo "Delete Arvados environment conf" - echo "Usage: arvrm name" + echo "Usage: arvrm " fi } arvboxswitch() { if [[ -n "$1" ]] ; then + export ARVBOX_CONTAINER=$1 if [[ -d $HOME/.arvbox/$1 ]] ; then - export ARVBOX_CONTAINER=$1 echo "Arvbox switched to $1" else - echo "$1 unknown" + echo "Warning: $1 doesn't exist, will be created." fi else if test -z "$ARVBOX_CONTAINER" ; then ARVBOX_CONTAINER=arvbox fi echo "Switch Arvbox environment conf" - echo "Usage: arvboxswitch name" echo "Your current container is: $ARVBOX_CONTAINER" + echo "Usage: arvboxswitch " echo "Available confs:" $(cd $HOME/.arvbox && ls --indicator-style=none) fi } @@ -114,7 +115,7 @@ arvopen() { xdg-open https://arvadosapi.com/$1 else echo "Open Arvados uuid in browser" - echo "Usage: arvopen uuid" + echo "Usage: arvopen " fi } @@ -123,6 +124,6 @@ arvissue() { xdg-open https://dev.arvados.org/issues/$1 else echo "Open Arvados issue in browser" - echo "Usage: arvissue uuid" + echo "Usage: arvissue " fi } diff --git a/tools/arvbox/bin/arvbox b/tools/arvbox/bin/arvbox index e021b442f1..e7416947d6 100755 --- a/tools/arvbox/bin/arvbox +++ b/tools/arvbox/bin/arvbox @@ -237,7 +237,7 @@ run() { fi if ! (docker ps -a | grep -E "$ARVBOX_CONTAINER-data$" -q) ; then - docker create -v /var/lib/postgresql -v $ARVADOS_CONTAINER_PATH --name $ARVBOX_CONTAINER-data arvados/arvbox-demo /bin/true + docker create -v /var/lib/postgresql -v $ARVADOS_CONTAINER_PATH --name $ARVBOX_CONTAINER-data arvados/arvbox-demo$TAG /bin/true fi docker run \ diff --git a/tools/arvbox/lib/arvbox/docker/createusers.sh b/tools/arvbox/lib/arvbox/docker/createusers.sh index 9c81a66ced..4cafd8c09c 100755 --- a/tools/arvbox/lib/arvbox/docker/createusers.sh +++ b/tools/arvbox/lib/arvbox/docker/createusers.sh @@ -34,20 +34,12 @@ if ! grep "^arvbox:" /etc/passwd >/dev/null 2>/dev/null ; then chown arvbox:arvbox -R /usr/local $ARVADOS_CONTAINER_PATH \ /var/lib/passenger /var/lib/postgresql \ /var/lib/nginx /var/log/nginx /etc/ssl/private \ - /var/lib/gopath /var/lib/pip /var/lib/npm \ - /var/lib/arvados + /var/lib/gopath /var/lib/pip /var/lib/npm fi mkdir -p /tmp/crunch0 /tmp/crunch1 chown crunch:crunch -R /tmp/crunch0 /tmp/crunch1 - # singularity needs to be owned by root and suid - chown root /var/lib/arvados/bin/singularity \ - /var/lib/arvados/etc/singularity/singularity.conf \ - /var/lib/arvados/etc/singularity/capability.json \ - /var/lib/arvados/etc/singularity/ecl.toml - chmod u+s /var/lib/arvados/bin/singularity - echo "arvbox ALL=(crunch) NOPASSWD: ALL" >> /etc/sudoers cat < /etc/profile.d/paths.sh