Merge branch '18670-flaky-lsf-test'

author Tom Clegg <tom@curii.com>

Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)

committer Tom Clegg <tom@curii.com>

Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)
author Tom Clegg <tom@curii.com>
Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)
committer Tom Clegg <tom@curii.com>
Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)
diff --git a/doc/_includes/_install_cuda.liquid b/doc/_includes/_install_cuda.liquid

new file mode 100644 (file)

index 0000000..cb1519a
--- /dev/null
+++ b/doc/_includes/_install_cuda.liquid
@@ -0,0 +1,21 @@
+{% comment %}
+Copyright (C) The Arvados Authors. All rights reserved.
+
+SPDX-License-Identifier: CC-BY-SA-3.0
+{% endcomment %}
+
+h2(#cuda). Install NVIDA CUDA Toolkit (optional)
+
+If you want to use NVIDIA GPUs, "install the CUDA toolkit.":https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
+
+In addition, you also must install the NVIDIA Container Toolkit:
+
+<pre>
+DIST=$(. /etc/os-release; echo $ID$VERSION_ID)
+curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \
+  sudo apt-key add -
+curl -s -L https://nvidia.github.io/libnvidia-container/$DIST/libnvidia-container.list | \
+  sudo tee /etc/apt/sources.list.d/libnvidia-container.list
+sudo apt-get update
+apt-get install libnvidia-container1 libnvidia-container-tools nvidia-container-toolkit
+</pre>
diff --git a/doc/admin/metadata-vocabulary.html.textile.liquid b/doc/admin/metadata-vocabulary.html.textile.liquid

index 170699ab6c36d3207ad08fe36d6a6631dce9d1f4..38a8d05b8c1195d33ebed45131000c387d894e35 100644 (file)
--- a/doc/admin/metadata-vocabulary.html.textile.liquid
+++ b/doc/admin/metadata-vocabulary.html.textile.liquid
@@ -28,7 +28,7 @@ The site administrator should place the JSON vocabulary file on the same host as
  </code></pre>
  </notextile>
  
-h2. Vocabulary definition format
+h2. Definition format
  
  The JSON file describes the available keys and values and if the user is allowed to enter free text not defined by the vocabulary.
  
@@ -54,6 +54,74 @@ When any key or value has more than one label option, Workbench2's user interfac
  
  Internally, Workbench2 uses the IDs to do property based searches, so if the user searches by @Animal: Human@ or @Species: Homo sapiens@, both will return the same results.
  
+h2. Definition validation
+
+Because the vocabulary definition is prone to syntax or logical errors, the @controller@ service needs to do some validation before answering requests. If the vocabulary validation fails, the service won't start.
+The site administrator can make sure the vocabulary file is correct before even trying to start the @controller@ service by running @arvados-server config-check@. When the vocabulary definition isn't correct, the administrator will get a list of issues like the one below:
+
+<notextile>
+<pre><code># arvados-server config-check -config /etc/arvados/config.yml
+Error loading vocabulary file "/etc/arvados/vocabulary.json" for cluster zzzzz:
+duplicate JSON key "tags.IDTAGFRUITS.values.IDVALFRUITS1"
+tag key "IDTAGCOMMENT" is configured as strict but doesn't provide values
+tag value label "Banana" for pair ("IDTAGFRUITS":"IDVALFRUITS8") already seen on value "IDVALFRUITS4"
+exit status 1
+</code></pre>
+</notextile>
+
+bq. NOTE: These validation checks are performed only on the node that hosts the vocabulary file defined on the configuration. As the same configuration file is shared between different nodes, those who don't host the file won't produce spurious errors when running @arvados-server config-check@.
+
+h2. Live updates
+
+Sometimes it may be necessary to modify the vocabulary definition in a running production environment.
+When a change is detected, the @controller@ service will automatically attempt to load the new vocabulary and check its validity before making it active.
+If the new vocabulary has some issue, the last valid one will keep being active. The service will export any errors on its health endpoint so that a monitoring solution can send an alert appropriately.
+With the above mechanisms in place, no outages should occur from making typos or other errors when updating the vocabulary file.
+
+h2. Health status
+
+To be able for the administrator to guarantee the system's metadata integrity, the @controller@ service exports a specific health endpoint for the vocabulary at @/_health/vocabulary@.
+As a first measure, the service won't start if the vocabulary file is incorrect. Once running, if there are updates (that may even be periodical), the service needs to keep running while notifying the operator that some fixing is in order.
+An example of a vocabulary health error is included below:
+
+<notextile>
+<pre><code>$ curl --silent -H "Authorization: Bearer xxxtokenxxx" https://controller/_health/vocabulary | jq .
+{
+  "error": "while loading vocabulary file \"/etc/arvados/vocabulary.json\": duplicate JSON key \"tags.IDTAGSIZES.values.IDVALSIZES3\"",
+  "health": "ERROR"
+}
+</code></pre>
+</notextile>
+
+h2. Client support
+
+Workbench2 currently takes advantage of this vocabulary definition by providing an easy-to-use interface for searching and applying metadata to different objects in the system. Because the definition file only resides on the @controller@ node, and Workbench2 is just a static web application run by every users' web browser, there's a mechanism in place that allows Workbench2 and any other client to request the active vocabulary.
+
+The @controller@ service provides an unauthenticated endpoint at @/arvados/v1/vocabulary@ where it exports the contents of the vocabulary JSON file:
+
+<notextile>
+<pre><code>$ curl --silent https://controller/arvados/v1/vocabulary | jq .
+{
+  "kind": "arvados#vocabulary",
+  "strict_tags": false,
+  "tags": {
+    "IDTAGANIMALS": {
+      "labels": [
+        {
+          "label": "Animal"
+        },
+        {
+          "label": "Creature"
+        }
+      ],
+      "strict": false,
+...
+}
+</code></pre>
+</notextile>
+
+Although the vocabulary enforcement is done on the backend side, clients can use this information to provide helping features to users, like doing ID-to-label translations, preemptive error checking, etc.
+
  h2. Properties migration
  
  After installing the new vocabulary definition, it may be necessary to migrate preexisting properties that were set up using literal strings. This can be a big task depending on the number of properties on the vocabulary and the amount of collections and projects on the cluster.
diff --git a/doc/install/crunch2-cloud/install-compute-node.html.textile.liquid b/doc/install/crunch2-cloud/install-compute-node.html.textile.liquid

index 131dde5996dcebf73d33276500c4b05141cee6cb..89771514e9fdf380c141ae3e474b78ecdb3ef832 100644 (file)
--- a/doc/install/crunch2-cloud/install-compute-node.html.textile.liquid
+++ b/doc/install/crunch2-cloud/install-compute-node.html.textile.liquid
@@ -125,10 +125,16 @@ Options:
        Path to the public key file that a-d-c will use to log into the compute node
    --mksquashfs-mem (default: 256M)
        Only relevant when using Singularity. This is the amount of memory mksquashfs is allowed to use.
-  --debug
-      Output debug information (default: false)
+  --nvidia-gpu-support (default: false)
+      Install all the necessary tooling for Nvidia GPU support
+  --debug (default: false)
+      Output debug information
  </code></pre></notextile>
  
+h2(#building). NVIDIA GPU support
+
+If you plan on using instance types with NVIDIA GPUs, add @--nvidia-gpu-support@ to the build command line.  Arvados uses the same compute image for both GPU and non-GPU instance types.  The GPU tooling is ignored when using the image with a non-GPU instance type.
+
  h2(#aws). Build an AWS image
  
  <notextile><pre><code>~$ <span class="userinput">./build.sh --json-file arvados-images-aws.json \
diff --git a/doc/install/crunch2-cloud/install-dispatch-cloud.html.textile.liquid b/doc/install/crunch2-cloud/install-dispatch-cloud.html.textile.liquid

index b4987f44373eb533e616c0b6f263cbf086f5562b..06a918dd37bcfab0df9a050dccbc1e4e0de68170 100644 (file)
--- a/doc/install/crunch2-cloud/install-dispatch-cloud.html.textile.liquid
+++ b/doc/install/crunch2-cloud/install-dispatch-cloud.html.textile.liquid
@@ -74,6 +74,27 @@ Add or update the following portions of your cluster configuration file, @config
  </code></pre>
  </notextile>
  
+h4. NVIDIA GPU support
+
+To specify instance types with NVIDIA GPUs, you must include an additional @CUDA@ section:
+
+<notextile>
+<pre><code>    InstanceTypes:
+      g4dn:
+        ProviderType: g4dn.xlarge
+        VCPUs: 4
+        RAM: 16GiB
+        IncludedScratch: 125GB
+        Price: 0.56
+        CUDA:
+          DriverVersion: "11.4"
+          HardwareCapability: "7.5"
+          DeviceCount: 1
+</code></pre>
+</notextile>
+
+The @DriverVersion@ is the version of the CUDA toolkit installed in your compute image (in X.Y format, do not include the patchlevel).  The @HardwareCapability@ is the CUDA compute capability of the GPUs available for this instance type.  The @DeviceCount@ is the number of GPU cores available for this instance type.
+
  h4. Minimal configuration example for Amazon EC2
  
  The <span class="userinput">ImageID</span> value is the compute node image that was built in "the previous section":install-compute-node.html#aws.
diff --git a/doc/install/crunch2-lsf/install-dispatch.html.textile.liquid b/doc/install/crunch2-lsf/install-dispatch.html.textile.liquid

index 7e44c8ec43c080fe26140003ab6bce9874b908b9..37adffd18d4e9bef5162614b015a3155df3333a5 100644 (file)
--- a/doc/install/crunch2-lsf/install-dispatch.html.textile.liquid
+++ b/doc/install/crunch2-lsf/install-dispatch.html.textile.liquid
@@ -64,17 +64,39 @@ Alternatively, you can arrange for the arvados-dispatch-lsf process to run as an
  
  h3(#SbatchArguments). Containers.LSF.BsubArgumentsList
  
-When arvados-dispatch-lsf invokes @bsub@, you can add arguments to the command by specifying @BsubArgumentsList@.  You can use this to send the jobs to specific cluster partitions or add resource requests.  Set @BsubArgumentsList@ to an array of strings.  For example:
+When arvados-dispatch-lsf invokes @bsub@, you can add arguments to the command by specifying @BsubArgumentsList@.  You can use this to send the jobs to specific cluster partitions or add resource requests.  Set @BsubArgumentsList@ to an array of strings.
+
+Template variables starting with % will be substituted as follows:
+
+%U uuid
+%C number of VCPUs
+%M memory in MB
+%T tmp in MB
+%G number of GPU devices (@runtime_constraints.cuda.device_count@)
+
+Use %% to express a literal %. The %%J in the default will be changed to %J, which is interpreted by @bsub@ itself.
+
+For example:
  
  <notextile>
  <pre>    Containers:
        LSF:
-        <code class="userinput">BsubArgumentsList: <b>["-C", "0", "-o", "/tmp/crunch-run.%J.out", "-e", "/tmp/crunch-run.%J.err"]</b></code>
+        <code class="userinput">BsubArgumentsList: <b>["-o", "/tmp/crunch-run.%%J.out", "-e", "/tmp/crunch-run.%%J.err", "-J", "%U", "-n", "%C", "-D", "%MMB", "-R", "rusage[mem=%MMB:tmp=%TMB] span[hosts=1]", "-R", "select[mem>=%MMB]", "-R", "select[tmp>=%TMB]", "-R", "select[ncpus>=%C]"]</b></code>
  </pre>
  </notextile>
  
  Note that the default value for @BsubArgumentsList@ uses the @-o@ and @-e@ arguments to write stdout/stderr data to files in @/tmp@ on the compute nodes, which is helpful for troubleshooting installation/configuration problems. Ensure you have something in place to delete old files from @/tmp@, or adjust these arguments accordingly.
  
+h3(#SbatchArguments). Containers.LSF.BsubCUDAArguments
+
+If the container requests access to GPUs (@runtime_constraints.cuda.device_count@ of the container request is greater than zero), the command line arguments in @BsubCUDAArguments@ will be added to the command line _after_ @BsubArgumentsList@.  This should consist of the additional @bsub@ flags your site requires to schedule the job on a node with GPU support.  Set @BsubCUDAArguments@ to an array of strings.  For example:
+
+<notextile>
+<pre>    Containers:
+      LSF:
+        <code class="userinput">BsubCUDAArguments: <b>["-gpu", "num=%G"]</b></code>
+</pre>
+</notextile>
  
  h3(#PollPeriod). Containers.PollInterval
  
diff --git a/doc/install/crunch2/install-compute-node-docker.html.textile.liquid b/doc/install/crunch2/install-compute-node-docker.html.textile.liquid

index 66bd85b7c5038073beaf95d342fde7c2060d90b2..6204d524f4e3c259f200e81315dd9d02e047dcf3 100644 (file)
--- a/doc/install/crunch2/install-compute-node-docker.html.textile.liquid
+++ b/doc/install/crunch2/install-compute-node-docker.html.textile.liquid
@@ -31,6 +31,8 @@ h2(#docker). Set up Docker
  
  See "Set up Docker":../install-docker.html
  
+{% include 'install_cuda' %}
+
  {% assign arvados_component = 'python-arvados-fuse crunch-run arvados-docker-cleaner' %}
  
  {% include 'install_compute_fuse' %}
diff --git a/doc/install/crunch2/install-compute-node-singularity.html.textile.liquid b/doc/install/crunch2/install-compute-node-singularity.html.textile.liquid

index 14f95e48fb3be044f5cd8d5bb1730d6a32fcfed0..e61b6cbe3783b192325b77c1454747abcc128a1e 100644 (file)
--- a/doc/install/crunch2/install-compute-node-singularity.html.textile.liquid
+++ b/doc/install/crunch2/install-compute-node-singularity.html.textile.liquid
@@ -32,6 +32,8 @@ This page describes how to configure a compute node so that it can be used to ru
  
  {% include 'install_packages' %}
  
+{% include 'install_cuda' %}
+
  h2(#singularity). Set up Singularity
  
  Follow the "Singularity installation instructions":https://sylabs.io/guides/3.7/user-guide/quick_start.html. Make sure @singularity@ and @mksquashfs@ are working:
diff --git a/doc/install/install-workbench2-app.html.textile.liquid b/doc/install/install-workbench2-app.html.textile.liquid

index c9a1c7012659fd1a3b629431441dcda69f018a68..63159611828268dc05b4fac99d4d243881571552 100644 (file)
--- a/doc/install/install-workbench2-app.html.textile.liquid
+++ b/doc/install/install-workbench2-app.html.textile.liquid
@@ -73,9 +73,9 @@ server {
  </code></pre>
  </notextile>
  
-h2. Vocabulary configuration (optional)
+h2. Vocabulary configuration
  
-Workbench2 can load a vocabulary file which lists available metadata properties for groups and collections.  To configure the property vocabulary definition, please visit the "Metadata Vocabulary Format":{{site.baseurl}}/admin/metadata-vocabulary.html page in the Admin section.
+Workbench2 will load, if available, a vocabulary definition which lists available metadata properties for groups and collections.  To learn how to configure the property vocabulary definition, please visit the "Metadata Vocabulary Format":{{site.baseurl}}/admin/metadata-vocabulary.html page in the Admin section.
  
  {% assign arvados_component = 'arvados-workbench2' %}
  
diff --git a/doc/user/cwl/arvados-vscode-training.html.md.liquid b/doc/user/cwl/arvados-vscode-training.html.md.liquid

index 1858f5b574260ebd0addc49379d3a9ce0dd4cb4e..804b3ad7bbd07febe180edbd73dc3c5dcd72ac99 100644 (file)
--- a/doc/user/cwl/arvados-vscode-training.html.md.liquid
+++ b/doc/user/cwl/arvados-vscode-training.html.md.liquid
@@ -86,9 +86,11 @@ Code (abbreviated "vscode") to develop CWL workflows on Arvados.
     1. Go to Vscode
     1. Vscode: Click on the `Terminal` menu
     1. Vscode: Click `Run Task…`
-   1. Vscode: Select `Configure Arvados`
-   1. Vscode: Paste text into the `Current API_TOKEN and API_HOST from Workbench` prompt
-   1. Vscode: This will create files called `API_HOST` and `API_TOKEN`
+   1. Vscode: Select `Set Arvados Host`
+   1. Vscode: Paste the value of API Host from the Workbench `Get API Token` dialog (found in the User menu) at the prompt
+   1. Vscode: Next, run task `Set Arvados Token`
+   1. Vscode: Paste the value of API Token from the Workbench `Get API Token` dialog
+   1. Vscode: These will create files called `API_HOST` and `API_TOKEN`
  
  ## 3. Register & run a workflow
  
diff --git a/doc/user/cwl/cwl-extensions.html.textile.liquid b/doc/user/cwl/cwl-extensions.html.textile.liquid

index 0580dca289f431a10ebaab322833368ddd3ef107..dd78e989fd52afe4ddd24940a00d76634f546a2d 100644 (file)
--- a/doc/user/cwl/cwl-extensions.html.textile.liquid
+++ b/doc/user/cwl/cwl-extensions.html.textile.liquid
@@ -58,7 +58,7 @@ hints:
        property1: value1
        property2: $(inputs.value2)
  
-  arv:CUDARequirement:
+  cwltool:CUDARequirement:
      cudaVersionMin: "11.0"
      cudaComputeCapabilityMin: "9.0"
      deviceCountMin: 1
@@ -153,7 +153,7 @@ table(table table-bordered table-condensed).
  |_. Field |_. Type |_. Description |
  |processProperties|key-value map, or list of objects with the fields {propertyName, propertyValue}|The properties that will be set on the container request.  May include expressions that reference `$(inputs)` of the current workflow or tool.|
  
-h2(#CUDARequirement). arv:CUDARequirement
+h2(#CUDARequirement). cwltool:CUDARequirement
  
  Request support for Nvidia CUDA GPU acceleration in the container.  Assumes that the CUDA runtime (SDK) is installed in the container, and the host will inject the CUDA driver libraries into the container (equal or later to the version requested).
  
diff --git a/doc/user/cwl/cwl-style.html.textile.liquid b/doc/user/cwl/cwl-style.html.textile.liquid

index bd07161ce3b203aca424b5287a48362d51d46787..853ed3b3e2be241d7e7c7dad9ae2c64312636449 100644 (file)
--- a/doc/user/cwl/cwl-style.html.textile.liquid
+++ b/doc/user/cwl/cwl-style.html.textile.liquid
@@ -11,9 +11,36 @@ SPDX-License-Identifier: CC-BY-SA-3.0
  
  h2(#performance). Performance
  
-To get the best perfomance from your workflows, be aware of the following Arvados features, behaviors, and best practices:
+To get the best perfomance from your workflows, be aware of the following Arvados features, behaviors, and best practices.
  
-If you have a sequence of short-running steps (less than 1-2 minutes each), use the Arvados extension "arv:RunInSingleContainer":cwl-extensions.html#RunInSingleContainer to avoid scheduling and data transfer overhead by running all the steps together at once.  To use this feature, @cwltool@ must be installed in the container image.
+Does your application support NVIDIA GPU acceleration?  Use "cwltool:CUDARequirement":cwl-extensions.html#CUDARequirement to request nodes with GPUs.
+
+If you have a sequence of short-running steps (less than 1-2 minutes each), use the Arvados extension "arv:RunInSingleContainer":cwl-extensions.html#RunInSingleContainer to avoid scheduling and data transfer overhead by running all the steps together in the same container on the same node.  To use this feature, @cwltool@ must be installed in the container image.  Example:
+
+{% codeblock as yaml %}
+class: Workflow
+cwlVersion: v1.0
+$namespaces:
+  arv: "http://arvados.org/cwl#"
+inputs:
+  file: File
+outputs: []
+requirements:
+  SubworkflowFeatureRequirement: {}
+steps:
+  subworkflow-with-short-steps:
+    in:
+      file: file
+    out: [out]
+    # This hint indicates that the subworkflow should be bundled and
+    # run in a single container, instead of the normal behavior, which
+    # is to run each step in a separate container.  This greatly
+    # reduces overhead if you have a series of short jobs, without
+    # requiring any changes the CWL definition of the sub workflow.
+    hints:
+      - class: arv:RunInSingleContainer
+    run: subworkflow-with-short-steps.cwl
+{% endcodeblock %}
  
  Avoid declaring @InlineJavascriptRequirement@ or @ShellCommandRequirement@ unless you specifically need them.  Don't include them "just in case" because they change the default behavior and may add extra overhead.
  
@@ -123,7 +150,7 @@ To write workflows that are easy to modify and portable across CWL runners (in t
  
  Workflows should always provide @DockerRequirement@ in the @hints@ or @requirements@ section.
  
-Build a reusable library of components.  Share tool wrappers and subworkflows between projects.  Make use of and contribute to "community maintained workflows and tools":https://github.com/common-workflow-language/workflows and tool registries such as "Dockstore":http://dockstore.org .
+Build a reusable library of components.  Share tool wrappers and subworkflows between projects.  Make use of and contribute to "community maintained workflows and tools":https://github.com/common-workflow-library and tool registries such as "Dockstore":http://dockstore.org .
  
  CommandLineTools wrapping custom scripts should represent the script as an input parameter with the script file as a default value.  Use @secondaryFiles@ for scripts that consist of multiple files.  For example:
  
diff --git a/lib/config/cmd.go b/lib/config/cmd.go

index eeab6ac8cd0dfad10f82f97880db0187f3e412bd..528d748c86f858d353698adc4ae573a569e12c4d 100644 (file)
--- a/lib/config/cmd.go
+++ b/lib/config/cmd.go
@@ -6,6 +6,7 @@ package config
  
  import (
         "bytes"
+       "errors"
         "flag"
         "fmt"
         "io"
@@ -13,6 +14,7 @@ import (
         "os/exec"
  
         "git.arvados.org/arvados.git/lib/cmd"
+       "git.arvados.org/arvados.git/sdk/go/arvados"
         "git.arvados.org/arvados.git/sdk/go/ctxlog"
         "github.com/ghodss/yaml"
         "github.com/sirupsen/logrus"
@@ -107,6 +109,34 @@ func (checkCommand) RunCommand(prog string, args []string, stdin io.Reader, stdo
         if err != nil {
                 return 1
         }
+
+       // Check for configured vocabulary validity.
+       for id, cc := range withDepr.Clusters {
+               if cc.API.VocabularyPath == "" {
+                       continue
+               }
+               vd, err := os.ReadFile(cc.API.VocabularyPath)
+               if err != nil {
+                       if errors.Is(err, os.ErrNotExist) {
+                               // If the vocabulary path doesn't exist, it might mean that
+                               // the current node isn't the controller; so it's not an
+                               // error.
+                               continue
+                       }
+                       logger.Errorf("Error reading vocabulary file %q for cluster %s: %s\n", cc.API.VocabularyPath, id, err)
+                       continue
+               }
+               mk := make([]string, 0, len(cc.Collections.ManagedProperties))
+               for k := range cc.Collections.ManagedProperties {
+                       mk = append(mk, k)
+               }
+               _, err = arvados.NewVocabulary(vd, mk)
+               if err != nil {
+                       logger.Errorf("Error loading vocabulary file %q for cluster %s:\n%s\n", cc.API.VocabularyPath, id, err)
+                       continue
+               }
+       }
+
         cmd := exec.Command("diff", "-u", "--label", "without-deprecated-configs", "--label", "relying-on-deprecated-configs", "/dev/fd/3", "/dev/fd/4")
         for _, obj := range []interface{}{withoutDepr, withDepr} {
                 y, _ := yaml.Marshal(obj)
diff --git a/lib/config/cmd_test.go b/lib/config/cmd_test.go

index a5cc28b80c978670a52cfceaa385b567fa73b245..7167982ccd7021f3b43a895190909694c493b7da 100644 (file)
--- a/lib/config/cmd_test.go
+++ b/lib/config/cmd_test.go
@@ -53,6 +53,7 @@ Clusters:
    SystemRootToken: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    API:
      MaxItemsPerResponse: 1234
+    VocabularyPath: /this/path/does/not/exist
    Collections:
      BlobSigningKey: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    PostgreSQL:
@@ -77,6 +78,60 @@ Clusters:
         c.Check(stderr.String(), check.Equals, "")
  }
  
+func (s *CommandSuite) TestCheck_VocabularyErrors(c *check.C) {
+       tmpFile, err := ioutil.TempFile("", "")
+       c.Assert(err, check.IsNil)
+       defer os.Remove(tmpFile.Name())
+       _, err = tmpFile.WriteString(`
+{
+ "tags": {
+  "IDfoo": {
+   "labels": [
+    {"label": "foo"}
+   ]
+  },
+  "IDfoo": {
+   "labels": [
+    {"label": "baz"}
+   ]
+  }
+ }
+}`)
+       c.Assert(err, check.IsNil)
+       tmpFile.Close()
+       vocPath := tmpFile.Name()
+       var stdout, stderr bytes.Buffer
+       in := `
+Clusters:
+ z1234:
+  ManagementToken: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  SystemRootToken: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  API:
+    MaxItemsPerResponse: 1234
+    VocabularyPath: ` + vocPath + `
+  Collections:
+    BlobSigningKey: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+  PostgreSQL:
+    Connection:
+      sslmode: require
+  Services:
+    RailsAPI:
+      InternalURLs:
+        "http://0.0.0.0:8000": {}
+  Workbench:
+    UserProfileFormFields:
+      color:
+        Type: select
+        Options:
+          fuchsia: {}
+    ApplicationMimetypesWithViewIcon:
+      whitespace: {}
+`
+       code := CheckCommand.RunCommand("arvados config-check", []string{"-config", "-"}, bytes.NewBufferString(in), &stdout, &stderr)
+       c.Check(code, check.Equals, 1)
+       c.Check(stderr.String(), check.Matches, `(?ms).*Error loading vocabulary file.*for cluster.*duplicate JSON key.*tags.IDfoo.*`)
+}
+
  func (s *CommandSuite) TestCheck_DeprecatedKeys(c *check.C) {
         var stdout, stderr bytes.Buffer
         in := `
diff --git a/lib/controller/localdb/conn.go b/lib/controller/localdb/conn.go

index 323e660c6f1e75d79721466e513dc8c611a52833..104cfe28f5e0cccfb9cf785955229b4f3b297fdf 100644 (file)
--- a/lib/controller/localdb/conn.go
+++ b/lib/controller/localdb/conn.go
@@ -99,7 +99,7 @@ func (conn *Conn) maybeRefreshVocabularyCache(logger logrus.FieldLogger) error {
  func (conn *Conn) loadVocabularyFile() error {
         vf, err := os.ReadFile(conn.cluster.API.VocabularyPath)
         if err != nil {
-               return fmt.Errorf("couldn't reading the vocabulary file: %v", err)
+               return fmt.Errorf("while reading the vocabulary file: %v", err)
         }
         mk := make([]string, 0, len(conn.cluster.Collections.ManagedProperties))
         for k := range conn.cluster.Collections.ManagedProperties {
diff --git a/sdk/go/arvados/vocabulary.go b/sdk/go/arvados/vocabulary.go

index 150091b308505b51f1830d33351cc7a706161650..bb1bec789f7f459a3cd49657c4df5337711cce19 100644 (file)
--- a/sdk/go/arvados/vocabulary.go
+++ b/sdk/go/arvados/vocabulary.go
@@ -7,8 +7,10 @@ package arvados
  import (
         "bytes"
         "encoding/json"
+       "errors"
         "fmt"
         "reflect"
+       "strconv"
         "strings"
  )
  
@@ -55,11 +57,32 @@ func NewVocabulary(data []byte, managedTagKeys []string) (voc *Vocabulary, err e
         }
         err = json.Unmarshal(data, &voc)
         if err != nil {
-               return nil, fmt.Errorf("invalid JSON format error: %q", err)
+               var serr *json.SyntaxError
+               if errors.As(err, &serr) {
+                       offset := serr.Offset
+                       errorMsg := string(data[:offset])
+                       line := 1 + strings.Count(errorMsg, "\n")
+                       column := offset - int64(strings.LastIndex(errorMsg, "\n")+len("\n"))
+                       return nil, fmt.Errorf("invalid JSON format: %q (line %d, column %d)", err, line, column)
+               }
+               return nil, fmt.Errorf("invalid JSON format: %q", err)
         }
         if reflect.DeepEqual(voc, &Vocabulary{}) {
                 return nil, fmt.Errorf("JSON data provided doesn't match Vocabulary format: %q", data)
         }
+
+       shouldReportErrors := false
+       errors := []string{}
+
+       // json.Unmarshal() doesn't error out on duplicate keys.
+       dupedKeys := []string{}
+       err = checkJSONDupedKeys(json.NewDecoder(bytes.NewReader(data)), nil, &dupedKeys)
+       if err != nil {
+               shouldReportErrors = true
+               for _, dk := range dupedKeys {
+                       errors = append(errors, fmt.Sprintf("duplicate JSON key %q", dk))
+               }
+       }
         voc.reservedTagKeys = make(map[string]bool)
         for _, managedKey := range managedTagKeys {
                 voc.reservedTagKeys[managedKey] = true
@@ -67,63 +90,122 @@ func NewVocabulary(data []byte, managedTagKeys []string) (voc *Vocabulary, err e
         for systemKey := range voc.systemTagKeys() {
                 voc.reservedTagKeys[systemKey] = true
         }
-       err = voc.validate()
+       validationErrs, err := voc.validate()
         if err != nil {
-               return nil, err
+               shouldReportErrors = true
+               errors = append(errors, validationErrs...)
+       }
+       if shouldReportErrors {
+               return nil, fmt.Errorf("%s", strings.Join(errors, "\n"))
         }
         return voc, nil
  }
  
-func (v *Vocabulary) validate() error {
-       if v == nil {
+func checkJSONDupedKeys(d *json.Decoder, path []string, errors *[]string) error {
+       t, err := d.Token()
+       if err != nil {
+               return err
+       }
+       delim, ok := t.(json.Delim)
+       if !ok {
                 return nil
         }
+       switch delim {
+       case '{':
+               keys := make(map[string]bool)
+               for d.More() {
+                       t, err := d.Token()
+                       if err != nil {
+                               return err
+                       }
+                       key := t.(string)
+
+                       if keys[key] {
+                               *errors = append(*errors, strings.Join(append(path, key), "."))
+                       }
+                       keys[key] = true
+
+                       if err := checkJSONDupedKeys(d, append(path, key), errors); err != nil {
+                               return err
+                       }
+               }
+               // consume closing '}'
+               if _, err := d.Token(); err != nil {
+                       return err
+               }
+       case '[':
+               i := 0
+               for d.More() {
+                       if err := checkJSONDupedKeys(d, append(path, strconv.Itoa(i)), errors); err != nil {
+                               return err
+                       }
+                       i++
+               }
+               // consume closing ']'
+               if _, err := d.Token(); err != nil {
+                       return err
+               }
+       }
+       if len(path) == 0 && len(*errors) > 0 {
+               return fmt.Errorf("duplicate JSON key(s) found")
+       }
+       return nil
+}
+
+func (v *Vocabulary) validate() ([]string, error) {
+       if v == nil {
+               return nil, nil
+       }
         tagKeys := map[string]string{}
         // Checks for Vocabulary strictness
         if v.StrictTags && len(v.Tags) == 0 {
-               return fmt.Errorf("vocabulary is strict but no tags are defined")
+               return nil, fmt.Errorf("vocabulary is strict but no tags are defined")
         }
         // Checks for collisions between tag keys, reserved tag keys
         // and tag key labels.
+       errors := []string{}
         for key := range v.Tags {
                 if v.reservedTagKeys[key] {
-                       return fmt.Errorf("tag key %q is reserved", key)
+                       errors = append(errors, fmt.Sprintf("tag key %q is reserved", key))
                 }
                 lcKey := strings.ToLower(key)
                 if tagKeys[lcKey] != "" {
-                       return fmt.Errorf("duplicate tag key %q", key)
+                       errors = append(errors, fmt.Sprintf("duplicate tag key %q", key))
                 }
                 tagKeys[lcKey] = key
                 for _, lbl := range v.Tags[key].Labels {
                         label := strings.ToLower(lbl.Label)
                         if tagKeys[label] != "" {
-                               return fmt.Errorf("tag label %q for key %q already seen as a tag key or label", lbl.Label, key)
+                               errors = append(errors, fmt.Sprintf("tag label %q for key %q already seen as a tag key or label", lbl.Label, key))
                         }
                         tagKeys[label] = lbl.Label
                 }
                 // Checks for value strictness
                 if v.Tags[key].Strict && len(v.Tags[key].Values) == 0 {
-                       return fmt.Errorf("tag key %q is configured as strict but doesn't provide values", key)
+                       errors = append(errors, fmt.Sprintf("tag key %q is configured as strict but doesn't provide values", key))
                 }
                 // Checks for collisions between tag values and tag value labels.
                 tagValues := map[string]string{}
                 for val := range v.Tags[key].Values {
                         lcVal := strings.ToLower(val)
                         if tagValues[lcVal] != "" {
-                               return fmt.Errorf("duplicate tag value %q for tag %q", val, key)
+                               errors = append(errors, fmt.Sprintf("duplicate tag value %q for tag %q", val, key))
                         }
                         // Checks for collisions between labels from different values.
                         tagValues[lcVal] = val
                         for _, tagLbl := range v.Tags[key].Values[val].Labels {
                                 label := strings.ToLower(tagLbl.Label)
                                 if tagValues[label] != "" && tagValues[label] != val {
-                                       return fmt.Errorf("tag value label %q for pair (%q:%q) already seen on value %q", tagLbl.Label, key, val, tagValues[label])
+                                       errors = append(errors, fmt.Sprintf("tag value label %q for pair (%q:%q) already seen on value %q", tagLbl.Label, key, val, tagValues[label]))
                                 }
                                 tagValues[label] = val
                         }
                 }
         }
-       return nil
+       if len(errors) > 0 {
+               return errors, fmt.Errorf("invalid vocabulary")
+       }
+       return nil, nil
  }
  
  func (v *Vocabulary) getLabelsToKeys() (labels map[string]string) {
diff --git a/sdk/go/arvados/vocabulary_test.go b/sdk/go/arvados/vocabulary_test.go

index 5a5189de2b3e1c9d66d6c818bd1ce1f11554bf17..84b9bf2295e62e6025e0c6f03847c4d3e666a9eb 100644 (file)
--- a/sdk/go/arvados/vocabulary_test.go
+++ b/sdk/go/arvados/vocabulary_test.go
@@ -6,6 +6,8 @@ package arvados
  
  import (
         "encoding/json"
+       "regexp"
+       "strings"
  
         check "gopkg.in/check.v1"
  )
@@ -56,7 +58,7 @@ func (s *VocabularySuite) SetUpTest(c *check.C) {
                         },
                 },
         }
-       err := s.testVoc.validate()
+       _, err := s.testVoc.validate()
         c.Assert(err, check.IsNil)
  }
  
@@ -258,14 +260,26 @@ func (s *VocabularySuite) TestNewVocabulary(c *check.C) {
                         },
                 },
                 {
-                       "Valid data, but uses reserved key",
+                       "Invalid JSON error with line & column numbers",
+                       `{"tags":{
+                               "aKey":{
+                                       "labels": [,{"label": "A label"}]
+                               }
+                       }}`,
+                       false, `invalid JSON format:.*\(line \d+, column \d+\)`, nil,
+               },
+               {
+                       "Invalid JSON with duplicate & reserved keys",
                         `{"tags":{
                                 "type":{
                                         "strict": false,
-                                       "labels": [{"label": "Type"}]
+                                       "labels": [{"label": "Class", "label": "Type"}]
+                               },
+                               "type":{
+                                       "labels": []
                                 }
                         }}`,
-                       false, "tag key.*is reserved", nil,
+                       false, "(?s).*duplicate JSON key \"tags.type.labels.0.label\"\nduplicate JSON key \"tags.type\"\ntag key \"type\" is reserved", nil,
                 },
         }
  
@@ -288,14 +302,14 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
         tests := []struct {
                 name       string
                 voc        *Vocabulary
-               errMatches string
+               errMatches []string
         }{
                 {
                         "Strict vocabulary, no keys",
                         &Vocabulary{
                                 StrictTags: true,
                         },
-                       "vocabulary is strict but no tags are defined",
+                       []string{"vocabulary is strict but no tags are defined"},
                 },
                 {
                         "Collision between tag key and tag key label",
@@ -312,7 +326,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "", // Depending on how the map is sorted, this could be one of two errors
+                       nil, // Depending on how the map is sorted, this could be one of two errors
                 },
                 {
                         "Collision between tag key and tag key label (case-insensitive)",
@@ -329,7 +343,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "", // Depending on how the map is sorted, this could be one of two errors
+                       nil, // Depending on how the map is sorted, this could be one of two errors
                 },
                 {
                         "Collision between tag key labels",
@@ -346,7 +360,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "tag label.*for key.*already seen.*",
+                       []string{"(?s).*tag label.*for key.*already seen.*"},
                 },
                 {
                         "Collision between tag value and tag value label",
@@ -367,7 +381,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "", // Depending on how the map is sorted, this could be one of two errors
+                       nil, // Depending on how the map is sorted, this could be one of two errors
                 },
                 {
                         "Collision between tag value and tag value label (case-insensitive)",
@@ -388,7 +402,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "", // Depending on how the map is sorted, this could be one of two errors
+                       nil, // Depending on how the map is sorted, this could be one of two errors
                 },
                 {
                         "Collision between tag value labels",
@@ -409,7 +423,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "tag value label.*for pair.*already seen.*on value.*",
+                       []string{"(?s).*tag value label.*for pair.*already seen.*on value.*"},
                 },
                 {
                         "Collision between tag value labels (case-insensitive)",
@@ -430,7 +444,7 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "tag value label.*for pair.*already seen.*on value.*",
+                       []string{"(?s).*tag value label.*for pair.*already seen.*on value.*"},
                 },
                 {
                         "Strict tag key, with no values",
@@ -443,15 +457,47 @@ func (s *VocabularySuite) TestValidationErrors(c *check.C) {
                                         },
                                 },
                         },
-                       "tag key.*is configured as strict but doesn't provide values",
+                       []string{"(?s).*tag key.*is configured as strict but doesn't provide values"},
+               },
+               {
+                       "Multiple errors reported",
+                       &Vocabulary{
+                               StrictTags: false,
+                               Tags: map[string]VocabularyTag{
+                                       "IDTAGANIMALS": {
+                                               Strict: true,
+                                               Labels: []VocabularyLabel{{Label: "Animal"}, {Label: "Creature"}},
+                                       },
+                                       "IDTAGSIZES": {
+                                               Labels: []VocabularyLabel{{Label: "Animal"}, {Label: "Size"}},
+                                       },
+                               },
+                       },
+                       []string{
+                               "(?s).*tag key.*is configured as strict but doesn't provide values.*",
+                               "(?s).*tag label.*for key.*already seen.*",
+                       },
                 },
         }
         for _, tt := range tests {
                 c.Log(c.TestName()+" ", tt.name)
-               err := tt.voc.validate()
+               validationErrs, err := tt.voc.validate()
                 c.Assert(err, check.NotNil)
-               if tt.errMatches != "" {
-                       c.Assert(err, check.ErrorMatches, tt.errMatches)
+               for _, errMatch := range tt.errMatches {
+                       seen := false
+                       for _, validationErr := range validationErrs {
+                               if regexp.MustCompile(errMatch).MatchString(validationErr) {
+                                       seen = true
+                                       break
+                               }
+                       }
+                       if len(validationErrs) == 0 {
+                               c.Assert(err, check.ErrorMatches, errMatch)
+                       } else {
+                               c.Assert(seen, check.Equals, true,
+                                       check.Commentf("Expected to see error matching %q:\n%s",
+                                               errMatch, strings.Join(validationErrs, "\n")))
+                       }
                 }
         }
  }
diff --git a/tools/salt-install/config_examples/multi_host/aws/states/custom_certs.sls b/tools/salt-install/config_examples/multi_host/aws/states/custom_certs.sls

index 371650339125a67083b3adb049fb5ac48893fe5b..3b2be59f368c353793bec874b9cf9dae1adde896 100644 (file)
--- a/tools/salt-install/config_examples/multi_host/aws/states/custom_certs.sls
+++ b/tools/salt-install/config_examples/multi_host/aws/states/custom_certs.sls
@@ -6,17 +6,18 @@
  {%- set dest_cert_dir = '/etc/nginx/ssl' %}
  {%- set certs = salt['pillar.get']('extra_custom_certs', [])  %}
  
+{% if certs %}
  extra_custom_certs_file_directory_certs_dir:
    file.directory:
      - name: /etc/nginx/ssl
      - require:
        - pkg: nginx_install
  
-{%- for cert in certs %}
-  {%- set cert_file = 'arvados-' ~ cert ~ '.pem' %}
-  {#- set csr_file = 'arvados-' ~ cert ~ '.csr' #}
-  {%- set key_file = 'arvados-' ~ cert ~ '.key' %}
-  {% for c in [cert_file, key_file] %}
+  {%- for cert in certs %}
+    {%- set cert_file = 'arvados-' ~ cert ~ '.pem' %}
+    {#- set csr_file = 'arvados-' ~ cert ~ '.csr' #}
+    {%- set key_file = 'arvados-' ~ cert ~ '.key' %}
+    {% for c in [cert_file, key_file] %}
  extra_custom_certs_file_copy_{{ c }}:
    file.copy:
      - name: {{ dest_cert_dir }}/{{ c }}
@@ -27,5 +28,6 @@ extra_custom_certs_file_copy_{{ c }}:
      - unless: cmp {{ dest_cert_dir }}/{{ c }} {{ orig_cert_dir }}/{{ c }}
      - require:
        - file: extra_custom_certs_file_directory_certs_dir
+    {%- endfor %}
    {%- endfor %}
-{%- endfor %}
+{%- endif %}
diff --git a/tools/salt-install/config_examples/single_host/multiple_hostnames/states/custom_certs.sls b/tools/salt-install/config_examples/single_host/multiple_hostnames/states/custom_certs.sls

index 371650339125a67083b3adb049fb5ac48893fe5b..3b2be59f368c353793bec874b9cf9dae1adde896 100644 (file)
--- a/tools/salt-install/config_examples/single_host/multiple_hostnames/states/custom_certs.sls
+++ b/tools/salt-install/config_examples/single_host/multiple_hostnames/states/custom_certs.sls
@@ -6,17 +6,18 @@
  {%- set dest_cert_dir = '/etc/nginx/ssl' %}
  {%- set certs = salt['pillar.get']('extra_custom_certs', [])  %}
  
+{% if certs %}
  extra_custom_certs_file_directory_certs_dir:
    file.directory:
      - name: /etc/nginx/ssl
      - require:
        - pkg: nginx_install
  
-{%- for cert in certs %}
-  {%- set cert_file = 'arvados-' ~ cert ~ '.pem' %}
-  {#- set csr_file = 'arvados-' ~ cert ~ '.csr' #}
-  {%- set key_file = 'arvados-' ~ cert ~ '.key' %}
-  {% for c in [cert_file, key_file] %}
+  {%- for cert in certs %}
+    {%- set cert_file = 'arvados-' ~ cert ~ '.pem' %}
+    {#- set csr_file = 'arvados-' ~ cert ~ '.csr' #}
+    {%- set key_file = 'arvados-' ~ cert ~ '.key' %}
+    {% for c in [cert_file, key_file] %}
  extra_custom_certs_file_copy_{{ c }}:
    file.copy:
      - name: {{ dest_cert_dir }}/{{ c }}
@@ -27,5 +28,6 @@ extra_custom_certs_file_copy_{{ c }}:
      - unless: cmp {{ dest_cert_dir }}/{{ c }} {{ orig_cert_dir }}/{{ c }}
      - require:
        - file: extra_custom_certs_file_directory_certs_dir
+    {%- endfor %}
    {%- endfor %}
-{%- endfor %}
+{%- endif %}
author	Tom Clegg <tom@curii.com>
	Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)
committer	Tom Clegg <tom@curii.com>
	Thu, 27 Jan 2022 14:23:02 +0000 (09:23 -0500)
doc/_includes/_install_cuda.liquid	[new file with mode: 0644]	patch \| blob
doc/admin/metadata-vocabulary.html.textile.liquid		patch \| blob \| history
doc/install/crunch2-cloud/install-compute-node.html.textile.liquid		patch \| blob \| history
doc/install/crunch2-cloud/install-dispatch-cloud.html.textile.liquid		patch \| blob \| history
doc/install/crunch2-lsf/install-dispatch.html.textile.liquid		patch \| blob \| history
doc/install/crunch2/install-compute-node-docker.html.textile.liquid		patch \| blob \| history
doc/install/crunch2/install-compute-node-singularity.html.textile.liquid		patch \| blob \| history
doc/install/install-workbench2-app.html.textile.liquid		patch \| blob \| history
doc/user/cwl/arvados-vscode-training.html.md.liquid		patch \| blob \| history
doc/user/cwl/cwl-extensions.html.textile.liquid		patch \| blob \| history
doc/user/cwl/cwl-style.html.textile.liquid		patch \| blob \| history
lib/config/cmd.go		patch \| blob \| history
lib/config/cmd_test.go		patch \| blob \| history
lib/controller/localdb/conn.go		patch \| blob \| history
sdk/go/arvados/vocabulary.go		patch \| blob \| history
sdk/go/arvados/vocabulary_test.go		patch \| blob \| history
tools/salt-install/config_examples/multi_host/aws/states/custom_certs.sls		patch \| blob \| history
tools/salt-install/config_examples/single_host/multiple_hostnames/states/custom_certs.sls		patch \| blob \| history