Merge branch '13570-crunch-ram-doc' closes #13570

[arvados.git] / doc / user / cwl / cwl-style.html.textile.liquid
diff --git a/doc/user/cwl/cwl-style.html.textile.liquid b/doc/user/cwl/cwl-style.html.textile.liquid

index 5c6d04940f6b1928120494e8134681d956efac50..6b5147ddcbda8260f9431fa8ec201bc759a8aa41 100644 (file)
--- a/doc/user/cwl/cwl-style.html.textile.liquid
+++ b/doc/user/cwl/cwl-style.html.textile.liquid
@@ -3,6 +3,13 @@ layout: default
  navsection: userguide
  title: Best Practices for writing CWL
  ...
+{% comment %}
+Copyright (C) The Arvados Authors. All rights reserved.
+
+SPDX-License-Identifier: CC-BY-SA-3.0
+{% endcomment %}
+
+* To run on Arvados, a workflow should provide a @DockerRequirement@ in the @hints@ section.
  
  * Build a reusable library of components.  Share tool wrappers and subworkflows between projects.  Make use of and contribute to "community maintained workflows and tools":https://github.com/common-workflow-language/workflows and tool registries such as "Dockstore":http://dockstore.org .
  
@@ -106,6 +113,8 @@ steps:
          tmpdirMin: 90000
  </pre>
  
+* Available compute nodes types vary over time and across different cloud providers, so try to limit the RAM requirement to what the program actually needs.  However, if you need to target a specific compute node type, see this discussion on "calculating RAM request and choosing instance type for containers.":{{site.baseurl}}/api/execution.html#RAM
+
  * Instead of scattering separate steps, prefer to scatter over a subworkflow.
  
  With the following pattern, @step1@ has to wait for all samples to complete before @step2@ can start computing on any samples.  This means a single long-running sample can prevent the rest of the workflow from moving on:
@@ -166,3 +175,8 @@ steps:
            out: [out]
            run: tool3.cwl
  </pre>
+
+* When migrating from crunch v1 API (--api=jobs) to the crunch v2 API (--api=containers) there are a few differences in behavior:
+** The tool is limited to accessing only collections which are explicitly listed in the input, and further limited to only the subdirectories of collections listed in input.  For example, given an explicit file input @/dir/subdir/file1.txt@, a tool will not be able to implicitly access the file @/dir/file2.txt@.  Use @secondaryFiles@ or a @Directory@ input to describe trees of files.
+** Files listed in @InitialWorkDirRequirement@ appear in the output directory as normal files (not symlinks) but cannot be moved, renamed or deleted.  These files will be added to the output collection but without any additional copies of the underlying data.
+** Tools are disallowed network access by default.  Tools which require network access must include @arv:APIRequirement: {}@ in their @requirements@ section.