_episodes/03-running.md

   1 ---
   2 title: "Running and Debugging a Workflow"
   3 teaching: 15
   4 exercises: 20
   5 questions:
   6 - "How do I provide input to run a workflow?"
   7 - "What should I do if the workflow fails?"
   8 objectives:
   9 - "Write an input parameter file."
  10 - "Execute the workflow."
  11 - "Diagnose workflow errors."
  12 keypoints:
  13 - "The input parameter file is a YAML file with values for each input parameter."
  14 - "A common reason for a workflow step fails is insufficient RAM."
  15 - "Use ResourceRequirement to set the amount of RAM to be allocated to the job."
  16 - "Output parameter values are printed as JSON to standard output at the end of the run."
  17 ---
  18
  19 # The input parameter file
  20
  21 CWL input values are provided in the form of a YAML or JSON file.
  22 create a called file
  23
  24 This file gives the values for parameters declared in the `inputs`
  25 section of our workflow.  Our workflow takes `fq`, `genome` and `gtf`
  26 as input parameters.
  27
  28 When setting inputs, Files and Directories are given as an object with
  29 `class: File` or `class: Directory`.  This distinguishes them from
  30 plain strings that may or may not be file paths.
  31
  32 Note: if you don't have example sequence data or the STAR index files, see [setup](/setup.html).
  33
  34 {% capture generic_input_tab_content %}
  35 main-input.yaml
  36 ```
  37 fq:
  38   class: File
  39   location: rnaseq/raw_fastq/Mov10_oe_1.subset.fq
  40   format: http://edamontology.org/format_1930
  41 genome:
  42   class: Directory
  43   location: hg19-chr1-STAR-index
  44 gtf:
  45   class: File
  46   location: rnaseq/reference_data/chr1-hg19_genes.gtf
  47 ```
  48 {: .language-yaml }
  49
  50 > ## Running the workflow
  51 >
  52 > Type this into the terminal:
  53 >
  54 > ```
  55 > cwl-runner main.cwl main-input.yaml
  56 > ```
  57 > {: .language-bash }
  58 >
  59 > This may take a few minutes to run, and will print some amount of
  60 > logging.  The logging you see, how access other logs, and how to
  61 > track workflow progress will depend on your CWL runner platform.
  62 {: .challenge }
  63 {% endcapture %}
  64
  65 {% capture arvados_input_tab_content %}
  66 main-input.yaml
  67 ```
  68 fq:
  69   class: File
  70   location: keep:9178fe1b80a08a422dbe02adfd439764+925/raw_fastq/Mov10_oe_1.subset.fq
  71   format: http://edamontology.org/format_1930
  72 genome:
  73   class: Directory
  74   location: keep:02a12ce9e2707610991bd29d38796b57+2912
  75 gtf:
  76   class: File
  77   location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf
  78 ```
  79 {: .language-yaml }
  80
  81 > ## Running the workflow
  82 >
  83 > If you are using VSCode with Arvados tasks, select `main.cwl` and
  84 > then use the `Run CWL Workflow on Arvados` task.
  85 >
  86 {: .challenge }
  87 {% endcapture %}
  88
  89 <div class="tabbed">
  90   <ul class="tab">
  91       <li><a href="#section-arvados-input">arvados</a></li>
  92       <li><a href="#section-generic-input">generic</a></li>
  93   </ul>
  94
  95   <section id="section-arvados-input">{{ arvados_input_tab_content | markdownify}}</section>
  96   <section id="section-generic-input">{{ generic_input_tab_content | markdownify}}</section>
  97 </div>
  98
  99 # Debugging the workflow
 100
 101 Depending on whether and how your workflow platform enforces memory
 102 limits, your workflow may fail.  Let's talk about what to do when a
 103 workflow fails.
 104
 105 A workflow can fail for many reasons: some possible reasons include
 106 bad input, bugs in the code, or running out memory.  In our example,
 107 the STAR workflow may fail with an out of memory error.
 108
 109 To help diagnose these errors, the workflow runner produces logs that
 110 record what happened, either in the terminal or the web interface.
 111
 112 Some errors you might see in the logs that would indicate an out of
 113 memory condition:
 114
 115 ```
 116 EXITING: fatal error trying to allocate genome arrays, exception thrown: std::bad_alloc
 117 Possible cause 1: not enough RAM. Check if you have enough RAM 5711762337 bytes
 118 Possible cause 2: not enough virtual memory allowed with ulimit. SOLUTION: run ulimit -v 5711762337
 119 ```
 120
 121 or
 122
 123 ```
 124 Container exited with code: 137
 125 ```
 126
 127 (Exit code 137 most commonly occurs when a container goes "out of memory" and is terminated by the operating system).
 128
 129 If this happens, you will need to request more RAM.
 130
 131 # Setting runtime RAM requirements
 132
 133 By default, a step is allocated 256 MB of RAM.  From the STAR error message:
 134
 135 > Check if you have enough RAM 5711762337 bytes
 136
 137 We can see that STAR requires quite a bit more RAM than 256 MB.  To
 138 request more RAM, add a "requirements" section with
 139 "ResourceRequirement" to the "STAR" step:
 140
 141 ```
 142   STAR:
 143     requirements:
 144       ResourceRequirement:
 145         ramMin: 9000
 146     run: bio-cwl-tools/STAR/STAR-Align.cwl
 147         ...
 148 ```
 149 {: .language-yaml }
 150
 151 Resource requirements you can set include:
 152
 153 * coresMin: CPU cores
 154 * ramMin: RAM (in megabytes)
 155 * tmpdirMin: temporary directory available space
 156 * outdirMin: output directory available space
 157
 158 > ## Running the workflow
 159 >
 160 > Now that you've fixed the workflow, run it again.
 161 >
 162 {: .challenge }
 163
 164 > ## Episode solution
 165 > * <a href="../assets/answers/ep3/main.cwl">main.cwl</a>
 166 {: .solution}
 167
 168 # Workflow results
 169
 170 The CWL runner will print a results JSON object to standard output.  It will look something like this (it may include additional fields).
 171
 172 {% capture generic_output_tab_content %}
 173
 174 ```
 175 {
 176     "bam_sorted_indexed": {
 177         "location": "file:///home/username/rnaseq-cwl-training-exercises/Aligned.sortedByCoord.out.bam",
 178         "basename": "Aligned.sortedByCoord.out.bam",
 179         "class": "File",
 180         "size": 25370707,
 181         "secondaryFiles": [
 182             {
 183                 "basename": "Aligned.sortedByCoord.out.bam.bai",
 184                 "location": "file:///home/username/rnaseq-cwl-training-exercises/Aligned.sortedByCoord.out.bam.bai",
 185                 "class": "File",
 186                 "size": 176552,
 187             }
 188         ]
 189     },
 190     "qc_html": {
 191         "location": "file:///home/username/rnaseq-cwl-training-exercises/Mov10_oe_1.subset_fastqc.html",
 192         "basename": "Mov10_oe_1.subset_fastqc.html",
 193         "class": "File",
 194         "size": 383589
 195     }
 196 }
 197 ```
 198 {: .language-yaml }
 199 {% endcapture %}
 200
 201 {% capture arvados_output_tab_content %}
 202 ```
 203 {
 204     "bam_sorted_indexed": {
 205         "basename": "Aligned.sortedByCoord.out.bam",
 206         "class": "File",
 207         "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam",
 208         "secondaryFiles": [
 209             {
 210                 "basename": "Aligned.sortedByCoord.out.bam.bai",
 211                 "class": "File",
 212                 "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam.bai"
 213             }
 214         ],
 215         "size": 25370695
 216     },
 217     "qc_html": {
 218         "basename": "Mov10_oe_1.subset_fastqc.html",
 219         "class": "File",
 220         "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Mov10_oe_1.subset_fastqc.html",
 221         "size": 383589
 222     }
 223 }
 224 ```
 225 {: .language-yaml }
 226 {% endcapture %}
 227
 228 <div class="tabbed">
 229   <ul class="tab">
 230       <li><a href="#section-arvados-output">arvados</a></li>
 231       <li><a href="#section-generic-output">generic</a></li>
 232   </ul>
 233
 234   <section id="section-arvados-output">{{ arvados_output_tab_content | markdownify}}</section>
 235   <section id="section-generic-output">{{ generic_output_tab_content | markdownify}}</section>
 236 </div>
 237
 238 This has a similar structure as `main-input.yaml`.  The each output
 239 parameter is listed, with the `location` field of each `File` object
 240 indicating where the output file can be found.