4 title: "Writing a CWL workflow"
7 {% include 'what_is_cwl' %}
9 {% include 'tutorial_expectations' %}
11 h2. Registering a workflow to use in Workbench
13 Use @--create-workflow@ to register a CWL workflow with Arvados. This enables you to share workflows with other Arvados users, and run them by clicking the <span class="btn btn-sm btn-primary"><i class="fa fa-fw fa-gear"></i> Run a process...</span> button on the Workbench Dashboard.
15 {% include 'register_cwl_workflow' %}
17 h2. Making workflows directly executable
19 You can make a workflow file directly executable (@cwl-runner@ should be an alias to @arvados-cwl-runner@) by adding the following line to the top of the file:
22 <pre><code>#!/usr/bin/env cwl-runner
27 <pre><code>~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">./bwa-mem.cwl bwa-mem-input.yml</span>
28 arvados-cwl-runner 1.0.20160628195002, arvados-python-client 0.1.20160616015107, cwltool 1.0.20160629140624
29 2016-06-30 14:56:36 arvados.arv-run[27002] INFO: Upload local files: "bwa-mem.cwl"
30 2016-06-30 14:56:36 arvados.arv-run[27002] INFO: Uploaded to qr1hi-4zz18-h7ljh5u76760ww2
31 2016-06-30 14:56:40 arvados.cwl-runner[27002] INFO: Submitted job qr1hi-8i9sb-fm2n3b1w0l6bskg
32 2016-06-30 14:56:41 arvados.cwl-runner[27002] INFO: Job bwa-mem.cwl (qr1hi-8i9sb-fm2n3b1w0l6bskg) is Running
33 2016-06-30 14:57:12 arvados.cwl-runner[27002] INFO: Job bwa-mem.cwl (qr1hi-8i9sb-fm2n3b1w0l6bskg) is Complete
34 2016-06-30 14:57:12 arvados.cwl-runner[27002] INFO: Overall process status is success
37 "path": "keep:54325254b226664960de07b3b9482349+154/HWI-ST1027_129_D0THKACXX.1_1.sam",
38 "checksum": "sha1$0dc46a3126d0b5d4ce213b5f0e86e2d05a54755a",
46 You can even make an input file directly executable the same way with the following two lines at the top:
49 <pre><code>#!/usr/bin/env cwl-runner
50 cwl:tool: <span class="userinput">bwa-mem.cwl</span>
55 <pre><code>~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">./bwa-mem-input.yml</span>
56 arvados-cwl-runner 1.0.20160628195002, arvados-python-client 0.1.20160616015107, cwltool 1.0.20160629140624
57 2016-06-30 14:56:36 arvados.arv-run[27002] INFO: Upload local files: "bwa-mem.cwl"
58 2016-06-30 14:56:36 arvados.arv-run[27002] INFO: Uploaded to qr1hi-4zz18-h7ljh5u76760ww2
59 2016-06-30 14:56:40 arvados.cwl-runner[27002] INFO: Submitted job qr1hi-8i9sb-fm2n3b1w0l6bskg
60 2016-06-30 14:56:41 arvados.cwl-runner[27002] INFO: Job bwa-mem.cwl (qr1hi-8i9sb-fm2n3b1w0l6bskg) is Running
61 2016-06-30 14:57:12 arvados.cwl-runner[27002] INFO: Job bwa-mem.cwl (qr1hi-8i9sb-fm2n3b1w0l6bskg) is Complete
62 2016-06-30 14:57:12 arvados.cwl-runner[27002] INFO: Overall process status is success
65 "path": "keep:54325254b226664960de07b3b9482349+154/HWI-ST1027_129_D0THKACXX.1_1.sam",
66 "checksum": "sha1$0dc46a3126d0b5d4ce213b5f0e86e2d05a54755a",
74 h2. Developing workflows
76 For an introduction and and detailed documentation about writing CWL, see the "CWL User Guide":http://commonwl.org/v1.0/UserGuide.html and the "CWL Specification":http://commonwl.org/v1.0 .
78 To run on Arvados, a workflow should provide a @DockerRequirement@ in the @hints@ section.
80 When developing a workflow, it is often helpful to run it on the local host to avoid the overhead of submitting to the cluster. To execute a workflow only on the local host (without submitting jobs to an Arvados cluster) you can use the @cwltool@ command. Note that you must also have the input data accessible on the local host. You can use @arv-get@ to fetch the data from Keep.
83 <pre><code>~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">arv-get 2463fa9efeb75e099685528b3b9071e0+438/ .</span>
84 156 MiB / 156 MiB 100.0%
85 ~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">arv-get ae480c5099b81e17267b7445e35b4bc7+180/ .</span>
86 23 MiB / 23 MiB 100.0%
87 ~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">cwltool bwa-mem-input.yml bwa-mem-input-local.yml</span>
88 cwltool 1.0.20160629140624
89 [job bwa-mem.cwl] /home/example/arvados/doc/user/cwl/bwa-mem$ docker \
92 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.ann:/var/lib/cwl/job979368791_bwa-mem/19.fasta.ann:ro \
93 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq:/var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq:ro \
94 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.sa:/var/lib/cwl/job979368791_bwa-mem/19.fasta.sa:ro \
95 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.amb:/var/lib/cwl/job979368791_bwa-mem/19.fasta.amb:ro \
96 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.pac:/var/lib/cwl/job979368791_bwa-mem/19.fasta.pac:ro \
97 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq:/var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq:ro \
98 --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.bwt:/var/lib/cwl/job979368791_bwa-mem/19.fasta.bwt:ro \
99 --volume=/home/example/arvados/doc/user/cwl/bwa-mem:/var/spool/cwl:rw \
100 --volume=/tmp/tmpgzyou9:/tmp:rw \
101 --workdir=/var/spool/cwl \
107 --env=HOME=/var/spool/cwl \
114 '@RG ID:arvados_tutorial PL:illumina SM:HWI-ST1027_129' \
115 /var/lib/cwl/job979368791_bwa-mem/19.fasta \
116 /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq \
117 /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq > /home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.sam
118 [M::bwa_idx_load_from_disk] read 0 ALT contigs
119 [M::process] read 100000 sequences (10000000 bp)...
120 [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4745, 1, 0)
121 [M::mem_pestat] skip orientation FF as there are not enough pairs
122 [M::mem_pestat] analyzing insert size distribution for orientation FR...
123 [M::mem_pestat] (25, 50, 75) percentile: (154, 181, 214)
124 [M::mem_pestat] low and high boundaries for computing mean and std.dev: (34, 334)
125 [M::mem_pestat] mean and std.dev: (185.63, 44.88)
126 [M::mem_pestat] low and high boundaries for proper pairs: (1, 394)
127 [M::mem_pestat] skip orientation RF as there are not enough pairs
128 [M::mem_pestat] skip orientation RR as there are not enough pairs
129 [M::mem_process_seqs] Processed 100000 reads in 9.848 CPU sec, 9.864 real sec
130 [main] Version: 0.7.12-r1039
131 [main] CMD: bwa mem -t 1 -R @RG ID:arvados_tutorial PL:illumina SM:HWI-ST1027_129 /var/lib/cwl/job979368791_bwa-mem/19.fasta /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq
132 [main] Real time: 10.061 sec; CPU: 10.032 sec
133 Final process status is success
137 "path": "/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.sam",
138 "checksum": "sha1$0c668cca45fef02397bb5302880526d300ee4dac",
145 If you get the error @JavascriptException: Long-running script killed after 20 seconds.@ this may be due to the Dockerized Node.js engine taking too long to start. You may address this by installing Node.js locally (run @apt-get install nodejs@ on Debian or Ubuntu) or by specifying a longer timeout with the @--eval-timeout@ option. For example, run the workflow with @cwltool --eval-timeout=40@ for a 40-second timeout.
147 h2. Running a CWL workflow
149 h3. Running a workflow at command prompt
153 h3. Running a workflow using Workbench
155 The workflow can also be executed using Workbench. Go to the Workbench Dashboard and click the <span class="btn btn-sm btn-primary"><i class="fa fa-fw fa-gear"></i> Run a process...</span> button and select the desired workflow.