Merge branch '15964-fix-docs' refs #15964
[arvados.git] / doc / user / tutorials / writing-cwl-workflow.html.textile.liquid
1 ---
2 layout: default
3 navsection: userguide
4 title: "Developing workflows with CWL"
5 ...
6 {% comment %}
7 Copyright (C) The Arvados Authors. All rights reserved.
8
9 SPDX-License-Identifier: CC-BY-SA-3.0
10 {% endcomment %}
11
12 {% include 'what_is_cwl' %}
13
14 {% include 'tutorial_expectations' %}
15
16 h2. Developing workflows
17
18 For an introduction and and detailed documentation about writing CWL, see the "CWL User Guide":https://www.commonwl.org/user_guide and the "CWL Specification":http://commonwl.org/v1.2 .
19
20 See "Writing Portable High-Performance Workflows":{{site.baseurl}}/user/cwl/cwl-style.html and "Arvados CWL Extensions":{{site.baseurl}}/user/cwl/cwl-extensions.html for additional information about using CWL on Arvados.
21
22 See "Repositories of CWL Tools and Workflows":https://www.commonwl.org/#Repositories_of_CWL_Tools_and_Workflows for links to repositories of existing tools for reuse.
23
24 See "Software for working with CWL":https://www.commonwl.org/#Software_for_working_with_CWL for links to software tools to help create CWL documents.
25
26 h2. Using cwltool
27
28 When developing a workflow, it is often helpful to run it on the local host to avoid the overhead of submitting to the cluster.  To execute a workflow only on the local host (without submitting jobs to an Arvados cluster) you can use the @cwltool@ command.  Note that when using @cwltool@ you must have the input data accessible on the local file system using either @arv-mount@ or @arv-get@ to fetch the data from Keep.
29
30 <notextile>
31 <pre><code>~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">arv-get 2463fa9efeb75e099685528b3b9071e0+438/ .</span>
32 156 MiB / 156 MiB 100.0%
33 ~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">arv-get ae480c5099b81e17267b7445e35b4bc7+180/ .</span>
34 23 MiB / 23 MiB 100.0%
35 ~/arvados/doc/user/cwl/bwa-mem$ <span class="userinput">cwltool bwa-mem-input.yml bwa-mem-input-local.yml</span>
36 cwltool 1.0.20160629140624
37 [job bwa-mem.cwl] /home/example/arvados/doc/user/cwl/bwa-mem$ docker \
38     run \
39     -i \
40     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.ann:/var/lib/cwl/job979368791_bwa-mem/19.fasta.ann:ro \
41     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq:/var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq:ro \
42     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.sa:/var/lib/cwl/job979368791_bwa-mem/19.fasta.sa:ro \
43     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.amb:/var/lib/cwl/job979368791_bwa-mem/19.fasta.amb:ro \
44     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.pac:/var/lib/cwl/job979368791_bwa-mem/19.fasta.pac:ro \
45     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq:/var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq:ro \
46     --volume=/home/example/arvados/doc/user/cwl/bwa-mem/19.fasta.bwt:/var/lib/cwl/job979368791_bwa-mem/19.fasta.bwt:ro \
47     --volume=/home/example/arvados/doc/user/cwl/bwa-mem:/var/spool/cwl:rw \
48     --volume=/tmp/tmpgzyou9:/tmp:rw \
49     --workdir=/var/spool/cwl \
50     --read-only=true \
51     --log-driver=none \
52     --user=1001 \
53     --rm \
54     --env=TMPDIR=/tmp \
55     --env=HOME=/var/spool/cwl \
56     biodckr/bwa \
57     bwa \
58     mem \
59     -t \
60     1 \
61     -R \
62     '@RG        ID:arvados_tutorial     PL:illumina     SM:HWI-ST1027_129' \
63     /var/lib/cwl/job979368791_bwa-mem/19.fasta \
64     /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq \
65     /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq > /home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.sam
66 [M::bwa_idx_load_from_disk] read 0 ALT contigs
67 [M::process] read 100000 sequences (10000000 bp)...
68 [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4745, 1, 0)
69 [M::mem_pestat] skip orientation FF as there are not enough pairs
70 [M::mem_pestat] analyzing insert size distribution for orientation FR...
71 [M::mem_pestat] (25, 50, 75) percentile: (154, 181, 214)
72 [M::mem_pestat] low and high boundaries for computing mean and std.dev: (34, 334)
73 [M::mem_pestat] mean and std.dev: (185.63, 44.88)
74 [M::mem_pestat] low and high boundaries for proper pairs: (1, 394)
75 [M::mem_pestat] skip orientation RF as there are not enough pairs
76 [M::mem_pestat] skip orientation RR as there are not enough pairs
77 [M::mem_process_seqs] Processed 100000 reads in 9.848 CPU sec, 9.864 real sec
78 [main] Version: 0.7.12-r1039
79 [main] CMD: bwa mem -t 1 -R @RG ID:arvados_tutorial     PL:illumina     SM:HWI-ST1027_129 /var/lib/cwl/job979368791_bwa-mem/19.fasta /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.fastq /var/lib/cwl/job979368791_bwa-mem/HWI-ST1027_129_D0THKACXX.1_2.fastq
80 [main] Real time: 10.061 sec; CPU: 10.032 sec
81 Final process status is success
82 {
83     "aligned_sam": {
84         "size": 30738959,
85         "path": "/home/example/arvados/doc/user/cwl/bwa-mem/HWI-ST1027_129_D0THKACXX.1_1.sam",
86         "checksum": "sha1$0c668cca45fef02397bb5302880526d300ee4dac",
87         "class": "File"
88     }
89 }
90 </code></pre>
91 </notextile>
92
93 If you get the error @JavascriptException: Long-running script killed after 20 seconds.@ this may be due to the Dockerized Node.js engine taking too long to start.  You may address this by installing Node.js locally (run @apt-get install nodejs@ on Debian or Ubuntu) or by specifying a longer timeout with the @--eval-timeout@ option.  For example, run the workflow with @cwltool --eval-timeout=40@ for a 40-second timeout.