X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/bef270550d49533c46df8741db9f9dfa67afc1b8..4d8209df92198e6207d3d38cbb7a189cb319bf3c:/doc/user/topics/running-pipeline-command-line.html.textile.liquid
diff --git a/doc/user/topics/running-pipeline-command-line.html.textile.liquid b/doc/user/topics/running-pipeline-command-line.html.textile.liquid
index 8acce956e9..147fbf0d3a 100644
--- a/doc/user/topics/running-pipeline-command-line.html.textile.liquid
+++ b/doc/user/topics/running-pipeline-command-line.html.textile.liquid
@@ -4,119 +4,44 @@ navsection: userguide
title: "Running a pipeline on the command line"
...
-In "Writing a pipeline":{{ site.baseurl }}/user/tutorials/tutorial-firstscript.html, we learned how to create a pipeline template on the command-line. Let's create one that doesn't require any user input to start:
+This tutorial demonstrates how to use the command line to run the same pipeline as described in "running a pipeline using Workbench.":{{site.baseurl}}/user/tutorials/tutorial-pipeline-workbench.html
-
-~$ cat >the_pipeline <<EOF
-{
- "name":"Filter MD5 hash values",
- "components":{
- "do_hash":{
- "script":"hash.py",
- "script_parameters":{
- "input": "887cd41e9c613463eab2f0d885c6dd96+83"
- },
- "repository":"$USER",
- "script_version":"master"
- },
- "filter":{
- "script":"0-filter.py",
- "script_parameters":{
- "input":{
- "output_of":"do_hash"
- }
- },
- "repository":"$USER",
- "script_version":"master"
- }
- }
-}
-EOF
-~$ arv pipeline_template create --pipeline-template "$(cat the_pipeline)"
-~$ arv pipeline run --run-here --template qr1hi-p5p6p-xxxxxxxxxxxxxxx
-2013-12-16 14:08:40 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
-do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 queued 2013-12-16T14:08:40Z
-filter - -
+* "Tutorial align using bwa mem (qr1hi-p5p6p-itzkwxblfermlwv)":https://{{ site.arvados_workbench_host }}/pipeline_templates/qr1hi-p5p6p-itzkwxblfermlwv
+* "Tutorial chromosome 19 reference (2463fa9efeb75e099685528b3b9071e0+438)":https://{{ site.arvados_workbench_host }}/collections/2463fa9efeb75e099685528b3b9071e0+438
+* "Tutorial sample exome (3229739b505d2b878b62aed09895a55a+142)":https://{{ site.arvados_workbench_host }}/collections/3229739b505d2b878b62aed09895a55a+142
-2013-12-16 14:08:51 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
-do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 1ed9ed18ef31ad21bcabcfeff7777bae+162
-filter qr1hi-8i9sb-w5k40fztqgg9i2x queued 2013-12-16T14:08:50Z
-
-2013-12-16 14:09:01 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
-do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 1ed9ed18ef31ad21bcabcfeff7777bae+162
-filter qr1hi-8i9sb-w5k40fztqgg9i2x d3bcc2ee0f0ea31049000c721c0f3a2a+56
-
-~$ arv keep get 1ed9ed18ef31ad21bcabcfeff7777bae+162/md5sum.txt
-0f1d6bcf55c34bed7f92a805d2d89bbf 887cd41e9c613463eab2f0d885c6dd96+83/./alice.txt
-504938460ef369cd275e4ef58994cffe 887cd41e9c613463eab2f0d885c6dd96+83/./bob.txt
-8f3b36aff310e06f3c5b9e95678ff77a 887cd41e9c613463eab2f0d885c6dd96+83/./carol.txt
-~$ arv keep get d3bcc2ee0f0ea31049000c721c0f3a2a+56/0-filter.txt
-0f1d6bcf55c34bed7f92a805d2d89bbf 887cd41e9c613463eab2f0d885c6dd96+83/./alice.txt
-
~$ arv pipeline run --run-pipeline-here --template qr1hi-p5p6p-itzkwxblfermlwv bwa-mem::reference_collection=2463fa9efeb75e099685528b3b9071e0+438 bwa-mem::sample=3229739b505d2b878b62aed09895a55a+142
-h3. Running a pipeline with different parameters
+2014-07-25 18:05:26 +0000 -- pipeline_instance qr1hi-d1hrv-d14trje19pna7f2
+bwa-mem qr1hi-8i9sb-67n1qvsronmd2z6 queued 2014-07-25T18:05:25Z
-Notice that the pipeline template explicitly specifies the Keep locator for the input:
+2014-07-25 18:05:36 +0000 -- pipeline_instance qr1hi-d1hrv-d14trje19pna7f2
+bwa-mem qr1hi-8i9sb-67n1qvsronmd2z6 {:done=>0, :running=>1, :failed=>0, :todo=>0}
-
-...
- "do_hash":{
- "script_parameters":{
- "input": "887cd41e9c613463eab2f0d885c6dd96+83"
- },
- }
-...
+2014-07-25 18:05:46 +0000 -- pipeline_instance qr1hi-d1hrv-d14trje19pna7f2
+bwa-mem qr1hi-8i9sb-67n1qvsronmd2z6 49bae1066f4ebce72e2587a3efa61c7d+88
-You can specify values for pipeline component script_parameters like this:
-
-
-~$ arv pipeline run --run-here --template qr1hi-p5p6p-xxxxxxxxxxxxxxx do_hash::input=c1bad4b39ca5a924e481008009d94e32+210
-2013-12-17 20:31:24 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
-do_hash qr1hi-8i9sb-rffhuay4jryl2n2 queued 2013-12-17T20:31:24Z
-filter - -
+This instantiates your pipeline and displays periodic status reports in your terminal window. The new pipeline instance will also show up on the Workbench Dashboard.
-2013-12-17 20:31:34 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
-do_hash qr1hi-8i9sb-rffhuay4jryl2n2 {:done=>1, :running=>1, :failed=>0, :todo=>0}
-filter - -
+@arv pipeline run@ submits a job for each pipeline component as soon as the component's inputs are known (i.e., any dependencies are satsified). It terminates when there is no work left to do: this means either all components are satisfied and all jobs have completed successfully, _or_ one or more jobs have failed and it is therefore unproductive to submit any further jobs.
-2013-12-17 20:31:55 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
-do_hash qr1hi-8i9sb-rffhuay4jryl2n2 50cafdb29cc21dd6eaec85ba9e0c6134+56
-filter qr1hi-8i9sb-j347g1sqovdh0op queued 2013-12-17T20:31:55Z
-
-2013-12-17 20:32:05 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
-do_hash qr1hi-8i9sb-rffhuay4jryl2n2 50cafdb29cc21dd6eaec85ba9e0c6134+56
-filter qr1hi-8i9sb-j347g1sqovdh0op 490cd451c8108824b8a17e3723e1f236+19
-
-
-
-Now check the output:
+The Keep locators of the output of of the @bwa-mem@ components are available from the last status report shown above:
-~$ arv keep get 50cafdb29cc21dd6eaec85ba9e0c6134+56/md5sum.txt
-44b8ae3fde7a8a88d2f7ebd237625b4f c1bad4b39ca5a924e481008009d94e32+210/./var-GS000016015-ASM.tsv.bz2
-~$ arv keep get 490cd451c8108824b8a17e3723e1f236+19/0-filter.txt
+~$ arv keep ls -s 49bae1066f4ebce72e2587a3efa61c7d+88
+ 29226 ./HWI-ST1027_129_D0THKACXX.1_1.sam
-Since none of the files in the collection have hash code that start with 0, the output of the filter component is empty.
+h2. Re-using existing jobs and outputs
+
+When satisfying a pipeline component that is not marked as nondeterministic in the pipeline template, @arv pipeline run@ checks for a previously submitted job that satisfies the component's requirements. If such a job is found, @arv pipeline run@ uses the existing job rather than submitting a new one. Usually this is a safe way to conserve time and compute resources. In some cases it's desirable to re-run jobs with identical specifications (e.g., to demonstrate that a job or entire pipeline thought to be repeatable is in fact repeatable). For such cases, job re-use features can be disabled entirely by passing the @--no-reuse@ flag to the @arv pipeline run@ command.