X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/bbf7272aa2b831102c47fc93f8966ec32e918205..448f667a574b50da096051a0d062b9059ab3609f:/doc/user/tutorials/running-external-program.html.textile.liquid diff --git a/doc/user/tutorials/running-external-program.html.textile.liquid b/doc/user/tutorials/running-external-program.html.textile.liquid deleted file mode 100644 index 6f1dae3de2..0000000000 --- a/doc/user/tutorials/running-external-program.html.textile.liquid +++ /dev/null @@ -1,63 +0,0 @@ ---- -layout: default -navsection: userguide -title: "Writing a pipeline template" -... - -This tutorial demonstrates how to construct a two stage pipeline template that uses the "bwa mem":http://bio-bwa.sourceforge.net/ tool to produce a "Sequence Alignment/Map (SAM)":https://samtools.github.io/ file, then uses the "Picard SortSam tool":http://picard.sourceforge.net/command-line-overview.shtml#SortSam to produce a BAM (Binary Alignment/Map) file. - -{% include 'tutorial_expectations' %} - -Use the following command to create an empty template using @arv create pipeline_template@: - - -
~$ arv create pipeline_template
-
- -This will open the template record in an interactive text editor (as specified by $EDITOR or $VISUAL, otherwise defaults to @nano@). Now, update the contents of the editor with the following content: - -{% code 'tutorial_bwa_sortsam_pipeline' as javascript %} - -* @"name"@ is a human-readable name for the pipeline. -* @"components"@ is a set of scripts or commands that make up the pipeline. Each component is given an identifier (@"bwa-mem"@ and @"SortSam"@) in this example). -** Each entry in components @"components"@ is an Arvados job submission. For more information about individual jobs, see the "job object reference":{{site.baseurl}}/api/schema/Job.html and "job create method.":{{site.baseurl}}/api/methods/jobs.html#create -* @"repository"@, @"script_version"@, and @"script"@ indicate that we intend to use the external @"run-command"@ tool wrapper that is part of the Arvados. These parameters are described in more detail in "Writing a script":tutorial-firstscript.html. -* @"runtime_constraints"@ describes runtime resource requirements for the component. -** @"docker_image"@ specifies the "Docker":https://www.docker.com/ runtime environment in which to run the job. The Docker image @"bcosc/arv-base-java"@ supplied here has the Java runtime environment, bwa, and samtools installed. -** @"arvados_sdk_version"@ specifies a version of the Arvados SDK to load alongside the job's script. -* @"script_parameters"@ describes the component parameters. -** @"command"@ is the actual command line to invoke the @bwa@ and then @SortSam@. The notation @$()@ denotes macro substitution commands evaluated by the run-command tool wrapper. -** @"task.stdout"@ indicates that the output of this command should be captured to a file. -** @$(node.cores)@ evaluates to the number of cores available on the compute node at time the command is run. -** @$(tmpdir)@ evaluates to the local path for temporary directory the command should use for scratch data. -** @$(reference_collection)@ evaluates to the script_parameter @"reference_collection"@ -** @$(dir $(...))@ constructs a local path to a directory representing the supplied Arvados collection. -** @$(file $(...))@ constructs a local path to a given file within the supplied Arvados collection. -** @$(glob $(...))@ searches the specified path based on a file glob pattern and evalutes to the first result. -** @$(basename $(...))@ evaluates to the supplied path with leading path portion and trailing filename extensions stripped -* @"output_of"@ indicates that the @output@ of the @bwa-mem@ component should be used as the @"input"@ script parameter of @SortSam@. Arvados uses these dependencies between components to automatically determine the correct order to run them. - -When using @run-command@, the tool should write its output to the current working directory. The output will be automatically uploaded to Keep when the job completes. - -See the "run-command reference":{{site.baseurl}}/user/topics/run-command.html for more information about using @run-command@. - -*Note:* When trying to get job reproducibility without re-computation, you need to set these parameters to their specific hashes. Using a version such as master in @"arvados_sdk_version"@ will grab the latest version hash, which will allow Arvados to re-compute your job if the sdk gets updated. -* @"arvados_sdk_version"@ : The latest version can be found on the "Arvados Python sdk repository":https://arvados.org/projects/arvados/repository/revisions/master/show/sdk/python under *Latest revisions*. -* @"script_version"@ : The current version of your script in your git repository can be found by using the following command: - - -
~$ git rev-parse HEAD
-
- -* @"docker_image"@ : The docker image hash used is found on the "Collection page":https://cloud.curoverse.com/collections/qr1hi-4zz18-dov6im679g3jr1n as the *Content address*. - -h2. Running your pipeline - -Your new pipeline template should appear at the top of the Workbench "pipeline templates":{{site.arvados_workbench_host}}/pipeline_templates page. You can run your pipeline "using Workbench":tutorial-pipeline-workbench.html or the "command line.":{{site.baseurl}}/user/topics/running-pipeline-command-line.html - -Test data is available in the "Arvados Tutorial":{{site.arvados_workbench_host}}/projects/qr1hi-j7d0g-u7zg1qdaowykd8d project: - -* Choose "Tutorial chromosome 19 reference (2463fa9efeb75e099685528b3b9071e0+438)":{{site.arvados_workbench_host}}/collections/2463fa9efeb75e099685528b3b9071e0+438 for the "reference_collection" parameter -* Choose "Tutorial sample exome (3229739b505d2b878b62aed09895a55a+142)":{{site.arvados_workbench_host}}/collections/3229739b505d2b878b62aed09895a55a+142 for the "sample" parameter - -For more information and examples for writing pipelines, see the "pipeline template reference":{{site.baseurl}}/api/schema/PipelineTemplate.html