X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/715869b9a22e22ac68a7dbefa96f27150017f75d..2b5d9607892a48d32401ff59516e8d73234eee89:/doc/user/tutorials/running-external-program.html.textile.liquid diff --git a/doc/user/tutorials/running-external-program.html.textile.liquid b/doc/user/tutorials/running-external-program.html.textile.liquid index e286013247..56b71c05ee 100644 --- a/doc/user/tutorials/running-external-program.html.textile.liquid +++ b/doc/user/tutorials/running-external-program.html.textile.liquid @@ -1,67 +1,68 @@ --- layout: default navsection: userguide -navmenu: Tutorials -title: "Running external programs" - +title: "Using Crunch to run external programs" ... -h1. Running external programs - This tutorial demonstrates how to use Crunch to run an external program by writting a wrapper using the Python SDK. -*This tutorial assumes that you are "logged into an Arvados VM instance":{{site.basedoc}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.basedoc}}/user/getting_started/check-environment.html* +*This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* In this tutorial, you will use the external program @md5sum@ to compute hashes instead of the built-in Python library used in earlier tutorials. Start by entering the @crunch_scripts@ directory of your git repository: -
$ cd you/crunch_scripts
+
~$ cd you/crunch_scripts
 
- -Next, using your favorite text editor, create a new file called @run-md5sum.py@ in the @crunch_scripts@ directory. Add the following code to use the @md5sum@ program to compute the hash of each file in a collection: -
{% include 'run_md5sum_py' %}
+Next, using @nano@ or your favorite Unix text editor, create a new file called @run-md5sum.py@ in the @crunch_scripts@ directory. + +notextile.
~/you/crunch_scripts$ nano run-md5sum.py
+ +Add the following code to use the @md5sum@ program to compute the hash of each file in a collection: + + {% code 'run_md5sum_py' as python %} Make the file executable: -notextile.
$ chmod +x run-md5sum.py
+notextile.
~/you/crunch_scripts$ chmod +x run-md5sum.py
-Next, add the file to @git@ staging, commit and push: +Next, use @git@ to stage the file, commit, and push: -
$ git add run-md5sum.py
-$ git commit -m"run external md5sum program"
-$ git push origin master
+
~/you/crunch_scripts$ git add run-md5sum.py
+~/you/crunch_scripts$ git commit -m"run external md5sum program"
+~/you/crunch_scripts$ git push origin master
 
-You should now be able to run your new script using Crunch, with "script" referring to our new "run-md5sum.py" script. +You should now be able to run your new script using Crunch, with @"script"@ referring to our new @run-md5sum.py@ script. -
$ cat >the_job <<EOF
+
~/you/crunch_scripts$ cat >~/the_pipeline <<EOF
 {
- "script": "run-md5sum.py",
- "script_version": "you:master",
- "script_parameters":
- {
-  "input": "c1bad4b39ca5a924e481008009d94e32+210"
- }
-}
-EOF
-$ arv -h job create --job "$(cat the_job)"
-{
- ...
- "uuid":"qr1hi-xxxxx-xxxxxxxxxxxxxxx"
- ...
-}
-$ arv -h job get --uuid qr1hi-xxxxx-xxxxxxxxxxxxxxx
-{
- ...
- "output":"4d164b1658c261b9afc6b479130016a3+54",
- ...
+  "name":"Run external md5sum program",
+  "components":{
+    "do_hash":{
+      "script":"run-md5sum.py",
+      "script_parameters":{
+        "input":{
+          "required": true,
+          "dataclass": "Collection"
+        }
+      },
+      "repository":"$USER",
+      "script_version":"master"
+    }
+  }
 }
+EOF
+~/you/crunch_scripts$ arv pipeline_template create --pipeline-template "$(cat ~/the_pipeline)"
 
+ +(Your shell should automatically fill in @$USER@ with your login name. The JSON that gets saved should have @"repository"@ pointed at your personal git repository.) + +Your new pipeline template will appear on the Workbench "Compute %(rarr)→% Pipeline templates":https://{{ site.arvados_workbench_host }}/pipeline_instances page. You can run the "pipeline using Workbench":tutorial-pipeline-workbench.html.