X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/49a6ced3c7a540a7da7155ab1c3120a5227c620c..50128b53da4003912635b03fb27b5be2c5beaca1:/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
diff --git a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
index 9de1a9c61e..d4caafef5c 100644
--- a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
+++ b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
@@ -2,58 +2,28 @@
layout: default
navsection: userguide
navmenu: Tutorials
-title: "Writing a pipeline"
+title: "Writing a Crunch script"
...
-In this tutorial, we will write the "hash" script demonstrated in the first tutorial.
+{% include 'pipeline_deprecation_notice' %}
-*This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html*
+This tutorial demonstrates how to write a script using Arvados Python SDK. The Arvados SDK supports access to advanced features not available using the @run-command@ wrapper, such as scheduling concurrent tasks across nodes.
-This tutorial uses *@you@* to denote your username. Replace *@you@* with your user name in all the following examples.
+{% include 'tutorial_expectations' %}
-h2. Setting up Git
+This tutorial uses @$USER@ to denote your username. Replace @$USER@ with your user name in all the following examples.
-All Crunch scripts are managed through the Git revision control system. Before you start using Git, you should do some basic configuration (you only need to do this the first time):
+Start by creating a directory called @tutorial@ in your home directory. Next, create a subdirectory called @crunch_scripts@ and change to that directory:
-~$ git config --global user.name "Your Name"
-~$ git config --global user.email you@example.com
git@git.{{ site.arvados_api_host }}:you.git
-~$ cd $HOME # (or wherever you want to install)
-~$ git clone git@git.{{ site.arvados_api_host }}:you.git
-Cloning into 'you'...
$ man gittutorial
-
-or *"search Google for Git tutorials":http://google.com/#q=git+tutorial*.
-{% include 'notebox_end' %}
-
-h2. Creating a Crunch script
-
-Start by entering the *@you@* directory created by @git clone@. Next create a subdirectory called @crunch_scripts@ and change to that directory:
-
-~$ cd you
-~/you$ mkdir crunch_scripts
-~/you$ cd crunch_scripts
+~$ cd $HOME
+~$ mkdir -p tutorial/crunch_scripts
+~$ cd tutorial/crunch_scripts
~/you/crunch_scripts$ nano hash.py
+notextile. ~/tutorial/crunch_scripts$ nano hash.py
Add the following code to compute the MD5 hash of each file in a collection:
@@ -61,85 +31,77 @@ Add the following code to compute the MD5 hash of each file in a collection:
Make the file executable:
-notextile. ~/you/crunch_scripts$ chmod +x hash.py
-
-{% include 'notebox_begin' %}
-The steps below describe how to execute the script after committing changes to Git. To run a script locally for testing, please see "debugging a crunch script":{{site.baseurl}}/user/topics/tutorial-job-debug.html.
-
-{% include 'notebox_end' %}
+notextile. ~/tutorial/crunch_scripts$ chmod +x hash.py
-Next, add the file to the staging area. This tells @git@ that the file should be included on the next commit.
-
-notextile. ~/you/crunch_scripts$ git add hash.py
-
-Next, commit your changes. All staged changes are recorded into the local git repository:
-
-~/you/crunch_scripts$ git commit -m"my first script"
-[master (root-commit) 27fd88b] my first script
- 1 file changed, 45 insertions(+)
- create mode 100755 crunch_scripts/hash.py
-~/you/crunch_scripts$ git push origin master
-Counting objects: 4, done.
-Compressing objects: 100% (2/2), done.
-Writing objects: 100% (4/4), 682 bytes, done.
-Total 4 (delta 0), reused 0 (delta 0)
-To git@git.qr1hi.arvadosapi.com:you.git
- * [new branch] master -> master
+~/tutorial/crunch_scripts$ cat >~/the_job <<EOF
+{
+ "repository":"",
+ "script":"hash.py",
+ "script_version":"$HOME/tutorial",
+ "script_parameters":{
+ "input":"c1bad4b39ca5a924e481008009d94e32+210"
+ }
+}
+EOF
+
~/you/crunch_scripts$ cd ~
-~$ cat >the_pipeline <<EOF
-{
- "name":"My first pipeline",
- "components":{
- "do_hash":{
- "script":"hash.py",
- "script_parameters":{
- "input":{
- "required": true,
- "dataclass": "Collection"
- }
- },
- "repository":"$USER",
- "script_version":"master",
- "output_is_persistent":true
- }
- }
-}
-EOF
-
+~/tutorial/crunch_scripts$ arv-crunch-job --job "$(cat ~/the_job)"
+2014-08-06_15:16:22 qr1hi-8i9sb-qyrat80ef927lam 14473 check slurm allocation
+2014-08-06_15:16:22 qr1hi-8i9sb-qyrat80ef927lam 14473 node localhost - 1 slots
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 start
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script hash.py
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script_version $HOME/tutorial
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script_parameters {"input":"c1bad4b39ca5a924e481008009d94e32+210"}
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 runtime_constraints {"max_tasks_per_node":0}
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 start level 0
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 0 done, 0 running, 1 todo
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 0 job_task qr1hi-ot0gb-lptn85mwkrn9pqo
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 0 child 14478 started on localhost.1
+2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 0 done, 1 running, 0 todo
+2014-08-06_15:16:24 qr1hi-8i9sb-qyrat80ef927lam 14473 0 stderr crunchstat: Running [stdbuf --output=0 --error=0 /home/$USER/tutorial/crunch_scripts/hash.py]
+2014-08-06_15:16:24 qr1hi-8i9sb-qyrat80ef927lam 14473 0 child 14478 on localhost.1 exit 0 signal 0 success=true
+2014-08-06_15:16:24 qr1hi-8i9sb-qyrat80ef927lam 14473 0 success in 1 seconds
+2014-08-06_15:16:24 qr1hi-8i9sb-qyrat80ef927lam 14473 0 output
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 wait for last 0 children to finish
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 1 done, 0 running, 1 todo
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 start level 1
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 1 done, 0 running, 1 todo
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 1 job_task qr1hi-ot0gb-e3obm0lv6k6p56a
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 1 child 14504 started on localhost.1
+2014-08-06_15:16:25 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 1 done, 1 running, 0 todo
+2014-08-06_15:16:26 qr1hi-8i9sb-qyrat80ef927lam 14473 1 stderr crunchstat: Running [stdbuf --output=0 --error=0 /home/$USER/tutorial/crunch_scripts/hash.py]
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 child 14504 on localhost.1 exit 0 signal 0 success=true
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 success in 10 seconds
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 output 8c20281b9840f624a486e4f1a78a1da8+105+A234be74ceb5ea31db6e11b6be26f3eb76d288ad0@54987018
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 wait for last 0 children to finish
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 2 done, 0 running, 0 todo
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 release job allocation
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 Freeze not implemented
+2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 collate
+2014-08-06_15:16:36 qr1hi-8i9sb-qyrat80ef927lam 14473 collated output manifest text to send to API server is 105 bytes with access tokens
+2014-08-06_15:16:36 qr1hi-8i9sb-qyrat80ef927lam 14473 output hash c1b44b6dc41ef334cf1136033ca950e6+54
+2014-08-06_15:16:37 qr1hi-8i9sb-qyrat80ef927lam 14473 finish
+2014-08-06_15:16:38 qr1hi-8i9sb-qyrat80ef927lam 14473 log manifest is 7fe8cf1d45d438a3ca3ac4a184b7aff4+83
+
~$ arv pipeline_template create --pipeline-template "$(cat the_pipeline)"
+~/tutorial/crunch_scripts$ arv-ls c1b44b6dc41ef334cf1136033ca950e6+54
+./md5sum.txt
+~/tutorial/crunch_scripts$ arv-get c1b44b6dc41ef334cf1136033ca950e6+54/ .
+0 MiB / 0 MiB 100.0%
+~/tutorial/crunch_scripts$ cat md5sum.txt
+44b8ae3fde7a8a88d2f7ebd237625b4f c1bad4b39ca5a924e481008009d94e32+210/var-GS000016015-ASM.tsv.bz2