X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/fd2c7f4ea5bdf6e03dfccd3747ea50d9e183ee1e..77f1129ec53edffb5ed5a859106675cf262977e8:/doc/user/tutorials/tutorial-firstscript.html.textile.liquid?ds=inline
diff --git a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
index 476fdf2a09..1269699a20 100644
--- a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
+++ b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid
@@ -2,14 +2,14 @@
layout: default
navsection: userguide
navmenu: Tutorials
-title: "Writing a pipeline"
+title: "Writing a script"
...
-In this tutorial, we will write the "hash" script demonstrated in the first tutorial.
+This tutorial demonstrates how to write crunch script using the Arvados Python SDK. The Arvados SDK supports access to advanced features not available using the @run-command@ wrapper, such as scheduling parallel tasks across nodes.
{% include 'tutorial_expectations' %}
-This tutorial uses *@you@* to denote your username. Replace *@you@* with your user name in all the following examples.
+This tutorial uses @$USER@ to denote your username. Replace @$USER@ with your user name in all the following examples.
h2. Setting up Git
@@ -17,20 +17,20 @@ All Crunch scripts are managed through the Git revision control system. Before
+~$ git config --global user.email $USER@example.com
~$ git config --global user.name "Your Name"
-~$ git config --global user.email you@example.com
git@git.{{ site.arvados_api_host }}:you.git
git@git.{{ site.arvados_api_host }}:$USER.git
+~$ git clone git@git.{{ site.arvados_api_host }}:$USER.git
+Cloning into '$USER'...
~$ cd $HOME # (or wherever you want to install)
-~$ git clone git@git.{{ site.arvados_api_host }}:you.git
-Cloning into 'you'...
+~$ cd you
-~/you$ mkdir crunch_scripts
-~/you$ cd crunch_scripts
~$ cd $USER
+~/$USER$ mkdir crunch_scripts
+~/$USER$ cd crunch_scripts
~/you/crunch_scripts$ nano hash.py
+notextile. ~/$USER/crunch_scripts$ nano hash.py
Add the following code to compute the MD5 hash of each file in a collection:
@@ -61,7 +61,7 @@ Add the following code to compute the MD5 hash of each file in a collection:
Make the file executable:
-notextile. ~/you/crunch_scripts$ chmod +x hash.py
+notextile. ~/$USER/crunch_scripts$ chmod +x hash.py
{% include 'notebox_begin' %}
The steps below describe how to execute the script after committing changes to Git. To run a script locally for testing, please see "debugging a crunch script":{{site.baseurl}}/user/topics/tutorial-job-debug.html.
@@ -70,12 +70,12 @@ The steps below describe how to execute the script after committing changes to G
Next, add the file to the staging area. This tells @git@ that the file should be included on the next commit.
-notextile. ~/you/crunch_scripts$ git add hash.py
+notextile. ~/$USER/crunch_scripts$ git add hash.py
Next, commit your changes. All staged changes are recorded into the local git repository:
~/you/crunch_scripts$ git commit -m"my first script"
+~/$USER/crunch_scripts$ git commit -m"my first script"
[master (root-commit) 27fd88b] my first script
1 file changed, 45 insertions(+)
create mode 100755 crunch_scripts/hash.py
@@ -84,12 +84,12 @@ Next, commit your changes. All staged changes are recorded into the local git r
Finally, upload your changes to the Arvados server:
-~/you/crunch_scripts$ git push origin master
+~/$USER/crunch_scripts$ git push origin master
Counting objects: 4, done.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (4/4), 682 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
-To git@git.qr1hi.arvadosapi.com:you.git
+To git@git.qr1hi.arvadosapi.com:$USER.git
* [new branch] master -> master
@@ -98,7 +98,7 @@ h2. Create a pipeline template
Next, create a file that contains the pipeline definition:
-~/you/crunch_scripts$ cd ~
+~/$USER/crunch_scripts$ cd ~
~$ cat >the_pipeline <<EOF
{
"name":"My first pipeline",
@@ -113,7 +113,10 @@ Next, create a file that contains the pipeline definition:
},
"repository":"$USER",
"script_version":"master",
- "output_is_persistent":true
+ "output_is_persistent":true,
+ "runtime_constraints":{
+ "docker_image":"arvados/jobs"
+ }
}
}
}
@@ -121,17 +124,9 @@ EOF
-* @cat@ is a standard Unix utility that writes a sequence of input to standard output.
-* @<the_pipeline@ redirects standard output to a file called @the_pipeline@.
-* @"name"@ is a human-readable name for the pipeline.
-* @"components"@ is a set of scripts that make up the pipeline.
-* The component is listed with a human-readable name (@"do_hash"@ in this example).
-* @"repository"@ is the name of a git repository to search for the script version. You can access a list of available git repositories on the Arvados Workbench under "Code repositories":https://{{site.arvados_workbench_host}}/repositories. Your shell should automatically fill in @$USER@ with your login name, so that the final JSON has @"repository"@ pointed at your personal Git repository.
+* @"repository"@ is the name of a git repository to search for the script version. You can access a list of available git repositories on the Arvados Workbench under "Code repositories":https://{{site.arvados_workbench_host}}/repositories.
* @"script_version"@ specifies the version of the script that you wish to run. This can be in the form of an explicit Git revision hash, a tag, or a branch (in which case it will use the HEAD of the specified branch). Arvados logs the script version that was used in the run, enabling you to go back and re-run any past job with the guarantee that the exact same code will be used as was used in the previous run.
* @"script"@ specifies the filename of the script to run. Crunch expects to find this in the @crunch_scripts/@ subdirectory of the Git repository.
-* @"script_parameters"@ describes the parameters for the script. In this example, there is one parameter called @input@ which is @required@ and is a @Collection@.
-* @"output_is_persistent"@ indicates whether the output of the job is considered valuable. If this value is false (or not given), the output will be treated as intermediate data and eventually deleted to reclaim disk space.
Now, use @arv pipeline_template create@ to register your pipeline template in Arvados: