X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/78b4e097593088a9c3614bf922a13e7eb454ea06..711711827bb0c3564836707bb7d4453c60c6a98c:/doc/user/tutorials/tutorial-firstscript.html.textile.liquid diff --git a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid index d59275d5c4..3937698476 100644 --- a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid +++ b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid @@ -4,6 +4,13 @@ navsection: userguide navmenu: Tutorials title: "Writing a Crunch script" ... +{% comment %} +Copyright (C) The Arvados Authors. All rights reserved. + +SPDX-License-Identifier: CC-BY-SA-3.0 +{% endcomment %} + +{% include 'pipeline_deprecation_notice' %} This tutorial demonstrates how to write a script using Arvados Python SDK. The Arvados SDK supports access to advanced features not available using the @run-command@ wrapper, such as scheduling concurrent tasks across nodes. @@ -11,10 +18,11 @@ This tutorial demonstrates how to write a script using Arvados Python SDK. The This tutorial uses @$USER@ to denote your username. Replace @$USER@ with your user name in all the following examples. -Start by creating a directory called @$USER@ . Next, create a subdirectory called @crunch_scripts@ and change to that directory: +Start by creating a directory called @tutorial@ in your home directory. Next, create a subdirectory called @crunch_scripts@ and change to that directory: -
~$ mkdir -p tutorial/crunch_scripts
+
~$ cd $HOME
+~$ mkdir -p tutorial/crunch_scripts
 ~$ cd tutorial/crunch_scripts
@@ -54,7 +62,7 @@ You can now run your script on your local workstation or VM using @arv-crunch-jo 2014-08-06_15:16:22 qr1hi-8i9sb-qyrat80ef927lam 14473 node localhost - 1 slots 2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 start 2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script hash.py -2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script_version /home/peter/peter +2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script_version $HOME/tutorial 2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 script_parameters {"input":"c1bad4b39ca5a924e481008009d94e32+210"} 2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 runtime_constraints {"max_tasks_per_node":0} 2014-08-06_15:16:23 qr1hi-8i9sb-qyrat80ef927lam 14473 start level 0 @@ -76,26 +84,29 @@ You can now run your script on your local workstation or VM using @arv-crunch-jo 2014-08-06_15:16:26 qr1hi-8i9sb-qyrat80ef927lam 14473 1 stderr crunchstat: Running [stdbuf --output=0 --error=0 /home/$USER/tutorial/crunch_scripts/hash.py] 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 child 14504 on localhost.1 exit 0 signal 0 success=true 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 success in 10 seconds -2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 output 50cafdb29cc21dd6eaec85ba9e0c6134+56+Aef0f991b80fa0b75f802e58e70b207aa184d24ff@53f4bbd3 +2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 1 output 8c20281b9840f624a486e4f1a78a1da8+105+A234be74ceb5ea31db6e11b6be26f3eb76d288ad0@54987018 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 wait for last 0 children to finish 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 status: 2 done, 0 running, 0 todo +2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 release job allocation 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 Freeze not implemented 2014-08-06_15:16:35 qr1hi-8i9sb-qyrat80ef927lam 14473 collate -2014-08-06_15:16:36 qr1hi-8i9sb-qyrat80ef927lam 14473 output d6338df28d6b8e5d14929833b417e20e+107+Adf1ce81222b6992ce5d33d8bfb28a6b5a1497898@53f4bbd4 +2014-08-06_15:16:36 qr1hi-8i9sb-qyrat80ef927lam 14473 collated output manifest text to send to API server is 105 bytes with access tokens +2014-08-06_15:16:36 qr1hi-8i9sb-qyrat80ef927lam 14473 output hash c1b44b6dc41ef334cf1136033ca950e6+54 2014-08-06_15:16:37 qr1hi-8i9sb-qyrat80ef927lam 14473 finish 2014-08-06_15:16:38 qr1hi-8i9sb-qyrat80ef927lam 14473 log manifest is 7fe8cf1d45d438a3ca3ac4a184b7aff4+83
-Although the job runs locally, the output of the job has been saved to Keep, the Arvados file store. The "output" line (third from the bottom) provides the "Keep locator":{{site.baseurl}}/user/tutorials/tutorial-keep-get.html to which the script's output has been saved. Copy the output identifier and use @arv-ls@ to list the contents of your output collection, and @arv-get@ to download it to the current directory: +Although the job runs locally, the output of the job has been saved to Keep, the Arvados file store. The "output hash" line (third from the bottom) provides the portable data hash of the Arvados collection where the script's output has been saved. Copy the output hash and use @arv-ls@ to list the contents of your output collection, and @arv-get@ to download it to the current directory: -
~/tutorial/crunch_scripts$ arv-ls d6338df28d6b8e5d14929833b417e20e+107+Adf1ce81222b6992ce5d33d8bfb28a6b5a1497898@53f4bbd4
+
~/tutorial/crunch_scripts$ arv-ls c1b44b6dc41ef334cf1136033ca950e6+54
 ./md5sum.txt
-~/tutorial/crunch_scripts$ arv-get d6338df28d6b8e5d14929833b417e20e+107+Adf1ce81222b6992ce5d33d8bfb28a6b5a1497898@53f4bbd4/ .
+~/tutorial/crunch_scripts$ arv-get c1b44b6dc41ef334cf1136033ca950e6+54/ .
+0 MiB / 0 MiB 100.0%
 ~/tutorial/crunch_scripts$ cat md5sum.txt
-44b8ae3fde7a8a88d2f7ebd237625b4f c1bad4b39ca5a924e481008009d94e32+210/./var-GS000016015-ASM.tsv.bz2
+44b8ae3fde7a8a88d2f7ebd237625b4f c1bad4b39ca5a924e481008009d94e32+210/var-GS000016015-ASM.tsv.bz2
 
-Running locally is convenient for development and debugging, as it permits a fast iterative development cycle. Your job run is also recorded by Arvados, and will appear in the *Recent jobs and pipelines* panel on the "Workbench Dashboard":https://{{site.arvados_workbench_host}}. This provides limited provenance, by recording the input parameters, the execution log, and the output. However, running locally does not allow you to scale out to multiple nodes, and does not store the complete system snapshot required to achieve reproducibility; to do that you need to "submit a job to the Arvados cluster":{{site.baseurl}}/user/tutorials/tutorial-submit-job.html. +Running locally is convenient for development and debugging, as it permits a fast iterative development cycle. Your job run is also recorded by Arvados, and will appear in the *Recent jobs and pipelines* panel on the "Workbench Dashboard":{{site.arvados_workbench_host}}. This provides limited provenance, by recording the input parameters, the execution log, and the output. However, running locally does not allow you to scale out to multiple nodes, and does not store the complete system snapshot required to achieve reproducibility; to do that you need to "submit a job to the Arvados cluster":{{site.baseurl}}/user/tutorials/tutorial-submit-job.html.