4 title: "Tutorial: Your first job"
8 h1. Tutorial: Your first job
10 Here you will use the "arv" command line tool to run a simple Crunch script on some sample data.
14 * Log in to a VM "using SSH":ssh-access.html
15 * Put an "API token":api-tokens.html in your @ARVADOS_API_TOKEN@ environment variable
16 * Put the API host name in your @ARVADOS_API_HOST@ environment variable
18 If everything is set up correctly, the command @arv -h user current@ will display your account information.
20 Arv depends on a few gems. It will tell you which ones to install, if they are not present yet. If you need to install the dependencies and are doing so as a non-root user, make sure you set GEM_HOME before you run gem install:
23 export GEM_HOME=~/.gem
28 We will run the "hash" program, which computes the MD5 hash of each file in a collection.
30 Pick a data collection. We'll use @33a9f3842b01ea3fdf27cc582f5ea2af@ here.
33 the_collection=33a9f3842b01ea3fdf27cc582f5ea2af
36 Pick a code version. We'll use @5565778cf15ae9af22ad392053430213e9016631@ here.
39 the_version=5565778cf15ae9af22ad392053430213e9016631
42 Make a JSON object describing the job.
45 read -rd "\000" the_job <<EOF
48 "script_version":"$the_version",
51 "input":"$the_collection"
57 (The @read -rd "\000"@ stuff just helps us get a multi-line string with lots of double quotation marks into a shell variable.)
62 arv -h job create --job "$the_job"
70 "etag":"dwbrasqcozpjsqtfshzdjfiii",
71 "uuid":"qr1hi-8i9sb-3i0yi357k0mauwz",
75 "input":"33a9f3842b01ea3fdf27cc582f5ea2af"
77 "script_version":"5565778cf15ae9af22ad392053430213e9016631",
82 h3. Monitor job progress
84 Go to Workbench, drop down the Compute menu, and click Jobs. The job you submitted should appear at the top of the list.
86 Hit "Refresh" until it finishes.
88 You can also watch the log messages while the job runs:
91 curl -s -H "Authorization: OAuth2 $ARVADOS_API_TOKEN" \
92 https://{{ site.arvados_api_host }}/arvados/v1/jobs/JOB_UUID_HERE/log_tail_follow
95 h3. Inspect the job output
97 Find the output of the job by looking at the Jobs page (in the Compute menu) in Workbench, or by using the API:
100 arv -h job get --uuid JOB_UUID_HERE
103 The output locator will look like <code>5894dfae5d6d8edf135f0ea3dba849c2+62+K@qr1hi</code>.
105 List the files in the collection:
108 arv keep ls 5894dfae5d6d8edf135f0ea3dba849c2+62+K@qr1hi
117 Show the contents of the md5sum.txt file:
120 arv keep less 5894dfae5d6d8edf135f0ea3dba849c2+62+K@qr1hi/md5sum.txt
125 The @script@ and @script_version@ attributes of a Job allow you to confirm the code that was used to run the job. Specifically, @script@ refers to a file in the @/crunch_scripts@ directory in the tree indicated by the commit hash @script_version@.
131 git clone git://github.com/clinicalfuture/arvados.git
133 git checkout $the_version
134 less crunch_scripts/hash