the_job@ redirects standard output to a file called @the_job@
* @"script"@ specifies the name of the script to run. The script is searched for in the "crunch_scripts/" subdirectory of the @git@ checkout specified by @"script_version"@.
* @"script_version"@ specifies the version of the script that you wish to run. This can be in the form of an explicit @git@ revision hash, or in the form "repository:branch" (in which case it will take the HEAD of the specified branch). Arvados logs the script version that was used in the run, enabling you to go back and re-run any past job with the guarantee that the exact same code will be used as was used in the previous run. You can access a list of available @git@ repositories on the Arvados workbench through _Access %(rarr)→% Repositories_.
* @"script_parameters"@ are provided to the script. In this case, the input is the locator for the collection that we inspected in the previous section.
Use @arv job create@ to actually submit the job. It should print out a JSON object which describes the newly created job:
$ arv -h job create --job "$(cat the_job)"
{
"href":"https://qr1hi.arvadosapi.com/arvados/v1/jobs/qr1hi-8i9sb-xxxxxxxxxxxxxxx",
"kind":"arvados#job",
"etag":"aulvmdxezwxo4zrw15gz1v7x3",
"uuid":"qr1hi-8i9sb-xxxxxxxxxxxxxxx",
"owner_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
"created_at":"2013-12-10T17:07:08Z",
"modified_by_client_uuid":"qr1hi-ozdt8-obw7foaks3qjyej",
"modified_by_user_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
"modified_at":"2013-12-10T17:07:08Z",
"updated_at":"2013-12-10T17:07:08Z",
"submit_id":null,
"priority":null,
"script":"hash",
"script_parameters":{
"input":"33a9f3842b01ea3fdf27cc582f5ea2af"
},
"script_version":"d3b10812b443dcf0189c1c432483bf7ac06507fe",
"cancelled_at":null,
"cancelled_by_client_uuid":null,
"cancelled_by_user_uuid":null,
"started_at":null,
"finished_at":null,
"output":null,
"success":null,
"running":null,
"is_locked_by_uuid":null,
"log":null,
"runtime_constraints":{},
"tasks_summary":{},
"dependencies":[
"33a9f3842b01ea3fdf27cc582f5ea2af"
],
"log_stream_href":"https://qr1hi.arvadosapi.com/arvados/v1/jobs/qr1hi-8i9sb-xxxxxxxxxxxxxxx/log_tail_follow"
}
The job is new queued and will start running as soon as it reaches the front of the queue. Fields to pay attention to include:
* @"uuid"@ is the unique identifier for this specific job
* @"script_version"@ is the actual revision of the script used. This is useful if the version was described using the "repository:branch" format.
h3. Monitor job progress
Go to Workbench, and use the menu to navigate to _Compute %(rarr)→% Jobs_. The job you submitted can be identified by the *uuid* row, which will match the "uuid" field of the JSON object returned when the job was created.
Hit "Refresh" until it finishes. Successful completion is indicated by a green check mark in the *status* column.
You can access log messages while the job runs using @arv job log_tail_follow@:
notextile. $ arv job log_tail_follow --uuid qr1hi-8i9sb-xxxxxxxxxxxxxxx
This will print out the last several lines of the log for that job.
h3. Inspect the job output
You can access the job output under the *output* column of the _Compute %(rarr)→% Jobs_ page. Alternately, you can use @arv job get@ to access a JSON object describing the output:
$ arv -h job get --uuid qr1hi-8i9sb-xxxxxxxxxxxxxxx
{
"href":"https://qr1hi.arvadosapi.com/arvados/v1/jobs/qr1hi-8i9sb-zs6d9pxkr0vk175",
"kind":"arvados#job",
"etag":"eoe99lw7rnqxo7j29fh53hz",
"uuid":"qr1hi-8i9sb-zs6d9pxkr0vk175",
"owner_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
"created_at":"2013-12-10T17:23:26Z",
"modified_by_client_uuid":null,
"modified_by_user_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
"modified_at":"2013-12-10T17:23:45Z",
"updated_at":"2013-12-10T17:23:45Z",
"submit_id":null,
"priority":null,
"script":"hash",
"script_parameters":{
"input":"33a9f3842b01ea3fdf27cc582f5ea2af"
},
"script_version":"0a8c7c6fce7a9667ee42c1984a845100f51906a2",
"cancelled_at":null,
"cancelled_by_client_uuid":null,
"cancelled_by_user_uuid":null,
"started_at":"2013-12-10T17:23:29Z",
"finished_at":"2013-12-10T17:23:44Z",
"output":"880b55fb4470b148a447ff38cacdd952+54+K@qr1hi",
"success":true,
"running":false,
"is_locked_by_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
"log":"f760f3dd3105103e058a043310f7e72b+3028+K@qr1hi",
"runtime_constraints":{},
"tasks_summary":{
"done":2,
"running":0,
"failed":0,
"todo":0
},
"dependencies":[
"33a9f3842b01ea3fdf27cc582f5ea2af"
],
"log_stream_href":null
}
* @"output"@ is the unique identifier for this specific job's output. This is a Keep collection. Because the output of Arvados jobs should be deterministic, the known expected output is 880b55fb4470b148a447ff38cacdd952+54+K@qr1hi
.
Now you can list the files in the collection:
$ arv keep get 880b55fb4470b148a447ff38cacdd952+54+K@qr1hi
. 78b268d1e03d87f8270bdee9d5d427c5+61 0:61:md5sum.txt
This collection consists of the md5sum.txt file. Use @arv keep get@ to show the contents of the md5sum.txt file:
$ arv keep get 880b55fb4470b148a447ff38cacdd952+54+K@qr1hi/md5sum.txt
44b8ae3fde7a8a88d2f7ebd237625b4f var-GS000016015-ASM.tsv.bz2
This md5 hash matches the md5 hash which we computed earlier.
This concludes the first tutorial. In the next tutorial, we will "write a script to compute the hash.":tutorial-firstscript.html