doc/api/crunch-scripts.textile

   1 ---
   2 layout: default
   3 navsection: api
   4 title: Crunch scripts
   5 navorder: 5
   6 ---
   7
   8 h2. Crunch scripts
   9
  10 A crunch script is responsible for completing a single JobTask. In doing so, it will:
  11
  12 * (optionally) read some input from Keep
  13 * (optionally) store some output in Keep
  14 * (optionally) create some new JobTasks and add them to the current Job
  15 * (optionally) update the current JobTask record with the "output" attribute set to a Keep locator or a fragment of a manifest
  16 * update the current JobTask record with the "success" attribute set to True
  17
  18 A task's context is provided in environment variables.
  19
  20 table(table table-bordered table-condensed).
  21 |Environment variable|Description|
  22 |@JOB_UUID@|UUID of the current "Job":Jobs.html|
  23 |@TASK_UUID@|UUID of the current "JobTask":JobTasks.html|
  24 |@ARVADOS_API_HOST@|Hostname and port number of API server|
  25 |@ARVADOS_API_TOKEN@|Authentication token to use with API calls made by the current task|
  26
  27 The crunch script typically uses the Python SDK (or another suitable client library / SDK) to connect to the Arvados service and retrieve the rest of the details about the current job and task.
  28
  29 The Python SDK has some shortcuts for common operations.
  30
  31 In general, a crunch script can access information about the current job and task like this:
  32
  33 <pre>
  34 import arvados
  35 import os
  36
  37 job = arvados.api().jobs().get(uuid=os.environ['JOB_UUID']).execute()
  38 $sys.stderr.write("script_parameters['foo'] == %s"
  39                   % job['script_parameters']['foo'])
  40
  41 task = arvados.api().job_tasks().get(uuid=os.environ['TASK_UUID']).execute()
  42 $sys.stderr.write("current task sequence number is %d"
  43                   % task['sequence'])
  44 </pre>
  45