9 Applications submit compute jobs when:
10 * Provenance is important, i.e., it is worth recording how the output was produced; or
11 * Computation time is significant; or
12 * The job management features are convenient (failure detection/recovery, regression testing, etc).
16 See "jobs":{{site.baseurl}}/api/methods/jobs.html
20 Each job has, in addition to the usual "attributes of Arvados resources":{{site.baseurl}}/api/resources.html:
22 table(table table-bordered table-condensed).
23 |_. Attribute|_. Type|_. Description|_. Notes|
24 |script|string|The filename of the job script.|This program will be invoked by Crunch for each job task. It is given as a path to an executable file, relative to the @/crunch_scripts@ directory in the Git tree specified by the _repository_ and _script_version_ attributes.|
25 |script_parameters|hash|The input parameters for the job.|Conventionally, one of the parameters is called @"input"@. Typically, some parameter values are collection UUIDs. Ultimately, though, the significance of parameters is left entirely up to the script itself.|
26 |repository|string|Git repository name or URL.|Source of the repository where the given script_version is to be found. This can be given as the name of a locally hosted repository, or as a publicly accessible URL starting with @git://@, @http://@, or @https://@.
28 @yourusername/yourrepo@
29 @https://github.com/curoverse/arvados.git@|
30 |script_version|string|Git commit|During a **create** transaction, this is the Git branch, tag, or hash supplied by the client. Before the job starts, Arvados updates it to the full 40-character SHA-1 hash of the commit used by the job.
31 See "Specifying Git versions":#script_version below for more detail about acceptable ways to specify a commit.|
32 |cancelled_by_client_uuid|string|API client ID|Is null if job has not been cancelled|
33 |cancelled_by_user_uuid|string|Authenticated user ID|Is null if job has not been cancelled|
34 |cancelled_at|datetime|When job was cancelled|Is null if job has not been cancelled|
35 |started_at|datetime|When job started running|Is null if job has not [yet] started|
36 |finished_at|datetime|When job finished running|Is null if job has not [yet] finished|
37 |running|boolean|Whether the job is running||
38 |success|boolean|Whether the job indicated successful completion|Is null if job has not finished|
39 |is_locked_by_uuid|string|UUID of the user who has locked this job|Is null if job is not locked. The system user locks the job when starting the job, in order to prevent job attributes from being altered.|
40 |node_uuids|array|List of UUID strings for node objects that have been assigned to this job||
41 |log|string|Collection UUID|Is null if the job has not finished. After the job runs, the given collection contains a text file with log messages provided by the @arv-crunch-job@ task scheduler as well as the standard error streams provided by the task processes.|
42 |tasks_summary|hash|Summary of task completion states.|Example: @{"done":0,"running":4,"todo":2,"failed":0}@|
43 |output|string|Collection UUID|Is null if the job has not finished.|
44 |nondeterministic|boolean|The job is expected to produce different results if run more than once.|If true, this job will not be considered as a candidate for automatic re-use when submitting subsequent identical jobs.|
45 |submit_id|string|Unique ID provided by client when job was submitted|Optional. This can be used by a client to make the "jobs.create":{{site.baseurl}}/api/methods/jobs.html#create method idempotent.|
47 |arvados_sdk_version|string|Git commit hash that specifies the SDK version to use from the Arvados repository|This is set by searching the Arvados repository for a match for the arvados_sdk_version runtime constraint.|
48 |docker_image_locator|string|Portable data hash of the collection that contains the Docker image to use|This is set by searching readable collections for a match for the docker_image runtime constraint.|
49 |runtime_constraints|hash|Constraints that must be satisfied by the job/task scheduler in order to run the job.|See below.|
51 h3(#script_version). Specifying Git versions
53 The script_version attribute and arvados_sdk_version runtime constraint are typically given as a branch, tag, or commit hash, but there are many more ways to specify a Git commit. The "specifying revisions" section of the "gitrevisions manual page":http://git-scm.com/docs/gitrevisions.html has a definitive list. Arvados accepts Git versions in any format listed there that names a single commit (not a tree, a blob, or a range of commits). However, some kinds of names can be expected to resolve differently in Arvados than they do in your local repository. For example, <code>HEAD@{1}</code> refers to the local reflog, and @origin/master@ typically refers to a remote branch: neither is likely to work as desired if given as a Git version.
55 h3. Runtime constraints
57 table(table table-bordered table-condensed).
58 |_. Key|_. Type|_. Description|_. Implemented|
59 |arvados_sdk_version|string|The Git version of the SDKs to use from the Arvados git repository. See "Specifying Git versions":#script_version for more detail about acceptable ways to specify a commit. If you use this, you must also specify a @docker_image@ constraint (see below). In order to install the Python SDK successfully, Crunch must be able to find and run virtualenv inside the container.|✓|
60 |docker_image|string|The Docker image that this Job needs to run. If specified, Crunch will create a Docker container from this image, and run the Job's script inside that. The Keep mount and work directories will be available as volumes inside this container. The image must be uploaded to Arvados using @arv keep docker@. You may specify the image in any format that Docker accepts, such as @arvados/jobs@, @debian:latest@, or the Docker image id. Alternatively, you may specify the portable data hash of the image Collection.|✓|
61 |min_nodes|integer||✓|
63 |min_cores_per_node|integer|Require that each node assigned to this Job have the specified number of CPU cores|✓|
64 |min_ram_mb_per_node|integer|Require that each node assigned to this Job have the specified amount of real memory (in MiB)|✓|
65 |min_scratch_mb_per_node|integer|Require that each node assigned to this Job have the specified amount of scratch storage available (in MiB)|✓|
66 |max_tasks_per_node|integer|Maximum simultaneous tasks on a single node|✓|
67 |keep_cache_mb_per_task|integer|Size of file data buffer for per-task Keep directory ($TASK_KEEPMOUNT), in MiB. Default is 256 MiB. Increase this to reduce cache thrashing in situtations such as accessing multiple large (64+ MiB) files at the same time, or accessing different parts of a large file at the same time.|✓|
68 |min_ram_per_task|integer|Minimum real memory (KiB) per task||
69 |min_ram_per_task|integer|Minimum real memory (KiB) per task||