3 navsection: installguide
4 title: Test SLURM dispatch
7 h2. Test compute node setup
9 You should now be able to submit SLURM jobs that run in Docker containers. On the node where you're running the dispatcher, you can test this by running:
12 <pre><code>~$ <span class="userinput">sudo -u <b>crunch</b> srun -N1 docker run busybox echo OK
16 If it works, this command should print @OK@ (it may also show some status messages from SLURM and/or Docker). If it does not print @OK@, double-check your compute node setup, and that the @crunch@ user can submit SLURM jobs.
18 h2. Test the dispatcher
20 On the dispatch node, start monitoring the crunch-dispatch-slurm logs:
23 <pre><code>~$ <span class="userinput">sudo journalctl -o cat -fu crunch-dispatch-slurm.service</span>
27 *On your shell server*, submit a simple container request:
30 <pre><code>shell:~$ <span class="userinput">arv container_request create --container-request '{
34 "container_image": "arvados/jobs:latest",
35 "command": ["echo", "Hello, Crunch!"],
36 "output_path": "/out",
43 "runtime_constraints": {
51 This command should return a record with a @container_uuid@ field. Once crunch-dispatch-slurm polls the API server for new containers to run, you should see it dispatch that same container. It will log messages like:
54 <pre><code>2016/08/05 13:52:54 Monitoring container zzzzz-dz642-hdp2vpu9nq14tx0 started
55 2016/08/05 13:53:04 About to submit queued container zzzzz-dz642-hdp2vpu9nq14tx0
56 2016/08/05 13:53:04 sbatch succeeded: Submitted batch job 8102
60 If you do not see crunch-dispatch-slurm try to dispatch the container, double-check that it is running and that the API hostname and token in @/etc/arvados/crunch-dispatch-slurm/crunch-dispatch-slurm.yml@ are correct.
62 Before the container finishes, SLURM's @squeue@ command will show the new job in the list of queued and running jobs. For example, you might see:
65 <pre><code>~$ <span class="userinput">squeue --long</span>
66 Fri Aug 5 13:57:50 2016
67 JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES NODELIST(REASON)
68 8103 compute zzzzz-dz crunch RUNNING 1:56 UNLIMITED 1 compute0
72 The job's name corresponds to the container's UUID. You can get more information about it by running, e.g., <notextile><code>scontrol show job Name=<b>UUID</b></code></notextile>.
74 When the container finishes, the dispatcher will log that, with the final result:
77 <pre><code>2016/08/05 13:53:14 Container zzzzz-dz642-hdp2vpu9nq14tx0 now in state "Complete" with locked_by_uuid ""
78 2016/08/05 13:53:14 Monitoring container zzzzz-dz642-hdp2vpu9nq14tx0 finished
82 After the container finishes, you can get the container record by UUID *from a shell server* to see its results:
85 <pre><code>shell:~$ <span class="userinput">arv get <b>zzzzz-dz642-hdp2vpu9nq14tx0</b></span>
89 "log":"a01df2f7e5bc1c2ad59c60a837e90dc6+166",
90 "output":"d41d8cd98f00b204e9800998ecf8427e+0",
97 You can use standard Keep tools to view the container's output and logs from their corresponding fields. For example, to see the logs from the collection referenced in the @log@ field:
100 <pre><code>~$ <span class="userinput">arv keep ls <b>a01df2f7e5bc1c2ad59c60a837e90dc6+166</b></span>
104 ~$ <span class="userinput">arv keep get <b>a01df2f7e5bc1c2ad59c60a837e90dc6+166</b>/stdout.txt</span>
105 2016-08-05T13:53:06.201011Z Hello, Crunch!
109 If the container does not dispatch successfully, refer to the crunch-dispatch-slurm logs for information about why it failed.