9187: Slurm dispatcher improvements around squeue
authorPeter Amstutz <peter.amstutz@curoverse.com>
Wed, 1 Jun 2016 20:06:26 +0000 (16:06 -0400)
committerPeter Amstutz <peter.amstutz@curoverse.com>
Wed, 1 Jun 2016 20:06:26 +0000 (16:06 -0400)
commit3ae9a789410e93eeb31ca5670c17a6d03d77f608
tree563169799f4dd11f1fea638002f4690afac929ce
parent3a3910fdc8a5003c182f68e3423c96327a136175
9187: Slurm dispatcher improvements around squeue

* Clarify that status updates are not guaranteed to be delivered on a
heartbeat.
* Refactor slurm dispatcher to monitor the container in squeue in a separate
goroutine.
* Refactor polling squeue to a single goroutine and cache the results so that
monitoring 100 containers doesn't result in 100 calls to squeue.
* No longer set up strigger to cancel job on finish, instead cancel running
jobs not in squeue.
* Test both cases where a job is/is not in squeue.
sdk/go/dispatch/dispatch.go
services/crunch-dispatch-local/crunch-dispatch-local_test.go
services/crunch-dispatch-slurm/crunch-dispatch-slurm.go
services/crunch-dispatch-slurm/crunch-dispatch-slurm_test.go
services/crunch-dispatch-slurm/crunch-finish-slurm.sh [deleted file]