Brett Smith [Mon, 13 Jun 2016 14:16:07 +0000 (10:16 -0400)]
9309: Bugfix Ruby source install instructions for CentOS.
* Add missing `make` dependency.
* Add `-i` to `sudo gem install` throughout. Red Hat adds /usr/local
paths to $PATH in `/etc/profile`, so we need `-i` to find `gem`.
Brett Smith [Fri, 10 Jun 2016 03:23:35 +0000 (23:23 -0400)]
9187: Add priorities to crunch-dispatch-local test containers.
This is necessary to keep the tests working after 2c4ff054b533c62ecdb269963d3ab0af20d2df8b.
Otherwise, crunch-dispatch-local declines to do anything with them.
Refs #9187.
Tom Clegg [Tue, 7 Jun 2016 17:59:19 +0000 (13:59 -0400)]
9278: Set expires_at=now if a client sets it to a time in the past.
The definition of "now" in the default collection scope changes from
current_timestamp (time the current transaction started) to
statement_timestamp() (time the current statement started) so a test
case can expire a collection and then confirm that it is not in the
default scope, all within a single test transaction.
Brett Smith [Wed, 8 Jun 2016 17:17:43 +0000 (13:17 -0400)]
9309: Separate PostgreSQL setup page in Install Guide.
This provides us with a few benefits:
* We have a place to discuss the different deployment options
installers have around PostgreSQL.
* PostgreSQL setup is very distro-specific (and it's going to get
worse when we add CentOS 7), so this can take some of that noise out
of the Rails server install guides.
* People who want to try new things, like cloud database services,
get a clearer separation of the install process and the database
setup process.
radhika [Sat, 4 Jun 2016 14:06:09 +0000 (10:06 -0400)]
8876: introduce view helper methods such as link_to_log and queuedtime etc so that the views do not have to do too many decisions based on the state of the work unit.
Tom Clegg [Fri, 27 May 2016 01:17:31 +0000 (21:17 -0400)]
9272: Fix up state transitions:
* Change state to Running only at the last possible moment before
starting the container.
* When erroring out before Running, change state back to Queued.
* Do not save log/output/exit code when changing state to Cancelled.
Incidental fixes:
* Clean up error handling in Run()
* Don't create a collection for (or try to attach to the container)
the second "cleanup activities" log that gets opened after closing
the real container log.
Peter Amstutz [Thu, 2 Jun 2016 21:59:20 +0000 (17:59 -0400)]
9187: Improve squeue synchronization
* Put squeue functions into separate file.
* CheckSqueue() now blocks on a condition variable until the next successful
update of squeue, which then wakes up all goroutines waiting on CheckSqueue().
* Never do anything when squeue returns an error.
* Merge submitting, monitoring, and cleanup behaviors into a single goroutine
which updates based on CheckSqueue() instead of a ticker.
* Introduce a lock on squeue, sbatch and scancel operations, so that on next
wakeup the queue is guaranteed to reflect most recent sbatch/scancel
operations.
Peter Amstutz [Wed, 1 Jun 2016 20:06:26 +0000 (16:06 -0400)]
9187: Slurm dispatcher improvements around squeue
* Clarify that status updates are not guaranteed to be delivered on a
heartbeat.
* Refactor slurm dispatcher to monitor the container in squeue in a separate
goroutine.
* Refactor polling squeue to a single goroutine and cache the results so that
monitoring 100 containers doesn't result in 100 calls to squeue.
* No longer set up strigger to cancel job on finish, instead cancel running
jobs not in squeue.
* Test both cases where a job is/is not in squeue.
Brett Smith [Tue, 31 May 2016 20:35:53 +0000 (16:35 -0400)]
9242: Update Python module paths for CentOS 6.
I am more sure that this is correct, based on multiple data points
from Python 2 and 3 packages across CentOS 6 and 7.
This might be a change that's fallout from 44ceaa474a330f12dd9e00115af107d7258044f2.
Refs #9242.