Brett Smith [Fri, 10 Jun 2016 19:57:05 +0000 (15:57 -0400)]
9370: Alphabetize the package_go_binary list.
radhika [Tue, 14 Jun 2016 03:10:51 +0000 (23:10 -0400)]
9372: container display
Peter Amstutz [Mon, 13 Jun 2016 20:31:49 +0000 (16:31 -0400)]
8442: Enable net when API is enabled.
Nico Cesar [Mon, 13 Jun 2016 14:47:21 +0000 (10:47 -0400)]
bumped the selenium driver version to 2.53.1 to see if we can manage our way out failed builds
no issue #
Brett Smith [Mon, 13 Jun 2016 14:16:07 +0000 (10:16 -0400)]
9309: Bugfix Ruby source install instructions for CentOS.
* Add missing `make` dependency.
* Add `-i` to `sudo gem install` throughout. Red Hat adds /usr/local
paths to $PATH in `/etc/profile`, so we need `-i` to find `gem`.
Refs #9309.
radhika [Sat, 11 Jun 2016 03:53:44 +0000 (23:53 -0400)]
refs #9318
Merge branch '9318-dashboard-uses-work-units'
radhika [Sat, 11 Jun 2016 03:52:02 +0000 (23:52 -0400)]
9318: fixed outputs display issue where "outpus: []" is being shown when there are no outputs.
radhika [Fri, 10 Jun 2016 17:43:03 +0000 (13:43 -0400)]
9274: while creating a ContainerRequest, set requesting_container_uuid based on the token.
Peter Amstutz [Fri, 10 Jun 2016 15:45:38 +0000 (11:45 -0400)]
9353: Add libcloud.common.BaseHTTPError to CLOUD_ERRORS.
Due to eventual consistency, it seems that calling ex_create_tags() on a node
that has just been created can return "InvalidInstanceID.NotFound" for some
period. The ec2 driver raises BaseHTTPError when it doesn't recognize a more
specific error, however prior to this commit, this wasn't recognized as a
retryable error, so node manager would fall over. This commit makes
BaseHTTPError a retryable error.
Peter Amstutz [Fri, 10 Jun 2016 14:45:19 +0000 (10:45 -0400)]
9388: Record every log id sent and don't send duplicates.
radhika [Fri, 10 Jun 2016 14:23:55 +0000 (10:23 -0400)]
closes #9318, closes #8650
Merge branch '9318-dashboard-uses-work-units'
radhika [Fri, 10 Jun 2016 14:23:29 +0000 (10:23 -0400)]
Merge branch 'master' into 9318-dashboard-uses-work-units
Peter Amstutz [Thu, 9 Jun 2016 21:55:37 +0000 (17:55 -0400)]
9388: Process each notify individually instead attempting to batch them up.
Prior to this commit, websockets used to try to send log events out in batches,
by getting all logs with an id greater than last log that was sent.
Unfortunately, under concurrent database writes, logs from uncommited
transactions may not appear in the query even if logs with larger ids do
appear. This results in the uncommitted log never being sent out because
subsequent batch sends would not consider logs prior to the last log id that
was sent (which, in this case, is higher than the log that was missed.)
This commit eliminates the batching behavior. Because NOTIFY includes the log
id of a specific record that was committed, consider only the log record with
that id and process events in the order that the NOTIFY events arrive. This
means events may be delivered out of numeric order (although they now more
closely reflect the "actual" order, e.g. the order that the events were
actually committed to the database).
"Catch ups" where the client has specified a last_log_id and needs to have past
logs replayed continue to be sent in batches.
Brett Smith [Fri, 10 Jun 2016 03:23:35 +0000 (23:23 -0400)]
9187: Add priorities to crunch-dispatch-local test containers.
This is necessary to keep the tests working after
2c4ff054b533c62ecdb269963d3ab0af20d2df8b.
Otherwise, crunch-dispatch-local declines to do anything with them.
Refs #9187.
Peter Amstutz [Thu, 9 Jun 2016 20:21:59 +0000 (16:21 -0400)]
Merge branch '9187-requeued-containers' closes #9187
Peter Amstutz [Thu, 9 Jun 2016 20:21:36 +0000 (16:21 -0400)]
9187: Add comments.
radhika [Thu, 9 Jun 2016 20:16:06 +0000 (16:16 -0400)]
9318: Update the compute node status pane to make sure the Details option is only offered when there are any active nodes.
radhika [Thu, 9 Jun 2016 17:58:58 +0000 (13:58 -0400)]
Merge branch 'master' into 9318-dashboard-uses-work-units
radhika [Thu, 9 Jun 2016 17:57:39 +0000 (13:57 -0400)]
9318: remove :output method in favor of :outputs method and correct the logic for various object models.
Tom Clegg [Thu, 9 Jun 2016 14:21:29 +0000 (10:21 -0400)]
Merge branch '9278-expiring-collections'
refs #9278
Tom Clegg [Wed, 8 Jun 2016 13:52:30 +0000 (09:52 -0400)]
9278: Ensure locator signatures expire no later than expires_at.
Tom Clegg [Tue, 7 Jun 2016 17:59:29 +0000 (13:59 -0400)]
9278: Expose expires_at in API response.
Tom Clegg [Tue, 7 Jun 2016 17:59:19 +0000 (13:59 -0400)]
9278: Set expires_at=now if a client sets it to a time in the past.
The definition of "now" in the default collection scope changes from
current_timestamp (time the current transaction started) to
statement_timestamp() (time the current statement started) so a test
case can expire a collection and then confirm that it is not in the
default scope, all within a single test transaction.
Brett Smith [Wed, 8 Jun 2016 21:29:37 +0000 (17:29 -0400)]
Merge branch '9309-postgresql-install-guide-wip'
Refs #9309. Closes #9367.
Brett Smith [Wed, 8 Jun 2016 17:17:43 +0000 (13:17 -0400)]
9309: Separate PostgreSQL setup page in Install Guide.
This provides us with a few benefits:
* We have a place to discuss the different deployment options
installers have around PostgreSQL.
* PostgreSQL setup is very distro-specific (and it's going to get
worse when we add CentOS 7), so this can take some of that noise out
of the Rails server install guides.
* People who want to try new things, like cloud database services,
get a clearer separation of the install process and the database
setup process.
Peter Amstutz [Wed, 8 Jun 2016 15:46:12 +0000 (11:46 -0400)]
9187: Don't try to take lock on containers with priority 0.
Peter Amstutz [Wed, 8 Jun 2016 15:20:21 +0000 (11:20 -0400)]
9187: If a container is reported Queued, but we are monitoring it, stop monitoring it.
radhika [Wed, 8 Jun 2016 14:37:14 +0000 (10:37 -0400)]
Merge branch '8650-container-work-unit' into 9318-dashboard-uses-work-units
radhika [Wed, 8 Jun 2016 14:31:45 +0000 (10:31 -0400)]
8650: test and fixture update
radhika [Wed, 8 Jun 2016 14:25:36 +0000 (10:25 -0400)]
Merge branch 'master' into 8650-container-work-unit
radhika [Wed, 8 Jun 2016 14:23:15 +0000 (10:23 -0400)]
closes #8087
Merge branch 'wtsi-hgi-8087-arv-cli-request-body-from-file'
radhika [Wed, 8 Jun 2016 11:23:19 +0000 (07:23 -0400)]
Merge branch 'master' into wtsi-hgi-8087-arv-cli-request-body-from-file
radhika [Wed, 8 Jun 2016 11:19:36 +0000 (07:19 -0400)]
Merge branch '8087-arv-cli-request-body-from-file' of https://github.com/wtsi-hgi/arvados into wtsi-hgi-8087-arv-cli-request-body-from-file
radhika [Wed, 8 Jun 2016 03:26:46 +0000 (23:26 -0400)]
Merge branch 'master' into 9318-dashboard-uses-work-units
radhika [Wed, 8 Jun 2016 03:26:15 +0000 (23:26 -0400)]
refs #8876
Merge branch '8876-work-unit'
radhika [Wed, 8 Jun 2016 03:23:43 +0000 (23:23 -0400)]
8876: Pass work unit to determine_wallclock_runtime, not the original object.
radhika [Tue, 7 Jun 2016 21:07:38 +0000 (17:07 -0400)]
Merge branch 'master' into 9318-dashboard-uses-work-units
radhika [Tue, 7 Jun 2016 21:07:04 +0000 (17:07 -0400)]
closes #8876
Merge branch '8876-work-unit'
radhika [Tue, 7 Jun 2016 21:00:41 +0000 (17:00 -0400)]
8876: remove show_child_summary and replace it with is_running?
Peter Amstutz [Tue, 7 Jun 2016 20:43:19 +0000 (16:43 -0400)]
Bugfix submitting cwl jobs with arvados-cwl-runner refs #9275
Peter Amstutz [Tue, 7 Jun 2016 20:24:37 +0000 (16:24 -0400)]
Merge branch '9275-cwl-runner-creates-jobs' closes #9275
Peter Amstutz [Tue, 7 Jun 2016 20:17:49 +0000 (16:17 -0400)]
9275: Initial pipeline/job component update from response
radhika [Tue, 7 Jun 2016 17:23:12 +0000 (13:23 -0400)]
Merge branch '8650-container-work-unit' into 9318-dashboard-uses-work-units
radhika [Tue, 7 Jun 2016 17:22:50 +0000 (13:22 -0400)]
Merge branch '8876-work-unit' into 8650-container-work-unit
radhika [Tue, 7 Jun 2016 16:55:59 +0000 (12:55 -0400)]
8876: when computing cpu and running times, use the work unit's start and finished times if there are no children.
radhika [Tue, 7 Jun 2016 15:43:00 +0000 (11:43 -0400)]
9318: running and finished containers and fixtures updated.
radhika [Tue, 7 Jun 2016 02:40:01 +0000 (22:40 -0400)]
9275: Update the update_pipeline_component method to check if pipeline is null.
radhika [Tue, 7 Jun 2016 00:49:46 +0000 (20:49 -0400)]
Merge branch '8650-container-work-unit' into 9318-dashboard-uses-work-units
Conflicts:
apps/workbench/app/models/proxy_work_unit.rb
radhika [Tue, 7 Jun 2016 00:37:01 +0000 (20:37 -0400)]
8650: some more methods in ContainerWorkUnit
radhika [Tue, 7 Jun 2016 00:29:30 +0000 (20:29 -0400)]
Merge branch '8876-work-unit' into 8650-container-work-unit
radhika [Tue, 7 Jun 2016 00:28:21 +0000 (20:28 -0400)]
8876: Use JobWorkUnit for pipeline components and cleanup.
radhika [Mon, 6 Jun 2016 19:50:58 +0000 (15:50 -0400)]
9275: add record to cwl_runner_job as components
Peter Amstutz [Mon, 6 Jun 2016 19:26:03 +0000 (15:26 -0400)]
Merge branch '9187-crunchv2-dispatching' closes #9187
Peter Amstutz [Mon, 6 Jun 2016 18:46:37 +0000 (14:46 -0400)]
Merge branch 'master' into 9187-crunchv2-dispatching
Peter Amstutz [Mon, 6 Jun 2016 14:44:11 +0000 (10:44 -0400)]
9187: Remove "squeueError" because checkSqueue for a successful squeue run. Refactor tests a bit and add a test for canceling containers.
radhika [Sat, 4 Jun 2016 23:19:20 +0000 (19:19 -0400)]
Merge branch '8650-container-work-unit' into 9318-dashboard-uses-work-units
Conflicts:
apps/workbench/app/views/work_unit/_progress.html.erb
apps/workbench/test/unit/work_unit_test.rb
radhika [Sat, 4 Jun 2016 23:09:58 +0000 (19:09 -0400)]
Merge branch '8876-work-unit' into 8650-container-work-unit
Conflicts:
apps/workbench/test/unit/work_unit_test.rb
radhika [Sat, 4 Jun 2016 23:03:32 +0000 (19:03 -0400)]
8876: add tests for link_to_log and queuedtime etc.
radhika [Sat, 4 Jun 2016 14:06:09 +0000 (10:06 -0400)]
8876: introduce view helper methods such as link_to_log and queuedtime etc so that the views do not have to do too many decisions based on the state of the work unit.
Peter Amstutz [Fri, 3 Jun 2016 21:57:48 +0000 (17:57 -0400)]
9187: Fix refactoring messup
radhika [Fri, 3 Jun 2016 20:52:06 +0000 (16:52 -0400)]
8876: display "no process has been submitted" when a child uuid is not presented.
Tom Clegg [Fri, 3 Jun 2016 19:54:05 +0000 (15:54 -0400)]
Merge branch '9272-use-container-auth'
closes #9272
Tom Clegg [Fri, 27 May 2016 01:25:47 +0000 (21:25 -0400)]
9272: Skip slow test when running -short tests.
Tom Clegg [Fri, 27 May 2016 01:22:11 +0000 (21:22 -0400)]
9272: Simplify json decoding with Unmarshal.
Tom Clegg [Fri, 27 May 2016 01:17:31 +0000 (21:17 -0400)]
9272: Fix up state transitions:
* Change state to Running only at the last possible moment before
starting the container.
* When erroring out before Running, change state back to Queued.
* Do not save log/output/exit code when changing state to Cancelled.
Incidental fixes:
* Clean up error handling in Run()
* Don't create a collection for (or try to attach to the container)
the second "cleanup activities" log that gets opened after closing
the real container log.
Tom Clegg [Thu, 26 May 2016 20:48:08 +0000 (16:48 -0400)]
9272: Pass container auth info into container if requested.
Tom Clegg [Thu, 26 May 2016 19:50:21 +0000 (15:50 -0400)]
9272: Get container auth instead of passing the dispatcher token into the container.
radhika [Fri, 3 Jun 2016 17:13:26 +0000 (13:13 -0400)]
9275: create pipeline_instance in submit mode as well and add the runner job to it's components.
radhika [Fri, 3 Jun 2016 14:36:14 +0000 (10:36 -0400)]
9318: Compute node summary pane includes queued and locked containers.
Peter Amstutz [Fri, 3 Jun 2016 02:46:55 +0000 (22:46 -0400)]
9187: Fix comment typo
Peter Amstutz [Fri, 3 Jun 2016 02:18:55 +0000 (22:18 -0400)]
9187: Add documentation comments to Squeue functions.
Peter Amstutz [Thu, 2 Jun 2016 21:59:20 +0000 (17:59 -0400)]
9187: Improve squeue synchronization
* Put squeue functions into separate file.
* CheckSqueue() now blocks on a condition variable until the next successful
update of squeue, which then wakes up all goroutines waiting on CheckSqueue().
* Never do anything when squeue returns an error.
* Merge submitting, monitoring, and cleanup behaviors into a single goroutine
which updates based on CheckSqueue() instead of a ticker.
* Introduce a lock on squeue, sbatch and scancel operations, so that on next
wakeup the queue is guaranteed to reflect most recent sbatch/scancel
operations.
radhika [Thu, 2 Jun 2016 23:53:24 +0000 (19:53 -0400)]
9318: "Active" and "Recently finished" panes in dashboard are updated to use work_unit interface.
Tom Clegg [Thu, 2 Jun 2016 20:51:11 +0000 (16:51 -0400)]
Merge branch '9343-no-env-vars'
refs #9343
Tom Clegg [Thu, 2 Jun 2016 20:38:55 +0000 (16:38 -0400)]
9343: Do not check env vars when setting up Keep client for pull requests.
radhika [Thu, 2 Jun 2016 19:07:48 +0000 (15:07 -0400)]
Merge branch '8876-work-unit' into 8650-container-work-unit
radhika [Thu, 2 Jun 2016 19:07:15 +0000 (15:07 -0400)]
8876: success? includes Canceled as well.
radhika [Thu, 2 Jun 2016 19:04:18 +0000 (15:04 -0400)]
8650: add children to container_work_unit
radhika [Wed, 1 Jun 2016 23:22:13 +0000 (19:22 -0400)]
Merge branch '8876-work-unit' into 8650-container-work-unit
radhika [Wed, 1 Jun 2016 23:20:42 +0000 (19:20 -0400)]
8876: move some methods such as log and output from job_work_unit into proxy_work_unit to aid reusability.
radhika [Wed, 1 Jun 2016 23:05:37 +0000 (19:05 -0400)]
8650: add container_work_unit
radhika [Wed, 1 Jun 2016 22:55:37 +0000 (18:55 -0400)]
8876: child_summary_str checks if total > 0
Peter Amstutz [Wed, 1 Jun 2016 20:06:26 +0000 (16:06 -0400)]
9187: Slurm dispatcher improvements around squeue
* Clarify that status updates are not guaranteed to be delivered on a
heartbeat.
* Refactor slurm dispatcher to monitor the container in squeue in a separate
goroutine.
* Refactor polling squeue to a single goroutine and cache the results so that
monitoring 100 containers doesn't result in 100 calls to squeue.
* No longer set up strigger to cancel job on finish, instead cancel running
jobs not in squeue.
* Test both cases where a job is/is not in squeue.
radhika [Wed, 1 Jun 2016 19:11:39 +0000 (15:11 -0400)]
Merge branch 'master' into 8876-work-unit
Brett Smith [Wed, 1 Jun 2016 18:28:24 +0000 (14:28 -0400)]
Merge branch '9309-centos-7-packages-wip'
Refs #9309. Closes #9313.
Brett Smith [Fri, 27 May 2016 22:34:56 +0000 (18:34 -0400)]
9309: Add packages and tests for CentOS 7.
Brett Smith [Tue, 31 May 2016 21:37:02 +0000 (17:37 -0400)]
9309: Look for fpm-info in backports/$PACKAGE_NAME.
This lets us define additional fpm flags when we build a non-dir
package from a source directory.
Brett Smith [Tue, 31 May 2016 21:36:08 +0000 (17:36 -0400)]
9242: Restore newer backported versions of Python packages.
I accidentally reverted this in
758d39f.
Refs #9242.
Brett Smith [Tue, 31 May 2016 20:35:53 +0000 (16:35 -0400)]
9242: Update Python module paths for CentOS 6.
I am more sure that this is correct, based on multiple data points
from Python 2 and 3 packages across CentOS 6 and 7.
This might be a change that's fallout from
44ceaa474a330f12dd9e00115af107d7258044f2.
Refs #9242.
Tom Clegg [Tue, 31 May 2016 20:23:30 +0000 (16:23 -0400)]
Merge branch '9162-keep-balance'
closes #9162
Tom Clegg [Tue, 24 May 2016 14:02:39 +0000 (10:02 -0400)]
9162: Add replication level histogram
Ported from
00a8ece1580a894dbbf9f756685eefc134e4d0d6 by jrandall
Tom Clegg [Mon, 16 May 2016 21:09:21 +0000 (17:09 -0400)]
9162: Add keep-balance
Brett Smith [Tue, 31 May 2016 20:09:57 +0000 (16:09 -0400)]
Merge branch '9242-python-backport-prefix-wip'
Closes #9242, #9247.
Brett Smith [Tue, 31 May 2016 15:13:41 +0000 (11:13 -0400)]
9242: Python packages install libraries to the distro path.
This avoids breaking dependent packages that expect to find files in
the same place.
radhika [Tue, 31 May 2016 17:12:06 +0000 (13:12 -0400)]
8876: typo in fixture
radhika [Tue, 31 May 2016 17:07:09 +0000 (13:07 -0400)]
8876: correct the job_reader2 fixture
radhika [Tue, 31 May 2016 16:52:35 +0000 (12:52 -0400)]
8876: improve jobs_with_components test to have components that can be un/read
radhika [Tue, 31 May 2016 15:42:42 +0000 (11:42 -0400)]
Merge branch 'master' into 8876-work-unit
Brett Smith [Thu, 19 May 2016 19:41:16 +0000 (15:41 -0400)]
9242: Refactor Python constant definitions in r-b-p.
There are about to be more of them, which will make this a real space
savings.
Brett Smith [Tue, 31 May 2016 01:32:52 +0000 (21:32 -0400)]
9316: Include documentation in CWL SDK.
This is necessary to make pip distributions installable, since
setup.py tries to open README.rst. Closes #9316.