Brett Smith [Fri, 27 May 2016 22:34:56 +0000 (18:34 -0400)]
9309: Add packages and tests for CentOS 7.
Brett Smith [Tue, 31 May 2016 21:37:02 +0000 (17:37 -0400)]
9309: Look for fpm-info in backports/$PACKAGE_NAME.
This lets us define additional fpm flags when we build a non-dir
package from a source directory.
Brett Smith [Tue, 31 May 2016 21:36:08 +0000 (17:36 -0400)]
9242: Restore newer backported versions of Python packages.
I accidentally reverted this in
758d39f.
Refs #9242.
Brett Smith [Tue, 31 May 2016 20:35:53 +0000 (16:35 -0400)]
9242: Update Python module paths for CentOS 6.
I am more sure that this is correct, based on multiple data points
from Python 2 and 3 packages across CentOS 6 and 7.
This might be a change that's fallout from
44ceaa474a330f12dd9e00115af107d7258044f2.
Refs #9242.
Tom Clegg [Tue, 31 May 2016 20:23:30 +0000 (16:23 -0400)]
Merge branch '9162-keep-balance'
closes #9162
Tom Clegg [Tue, 24 May 2016 14:02:39 +0000 (10:02 -0400)]
9162: Add replication level histogram
Ported from
00a8ece1580a894dbbf9f756685eefc134e4d0d6 by jrandall
Tom Clegg [Mon, 16 May 2016 21:09:21 +0000 (17:09 -0400)]
9162: Add keep-balance
Brett Smith [Tue, 31 May 2016 20:09:57 +0000 (16:09 -0400)]
Merge branch '9242-python-backport-prefix-wip'
Closes #9242, #9247.
Brett Smith [Tue, 31 May 2016 15:13:41 +0000 (11:13 -0400)]
9242: Python packages install libraries to the distro path.
This avoids breaking dependent packages that expect to find files in
the same place.
Brett Smith [Thu, 19 May 2016 19:41:16 +0000 (15:41 -0400)]
9242: Refactor Python constant definitions in r-b-p.
There are about to be more of them, which will make this a real space
savings.
Brett Smith [Tue, 31 May 2016 01:32:52 +0000 (21:32 -0400)]
9316: Include documentation in CWL SDK.
This is necessary to make pip distributions installable, since
setup.py tries to open README.rst. Closes #9316.
Ward Vandewege [Sat, 28 May 2016 13:27:16 +0000 (09:27 -0400)]
Fix centos6 package build (ruamel.yaml package building arguments for fpm).
No issue #
Brett Smith [Fri, 27 May 2016 22:43:38 +0000 (18:43 -0400)]
Update Software Collections package name in Install Guide.
Follows up previous commit. No issue #.
Brett Smith [Fri, 27 May 2016 22:42:09 +0000 (18:42 -0400)]
Update Software Collections package name in CentOS 6 Dockerfiles.
Why does the name of this package keep changing?
It is a mystery.
No issue #.
Brett Smith [Fri, 27 May 2016 22:30:57 +0000 (18:30 -0400)]
8959: Remove redundant python-gflags fpm-info.sh.
I added this file in
495a485ff. Later, Nico pinned the version in
run-build-packages, in
a8bbf6ef, to try to fix #8959. However, odds
are that #8959 was an ops problem, and not a package building problem:
the gflags 3.0 packages were still published on our repository, and
needed to be removed there.
Having both files causes trouble when you're building backports from
scratch. We haven't noticed because Jenkins never does that. But
I'm working on new packages and getting:
Loading fpm overrides from /arvados/backports/python-gflags/fpm-info.sh
fpm --maintainer=Ward Vandewege <ward@curoverse.com> -s python -t rpm --exclude=*/dist-packages/tests/* --exclude=*/site-packages/tests/* --verbose --log info -n python-gflags --iteration 1 --python-bin python2.7 --python-easyinstall easy_install-2.7 --python-package-name-prefix python --depends python -v 2.0 python-gflags==2.0
Error: python-gflags==2.0: Unable to figure out package name from fpm results:
{:timestamp=>"2016-05-27T22:20:53.045329+0000", :message=>"Setting workdir", :workdir=>"/tmp", :level=>:info} {:timestamp=>"2016-05-27T22:20:53.049435+0000", :message=>"Trying to download", :package=>"python-gflags==2.0", :level=>:info} {:timestamp=>"2016-05-27T22:20:53.122897+0000", :message=>"error: Not a URL, existing file, or requirement spec: 'python-gflags==2.0==2.0'", :level=>:info} {:timestamp=>"2016-05-27T22:20:53.130325+0000", :message=>"Process failed: easy_install-2.7 failed (exit code 1). Full command was:[\"easy_install-2.7\", \"-i\", \"https://pypi.python.org/simple\", \"--editable\", \"-U\", \"--build-directory\", \"/tmp/package-python-build20160527-1643-3sl5ec/python-gflags==2.0\", \"python-gflags==2.0==2.0\"]", :level=>:error}
Refs #8959.
Ward Vandewege [Fri, 27 May 2016 21:35:24 +0000 (17:35 -0400)]
Add dependency for ruamel.yaml to the build list.
No issue #
Ward Vandewege [Fri, 27 May 2016 21:34:55 +0000 (17:34 -0400)]
Fix bug in run-build-packages-one-target.sh: make sure to escape the *
passed to find.
No issue #
Tom Clegg [Fri, 27 May 2016 15:35:30 +0000 (11:35 -0400)]
Merge branch '9272-test-races'
refs #9272
Tom Clegg [Thu, 26 May 2016 19:49:49 +0000 (15:49 -0400)]
9272: Fix some race conditions in flaky tests.
Peter Amstutz [Fri, 27 May 2016 15:25:22 +0000 (11:25 -0400)]
Arvbox installs binaries for go 1.6 instead of golang Debian package
(which is stuck at 1.3) no issue #
Ward Vandewege [Thu, 26 May 2016 15:37:06 +0000 (11:37 -0400)]
Package ruamel.yaml, which is a new dependency of schema-salad.
No issue #
Peter Amstutz [Thu, 26 May 2016 14:09:04 +0000 (10:09 -0400)]
Merge branch '9303-actor-dead-dead' refs #9303
Peter Amstutz [Thu, 26 May 2016 13:51:24 +0000 (09:51 -0400)]
9303: Fetch arv_node before trying to shut down node, because monitor actor may
go away once the node has been successfully shut down. Also handle case of
node_finished_shutdown called after shutdown actor is stopped.
Peter Amstutz [Wed, 25 May 2016 20:33:49 +0000 (16:33 -0400)]
Log watchdog exception refs #9303
Peter Amstutz [Wed, 25 May 2016 20:30:10 +0000 (16:30 -0400)]
Merge branch '9303-kill-nodemanager-on-dead-actor' refs #9303
Peter Amstutz [Wed, 25 May 2016 20:29:30 +0000 (16:29 -0400)]
9303: Watchdog kill node manager on any error
Ward Vandewege [Tue, 24 May 2016 19:41:08 +0000 (15:41 -0400)]
Build distribution packages for the version of python-schema-salad that
python-cwltool now depends on.
refs #8653
Ward Vandewege [Tue, 24 May 2016 17:10:53 +0000 (13:10 -0400)]
Build distribution packages for the version of python-cwltool that
python-arvados-cwl-runner now depends on.
refs #8653
radhika [Tue, 24 May 2016 16:41:17 +0000 (12:41 -0400)]
closes #8556
Merge branch '8556-azure-trash'
radhika [Tue, 24 May 2016 16:39:28 +0000 (12:39 -0400)]
8556: implement trash/untrash for azure volumes.
Tom Clegg [Fri, 20 May 2016 14:01:55 +0000 (10:01 -0400)]
Merge branch 'wtsi-hgi-9231-rename-redunancy-to-replication-desired'
closes #9231
Peter Amstutz [Thu, 19 May 2016 20:02:17 +0000 (16:02 -0400)]
Merge branch '8653-cwl-runner-handle-files' closes #8653
Peter Amstutz [Thu, 19 May 2016 19:48:46 +0000 (15:48 -0400)]
8653: Fix tests.
Peter Amstutz [Thu, 19 May 2016 17:50:47 +0000 (13:50 -0400)]
8653: Use load_tool.fetch_document() instead of Loader() to read raw document.
Peter Amstutz [Thu, 19 May 2016 15:32:00 +0000 (11:32 -0400)]
8653: add cwlVersion so file validate correctly.
Peter Amstutz [Thu, 19 May 2016 02:08:48 +0000 (22:08 -0400)]
8653: Fix pathmapper API
Peter Amstutz [Thu, 19 May 2016 02:06:19 +0000 (22:06 -0400)]
8653: Set basedir for CollectionFsAccess
Peter Amstutz [Thu, 19 May 2016 02:01:15 +0000 (22:01 -0400)]
8653: Update load_tool in cwl-runner crunch script
Joshua C. Randall [Wed, 18 May 2016 13:35:37 +0000 (14:35 +0100)]
Renames 'redundancy' to 'replication_desired'
Peter Amstutz [Wed, 18 May 2016 21:42:44 +0000 (17:42 -0400)]
8653: Check that parameters are basestring before matching regex.
Peter Amstutz [Wed, 18 May 2016 20:40:48 +0000 (16:40 -0400)]
8653: Update cwl-runner to match changes in sdk/arvados-cwl-runner
Peter Amstutz [Wed, 18 May 2016 20:33:57 +0000 (16:33 -0400)]
8653: cwl-runner crunch script rewrites keep file paths into CWL File objects.
Clean up argument handling in arvados-cwl-runner so that --create-template
doesn't require a job object, and that --help doesn't present options that are
irrelevant or don't work.
Peter Amstutz [Wed, 18 May 2016 15:00:59 +0000 (11:00 -0400)]
Merge branch '9018-nodemanager-kill-instead-of-killpg' closes #9018
Peter Amstutz [Wed, 18 May 2016 14:59:03 +0000 (10:59 -0400)]
9018: Change os.killpg() -> os.kill, don't create new process group.
Peter Amstutz [Wed, 18 May 2016 13:27:15 +0000 (09:27 -0400)]
Merge branch '8236-nodemanager-watchdog' closes #8236
Peter Amstutz [Tue, 17 May 2016 20:59:20 +0000 (16:59 -0400)]
8236: Restore os.killpg(). Create a new process group so that it won't kill
the parent process by accident. Watchdog process now only monitors specific
actors.
Brett Smith [Tue, 17 May 2016 20:22:50 +0000 (16:22 -0400)]
Merge branch '9049-arv-copy-filters-wip'
Closes #9049, #9225.
Brett Smith [Tue, 17 May 2016 16:38:39 +0000 (12:38 -0400)]
9049: arv-copy checks and updates pipeline template filters.
Peter Amstutz [Tue, 17 May 2016 15:44:05 +0000 (11:44 -0400)]
8236: Add comment to BogusActor.ping()
Peter Amstutz [Tue, 17 May 2016 15:16:47 +0000 (11:16 -0400)]
Merge branch 'master' into 8236-nodemanager-watchdog
Peter Amstutz [Tue, 17 May 2016 15:15:53 +0000 (11:15 -0400)]
Merge branch '9161-node-state-fixes' closes #9161
Peter Amstutz [Tue, 17 May 2016 15:15:16 +0000 (11:15 -0400)]
8236: Add watchdog actor. This calls ping() on every other actor to check that
it is responsive. If an actor fails to respond, kill node manager.
Peter Amstutz [Tue, 17 May 2016 13:18:06 +0000 (09:18 -0400)]
9161: Remove unused "paired()" function
Peter Amstutz [Tue, 17 May 2016 12:50:51 +0000 (08:50 -0400)]
Merge branch 'master' into 9161-node-state-fixes
Ward Vandewege [Tue, 17 May 2016 01:05:37 +0000 (21:05 -0400)]
Remove hardcoded -v in call to run_upload_packages.py
refs #9224
Ward Vandewege [Mon, 16 May 2016 21:33:44 +0000 (17:33 -0400)]
When running run-build-packages-python-and-ruby.sh with --debug, pass
--verbose to the upload command.
refs #9224
Peter Amstutz [Mon, 16 May 2016 20:36:41 +0000 (16:36 -0400)]
9161: Remove spurious prints
Peter Amstutz [Mon, 16 May 2016 18:30:06 +0000 (14:30 -0400)]
9161: Don't automatically consider nodes with job_uuid set to be 'busy'.
Peter Amstutz [Mon, 16 May 2016 14:29:50 +0000 (10:29 -0400)]
9161: Decisions to start and stop compute nodes are now based on an explicit
set of states: booting, unpaired, idle, busy, down, shutdown. Refactor to
remove 'shutdowns' dict and fold into cloud_nodes. Nodes_wanted uses same
computation of node state as used for decision to shut down nodes. Nodes for
which the state is unclear are either idle (if in the boot grace period) or
down (if older).
Peter Amstutz [Fri, 13 May 2016 20:36:02 +0000 (16:36 -0400)]
9161: Put nodes tagged _nodemanager_recently_booted nodes back into the node list.
Peter Amstutz [Fri, 13 May 2016 20:09:10 +0000 (16:09 -0400)]
9161: Add _nodemanager_recently_booted as new way of remembering nodes which are in intermediate state between being created and showing up in the cloud node list.
Tom Clegg [Fri, 13 May 2016 19:38:51 +0000 (15:38 -0400)]
Accept auth tokens with uppercase letters.
No issue #
Peter Amstutz [Fri, 13 May 2016 18:26:30 +0000 (14:26 -0400)]
9161: Adjusting behavior to accomodate down/broken/missing nodes.
Brett Smith [Fri, 13 May 2016 15:25:36 +0000 (11:25 -0400)]
Merge branch '9213-fix-arv-gems-wip'
Closes #9213, #9215.
Brett Smith [Thu, 12 May 2016 20:48:43 +0000 (16:48 -0400)]
9213: Update arv's `gem install` suggestions.
This makes it match what it actually loads.
Brett Smith [Thu, 12 May 2016 20:40:37 +0000 (16:40 -0400)]
9213: Improve gem loading in `arv`.
* Include the exception string in the error message.
* Separate stdlib loading problems from gem loading problems.
* Load gems with more dependencies first, to avoid situations like
this:
irb(main):001:0> require 'active_support/inflector'
=> true
irb(main):002:0> require 'arvados/google_api_client'
Gem::LoadError: Unable to activate arvados-0.1.
20160420143004, because activesupport-4.2.6 conflicts with activesupport (< 4.2.6, >= 3)
Brett Smith [Thu, 12 May 2016 20:37:59 +0000 (16:37 -0400)]
9213: Fix google-api-client dependency range in gemspecs.
Brett Smith [Fri, 13 May 2016 14:55:31 +0000 (10:55 -0400)]
Merge branch '9135-eventclient-run-forever-wip'
Closes #9135, #9157.
Brett Smith [Mon, 9 May 2016 16:54:23 +0000 (12:54 -0400)]
9135: Bring EventClient's public interface closer to PollClient's.
* Restore the run_forever method, which was previously inherited from
WebSocketClient.
* Remove the connect and close_connection methods, which are
WebSocketClient implementation details that don't make sense as part
of the public interface. (A running EventClient will just reconnect
if you call close_connection on it.)
Brett Smith [Mon, 9 May 2016 16:57:42 +0000 (12:57 -0400)]
9135: Make EventClient initialization more consistent.
* DRY up the setup code. This includes always trying to close the
conenction after failure, since we were doing that in the initial
connection.
* Make the client a daemon thread, for consistency with PollClient.
Brett Smith [Mon, 9 May 2016 16:40:10 +0000 (12:40 -0400)]
9135: Clean imports in test_events.
Brett Smith [Mon, 9 May 2016 16:18:28 +0000 (12:18 -0400)]
9135: Add basic tests for Python events listeners.
These ensure that both classes have the core methods subscribe,
unsubscribe, run_forever, and close.
Rename the test file to test_events, to better match other test
patterns, and account for the fact it tests both classes in the
module.
Peter Amstutz [Fri, 13 May 2016 14:11:39 +0000 (10:11 -0400)]
9161: Eliminate 'booted' list and put nodes directly into cloud_nodes list.
Refactor logic for registering cloud nodes. Refactor computation of nodes
wanted; explicitly model 'unpaired' and 'down'.
Tom Clegg [Fri, 13 May 2016 13:37:49 +0000 (09:37 -0400)]
9188: Update SetBlobMetadata func signature.
refs #9188
Tom Clegg [Thu, 12 May 2016 15:34:51 +0000 (11:34 -0400)]
Merge branch '8128-crunch2-auth-api'
closes #8128
Tom Clegg [Thu, 12 May 2016 14:23:31 +0000 (10:23 -0400)]
8128: Fix test race.
Tom Clegg [Thu, 12 May 2016 13:08:14 +0000 (09:08 -0400)]
8128: Fix flaky test: pipe the "echo UUID" script to sh, not to "echo UUID".
Tom Clegg [Wed, 11 May 2016 15:01:28 +0000 (11:01 -0400)]
8128: Use row lock during Container update, add comments.
Tom Clegg [Tue, 10 May 2016 14:45:05 +0000 (10:45 -0400)]
8128: Add arvados.v1.api_client_authorizations.current
Tom Clegg [Mon, 9 May 2016 19:33:05 +0000 (15:33 -0400)]
8128: Add runtime tokens for containers, and locks for multiple dispatchers
Tom Clegg [Thu, 5 May 2016 21:50:44 +0000 (17:50 -0400)]
8128: Update crunch-dispatch-local to use new Locked state.
Tom Clegg [Thu, 5 May 2016 21:15:51 +0000 (17:15 -0400)]
8128: Update crunch-dispatch-slurm to use new Locked state.
Tom Clegg [Thu, 5 May 2016 19:46:20 +0000 (15:46 -0400)]
8128: Add Locked state to Container model.
Tom Clegg [Thu, 28 Apr 2016 15:16:50 +0000 (11:16 -0400)]
8128: De-dup container unit tests
Peter Amstutz [Wed, 11 May 2016 20:55:00 +0000 (16:55 -0400)]
9161: There's a window between when a node pings for the first time and the
value of 'slurm_state' is synchronized by crunch-dispatch. In this window, the
node will still report as 'down'. Check first_ping_at and implement a grace
period where the node should will be considered 'idle'.
Peter Amstutz [Wed, 11 May 2016 15:42:44 +0000 (11:42 -0400)]
Merge branch '8886-async-permission-update' refs #8886
Peter Amstutz [Wed, 11 May 2016 13:50:23 +0000 (09:50 -0400)]
8886: Restore behavior in group_permissions to call
calculate_group_permissions when cache is empty and async_permissions_update is
not true.
radhika [Tue, 10 May 2016 17:35:24 +0000 (13:35 -0400)]
closes #8017
Merge branch '8017-slurm-runtime-constraints'
radhika [Tue, 10 May 2016 17:34:34 +0000 (13:34 -0400)]
8017: RuntimeConstraints uses int64
radhika [Tue, 10 May 2016 17:23:28 +0000 (13:23 -0400)]
closes #8464
Merge branch '8464-crunch2-stdout'
radhika [Tue, 10 May 2016 15:59:16 +0000 (11:59 -0400)]
8017: RuntimeConstraints uses int64
radhika [Tue, 10 May 2016 15:28:54 +0000 (11:28 -0400)]
Merge branch '8017-slurm-runtime-constraints' of git.curoverse.com:arvados into 8017-slurm-runtime-constraints
radhika [Tue, 10 May 2016 15:25:45 +0000 (11:25 -0400)]
8464: stdout handling
Peter Amstutz [Mon, 9 May 2016 20:40:58 +0000 (16:40 -0400)]
8886: Add timestamp checking to permission updates.
radhika [Mon, 9 May 2016 16:08:15 +0000 (12:08 -0400)]
8017: mem-per-cpu
radhika [Tue, 3 May 2016 19:09:54 +0000 (15:09 -0400)]
8017: pass ram and vcpus runtime_constraints from Container to sbatch command.
radhika [Mon, 9 May 2016 16:08:15 +0000 (12:08 -0400)]
8017: mem-per-cpu
radhika [Mon, 9 May 2016 14:34:47 +0000 (10:34 -0400)]
Merge branch 'master' into 8017-slurm-runtime-constraints
radhika [Thu, 5 May 2016 21:51:47 +0000 (17:51 -0400)]
8464: Add stdout redirection in crunch2.
Tom Clegg [Thu, 5 May 2016 14:54:58 +0000 (10:54 -0400)]
Merge branch '9017-apiserver-short-tests'
refs #9017