radhika [Thu, 15 Oct 2015 03:01:13 +0000 (23:01 -0400)]
Merge branch 'master' into 7167-keep-rsync
radhika [Thu, 15 Oct 2015 03:00:05 +0000 (23:00 -0400)]
7167: log progress during keep-rsync and several test improvements.
Brett Smith [Wed, 14 Oct 2015 20:59:40 +0000 (16:59 -0400)]
API server needs an arvados-cli with crunch-job --docker-bin.
crunch-dispatch was extended to use `crunch-job --docker-bin` in
#6838. This commit simply updates the Gemfile to ensure this
dependency is satisfied. No issue #.
radhika [Wed, 14 Oct 2015 19:15:15 +0000 (15:15 -0400)]
Merge branch 'master' into 7167-keep-rsync
Tom Clegg [Wed, 14 Oct 2015 18:51:34 +0000 (14:51 -0400)]
Merge branch '7167-propagate-error' refs #7167
Tom Clegg [Wed, 14 Oct 2015 18:51:01 +0000 (14:51 -0400)]
Merge branch '7159-clean-index' refs #7159 refs #7168
radhika [Wed, 14 Oct 2015 17:43:54 +0000 (13:43 -0400)]
7167: loadConfig setupKeepclient do only one set at a time.
radhika [Wed, 14 Oct 2015 02:27:41 +0000 (22:27 -0400)]
7167: when the config file does not contain '/', use $HOME/.config/arvados/<filename>.
radhika [Wed, 14 Oct 2015 01:47:06 +0000 (21:47 -0400)]
7167: replace keep_existing with num_keep_servers and use it create all required keep servers at once.
radhika [Wed, 14 Oct 2015 01:16:35 +0000 (21:16 -0400)]
7167: replace the keep_existing logic and create all 3 keep servers at once and use the first two as src keepservers and the last one as the dst keep server.
radhika [Tue, 13 Oct 2015 21:01:46 +0000 (17:01 -0400)]
7167: Convert blobSigningKey also into local variable and make necessary changes to accommodate this change.
Remove the New method added in arvadosclient.go and revert MakeArvadosClient to what it was before.
radhika [Tue, 13 Oct 2015 19:46:26 +0000 (15:46 -0400)]
7167: update run_test_servers.py to use action="store_true" instead of converting string to boolean.
Tom Clegg [Tue, 13 Oct 2015 19:33:33 +0000 (15:33 -0400)]
7159: Address golint complaints
Tom Clegg [Tue, 13 Oct 2015 19:33:02 +0000 (15:33 -0400)]
7159: Omit non-Keep blobs from index
radhika [Tue, 13 Oct 2015 17:38:52 +0000 (13:38 -0400)]
Merge branch 'master' into 7167-keep-rsync
radhika [Tue, 13 Oct 2015 17:37:33 +0000 (13:37 -0400)]
7167: Convert most of the globals in keep-sync into locals and update all the code and tests as needed.
Tom Clegg [Tue, 13 Oct 2015 16:11:23 +0000 (12:11 -0400)]
7159: Return benign os.ErrNotExist error from Compare to avoid excessive logs. refs #7159
Tom Clegg [Tue, 13 Oct 2015 15:17:49 +0000 (11:17 -0400)]
7159: Fix error handling when reading full size block. refs #7159
Tom Clegg [Mon, 12 Oct 2015 17:41:04 +0000 (13:41 -0400)]
7167: Propagate read errors to caller. Fixes failing TestTransferShortBuffer.
Tom Clegg [Mon, 12 Oct 2015 16:40:38 +0000 (12:40 -0400)]
Warn about unhandled case if broken node has no ping time. refs #7286
Tom Clegg [Mon, 12 Oct 2015 16:06:47 +0000 (12:06 -0400)]
Merge branch '7159-empty-blob-race' refs #7159
Tom Clegg [Mon, 12 Oct 2015 13:33:24 +0000 (09:33 -0400)]
7159: Shorten race waits during generic tests
Tom Clegg [Fri, 9 Oct 2015 21:09:42 +0000 (17:09 -0400)]
7159: Log when waiting for get/put races
Tom Clegg [Thu, 8 Oct 2015 18:00:00 +0000 (14:00 -0400)]
7159: Exclude new empty blocks from index.
Tom Clegg [Thu, 8 Oct 2015 17:30:18 +0000 (13:30 -0400)]
7159: Test race deadline
Tom Clegg [Thu, 8 Oct 2015 16:52:17 +0000 (12:52 -0400)]
7159: Work around CreateBlob race by polling for updates when a brand new blob is found empty.
radhika [Mon, 12 Oct 2015 14:40:38 +0000 (10:40 -0400)]
7167: break load config logic out of main into loadConfig func and add several tests.
radhika [Mon, 12 Oct 2015 13:13:45 +0000 (09:13 -0400)]
7167: some more error tests such as error getting block from src and error putting block to dst.
radhika [Sat, 10 Oct 2015 00:09:17 +0000 (20:09 -0400)]
7167: stop rsync operation on any errors during Get or Put operations; add additional tests.
radhika [Fri, 9 Oct 2015 22:01:14 +0000 (18:01 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
radhika [Fri, 9 Oct 2015 22:00:56 +0000 (18:00 -0400)]
Merge branch 'master' into 7167-keep-rsync-test-setup
Conflicts:
sdk/go/keepclient/perms.go
sdk/go/keepclient/perms_test.go
services/keepstore/perms.go
services/keepstore/perms_test.go
Tom Clegg [Fri, 9 Oct 2015 19:48:28 +0000 (15:48 -0400)]
Merge branch '7491-keepclient-bugs' refs #7491
Tom Clegg [Fri, 9 Oct 2015 18:51:13 +0000 (14:51 -0400)]
Merge branch '7167-blob-sign-sdk' refs #7167
Tom Clegg [Fri, 9 Oct 2015 18:28:29 +0000 (14:28 -0400)]
7167: Deobfuscate variable names
Tom Clegg [Fri, 9 Oct 2015 18:20:15 +0000 (14:20 -0400)]
7167: Update tests and comments to new error vars.
Tom Clegg [Thu, 8 Oct 2015 21:50:22 +0000 (17:50 -0400)]
7167: Fix up comments
Tom Clegg [Thu, 8 Oct 2015 21:33:55 +0000 (17:33 -0400)]
7167: Replace duplicate tests with PermissionSecret tests
Brett Smith [Fri, 9 Oct 2015 15:24:50 +0000 (11:24 -0400)]
Version the CWL runner's dependency on cwltool.
cwltool development is continuing on with API-incompatible changes.
No issue #.
radhika [Fri, 9 Oct 2015 12:44:46 +0000 (08:44 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
radhika [Fri, 9 Oct 2015 12:43:37 +0000 (08:43 -0400)]
7167: default replications count from discovery doc test updates.
radhika [Thu, 8 Oct 2015 21:39:20 +0000 (17:39 -0400)]
7167: add tests with prefix during rsync
Tom Clegg [Thu, 8 Oct 2015 21:17:46 +0000 (17:17 -0400)]
7167: Tidy up errors. Remove extra comment copy.
radhika [Wed, 7 Oct 2015 20:47:56 +0000 (16:47 -0400)]
7167: move perms code from keepstore into keepclient go SDK.
radhika [Thu, 8 Oct 2015 20:22:45 +0000 (16:22 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
Conflicts:
tools/keep-rsync/keep-rsync_test.go
Tom Clegg [Thu, 8 Oct 2015 20:20:46 +0000 (16:20 -0400)]
6967: Update test to match improved code.
refs #6967
radhika [Thu, 8 Oct 2015 20:19:37 +0000 (16:19 -0400)]
7167: add tests to replications count
radhika [Thu, 8 Oct 2015 20:00:14 +0000 (16:00 -0400)]
7167: get replications count from destination api discovery doc and use it as default.
Tom Clegg [Tue, 6 Oct 2015 17:48:05 +0000 (13:48 -0400)]
7491: Ensure status channel stays open until all upload workers finish.
Tom Clegg [Tue, 6 Oct 2015 17:39:12 +0000 (13:39 -0400)]
7491: Fix error handling/reporting in keepclient/GET
Tom Clegg [Thu, 8 Oct 2015 19:24:19 +0000 (15:24 -0400)]
Merge branch '6967-yaml-format' closes #6967
Tom Clegg [Thu, 8 Oct 2015 18:46:22 +0000 (14:46 -0400)]
6967: More helpful comment & assertion failure message
Tom Clegg [Thu, 8 Oct 2015 18:45:33 +0000 (14:45 -0400)]
6967: Use git status --porcelain to isolate from user config
Tom Clegg [Wed, 7 Oct 2015 15:11:14 +0000 (11:11 -0400)]
6967: Move source_version detection code from config yaml to lib/app_version.rb.
Tom Clegg [Wed, 7 Oct 2015 14:00:16 +0000 (10:00 -0400)]
6967: Move source_version detection code from config yaml to lib/app_version.rb.
Tom Clegg [Wed, 7 Oct 2015 14:01:22 +0000 (10:01 -0400)]
6967: Treat blob_signing_key like a secret in `rake config:check`.
radhika [Thu, 8 Oct 2015 17:12:37 +0000 (13:12 -0400)]
7167: honor blob signing key while getting blocks.
radhika [Thu, 8 Oct 2015 15:13:41 +0000 (11:13 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
Conflicts:
tools/keep-rsync/keep-rsync_test.go
radhika [Thu, 8 Oct 2015 15:06:57 +0000 (11:06 -0400)]
7167: set enforce_permissions to true if blob signing key argument is provided.
radhika [Wed, 7 Oct 2015 22:24:51 +0000 (18:24 -0400)]
7167: add --keep-enforce-permissions to run_test_servers.py
radhika [Wed, 7 Oct 2015 20:50:04 +0000 (16:50 -0400)]
Merge branch 'master' into 7167-keep-rsync-test-setup
radhika [Wed, 7 Oct 2015 20:49:15 +0000 (16:49 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
radhika [Wed, 7 Oct 2015 20:47:56 +0000 (16:47 -0400)]
7167: move perms code from keepstore into keepclient go SDK.
Peter Amstutz [Wed, 7 Oct 2015 18:09:40 +0000 (14:09 -0400)]
Merge branch '6142-cancel-slurm' closes #6142
radhika [Wed, 7 Oct 2015 18:03:12 +0000 (14:03 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
radhika [Wed, 7 Oct 2015 18:01:18 +0000 (14:01 -0400)]
7167: merge test setup branch
radhika [Wed, 7 Oct 2015 17:52:43 +0000 (13:52 -0400)]
Merge branch 'master' into 7167-keep-rsync
radhika [Wed, 7 Oct 2015 17:52:33 +0000 (13:52 -0400)]
Merge branch 'master' into 7167-keep-rsync-test-setup
radhika [Wed, 7 Oct 2015 17:51:16 +0000 (13:51 -0400)]
7167: rename MakeArvadosClientWithConfig as New
Tom Clegg [Wed, 7 Oct 2015 15:44:25 +0000 (11:44 -0400)]
Merge branch '7254-dont-lose-replication-arg' closes #7254
Brett Smith [Wed, 7 Oct 2015 14:54:08 +0000 (10:54 -0400)]
Merge branch '7435-node-manager-shutdown-cleanup-wip'
Closes #7435, #7445.
Brett Smith [Wed, 7 Oct 2015 14:47:23 +0000 (10:47 -0400)]
7254: Test arv-put preserves replication when cache load fails.
Peter Amstutz [Wed, 7 Oct 2015 14:38:46 +0000 (10:38 -0400)]
6142: Only resume from 'drng' or 'drain'. Add/fix tests.
Brett Smith [Fri, 2 Oct 2015 15:07:27 +0000 (11:07 -0400)]
7435: Node Manager stops trying to shut down delisted cloud nodes.
If the underlying node is gone, trying to destroy it in the cloud will
almost certainly fail. It's hard to predict what will happen to
related actions like draining the node in SLURM. Just cancel the
attempt, and trust other systems like SLURM and Crunch to deal with
the disappearance on their own.
Tom Clegg [Tue, 6 Oct 2015 21:10:28 +0000 (17:10 -0400)]
7254: Test that replication arg is passed through to KeepClient.put()
radhika [Tue, 6 Oct 2015 20:54:01 +0000 (16:54 -0400)]
Merge branch '7167-keep-rsync-test-setup' into 7167-keep-rsync
radhika [Tue, 6 Oct 2015 20:51:36 +0000 (16:51 -0400)]
7167: Use struct instead of map for APIConfig
Peter Amstutz [Tue, 6 Oct 2015 20:42:51 +0000 (16:42 -0400)]
6142: If self._set_node_state('RESUME') in cancel_shutdown() returns non-zero,
check the node state and only retry if the node is in 'drain' or 'draining'.
Tom Clegg [Tue, 6 Oct 2015 19:04:13 +0000 (15:04 -0400)]
7254: Do not forget -replication arg when failing to load resume state.
radhika [Tue, 6 Oct 2015 17:56:12 +0000 (13:56 -0400)]
Merge branch 'master' into 7167-keep-rsync-test-setup
radhika [Tue, 6 Oct 2015 17:54:41 +0000 (13:54 -0400)]
7167: get index from src and dst and copy any missing blocks from src to dst.
Peter Amstutz [Tue, 6 Oct 2015 13:17:30 +0000 (09:17 -0400)]
Merge branch '7286-nodeman-destroy-broken-nodes' closes #7286
Peter Amstutz [Tue, 6 Oct 2015 01:33:00 +0000 (21:33 -0400)]
7286: Add comments clarifying arvados_node_missing() and broken(). Also bump
up version dependency to dev4.
radhika [Mon, 5 Oct 2015 21:22:37 +0000 (17:22 -0400)]
7167: Update test to also put a block in dst and attempt get from src.
radhika [Mon, 5 Oct 2015 15:41:39 +0000 (11:41 -0400)]
7167: Refactor MakeKeepClient and DiscoverKeepServers to allow making KeepClient from input JSON as well.
radhika [Mon, 5 Oct 2015 13:25:34 +0000 (09:25 -0400)]
7167: args not avaialble in all tests; hence store keep_existing argument in a variable rather than accessing it directly from args.
radhika [Mon, 5 Oct 2015 11:48:47 +0000 (07:48 -0400)]
Merge branch 'master' into 7167-keep-rsync-test-setup
radhika [Mon, 5 Oct 2015 11:46:56 +0000 (07:46 -0400)]
7167: keep-rsync parameter loading and intialization. Update test framework to allow creating two sets of keep servers, source and destination.
Tom Clegg [Fri, 2 Oct 2015 22:09:39 +0000 (18:09 -0400)]
7214: Fix "X-Keep-Replicas-Stored: 0" header when block is already present. refs #7214
Tom Clegg [Fri, 2 Oct 2015 20:06:12 +0000 (16:06 -0400)]
Merge branch '7241-azure-blob-volume' closes #7241
Peter Amstutz [Thu, 1 Oct 2015 17:00:00 +0000 (13:00 -0400)]
7286: Add BaseHTTPError to list of "cloud errors"
Peter Amstutz [Thu, 1 Oct 2015 13:32:33 +0000 (09:32 -0400)]
7286: Add drain* and fail* to SLURM_END_STATES, because the '*' means the node
is out of contact with slurm.
Peter Amstutz [Wed, 30 Sep 2015 21:16:09 +0000 (17:16 -0400)]
7286: Fix double count of missing nodes in shutdown
Peter Amstutz [Wed, 30 Sep 2015 20:26:46 +0000 (16:26 -0400)]
7286: Missing nodes are considered in "excess" count (reverts previous change). Added test. Also remove debug log statement.
Peter Amstutz [Wed, 30 Sep 2015 18:23:25 +0000 (14:23 -0400)]
7286: Compute "missing" based on "last_ping_at" instead of using API server's
buggy "status" field.
radhika [Wed, 30 Sep 2015 15:53:20 +0000 (11:53 -0400)]
closes #7200
Merge branch '7200-keepproxy-index-api'
radhika [Wed, 30 Sep 2015 15:46:41 +0000 (11:46 -0400)]
7200: more compact CheckAuthorizationHeader block
radhika [Wed, 30 Sep 2015 15:28:36 +0000 (11:28 -0400)]
7200: Use io.Copy instead of reading all bytes and writing to response. Much improved keep proxy test with code reuse.
Peter Amstutz [Wed, 30 Sep 2015 14:35:04 +0000 (10:35 -0400)]
7286: Add test that "missing" nodes are not counted towards "busy" (but are
counted towards node max).
Brett Smith [Wed, 30 Sep 2015 14:18:18 +0000 (10:18 -0400)]
7207: Remove `arv keep check` and `arv keep less`.
The implementations are no longer maintained and these are stale
references. Closes #7207.
Brett Smith [Wed, 30 Sep 2015 13:10:27 +0000 (09:10 -0400)]
7263: crunch-job checks for refreshes every two seconds.
This avoids the possibility that a constant stream of data from tasks
can prevent the job from being canceled. Refs #7263.