radhika [Fri, 6 Nov 2015 03:17:59 +0000 (22:17 -0500)]
5538: Merge FailHandler and FailThenSucceedHandler into one APIStub to facilitate testing many more error states; also add update and delete retry tests.
radhika [Fri, 6 Nov 2015 01:13:32 +0000 (20:13 -0500)]
5538: code improvements; use switch statement instead of if statement with several status code checks, sleep between retries.
radhika [Wed, 4 Nov 2015 22:18:25 +0000 (17:18 -0500)]
5538: update the newly added TestFail* to use proper client with http.Transport
radhika [Wed, 4 Nov 2015 22:11:10 +0000 (17:11 -0500)]
Merge branch 'master' into 5538-arvadosclient-retry
Conflicts:
sdk/go/arvadosclient/arvadosclient.go
radhika [Wed, 4 Nov 2015 21:39:59 +0000 (16:39 -0500)]
refs #5538
Merge branch '5538-close-idle-connections'
radhika [Wed, 4 Nov 2015 21:38:28 +0000 (16:38 -0500)]
5538: update test to reuse arvados client in TestCreatePipelineTemplate between idle and current connections.
radhika [Wed, 4 Nov 2015 21:25:32 +0000 (16:25 -0500)]
Merge branch 'master' into 5538-close-idle-connections
radhika [Wed, 4 Nov 2015 21:19:51 +0000 (16:19 -0500)]
closes #7719
Merge branch '7719-permit-net-delete'
radhika [Wed, 4 Nov 2015 21:13:29 +0000 (16:13 -0500)]
7719: permit never-delte to be set to false; add warning that datamanager is not yet fully tested.
radhika [Wed, 4 Nov 2015 19:58:46 +0000 (14:58 -0500)]
5538: add test with a connection idle for longer than MaxIdleConnectionDuration
radhika [Wed, 4 Nov 2015 19:36:42 +0000 (14:36 -0500)]
Merge branch 'master' into 5538-close-idle-connections
Brett Smith [Wed, 4 Nov 2015 19:32:01 +0000 (14:32 -0500)]
Merge branch '7713-node-manager-blacklist-broken-nodes-wip'
Closes #7713, #7718.
radhika [Wed, 4 Nov 2015 19:08:24 +0000 (14:08 -0500)]
5538: using fake arvados server to generate errors, added tests with retries.
Brett Smith [Wed, 4 Nov 2015 17:20:36 +0000 (12:20 -0500)]
7713: Node Manager blackholes broken nodes that can't shut down.
We are seeing situations on Azure where some nodes in an UNKNOWN state
cannot be shut down. The API call to destroy them always fails.
There are two related halves to this commit. In the first half,
after a cloud shutdown request fails, ComputeNodeShutdownActor checks
whether the node is broken. If it is, it cancels shutdown retries.
In the second half, the daemon checks for this shutdown outcome. When
it happens, it blacklists the broken node: it will immediately filter
it out of node lists from the cloud. It is no longer monitored in any
way or counted as a live node, so Node Manager will boot a replacement
for it.
This lets Node Manager create cloud nodes above max_nodes, up to the
number of broken nodes. We're reasonably bounded in for now because
only the Azure driver will ever declare a node broken. Other clouds
will never blacklist nodes this way.
radhika [Wed, 4 Nov 2015 16:36:24 +0000 (11:36 -0500)]
Merge branch 'master' into 5538-arvadosclient-retry
radhika [Wed, 4 Nov 2015 16:34:35 +0000 (11:34 -0500)]
5538: close any idle connections before a POST or DELETE request.
radhika [Wed, 4 Nov 2015 15:13:53 +0000 (10:13 -0500)]
5538: retry failed arvados api requests when appropriate.
Tom Clegg [Wed, 4 Nov 2015 05:19:40 +0000 (00:19 -0500)]
Merge branch '7444-dockercleaner-containers' closes #7444
Tom Clegg [Wed, 4 Nov 2015 04:55:11 +0000 (23:55 -0500)]
Merge branch '5824-keep-web'
refs #5824
Tom Clegg [Wed, 4 Nov 2015 04:06:45 +0000 (23:06 -0500)]
5824: Merge branch 'master' into 5824-keep-web
Tom Clegg [Wed, 4 Nov 2015 03:50:50 +0000 (22:50 -0500)]
5824: Avoid sending empty slices through toRead chan. Fixes race in test case.
Tom Clegg [Tue, 3 Nov 2015 19:06:52 +0000 (14:06 -0500)]
5824: Turn off debug printfs unless enabled by calling program.
radhika [Tue, 3 Nov 2015 15:43:33 +0000 (10:43 -0500)]
closes #7534
Merge branch '7534-superuser-token'
radhika [Tue, 3 Nov 2015 15:43:07 +0000 (10:43 -0500)]
Merge branch 'master' into 7534-superuser-token
Tom Clegg [Tue, 3 Nov 2015 14:55:50 +0000 (09:55 -0500)]
5824: Use session cookie instead of persistent.
Tom Clegg [Tue, 3 Nov 2015 14:52:33 +0000 (09:52 -0500)]
5824: Clarity edits in usage docs.
Peter Amstutz [Mon, 2 Nov 2015 22:42:54 +0000 (17:42 -0500)]
Merge branch '7593-cwl-crunchrunner' closes #7593
Tom Clegg [Mon, 2 Nov 2015 20:53:05 +0000 (15:53 -0500)]
7444: Rename kwarg remove_stopped_containers -> remove_containers_onexit
Tom Clegg [Mon, 2 Nov 2015 15:26:39 +0000 (10:26 -0500)]
7444: Set docker container name to {taskUUID}-{attemptNum}.
Tom Clegg [Fri, 30 Oct 2015 22:28:46 +0000 (18:28 -0400)]
7444: Do not remove docker containers with docker --rm; let dockercleaner do it.
Tom Clegg [Mon, 2 Nov 2015 20:46:22 +0000 (15:46 -0500)]
7444: Clean stopped containers at startup.
Peter Amstutz [Mon, 2 Nov 2015 20:42:48 +0000 (15:42 -0500)]
7593: Add version hint to arvados-python-client. Add get_uploaded() and
add_uploaded(). Add comments to clarify CollectionFsAccess._match
Tom Clegg [Mon, 2 Nov 2015 19:03:52 +0000 (14:03 -0500)]
7444: Test deletion error handling.
Tom Clegg [Fri, 30 Oct 2015 22:44:20 +0000 (18:44 -0400)]
7444: Note automatic removal of stopped containers, and how to disable.
Tom Clegg [Fri, 30 Oct 2015 22:18:09 +0000 (18:18 -0400)]
7444: Delete containers as soon as they stop.
Tom Clegg [Fri, 30 Oct 2015 19:57:44 +0000 (15:57 -0400)]
5824: Mention -anonymous-token in godoc. Sync usage messages.
Brett Smith [Fri, 30 Oct 2015 19:12:04 +0000 (15:12 -0400)]
6638/7370: Force new builds of Python backports with dependencies.
Even though we've declared these dependencies for a while now, Jenkins
has not published packages with them, because without a new upstream
version, fpm believes that there's no new package to build. This
resolves that by building a new iteration of the affected packages.
This is less than ideal, because if a new version is released, we'll
automatically package it with iteration 2. That is not correct, but
it doesn't affect any functionality, and we already have a plan to do
things properly in #6885. So we'll live with "correct functionality,
gross aesthetics" until then.
Ward approved in conversation. Refs #6638, #7370.
Brett Smith [Fri, 30 Oct 2015 18:47:31 +0000 (14:47 -0400)]
Merge branch '7668-crunch-node-properties-wip'
Closes #7668, #7672.
Tom Clegg [Thu, 29 Oct 2015 16:06:55 +0000 (12:06 -0400)]
7668: Move node stats from info to properties in fixtures.
Brett Smith [Wed, 28 Oct 2015 15:37:58 +0000 (11:37 -0400)]
7668: crunch-dispatch gets node stats from properties field.
This information moved from the info field to the properties field as
part of #3605. This simply updates crunch-dispatch to catch up with
the change.
Tom Clegg [Fri, 30 Oct 2015 18:22:21 +0000 (14:22 -0400)]
5824: Merge branch 'master' into 5824-keep-web
Tom Clegg [Fri, 30 Oct 2015 18:19:35 +0000 (14:19 -0400)]
Sync Gemfile.lock to current Gemfile.
amends
77460b2190e84df4178c25f014bbf136d559922e
refs #7582
Tom Clegg [Fri, 30 Oct 2015 18:02:10 +0000 (14:02 -0400)]
5824: Add -anonymous-token flag.
Tom Clegg [Fri, 30 Oct 2015 18:01:20 +0000 (14:01 -0400)]
5824: Update arvadostest usage.
radhika [Fri, 30 Oct 2015 00:50:36 +0000 (20:50 -0400)]
7534: move the print statement of the token obtained out the library into script.
Tom Clegg [Thu, 29 Oct 2015 20:31:37 +0000 (16:31 -0400)]
5824: Merge branch 'master' into 5824-keep-web
Tom Clegg [Thu, 29 Oct 2015 18:47:16 +0000 (14:47 -0400)]
5824: Add some clarifying comments and golint/vet/fmt fixes.
Tom Clegg [Thu, 29 Oct 2015 17:40:18 +0000 (13:40 -0400)]
5824: Add comments and fix variable names, cf. golint.
Tom Clegg [Thu, 29 Oct 2015 16:06:09 +0000 (12:06 -0400)]
5824: Add tests.
radhika [Thu, 29 Oct 2015 19:55:58 +0000 (15:55 -0400)]
7534: return an existing token instead of creating a new each time; add tests.
Peter Amstutz [Thu, 29 Oct 2015 15:32:24 +0000 (11:32 -0400)]
7593: Add arvados-cwl-runner to disambiguate if there is a conflict over what
should be default cwl-runner.
Brett Smith [Thu, 29 Oct 2015 15:14:48 +0000 (11:14 -0400)]
7695: Docs reflect that docker_image can't be a collection UUID.
We intended to allow this, but it's not actually implemented. Update
the docs for now. We'll add the functionality in refs #7695.
Tom Clegg [Thu, 29 Oct 2015 15:09:11 +0000 (11:09 -0400)]
5824: Add test for file in subdir.
Tom Clegg [Thu, 29 Oct 2015 15:08:59 +0000 (11:08 -0400)]
5824: Clarify docs.
radhika [Thu, 29 Oct 2015 14:30:14 +0000 (10:30 -0400)]
7534: refactor the code from create_superuser_token.rb script into lib and verify manually that existing behavior is preserved.
Peter Amstutz [Wed, 28 Oct 2015 16:24:36 +0000 (12:24 -0400)]
Fix typo in arvados-cli version number refs #7582
radhika [Wed, 28 Oct 2015 16:22:51 +0000 (12:22 -0400)]
closes #7492
Merge branch '7492-keepproxy-upstream-errors'
Peter Amstutz [Wed, 28 Oct 2015 16:21:46 +0000 (12:21 -0400)]
Update Gemfile pin on arvados-cli to ensure latest crunch-job refs #7582
radhika [Wed, 28 Oct 2015 16:21:21 +0000 (12:21 -0400)]
7167: update keep-rsync tests to use "Contains" instead of "HasSuffix" to make sure the error message checks pass even when retries happen.
radhika [Wed, 28 Oct 2015 16:18:33 +0000 (12:18 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Tom Clegg [Wed, 28 Oct 2015 16:17:54 +0000 (12:17 -0400)]
5824: Rename conventional dl.* to collections.*
Tom Clegg [Wed, 28 Oct 2015 15:37:47 +0000 (11:37 -0400)]
5824: Rename cookie to arvados_api_token.
radhika [Wed, 28 Oct 2015 15:33:46 +0000 (11:33 -0400)]
refs #7167
Add a log statement to see why the test failed intermittently.
Peter Amstutz [Wed, 28 Oct 2015 15:16:32 +0000 (11:16 -0400)]
7593: Make peter/crunchrunner the default repository (so that it works on
cloud.curoverse.com) until the deployment plan is sorted out.
Peter Amstutz [Wed, 28 Oct 2015 15:08:53 +0000 (11:08 -0400)]
7593: Generate files replaces $(task.keep)/ with keep: notation to reference
keep files.
Tom Clegg [Wed, 28 Oct 2015 14:52:34 +0000 (10:52 -0400)]
5824: Rename keepdl to keep-web.
Tom Clegg [Tue, 27 Oct 2015 21:15:16 +0000 (17:15 -0400)]
5824: Clarify docs.
Tom Clegg [Tue, 27 Oct 2015 14:33:08 +0000 (10:33 -0400)]
5824: Fix wrong title.
Ward Vandewege [Wed, 28 Oct 2015 14:22:58 +0000 (10:22 -0400)]
Update redmine links in README.
No issue #
radhika [Wed, 28 Oct 2015 13:51:26 +0000 (09:51 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Tom Clegg [Wed, 28 Oct 2015 13:15:58 +0000 (09:15 -0400)]
Fix typo in example config file
No issue #
radhika [Tue, 27 Oct 2015 23:48:47 +0000 (19:48 -0400)]
7492: add a test that simulates keep server unavailable error.
radhika [Tue, 27 Oct 2015 22:39:12 +0000 (18:39 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
radhika [Tue, 27 Oct 2015 22:38:33 +0000 (18:38 -0400)]
closes #7453
Merge branch '7453-create-new-group-with-role'
Peter Amstutz [Tue, 27 Oct 2015 20:54:30 +0000 (16:54 -0400)]
7593: Fixup to use keep: URI scheme globbing for outputs.
Peter Amstutz [Tue, 27 Oct 2015 20:26:10 +0000 (16:26 -0400)]
7593: References to files in keep must have keep: URI scheme. Improve error
handling. Support configuring which git repo has crunchrunner.
radhika [Tue, 27 Oct 2015 20:09:20 +0000 (16:09 -0400)]
7453: revert back to no generic "add new" button in all those pages.
radhika [Tue, 27 Oct 2015 19:46:33 +0000 (15:46 -0400)]
7453: Upon Nico's request, put back the "Add a new" button in keep disks, keep services, and virtual machines pages.
radhika [Tue, 27 Oct 2015 18:11:13 +0000 (14:11 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Peter Amstutz [Tue, 27 Oct 2015 17:06:19 +0000 (13:06 -0400)]
Merge branch '7582-crunch-runner' refs #7582
radhika [Tue, 27 Oct 2015 15:35:28 +0000 (11:35 -0400)]
7453: Disable submit button in add group dialog until a name is entered.
Observed that the "Add a new user" button is offered in /users page even to non-admin users. Corrected this bug.
Deleted one test in errors_test that was trying to access the "Add new group" button in /groups page.
Also, deleted the one test in virtual_machines_test that were trying to click the "Add new" button.
Added a new test in users_test that clicks the "Add new group" button and verifies the group is added.
Tom Clegg [Tue, 27 Oct 2015 14:39:11 +0000 (10:39 -0400)]
Merge branch '7160-azure-blob-doc' closes #7160
Tom Clegg [Tue, 27 Oct 2015 14:37:33 +0000 (10:37 -0400)]
7160: Clarify exampleAccountName -> exampleStorageAccountName
Peter Amstutz [Tue, 27 Oct 2015 14:22:10 +0000 (10:22 -0400)]
7582: Fixup concurrency around signal catching and forwarding.
radhika [Tue, 27 Oct 2015 13:23:24 +0000 (09:23 -0400)]
7453: do not display generic "add new" button in the groups, keep_disks, keep_services, links, nodes, and virtual_machines listing pages.
Tom Clegg [Mon, 26 Oct 2015 22:14:41 +0000 (18:14 -0400)]
7160: Add Azure Storage config page, update keepstore help text, add run script.
Peter Amstutz [Mon, 26 Oct 2015 19:53:07 +0000 (15:53 -0400)]
7593: Don't upload the same files more than once. Fix handling "./" in glob paths.
radhika [Mon, 26 Oct 2015 19:39:27 +0000 (15:39 -0400)]
7453: Add "Add new group" button to user admin page.
Brett Smith [Mon, 26 Oct 2015 18:46:57 +0000 (14:46 -0400)]
Merge branch 'pr/25'
Closes #7307.
Brett Smith [Mon, 26 Oct 2015 18:24:51 +0000 (14:24 -0400)]
7307: Clarify intended failure in arv-git-httpd SplitHostPort test.
Peter Amstutz [Mon, 26 Oct 2015 18:13:45 +0000 (14:13 -0400)]
7582: Passes draft-2 conformance tests.
radhika [Sat, 24 Oct 2015 20:00:38 +0000 (16:00 -0400)]
7492: add a keepproxy test with temporary connection refused error.
Brett Smith [Sat, 24 Oct 2015 19:21:27 +0000 (15:21 -0400)]
Merge branch '7587-pysdk-retry-test-wip'
Closes #7587, #7647.
Brett Smith [Fri, 23 Oct 2015 20:34:38 +0000 (16:34 -0400)]
7587: Add test for PySDK API client socket.error retries.
Brett Smith [Fri, 23 Oct 2015 19:42:59 +0000 (15:42 -0400)]
7587: Refactor PySDK API tests to use TestCaseWithServers.
History: first this test case used entirely mock responses. Then we
started running the API server to provide a discovery document. Then
people added tests that expected to talk to a real test server,
particularly test_empty_list and test_nonempty_list. These tests
would talk to the API server configured in the user's environment, and
fail if that's not a test API server.
Using TestCaseWithServers fixes the immediate bug in the tests, and
better reflects the current real state of the test case.
Peter Amstutz [Fri, 23 Oct 2015 21:43:38 +0000 (17:43 -0400)]
7582: Fixup to work with latest cwltool. Runs jobs with Go crunchrunner.
Bryan Cosca [Fri, 23 Oct 2015 21:30:43 +0000 (17:30 -0400)]
Merge branch '6600-retry-job-helpers'
refs #6600
radhika [Fri, 23 Oct 2015 21:28:07 +0000 (17:28 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Bryan Cosca [Fri, 23 Oct 2015 21:19:57 +0000 (17:19 -0400)]
6600: Added Retryloop to task_set_output(), current_task(), and current_job() to python SDK
Peter Amstutz [Fri, 23 Oct 2015 19:17:50 +0000 (15:17 -0400)]
Merge branch 'master' into 7582-crunch-runner
Conflicts:
sdk/cli/bin/crunch-job