radhika [Tue, 3 Nov 2015 15:04:02 +0000 (10:04 -0500)]
7661: add --by-pdh option to FUSE and use this option in crunch-job. Do not start web socket client when --by-pdh is used.
Brett Smith [Fri, 30 Oct 2015 19:12:04 +0000 (15:12 -0400)]
6638/7370: Force new builds of Python backports with dependencies.
Even though we've declared these dependencies for a while now, Jenkins
has not published packages with them, because without a new upstream
version, fpm believes that there's no new package to build. This
resolves that by building a new iteration of the affected packages.
This is less than ideal, because if a new version is released, we'll
automatically package it with iteration 2. That is not correct, but
it doesn't affect any functionality, and we already have a plan to do
things properly in #6885. So we'll live with "correct functionality,
gross aesthetics" until then.
Ward approved in conversation. Refs #6638, #7370.
Brett Smith [Fri, 30 Oct 2015 18:47:31 +0000 (14:47 -0400)]
Merge branch '7668-crunch-node-properties-wip'
Closes #7668, #7672.
Tom Clegg [Thu, 29 Oct 2015 16:06:55 +0000 (12:06 -0400)]
7668: Move node stats from info to properties in fixtures.
Brett Smith [Wed, 28 Oct 2015 15:37:58 +0000 (11:37 -0400)]
7668: crunch-dispatch gets node stats from properties field.
This information moved from the info field to the properties field as
part of #3605. This simply updates crunch-dispatch to catch up with
the change.
Tom Clegg [Fri, 30 Oct 2015 18:19:35 +0000 (14:19 -0400)]
Sync Gemfile.lock to current Gemfile.
amends
77460b2190e84df4178c25f014bbf136d559922e
refs #7582
Brett Smith [Thu, 29 Oct 2015 15:14:48 +0000 (11:14 -0400)]
7695: Docs reflect that docker_image can't be a collection UUID.
We intended to allow this, but it's not actually implemented. Update
the docs for now. We'll add the functionality in refs #7695.
Peter Amstutz [Wed, 28 Oct 2015 16:24:36 +0000 (12:24 -0400)]
Fix typo in arvados-cli version number refs #7582
radhika [Wed, 28 Oct 2015 16:22:51 +0000 (12:22 -0400)]
closes #7492
Merge branch '7492-keepproxy-upstream-errors'
Peter Amstutz [Wed, 28 Oct 2015 16:21:46 +0000 (12:21 -0400)]
Update Gemfile pin on arvados-cli to ensure latest crunch-job refs #7582
radhika [Wed, 28 Oct 2015 16:21:21 +0000 (12:21 -0400)]
7167: update keep-rsync tests to use "Contains" instead of "HasSuffix" to make sure the error message checks pass even when retries happen.
radhika [Wed, 28 Oct 2015 16:18:33 +0000 (12:18 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
radhika [Wed, 28 Oct 2015 15:33:46 +0000 (11:33 -0400)]
refs #7167
Add a log statement to see why the test failed intermittently.
Ward Vandewege [Wed, 28 Oct 2015 14:22:58 +0000 (10:22 -0400)]
Update redmine links in README.
No issue #
radhika [Wed, 28 Oct 2015 13:51:26 +0000 (09:51 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Tom Clegg [Wed, 28 Oct 2015 13:15:58 +0000 (09:15 -0400)]
Fix typo in example config file
No issue #
radhika [Tue, 27 Oct 2015 23:48:47 +0000 (19:48 -0400)]
7492: add a test that simulates keep server unavailable error.
radhika [Tue, 27 Oct 2015 22:39:12 +0000 (18:39 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
radhika [Tue, 27 Oct 2015 22:38:33 +0000 (18:38 -0400)]
closes #7453
Merge branch '7453-create-new-group-with-role'
radhika [Tue, 27 Oct 2015 20:09:20 +0000 (16:09 -0400)]
7453: revert back to no generic "add new" button in all those pages.
radhika [Tue, 27 Oct 2015 19:46:33 +0000 (15:46 -0400)]
7453: Upon Nico's request, put back the "Add a new" button in keep disks, keep services, and virtual machines pages.
radhika [Tue, 27 Oct 2015 18:11:13 +0000 (14:11 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Peter Amstutz [Tue, 27 Oct 2015 17:06:19 +0000 (13:06 -0400)]
Merge branch '7582-crunch-runner' refs #7582
radhika [Tue, 27 Oct 2015 15:35:28 +0000 (11:35 -0400)]
7453: Disable submit button in add group dialog until a name is entered.
Observed that the "Add a new user" button is offered in /users page even to non-admin users. Corrected this bug.
Deleted one test in errors_test that was trying to access the "Add new group" button in /groups page.
Also, deleted the one test in virtual_machines_test that were trying to click the "Add new" button.
Added a new test in users_test that clicks the "Add new group" button and verifies the group is added.
Tom Clegg [Tue, 27 Oct 2015 14:39:11 +0000 (10:39 -0400)]
Merge branch '7160-azure-blob-doc' closes #7160
Tom Clegg [Tue, 27 Oct 2015 14:37:33 +0000 (10:37 -0400)]
7160: Clarify exampleAccountName -> exampleStorageAccountName
Peter Amstutz [Tue, 27 Oct 2015 14:22:10 +0000 (10:22 -0400)]
7582: Fixup concurrency around signal catching and forwarding.
radhika [Tue, 27 Oct 2015 13:23:24 +0000 (09:23 -0400)]
7453: do not display generic "add new" button in the groups, keep_disks, keep_services, links, nodes, and virtual_machines listing pages.
Tom Clegg [Mon, 26 Oct 2015 22:14:41 +0000 (18:14 -0400)]
7160: Add Azure Storage config page, update keepstore help text, add run script.
radhika [Mon, 26 Oct 2015 19:39:27 +0000 (15:39 -0400)]
7453: Add "Add new group" button to user admin page.
Brett Smith [Mon, 26 Oct 2015 18:46:57 +0000 (14:46 -0400)]
Merge branch 'pr/25'
Closes #7307.
Brett Smith [Mon, 26 Oct 2015 18:24:51 +0000 (14:24 -0400)]
7307: Clarify intended failure in arv-git-httpd SplitHostPort test.
radhika [Sat, 24 Oct 2015 20:00:38 +0000 (16:00 -0400)]
7492: add a keepproxy test with temporary connection refused error.
Brett Smith [Sat, 24 Oct 2015 19:21:27 +0000 (15:21 -0400)]
Merge branch '7587-pysdk-retry-test-wip'
Closes #7587, #7647.
Brett Smith [Fri, 23 Oct 2015 20:34:38 +0000 (16:34 -0400)]
7587: Add test for PySDK API client socket.error retries.
Brett Smith [Fri, 23 Oct 2015 19:42:59 +0000 (15:42 -0400)]
7587: Refactor PySDK API tests to use TestCaseWithServers.
History: first this test case used entirely mock responses. Then we
started running the API server to provide a discovery document. Then
people added tests that expected to talk to a real test server,
particularly test_empty_list and test_nonempty_list. These tests
would talk to the API server configured in the user's environment, and
fail if that's not a test API server.
Using TestCaseWithServers fixes the immediate bug in the tests, and
better reflects the current real state of the test case.
Bryan Cosca [Fri, 23 Oct 2015 21:30:43 +0000 (17:30 -0400)]
Merge branch '6600-retry-job-helpers'
refs #6600
radhika [Fri, 23 Oct 2015 21:28:07 +0000 (17:28 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Bryan Cosca [Fri, 23 Oct 2015 21:19:57 +0000 (17:19 -0400)]
6600: Added Retryloop to task_set_output(), current_task(), and current_job() to python SDK
Peter Amstutz [Fri, 23 Oct 2015 19:17:50 +0000 (15:17 -0400)]
Merge branch 'master' into 7582-crunch-runner
Conflicts:
sdk/cli/bin/crunch-job
Peter Amstutz [Fri, 23 Oct 2015 19:00:01 +0000 (15:00 -0400)]
Merge branch '7582-run-any-docker-container' refs #7582
Peter Amstutz [Fri, 23 Oct 2015 18:57:55 +0000 (14:57 -0400)]
7582: Add test for stdbuf in /bin/sh bootstrap script.
Peter Amstutz [Fri, 23 Oct 2015 15:16:27 +0000 (11:16 -0400)]
7582: Adjust signal catching to eliminate races. Tighten up code based on comments.
radhika [Fri, 23 Oct 2015 15:14:43 +0000 (11:14 -0400)]
7492: cleanup error checking in keepproxy
radhika [Fri, 23 Oct 2015 13:44:47 +0000 (09:44 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Brett Smith [Fri, 23 Oct 2015 00:09:37 +0000 (20:09 -0400)]
Merge branch '7587-httplib2-retries-wip'
Refs #7587. Closes #7640.
Brett Smith [Wed, 21 Oct 2015 16:35:45 +0000 (12:35 -0400)]
7587: PySDK retries socket.error exceptions from API requests.
radhika [Thu, 22 Oct 2015 19:32:31 +0000 (15:32 -0400)]
closes #7546
Merge branch '7546-put-retry'
radhika [Thu, 22 Oct 2015 19:31:32 +0000 (15:31 -0400)]
7546: update comment to explain why we do not want to retry when status code is 503.
radhika [Thu, 22 Oct 2015 19:24:43 +0000 (15:24 -0400)]
Merge branch 'master' into 7546-put-retry
Peter Amstutz [Thu, 22 Oct 2015 19:08:15 +0000 (15:08 -0400)]
7582: fix typo --user=$try_user to $try_user_arg
Peter Amstutz [Thu, 22 Oct 2015 19:05:11 +0000 (15:05 -0400)]
7582: Don't call stdbuf in minimal run mode.
Peter Amstutz [Thu, 22 Oct 2015 18:55:44 +0000 (14:55 -0400)]
7582: Make fields in Job, Task, TaskDefs public so that json loading reflection works.
Peter Amstutz [Thu, 22 Oct 2015 18:16:56 +0000 (14:16 -0400)]
7582: Add parameter substitution. Improve validity checking for filenames.
Adjust signal handling & added test. Tweak behavior on exit code handling.
Move IArvadosClient to crunchrunner.
radhika [Thu, 22 Oct 2015 17:14:00 +0000 (13:14 -0400)]
refs #7167
Merge branch '7167-keep-rsync'
radhika [Thu, 22 Oct 2015 17:03:18 +0000 (13:03 -0400)]
Merge branch 'master' into 7167-keep-rsync
Peter Amstutz [Thu, 22 Oct 2015 14:20:36 +0000 (10:20 -0400)]
7582: Better reporting in the log about user probe behavior.
Peter Amstutz [Thu, 22 Oct 2015 14:20:36 +0000 (10:20 -0400)]
7582: Better reporting in the log about user probe behavior.
Peter Amstutz [Thu, 22 Oct 2015 13:51:37 +0000 (09:51 -0400)]
7582: Runner uploads results. Feature complete.
Peter Amstutz [Thu, 22 Oct 2015 13:20:13 +0000 (09:20 -0400)]
7582: Uploader passes tests
Peter Amstutz [Wed, 21 Oct 2015 20:41:35 +0000 (16:41 -0400)]
7582: Uploader mostly done, writing tests
Brett Smith [Wed, 21 Oct 2015 19:37:53 +0000 (15:37 -0400)]
Fix multiple exception catching in arv-run.
The previous version catches IOError and binds the exception object to
the name OSError. No issue #.
Peter Amstutz [Wed, 21 Oct 2015 17:38:57 +0000 (13:38 -0400)]
7582: More tests, add vwd support
Peter Amstutz [Wed, 21 Oct 2015 17:04:07 +0000 (13:04 -0400)]
7582: Working on tests.
radhika [Wed, 21 Oct 2015 16:23:09 +0000 (12:23 -0400)]
7167: expand the src and dst help messages to list the config parameters that are to be included in the config files.
Bryan Cosca [Wed, 21 Oct 2015 15:36:25 +0000 (11:36 -0400)]
Merge branch '7015-update-user-guide'
closes #7015
Ward Vandewege [Wed, 21 Oct 2015 15:19:46 +0000 (11:19 -0400)]
SSO installation doc fix: to run rails console, you need to be in the
/var/www/arvados-sso/current directory.
closes #7623
Bryan Cosca [Wed, 21 Oct 2015 15:18:52 +0000 (11:18 -0400)]
7015: Removed whitespace
Peter Amstutz [Wed, 21 Oct 2015 13:03:27 +0000 (09:03 -0400)]
7582: Crunchrunner work in progress.
radhika [Wed, 21 Oct 2015 13:00:58 +0000 (09:00 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
Tom Clegg [Tue, 20 Oct 2015 21:53:27 +0000 (17:53 -0400)]
6358: Fix probe order test logic.
This request order is OK with two threads: thread "0" just took a long
time to make its request.
expect 0 1 2 3 4 5 6 7
got 1 2 3 4 5 0 6 7
The inverse is not OK. This would mean 0 started before any of
1,2,3,4,5 finished.
expect 1 2 3 4 5 0 6 7
got 0 1 2 3 4 5 6 7
refs #6358
Tom Clegg [Tue, 20 Oct 2015 21:52:55 +0000 (17:52 -0400)]
6358: Fix race opportunity in ThreadLimiter.
refs #6358
Peter Amstutz [Tue, 20 Oct 2015 20:34:50 +0000 (16:34 -0400)]
7582: (1) Probe for non-root Docker user to use instead of assuming "crunch".
Tries the default user for the container, then 'crunch', then 'nobody', testing
for whether the actual user id is non-zero. This defends against mistakes but
not malice, but we intend to harden the security in the future so we don't want
anyone getting used to their jobs running as root in their Docker
containers.
(2) If arvados_sdk_version is not present, skip the "pipe to perl to install
the SDK" logic, but instead bootstrap with a small bash script that only
creates temporary directories and runs the crunch script.
radhika [Tue, 20 Oct 2015 17:41:29 +0000 (13:41 -0400)]
7546: update some of the failure tests to use keepclient.Retries = 0, so that the tests do not waste too much time
retrying a test that is designed for failures. This update brings down the keepclient_test runtime from 49s to 10s.
radhika [Tue, 20 Oct 2015 17:30:45 +0000 (13:30 -0400)]
Merge branch 'master' into 7546-put-retry
radhika [Tue, 20 Oct 2015 15:41:07 +0000 (11:41 -0400)]
Merge branch 'master' into 7492-keepproxy-upstream-errors
radhika [Tue, 20 Oct 2015 15:40:05 +0000 (11:40 -0400)]
7492: update keep-rsync test with bad blob signing key to expect Forbidden error instead of Block not found error.
radhika [Tue, 20 Oct 2015 15:21:34 +0000 (11:21 -0400)]
7546: also retry when status code is 0, which is the case when a closed connection was used.
Brett Smith [Tue, 20 Oct 2015 15:04:29 +0000 (11:04 -0400)]
Merge branch 'pr/28'
Closes #7324.
Brett Smith [Tue, 20 Oct 2015 15:03:55 +0000 (11:03 -0400)]
Clean redundant except: blocks in run_test_server.
Brett Smith [Tue, 20 Oct 2015 15:02:51 +0000 (11:02 -0400)]
7324: Tighten exception ignoring in run_test_server start_nginx.
We just want to make sure the FIFO's gone. Ignore the OSError that
says "can't remove it because it's already gone," and re-raise all
others.
radhika [Tue, 20 Oct 2015 15:02:19 +0000 (11:02 -0400)]
Merge branch 'master' into 7546-put-retry
radhika [Tue, 20 Oct 2015 14:59:41 +0000 (10:59 -0400)]
7492: better error reporting of upstream errors in keepproxy.
Tom Clegg [Mon, 19 Oct 2015 19:29:28 +0000 (15:29 -0400)]
Merge branch '6358-put-rendezvous' closes #6358
Brett Smith [Mon, 19 Oct 2015 18:40:24 +0000 (14:40 -0400)]
7499: Update development link in Workbench "Getting Started" popup.
Also, de-hyphenate "open source."
Refs #7499.
radhika [Mon, 19 Oct 2015 18:13:25 +0000 (14:13 -0400)]
Merge branch 'master' into 7546-put-retry
radhika [Mon, 19 Oct 2015 18:12:00 +0000 (14:12 -0400)]
refs #7167
Merge branch '7167-keep-rsync'
radhika [Mon, 19 Oct 2015 18:11:41 +0000 (14:11 -0400)]
Merge branch 'master' into 7167-keep-rsync
radhika [Mon, 19 Oct 2015 03:21:58 +0000 (23:21 -0400)]
7546: enhance putReplicas method to retry.
Tom Clegg [Sat, 17 Oct 2015 04:39:25 +0000 (00:39 -0400)]
Merge branch '7173-jessie'
closes #7173
Tom Clegg [Fri, 16 Oct 2015 23:40:22 +0000 (19:40 -0400)]
6358: Test partial ordering with multiple writer threads.
Tom Clegg [Fri, 16 Oct 2015 22:26:14 +0000 (18:26 -0400)]
6358: Fix rendezvous probe order on Put.
Bug #1 was that KeepClient.put() was starting threads in the order
given by roots_map.iteritems(), instead of the order they were
supplied by weighted_service_roots(). This is fixed by using the same
logic get() was using.
Bug #2 was that ThreadLimiter didn't unblock threads in the same order
they were created by put(). This is fixed by adding a "set_sequence"
method to ThreadLimiter to indicate the order in which threads should
be unblocked.
The new test case confirms that put(copies=1) always makes requests in
the correct order.
Bryan Cosca [Fri, 16 Oct 2015 21:07:59 +0000 (17:07 -0400)]
7015: Finished going through user guide
Peter Amstutz [Fri, 16 Oct 2015 15:42:01 +0000 (11:42 -0400)]
Merge branch '6321-slurm-oserror' closes #6321
Peter Amstutz [Fri, 16 Oct 2015 15:40:36 +0000 (11:40 -0400)]
6321: Add note about rationale for retrying on OSError.
Peter Amstutz [Fri, 16 Oct 2015 15:16:49 +0000 (11:16 -0400)]
6321: Add test that OSError is caught from slurm subprocess invocations.
radhika [Fri, 16 Oct 2015 14:23:12 +0000 (10:23 -0400)]
7167: Remove StartKeepWithParams and StopKeepWithParams and make StartKeep and StopKeep with parameters the only exposed funcs.
The update was small enough, about 10 usages in the entire code, and hence did not make sense to postpone it for a "better" time.
Colin Nolan [Fri, 16 Oct 2015 13:15:09 +0000 (14:15 +0100)]
7324: Implemented deletion of previous nginx access log fifo before creation,
as discussed with @jrandall to address issue raised by @brettcs
(see: https://github.com/curoverse/arvados/pull/28#discussion_r39689972).
radhika [Fri, 16 Oct 2015 02:55:03 +0000 (22:55 -0400)]
7167: Break all the code from keep-rsync main method into a separate func so that arg parsing can also be tested.
Rather than using default flag parsing, use FlagSet so that flags can be set multiple times from multiple tests.
Bryan Cosca [Thu, 15 Oct 2015 21:10:19 +0000 (17:10 -0400)]
7015: Checked up to Concurrent Crunch tasks