radhika [Fri, 20 Nov 2015 21:26:18 +0000 (16:26 -0500)]
Merge branch '7490-datamanager-dont-die-return-error' of git.curoverse.com:arvados into 7490-datamanager-dont-die-return-error
Tom Clegg [Fri, 20 Nov 2015 21:21:42 +0000 (16:21 -0500)]
7490: Quote strings in error messages, fixup error matching in tests.
radhika [Fri, 20 Nov 2015 14:31:24 +0000 (09:31 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
radhika [Thu, 19 Nov 2015 20:23:25 +0000 (15:23 -0500)]
7490: remove loggerutil.LogErrorMessage that I added since it is not used
radhika [Thu, 19 Nov 2015 19:54:37 +0000 (14:54 -0500)]
7490: use loggerutil to log any datamanager errors.
Peter Amstutz [Thu, 19 Nov 2015 19:46:38 +0000 (14:46 -0500)]
Merge branch '5353-set-node-size' refs #5353
Peter Amstutz [Thu, 19 Nov 2015 19:44:42 +0000 (14:44 -0500)]
5353: Remove checks that cloud_node.size is None (because it should never be None or
booting multiple node sizes won't work). Set size explicitly for the dummy driver.
Peter Amstutz [Thu, 19 Nov 2015 18:32:44 +0000 (13:32 -0500)]
5353: Explicitly set size field on node objects returned by list_nodes on AWS and Azure.
radhika [Thu, 19 Nov 2015 17:44:15 +0000 (12:44 -0500)]
7490: Add Err to collection.ReadCollections and keep.ServerResponse so that the error can be propagated to clients accessing these through a channel read.
Peter Amstutz [Thu, 19 Nov 2015 17:20:35 +0000 (12:20 -0500)]
Hotfix: use a recursive lock for closed_lock so that EventClient.close() can be
called from on_event(). refs #7654
radhika [Thu, 19 Nov 2015 16:12:22 +0000 (11:12 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
Peter Amstutz [Thu, 19 Nov 2015 02:35:50 +0000 (21:35 -0500)]
Merge branch '7654-ws4py-hang' closes #7654
Peter Amstutz [Thu, 19 Nov 2015 02:27:05 +0000 (21:27 -0500)]
Merge branch '3137-arv-mount-stats' closes #3137
Peter Amstutz [Thu, 19 Nov 2015 02:26:24 +0000 (21:26 -0500)]
3137: Bump Python SDK version requirement. Tweak Stats() class, don't keep two
sets of data points.
Brett Smith [Wed, 18 Nov 2015 23:00:27 +0000 (18:00 -0500)]
Merge branch '6923-crunch-no-dpkg-query-wip'
Closes #6923, #7740.
Brett Smith [Wed, 18 Nov 2015 20:30:09 +0000 (15:30 -0500)]
6923: crunch-job logs PySDK version when minimally bootstrapped.
Brett Smith [Mon, 9 Nov 2015 15:13:21 +0000 (10:13 -0500)]
6923: Improve Arvados SDK version logging in Crunch run script.
* Use a mechanism that works in a wider variety of containers. This
only depends on Python itself and setuptools. It won't generate
spurious warnings by calling dpkg-query on Red Hat containers.
* Don't log the version when we successfully set up the specified
arvados_sdk_version. The version will only be '0.1' in this case,
and that's not helpful.
Peter Amstutz [Wed, 18 Nov 2015 20:24:02 +0000 (15:24 -0500)]
7654: Fix hang in close().
Peter Amstutz [Wed, 18 Nov 2015 18:48:35 +0000 (13:48 -0500)]
3137: Change --stats to --crunchstat-interval as specified on the ticket.
Peter Amstutz [Wed, 18 Nov 2015 17:09:47 +0000 (12:09 -0500)]
Merge branch '5353-node-sizes' closes #5353
Brett Smith [Wed, 18 Nov 2015 16:36:04 +0000 (11:36 -0500)]
6846: Streamline Workbench 404 page.
* Prompt the user to log in with a prominent button.
* Make the page text less verbose.
* DRY up the code in the _report_error partial.
Refs #6846.
Peter Amstutz [Wed, 18 Nov 2015 15:18:03 +0000 (10:18 -0500)]
5353: Remove extra assertion because busywait does it for us.
Peter Amstutz [Wed, 18 Nov 2015 14:52:46 +0000 (09:52 -0500)]
5353: Update comment about min_nodes and node size.
Peter Amstutz [Wed, 18 Nov 2015 14:45:08 +0000 (09:45 -0500)]
5353: Add a couple comments to tests.
Peter Amstutz [Wed, 18 Nov 2015 14:25:32 +0000 (09:25 -0500)]
5353: Fix typo in _nodes_wanted(). Calculate number of nodes that can boot
based on price cap. Don't add jobs to wishlist that exceed max price cap.
radhika [Wed, 18 Nov 2015 13:38:38 +0000 (08:38 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
Peter Amstutz [Wed, 18 Nov 2015 04:07:14 +0000 (23:07 -0500)]
3137: Add --stats to arv-mount in crunch-job
Peter Amstutz [Wed, 18 Nov 2015 03:57:30 +0000 (22:57 -0500)]
3137: Add counter & logging for cache hits
Peter Amstutz [Wed, 18 Nov 2015 03:00:45 +0000 (22:00 -0500)]
3137: Refactor stats to record keep & fuse operations as well as bytes.
sguthrie [Tue, 17 Nov 2015 21:49:04 +0000 (16:49 -0500)]
Closes #7235. Merge branch '7235-python-keep-client-timeout'
sguthrie [Tue, 10 Nov 2015 20:23:18 +0000 (15:23 -0500)]
Closes #7235. Instead of setting KeepService's pycurl.TIMEOUT_MS, set pycurl.LOW_SPEED_LIMIT and pycurl.LOW_SPEED_TIME.
Default LOW_SPEED_LIMIT is 32768 bytes per second. Default LOW_SPEED_TIME is 64 seconds.
If the user specifies a length-two tuple, the first item sets CONNECTTIMEOUT_MS, the second item sets LOW_SPEED_TIME,
and LOW_SPEED_LIMIT is set to 32768 bytes per second.
Added bandwidth similator to keepstub, which uses millisecond precision (like curl) to measure timeouts.
Added tests to test_keep_client and modified existing tests to only use integers.
Peter Amstutz [Tue, 17 Nov 2015 16:11:20 +0000 (11:11 -0500)]
3137: Add stat counters for bytes uploaded/downloaded (keep) and read/written (fuse).
radhika [Tue, 17 Nov 2015 15:26:09 +0000 (10:26 -0500)]
7490: added couple more datamanager tests with errors injected during GetCollections
Brett Smith [Tue, 17 Nov 2015 03:42:31 +0000 (22:42 -0500)]
7313: crunch-job reports an error when a task doesn't record state.
Closes #7313.
Peter Amstutz [Mon, 16 Nov 2015 22:01:05 +0000 (17:01 -0500)]
5353: Fixes from testing with Dummy driver.
Peter Amstutz [Mon, 16 Nov 2015 21:25:48 +0000 (16:25 -0500)]
5353: Add note that min_nodes boots cheapest nodes.
Peter Amstutz [Mon, 16 Nov 2015 21:21:34 +0000 (16:21 -0500)]
5353: Added max_total_price. Added more tests for multiple node sizes.
Updated config file examples.
radhika [Sun, 15 Nov 2015 00:04:56 +0000 (19:04 -0500)]
7490: added several error condition check tests for datamanager/keep package; increased code coverage from 14.6% to 72%
radhika [Fri, 13 Nov 2015 18:05:09 +0000 (13:05 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
radhika [Fri, 13 Nov 2015 18:04:26 +0000 (13:04 -0500)]
7490: a few more keep unit tests with simulated errors
Brett Smith [Fri, 13 Nov 2015 14:29:40 +0000 (09:29 -0500)]
Merge branch '7696-pysdk-all-keep-service-types-wip'
Closes #7696, #7758.
Brett Smith [Wed, 11 Nov 2015 22:08:39 +0000 (17:08 -0500)]
7696: Improve PySDK KeepClient.ThreadLimiter.
* Move the calculation of how many threads to allow into the class.
* Teach it to handle cases where max_replicas_per_service is known and
greater than 1. This will never happen today, but is an anticipated
improvement.
* Update docstrings to reflect current reality.
These are all changes I made while debugging the previous race
condition.
Brett Smith [Wed, 11 Nov 2015 21:50:18 +0000 (16:50 -0500)]
7696: PySDK determines max_replicas_per_service after querying services.
Because max_replicas_per_service was set to 1 in the case where
KeepClient was instantiated with no direct information about available
Keep services, and because ThreadLimiter was being instantiated before
querying available Keep services (via map_new_services), the first
Keep request to talk to non-disk services would let multiple threads
run at once. This fixes that race condition, and adds a test that was
triggering it semi-reliably.
Brett Smith [Wed, 11 Nov 2015 17:17:46 +0000 (12:17 -0500)]
7696: PySDK KeepClient uses all service types.
Filter out gateway services from the list of usable services, rather
than selecting only disk and proxy types.
Brett Smith [Wed, 11 Nov 2015 17:18:46 +0000 (12:18 -0500)]
7696: Clean imports in PySDK arvados.keep module.
Brett Smith [Wed, 11 Nov 2015 15:06:51 +0000 (10:06 -0500)]
7696: Refactor locator builder method in PySDK tests.
Brett Smith [Fri, 13 Nov 2015 14:28:12 +0000 (09:28 -0500)]
Merge branch '7123-crunch-no-record-log-failure-wip'
Closes #7123, #7741.
Brett Smith [Mon, 9 Nov 2015 15:28:51 +0000 (10:28 -0500)]
7123: Crunch doesn't update job log when arv-put fails.
This prevents crunch-job from recording the empty collection as a
job's log. Most other components (Workbench, the log clenaer)
recognize a null log as a special case; less so the empty collection.
Brett Smith [Thu, 12 Nov 2015 21:33:48 +0000 (16:33 -0500)]
Merge branch '7645-doc-client-max-body-size-wip'
Closes #7645, #7742. Refs #7356.
Brett Smith [Mon, 9 Nov 2015 17:44:38 +0000 (12:44 -0500)]
7356: Install guide sets client_max_body_size for arv-git-httpd.
Brett Smith [Mon, 9 Nov 2015 17:43:58 +0000 (12:43 -0500)]
7645: Install guide suggests setting client_max_body_size consistently.
Without these changes, the upstream Passenger processes may reject
large request bodies.
radhika [Thu, 12 Nov 2015 21:26:44 +0000 (16:26 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
Brett Smith [Thu, 12 Nov 2015 21:12:28 +0000 (16:12 -0500)]
Merge branch '6846-workbench-top-nav-login-returns-wip'
Closes #6846, #7739.
Brett Smith [Mon, 9 Nov 2015 17:02:25 +0000 (12:02 -0500)]
6846: Workbench navigation bar login returns user to the same page.
Brett Smith [Thu, 12 Nov 2015 20:31:09 +0000 (15:31 -0500)]
Merge branch '6356-crunch-permfail-task-retry-fix-wip'
Closes #6356, #7738.
Brett Smith [Mon, 9 Nov 2015 13:30:14 +0000 (08:30 -0500)]
6356: crunch-job doesn't create new tasks after job success is set.
#6356 reported that a permanently failed task was retried. Note 3
discusses why this happened and suggests two fixes:
* Only put tempfailed task back on the todo list.
* Run `last THISROUND if $main::please_freeze || defined($main::success);`
after we call reapchildren(), since it's the main place where the
value of $main::success can change.
The first change would revert part of
75be7487c2bbd83aa5116aa5f8ade5ddf31501da, which intentionally puts
these tasks back on the todo list to get a correct tasks count.
The current `last if…` line was added in
b306eb48ab12676ffb365ede8197e4f2d7e92011, with the rationale "Don't
create new tasks if $main::success is defined." This change corrects
the code to implement the desired functionality, by checking and
stopping just before we create a new task (functionally, at least).
Tom Clegg [Thu, 12 Nov 2015 20:00:59 +0000 (15:00 -0500)]
Merge branch '5824-keep-web-workbench' closes #5824
radhika [Thu, 12 Nov 2015 18:03:10 +0000 (13:03 -0500)]
7490: Update the previously failing keep_test.go; no new tests added. We can now add datamanager/keep to gostuff in run-tests.sh
Tom Clegg [Wed, 11 Nov 2015 23:32:50 +0000 (18:32 -0500)]
5824: Fix clear-download-dir helper.
Tom Clegg [Wed, 11 Nov 2015 23:32:23 +0000 (18:32 -0500)]
5824: Fix path and query escapes.
Paths encode spaces as "%20", not "+".
Rails to_query helper does undesirable things like
"disposition[]=attachment".
Tom Clegg [Wed, 11 Nov 2015 23:29:39 +0000 (18:29 -0500)]
5824: Fix -attachment-only-host test config. Test more preview/download variants.
radhika [Wed, 11 Nov 2015 17:34:04 +0000 (12:34 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
Tom Clegg [Wed, 11 Nov 2015 17:14:16 +0000 (12:14 -0500)]
Merge branch '5824-keep-web-workbench' refs #5824
Tom Clegg [Wed, 11 Nov 2015 17:11:46 +0000 (12:11 -0500)]
5824: Merge branch 'master' into 5824-keep-web-workbench
Conflicts:
services/keepproxy/keepproxy_test.go
radhika [Wed, 11 Nov 2015 16:36:10 +0000 (11:36 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
radhika [Wed, 11 Nov 2015 16:01:24 +0000 (11:01 -0500)]
closes #7661
Merge branch '7661-fuse-by-pdh'
radhika [Wed, 11 Nov 2015 16:01:02 +0000 (11:01 -0500)]
Merge branch 'master' into 7661-fuse-by-pdh
Tom Clegg [Wed, 11 Nov 2015 01:48:24 +0000 (20:48 -0500)]
5824: Update/clarify docs and comments.
radhika [Tue, 10 Nov 2015 23:41:55 +0000 (18:41 -0500)]
7661: Pass pdh_only when adding by_id subdir; test now passes.
Tom Clegg [Tue, 10 Nov 2015 16:35:03 +0000 (11:35 -0500)]
Merge branch '5538-test-post-retry' refs #5538
Tom Clegg [Tue, 10 Nov 2015 16:33:32 +0000 (11:33 -0500)]
5538: Update comments to match new tests.
radhika [Tue, 10 Nov 2015 15:52:35 +0000 (10:52 -0500)]
7661: added test with only_pdh (not working yet)
Tom Clegg [Tue, 10 Nov 2015 15:10:55 +0000 (10:10 -0500)]
5538: Test that POST method is not retried.
Tom Clegg [Tue, 10 Nov 2015 07:20:34 +0000 (02:20 -0500)]
Use a different port number for each test case. No issue #
Tom Clegg [Tue, 10 Nov 2015 06:29:11 +0000 (01:29 -0500)]
5824: Support configuration with a download-only host.
radhika [Mon, 9 Nov 2015 20:41:46 +0000 (15:41 -0500)]
Merge branch 'master' into 7661-fuse-by-pdh
Tom Clegg [Mon, 9 Nov 2015 20:00:14 +0000 (15:00 -0500)]
5824: Preserve query in keep_web_url template. Warn when redirecting preview to a single-origin keep_web_url.
Peter Amstutz [Mon, 9 Nov 2015 19:33:09 +0000 (14:33 -0500)]
Merge branch '3585-arpi-project-uuid-wip' closes #3585
radhika [Mon, 9 Nov 2015 19:01:17 +0000 (14:01 -0500)]
Merge branch 'master' into 7661-fuse-by-pdh
radhika [Mon, 9 Nov 2015 18:54:56 +0000 (13:54 -0500)]
Merge branch 'master' into 7490-datamanager-dont-die-return-error
radhika [Mon, 9 Nov 2015 18:54:29 +0000 (13:54 -0500)]
closes #5538
Merge branch '5538-arvadosclient-retry'
radhika [Mon, 9 Nov 2015 18:49:31 +0000 (13:49 -0500)]
5538: update the test case for "error" to use better stub parameters with nil status codes and response body to avoid any confusion to the reader.
radhika [Mon, 9 Nov 2015 16:21:35 +0000 (11:21 -0500)]
7661: rename MagiDirectory by_pdh as pdh_only
radhika [Mon, 9 Nov 2015 15:43:13 +0000 (10:43 -0500)]
Merge branch 'master' into 7661-fuse-by-pdh
radhika [Mon, 9 Nov 2015 15:37:18 +0000 (10:37 -0500)]
7490: Convert several fatalf statements into returning errors. No new tests are added yet, but all the existing tests are passing.
Peter Amstutz [Mon, 9 Nov 2015 14:27:38 +0000 (09:27 -0500)]
5353: Add a couple of tests to explicitly create nodes of different sizes
radhika [Mon, 9 Nov 2015 13:38:29 +0000 (08:38 -0500)]
5538: add a test that simulates error during requesting server so that we can test the error path as well.
Brett Smith [Mon, 9 Nov 2015 11:05:28 +0000 (06:05 -0500)]
3585: Add --project-uuid switch to a-r-p-i.
Tom Clegg [Mon, 9 Nov 2015 08:28:50 +0000 (03:28 -0500)]
5824: Add anonymous-404 and download-by-pdh tests.
Tom Clegg [Sun, 8 Nov 2015 20:52:29 +0000 (15:52 -0500)]
5824: Propagate non-token parts of query string (notably ?attachment=disposition) when redirecting.
Tom Clegg [Sun, 8 Nov 2015 11:39:05 +0000 (06:39 -0500)]
5824: Support partial content with Range header (only if start==0).
Tom Clegg [Sat, 7 Nov 2015 09:36:01 +0000 (04:36 -0500)]
5824: Fix disposition=attachment handling.
Propagate disposition=attachment from Workbench to keep-web when
redirecting.
Include a filename in the Content-Disposition header if the request
URL contains "?", so UAs don't mistakenly include the query string as
part of the default filename.
Tom Clegg [Sat, 7 Nov 2015 09:06:47 +0000 (04:06 -0500)]
5824: Fixup new keepproxy tests to use simplified test setup.
See
813d35123538b00ab70719e247b6bb0881269460
Tom Clegg [Sat, 7 Nov 2015 09:03:27 +0000 (04:03 -0500)]
5824: Move "periodically refresh Keep services" func from keepproxy to SDK.
Tom Clegg [Sat, 7 Nov 2015 09:00:50 +0000 (04:00 -0500)]
5824: Fix server shutdown code.
* Pay attention to --num-keep-servers in stop_keep.
* Wait for processes to exit, to avoid start/stop races.
* Tighten exception handling in kill_server_pid() and warn instead of
crashing in various races.
* Log TERM signals.
* Log when a server does not shut down within the given deadline.
Tom Clegg [Sat, 7 Nov 2015 08:54:03 +0000 (03:54 -0500)]
5824: Fix Keep server shutdown, check errors, simplify stderr redirection.
(Oops, we forgot to actually Run() the python command for stop_keep.)
radhika [Sat, 7 Nov 2015 14:25:48 +0000 (09:25 -0500)]
5538: update the test to set resp.body with the given string from stub than hard code it (overlooked in previous commit)
radhika [Sat, 7 Nov 2015 14:00:49 +0000 (09:00 -0500)]
5538: correct retryable list and use it to determine whether to close idle connections; add a few more test cases.
radhika [Sat, 7 Nov 2015 13:42:38 +0000 (08:42 -0500)]
Merge branch 'master' into 5538-arvadosclient-retry
Tom Clegg [Sat, 7 Nov 2015 07:22:07 +0000 (02:22 -0500)]
5824: Use fifo2stderr for arv-git-httpd and keep-web logs, too.