Peter Amstutz [Tue, 6 Jan 2015 13:45:08 +0000 (08:45 -0500)]
Merge branch 'master' into 4570-multi-auth-method
Peter Amstutz [Tue, 6 Jan 2015 13:44:49 +0000 (08:44 -0500)]
4570: Revert to links on log in page instead of form. Fixup documentation to
to describe a production setup.
Peter Amstutz [Mon, 5 Jan 2015 16:37:54 +0000 (11:37 -0500)]
Merge branch '4869-keepalive' refs #4869
Peter Amstutz [Mon, 5 Jan 2015 15:17:42 +0000 (10:17 -0500)]
4869: Client.Timeout and Client.Transport are now correctly set in
DiscoverKeepServers(). Improved comments.
Tom Clegg [Wed, 31 Dec 2014 21:33:57 +0000 (16:33 -0500)]
Remove cruft. No issue #
Ward Vandewege [Wed, 31 Dec 2014 15:01:59 +0000 (10:01 -0500)]
Merge branch '4887-invalidate-duplicate-ip-on-old-compute-nodes'
closes #4887
Ward Vandewege [Wed, 31 Dec 2014 15:01:30 +0000 (10:01 -0500)]
Merge branch 'master' into 4887-invalidate-duplicate-ip-on-old-compute-nodes
Ward Vandewege [Wed, 31 Dec 2014 15:00:21 +0000 (10:00 -0500)]
Address review comments:
* change stale_conflicting_nodes to a local variable
* minor performance optimization: add an additional check for ip_address being nil
refs #4887
Tim Pierce [Tue, 30 Dec 2014 21:50:04 +0000 (16:50 -0500)]
Merge branch '4877-dont-delete-stdout'
Fixes #4877
Tim Pierce [Tue, 30 Dec 2014 21:45:42 +0000 (16:45 -0500)]
4877: don't delete /dev/stdout
Fixed the filename check before trying to delete /dev/stdout.
Ward Vandewege [Tue, 30 Dec 2014 19:31:53 +0000 (14:31 -0500)]
Detect stale compute node records with the same IP address as the new
node on its first ping. Clear the ip_address field on the stale nodes.
Refs #4887
Ward Vandewege [Tue, 30 Dec 2014 18:28:57 +0000 (13:28 -0500)]
Cleanups:
* Remove old commented out code
* Remove superfluous test for presence of file on disk
refs #4887
Peter Amstutz [Tue, 30 Dec 2014 15:39:50 +0000 (10:39 -0500)]
4869: Enable TCP keepalive and adjust connection timeouts to Keep client.
Tom Clegg [Mon, 29 Dec 2014 22:02:01 +0000 (17:02 -0500)]
Fix whitespace, cf. gofmt. refs #4875
Tom Clegg [Mon, 29 Dec 2014 21:59:35 +0000 (16:59 -0500)]
Merge branch '4875-keepclient-test-race' closes #4875
Tom Clegg [Mon, 29 Dec 2014 21:29:17 +0000 (16:29 -0500)]
4875: Merge branch 'master' into 4875-keepclient-test-race
Conflicts:
sdk/go/keepclient/keepclient_test.go
Tom Clegg [Mon, 29 Dec 2014 20:45:30 +0000 (15:45 -0500)]
Fix version strings to comply with PEP-440. No issue #
Tom Clegg [Mon, 29 Dec 2014 20:12:46 +0000 (15:12 -0500)]
Merge branch '4523-owner_uuid-index' refs #4523
Peter Amstutz [Mon, 29 Dec 2014 20:11:05 +0000 (15:11 -0500)]
Merge branch '4869-keepproxy' refs #4869
Peter Amstutz [Mon, 29 Dec 2014 19:37:13 +0000 (14:37 -0500)]
4869: Strip all newlines from error responses, not just leading and trailing
whitespace.
Tom Clegg [Mon, 29 Dec 2014 18:58:58 +0000 (13:58 -0500)]
4523: Dry up migration and test cases.
Peter Amstutz [Mon, 29 Dec 2014 18:51:20 +0000 (13:51 -0500)]
4869: Based on Go documentation, don't set a body ReadCloser on the request
when body length is 0.
Tom Clegg [Mon, 29 Dec 2014 17:45:02 +0000 (12:45 -0500)]
4523: Fix column order to match migration order.
Tom Clegg [Mon, 29 Dec 2014 17:44:35 +0000 (12:44 -0500)]
4523: Remove dev-only checks in migration.
Peter Amstutz [Mon, 29 Dec 2014 17:32:38 +0000 (12:32 -0500)]
4869: Correctly handle zero-length blocks in Keep client/Keep proxy. Remove
X-Block-Size. Choose default request timeout based on if client is talking to
a proxy or not. Use double quotes in logging. Rename "tag" to "requestId".
Tom Clegg [Mon, 29 Dec 2014 17:28:44 +0000 (12:28 -0500)]
4523: Fix whitespace.
Peter Amstutz [Mon, 29 Dec 2014 14:23:45 +0000 (09:23 -0500)]
4869: Keepstore now returns Content-Length headers, and logs the error message
sent to the client on errors.
Peter Amstutz [Mon, 29 Dec 2014 14:09:13 +0000 (09:09 -0500)]
4869: KeepClient now has a default timeout per block request (10 minutes). In
keepproxy, the timeout is set to 20 seconds per block. Also rearranged some
keepclient and keepproxy logging to provide better information.
Tom Clegg [Tue, 23 Dec 2014 20:51:49 +0000 (15:51 -0500)]
Merge branch '4754-performance-TC' closes #4754
Ward Vandewege [Tue, 23 Dec 2014 20:47:49 +0000 (15:47 -0500)]
Merge branch '4844-stricter-min-nodes-wip'
refs #4844
Ward Vandewege [Tue, 23 Dec 2014 20:47:23 +0000 (15:47 -0500)]
Merge branch 'master' into 4844-stricter-min-nodes-wip
Ward Vandewege [Tue, 23 Dec 2014 20:44:10 +0000 (15:44 -0500)]
Skip two more CLI tests that need a running API server.
refs #4156
Peter Amstutz [Tue, 23 Dec 2014 14:55:05 +0000 (09:55 -0500)]
4869: Improve logging
Tom Clegg [Sun, 21 Dec 2014 00:28:56 +0000 (19:28 -0500)]
4875: Let the OS choose port numbers for fake servers.
Fixes a race condition where test case N+1 can't listen on port 2990
because test case N hasn't shut down its listener.
Also removes the artificial acceptance requirement that nobody else on
the testing host is using the arbitrarily assigned port range
2990..299x.
Incidental changes:
* rename RunBogusKeepServer to RunFakeKeepServer (to match
RunSomeFakeKeepServers and fix the misleading implication that the
resulting server does something bogus).
* return a KeepServer object from RunFakeKeepServer (for better parity
with RunSomeFakeKeepServers).
Tom Clegg [Sat, 20 Dec 2014 23:49:53 +0000 (18:49 -0500)]
4875: Use range in for loops.
Phil Hodgson [Sat, 20 Dec 2014 18:34:39 +0000 (19:34 +0100)]
Merge branch '4858-graph-not-comparing' refs #4358
Phil Hodgson [Sat, 20 Dec 2014 17:58:03 +0000 (18:58 +0100)]
Merge branch 'master' into 4358-graph-not-comparing
Brett Smith [Fri, 19 Dec 2014 17:09:17 +0000 (12:09 -0500)]
4844: Node Manager doesn't treat min_nodes as min_nodes_idle.
There's a bad interaction between the past bugfixes to (a) implement
min_nodes, and (b) boot new nodes when existing nodes are busy.
Because min_nodes has been implemented at the server wishlist level in
the past, the daemon can't distinguish between "nodes requested to
fulfill min_nodes" and "nodes requested to fulfill jobs."
This commit puts all the responsibility for enforcing min_nodes in the
daemon, so that the server wishlist always represents real job
requirements. This lets the daemon correctly decide whether or not to
boot a new node when >= min_nodes are busy.
Brett Smith [Thu, 18 Dec 2014 21:20:15 +0000 (16:20 -0500)]
Merge branch '4670-node-manager-robust-tags-wip'
Closes #4670, #4812.
Brett Smith [Fri, 12 Dec 2014 21:16:39 +0000 (16:16 -0500)]
4670: Add a post-create hook to Node Manager for EC2 tagging.
The previous code was relying on the post-create tagging in libcloud's
EC2 driver. Unfortunately, that's not working out too well for us: if
it fails, you get no indication of that, and it doesn't get retried.
This moves the work up into Node Manager, where failures can be logged
and retried appropriately.
The retry support may be sufficient to resolve #4670. If it's not,
then the additional logging will help us track down the root cause.
Brett Smith [Fri, 12 Dec 2014 18:18:51 +0000 (13:18 -0500)]
4670: Node Manager handles more libcloud exceptions.
libcloud compute drivers (at least EC2 and GCE) raise bare Exceptions
when there's some problem talking to the cloud service. The previous
code was expecting to see a LibcloudError, so it wouldn't handle these
errors as intended.
I didn't want to just catch errors with "except Exception" everywhere,
so I added an is_cloud_exception class method to our driver classes to
more accurately identify exceptions that represent trouble talking to
the cloud service. It recognizes exact Exceptions, plus the other
classes we were catching before.
While I was at this, I gave more specific names to the wrapper methods
in compute node actor decorators, as a debugging aid.
Brett Smith [Thu, 18 Dec 2014 16:04:23 +0000 (11:04 -0500)]
4800: run-command calls sys.exit() with an integer.
Closes #4800.
Brett Smith [Thu, 18 Dec 2014 15:42:56 +0000 (10:42 -0500)]
4818: Add missing timeout in Node Manager test.
Refs #4818.
Tom Clegg [Thu, 18 Dec 2014 15:14:59 +0000 (10:14 -0500)]
Merge branch '4515-search-empty-project' closes #4515
Tom Clegg [Wed, 17 Dec 2014 14:44:27 +0000 (09:44 -0500)]
Remove excess heading and divider. No issue #
Tom Clegg [Wed, 17 Dec 2014 14:41:49 +0000 (09:41 -0500)]
Restore scroll on projects menu. Do not offer "Add project" button in projects dropdown in chooser dialog. closes #4811
Tom Clegg [Tue, 16 Dec 2014 16:37:01 +0000 (11:37 -0500)]
Ignore .eggs/ (*.egg isn't enough: .eggs/README.txt gets installed too, as a human-readable .gitignore.)
No issue #
Tom Clegg [Tue, 16 Dec 2014 16:16:59 +0000 (11:16 -0500)]
4515: Add controller tests for search dialog.
Brett Smith [Mon, 15 Dec 2014 19:46:43 +0000 (14:46 -0500)]
4818: Node Manager unpairs Arvados node when cloud node shuts down.
Without this, Node Manager doesn't correctly pair the Arvados node
with a new cloud node that's booted later. Closes #4818.
Tom Clegg [Mon, 15 Dec 2014 19:40:32 +0000 (14:40 -0500)]
4754: Move perf/prof deps to :test/:performance groups.
Move "do not reset unless Rails.env==test" logic into one place.
Brett Smith [Mon, 15 Dec 2014 17:58:52 +0000 (12:58 -0500)]
Merge branch '4481-update-user-docs-TC'
Closes #4741, #4790.
Tom Clegg [Mon, 15 Dec 2014 17:18:33 +0000 (12:18 -0500)]
4481: Fix ambiguous "Keep id" -> "locator" in example scripts.
Brett Smith [Wed, 10 Dec 2014 21:40:13 +0000 (16:40 -0500)]
4481: Refresh Crunch script tutorial page.
* The script now normalizes the output path, for consistency with
other scripts, and it looks nicer.
* Modernize the job log output slightly, and adjust text to match.
Brett Smith [Mon, 8 Dec 2014 19:37:23 +0000 (14:37 -0500)]
4481: Update tutorial Crunch scripts to use newer PySDK methods.
Most focus is on the file Collection file methods added in #3603.
Brett Smith [Mon, 15 Dec 2014 17:57:25 +0000 (12:57 -0500)]
Merge branch '4792-arv-ls-normalize-wip'
Closes #4792, #4813.
Phil Hodgson [Sun, 14 Dec 2014 12:47:02 +0000 (13:47 +0100)]
4358: fixed: the provenance graph was being generated twice, the second time for only one pipeline
Brett Smith [Fri, 12 Dec 2014 23:01:39 +0000 (18:01 -0500)]
4792: arv-ls normalizes the collection before listing.
I sort of ended up rewriting arv-ls to make it testable, but hey,
that's part of the support task. Normalization is the only functional
change I made.
Radhika Chippada [Fri, 12 Dec 2014 21:29:50 +0000 (16:29 -0500)]
closes #4414
Merge branch '4414-add-new-in-project-dropdown'
Radhika Chippada [Fri, 12 Dec 2014 21:20:44 +0000 (16:20 -0500)]
4414: use ensure_unique_name option to instruct api server to create unique name for new project.
Radhika Chippada [Fri, 12 Dec 2014 19:02:03 +0000 (14:02 -0500)]
4414: add "Add a new project" link to project dropdown.
Radhika Chippada [Fri, 12 Dec 2014 14:26:03 +0000 (09:26 -0500)]
closes #4476
closes #4804
Merge branch '4476-search-next-page-issue'
Radhika Chippada [Fri, 12 Dec 2014 14:22:14 +0000 (09:22 -0500)]
closes #4799
Merge branch '4799-move-selected-error'
Radhika Chippada [Fri, 12 Dec 2014 14:14:02 +0000 (09:14 -0500)]
refs #4754
Merge branch '4754-performance-benchmarks'
Radhika Chippada [Fri, 12 Dec 2014 14:13:31 +0000 (09:13 -0500)]
Merge branch 'master' into 4754-performance-benchmarks
Radhika Chippada [Thu, 11 Dec 2014 21:46:36 +0000 (16:46 -0500)]
4804: search dialog retains project_uuid param in next_page_href.
Radhika Chippada [Thu, 11 Dec 2014 19:38:21 +0000 (14:38 -0500)]
4476: include filters in search next_page_href url.
Radhika Chippada [Thu, 11 Dec 2014 17:26:30 +0000 (12:26 -0500)]
4799: do not offer "move selected" option when current user cannot write to the project.
Brett Smith [Thu, 11 Dec 2014 15:54:28 +0000 (10:54 -0500)]
4027: Document arvados_sdk_version's virtualenv requirement.
Refs #4027.
Radhika Chippada [Thu, 11 Dec 2014 14:16:26 +0000 (09:16 -0500)]
4754: add command to be used to run diagnostics testing as a comment.
Radhika Chippada [Wed, 10 Dec 2014 22:41:32 +0000 (17:41 -0500)]
4754: update assertion to look for a data-object-uuid
Radhika Chippada [Wed, 10 Dec 2014 22:18:19 +0000 (17:18 -0500)]
4754: support RAILS_ENV=performance
Radhika Chippada [Wed, 10 Dec 2014 19:18:24 +0000 (14:18 -0500)]
Merge branch 'master' into 4754-performance-benchmarks
Tim Pierce [Wed, 10 Dec 2014 16:26:10 +0000 (11:26 -0500)]
Merge branch '4499-one-task-per-input-file-normalize'
Fixes #4499.
Ward Vandewege [Wed, 10 Dec 2014 14:49:09 +0000 (09:49 -0500)]
Download bwa and samtools from a self-hosted mirror, sf.net downloads
are way too unreliable.
No issue #
Brett Smith [Wed, 10 Dec 2014 13:04:08 +0000 (08:04 -0500)]
Merge branch '4293-node-manager-timed-bootstrap-wip'
Closes #4293, #4732. Refs #4380.
Brett Smith [Fri, 5 Dec 2014 22:27:37 +0000 (17:27 -0500)]
4293: Node Manager shuts down nodes that fail to boot.
This helps Node Manager detect and correct when a node fails to
bootstrap.
Brett Smith [Fri, 5 Dec 2014 22:45:13 +0000 (17:45 -0500)]
4380: Node Manager SLURM dispatcher proceeds from more states.
Per discussion with Ward. Our main concern is that Node Manager
shouldn't shut down nodes that are doing work. We feel comfortable
broadening the definition of "not doing work" to this set of states.
Tom Clegg [Wed, 10 Dec 2014 07:15:13 +0000 (02:15 -0500)]
Merge branch '3781-browser-upload' closes #3781
Tom Clegg [Wed, 10 Dec 2014 05:54:23 +0000 (00:54 -0500)]
3781: Fix test that assumes only one empty collection is readable.
Wait longer for browser timeout in upload-fail test.
Tom Clegg [Wed, 10 Dec 2014 05:00:51 +0000 (00:00 -0500)]
3781: Merge branch 'master' into 3781-browser-upload
Radhika Chippada [Wed, 10 Dec 2014 01:31:02 +0000 (20:31 -0500)]
4754: move search test into browsing_test.rb instead of having its own file.
Radhika Chippada [Tue, 9 Dec 2014 21:57:06 +0000 (16:57 -0500)]
4754: search_test assertions
Tom Clegg [Tue, 9 Dec 2014 21:44:50 +0000 (16:44 -0500)]
3781: Add singletest function.
Tom Clegg [Tue, 9 Dec 2014 21:44:00 +0000 (16:44 -0500)]
3781: Add test cases: empty files, renaming, and error reporting.
Radhika Chippada [Tue, 9 Dec 2014 21:18:50 +0000 (16:18 -0500)]
4754: add rails-perftest and ruby-prof gems to enable performance benchmarking and add search_test.rb
Tim Pierce [Mon, 8 Dec 2014 18:53:02 +0000 (13:53 -0500)]
4499: Normalize manifest in one_task_per_input_file
* arvados.job_setup.one_task_per_input_file now calls cr.normalize()
before creating tasks.
* Added unit test in test_sdk.py to confirm that the expected number of
tasks are created when called on a normalized manifest.
Brett Smith [Tue, 9 Dec 2014 15:21:28 +0000 (10:21 -0500)]
4027: Update arvados-cli in API server bundle.
Refs #4027.
Brett Smith [Tue, 9 Dec 2014 15:14:30 +0000 (10:14 -0500)]
Merge branch '4027-crunch-sdk-install-wip'
Closes #4027, #4667.
Brett Smith [Mon, 8 Dec 2014 15:45:23 +0000 (10:45 -0500)]
4027: crunch-job logs its own version information.
By request in code review, to help detect situations where
crunch-dispatch and crunch-job are out of sync.
Brett Smith [Mon, 24 Nov 2014 21:55:38 +0000 (16:55 -0500)]
4027: Crunch installs jobs' requested arvados_sdk_version.
* crunch-dispatch fetches the requested SDK version into its internal
git repository, just like it does for the Crunch script. Refactored
crunch-dispatch to make that code reusable.
* crunch-job's main script archives the sdk subdirectory as of that
commit, sending it along to compute nodes in the same .tar as the
Crunch script, under .arvados.sdk.
* crunch-job's __DATA__ dispatch section looks for the SDK under
.arvados.sdk, and installs it as much as possible.
Since I was messing with it so much already, I changed the semantics
of crunch-job's __DATA__ section: it is now either in installation
mode or run mode, based on whether there's anything in @ARGV. I
confirmed that this is consistent with current calls to the section.
Brett Smith [Mon, 8 Dec 2014 23:15:39 +0000 (18:15 -0500)]
4027: arvados/jobs includes virtualenv.
This lets you use the arvados/jobs image with the arvados_sdk_version
feature of Crunch.
Brett Smith [Mon, 24 Nov 2014 20:53:44 +0000 (15:53 -0500)]
4027: Bugfix update-gitolite.rb in Docker.
* Load a YAML library.
* Support ARVADOS_API_HOST_INSECURE, and set it in normal Docker use.
Brett Smith [Mon, 24 Nov 2014 20:53:00 +0000 (15:53 -0500)]
4027: Revamp SSH use in our Docker images.
* Don't install or run SSH in most of our Docker images. `docker
exec` is now preferred to inspect running images.
* Do run SSH on the API server, always, for Gitolite.
There is a feature regression here: the user's SSH key is not
automatically installed on the shell account. This needs to be fixed
another way. In the meantime, it's not difficult to run
`docker exec -ti --user=self shell /bin/bash`, and you can clone the
repository from the host system.
Tom Clegg [Tue, 9 Dec 2014 08:37:48 +0000 (03:37 -0500)]
3781: Fix trigger() usage: second argument is an array of handler args.
Tom Clegg [Tue, 9 Dec 2014 08:02:15 +0000 (03:02 -0500)]
3781: Fix progress% (100, not NaN) and manifest format (>=1 data locator) for zero-byte files.
Tim Pierce [Mon, 8 Dec 2014 22:02:33 +0000 (17:02 -0500)]
Merge branch '4269-no-collection-uuid-in-script-params'
Refs #4269.
Tim Pierce [Mon, 8 Dec 2014 21:44:17 +0000 (16:44 -0500)]
4269: clean up uuid regex matching
Code review feedback:
* Improved name for validation "no_collection_uuids" to
"ensure_no_collection_uuids_in_script_params"
* Added ArvadosModel.uuid_regex (along the lines of uuid_like_pattern)
and substituted it for hardcoded uuid regexes throughout the code.
Radhika Chippada [Mon, 8 Dec 2014 21:34:05 +0000 (16:34 -0500)]
closes #4477
closes #4719
Merge branch '4477-no-job-log'
Tom Clegg [Mon, 8 Dec 2014 20:26:52 +0000 (15:26 -0500)]
3781: Go to "Done!" state when the last upload completes despite a late call to stop().
Tom Clegg [Mon, 8 Dec 2014 18:59:45 +0000 (13:59 -0500)]
3781: Fix use of committed flag. That is now called state==="Done".