Radhika Chippada [Tue, 3 Mar 2015 17:59:56 +0000 (12:59 -0500)]
Merge branch 'master' into 5349-timestamp-error-for-running-pipeline
Radhika Chippada [Tue, 3 Mar 2015 17:57:31 +0000 (12:57 -0500)]
3761: Pass pullq to RunPullWorker
Radhika Chippada [Tue, 3 Mar 2015 17:26:59 +0000 (12:26 -0500)]
Merge branch 'master' into 3761-pull-list-worker
Radhika Chippada [Tue, 3 Mar 2015 17:26:21 +0000 (12:26 -0500)]
3761: pass keepclient as an arg to RunPullWorker
Peter Amstutz [Tue, 3 Mar 2015 17:16:55 +0000 (12:16 -0500)]
Merge branch '5322-sso-manual-account-doc' closes #5322
Peter Amstutz [Tue, 3 Mar 2015 17:13:35 +0000 (12:13 -0500)]
Merge branch '5305-arv-copy-fixes' closes #5305
Peter Amstutz [Tue, 3 Mar 2015 17:10:09 +0000 (12:10 -0500)]
Fixed SafeApi -> ThreadSafeApiCache refs #4823
Peter Amstutz [Tue, 3 Mar 2015 17:06:28 +0000 (12:06 -0500)]
Fix arv-mount use arvados.config.settings() to initialize ThreadSafeApiCache
refs #4823
Peter Amstutz [Tue, 3 Mar 2015 16:42:10 +0000 (11:42 -0500)]
5305: Remove erroneous comment
Peter Amstutz [Tue, 3 Mar 2015 16:39:03 +0000 (11:39 -0500)]
4956: Add 'maxRequestSize' to discovery document
Peter Amstutz [Tue, 3 Mar 2015 16:22:47 +0000 (11:22 -0500)]
5305: Added num_retries to all execute() calls. Refactored
collection-name-choosing logic to be easier to follow.
Radhika Chippada [Tue, 3 Mar 2015 16:05:53 +0000 (11:05 -0500)]
Merge branch 'master' into 3761-pull-list-worker
Radhika Chippada [Tue, 3 Mar 2015 16:05:11 +0000 (11:05 -0500)]
3761: improved tests with delays
Tom Clegg [Tue, 3 Mar 2015 15:56:52 +0000 (10:56 -0500)]
5043: Split long stderr lines rather than consume unlimited memory.
Ward Vandewege [Tue, 3 Mar 2015 15:16:23 +0000 (10:16 -0500)]
Follow the naming conventions for hostnames; add SSO server as a
public-facing service that requires an SSL certificate.
refs #5322
Brett Smith [Tue, 3 Mar 2015 15:10:10 +0000 (10:10 -0500)]
Merge branch '5313-node-manager-gce-fixups2-wip'
Refs #5313.
Brett Smith [Tue, 3 Mar 2015 15:08:17 +0000 (10:08 -0500)]
5313: Rely more on datacenter constructor in Node Manager GCE driver.
When initialized with a datacenter argument, the GCE libcloud driver
acts a lot more like the EC2 one. Many listings are implicitly
limited to that zone, saving us the need of limiting searches
ourselves. Let's rely on libcloud instead of our own code.
Brett Smith [Tue, 3 Mar 2015 15:06:24 +0000 (10:06 -0500)]
5313: Revert Node Manager's GCE boot disk destroy code.
After upgrading to libcloud>=0.16, it's redundant to create a node
with ex_disk_auto_delete=True, then destroy the node with
destory_boot_disk=True. During the destroy process, libcloud will
fail to destroy the boot disk, because Google has already deleted it.
ex_disk_auto_delete is closer to what we want, so just rely on that.
Peter Amstutz [Tue, 3 Mar 2015 14:57:08 +0000 (09:57 -0500)]
Update arvados-fuse dependency on python sdk refs #4823
Peter Amstutz [Tue, 3 Mar 2015 14:34:50 +0000 (09:34 -0500)]
Merge branch '4823-python-sdk-writable-collection-api' closes #4823
Peter Amstutz [Tue, 3 Mar 2015 14:34:05 +0000 (09:34 -0500)]
4823: Docstring and comment fixes.
Tom Clegg [Tue, 3 Mar 2015 02:35:35 +0000 (21:35 -0500)]
5043: Use Go's log package to serialize writes. Lose logChan.
Radhika Chippada [Tue, 3 Mar 2015 02:09:42 +0000 (21:09 -0500)]
3761: code refactoring
Radhika Chippada [Mon, 2 Mar 2015 21:11:45 +0000 (16:11 -0500)]
Merge branch 'master' into 3761-pull-list-worker
Radhika Chippada [Mon, 2 Mar 2015 21:04:17 +0000 (16:04 -0500)]
5349: Reverted "Time.iso8601(current_job[:created_at]" back to "current_job[:created_at]". All tests and manual testing passed and no negative side effects are observed.
Peter Amstutz [Mon, 2 Mar 2015 20:51:55 +0000 (15:51 -0500)]
4823: More fixes and cleanups.
* Renamed SynchronizedCollectionBase to RichCollectionBase
* Renamed arvapi parameter of one_task_per_input_file to api_client
* KeepLocator.stripped() returns bare hash if self.size is None
* Permit closing an ArvadosFileWriter more than once
* Fix various docstrings
* Strive to follow PEP 8 spacing guidelines
Peter Amstutz [Mon, 2 Mar 2015 20:31:18 +0000 (15:31 -0500)]
Merge branch 'origin-5309-keepproxy-panic' closes #5309
Peter Amstutz [Mon, 2 Mar 2015 20:22:39 +0000 (15:22 -0500)]
5309: Add comment about testing for content-length error bug.
Tom Clegg [Mon, 2 Mar 2015 19:42:35 +0000 (14:42 -0500)]
5043: Accept long stderr lines from crunch tasks.
Brett Smith [Mon, 2 Mar 2015 19:08:02 +0000 (14:08 -0500)]
Merge branch '4751-node-manager-stricter-node-pairing-wip'
Closes #4751, #5351.
Brett Smith [Mon, 2 Mar 2015 16:27:59 +0000 (11:27 -0500)]
4751: Node Manager considers ping times for stricter node pairing.
Because the pairing decision is currently based on IP address alone,
Node Manager will occasionally pair a cloud node with the wrong
Arvados node after an IP address is reused. Fix that by bringing the
node's first_ping_at into consideration: if it's older than the cloud
node, refuse to pair.
Brett Smith [Mon, 2 Mar 2015 19:07:29 +0000 (14:07 -0500)]
Merge branch '5313-node-manager-gce-fixes-wip'
Closes #5313, #5350.
Radhika Chippada [Mon, 2 Mar 2015 18:37:04 +0000 (13:37 -0500)]
3761: use SignLocator
Brett Smith [Mon, 2 Mar 2015 15:37:42 +0000 (10:37 -0500)]
5313: Node Manager's GCE driver destroys boot disks reliably.
This more closely matches the behavior of the EC2 driver, which we
want.
* Upgrade to libcloud 0.16, which adds an ex_disk_auto_delete argument
to GCE's create_node method, with True as the default.
* Set destroy_boot_disk=True when calling destroy_node().
Brett Smith [Mon, 2 Mar 2015 15:29:14 +0000 (10:29 -0500)]
5313: Rename Node Manager's `user-data` GCE tag to `arv-ping-url`.
`user-data` is an EC2-specific name. `arv-ping-url` more clearly
describes what's in it.
Peter Amstutz [Fri, 27 Feb 2015 20:14:35 +0000 (20:14 +0000)]
5305: Add heuristics to choose name when collection is referenced by PDH instead of uuid
Radhika Chippada [Fri, 27 Feb 2015 19:27:55 +0000 (14:27 -0500)]
Merge branch 'master' into 3761-pull-list-worker
Radhika Chippada [Fri, 27 Feb 2015 19:27:15 +0000 (14:27 -0500)]
3761: additional tests
Brett Smith [Fri, 27 Feb 2015 19:23:02 +0000 (14:23 -0500)]
Merge branch '5283-crunch-collation-safety-wip'
Closes #5283, #5306.
Brett Smith [Fri, 27 Feb 2015 19:22:18 +0000 (14:22 -0500)]
5283: Log more crunch-job output handling.
Requested during code review.
Brett Smith [Wed, 25 Feb 2015 16:37:26 +0000 (11:37 -0500)]
5283: crunch-job doesn't use freeze logic after a job fails.
If the job has failed permanently, we want to go through all the
end-of-job logic. Previously, we were getting sidetracked into
freeze_if_want_freeze, which skips some steps like setting the
permanent job output record. Refs #4472.
Brett Smith [Fri, 27 Feb 2015 19:20:12 +0000 (14:20 -0500)]
5283: Improve reliability of crunch-job output collation.
* Check the results of all pipe opens, exit statuses, and writes.
Log any problems.
* Have fetch_block return undef when it encounters trouble, rather
than dying. create_output_collection already checks for this, so it
effectively bubbles up the error.
* Retry all of the associated API calls.
* Kill the manifest creation pipe if we give up on it, per the TODO.
This probably won't resolve #5283, but hopefully these changes will
give us additional information to help diagnose the problem.
Peter Amstutz [Fri, 27 Feb 2015 19:03:03 +0000 (19:03 +0000)]
5305: Handle collection pdh for docker image
Peter Amstutz [Fri, 27 Feb 2015 16:43:53 +0000 (11:43 -0500)]
5322: Add documentation to "install SSO" section. (Possibly this should go
into the admin guide, the admin guide is kind of useless right now.)
Radhika Chippada [Fri, 27 Feb 2015 16:12:29 +0000 (11:12 -0500)]
3761: Run pull list worker, which processes pull reqests from the list.
Ward Vandewege [Thu, 26 Feb 2015 18:59:58 +0000 (13:59 -0500)]
Move the licensing info out of the second column and towards the footer of the page on the doc site.
No issue #
Ward Vandewege [Thu, 26 Feb 2015 18:29:44 +0000 (13:29 -0500)]
Merge branch '5310-arv-copy-by-pdh'
refs #5310
Ward Vandewege [Thu, 26 Feb 2015 18:27:26 +0000 (13:27 -0500)]
Merge branch 'master' into 5310-arv-copy-by-pdh
Peter Amstutz [Thu, 26 Feb 2015 18:19:50 +0000 (13:19 -0500)]
4823: Add flush() to ArvadosFile. Fix tests to avoid using internal APIs. Fix
import in _normalize_stream. Make KeepRequestError more generic (now
represents a list of "request errors" instead of "service errors").
Phil Hodgson [Thu, 26 Feb 2015 08:01:04 +0000 (09:01 +0100)]
Merge branch '4232-slow-pipes-n-jobs' closes #4232
Peter Amstutz [Wed, 25 Feb 2015 21:42:10 +0000 (16:42 -0500)]
5309: Fix keepclient and keepproxy bugs related to error handling:
* KeepClient: Handle nil response and nil response Body from Client.Do(GET)
* KeepProxy: Only defer reader.Close() if reader is not nil
* KeepProxy tests: Add test for GET failure to read (404)
Peter Amstutz [Wed, 25 Feb 2015 19:09:09 +0000 (14:09 -0500)]
5310: Use c.get('name') instead of c['name']
* 'name' isn't necessarily present when obj_uuid is a PDH,
src.collections().get(uuid=obj_uuid).execute() may return a synthetic record
without a name field.
Peter Amstutz [Wed, 25 Feb 2015 16:00:05 +0000 (11:00 -0500)]
4823: Add arvapi parameter to one_task_per_input_file() to solve mocking
problems. Refactor Collection.copy() and add Collection.add(). Fix docstring
bugs.
Peter Amstutz [Wed, 25 Feb 2015 14:39:36 +0000 (09:39 -0500)]
Merge branch 'master' into 4823-python-sdk-writable-collection-api
Phil Hodgson [Wed, 25 Feb 2015 08:12:50 +0000 (09:12 +0100)]
Merge branch 'master' into 4232-slow-pipes-n-jobs
Phil Hodgson [Wed, 25 Feb 2015 08:10:41 +0000 (09:10 +0100)]
4232: remove traces of no-longer-needed "dependency" code for pipeline_instances
Peter Amstutz [Tue, 24 Feb 2015 22:13:33 +0000 (17:13 -0500)]
4823: Remove sync_mode() from Collection in favor of writable() flag.
Collection constructor raises ArgumentError() on bad manifest. Fix
assertEquals() -> assertEqual().
Peter Amstutz [Tue, 24 Feb 2015 14:22:31 +0000 (09:22 -0500)]
Merge branch '3785-job-log-collection-owner' closes #3785
Peter Amstutz [Mon, 23 Feb 2015 21:49:23 +0000 (16:49 -0500)]
Merge branch '4520-arv-copy-project-uuid' closes #4520
Peter Amstutz [Mon, 23 Feb 2015 21:48:44 +0000 (16:48 -0500)]
4520: Tweak test that put(u'foo') does the right thing.
Peter Amstutz [Mon, 23 Feb 2015 21:36:09 +0000 (16:36 -0500)]
4520: Coerce unicode strings to ascii in put(). Use result.content (returns
literal result bytes) instead result.text (returns unicode) when processing
Keep responses.
Peter Amstutz [Mon, 23 Feb 2015 21:09:48 +0000 (16:09 -0500)]
3785: Log tab is no longer suppressed for anonymous users.
Peter Amstutz [Mon, 23 Feb 2015 20:17:19 +0000 (15:17 -0500)]
4520: manifest_text() is utf-8 encoded by default so it can be safely put() to
Keep. Add test that calling put() with a unicode string raises an error.
Fetching user uuid in arv-copy uses num_retries.
Peter Amstutz [Mon, 23 Feb 2015 19:17:12 +0000 (14:17 -0500)]
4520: Better checking to see if collection already exists at the destination.
Set args.project_uuid to default value (current user uuid) if not set on
command line to simplify code.
Peter Amstutz [Mon, 23 Feb 2015 18:55:41 +0000 (13:55 -0500)]
4520: Bonus fix because arv-copy was giving KeepClient.put() unicode strings
instead of byte strings. KeepClient will now reject that.
Peter Amstutz [Mon, 23 Feb 2015 18:54:47 +0000 (13:54 -0500)]
4520: Refactor code to create the collection record. Also refactored code
which creates Docker metadata links so that copying any collection which
represents a Docker image will also copy over the metadata.
Peter Amstutz [Mon, 23 Feb 2015 16:45:32 +0000 (11:45 -0500)]
3785: Upload job log to collection with --project-uuid same owner_uuid as job.
Peter Amstutz [Mon, 23 Feb 2015 15:48:04 +0000 (10:48 -0500)]
Merge branch '5277-fuse-mtime-fix' closes #5277
Peter Amstutz [Mon, 23 Feb 2015 15:16:38 +0000 (10:16 -0500)]
Merge branch 'master' into 4823-python-sdk-writable-collection-api
Peter Amstutz [Mon, 23 Feb 2015 15:16:08 +0000 (10:16 -0500)]
4823: Clean up imports in collection.py
Peter Amstutz [Mon, 23 Feb 2015 14:44:27 +0000 (09:44 -0500)]
4823: Handle edge cases of files named '.' so that the FUSE test passes. Added
tests for invalid manifests. Defer populating CollectionReader streams until
needed to avoid extra copy of the manifest.
Phil Hodgson [Sat, 21 Feb 2015 09:16:31 +0000 (10:16 +0100)]
4232: revert experimental change to using find? for each of the jobs in a pipeline, rather than simply a where clause: there is no evidence that this switch to find? was helping to speed up anything overall
Phil Hodgson [Sat, 21 Feb 2015 09:11:43 +0000 (10:11 +0100)]
Merge branch 'master' into 4232-slow-pipes-n-jobs
Peter Amstutz [Fri, 20 Feb 2015 20:58:14 +0000 (15:58 -0500)]
4823: Fix tests broken by prior refactoring. Renamed 'api.py' to 'apisetup.py'
so it wouldn't be awkwardly shadowed by the 'api' method.
Peter Amstutz [Fri, 20 Feb 2015 18:29:10 +0000 (13:29 -0500)]
4823: Refactoring. ReadOnly Collection is now CollectionReader, replacing old
CollectionReader implementation (but maintains the same public API).
Ward Vandewege [Thu, 19 Feb 2015 19:46:58 +0000 (14:46 -0500)]
Fix typo on doc homepage.
No issue #
Peter Amstutz [Thu, 19 Feb 2015 15:20:34 +0000 (10:20 -0500)]
5277: Add test for mtime. Use ciso8601 module to parse arvados timestamps.
Peter Amstutz [Thu, 19 Feb 2015 14:38:53 +0000 (09:38 -0500)]
4823: Revert FUSE changes unrelated to ThreadSafeApiCache
Peter Amstutz [Thu, 19 Feb 2015 14:32:57 +0000 (09:32 -0500)]
Merge branch 'master' into 4823-python-sdk-writable-collection-api
Peter Amstutz [Thu, 19 Feb 2015 14:32:17 +0000 (09:32 -0500)]
4823: Fix fuse tests for SafeApi -> ThreadSafeApiCache changes. Add a couple
of mtime assertions.
Radhika Chippada [Wed, 18 Feb 2015 16:52:48 +0000 (11:52 -0500)]
closes #5197
Merge branch '5197-collection-name-owner-unique'
Brett Smith [Wed, 18 Feb 2015 15:31:00 +0000 (10:31 -0500)]
4759: Update Node Manager to parse new Arvados API timestamps.
Refs #4759.
Nancy Ouyang [Tue, 17 Feb 2015 23:16:24 +0000 (18:16 -0500)]
closes #5243, #5194 Merge branch '5194-quickfix-disambiguate-gettingstarted-sections'
mishaz [Tue, 17 Feb 2015 23:07:14 +0000 (23:07 +0000)]
Merge branch '3408-production-datamanager' refs #3408
Nancy Ouyang [Tue, 17 Feb 2015 23:06:45 +0000 (18:06 -0500)]
5194: minor fixes
mishaz [Tue, 17 Feb 2015 23:06:32 +0000 (23:06 +0000)]
Merge branch 'master' into 3408-production-datamanager refs #3408
mishaz [Tue, 17 Feb 2015 23:02:36 +0000 (23:02 +0000)]
Changes to allow datamanager to run indefinitely:
Logger's worker goroutine returns after final write.
minutes-between-runs flag specifies how many minutes to wait between runs (0 means don't loop)
Brett Smith [Tue, 17 Feb 2015 22:54:59 +0000 (17:54 -0500)]
Merge branch '4138-node-manager-gce-wip'
Closes #4138, #5222.
Thanks, Tim.
Brett Smith [Mon, 16 Feb 2015 16:06:41 +0000 (11:06 -0500)]
4138: Prepare Node Manager GCE driver for production.
* Set node metadata in more appropriate places.
* Bridge more differences between GCE and EC2, like the fact that
sizes are listed for each location they're available, and GCE
doesn't provide node boot times.
* Use more infrastructure from BaseComputeNodeDriver to reduce code
duplication.
* Load as many objects as possible at initialization time, to reduce
API overhead of creating nodes.
Brett Smith [Fri, 13 Feb 2015 20:24:04 +0000 (15:24 -0500)]
4138: Revamp Node Manager driver proxying in BaseComputeNodeDriver.
Accessing attributes through a super() proxy does not invoke
__getattr__ on base classes, so the old implementation made it
impossible for subclasses to be agnostic about whether a method was
implemented in BaseComputeNodeDriver or the real libcloud driver.
This version makes that possible. It's also a little nicer because
now the class will report these method names to dir(), hasattr(), etc.
Brett Smith [Thu, 12 Feb 2015 22:22:12 +0000 (17:22 -0500)]
4138: Refactor out Node Manager DriverTestMixin.
Brett Smith [Thu, 12 Feb 2015 20:53:16 +0000 (15:53 -0500)]
4138: Fix noop Node Manager EC2 driver tests.
The previous tests simply instantiated the driver, then checked that a
mock method was truthy (which it will always be). This makes the test
work as intended.
Brett Smith [Fri, 13 Feb 2015 21:00:30 +0000 (16:00 -0500)]
4138: Refactor common Node Manager driver initialization to base driver.
Brett Smith [Wed, 11 Feb 2015 20:12:37 +0000 (15:12 -0500)]
4138: Simplify Node Manager GCE credential handling.
Because libcloud's GCE driver accepts a key path as a constructor
argument, it's relatively straightforward to put all the constructor
arguments directly in the Node Manager configuration. No need to
parse out JSON.
Tim Pierce [Fri, 23 Jan 2015 22:44:41 +0000 (17:44 -0500)]
4138: updated unit test
Corrected test_create_includes_ping_secret to account for delivering the
ping secret via metadata in GCE.
Tim Pierce [Fri, 23 Jan 2015 22:24:54 +0000 (17:24 -0500)]
4138: GCE fixes
The 'network_id' parameter needs to be delivered as 'location' in GCE.
The ping_url parameter is now delivered in the node metadata as
'pingUrl'.
When creating a new GCE instance, 'name' is a required parameter and
must begin with a letter. The default name is the UUID of the
corresponding Arvados node, prepended with 'arv-'.
Tim Pierce [Wed, 21 Jan 2015 18:06:35 +0000 (13:06 -0500)]
4138: general GCE fixes
* JSON credential file
** GCE credentials are delivered as a JSON string (and the key is formatted as a multi-line RSA private key). Let the GCE config file specify a path to the JSON credential file for simplicity.
* Accept NodeSizes addressed by id or name
** In EC2, NodeSizes are identified by the 'id' field. In GCE they are identified by the 'name' field. Allow the Node Manager config module to accept either.
Tim Pierce [Mon, 24 Nov 2014 22:12:07 +0000 (17:12 -0500)]
4138: code review feedback
Tim Pierce [Tue, 18 Nov 2014 18:49:10 +0000 (13:49 -0500)]
4138: support for Google Cloud Engine.
* Added:
** nodemanager/arvnodeman/computenode/drivers/gce.py
** nodemanager/doc/gce.example.cfg
** nodemanager/tests/test_computenode_driver_gce.py
Updated comment in nodemanager/arvnodeman/computenode/drivers/ec2.py.
Nancy Ouyang [Tue, 17 Feb 2015 22:19:37 +0000 (17:19 -0500)]
5194: Quickfix, disambiguated getting started and user guide sections, added 'next steps' to getting started guide