arvados.git
8 years ago8341: Use a worker thread to get page N+1 of logs while parsing page N.
Tom Clegg [Mon, 8 Feb 2016 01:15:00 +0000 (20:15 -0500)]
8341: Use a worker thread to get page N+1 of logs while parsing page N.

8 years ago8341: Get job log from logs API if the log has not been written to Keep yet.
Tom Clegg [Mon, 8 Feb 2016 00:43:02 +0000 (19:43 -0500)]
8341: Get job log from logs API if the log has not been written to Keep yet.

8 years agoMerge branch '8289-no-extra-orders' closes #8289
Tom Clegg [Mon, 8 Feb 2016 19:29:03 +0000 (14:29 -0500)]
Merge branch '8289-no-extra-orders' closes #8289

8 years ago8289: Strip redundant orders, even when provided explicitly by client.
Tom Clegg [Mon, 8 Feb 2016 19:28:02 +0000 (14:28 -0500)]
8289: Strip redundant orders, even when provided explicitly by client.

8 years ago8289: Do not add fallback orders if client already specified an unambiguous order.
Tom Clegg [Sat, 23 Jan 2016 05:23:49 +0000 (00:23 -0500)]
8289: Do not add fallback orders if client already specified an unambiguous order.

8 years agoMerge branch '7667-node-manager-logging' refs #7667
Peter Amstutz [Mon, 8 Feb 2016 16:28:53 +0000 (11:28 -0500)]
Merge branch '7667-node-manager-logging' refs #7667

8 years ago7667: Store node size in a table so to avoid blocking on booting and shutdown
Peter Amstutz [Mon, 8 Feb 2016 16:28:11 +0000 (11:28 -0500)]
7667: Store node size in a table so to avoid blocking on booting and shutdown
actors to ask node size.

8 years ago7667: Fix log message
Peter Amstutz [Mon, 8 Feb 2016 03:52:51 +0000 (22:52 -0500)]
7667: Fix log message

8 years agoMerge branch '8285-fuse-subscribe-websockets' closes #8285
Tom Clegg [Sat, 6 Feb 2016 00:45:30 +0000 (19:45 -0500)]
Merge branch '8285-fuse-subscribe-websockets' closes #8285

8 years ago8285: Test that arvados.events.subscribe() is called only when needed.
Tom Clegg [Sat, 6 Feb 2016 00:39:42 +0000 (19:39 -0500)]
8285: Test that arvados.events.subscribe() is called only when needed.

Add missing TagsDirectory.want_event_subscribe().

8 years ago8285: Add test for listen_for_events
Peter Amstutz [Sat, 6 Feb 2016 00:17:42 +0000 (19:17 -0500)]
8285: Add test for listen_for_events

8 years ago8285: Add want_event_subscribe flag to subclasses of fusedir.Directory,
Peter Amstutz [Fri, 5 Feb 2016 21:39:25 +0000 (16:39 -0500)]
8285: Add want_event_subscribe flag to subclasses of fusedir.Directory,
determine whether to call listen_for_events based on it.

8 years ago7667: Combine polling logs into fewer lines for less noise. Adjust message
Peter Amstutz [Fri, 5 Feb 2016 16:10:43 +0000 (11:10 -0500)]
7667: Combine polling logs into fewer lines for less noise.  Adjust message
when last_ping_at is unexpectedly none to be less severe (can happen in
innocent circumstances).  Report nodes in "booted" list as "booting" since they
are unpaired.  Fix tests.

8 years ago7868: Update API server's arvados-cli version.
Brett Smith [Fri, 5 Feb 2016 09:52:43 +0000 (04:52 -0500)]
7868: Update API server's arvados-cli version.

Curoverse clusters are deployed by setting CRUNCH_JOB_BIN,
effectively excluding it from the bundle, but this is not true for
clusters deployed following the install guide.  Out of the box,
they'll use the version of crunch-job that's actually in the
arvados-cli gem in the bundle.

crunch-dispatch has functionality in it that requires a newer
arvados-cli, so update accordingly.  This is not exactly the version
produced by #7868, but it's pretty close.

I think there's a strong case that we should update this version
whenever we make a substantial change to crunch-job.  But since I'm
pushing this without discussion or review, I'm doing the smallest
thing possible.

Refs #7868.

8 years ago7667: Node manager bug fixes and logging improvements.
Peter Amstutz [Thu, 4 Feb 2016 23:46:31 +0000 (18:46 -0500)]
7667: Node manager bug fixes and logging improvements.

 * ComputeNodeSetupActor will now finish if there is an unhandled exception.

 * ComputeNodeMonitorActor now explains why a node that is in the shutdown window
is not eligible for shutdown.

 * Logging in nodes_wanted now distinguishes idle/busy/booting/shutting down.

 * Logging by actors is now class name and a portion of the actor urn, so actions
of a specific actor can be consistently identified.

8 years agoRecognize another way slurm tells us about node failures.
Tom Clegg [Thu, 4 Feb 2016 19:29:39 +0000 (14:29 -0500)]
Recognize another way slurm tells us about node failures.

Retry, instead of giving up, in situations like this:

2016-02-02_08:42:26 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 stderr srun: error: Aborting, io error and missing step on node 0
2016-02-02_08:42:26 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 stderr srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
2016-02-02_08:42:28 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 stderr srun: error: Timed out waiting for job step to complete
2016-02-02_08:42:28 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 child 42984 on compute26.1 exit 0 success=
2016-02-02_08:42:28 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 ERROR: Task process exited 0, but never updated its task record to indicate success and record its output.
2016-02-02_08:42:28 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 failure (#1, permanent) after 560 seconds
2016-02-02_08:42:28 wx7k5-8i9sb-guk2lv53z3572dc 40682 3 task output (0 bytes):

No issue #

8 years agoMerge branch '8288-poll-client-close-timeout' refs #8288
Tom Clegg [Thu, 4 Feb 2016 18:17:31 +0000 (13:17 -0500)]
Merge branch '8288-poll-client-close-timeout' refs #8288

8 years ago8288: Add timeout option to close() method of event clients.
Tom Clegg [Mon, 1 Feb 2016 06:58:34 +0000 (01:58 -0500)]
8288: Add timeout option to close() method of event clients.

Previously in EventClient, close() didn't wait for anything. Now, if a
timeout is given, it waits for ws4py to call the closed() callback to
indicate the connection has closed.

Previously in PollClient, close() waited indefinitely for the polling
thread to terminate.  This can take a very long time if, for example,
there are multiple subscriptions and the "get logs" API transaction is
slow.

The only apparent reason a caller would want to wait here at all is to
guarantee the simplifying assumption the on_event() callback is never
called after close().  Now, instead of letting the thread run until
all events are received and handled, PollClient achieves this the same
way EventClient does: ignore events that arrive after close().

8 years agoMake install guide slurm.conf more Arvados-compliant.
Brett Smith [Thu, 4 Feb 2016 10:33:24 +0000 (05:33 -0500)]
Make install guide slurm.conf more Arvados-compliant.

* SelectType=select/linear allocates entire nodes at a time.  The
  previous value scheduled individual cores.
* With that change, SelectTypeParameters=CR_CPU_Memory is not valid.
  Remove it, as we do in production.
* The setting of FastSchedule seems less pressing, but 0 is what we
  use in production, so share that here too.

No issue #.

8 years agoTry to make logging identify the actor consistently
Peter Amstutz [Wed, 3 Feb 2016 22:51:46 +0000 (17:51 -0500)]
Try to make logging identify the actor consistently

8 years agoMerge branch '6702-gce-node-create-fix' closes #6702
Peter Amstutz [Wed, 3 Feb 2016 20:54:18 +0000 (15:54 -0500)]
Merge branch '6702-gce-node-create-fix' closes #6702

8 years agoMerge branch '8288-arv-mount-deadlock' refs #8288
Tom Clegg [Wed, 3 Feb 2016 17:51:54 +0000 (12:51 -0500)]
Merge branch '8288-arv-mount-deadlock' refs #8288

8 years ago8288: Do not call operations.destroy() as a last resort, just abandon the llfuse...
Tom Clegg [Tue, 2 Feb 2016 21:46:35 +0000 (16:46 -0500)]
8288: Do not call operations.destroy() as a last resort, just abandon the llfuse thread.

8 years ago8288: Add test case for --exec mode.
Tom Clegg [Mon, 1 Feb 2016 08:01:31 +0000 (03:01 -0500)]
8288: Add test case for --exec mode.

8 years ago8288: Give fusermount -u a chance to work before resorting to operations.destroy().
Tom Clegg [Mon, 1 Feb 2016 02:43:30 +0000 (21:43 -0500)]
8288: Give fusermount -u a chance to work before resorting to operations.destroy().

Log a warning when resorting to operations.destroy().

De-duplicate setup/teardown code so more of the --exec code path is exercised in tests.

8 years ago8123: Install chartjs.js asset file.
Tom Clegg [Wed, 3 Feb 2016 17:50:31 +0000 (12:50 -0500)]
8123: Install chartjs.js asset file.

...during "setup.py install" too, not just when installing via
package.

refs #8123

8 years agoImprove install guide Nginx+SCL integration.
Brett Smith [Wed, 3 Feb 2016 11:42:17 +0000 (06:42 -0500)]
Improve install guide Nginx+SCL integration.

No issue #.

8 years agologin-sync gets user's home from /etc/passwd.
Brett Smith [Wed, 3 Feb 2016 11:26:32 +0000 (06:26 -0500)]
login-sync gets user's home from /etc/passwd.

No issue #.

8 years agoWorkbench loads CA certs on Red Hat.
Brett Smith [Wed, 3 Feb 2016 10:37:42 +0000 (05:37 -0500)]
Workbench loads CA certs on Red Hat.

This has the same rationale and logic as #6432 and
9b910084faf3db6fa2071af604620e7d45d12a6c, applied to Workbench.

Changing from `/etc/ssl/certs` to `/etc/ssl/certs/ca-certificates.crt`
is safe, because add_trust_ca accepts either a directory with hashed
certs, or a file with multiple certs.  On Debian, the latter path is a
single file built from the hashed certs in the former, so this is
functionally identical there, and more predictable on Red Hat (where I
don't know what it's doing).

No issue #.

8 years agoAdd fuse dependency to FUSE driver package.
Brett Smith [Wed, 3 Feb 2016 09:53:04 +0000 (04:53 -0500)]
Add fuse dependency to FUSE driver package.

When the fuse tools aren't installed, attempting to run arv-mount
fails with "fuse: failed to exec fusermount".

No issue #.

8 years agoAdd curl library dependency to shell install guide.
Brett Smith [Wed, 3 Feb 2016 09:39:27 +0000 (04:39 -0500)]
Add curl library dependency to shell install guide.

No isse #.

8 years agoSLURM install guide notes slurm.conf path on Red Hat.
Brett Smith [Wed, 3 Feb 2016 09:32:39 +0000 (04:32 -0500)]
SLURM install guide notes slurm.conf path on Red Hat.

No issue #.

8 years agoAdd missing ; in keepproxy Nginx config.
Brett Smith [Wed, 3 Feb 2016 09:26:49 +0000 (04:26 -0500)]
Add missing ; in keepproxy Nginx config.

No issue #.

8 years ago6702: Refactor create_node to BaseComputeNodeDriver so logic also applies to
Peter Amstutz [Tue, 2 Feb 2016 17:26:57 +0000 (12:26 -0500)]
6702: Refactor create_node to BaseComputeNodeDriver so logic also applies to
Azure.  Adds new find_node() method; if returns None or raises an error,
re-raise the original create_node exception.

8 years agoMerge branch '6702-gce-node-create-fix' closes #6702
Peter Amstutz [Tue, 2 Feb 2016 16:31:15 +0000 (11:31 -0500)]
Merge branch '6702-gce-node-create-fix' closes #6702

8 years agoMerge branch 'fix/build-python-llfuse-version' of https://github.com/wtsi-hgi/arvados
Peter Amstutz [Tue, 2 Feb 2016 16:05:50 +0000 (11:05 -0500)]
Merge branch 'fix/build-python-llfuse-version' of https://github.com/wtsi-hgi/arvados
no issue #

8 years agoMerge branch 'master' into 6702-gce-node-create-fix
Peter Amstutz [Tue, 2 Feb 2016 15:56:13 +0000 (10:56 -0500)]
Merge branch 'master' into 6702-gce-node-create-fix

8 years agoMerge branch '8206-gce-retry-init' closes #8206
Peter Amstutz [Tue, 2 Feb 2016 15:55:58 +0000 (10:55 -0500)]
Merge branch '8206-gce-retry-init' closes #8206

8 years ago8206: Mock time.sleep() to avoid unnecessary delay in test.
Peter Amstutz [Tue, 2 Feb 2016 15:55:39 +0000 (10:55 -0500)]
8206: Mock time.sleep() to avoid unnecessary delay in test.

8 years agopins python-llfuse version to 0.41.1 for fpm on all platforms
Joshua Randall [Tue, 2 Feb 2016 15:45:46 +0000 (15:45 +0000)]
pins python-llfuse version to 0.41.1 for fpm on all platforms

8 years ago8206: Refactor _retry to RetryMixin. Make retry timing consistent.
Peter Amstutz [Tue, 2 Feb 2016 15:03:39 +0000 (10:03 -0500)]
8206: Refactor _retry to RetryMixin.  Make retry timing consistent.

8 years ago8005: Install guide suggests slurm-munge on Red Hat SLURM nodes.
Brett Smith [Tue, 2 Feb 2016 12:23:10 +0000 (07:23 -0500)]
8005: Install guide suggests slurm-munge on Red Hat SLURM nodes.

This package includes the SLURM plugins that talk to MUNGE.
Refs #8005.

8 years ago6702: Catch GCE create_node() errors and check if the node was actually
Peter Amstutz [Mon, 1 Feb 2016 19:54:28 +0000 (14:54 -0500)]
6702: Catch GCE create_node() errors and check if the node was actually
created.  Added test.

8 years ago8014: Remove more upgrade script references from install guide.
Brett Smith [Mon, 1 Feb 2016 17:43:04 +0000 (12:43 -0500)]
8014: Remove more upgrade script references from install guide.

The steps removed are now handled by Rails package postinst scripts.
This should've been done in 378a988bbf9e29736382339f587582259b641782,
but was overlooked.  Refs #8014.

8 years agoRefresh Gitolite install guide.
Brett Smith [Mon, 1 Feb 2016 16:53:29 +0000 (11:53 -0500)]
Refresh Gitolite install guide.

* Tested instructions still work with 3.6.4.  So noted.
* Prefer cloning Gitolite over HTTPS, since that's less likely to be
  firewalled.

No issue #.

8 years agoFix install doc rendering of API Nginx config.
Brett Smith [Mon, 1 Feb 2016 16:51:14 +0000 (11:51 -0500)]
Fix install doc rendering of API Nginx config.

<notextile> doesn't actually nest like proper HTML, it's just a
boolean that remembers the last state.  Turn it back on after doing an
include that turns it off.  No issue #.

8 years agoPin llfuse to 0.41.1 because 0.42 came out and broke things. no issue #
Peter Amstutz [Mon, 1 Feb 2016 14:14:41 +0000 (09:14 -0500)]
Pin llfuse to 0.41.1 because 0.42 came out and broke things.  no issue #

8 years agoMerge branch '8005-centos-3rdparty-installs-wip'
Brett Smith [Fri, 29 Jan 2016 00:38:04 +0000 (19:38 -0500)]
Merge branch '8005-centos-3rdparty-installs-wip'

Closes #8005, #8135.

8 years ago8005: Add tar Ruby build dependency on CentOS 6.
Brett Smith [Fri, 29 Jan 2016 00:27:13 +0000 (19:27 -0500)]
8005: Add tar Ruby build dependency on CentOS 6.

8 years ago8005: Install guide uses runit packages on Red Hat.
Brett Smith [Thu, 28 Jan 2016 00:02:05 +0000 (19:02 -0500)]
8005: Install guide uses runit packages on Red Hat.

The runit RPMs only provide /etc/service.  The .debs provide /etc/sv
and /etc/service.  Our understanding is that /etc/sv is for all
service definitions (akin to /etc/init.d), and /etc/service is for
service definitions that runit should start at boot (akin to
/etc/rcN.d).  To provide uniformity, our install guide instructs users
to make /etc/sv if needed, and link it to /etc/service.

This commit could go farther.  Today it would be best if all the runit
sections in the install guide followed Tom's modern template used for
arv-git-httpd and arvados-docker-cleaner.  However, that will probably
require some creation and testing of log/run scripts, and some
adaptation of the run scripts to fit the template.  I wish I could
include those improvements now, but unfortunately I'm out of time, so
they'll have to wait for another day.

8 years ago8005: Install guide gets SLURM and MUNGE from RPMs.
Brett Smith [Thu, 28 Jan 2016 00:08:33 +0000 (19:08 -0500)]
8005: Install guide gets SLURM and MUNGE from RPMs.

8 years ago8005: Fix bad Textile markup in compute node install guide.
Brett Smith [Wed, 27 Jan 2016 23:54:57 +0000 (18:54 -0500)]
8005: Fix bad Textile markup in compute node install guide.

The switch dashes created strikethrough for much of the notebox.

8 years ago8005: Document installing Git on CentOS 6 from RepoForge.
Brett Smith [Wed, 27 Jan 2016 20:15:23 +0000 (15:15 -0500)]
8005: Document installing Git on CentOS 6 from RepoForge.

8 years ago8005: DRY up PostgreSQL password auth instructions on CentOS 6.
Brett Smith [Wed, 27 Jan 2016 20:00:17 +0000 (15:00 -0500)]
8005: DRY up PostgreSQL password auth instructions on CentOS 6.

8 years agoMake our API server packages for debian-based distributions depend on
Ward Vandewege [Thu, 28 Jan 2016 19:32:00 +0000 (14:32 -0500)]
Make our API server packages for debian-based distributions depend on
libcurl-ssl-dev rather than libcurl4-openssl-dev.

No issue #

8 years agocloses #8198
radhika [Tue, 26 Jan 2016 17:10:04 +0000 (12:10 -0500)]
closes #8198
Merge branch '8198-node-ip-address'

8 years agoMerge branch 'master' into 8198-node-ip-address
radhika [Tue, 26 Jan 2016 17:09:37 +0000 (12:09 -0500)]
Merge branch 'master' into 8198-node-ip-address

8 years agorefs #8178
radhika [Tue, 26 Jan 2016 17:08:21 +0000 (12:08 -0500)]
refs #8178
Merge branch '8178-keepstore-trash-interface'

8 years agoMerge branch '8178-keepstore-trash-interface' of git.curoverse.com:arvados into 8178...
radhika [Tue, 26 Jan 2016 15:41:00 +0000 (10:41 -0500)]
Merge branch '8178-keepstore-trash-interface' of git.curoverse.com:arvados into 8178-keepstore-trash-interface

Conflicts:
services/keepstore/handlers.go
services/keepstore/volume_test.go

8 years ago8178: untrash should fail when ErrNotImplemented is returned.
radhika [Tue, 26 Jan 2016 15:38:28 +0000 (10:38 -0500)]
8178: untrash should fail when ErrNotImplemented is returned.

8 years ago8178: (for now) all volumes must return ErrNotImplemented if trash-lifetime != 0
radhika [Fri, 22 Jan 2016 22:37:15 +0000 (17:37 -0500)]
8178: (for now) all volumes must return ErrNotImplemented if trash-lifetime != 0

8 years ago8178: All three currently supported volumes return error when trash-lifetime period...
radhika [Thu, 21 Jan 2016 20:25:06 +0000 (15:25 -0500)]
8178: All three currently supported volumes return error when trash-lifetime period is not configured. azure blob and s3 volumes are updated to do so.
Returning an error is causing test failures in unix volume and hence is still a work in progress.

8 years ago8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandle...
radhika [Thu, 21 Jan 2016 18:59:36 +0000 (13:59 -0500)]
8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandler and test for this endpoint.

8 years ago8206: Add test to support retry on create_driver.
Peter Amstutz [Mon, 25 Jan 2016 22:02:40 +0000 (17:02 -0500)]
8206: Add test to support retry on create_driver.

8 years agoMerge branch '8123-crunchstat-graphs' closes #8123
Tom Clegg [Mon, 25 Jan 2016 21:08:14 +0000 (16:08 -0500)]
Merge branch '8123-crunchstat-graphs' closes #8123

8 years ago8123: Escape HTML chars in page title.
Tom Clegg [Mon, 25 Jan 2016 21:05:56 +0000 (16:05 -0500)]
8123: Escape HTML chars in page title.

8 years ago8206: Refactor _retry into common function wrapper usable by both dispatch and
Peter Amstutz [Mon, 25 Jan 2016 20:36:34 +0000 (15:36 -0500)]
8206: Refactor _retry into common function wrapper usable by both dispatch and
compute drivers.

8 years ago8123: Explain existing_constraints and use a proper instance variable.
Tom Clegg [Mon, 25 Jan 2016 06:16:44 +0000 (01:16 -0500)]
8123: Explain existing_constraints and use a proper instance variable.

8 years ago8123: Fix accidental old-style class.
Tom Clegg [Mon, 25 Jan 2016 06:08:27 +0000 (01:08 -0500)]
8123: Fix accidental old-style class.

8 years ago8123: Fix type check to accommodate unicode.
Tom Clegg [Mon, 25 Jan 2016 06:00:03 +0000 (01:00 -0500)]
8123: Fix type check to accommodate unicode.

8 years ago8123: Use -v,-vv instead of --verbose,--debug.
Tom Clegg [Mon, 25 Jan 2016 05:59:46 +0000 (00:59 -0500)]
8123: Use -v,-vv instead of --verbose,--debug.

8 years ago8123: Change --include-child-jobs to --skip-child-jobs (default False).
Tom Clegg [Mon, 25 Jan 2016 02:07:42 +0000 (21:07 -0500)]
8123: Change --include-child-jobs to --skip-child-jobs (default False).

8 years ago8123: Explain mysterious memory constraint logic.
Tom Clegg [Mon, 25 Jan 2016 02:06:48 +0000 (21:06 -0500)]
8123: Explain mysterious memory constraint logic.

8 years ago8123: Update test dependencies.
Tom Clegg [Mon, 25 Jan 2016 02:05:28 +0000 (21:05 -0500)]
8123: Update test dependencies.

8 years ago8123: Include chartjs.js in package.
Tom Clegg [Sat, 23 Jan 2016 06:28:38 +0000 (01:28 -0500)]
8123: Include chartjs.js in package.

8 years agoMerge branch '8178-keepstore-trash-interface' of git.curoverse.com:arvados into 8178...
radhika [Fri, 22 Jan 2016 22:49:08 +0000 (17:49 -0500)]
Merge branch '8178-keepstore-trash-interface' of git.curoverse.com:arvados into 8178-keepstore-trash-interface

Conflicts:
services/keepstore/azure_blob_volume.go
services/keepstore/handler_test.go
services/keepstore/handlers.go
services/keepstore/keepstore.go
services/keepstore/s3_volume.go
services/keepstore/volume_test.go
services/keepstore/volume_unix.go

8 years ago8178: (for now) all volumes must return ErrNotImplemented if trash-lifetime != 0
radhika [Fri, 22 Jan 2016 22:37:15 +0000 (17:37 -0500)]
8178: (for now) all volumes must return ErrNotImplemented if trash-lifetime != 0

8 years ago8178: All three currently supported volumes return error when trash-lifetime period...
radhika [Thu, 21 Jan 2016 20:25:06 +0000 (15:25 -0500)]
8178: All three currently supported volumes return error when trash-lifetime period is not configured. azure blob and s3 volumes are updated to do so.
Returning an error is causing test failures in unix volume and hence is still a work in progress.

8 years ago8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandle...
radhika [Thu, 21 Jan 2016 18:59:36 +0000 (13:59 -0500)]
8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandler and test for this endpoint.

8 years ago8198: consider X-Forwarded-For header by way of request.repote_ip while setting node...
radhika [Fri, 22 Jan 2016 18:34:07 +0000 (13:34 -0500)]
8198: consider X-Forwarded-For header by way of request.repote_ip while setting node ip address.

8 years agoFix python sdk tests refs #6833
Peter Amstutz [Fri, 22 Jan 2016 13:23:48 +0000 (08:23 -0500)]
Fix python sdk tests refs #6833

8 years agoMerge branch '8281-arv-mount-retry' closes #8281
Tom Clegg [Thu, 21 Jan 2016 22:25:48 +0000 (17:25 -0500)]
Merge branch '8281-arv-mount-retry' closes #8281

8 years agoMerge branch '6833-test-token-expiry' closes #6833
Peter Amstutz [Thu, 21 Jan 2016 22:25:22 +0000 (17:25 -0500)]
Merge branch '6833-test-token-expiry' closes #6833

8 years agoMerge branch '7846-magic-invalidate-entry' closes #7846
Peter Amstutz [Thu, 21 Jan 2016 21:49:45 +0000 (16:49 -0500)]
Merge branch '7846-magic-invalidate-entry' closes #7846

8 years ago8281: Limit # write threads to #copies remaining, not #copies total.
Tom Clegg [Thu, 21 Jan 2016 21:10:11 +0000 (16:10 -0500)]
8281: Limit # write threads to #copies remaining, not #copies total.

8 years ago8281: Fix KeepClient retry bugs.
Tom Clegg [Thu, 21 Jan 2016 19:35:34 +0000 (14:35 -0500)]
8281: Fix KeepClient retry bugs.

get() and put() were both handling all Curl exceptions -- including
timeouts -- by marking the keep service as unusable. For example, if a
single proxy is the only service available, a single timeout was
fatal. This is fixed by setting the retry loop status to None instead
of False after curl exceptions.

put() was repeating its retry loop until it achieved the desired
number of replicas _in a single iteration_. For example, when trying
to store 2 replicas, 6 loop iterations with a single success in each
iteration would result in 6 copies being stored but put() declaring
failure. This is fixed by checking against a cumulative "done" counter
instead of the "copies done in this loop iteration" counter.

8 years ago8281: Fix arv-mount ignoring --retries argument when writing file data.
Tom Clegg [Thu, 21 Jan 2016 09:01:16 +0000 (04:01 -0500)]
8281: Fix arv-mount ignoring --retries argument when writing file data.

"num_retries" arguments get passed around extensively in arvfile.py
and collection.py in the Python SDK, but ultimately the writing of
file data is done by a _BlockManager which doesn't have any way to
accept that argument or pass it along to a KeepClient, so PUT requests
always use the CollectionWriter's KeepClient's default num_retries.

In arv-mount's case, we have been telling CollectionWriter the
num_retries we want. When CollectionWriter creates a KeepClient,
num_retries gets passed along -- normally this works around the fact
that num_retries gets lost by the _BlockManager layer. However, we
provided our own KeepClient to use instead of letting CollectionWriter
create one, and we forgot to set num_retries on our own KeepClient, so
we weren't retrying PUT requests.

8 years ago6833: Test setting small blobSignatureTtl. Fix earlier fix.
Peter Amstutz [Thu, 21 Jan 2016 20:46:53 +0000 (15:46 -0500)]
6833: Test setting small blobSignatureTtl.  Fix earlier fix.

8 years ago6833: Test to confirm that enabling polling on CollectionDirectory causes
Peter Amstutz [Thu, 21 Jan 2016 16:30:23 +0000 (11:30 -0500)]
6833: Test to confirm that enabling polling on CollectionDirectory causes
tokens to be refreshed.

8 years ago6833: Collection update file block list (to get most recent tokens) even when actual
Peter Amstutz [Thu, 21 Jan 2016 16:29:40 +0000 (11:29 -0500)]
6833: Collection update file block list (to get most recent tokens) even when actual
content hasn't changed.

8 years ago6833: Add (most of) a test case for token expiry.
Tom Clegg [Fri, 15 Jan 2016 21:20:49 +0000 (16:20 -0500)]
6833: Add (most of) a test case for token expiry.

8 years ago8178: All three currently supported volumes return error when trash-lifetime period...
radhika [Thu, 21 Jan 2016 20:25:06 +0000 (15:25 -0500)]
8178: All three currently supported volumes return error when trash-lifetime period is not configured. azure blob and s3 volumes are updated to do so.
Returning an error is causing test failures in unix volume and hence is still a work in progress.

8 years agoMerge branch '8008-package-testing' refs #8008
Peter Amstutz [Thu, 21 Jan 2016 19:07:11 +0000 (14:07 -0500)]
Merge branch '8008-package-testing' refs #8008

8 years ago8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandle...
radhika [Thu, 21 Jan 2016 18:59:36 +0000 (13:59 -0500)]
8178: rename Delete api as Trash; add Untrash to volume interface; add UndeleteHandler and test for this endpoint.

8 years ago7846: Better directory entry invalidation, fixes MagicDirApiError test. Also
Peter Amstutz [Thu, 21 Jan 2016 18:37:20 +0000 (13:37 -0500)]
7846: Better directory entry invalidation, fixes MagicDirApiError test.  Also
fix bug in typo in exception handler.

8 years agorefs #6833
radhika [Wed, 20 Jan 2016 21:15:12 +0000 (16:15 -0500)]
refs #6833
Merge branch '6833-arv-mount-cache-refresh'

8 years ago6833: get blobSignatureTtl from discovery document and use it to set the poll_time.
radhika [Mon, 11 Jan 2016 14:37:10 +0000 (09:37 -0500)]
6833: get blobSignatureTtl from discovery document and use it to set the poll_time.

8 years ago6833: add poll_time to CollectionDirectory.
radhika [Thu, 7 Jan 2016 18:14:05 +0000 (13:14 -0500)]
6833: add poll_time to CollectionDirectory.

8 years agoFix typo.
Ward Vandewege [Wed, 20 Jan 2016 18:19:58 +0000 (13:19 -0500)]
Fix typo.

refs #8248

8 years agocloses #8028
radhika [Wed, 20 Jan 2016 17:37:34 +0000 (12:37 -0500)]
closes #8028
Merge branch '8028-crunch-dispatch-local'