arvados.git
8 years ago6149: crunch-job installer handles egg_info errors better. 6149-quiet-egg-info-stderr-wip
Brett Smith [Thu, 11 Jun 2015 21:16:03 +0000 (17:16 -0400)]
6149: crunch-job installer handles egg_info errors better.

Capture the command's stderr and confirm that the error refers to
git.  If it does, ignore the stderr and set a build tag.  Otherwise,
propagate stderr and abort.

8 years agoUpdate arvados in bundle. refs #6203 refs #6277
Tom Clegg [Tue, 9 Jun 2015 19:16:28 +0000 (15:16 -0400)]
Update arvados in bundle. refs #6203 refs #6277

8 years agoUse default word wrap (no mid-word line breaks). refs #6057
Tom Clegg [Tue, 9 Jun 2015 19:12:46 +0000 (15:12 -0400)]
Use default word wrap (no mid-word line breaks). refs #6057

8 years agoMerge branch '6203-locator-regexp' refs #6203 refs #6277
Tom Clegg [Tue, 9 Jun 2015 14:11:35 +0000 (10:11 -0400)]
Merge branch '6203-locator-regexp' refs #6203 refs #6277

8 years agoadded shell-image back again to the build. I checked on my local machine and works...
Nico Cesar [Wed, 10 Jun 2015 16:26:15 +0000 (12:26 -0400)]
added shell-image back again to the build. I checked on my local machine and works great

no issue #

8 years ago5790: Fix PySDK Docker image listing comparing ints and datetimes.
Brett Smith [Wed, 10 Jun 2015 15:24:38 +0000 (11:24 -0400)]
5790: Fix PySDK Docker image listing comparing ints and datetimes.

Closes #5790.

8 years agoAdd instructions to the 'Create standard objects' page to create a cluster-wide reada...
Ward Vandewege [Wed, 10 Jun 2015 14:32:40 +0000 (10:32 -0400)]
Add instructions to the 'Create standard objects' page to create a cluster-wide readable project for standard Arvados Docker images.

refs #6096

8 years agocloses #6203
radhika [Wed, 10 Jun 2015 14:15:47 +0000 (10:15 -0400)]
closes #6203
Merge branch '6203-collection-perf-api'

8 years agoMerge branch '6203-collection-perf-api-TC' of git.curoverse.com:arvados into 6203...
radhika [Wed, 10 Jun 2015 14:06:35 +0000 (10:06 -0400)]
Merge branch '6203-collection-perf-api-TC' of git.curoverse.com:arvados into 6203-collection-perf-api-TC

Conflicts:
services/api/app/models/collection.rb

8 years agoqr1hi-automated-performance-suite is failing because the test doesnt give enough...
Nico Cesar [Tue, 9 Jun 2015 18:48:29 +0000 (14:48 -0400)]
qr1hi-automated-performance-suite is failing because the test doesnt give enough time for the page to render (now 50s makes the test past.). no issue #

8 years agoqr1hi-automated-performance-suite is failing because the test doesnt give enough...
Nico Cesar [Tue, 9 Jun 2015 18:38:18 +0000 (14:38 -0400)]
qr1hi-automated-performance-suite is failing because the test doesnt give enough time for the page to render

no issue #

8 years agoopenssl self cert creation failed because missing of "tmp" directory.
Nico Cesar [Tue, 9 Jun 2015 15:32:25 +0000 (11:32 -0400)]
openssl self cert creation failed because missing of "tmp" directory.
api server failed to start because missing "tmp/logs" and "tmp/api" ditectories

no issue #

8 years agoMerge branch '6235-go-sdk-discovery' closes #6235
Peter Amstutz [Tue, 9 Jun 2015 15:22:52 +0000 (11:22 -0400)]
Merge branch '6235-go-sdk-discovery' closes #6235

8 years agoMerge branch '5790-copy-most-recent-docker-image-wip'
Brett Smith [Tue, 9 Jun 2015 14:48:36 +0000 (10:48 -0400)]
Merge branch '5790-copy-most-recent-docker-image-wip'

Closes #5790, #6103.

8 years ago5790: Improve Docker image listing in Python SDK.
Brett Smith [Wed, 3 Jun 2015 20:59:24 +0000 (16:59 -0400)]
5790: Improve Docker image listing in Python SDK.

* Always fetch all relevant Docker links.
* Support finding images by image hash.
* Show image hashes when listing images by name.
* Like Docker itself, when an image has multiple names and we're not
  filtering by name, list each one.
* Better match the API server's priority logic:
  * Ignore links to collections that aren't found.
  * Links with an image_timestamp always have priority over those that
    don't, regardless of their respective created_at timestamps.

The main motivation for this change is to make sure arv-copy gets the
right Docker image when copying a pipeline template recursively.

This implementation goes through some trouble to parse timestamps out
of each Docker link only once.

8 years agoMerge branch '6152-compute-node-no-arv-wip'
Brett Smith [Tue, 9 Jun 2015 14:26:44 +0000 (10:26 -0400)]
Merge branch '6152-compute-node-no-arv-wip'

Closes #6152, #6256.

8 years ago6152: Use Python SDK for compute node installation.
Brett Smith [Fri, 5 Jun 2015 19:46:08 +0000 (15:46 -0400)]
6152: Use Python SDK for compute node installation.

This eliminates an otherwise-needless dependency on `arv` and the
entire Ruby stack.

8 years agocloses #6093
radhika [Tue, 9 Jun 2015 14:25:59 +0000 (10:25 -0400)]
closes #6093
Merge branch '6093-refresh-docs'

8 years ago6093: remove "Alternate way to add SSH keys" and add the "Manage account" link blurb...
radhika [Tue, 9 Jun 2015 14:21:38 +0000 (10:21 -0400)]
6093: remove "Alternate way to add SSH keys" and add the "Manage account" link blurb to "Adding your keys" section itself.

8 years ago6203: Fix loophole allowing locators in bogus manifests to be accepted
Tom Clegg [Mon, 8 Jun 2015 19:34:48 +0000 (15:34 -0400)]
6203: Fix loophole allowing locators in bogus manifests to be accepted
without verification and then signed in responses. refs #6277

8 years ago6203: Add tests for LOCATOR_REGEXP. Fix regexp to reject "++" and trailing "\n".
Tom Clegg [Mon, 8 Jun 2015 16:03:42 +0000 (12:03 -0400)]
6203: Add tests for LOCATOR_REGEXP. Fix regexp to reject "++" and trailing "\n".

8 years agoMerge branch 'master' into 6093-refresh-docs
radhika [Tue, 9 Jun 2015 14:09:07 +0000 (10:09 -0400)]
Merge branch 'master' into 6093-refresh-docs

8 years ago6203: Accept (and discard) hints in client-provided portable_data_hash.
Tom Clegg [Mon, 8 Jun 2015 14:10:26 +0000 (10:10 -0400)]
6203: Accept (and discard) hints in client-provided portable_data_hash.

8 years ago6203: Remove unused vars. Remove unnecessary newline manipulation.
Tom Clegg [Mon, 8 Jun 2015 07:11:22 +0000 (03:11 -0400)]
6203: Remove unused vars. Remove unnecessary newline manipulation.

8 years ago6235: Discovery() returns the requested value directly instead of a single-entry...
Peter Amstutz [Mon, 8 Jun 2015 13:36:33 +0000 (09:36 -0400)]
6235: Discovery() returns the requested value directly instead of a single-entry map.

8 years ago6203: Remove unused vars. Remove unnecessary newline manipulation.
Tom Clegg [Mon, 8 Jun 2015 07:11:22 +0000 (03:11 -0400)]
6203: Remove unused vars. Remove unnecessary newline manipulation.

8 years ago6203: Use faster =~ instead of match.
Tom Clegg [Mon, 8 Jun 2015 06:23:57 +0000 (02:23 -0400)]
6203: Use faster =~ instead of match.

8 years ago6203: Fix cheating test.
Tom Clegg [Mon, 8 Jun 2015 06:23:37 +0000 (02:23 -0400)]
6203: Fix cheating test.

8 years ago6203: Use each_line instead of split.each.
Tom Clegg [Mon, 8 Jun 2015 06:23:20 +0000 (02:23 -0400)]
6203: Use each_line instead of split.each.

8 years ago6203: Remove redundant split before each_line.
Tom Clegg [Mon, 8 Jun 2015 06:22:33 +0000 (02:22 -0400)]
6203: Remove redundant split before each_line.

8 years ago6203: Apply special case only to a 0-byte manifest: don't ignore white space.
Tom Clegg [Mon, 8 Jun 2015 05:50:30 +0000 (01:50 -0400)]
6203: Apply special case only to a 0-byte manifest: don't ignore white space.

8 years ago6203: Eliminate unneeded variable.
Tom Clegg [Mon, 8 Jun 2015 05:49:06 +0000 (01:49 -0400)]
6203: Eliminate unneeded variable.

8 years ago6203: Merge pdh validations into one method. Update comments. Add tests.
Tom Clegg [Mon, 8 Jun 2015 05:48:45 +0000 (01:48 -0400)]
6203: Merge pdh validations into one method. Update comments. Add tests.

8 years ago6203: Use LOCATION_REGEXP from sdk; also add back updated each_manifest_locator method
radhika [Tue, 9 Jun 2015 01:42:30 +0000 (21:42 -0400)]
6203: Use LOCATION_REGEXP from sdk; also add back updated each_manifest_locator method
with the same updates as munge method since this is simpler than the munge method. This
has seems further improved performance.

8 years ago6203: Use manifest.each_line and line.rstrip! instead of manifest.split("\n").
radhika [Mon, 8 Jun 2015 19:39:37 +0000 (15:39 -0400)]
6203: Use manifest.each_line and line.rstrip! instead of manifest.split("\n").
Performance was comparable in both cases; though each_line itself is twice as fast (we need to do strip, which eats away the gain).

8 years ago6203: further optimization of munge method; also, match[0].sub(/\+A[^+]*/, '') instea...
radhika [Mon, 8 Jun 2015 18:32:42 +0000 (14:32 -0400)]
6203: further optimization of munge method; also, match[0].sub(/\+A[^+]*/, '') instead of split+append
resulted in another 2 seconds saving in each of create and update operations!

8 years ago6203: compute_pdh, computed_pdh etc etc etc confusion. clean up to make it easier...
radhika [Mon, 8 Jun 2015 16:24:53 +0000 (12:24 -0400)]
6203: compute_pdh, computed_pdh etc etc etc confusion. clean up to make it easier to follow.

8 years ago6203: Benchmarking revealed that regexp.match(string) is 2.5x more expensive than...
radhika [Sat, 6 Jun 2015 23:36:13 +0000 (19:36 -0400)]
6203: Benchmarking revealed that regexp.match(string) is 2.5x more expensive than string =~ regexp. Updated check_signatures method accordingly.

8 years agoMerge branch 'master' into 6203-collection-perf-api
radhika [Sat, 6 Jun 2015 22:59:33 +0000 (18:59 -0400)]
Merge branch 'master' into 6203-collection-perf-api

8 years agobundler 1.10 brakes workbench build because
Nico Cesar [Fri, 5 Jun 2015 18:54:36 +0000 (14:54 -0400)]
bundler 1.10 brakes workbench build because

https://github.com/lucasefe/themes_for_rails/blob/master/themes_for_rails.gemspec

has a bug. And it's a NOTICE in 1.9.9 but its a FATAL in 1.10.x

this is our workaround.

no issue #

8 years agoMerge branch '6146-log-squeue-lost-tasks' refs #6146
Tom Clegg [Fri, 5 Jun 2015 17:00:31 +0000 (13:00 -0400)]
Merge branch '6146-log-squeue-lost-tasks' refs #6146

8 years ago6146: Document how --steps really works. Simplify squeue output format and parsing.
Tom Clegg [Thu, 4 Jun 2015 21:09:59 +0000 (17:09 -0400)]
6146: Document how --steps really works. Simplify squeue output format and parsing.

8 years agoMerge branch '6074-collections-index' closes #6074
Tom Clegg [Thu, 4 Jun 2015 19:29:35 +0000 (15:29 -0400)]
Merge branch '6074-collections-index' closes #6074

8 years agoMerge branch 'master' into 6093-refresh-docs
radhika [Thu, 4 Jun 2015 18:10:33 +0000 (14:10 -0400)]
Merge branch 'master' into 6093-refresh-docs

8 years ago6093: Add button-override css to make any buttons added inside the documentation...
radhika [Thu, 4 Jun 2015 18:09:39 +0000 (14:09 -0400)]
6093: Add button-override css to make any buttons added inside the documentation to appear unclickable to avoid any confusion.

8 years ago6203: Corrected one dumb switched order of if conditions that caused 5s lag!!
radhika [Thu, 4 Jun 2015 16:51:26 +0000 (12:51 -0400)]
6203: Corrected one dumb switched order of if conditions that caused 5s lag!!

8 years agoMerge branch 'master' into 6203-collection-perf-api
radhika [Thu, 4 Jun 2015 16:16:55 +0000 (12:16 -0400)]
Merge branch 'master' into 6203-collection-perf-api

8 years ago6087: Fix MissingAttribute check, and change it to a debug warning for now
Tom Clegg [Thu, 4 Jun 2015 15:21:43 +0000 (11:21 -0400)]
6087: Fix MissingAttribute check, and change it to a debug warning for now
because it reveals too many bugs. refs #6087

8 years ago6087: Fix MissingAttribute firing for new records during changes_applied. refs #6087
Tom Clegg [Thu, 4 Jun 2015 14:38:28 +0000 (10:38 -0400)]
6087: Fix MissingAttribute firing for new records during changes_applied. refs #6087

8 years agoMerge branch 'master' into 6203-collection-perf-api
radhika [Thu, 4 Jun 2015 14:27:04 +0000 (10:27 -0400)]
Merge branch 'master' into 6203-collection-perf-api

Conflicts:
apps/workbench/test/integration_performance/collection_unit_test.rb
apps/workbench/test/integration_performance/collections_controller_test.rb
services/api/test/integration/collections_performance_test.rb
services/api/test/unit/collection_performance_test.rb

8 years agoMerge branch '6087-collection-timing' closes #6087
Tom Clegg [Thu, 4 Jun 2015 13:37:59 +0000 (09:37 -0400)]
Merge branch '6087-collection-timing' closes #6087

8 years ago6087: If attributes are accessed but not loaded due to select(), raise instead of...
Tom Clegg [Wed, 3 Jun 2015 18:50:03 +0000 (14:50 -0400)]
6087: If attributes are accessed but not loaded due to select(), raise instead of returning nil/{}/[].

8 years ago6087: Remove unneeded CollectionsController#update special case.
Tom Clegg [Wed, 3 Jun 2015 16:05:11 +0000 (12:05 -0400)]
6087: Remove unneeded CollectionsController#update special case.

ArvadosBase#save now covers the general case of omitting unchanged attributes.

8 years ago6087: Reset changed-attrs list after saving. Fix only-send-changed-attrs logic. Add...
Tom Clegg [Tue, 2 Jun 2015 19:23:51 +0000 (15:23 -0400)]
6087: Reset changed-attrs list after saving. Fix only-send-changed-attrs logic. Add tests.

8 years ago6087: Strengthen "manifest_text is not lost in update" test.
Tom Clegg [Wed, 3 Jun 2015 16:00:19 +0000 (12:00 -0400)]
6087: Strengthen "manifest_text is not lost in update" test.

8 years ago6087: Add big-manifest tests, with some finer-grained performance numbers on stderr.
Tom Clegg [Mon, 25 May 2015 17:25:22 +0000 (13:25 -0400)]
6087: Add big-manifest tests, with some finer-grained performance numbers on stderr.

8 years ago6074: Use each instead of find_each, so our order() and limit() constraints are respe...
Tom Clegg [Thu, 4 Jun 2015 04:52:48 +0000 (00:52 -0400)]
6074: Use each instead of find_each, so our order() and limit() constraints are respected.

According to http://apidock.com/rails/ActiveRecord/Batches/find_each,
both order and limit are ignored.

The existing test "max_index_database_read does not interfere with
order" had two fatal bugs that prevented it from catching the
find_each problem:

1. It didn't select the 'name' column, so the 'name' order was
   ignored. But the test passed because the name wasn't returned,
   item['name'] was nil, and... nil !~ /pattern/ is true. Both
   problems are fixed here.

   (This explains why it seemed to find 15 names starting with
   Collection_9, rather than just the 11 that exist (_9 and _90..99).)

2. It tested only that the returned results followed the requested
   order, not that the order was followed when deciding what the limit
   should be. All of the items_available were the same size, so _any_
   order would have set the limit at 15 and passed the test.

The second problem is fixed by adding a separate test.

8 years ago6203: add trailing newline character in munge_manifest_locators method only when...
radhika [Thu, 4 Jun 2015 03:57:58 +0000 (23:57 -0400)]
6203: add trailing newline character in munge_manifest_locators method only when the original manifest ended with one; one of the unit tests did not like it otherwise.

8 years ago6074: Update config docs to match new max_index_database_read behavior.
Tom Clegg [Thu, 4 Jun 2015 03:41:06 +0000 (23:41 -0400)]
6074: Update config docs to match new max_index_database_read behavior.

8 years ago6203: Merge strip_manifest_text and maybe_clear_replication_confirmed into one method...
radhika [Thu, 4 Jun 2015 03:13:39 +0000 (23:13 -0400)]
6203: Merge strip_manifest_text and maybe_clear_replication_confirmed into one method to avoid repeated manifest parsing related expense.

8 years ago6235: Add method to get parameters from API discovery document.
Peter Amstutz [Wed, 3 Jun 2015 20:40:50 +0000 (16:40 -0400)]
6235: Add method to get parameters from API discovery document.

8 years ago6074: Never exceed the configured max_index_database_read (even by one
Tom Clegg [Wed, 3 Jun 2015 20:37:39 +0000 (16:37 -0400)]
6074: Never exceed the configured max_index_database_read (even by one
record) unless necessary to return one row.

Fix copy&paste error in test case.

8 years ago6074: Clear any existing ActiveRecord select() before adding our own,
Tom Clegg [Wed, 3 Jun 2015 20:28:01 +0000 (16:28 -0400)]
6074: Clear any existing ActiveRecord select() before adding our own,
otherwise we'll read all the big values when we're really just trying
to predict the result size.

8 years ago6074: Speed up db query by using octet_length() instead of length(). closes #6223
Tom Clegg [Wed, 3 Jun 2015 19:39:01 +0000 (15:39 -0400)]
6074: Speed up db query by using octet_length() instead of length(). closes #6223

8 years agoMerge branch 'master' into 6203-collection-perf-api
radhika [Wed, 3 Jun 2015 19:11:20 +0000 (15:11 -0400)]
Merge branch 'master' into 6203-collection-perf-api

8 years ago6203: Do not use Keep::Locator.parse to parse locator in some of the most expensive...
radhika [Wed, 3 Jun 2015 19:08:56 +0000 (15:08 -0400)]
6203: Do not use Keep::Locator.parse to parse locator in some of the most expensive paths.

8 years ago6087: Use HTTPClient's compression feature (instead of adding the
Tom Clegg [Mon, 25 May 2015 14:35:02 +0000 (10:35 -0400)]
6087: Use HTTPClient's compression feature (instead of adding the
Content-Encoding header ourselves). Rename config knob to describe
purpose instead of implementation.

8 years ago6087: Compute portable_data_hash only once during check_signatures.
Tom Clegg [Mon, 25 May 2015 14:23:53 +0000 (10:23 -0400)]
6087: Compute portable_data_hash only once during check_signatures.

8 years ago6087: Use app-configured key by default for blob signing and verification.
Tom Clegg [Mon, 25 May 2015 14:19:31 +0000 (10:19 -0400)]
6087: Use app-configured key by default for blob signing and verification.

8 years agoMerge branch '6146-log-squeue-lost-tasks' refs #6146
Tom Clegg [Wed, 3 Jun 2015 13:50:55 +0000 (09:50 -0400)]
Merge branch '6146-log-squeue-lost-tasks' refs #6146

8 years ago6146: Better log message.
Tom Clegg [Sun, 31 May 2015 09:48:22 +0000 (05:48 -0400)]
6146: Better log message.

8 years ago6146: Use new SLURM_JOB_ID env var instead of old SLURM_JOBID
Tom Clegg [Sun, 31 May 2015 09:32:55 +0000 (05:32 -0400)]
6146: Use new SLURM_JOB_ID env var instead of old SLURM_JOBID

8 years ago6146: Improvements to "kill srun process if slurm task disappears" feature:
Tom Clegg [Sun, 31 May 2015 06:02:19 +0000 (02:02 -0400)]
6146: Improvements to "kill srun process if slurm task disappears" feature:

* Log when we notice a process is orphaned.

* Log when we decide to kill an orphaned process.

* Use `squeue --jobs $SLURM_JOBID` so slurm doesn't have to tell us
  about other jobs' tasks.

* Do not kill a process that is still reporting stderr.

* Do not check `squeue` at all if every process has reported stderr
  since the last squeue check. (In such cases, it seems safe to assume
  no children are hung/dead.)

* Use the same timer/interval (15 seconds) for both noticing and
  killing orphaned processes.

8 years agoMerge branch 'master' into 6203-collection-perf-api
radhika [Wed, 3 Jun 2015 02:26:22 +0000 (22:26 -0400)]
Merge branch 'master' into 6203-collection-perf-api

8 years agorefs #6093
radhika [Wed, 3 Jun 2015 02:19:54 +0000 (22:19 -0400)]
refs #6093
Merge branch '6093-refresh-docs'

8 years ago6093: delete the redundant details in "alternate way to add ssh keys" section.
radhika [Wed, 3 Jun 2015 02:17:30 +0000 (22:17 -0400)]
6093: delete the redundant details in "alternate way to add ssh keys" section.

8 years agoMerge branch 'master' into 6093-refresh-docs
radhika [Wed, 3 Jun 2015 02:08:42 +0000 (22:08 -0400)]
Merge branch 'master' into 6093-refresh-docs

8 years agocloses #5930. Merge branch '5930-smalldocfix'
Nancy Ouyang [Wed, 3 Jun 2015 00:09:20 +0000 (20:09 -0400)]
closes #5930. Merge branch '5930-smalldocfix'

8 years ago5930: fixed as per code review
Nancy Ouyang [Wed, 3 Jun 2015 00:08:16 +0000 (20:08 -0400)]
5930: fixed as per code review

8 years agoMerge branch '6194-python-arvfile-large-write' closes #6194
Peter Amstutz [Tue, 2 Jun 2015 20:23:22 +0000 (16:23 -0400)]
Merge branch '6194-python-arvfile-large-write' closes #6194

8 years ago6194: Simplify test_large_write a little bit.
Peter Amstutz [Tue, 2 Jun 2015 20:17:31 +0000 (16:17 -0400)]
6194: Simplify test_large_write a little bit.

8 years agoMerge branch 'master' into 6093-refresh-docs
radhika [Tue, 2 Jun 2015 19:39:41 +0000 (15:39 -0400)]
Merge branch 'master' into 6093-refresh-docs

8 years ago6194: Make splitting loop simpler since [n:n+KEEP_BLOCK_SIZE] returns a short
Peter Amstutz [Tue, 2 Jun 2015 17:32:19 +0000 (13:32 -0400)]
6194: Make splitting loop simpler since [n:n+KEEP_BLOCK_SIZE] returns a short
slice when there isn't KEEP_BLOCK_SIZE data.  Update test.

8 years agoMerge branch 'master' into 6087-collection-timing
radhika [Tue, 2 Jun 2015 13:44:08 +0000 (09:44 -0400)]
Merge branch 'master' into 6087-collection-timing

8 years agoMerge branch 'master' into 6194-python-arvfile-large-write
Peter Amstutz [Mon, 1 Jun 2015 20:57:54 +0000 (16:57 -0400)]
Merge branch 'master' into 6194-python-arvfile-large-write

8 years ago6194: Fix test. Lots of small writes break across blocks differently than one huge
Peter Amstutz [Mon, 1 Jun 2015 20:56:15 +0000 (16:56 -0400)]
6194: Fix test.  Lots of small writes break across blocks differently than one huge
one.

8 years agoMerge branch 'master' into 6087-collection-timing
radhika [Mon, 1 Jun 2015 14:05:56 +0000 (10:05 -0400)]
Merge branch 'master' into 6087-collection-timing

8 years ago6194: Fix typo in invocation of writeto() and use memoryview to avoid copying slices.
Peter Amstutz [Mon, 1 Jun 2015 12:51:35 +0000 (08:51 -0400)]
6194: Fix typo in invocation of writeto() and use memoryview to avoid copying slices.

8 years agoRemove non-existent migration from structure.sql. refs #3036
Tom Clegg [Sun, 31 May 2015 12:39:20 +0000 (08:39 -0400)]
Remove non-existent migration from structure.sql. refs #3036

8 years agoUpdate example dns_server_update_command. refs #6146
Tom Clegg [Sun, 31 May 2015 12:38:52 +0000 (08:38 -0400)]
Update example dns_server_update_command. refs #6146

8 years agoMerge branch '6146-dns-update-command' refs #6146
Tom Clegg [Sun, 31 May 2015 12:29:51 +0000 (08:29 -0400)]
Merge branch '6146-dns-update-command' refs #6146

8 years ago6146: Add dns_server_update_command. Update docs & tests for DNS update hooks.
Tom Clegg [Sun, 31 May 2015 12:23:44 +0000 (08:23 -0400)]
6146: Add dns_server_update_command. Update docs & tests for DNS update hooks.

8 years agoTell tar to read to EOF (even if it detects trailing NULs).
Tom Clegg [Sat, 30 May 2015 02:01:06 +0000 (22:01 -0400)]
Tell tar to read to EOF (even if it detects trailing NULs).

Avoids SIGPIPE when feeding a tarball made with tar -A.

refs #6146 refs #6094

8 years ago6194: Chunk large ArvadosFile writes automatically instead of raising an error.
Peter Amstutz [Fri, 29 May 2015 20:28:37 +0000 (16:28 -0400)]
6194: Chunk large ArvadosFile writes automatically instead of raising an error.

8 years agoMerge branch '6146-ignore-tar-sigpipe' refs #6146 refs #6094
Tom Clegg [Fri, 29 May 2015 18:04:01 +0000 (14:04 -0400)]
Merge branch '6146-ignore-tar-sigpipe' refs #6146 refs #6094

8 years ago6146: Ignore SIGPIPE while feeding data to tar. Rely on close() retval instead.
Tom Clegg [Fri, 29 May 2015 17:32:40 +0000 (13:32 -0400)]
6146: Ignore SIGPIPE while feeding data to tar. Rely on close() retval instead.

8 years agoIn install script, log archive hash before running tar. refs #6146
Tom Clegg [Fri, 29 May 2015 16:02:03 +0000 (12:02 -0400)]
In install script, log archive hash before running tar. refs #6146

8 years agoMerge branch '6146-job-runtime-sanity' refs #6146
Tom Clegg [Fri, 29 May 2015 15:31:43 +0000 (11:31 -0400)]
Merge branch '6146-job-runtime-sanity' refs #6146

8 years ago6146: Exit TEMPFAIL early (without failing the job) if worker nodes cannot run a...
Tom Clegg [Thu, 28 May 2015 21:13:44 +0000 (17:13 -0400)]
6146: Exit TEMPFAIL early (without failing the job) if worker nodes cannot run a trivial command.

This is meant to improve the way we handle a couple of edge cases.

1. A worker node doesn't get bootstrapped properly. It works well
enough to persuade nodemanager and the API server that it's alive and
ready to run jobs, but it can't actually run jobs. This means there's
a bug in the bootstrapping process -- its startup script shouldn't
tell slurm State=RESUME without checking itself -- but even so this
doesn't deserve to fail a job: it's definitely a system problem,
there's zero chance a different job would have gone any differently.

2. A worker node has a hardware problem, or it has fallen off the
network, or something like that, but slurm hasn't yet noticed and set
its state to DOWN, so slurm still uses it to satisfy crunch-dispatch's
"salloc" commands. As above, there's zero chance this could have gone
differently for any other job, so it doesn't make sense to fail the
job.

8 years agoMerge branch 'master' into 6087-collection-timing
radhika [Thu, 28 May 2015 18:53:44 +0000 (14:53 -0400)]
Merge branch 'master' into 6087-collection-timing