mishaz [Tue, 9 Jun 2015 18:00:30 +0000 (18:00 +0000)]
Merge branch 'master' into 3408-production-datamanager
mishaz [Tue, 9 Jun 2015 17:51:31 +0000 (17:51 +0000)]
Renamed BlockToReplication BlockToDesiredReplication.
Added protocol field to servers in pull list.
Nico Cesar [Tue, 9 Jun 2015 15:32:25 +0000 (11:32 -0400)]
openssl self cert creation failed because missing of "tmp" directory.
api server failed to start because missing "tmp/logs" and "tmp/api" ditectories
no issue #
Peter Amstutz [Tue, 9 Jun 2015 15:22:52 +0000 (11:22 -0400)]
Merge branch '6235-go-sdk-discovery' closes #6235
Brett Smith [Tue, 9 Jun 2015 14:48:36 +0000 (10:48 -0400)]
Merge branch '5790-copy-most-recent-docker-image-wip'
Closes #5790, #6103.
Brett Smith [Wed, 3 Jun 2015 20:59:24 +0000 (16:59 -0400)]
5790: Improve Docker image listing in Python SDK.
* Always fetch all relevant Docker links.
* Support finding images by image hash.
* Show image hashes when listing images by name.
* Like Docker itself, when an image has multiple names and we're not
filtering by name, list each one.
* Better match the API server's priority logic:
* Ignore links to collections that aren't found.
* Links with an image_timestamp always have priority over those that
don't, regardless of their respective created_at timestamps.
The main motivation for this change is to make sure arv-copy gets the
right Docker image when copying a pipeline template recursively.
This implementation goes through some trouble to parse timestamps out
of each Docker link only once.
Brett Smith [Tue, 9 Jun 2015 14:26:44 +0000 (10:26 -0400)]
Merge branch '6152-compute-node-no-arv-wip'
Closes #6152, #6256.
Brett Smith [Fri, 5 Jun 2015 19:46:08 +0000 (15:46 -0400)]
6152: Use Python SDK for compute node installation.
This eliminates an otherwise-needless dependency on `arv` and the
entire Ruby stack.
radhika [Tue, 9 Jun 2015 14:25:59 +0000 (10:25 -0400)]
closes #6093
Merge branch '6093-refresh-docs'
radhika [Tue, 9 Jun 2015 14:21:38 +0000 (10:21 -0400)]
6093: remove "Alternate way to add SSH keys" and add the "Manage account" link blurb to "Adding your keys" section itself.
radhika [Tue, 9 Jun 2015 14:09:07 +0000 (10:09 -0400)]
Merge branch 'master' into 6093-refresh-docs
Peter Amstutz [Mon, 8 Jun 2015 13:36:33 +0000 (09:36 -0400)]
6235: Discovery() returns the requested value directly instead of a single-entry map.
Nico Cesar [Fri, 5 Jun 2015 18:54:36 +0000 (14:54 -0400)]
bundler 1.10 brakes workbench build because
https://github.com/lucasefe/themes_for_rails/blob/master/themes_for_rails.gemspec
has a bug. And it's a NOTICE in 1.9.9 but its a FATAL in 1.10.x
this is our workaround.
no issue #
Tom Clegg [Fri, 5 Jun 2015 17:00:31 +0000 (13:00 -0400)]
Merge branch '6146-log-squeue-lost-tasks' refs #6146
mishaz [Thu, 4 Jun 2015 21:11:01 +0000 (21:11 +0000)]
Added size to block locators, touching most of the code.
Moved BlockLocator (and associated parsing code) from sdk/go/manifest to sdk/go/blockdigest.
Added some helper methods, some for testing.
Some gofmt cleanup.
Tom Clegg [Thu, 4 Jun 2015 21:09:59 +0000 (17:09 -0400)]
6146: Document how --steps really works. Simplify squeue output format and parsing.
Tom Clegg [Thu, 4 Jun 2015 19:29:35 +0000 (15:29 -0400)]
Merge branch '6074-collections-index' closes #6074
radhika [Thu, 4 Jun 2015 18:10:33 +0000 (14:10 -0400)]
Merge branch 'master' into 6093-refresh-docs
radhika [Thu, 4 Jun 2015 18:09:39 +0000 (14:09 -0400)]
6093: Add button-override css to make any buttons added inside the documentation to appear unclickable to avoid any confusion.
Tom Clegg [Thu, 4 Jun 2015 15:21:43 +0000 (11:21 -0400)]
6087: Fix MissingAttribute check, and change it to a debug warning for now
because it reveals too many bugs. refs #6087
Tom Clegg [Thu, 4 Jun 2015 14:38:28 +0000 (10:38 -0400)]
6087: Fix MissingAttribute firing for new records during changes_applied. refs #6087
Tom Clegg [Thu, 4 Jun 2015 13:37:59 +0000 (09:37 -0400)]
Merge branch '6087-collection-timing' closes #6087
Tom Clegg [Wed, 3 Jun 2015 18:50:03 +0000 (14:50 -0400)]
6087: If attributes are accessed but not loaded due to select(), raise instead of returning nil/{}/[].
Tom Clegg [Wed, 3 Jun 2015 16:05:11 +0000 (12:05 -0400)]
6087: Remove unneeded CollectionsController#update special case.
ArvadosBase#save now covers the general case of omitting unchanged attributes.
Tom Clegg [Tue, 2 Jun 2015 19:23:51 +0000 (15:23 -0400)]
6087: Reset changed-attrs list after saving. Fix only-send-changed-attrs logic. Add tests.
Tom Clegg [Wed, 3 Jun 2015 16:00:19 +0000 (12:00 -0400)]
6087: Strengthen "manifest_text is not lost in update" test.
Tom Clegg [Mon, 25 May 2015 17:25:22 +0000 (13:25 -0400)]
6087: Add big-manifest tests, with some finer-grained performance numbers on stderr.
Tom Clegg [Thu, 4 Jun 2015 04:52:48 +0000 (00:52 -0400)]
6074: Use each instead of find_each, so our order() and limit() constraints are respected.
According to http://apidock.com/rails/ActiveRecord/Batches/find_each,
both order and limit are ignored.
The existing test "max_index_database_read does not interfere with
order" had two fatal bugs that prevented it from catching the
find_each problem:
1. It didn't select the 'name' column, so the 'name' order was
ignored. But the test passed because the name wasn't returned,
item['name'] was nil, and... nil !~ /pattern/ is true. Both
problems are fixed here.
(This explains why it seemed to find 15 names starting with
Collection_9, rather than just the 11 that exist (_9 and _90..99).)
2. It tested only that the returned results followed the requested
order, not that the order was followed when deciding what the limit
should be. All of the items_available were the same size, so _any_
order would have set the limit at 15 and passed the test.
The second problem is fixed by adding a separate test.
Tom Clegg [Thu, 4 Jun 2015 03:41:06 +0000 (23:41 -0400)]
6074: Update config docs to match new max_index_database_read behavior.
Peter Amstutz [Wed, 3 Jun 2015 20:40:50 +0000 (16:40 -0400)]
6235: Add method to get parameters from API discovery document.
Tom Clegg [Wed, 3 Jun 2015 20:37:39 +0000 (16:37 -0400)]
6074: Never exceed the configured max_index_database_read (even by one
record) unless necessary to return one row.
Fix copy&paste error in test case.
Tom Clegg [Wed, 3 Jun 2015 20:28:01 +0000 (16:28 -0400)]
6074: Clear any existing ActiveRecord select() before adding our own,
otherwise we'll read all the big values when we're really just trying
to predict the result size.
Tom Clegg [Wed, 3 Jun 2015 19:39:01 +0000 (15:39 -0400)]
6074: Speed up db query by using octet_length() instead of length(). closes #6223
Tom Clegg [Mon, 25 May 2015 14:35:02 +0000 (10:35 -0400)]
6087: Use HTTPClient's compression feature (instead of adding the
Content-Encoding header ourselves). Rename config knob to describe
purpose instead of implementation.
Tom Clegg [Mon, 25 May 2015 14:23:53 +0000 (10:23 -0400)]
6087: Compute portable_data_hash only once during check_signatures.
Tom Clegg [Mon, 25 May 2015 14:19:31 +0000 (10:19 -0400)]
6087: Use app-configured key by default for blob signing and verification.
Tom Clegg [Wed, 3 Jun 2015 13:50:55 +0000 (09:50 -0400)]
Merge branch '6146-log-squeue-lost-tasks' refs #6146
Tom Clegg [Sun, 31 May 2015 09:48:22 +0000 (05:48 -0400)]
6146: Better log message.
Tom Clegg [Sun, 31 May 2015 09:32:55 +0000 (05:32 -0400)]
6146: Use new SLURM_JOB_ID env var instead of old SLURM_JOBID
Tom Clegg [Sun, 31 May 2015 06:02:19 +0000 (02:02 -0400)]
6146: Improvements to "kill srun process if slurm task disappears" feature:
* Log when we notice a process is orphaned.
* Log when we decide to kill an orphaned process.
* Use `squeue --jobs $SLURM_JOBID` so slurm doesn't have to tell us
about other jobs' tasks.
* Do not kill a process that is still reporting stderr.
* Do not check `squeue` at all if every process has reported stderr
since the last squeue check. (In such cases, it seems safe to assume
no children are hung/dead.)
* Use the same timer/interval (15 seconds) for both noticing and
killing orphaned processes.
radhika [Wed, 3 Jun 2015 02:19:54 +0000 (22:19 -0400)]
refs #6093
Merge branch '6093-refresh-docs'
radhika [Wed, 3 Jun 2015 02:17:30 +0000 (22:17 -0400)]
6093: delete the redundant details in "alternate way to add ssh keys" section.
radhika [Wed, 3 Jun 2015 02:08:42 +0000 (22:08 -0400)]
Merge branch 'master' into 6093-refresh-docs
Nancy Ouyang [Wed, 3 Jun 2015 00:09:20 +0000 (20:09 -0400)]
closes #5930. Merge branch '5930-smalldocfix'
Nancy Ouyang [Wed, 3 Jun 2015 00:08:16 +0000 (20:08 -0400)]
5930: fixed as per code review
mishaz [Tue, 2 Jun 2015 22:42:59 +0000 (22:42 +0000)]
Added string to error message to help with debugging.
Peter Amstutz [Tue, 2 Jun 2015 20:23:22 +0000 (16:23 -0400)]
Merge branch '6194-python-arvfile-large-write' closes #6194
Peter Amstutz [Tue, 2 Jun 2015 20:17:31 +0000 (16:17 -0400)]
6194: Simplify test_large_write a little bit.
radhika [Tue, 2 Jun 2015 19:39:41 +0000 (15:39 -0400)]
Merge branch 'master' into 6093-refresh-docs
Peter Amstutz [Tue, 2 Jun 2015 17:32:19 +0000 (13:32 -0400)]
6194: Make splitting loop simpler since [n:n+KEEP_BLOCK_SIZE] returns a short
slice when there isn't KEEP_BLOCK_SIZE data. Update test.
mishaz [Tue, 2 Jun 2015 00:26:37 +0000 (00:26 +0000)]
Changes in response to code review.
Created DataFetcher type so that we can abstract whether our data is read from remote servers or local files.
Moved code for reading from remote servers into BuildDataFetcher().
Split summary.MaybeReadData() into ShouldReadData() and ReadData().
Started paying attention to new writable flag for keep servers. Required reworking CreatePullServers somewhat.
Cleaned up code for adding pull list to new destination.
Updated tests.
Peter Amstutz [Mon, 1 Jun 2015 20:57:54 +0000 (16:57 -0400)]
Merge branch 'master' into 6194-python-arvfile-large-write
Peter Amstutz [Mon, 1 Jun 2015 20:56:15 +0000 (16:56 -0400)]
6194: Fix test. Lots of small writes break across blocks differently than one huge
one.
Peter Amstutz [Mon, 1 Jun 2015 12:51:35 +0000 (08:51 -0400)]
6194: Fix typo in invocation of writeto() and use memoryview to avoid copying slices.
Tom Clegg [Sun, 31 May 2015 12:39:20 +0000 (08:39 -0400)]
Remove non-existent migration from structure.sql. refs #3036
Tom Clegg [Sun, 31 May 2015 12:38:52 +0000 (08:38 -0400)]
Update example dns_server_update_command. refs #6146
Tom Clegg [Sun, 31 May 2015 12:29:51 +0000 (08:29 -0400)]
Merge branch '6146-dns-update-command' refs #6146
Tom Clegg [Sun, 31 May 2015 12:23:44 +0000 (08:23 -0400)]
6146: Add dns_server_update_command. Update docs & tests for DNS update hooks.
Tom Clegg [Sat, 30 May 2015 02:01:06 +0000 (22:01 -0400)]
Tell tar to read to EOF (even if it detects trailing NULs).
Avoids SIGPIPE when feeding a tarball made with tar -A.
refs #6146 refs #6094
Peter Amstutz [Fri, 29 May 2015 20:28:37 +0000 (16:28 -0400)]
6194: Chunk large ArvadosFile writes automatically instead of raising an error.
Tom Clegg [Fri, 29 May 2015 18:04:01 +0000 (14:04 -0400)]
Merge branch '6146-ignore-tar-sigpipe' refs #6146 refs #6094
Tom Clegg [Fri, 29 May 2015 17:32:40 +0000 (13:32 -0400)]
6146: Ignore SIGPIPE while feeding data to tar. Rely on close() retval instead.
Tom Clegg [Fri, 29 May 2015 16:02:03 +0000 (12:02 -0400)]
In install script, log archive hash before running tar. refs #6146
Tom Clegg [Fri, 29 May 2015 15:31:43 +0000 (11:31 -0400)]
Merge branch '6146-job-runtime-sanity' refs #6146
Tom Clegg [Thu, 28 May 2015 21:13:44 +0000 (17:13 -0400)]
6146: Exit TEMPFAIL early (without failing the job) if worker nodes cannot run a trivial command.
This is meant to improve the way we handle a couple of edge cases.
1. A worker node doesn't get bootstrapped properly. It works well
enough to persuade nodemanager and the API server that it's alive and
ready to run jobs, but it can't actually run jobs. This means there's
a bug in the bootstrapping process -- its startup script shouldn't
tell slurm State=RESUME without checking itself -- but even so this
doesn't deserve to fail a job: it's definitely a system problem,
there's zero chance a different job would have gone any differently.
2. A worker node has a hardware problem, or it has fallen off the
network, or something like that, but slurm hasn't yet noticed and set
its state to DOWN, so slurm still uses it to satisfy crunch-dispatch's
"salloc" commands. As above, there's zero chance this could have gone
differently for any other job, so it doesn't make sense to fail the
job.
Tom Clegg [Wed, 27 May 2015 20:10:24 +0000 (16:10 -0400)]
Merge branch '6146-retry-install' refs #6146
Tom Clegg [Wed, 27 May 2015 19:48:54 +0000 (15:48 -0400)]
6146: Retry install (max 3 attempts) if install script fails with no error messages.
Also: if install fails, croak() instead of exit(1) so we still get a log file.
radhika [Wed, 27 May 2015 19:38:59 +0000 (15:38 -0400)]
Merge branch 'master' into 6093-refresh-docs
Conflicts:
doc/user/tutorials/tutorial-submit-job.html.textile.liquid
radhika [Wed, 27 May 2015 19:27:36 +0000 (15:27 -0400)]
closes #6057
Merge branch '6057-public-projects-page'
radhika [Wed, 27 May 2015 19:26:20 +0000 (15:26 -0400)]
6057: few more minor tweaks
Peter Amstutz [Wed, 27 May 2015 19:11:58 +0000 (15:11 -0400)]
Merge branch '6141-doc-workbench-links' refs #6141
radhika [Wed, 27 May 2015 19:03:54 +0000 (15:03 -0400)]
Merge branch 'master' into 6057-public-projects-page
Peter Amstutz [Wed, 27 May 2015 18:23:03 +0000 (14:23 -0400)]
Merge branch '6090-docker-use-local-sso' closes #6138
Ward Vandewege [Wed, 27 May 2015 16:42:44 +0000 (12:42 -0400)]
Add GPG key for RVM installation in the doc.
No issue #
Tom Clegg [Wed, 27 May 2015 13:06:10 +0000 (09:06 -0400)]
Merge branch '6098-full-text-index' refs #6098
Tom Clegg [Wed, 27 May 2015 13:05:40 +0000 (09:05 -0400)]
6098: Recreate full text indexes with leading spaces, to persuade Postgres to actually use them.
radhika [Tue, 26 May 2015 22:58:18 +0000 (18:58 -0400)]
Merge branch 'master' into 6057-public-projects-page
radhika [Tue, 26 May 2015 22:54:13 +0000 (18:54 -0400)]
6057: if /projects/public is accessed when anonymous config is not enabled, show 404.
radhika [Tue, 26 May 2015 20:04:41 +0000 (16:04 -0400)]
Merge branch 'master' into 6093-refresh-docs
radhika [Tue, 26 May 2015 19:57:11 +0000 (15:57 -0400)]
6093: one more
radhika [Tue, 26 May 2015 19:51:44 +0000 (15:51 -0400)]
6093: a few more updates
radhika [Tue, 26 May 2015 15:56:29 +0000 (11:56 -0400)]
6093: some more doc updates.
Tom Clegg [Tue, 26 May 2015 14:31:56 +0000 (10:31 -0400)]
Merge branch '6094-install-script-sigpipe' refs #6094
Tom Clegg [Tue, 26 May 2015 14:18:38 +0000 (10:18 -0400)]
6094: Propagate install script stderr+stdout to job log.
Brett Smith [Tue, 26 May 2015 00:32:19 +0000 (20:32 -0400)]
Merge branch '6095-arv-copy-preserve-object-order-wip'
Closes #6095, #6117.
Brett Smith [Fri, 22 May 2015 21:10:37 +0000 (17:10 -0400)]
6095: arv-copy preserves order of copied JSON.
This means arv-copy no longer loses the order of pipeline template
components, which makes for a nicer presentation in Workbench.
Other Python clients that would like to preserve the order of JSON
responses can use OrderedJsonModel the same way.
radhika [Mon, 25 May 2015 22:36:49 +0000 (18:36 -0400)]
6093: doc updates
Tom Clegg [Fri, 22 May 2015 21:42:17 +0000 (17:42 -0400)]
Merge branch '6094-install-script-sigpipe' (early part) refs #6094
radhika [Fri, 22 May 2015 20:40:59 +0000 (16:40 -0400)]
Merge branch 'master' into 6057-public-projects-page
radhika [Fri, 22 May 2015 20:39:57 +0000 (16:39 -0400)]
6057: add projects/public page, which lists publicly accessible projects.
Peter Amstutz [Fri, 22 May 2015 20:13:22 +0000 (16:13 -0400)]
6141: Remove hard-coded "https://" from "https://{{site.arvados_workbench_host}}" and require that arvados_workbench_host include the url scheme instead.
Tom Clegg [Fri, 22 May 2015 19:40:29 +0000 (15:40 -0400)]
6094: Consider arvados_sdk_version (not just script_version) when
deciding there's no need to extract or install anything.
Tom Clegg [Fri, 22 May 2015 19:39:00 +0000 (15:39 -0400)]
6094: Avoid SIGPIPE by consuming DATA section even when it's not needed.
Peter Amstutz [Fri, 22 May 2015 19:32:47 +0000 (15:32 -0400)]
6138: Added --domain to set ARVADOS_DOMAIN. Removed useless comments in apache2_vhost that mentioned qr1hi.
Ward Vandewege [Fri, 22 May 2015 14:03:01 +0000 (10:03 -0400)]
Add installation instructions for compute nodes; update the installation
instructions for crunch dispatcher.
No issue #
Tom Clegg [Thu, 21 May 2015 21:15:06 +0000 (17:15 -0400)]
Merge branch '6087-collection-timing' (early part) refs #6087 refs #6092
Tom Clegg [Thu, 21 May 2015 20:51:52 +0000 (16:51 -0400)]
6087: Get database time only once per manifest-signing/verifying event, rather than once per locator.
Brett Smith [Thu, 21 May 2015 20:06:16 +0000 (16:06 -0400)]
Update tutorial pipeline page to match new definition.
No issue #.
Peter Amstutz [Thu, 21 May 2015 19:00:10 +0000 (15:00 -0400)]
6090: Docker install uses local SSO server instead of auth.curoverse.com. Also
clean up references to dev.arvados to use @@ARVADOS_DOMAIN@@.
Brett Smith [Thu, 21 May 2015 18:23:46 +0000 (14:23 -0400)]
Update tutorial pipeline template definition.
This helps it run out of the box again.
No issue #.