Peter Amstutz [Tue, 14 Mar 2017 15:38:46 +0000 (11:38 -0400)]
6520: Add page with sample ping script. Improve instructions on creating
compute node a little bit.
Peter Amstutz [Tue, 14 Mar 2017 15:06:43 +0000 (11:06 -0400)]
6520: Add node_mem_scaling to documentation.
Peter Amstutz [Mon, 20 Feb 2017 21:13:29 +0000 (16:13 -0500)]
6520: Add information about setting up SLURM to crunchv2 documentation.
Peter Amstutz [Thu, 16 Feb 2017 22:20:19 +0000 (17:20 -0500)]
6520: Node manager docs WIP
Peter Amstutz [Thu, 9 Mar 2017 22:44:26 +0000 (17:44 -0500)]
Bugfix: python Collection class sets _portable_data_hash in _populate_from_api_server
refs #10956
Tom Clegg [Thu, 9 Mar 2017 20:28:36 +0000 (15:28 -0500)]
Merge branch '5036-arv-mount-type'
refs #5036
Tom Clegg [Thu, 9 Mar 2017 19:53:51 +0000 (14:53 -0500)]
5036: Add "--subtype foo" flag to set mounted filesystem type to "fuse.foo".
Peter Amstutz [Thu, 9 Mar 2017 19:44:26 +0000 (14:44 -0500)]
Merge branch '11226-discovery-doc-cache' closes #11226
Peter Amstutz [Thu, 9 Mar 2017 18:49:15 +0000 (13:49 -0500)]
11226: Disable google api client discovery doc caching, use only httplib2 caching.
Google API client has its own caching mechanism. The problem is, by default
this goes to /tmp/google-api-python-client-discovery-doc.cache This is a
problem on a multi-user system. Arvados already provides for discovery doc
caching via httplib2.
Peter Amstutz [Wed, 8 Mar 2017 20:33:50 +0000 (15:33 -0500)]
Add ability to use "arvbox start test" to re-run tests without tearing down &
restarting whole container. no issue #
Peter Amstutz [Wed, 8 Mar 2017 20:33:00 +0000 (15:33 -0500)]
Arvbox now uses Go websockets server instead of Puma. no issue #
Tom Clegg [Tue, 7 Mar 2017 20:39:36 +0000 (15:39 -0500)]
Merge branch '3115-keep-disk-create-surprise'
closes #3115
Tom Clegg [Tue, 7 Mar 2017 20:32:59 +0000 (15:32 -0500)]
3115: Create keep_disks implicitly only in #ping action, not #show.
Lucas Di Pentima [Tue, 7 Mar 2017 18:25:44 +0000 (15:25 -0300)]
Merge branch '11139-nodemanager-mem-scale-factor'
Closes #11139
Lucas Di Pentima [Tue, 7 Mar 2017 17:54:22 +0000 (14:54 -0300)]
11139: Merge branch 'master' into 11139-nodemanager-mem-scale-factor
Lucas Di Pentima [Tue, 7 Mar 2017 17:52:59 +0000 (14:52 -0300)]
11139: Added new test to check for non-default values. Updated example config files.
Tom Clegg [Tue, 7 Mar 2017 15:11:38 +0000 (10:11 -0500)]
Merge branch '11166-log-name-collision'
closes #11166
Lucas Di Pentima [Mon, 6 Mar 2017 22:04:35 +0000 (19:04 -0300)]
11139: Added default config parameter 'node_mem_scaling' to be applied to node's ram sticker values. Its default is 95%.
Added test checking the default config and that the value is actually applied to the server calculations.
Tom Clegg [Mon, 6 Mar 2017 19:59:56 +0000 (14:59 -0500)]
11166: Use ensure_unique_name to avoid collisions when saving logs and output.
Tom Clegg [Mon, 6 Mar 2017 15:31:40 +0000 (10:31 -0500)]
Merge branch '11138-docker-load-fail'
closes #11138
Tom Clegg [Fri, 3 Mar 2017 22:05:00 +0000 (17:05 -0500)]
Add missing install step: add keep-balance token to keepstore configs.
No issue #
Tom Clegg [Fri, 3 Mar 2017 21:53:14 +0000 (16:53 -0500)]
Remove obsolete GOMAXPROCS advice.
No issue #
Tom Clegg [Fri, 3 Mar 2017 21:46:50 +0000 (16:46 -0500)]
Merge branch '11168-serialize-json'
closes #11168
Tom Clegg [Fri, 3 Mar 2017 21:41:05 +0000 (16:41 -0500)]
11138: Show actual image ID when checking whether docker image is loaded.
Tom Clegg [Fri, 3 Mar 2017 21:18:10 +0000 (16:18 -0500)]
11168: Double-decode serialized fields if database was mangled by downgraded API server.
Tom Clegg [Fri, 3 Mar 2017 05:52:53 +0000 (00:52 -0500)]
11138: Test for docker image after loading, in case docker-load erroneously proclaimed success.
Tom Clegg [Fri, 3 Mar 2017 05:26:20 +0000 (00:26 -0500)]
11168: Always deep-sort before comparing in where_serialized.
Tom Clegg [Fri, 3 Mar 2017 05:25:57 +0000 (00:25 -0500)]
11168: Remove unused import.
Tom Clegg [Fri, 3 Mar 2017 05:17:43 +0000 (00:17 -0500)]
11168: Revert serialization change in order to avoid breaking job reuse.
Tom Clegg [Wed, 1 Mar 2017 23:07:14 +0000 (18:07 -0500)]
11168: Add missing require.
Tom Clegg [Wed, 1 Mar 2017 22:59:33 +0000 (17:59 -0500)]
11168: Prohibit down-migration to YAML-only codebase.
Tom Clegg [Wed, 1 Mar 2017 21:09:05 +0000 (16:09 -0500)]
11168: Change db serialize from YAML to JSON.
Tom Clegg [Wed, 1 Mar 2017 20:04:34 +0000 (15:04 -0500)]
Merge branch '10764-ws-tests'
closes #10764
Tom Clegg [Wed, 1 Mar 2017 15:58:16 +0000 (10:58 -0500)]
10764: De-duplicate real/test server startup. Add test for broken config.
Tom Clegg [Wed, 1 Mar 2017 07:52:38 +0000 (02:52 -0500)]
10764: Simplify test server shutdown.
Tom Clegg [Wed, 1 Mar 2017 07:36:18 +0000 (02:36 -0500)]
10764: Test v0 session.
Tom Clegg [Tue, 28 Feb 2017 22:08:33 +0000 (17:08 -0500)]
Merge branch '10777-die-if-arv-mount-dies'
closes #10777
radhika [Tue, 28 Feb 2017 18:29:43 +0000 (13:29 -0500)]
closes #11015
Merge branch '11015-crunch-run-output-upload'
radhika [Tue, 28 Feb 2017 18:23:45 +0000 (13:23 -0500)]
Merge branch 'master' into 11015-crunch-run-output-upload
Tom Clegg [Tue, 28 Feb 2017 02:49:42 +0000 (21:49 -0500)]
10764: Permission tests. Support PDH permission check.
Tom Clegg [Mon, 27 Feb 2017 22:51:51 +0000 (17:51 -0500)]
10764: Add unit tests
radhika [Mon, 27 Feb 2017 18:25:50 +0000 (13:25 -0500)]
11015: use multiple writers to increate throughput of goUpload.
Tom Clegg [Mon, 27 Feb 2017 16:54:40 +0000 (11:54 -0500)]
10979: Add missing SqueueChecker initialization.
refs #10979
Lucas Di Pentima [Mon, 27 Feb 2017 13:37:07 +0000 (10:37 -0300)]
Merge branch '11002-arvput-crash-fix'
Closes #11002
Lucas Di Pentima [Mon, 27 Feb 2017 12:42:49 +0000 (09:42 -0300)]
11002: Merge branch 'master' into 11002-arvput-crash-fix
Lucas Di Pentima [Mon, 27 Feb 2017 12:41:17 +0000 (09:41 -0300)]
11002: Added note explaining why we're expecting a SystemExit to catch a SIGINT (KeyboardInterrupt)
Tom Clegg [Sat, 25 Feb 2017 06:56:53 +0000 (01:56 -0500)]
10777: Close and flush logs right away instead of waiting for next tick.
Tom Clegg [Sat, 25 Feb 2017 06:51:20 +0000 (01:51 -0500)]
10777: Stop container if arv-mount dies before container exits.
Tom Clegg [Fri, 24 Feb 2017 22:46:38 +0000 (17:46 -0500)]
run-tests.sh exit non-zero if gofmt fails
No issue #
Tom Clegg [Fri, 24 Feb 2017 21:46:46 +0000 (16:46 -0500)]
10979: Check for orphans only once at startup. Add missing Lock() in
squeue checker. Avoid holding mtx while waiting for API response.
Ensure RunContainer actually gets called in test case.
refs #10979
Ward Vandewege [Fri, 24 Feb 2017 20:56:32 +0000 (15:56 -0500)]
build improvement: really include apps/workbench_functionals when
apps/workbench is specified.
No issue #
Tom Clegg [Fri, 24 Feb 2017 19:43:33 +0000 (14:43 -0500)]
Merge branch '6347-log-timestamps'
closes #6347
Peter Amstutz [Fri, 24 Feb 2017 19:24:18 +0000 (14:24 -0500)]
Merge branch '10629-fuse-listing-perf' closes #10629
Peter Amstutz [Fri, 24 Feb 2017 19:21:52 +0000 (14:21 -0500)]
Merge branch '9277-container-output' closes #9277
Peter Amstutz [Fri, 24 Feb 2017 19:20:15 +0000 (14:20 -0500)]
9277: Add test that setting trashed, unreable collection is disallowed.
Tom Clegg [Fri, 24 Feb 2017 18:45:12 +0000 (13:45 -0500)]
6347: Use RFC3339Nano to render timestamps loaded from serialized fields.
Psych (YAML) serializes timestamps as ISO8601-with-space-separators,
and safe_load deserializes them to Time even with
whitelist_classes=[].
Psych.dump(Time.now.utc)
=> "--- 2017-02-22 21:33:22.
845133778 Z\n...\n"
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').class
=> Time
Psych.safe_load('2017-02-31 21:33:22.
845133778 Z').class
=> String
Before:
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').as_json
=> "2017-02-22T21:33:22Z"
After:
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').as_json
=> "2017-02-22T21:33:22.845133778Z"
radhika [Fri, 24 Feb 2017 18:17:15 +0000 (13:17 -0500)]
refs #10979
Merge branch '10979-cancelled-job-nodes'
radhika [Fri, 24 Feb 2017 18:16:31 +0000 (13:16 -0500)]
10979: fix failing test
Peter Amstutz [Fri, 24 Feb 2017 16:01:07 +0000 (11:01 -0500)]
9277: arvados-cwl-runner sets "is_trashed" when directly setting output of container.
Peter Amstutz [Fri, 24 Feb 2017 15:53:28 +0000 (10:53 -0500)]
9277: Container output check must be unscoped to include trashed collections.
radhika [Fri, 24 Feb 2017 05:54:47 +0000 (00:54 -0500)]
closes #10979
Merge branch '10979-cancelled-job-nodes'
radhika [Fri, 24 Feb 2017 05:54:24 +0000 (00:54 -0500)]
Merge branch 'master' into 10979-cancelled-job-nodes
radhika [Fri, 24 Feb 2017 05:52:18 +0000 (00:52 -0500)]
10979: ruby way of doing it
Lucas Di Pentima [Thu, 23 Feb 2017 20:56:24 +0000 (17:56 -0300)]
11002: Don't save the state and log the stack trace before quitting upon catching an exception. Also, when receiving SIGINT (KeyboardInterrupt), just quit without any logging.
Updated tests to reflect this new behaviour.
Tom Clegg [Thu, 23 Feb 2017 19:04:42 +0000 (14:04 -0500)]
11156: Fix infinite loop condition.
closes #11156
Tom Clegg [Wed, 22 Feb 2017 21:36:50 +0000 (16:36 -0500)]
Merge branch '7995-keep-balance-docs'
closes #7995
Tom Clegg [Wed, 22 Feb 2017 21:33:35 +0000 (16:33 -0500)]
Remove pidfiles after shutting down test servers.
No issue #
Peter Amstutz [Wed, 22 Feb 2017 21:28:22 +0000 (16:28 -0500)]
10629: Don't flush dirhandles.
Peter Amstutz [Wed, 22 Feb 2017 21:08:41 +0000 (16:08 -0500)]
10629: Make tracking and dirtying of _committed flag efficient.
Peter Amstutz [Wed, 22 Feb 2017 20:40:22 +0000 (20:40 +0000)]
10629: improve debug logging
--debug includes Keep logging.
--logfile includes timestamps.
radhika [Wed, 22 Feb 2017 19:49:37 +0000 (14:49 -0500)]
Merge branch 'master' into 10979-cancelled-job-nodes
radhika [Wed, 22 Feb 2017 19:48:47 +0000 (14:48 -0500)]
10979: refactor squeue invocations
Lucas Di Pentima [Wed, 22 Feb 2017 18:26:16 +0000 (15:26 -0300)]
11002: Track this specific error with its own exception class, for future-proofing.
Tom Clegg [Wed, 22 Feb 2017 18:19:09 +0000 (13:19 -0500)]
Merge branch '11097-reuse-impure'
closes #11097
Tom Clegg [Wed, 22 Feb 2017 16:45:35 +0000 (11:45 -0500)]
7995: Add note about one keep-balance process at a time.
Tom Clegg [Wed, 22 Feb 2017 16:29:48 +0000 (11:29 -0500)]
7995: Fix up dry-run instructions.
Tom Clegg [Wed, 22 Feb 2017 16:24:13 +0000 (11:24 -0500)]
7995: Fix up inconsistent "e.g." vs. "e.g.,".
Tom Clegg [Wed, 22 Feb 2017 16:02:32 +0000 (11:02 -0500)]
7995: Copy edits.
Tom Clegg [Wed, 22 Feb 2017 15:29:31 +0000 (10:29 -0500)]
Fix dispatch panic when processing an update after tracker has been closed/deleted.
refs #11151
Tom Morris [Tue, 21 Feb 2017 21:35:00 +0000 (16:35 -0500)]
A few copy edits
Tom Morris [Tue, 21 Feb 2017 21:34:07 +0000 (16:34 -0500)]
Document epydoc dependency
Tom Clegg [Tue, 21 Feb 2017 20:27:54 +0000 (15:27 -0500)]
11097: Clarify reuse query.
Tom Clegg [Tue, 21 Feb 2017 20:27:37 +0000 (15:27 -0500)]
11097: Update docs to reflect new container reuse behavior.
radhika [Tue, 21 Feb 2017 18:29:58 +0000 (13:29 -0500)]
10979: scancel orphaned job nodes in crunch1.
Lucas Di Pentima [Tue, 21 Feb 2017 17:41:58 +0000 (14:41 -0300)]
11002: Do not try to save internal state when receiving a KeyboardInterrupt exception.
Updated test accordingly.
Lucas Di Pentima [Tue, 21 Feb 2017 16:22:07 +0000 (13:22 -0300)]
11002: Merge branch 'master' into 11002-arvput-crash-fix
Lucas Di Pentima [Tue, 21 Feb 2017 13:00:19 +0000 (10:00 -0300)]
11002: When trying to save the cache's state before quitting, if an exception
is caught because of a BlockManager problem induced by an interruption,
print a warning message and quit without saving the last checkpoint.
Lucas Di Pentima [Tue, 21 Feb 2017 12:54:07 +0000 (09:54 -0300)]
11002: Added missing assertion to test.
Lucas Di Pentima [Tue, 21 Feb 2017 11:41:07 +0000 (08:41 -0300)]
11002: Improved test mocking a more suitable method and catching the specific exception type.
Tom Clegg [Mon, 20 Feb 2017 22:03:10 +0000 (17:03 -0500)]
7995: Add "dry run" note.
Peter Amstutz [Mon, 20 Feb 2017 21:18:44 +0000 (16:18 -0500)]
Merge branch '6520-pending-reason' refs #6520
Peter Amstutz [Mon, 20 Feb 2017 18:54:47 +0000 (13:54 -0500)]
6520: Add ReqNodeNotAvail to list of reasons (along with "Resources") to boot a new node.
Tom Clegg [Mon, 20 Feb 2017 20:45:40 +0000 (15:45 -0500)]
11097: Update test to match new behavior.
Tom Clegg [Mon, 20 Feb 2017 20:34:15 +0000 (15:34 -0500)]
11097: Merge branch 'master' into 11097-reuse-impure
Lucas Di Pentima [Mon, 20 Feb 2017 20:25:05 +0000 (17:25 -0300)]
11002: Added test to make the bug happen.
Tom Clegg [Mon, 20 Feb 2017 15:41:52 +0000 (10:41 -0500)]
7995: Add keep-balance to install guide.
Tom Clegg [Fri, 17 Feb 2017 22:32:53 +0000 (17:32 -0500)]
Merge branch '11127-delete-trash-with-links'
refs #11127
Lucas Di Pentima [Fri, 17 Feb 2017 22:18:54 +0000 (19:18 -0300)]
Merge branch '11121-crunch-output-collection-owner'
Closes #11121
Lucas Di Pentima [Fri, 17 Feb 2017 22:17:40 +0000 (19:17 -0300)]
11121: Merge branch 'master' into 11121-crunch-output-collection-owner
Tom Clegg [Fri, 17 Feb 2017 21:17:26 +0000 (16:17 -0500)]
11127: Delete dependent links too when emptying trash.