Peter Amstutz [Fri, 17 Mar 2017 17:27:03 +0000 (13:27 -0400)]
11288: Slurm requires reason to put node in DOWN state.
Lucas Di Pentima [Fri, 17 Mar 2017 14:44:20 +0000 (11:44 -0300)]
Merge branch '11014-hide-node-status'
Closes #11014
Lucas Di Pentima [Fri, 17 Mar 2017 14:16:05 +0000 (11:16 -0300)]
11014: When PipelineInstance API is off and also show_recent_collections_on_dashboard configuration is off, now the "Recent processes" panel will take full screen width.
Corrected test name.
Avoid calling PipelineInstance.api_exist?(:index) more than once.
Peter Amstutz [Thu, 16 Mar 2017 20:52:11 +0000 (16:52 -0400)]
Merge branch '11254-nodemanager-no-actor' closes #11254
Peter Amstutz [Thu, 16 Mar 2017 20:49:32 +0000 (16:49 -0400)]
11254: Refactor _node_states
Peter Amstutz [Thu, 16 Mar 2017 20:11:00 +0000 (16:11 -0400)]
11254: Cloud nodes where "actor is None" are considered to be in shutdown. The
only time it should be "None" is the period between a successful shutdown and
when the node disappears from the cloud node list.
Lucas Di Pentima [Thu, 16 Mar 2017 19:16:34 +0000 (16:16 -0300)]
10218: Use a []string for the entire command instead of splitting it.
Lucas Di Pentima [Thu, 16 Mar 2017 18:18:23 +0000 (15:18 -0300)]
11014: Check for PipelineIndex#index API to exist for rendering the compute node status pane on the Dashboard.
Added related test.
Tom Clegg [Thu, 16 Mar 2017 17:24:58 +0000 (13:24 -0400)]
10218: Wait for container to be started (not just created) before trying to cancel it.
Lucas Di Pentima [Thu, 16 Mar 2017 14:01:19 +0000 (11:01 -0300)]
10218: Split multi-line command output so that each line is written to the logs independently.
Lucas Di Pentima [Wed, 15 Mar 2017 22:21:10 +0000 (19:21 -0300)]
10218: Logging node information (cpu, mem, disk) by storing command outputs on the log collection. Added relevant test.
Peter Amstutz [Wed, 15 Mar 2017 15:03:20 +0000 (11:03 -0400)]
8567: Add check that admin token is used and ensure that migration links are
created owned by system user. Also fix tests now that arv-keepdocker uses
logging instead of printing directly to sys.stderr.
radhika [Tue, 14 Mar 2017 20:01:14 +0000 (16:01 -0400)]
closes #11071
Merge branch '11071-fts-perf-test'
radhika [Tue, 14 Mar 2017 20:00:17 +0000 (16:00 -0400)]
11071: formatting on the long if statement
Peter Amstutz [Tue, 14 Mar 2017 19:35:29 +0000 (15:35 -0400)]
Add missing documentation file. refs #6520
Peter Amstutz [Tue, 14 Mar 2017 19:17:49 +0000 (15:17 -0400)]
Merge branch '8567-api-select-docker-fmt' refs #8567
Peter Amstutz [Tue, 14 Mar 2017 19:17:34 +0000 (15:17 -0400)]
Merge branch '8567-cwl-docker-img' refs #8567
Peter Amstutz [Tue, 14 Mar 2017 19:09:00 +0000 (15:09 -0400)]
Merge branch '6520-nodemanager-docs' closes #11123
Peter Amstutz [Tue, 14 Mar 2017 16:44:35 +0000 (12:44 -0400)]
8567: Add note about updating API server configuration.
Peter Amstutz [Tue, 14 Mar 2017 16:39:24 +0000 (12:39 -0400)]
8567: Add docker19 migration instructions to install guide.
Peter Amstutz [Thu, 9 Mar 2017 22:41:55 +0000 (17:41 -0500)]
8567: Fix migrate links to use PDH instead of UUID.
Better error reporting.
Migrate script cleans up /var/lib/docker inside container.
Peter Amstutz [Tue, 7 Mar 2017 15:02:39 +0000 (10:02 -0500)]
8567: Add check for ARVADOS_API_HOST_INSECURE
Peter Amstutz [Tue, 7 Mar 2017 14:26:17 +0000 (09:26 -0500)]
8567: Rename docker19-migrate to migrate-docker19 for consistency with
arv-migrate-docker19. Add docstrict to migrate19() function.
Peter Amstutz [Mon, 6 Mar 2017 19:52:55 +0000 (14:52 -0500)]
8567: Move out of tools/ into sdk/python and docker/docker19-migrate.
Peter Amstutz [Mon, 6 Mar 2017 19:31:17 +0000 (14:31 -0500)]
8567: Add status reporting to migrate script.
Peter Amstutz [Mon, 6 Mar 2017 15:39:57 +0000 (10:39 -0500)]
8567: Creates migration links.
Peter Amstutz [Mon, 6 Mar 2017 15:22:52 +0000 (10:22 -0500)]
8567: Docker image migration WIP.
Peter Amstutz [Tue, 14 Mar 2017 16:25:09 +0000 (12:25 -0400)]
8567: Fix tests now that container_image provides docker repo+tag and not PDH.
Peter Amstutz [Tue, 14 Mar 2017 16:03:33 +0000 (12:03 -0400)]
8567: If a search_term looks like a PDH, always treat it as one. Also use
correct optional parameter syntax.
Peter Amstutz [Tue, 14 Mar 2017 15:38:46 +0000 (11:38 -0400)]
6520: Add page with sample ping script. Improve instructions on creating
compute node a little bit.
Peter Amstutz [Tue, 14 Mar 2017 15:06:43 +0000 (11:06 -0400)]
6520: Add node_mem_scaling to documentation.
Peter Amstutz [Mon, 20 Feb 2017 21:13:29 +0000 (16:13 -0500)]
6520: Add information about setting up SLURM to crunchv2 documentation.
Peter Amstutz [Thu, 16 Feb 2017 22:20:19 +0000 (17:20 -0500)]
6520: Node manager docs WIP
Peter Amstutz [Fri, 10 Mar 2017 19:38:16 +0000 (14:38 -0500)]
8567: Use Docker image repo+tag name instead of PDH so that API server can select correct image format.
radhika [Mon, 13 Mar 2017 22:05:26 +0000 (18:05 -0400)]
11071: test count=none in groups#contents method.
Peter Amstutz [Mon, 13 Mar 2017 21:37:31 +0000 (17:37 -0400)]
8567: Tests default to image format v1 to avoid breaking all the tests that
uses the :docker_image collection.
Peter Amstutz [Mon, 13 Mar 2017 21:31:20 +0000 (17:31 -0400)]
8567: Adjust job container resolve test now that images are filtered based on
support version.
Peter Amstutz [Mon, 13 Mar 2017 20:40:26 +0000 (16:40 -0400)]
8567: Refactor code that queries migration links into get_compatible_images.
Peter Amstutz [Fri, 10 Mar 2017 19:06:22 +0000 (14:06 -0500)]
8567: Add & tweak tests for selecting compatible Docker image format.
Peter Amstutz [Fri, 10 Mar 2017 18:51:49 +0000 (13:51 -0500)]
8567: find_all_for_docker_image() returns only Docker images compatible with Rails.configuration.docker_image_formats. Follows migration links.
Peter Amstutz [Thu, 9 Mar 2017 22:44:26 +0000 (17:44 -0500)]
Bugfix: python Collection class sets _portable_data_hash in _populate_from_api_server
refs #10956
Tom Clegg [Thu, 9 Mar 2017 20:28:36 +0000 (15:28 -0500)]
Merge branch '5036-arv-mount-type'
refs #5036
Tom Clegg [Thu, 9 Mar 2017 19:53:51 +0000 (14:53 -0500)]
5036: Add "--subtype foo" flag to set mounted filesystem type to "fuse.foo".
Peter Amstutz [Thu, 9 Mar 2017 19:44:26 +0000 (14:44 -0500)]
Merge branch '11226-discovery-doc-cache' closes #11226
Peter Amstutz [Thu, 9 Mar 2017 18:49:15 +0000 (13:49 -0500)]
11226: Disable google api client discovery doc caching, use only httplib2 caching.
Google API client has its own caching mechanism. The problem is, by default
this goes to /tmp/google-api-python-client-discovery-doc.cache This is a
problem on a multi-user system. Arvados already provides for discovery doc
caching via httplib2.
Peter Amstutz [Wed, 8 Mar 2017 20:33:50 +0000 (15:33 -0500)]
Add ability to use "arvbox start test" to re-run tests without tearing down &
restarting whole container. no issue #
Peter Amstutz [Wed, 8 Mar 2017 20:33:00 +0000 (15:33 -0500)]
Arvbox now uses Go websockets server instead of Puma. no issue #
Tom Clegg [Tue, 7 Mar 2017 21:15:48 +0000 (16:15 -0500)]
11221: Always restart systemd services, even after a few startup failures.
Tom Clegg [Tue, 7 Mar 2017 20:39:36 +0000 (15:39 -0500)]
Merge branch '3115-keep-disk-create-surprise'
closes #3115
Tom Clegg [Tue, 7 Mar 2017 20:32:59 +0000 (15:32 -0500)]
3115: Create keep_disks implicitly only in #ping action, not #show.
Lucas Di Pentima [Tue, 7 Mar 2017 18:25:44 +0000 (15:25 -0300)]
Merge branch '11139-nodemanager-mem-scale-factor'
Closes #11139
Lucas Di Pentima [Tue, 7 Mar 2017 17:54:22 +0000 (14:54 -0300)]
11139: Merge branch 'master' into 11139-nodemanager-mem-scale-factor
Lucas Di Pentima [Tue, 7 Mar 2017 17:52:59 +0000 (14:52 -0300)]
11139: Added new test to check for non-default values. Updated example config files.
Tom Clegg [Tue, 7 Mar 2017 15:11:38 +0000 (10:11 -0500)]
Merge branch '11166-log-name-collision'
closes #11166
Lucas Di Pentima [Mon, 6 Mar 2017 22:04:35 +0000 (19:04 -0300)]
11139: Added default config parameter 'node_mem_scaling' to be applied to node's ram sticker values. Its default is 95%.
Added test checking the default config and that the value is actually applied to the server calculations.
Tom Clegg [Mon, 6 Mar 2017 19:59:56 +0000 (14:59 -0500)]
11166: Use ensure_unique_name to avoid collisions when saving logs and output.
Tom Clegg [Mon, 6 Mar 2017 15:31:40 +0000 (10:31 -0500)]
Merge branch '11138-docker-load-fail'
closes #11138
Tom Clegg [Fri, 3 Mar 2017 22:05:00 +0000 (17:05 -0500)]
Add missing install step: add keep-balance token to keepstore configs.
No issue #
Tom Clegg [Fri, 3 Mar 2017 21:53:14 +0000 (16:53 -0500)]
Remove obsolete GOMAXPROCS advice.
No issue #
Tom Clegg [Fri, 3 Mar 2017 21:46:50 +0000 (16:46 -0500)]
Merge branch '11168-serialize-json'
closes #11168
Tom Clegg [Fri, 3 Mar 2017 21:41:05 +0000 (16:41 -0500)]
11138: Show actual image ID when checking whether docker image is loaded.
Tom Clegg [Fri, 3 Mar 2017 21:18:10 +0000 (16:18 -0500)]
11168: Double-decode serialized fields if database was mangled by downgraded API server.
Tom Clegg [Fri, 3 Mar 2017 05:52:53 +0000 (00:52 -0500)]
11138: Test for docker image after loading, in case docker-load erroneously proclaimed success.
Tom Clegg [Fri, 3 Mar 2017 05:26:20 +0000 (00:26 -0500)]
11168: Always deep-sort before comparing in where_serialized.
Tom Clegg [Fri, 3 Mar 2017 05:25:57 +0000 (00:25 -0500)]
11168: Remove unused import.
Tom Clegg [Fri, 3 Mar 2017 05:17:43 +0000 (00:17 -0500)]
11168: Revert serialization change in order to avoid breaking job reuse.
radhika [Thu, 2 Mar 2017 23:12:03 +0000 (18:12 -0500)]
Merge branch 'master' into 11071-fts-perf-test
radhika [Thu, 2 Mar 2017 22:53:12 +0000 (17:53 -0500)]
11017: Use count='none' for full text search on workbench.
Update groups.load_searchable_objects which is used by contents method
to work with multiple class types when count='none' is used.
Tom Clegg [Wed, 1 Mar 2017 23:07:14 +0000 (18:07 -0500)]
11168: Add missing require.
Tom Clegg [Wed, 1 Mar 2017 22:59:33 +0000 (17:59 -0500)]
11168: Prohibit down-migration to YAML-only codebase.
Tom Clegg [Wed, 1 Mar 2017 21:09:05 +0000 (16:09 -0500)]
11168: Change db serialize from YAML to JSON.
Tom Clegg [Wed, 1 Mar 2017 20:04:34 +0000 (15:04 -0500)]
Merge branch '10764-ws-tests'
closes #10764
Tom Clegg [Wed, 1 Mar 2017 15:58:16 +0000 (10:58 -0500)]
10764: De-duplicate real/test server startup. Add test for broken config.
Tom Clegg [Wed, 1 Mar 2017 07:52:38 +0000 (02:52 -0500)]
10764: Simplify test server shutdown.
Tom Clegg [Wed, 1 Mar 2017 07:36:18 +0000 (02:36 -0500)]
10764: Test v0 session.
Tom Clegg [Tue, 28 Feb 2017 22:08:33 +0000 (17:08 -0500)]
Merge branch '10777-die-if-arv-mount-dies'
closes #10777
radhika [Tue, 28 Feb 2017 18:29:43 +0000 (13:29 -0500)]
closes #11015
Merge branch '11015-crunch-run-output-upload'
radhika [Tue, 28 Feb 2017 18:23:45 +0000 (13:23 -0500)]
Merge branch 'master' into 11015-crunch-run-output-upload
Tom Clegg [Tue, 28 Feb 2017 02:49:42 +0000 (21:49 -0500)]
10764: Permission tests. Support PDH permission check.
Tom Clegg [Mon, 27 Feb 2017 22:51:51 +0000 (17:51 -0500)]
10764: Add unit tests
radhika [Mon, 27 Feb 2017 18:25:50 +0000 (13:25 -0500)]
11015: use multiple writers to increate throughput of goUpload.
Tom Clegg [Mon, 27 Feb 2017 16:54:40 +0000 (11:54 -0500)]
10979: Add missing SqueueChecker initialization.
refs #10979
Lucas Di Pentima [Mon, 27 Feb 2017 13:37:07 +0000 (10:37 -0300)]
Merge branch '11002-arvput-crash-fix'
Closes #11002
Lucas Di Pentima [Mon, 27 Feb 2017 12:42:49 +0000 (09:42 -0300)]
11002: Merge branch 'master' into 11002-arvput-crash-fix
Lucas Di Pentima [Mon, 27 Feb 2017 12:41:17 +0000 (09:41 -0300)]
11002: Added note explaining why we're expecting a SystemExit to catch a SIGINT (KeyboardInterrupt)
Tom Clegg [Sat, 25 Feb 2017 06:56:53 +0000 (01:56 -0500)]
10777: Close and flush logs right away instead of waiting for next tick.
Tom Clegg [Sat, 25 Feb 2017 06:51:20 +0000 (01:51 -0500)]
10777: Stop container if arv-mount dies before container exits.
Tom Clegg [Fri, 24 Feb 2017 22:46:38 +0000 (17:46 -0500)]
run-tests.sh exit non-zero if gofmt fails
No issue #
Tom Clegg [Fri, 24 Feb 2017 21:46:46 +0000 (16:46 -0500)]
10979: Check for orphans only once at startup. Add missing Lock() in
squeue checker. Avoid holding mtx while waiting for API response.
Ensure RunContainer actually gets called in test case.
refs #10979
Ward Vandewege [Fri, 24 Feb 2017 20:56:32 +0000 (15:56 -0500)]
build improvement: really include apps/workbench_functionals when
apps/workbench is specified.
No issue #
Tom Clegg [Fri, 24 Feb 2017 19:43:33 +0000 (14:43 -0500)]
Merge branch '6347-log-timestamps'
closes #6347
Peter Amstutz [Fri, 24 Feb 2017 19:24:18 +0000 (14:24 -0500)]
Merge branch '10629-fuse-listing-perf' closes #10629
Peter Amstutz [Fri, 24 Feb 2017 19:21:52 +0000 (14:21 -0500)]
Merge branch '9277-container-output' closes #9277
Peter Amstutz [Fri, 24 Feb 2017 19:20:15 +0000 (14:20 -0500)]
9277: Add test that setting trashed, unreable collection is disallowed.
Tom Clegg [Fri, 24 Feb 2017 18:45:12 +0000 (13:45 -0500)]
6347: Use RFC3339Nano to render timestamps loaded from serialized fields.
Psych (YAML) serializes timestamps as ISO8601-with-space-separators,
and safe_load deserializes them to Time even with
whitelist_classes=[].
Psych.dump(Time.now.utc)
=> "--- 2017-02-22 21:33:22.
845133778 Z\n...\n"
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').class
=> Time
Psych.safe_load('2017-02-31 21:33:22.
845133778 Z').class
=> String
Before:
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').as_json
=> "2017-02-22T21:33:22Z"
After:
Psych.safe_load('2017-02-22 21:33:22.
845133778 Z').as_json
=> "2017-02-22T21:33:22.845133778Z"
radhika [Fri, 24 Feb 2017 18:17:15 +0000 (13:17 -0500)]
refs #10979
Merge branch '10979-cancelled-job-nodes'
radhika [Fri, 24 Feb 2017 18:16:31 +0000 (13:16 -0500)]
10979: fix failing test
Peter Amstutz [Fri, 24 Feb 2017 16:01:07 +0000 (11:01 -0500)]
9277: arvados-cwl-runner sets "is_trashed" when directly setting output of container.
Peter Amstutz [Fri, 24 Feb 2017 15:53:28 +0000 (10:53 -0500)]
9277: Container output check must be unscoped to include trashed collections.
radhika [Fri, 24 Feb 2017 05:54:47 +0000 (00:54 -0500)]
closes #10979
Merge branch '10979-cancelled-job-nodes'