mishaz [Wed, 24 Dec 2014 20:26:38 +0000 (20:26 +0000)]
Added string copying to try to reduce memory usage, didn't seem to work. Cleaned up logging (and logging logic) so that we only see one line per batch.
mishaz [Wed, 24 Dec 2014 19:29:08 +0000 (19:29 +0000)]
Started parsing modification date as a timestamp instead of leaving it as a string.
mishaz [Wed, 24 Dec 2014 01:36:43 +0000 (01:36 +0000)]
Switched from strings to BlockDigests to hold block digests more efficiently. Started clearing out manifest text once we finished with it. Made profiling conitional on flag (before it crashed if not provided). Added final heap profile once collections were finished.
Runs to completion!
mishaz [Wed, 24 Dec 2014 00:24:07 +0000 (00:24 +0000)]
Changes to manifest that I forgot to add to previous checking.
mishaz [Tue, 23 Dec 2014 23:55:12 +0000 (23:55 +0000)]
Added blockdigest class to store digests more efficiently. This has the nice side effect of reducing how many string slices we use from the SDK, so the large string can get garbage collected once we remove other usages.
mishaz [Tue, 23 Dec 2014 19:33:07 +0000 (19:33 +0000)]
Long overdue checkin of data manager. Current code runs, but uses way too much memory and eventually crashes. This checkin includes heap profiling to track down memory usage.
mishaz [Sat, 22 Nov 2014 00:57:40 +0000 (00:57 +0000)]
Added reporting of disk usage. This is the Collection Storage of each user as described here: https://arvados.org/projects/arvados/wiki/Data_Manager_Design_Doc#Reports-Produced
But it does not include the size of projects owned by the user (projects and subprojects are each reported as their own users)
Report is just logged to screen for now.
mishaz [Thu, 16 Oct 2014 20:57:06 +0000 (20:57 +0000)]
Started reading index from keep servers.
Added lots of code to handle unexpected results from keep server.
mishaz [Wed, 15 Oct 2014 20:53:53 +0000 (20:53 +0000)]
Started reading response from keep server.
Tom Clegg [Fri, 13 Feb 2015 21:22:55 +0000 (16:22 -0500)]
Merge branch 'master' of git.curoverse.com:arvados into 3408-production-datamanager
Brett Smith [Tue, 14 Oct 2014 18:49:49 +0000 (14:49 -0400)]
4126: API server uses fixed a-r-p-i.
Refs #4126.
Peter Amstutz [Tue, 14 Oct 2014 15:41:02 +0000 (11:41 -0400)]
Merge branch '3692-event-bus-fix-and' closes #3692
Peter Amstutz [Tue, 14 Oct 2014 15:33:31 +0000 (11:33 -0400)]
Merge branch '3656-arv-create' closes #3656
mishaz [Tue, 14 Oct 2014 19:46:12 +0000 (19:46 +0000)]
Added flags to read data manager token from a file.
Started trying to get index from keep servers but it's not working yet.
Peter Amstutz [Tue, 14 Oct 2014 15:32:38 +0000 (11:32 -0400)]
3656: Delete unused documentation page
Peter Amstutz [Tue, 14 Oct 2014 14:54:51 +0000 (10:54 -0400)]
3656: Alphabetize list of subcommands. Rename tmp -> tmp_file. Small wording change on doc pages.
Peter Amstutz [Tue, 14 Oct 2014 14:40:36 +0000 (10:40 -0400)]
3692: Bug fix for inadequate grouping when constructing selection.
Brett Smith [Tue, 14 Oct 2014 14:19:41 +0000 (10:19 -0400)]
Merge branch '4139-blocking-node-manager-tests-wip'
Refs #2881, #4139. Closes #4184.
Brett Smith [Tue, 14 Oct 2014 14:18:49 +0000 (10:18 -0400)]
4139: Node Manager README links to compute node lifecycle page.
Brett Smith [Mon, 13 Oct 2014 19:19:15 +0000 (15:19 -0400)]
4139: Speed up Node Manager tests.
Previously, the tests would poll interesting mocks, waiting for them
to be called. This introduces significant overhead to the tests, and
they would frequently time out on Jenkins. This modifies the tests to
get more information by blocking on the tested actors, which means
more predictability and less fighting for CPU (typical runtimes for
all the tests improved from 5 seconds to 0.5 seconds on my
workstation).
The downside to this approach is that it ties the tests more tightly
to the underlying actors' implementation. In particular, they
sometimes send a message and block for a response to ensure that any
internal messages generated by the *last* message have been handled.
This is less than ideal, but I don't have a better idea right now.
mishaz [Fri, 10 Oct 2014 20:59:55 +0000 (20:59 +0000)]
Deleted unused type.
mishaz [Fri, 10 Oct 2014 20:27:36 +0000 (20:27 +0000)]
Added tests to check that we're iterating on manifest lines correctly and handling blank lines in manifests.
LineIter will now handle long manifest lines properly and added test to check that we continue to do so.
Peter Amstutz [Fri, 10 Oct 2014 20:14:15 +0000 (16:14 -0400)]
3692: Explicitly incorporate sequence number test into where clause
Peter Amstutz [Fri, 10 Oct 2014 19:43:52 +0000 (15:43 -0400)]
Websocket server side fix, perform database notify in after_save callback on
the log object instead of in log_change on ArvadosBase because crunch-dispatch
was creating Log objects directly and bypassing the notification in log_change.
mishaz [Fri, 10 Oct 2014 15:28:09 +0000 (15:28 +0000)]
Added test to show that our code fails on long manifests.
radhika [Fri, 10 Oct 2014 13:45:09 +0000 (09:45 -0400)]
closes #4126
Merge branch '4126-preserve-parameter-hash'
Peter Amstutz [Fri, 10 Oct 2014 12:52:27 +0000 (08:52 -0400)]
3692: Fixed test, and fixed the actual bug
Peter Amstutz [Thu, 9 Oct 2014 18:56:47 +0000 (14:56 -0400)]
3656: Add missing file
Brett Smith [Thu, 9 Oct 2014 17:46:51 +0000 (13:46 -0400)]
4139: Add README to Node Manager.
Refs #4139.
Brett Smith [Thu, 9 Oct 2014 17:32:50 +0000 (13:32 -0400)]
4139: Add *.egg to Node Manager's .gitignore.
`python setup.py test` will automatically download dependencies to the
source directory if you don't already have them available in your
environment. Refs #2881, #4139.
Brett Smith [Thu, 9 Oct 2014 17:31:47 +0000 (13:31 -0400)]
4139: Add environment configuration knobs for Node Manager tests.
These are settings I've fiddled with regularly over the course of
development, and now it looks like we're going to need to fiddle them
some more to accommodate Jenkins. I'm exposing them as environment
variables so I can stop messing with the code appropriately.
Refs #4139.
Peter Amstutz [Thu, 9 Oct 2014 17:21:27 +0000 (13:21 -0400)]
3656: Documentation updated to use "arv create".
Peter Amstutz [Thu, 9 Oct 2014 15:37:39 +0000 (11:37 -0400)]
3656: Support additional create parameters on the command line, only open
editor on the object itself, should be less confusing.
Peter Amstutz [Thu, 9 Oct 2014 15:04:43 +0000 (11:04 -0400)]
3656: Add arv-create command. Refactor run_editor to be shared by arv_edit and arv_create.
Peter Amstutz [Thu, 9 Oct 2014 13:33:34 +0000 (09:33 -0400)]
Merge branch '4042-run-command-MxN' closes #4042
Peter Amstutz [Thu, 9 Oct 2014 13:32:36 +0000 (09:32 -0400)]
4042: Typo fixes. Highlight run-command and script_parameters in text. Rename
--job-parameters to --script-parameters and add mention of --dry-run mode.
Brett Smith [Thu, 9 Oct 2014 13:20:38 +0000 (09:20 -0400)]
Update install docs for keep→keepstore rename.
No issue #. Reported on #arvados.
Peter Amstutz [Thu, 9 Oct 2014 13:15:30 +0000 (09:15 -0400)]
Merge branch '3381-job-progress-bar-bug' closes #3381
Peter Amstutz [Thu, 9 Oct 2014 13:14:56 +0000 (09:14 -0400)]
3381: Fix layout comment
radhika [Thu, 9 Oct 2014 12:52:23 +0000 (08:52 -0400)]
4126: undo the hash parameter retention logic for value
radhika [Wed, 8 Oct 2014 21:30:59 +0000 (17:30 -0400)]
Merge branch 'master' into 4126-preserve-parameter-hash
radhika [Wed, 8 Oct 2014 21:28:38 +0000 (17:28 -0400)]
4126: when parameter is a hash, use it as value if nothing else matches.
Brett Smith [Wed, 8 Oct 2014 20:52:44 +0000 (16:52 -0400)]
Merge branch '2881-node-manager'
Closes #2881, #4106.
Brett Smith [Fri, 3 Oct 2014 21:53:57 +0000 (17:53 -0400)]
2881: Add Node Manager service.
Peter Amstutz [Wed, 8 Oct 2014 19:00:42 +0000 (15:00 -0400)]
Updated examples.
Peter Amstutz [Wed, 8 Oct 2014 18:50:12 +0000 (14:50 -0400)]
Merge branch '4042-run-command-MxN' of git.curoverse.com:arvados into 4042-run-command-MxN
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
Peter Amstutz [Wed, 8 Oct 2014 18:49:54 +0000 (14:49 -0400)]
4042: Update documentation samples. Small fix to dry-run to allow supplying custom TASK_KEEPMOUNT.
Peter Amstutz [Wed, 8 Oct 2014 15:12:57 +0000 (11:12 -0400)]
4042: Rename bad reuse of 'p' to 'match' in expand_item. Finish describing
$(task.outdir). Clarify that list functions take a user parameter name. Fix
other spelling and grammatical errors in documentation.
Tom Clegg [Wed, 8 Oct 2014 14:58:31 +0000 (10:58 -0400)]
Merge branch '4044-crunchstat-wait' refs #4044
Tom Clegg [Wed, 8 Oct 2014 14:58:23 +0000 (10:58 -0400)]
4044: Merge branch 'master' into 4044-crunchstat-wait
Tom Clegg [Wed, 8 Oct 2014 14:58:11 +0000 (10:58 -0400)]
4044: Add comments to "continue" statements.
Peter Amstutz [Wed, 8 Oct 2014 14:45:35 +0000 (10:45 -0400)]
3381: Reorganize _running_components to be clearer. Fixing workbench.
Tom Clegg [Wed, 8 Oct 2014 14:05:27 +0000 (10:05 -0400)]
4126: Preserve hash form when populating component parameters.
radhika [Wed, 8 Oct 2014 13:58:44 +0000 (09:58 -0400)]
closes #3990
Merge branch '3990-owner-when-rerunning-pipeline'
radhika [Wed, 8 Oct 2014 13:58:10 +0000 (09:58 -0400)]
3990: minor test update
Tom Clegg [Tue, 7 Oct 2014 20:24:15 +0000 (16:24 -0400)]
4044: Clean up channel and pipe usage.
* Remove code for using one goroutine to copy stdout and stderr from
channels to pipes. Instead, copy stderr from channel to pipe in a
simple goroutine, and use give the child our own Stdout to use (we
don't use it ourselves anyway).
* Rename functions (OutputChannel -> CopyChanToPipe, ReadLineByLine ->
CopyPipeToChan).
* Add "stop" channel to shut down polling loop when child process
exits.
* Check for errors when opening cgroup stats files. Report errors
instead of displaying bogus stats.
* Split main() into run() and a short main() with os.Exit logic, so
deferred functions run regardless of exit path.
* Close dangling filehandle when cgroup_cidfile succeeds.
* Fix slight divide-by-zero opportunity when elapsed==0.
* Fix condition triggering the "could not read cid file" error
message.
radhika [Tue, 7 Oct 2014 18:04:26 +0000 (14:04 -0400)]
Merge branch 'master' into 3990-owner-when-rerunning-pipeline
radhika [Tue, 7 Oct 2014 18:02:35 +0000 (14:02 -0400)]
closes #3882
Merge branch '3882-cancel-already-cancelled-job'
radhika [Tue, 7 Oct 2014 17:55:32 +0000 (13:55 -0400)]
3882: update "after_validation :update_timestamps_when_state_changes" to "before_save :update_timestamps_when_state_changes",
and also add comple more test combinations around cancelled state.
radhika [Tue, 7 Oct 2014 17:29:01 +0000 (13:29 -0400)]
Merge branch 'master' into 3882-cancel-already-cancelled-job
radhika [Tue, 7 Oct 2014 17:26:18 +0000 (13:26 -0400)]
Merge branch 'master' into 3990-owner-when-rerunning-pipeline
radhika [Tue, 7 Oct 2014 17:17:33 +0000 (13:17 -0400)]
3990: rerunning pipeline from within a project with no write permission
Tom Clegg [Tue, 7 Oct 2014 16:04:34 +0000 (12:04 -0400)]
Merge branch '3775-fetch-git-repo' closes #3775
Conflicts:
sdk/cli/bin/crunch-job
radhika [Tue, 7 Oct 2014 14:42:57 +0000 (10:42 -0400)]
Merge branch 'master' into 3990-owner-when-rerunning-pipeline
radhika [Tue, 7 Oct 2014 14:41:30 +0000 (10:41 -0400)]
3990: add integration tests to rerun pipeline.
Brett Smith [Tue, 7 Oct 2014 13:44:15 +0000 (09:44 -0400)]
Merge branch '4012-crunch-job-api-retries-wip'
Closes #4012.
Brett Smith [Tue, 7 Oct 2014 13:35:11 +0000 (09:35 -0400)]
4012: crunch-job retries all API operations.
This will make jobs more robust against transient errors when talking
to the API server.
Tom Clegg [Tue, 7 Oct 2014 03:24:53 +0000 (23:24 -0400)]
3775: Fix SDK usage.
Tom Clegg [Tue, 7 Oct 2014 03:07:35 +0000 (23:07 -0400)]
3775: Update comment
Tom Clegg [Tue, 7 Oct 2014 02:16:56 +0000 (22:16 -0400)]
3775: Update comment
Tom Clegg [Tue, 7 Oct 2014 02:06:50 +0000 (22:06 -0400)]
3775: Update perlpod. Use items_available. Be more conservative when
using already-cached commit sha1s, expand related comment. Move "skip
if not possible now" checks into subroutines.
radhika [Mon, 6 Oct 2014 23:54:15 +0000 (19:54 -0400)]
Merge branch 'master' into 3990-owner-when-rerunning-pipeline
radhika [Mon, 6 Oct 2014 23:50:19 +0000 (19:50 -0400)]
3990: refactor pipeline instances integration test to reuse logic that creates and runs a pipeline.
Tom Clegg [Mon, 6 Oct 2014 21:55:57 +0000 (17:55 -0400)]
Merge branch '3828-keepproxy-race' closes #3828
Brett Smith [Mon, 6 Oct 2014 21:36:31 +0000 (17:36 -0400)]
3634: Update user setup tests for preserved tab state.
These tests assume that the page reloads after submitting Ajax
dialogs. They started failing when we started preserving tab state.
Update the tests to expressly refresh the page. Future improvements
might find a solution with lower overhead.
Refs #3634.
Tom Clegg [Mon, 6 Oct 2014 21:25:02 +0000 (17:25 -0400)]
Merge branch '3775-fetch-git-repo' closes #3775
Tom Clegg [Mon, 6 Oct 2014 21:21:25 +0000 (17:21 -0400)]
3775: Set state=Running when creating a Job in local mode.
Tom Clegg [Mon, 6 Oct 2014 21:13:17 +0000 (17:13 -0400)]
3775: Merge branch 'master' into 3775-fetch-git-repo
Conflicts:
sdk/cli/bin/crunch-job
Peter Amstutz [Mon, 6 Oct 2014 20:30:29 +0000 (16:30 -0400)]
4042: When listing directory, return list of absolute paths
radhika [Mon, 6 Oct 2014 20:27:25 +0000 (16:27 -0400)]
3990: set owner_uuid of a copied pipeline instance to that of the source, provided it is a project and writable by the current user.
Tom Clegg [Mon, 6 Oct 2014 20:05:54 +0000 (16:05 -0400)]
3828: Wait for listener to start before connecting to it. Fix test
panic in listener.Close() when listener does not exist.
Tom Clegg [Mon, 6 Oct 2014 20:04:33 +0000 (16:04 -0400)]
3828: Use defer to close pidfile. Avoids leftover pidfile if Listen fails.
Tom Clegg [Mon, 6 Oct 2014 19:56:17 +0000 (15:56 -0400)]
3828: Remove duplicate "write pidfile" block.
Brett Smith [Mon, 6 Oct 2014 19:16:23 +0000 (15:16 -0400)]
arv-put parses arguments before instantiating an API client.
This helps ensure that `--help` responds quickly even if the server is
slow or down. No issue #.
Phil Hodgson [Mon, 6 Oct 2014 19:11:19 +0000 (15:11 -0400)]
Merge branch '3634-tab-state' refs #3634
Peter Amstutz [Mon, 6 Oct 2014 19:03:37 +0000 (15:03 -0400)]
3381: Merge job_status_label and job_progress into a single job_progress
partial. This partial renders a progress bar if the job is running, otherwise
renders a label with the job state. The progress bar now shows only 'done'
tasks and renders the progress bar in orange if any tasks have failed. Move
"done, failure, running, todo" from panel body to panel heading on
running_component partial. Dashboard now uses job_progress partial with
"scaleby" to indicate pipeline progress more precisely.
radhika [Mon, 6 Oct 2014 18:55:10 +0000 (14:55 -0400)]
closes #4046
Merge branch '4046-job-queue-position'
Phil Hodgson [Mon, 6 Oct 2014 18:53:08 +0000 (14:53 -0400)]
Merge branch 'master' into 3634-tab-state
Conflicts (resolved):
apps/workbench/app/views/application/_title_and_buttons.html.erb
radhika [Mon, 6 Oct 2014 18:51:20 +0000 (14:51 -0400)]
Merge branch 'master' into 4046-job-queue-position
Phil Hodgson [Mon, 6 Oct 2014 18:49:45 +0000 (14:49 -0400)]
3634: add anchor to URL when switching project tabs
radhika [Mon, 6 Oct 2014 18:47:43 +0000 (14:47 -0400)]
4046: update assert error message
Tom Clegg [Mon, 6 Oct 2014 18:37:06 +0000 (14:37 -0400)]
Merge branch '3782-stub-io-pipe' refs #3782
radhika [Mon, 6 Oct 2014 18:30:52 +0000 (14:30 -0400)]
Merge branch 'master' into 4046-job-queue-position
Tim Pierce [Mon, 6 Oct 2014 18:22:00 +0000 (14:22 -0400)]
Merge branch '3825-crunch-pipe-to-arv-put-final'
Closes #3825.
Tim Pierce [Mon, 6 Oct 2014 17:24:47 +0000 (13:24 -0400)]
3825: code review
* avoid overloading "output" (usually used for the output from a task or job than for diagnostic output from crunch)
** renamed:
*** start_output_log -> log_writer_start
*** write_output_log -> log_writer_send
*** finish_output_log -> log_writer_finish
*** output_log_is_active -> log_writer_is_active
* fixed missing semicolon
Tom Clegg [Mon, 6 Oct 2014 17:56:30 +0000 (13:56 -0400)]
3782: Merge branch 'master' into 3782-stub-io-pipe
radhika [Mon, 6 Oct 2014 17:50:11 +0000 (13:50 -0400)]
closes #3583
Merge branch '3583-provenance-graph-issue'
radhika [Mon, 6 Oct 2014 17:49:33 +0000 (13:49 -0400)]
Merge branch 'master' into 3583-provenance-graph-issue
radhika [Mon, 6 Oct 2014 17:47:22 +0000 (13:47 -0400)]
Merge branch 'master' into 4046-job-queue-position
radhika [Mon, 6 Oct 2014 17:45:58 +0000 (13:45 -0400)]
4046: update queue_position method to increment index and add unit test.