mishaz [Wed, 7 Jan 2015 04:16:40 +0000 (04:16 +0000)]
Started focusing on Keep Server responses again. Switched to using blockdigest instead of strings. Added per block info so that we can track block replication across servers.
mishaz [Wed, 24 Dec 2014 20:26:38 +0000 (20:26 +0000)]
Added string copying to try to reduce memory usage, didn't seem to work. Cleaned up logging (and logging logic) so that we only see one line per batch.
mishaz [Wed, 24 Dec 2014 01:36:43 +0000 (01:36 +0000)]
Switched from strings to BlockDigests to hold block digests more efficiently. Started clearing out manifest text once we finished with it. Made profiling conitional on flag (before it crashed if not provided). Added final heap profile once collections were finished.
mishaz [Tue, 23 Dec 2014 23:55:12 +0000 (23:55 +0000)]
Added blockdigest class to store digests more efficiently. This has the nice side effect of reducing how many string slices we use from the SDK, so the large string can get garbage collected once we remove other usages.
mishaz [Tue, 23 Dec 2014 19:33:07 +0000 (19:33 +0000)]
Long overdue checkin of data manager. Current code runs, but uses way too much memory and eventually crashes. This checkin includes heap profiling to track down memory usage.
mishaz [Sat, 22 Nov 2014 00:57:40 +0000 (00:57 +0000)]
Added reporting of disk usage. This is the Collection Storage of each user as described here: https://arvados.org/projects/arvados/wiki/Data_Manager_Design_Doc#Reports-Produced
But it does not include the size of projects owned by the user (projects and subprojects are each reported as their own users)
Peter Amstutz [Wed, 7 Jan 2015 19:38:41 +0000 (14:38 -0500)]
4312: Use "install" phase of bootstrap script to report the installed versions
of any arvados pip or debian packages. Like virtualenv logic, only reports for
task 0 (since every task starts the same image).
Brett Smith [Fri, 19 Dec 2014 22:40:13 +0000 (17:40 -0500)]
4836: Trigger Workbench infinite scroll load on tab show.
If an infinite scroller is in the first tab of a show page, but the
user is going to a different tab, we'll queue up the first event
to load data for the container, but when it fires the container won't
be visible so it will decline to load anything. Then you can only get
data to load if you resize the window.
Fire a scroll event when a new tab is shown, to spur the infinite
scroller to load data as appropriate.
Tim Pierce [Tue, 6 Jan 2015 16:03:10 +0000 (11:03 -0500)]
4598: account for queued and cancelled jobs, fix sorting
Per code review:
* Updated report to include job states "Cancelled" and "Queued" as well
as Failed, Running and Complete, and to take these into account when
calculating job counts.
* Fixed sorting for failure classes.
Tim Pierce [Mon, 5 Jan 2015 19:22:47 +0000 (14:22 -0500)]
4598: formatting and calculation fixes (code review)
Incorporating code review feedback from #4598-13.
Bugs fixed:
* Correct counting and percentage calculation of job failures.
** Jobs were getting categorized as both "unknown" and as a specific failure type.
* Crashes fixed: should not raise any unhandled exceptions.
Formatting fixes:
* Itemized failures are now sorted in descending order by failure type
* Better horizontal alignment
* Modified formatting to account for updated description.
Peter Amstutz [Mon, 29 Dec 2014 17:32:38 +0000 (12:32 -0500)]
4869: Correctly handle zero-length blocks in Keep client/Keep proxy. Remove
X-Block-Size. Choose default request timeout based on if client is talking to
a proxy or not. Use double quotes in logging. Rename "tag" to "requestId".
Peter Amstutz [Mon, 29 Dec 2014 14:09:13 +0000 (09:09 -0500)]
4869: KeepClient now has a default timeout per block request (10 minutes). In
keepproxy, the timeout is set to 20 seconds per block. Also rearranged some
keepclient and keepproxy logging to provide better information.
Tom Clegg [Sun, 21 Dec 2014 00:28:56 +0000 (19:28 -0500)]
4875: Let the OS choose port numbers for fake servers.
Fixes a race condition where test case N+1 can't listen on port 2990
because test case N hasn't shut down its listener.
Also removes the artificial acceptance requirement that nobody else on
the testing host is using the arbitrarily assigned port range
2990..299x.
Incidental changes:
* rename RunBogusKeepServer to RunFakeKeepServer (to match
RunSomeFakeKeepServers and fix the misleading implication that the
resulting server does something bogus).
* return a KeepServer object from RunFakeKeepServer (for better parity
with RunSomeFakeKeepServers).