Brett Smith [Tue, 14 Apr 2015 17:13:29 +0000 (13:13 -0400)]
5717: crunch-job uses fewer slots when few tasks at this level.
When crunch-job begins tasks at a new level, it looks at the number of
tasks scheduled for that level. If that's smaller than the maximum
number of slots available, then it only considers slots "free" up to
the number of tasks scheduled, or the number of nodes available,
whichever is greater.
This change lets Crunch scale whole-node resources like RAM more
effectively. This may not be desired if a level starts with a small
number of tasks queued, but later schedules more and wants maximum
parallelization, but that's uncommon enough that this seems like net
win. Previously, Crunch could overallocate RAM in this scenario,
which seems worse.
Brett Smith [Mon, 13 Apr 2015 20:48:16 +0000 (16:48 -0400)]
5714: Avoid Node Manager race conditions around stop_if_no_cloud_node.
Checking .is_alive() seems to always lead to race conditions.
Instead, have CloudNodeSetupActor.stop_if_no_cloud_node() return True
if it's going to stop, else False. Have NodeManagerDaemonActor
respect this return value consistently.
5440: remove all links to /start temporarily to avoid confusing user
Put in quickstart sections on docs homepage for both 1) public pipeline 2) pipeline developers
Revert changes made in #5090 (home title back to "Arvados Docs", topnav link to "arvados.org")
Refactor /user/index.html, removing references to /start and directing new users to homepage quickstart section
* Get links with API list calls, instead of fetching each one
individually.
* Get a list mapping portable data hashes to UUIDs, and add a single
UUID per portable data hash to the fetch list. This helps us avoid
downloading multiple copies the same manifest text, and is probably
the single-biggest win in this entire commit for most use cases.
* Use the Ruby SDK to build the new collection. This lets us avoid
spawning new arv-normalize processes, and piping large manifests to
them. It also lets us build the entire collection and normalize
only when we're done.
Peter Amstutz [Mon, 13 Apr 2015 14:18:16 +0000 (10:18 -0400)]
5692: Add flush flag to manifest_text() which calls commit_all(). Added
portable_manifest_text() which returns the stripped, normalized manifest text
free of side effects. Fixed tests.
Peter Amstutz [Fri, 10 Apr 2015 19:27:05 +0000 (15:27 -0400)]
5692: Collection.manifest_text(strip=False) will flush open files and wait for
all blocks to be committed in order to return a manifest text with valid
authorization tokens. Fix tests affected by the change.
Tom Clegg [Sun, 29 Mar 2015 00:26:00 +0000 (20:26 -0400)]
5414: Add client support for Keep service hints.
Also, some incidental improvements in nearby code:
* Consistent logging in keepproxy, with one reusable logging statement
instead of a different statement/format for each outcome.
* In sdk/go/keepclient, remove public AuthorizedGet and AuthorizedAsk
methods. Instead, Get() and Ask() accept a locator (with or without
a permission token) and do the right thing. Callers don't have to
parse locators to decide which method to call.
* In sdk/go/keepclient, use an RWMutex instead of atomic.LoadPointer()
and unsafe.Pointer() to update KeepClient root maps safely.
* In sdk/go/keepclient, DiscoverKeepServers() doesn't return the new
root maps, just an error. In normal usage, the caller only cares
whether discovery was successful.
Also, some Go style fixes in nearby code:
* Use pointer receivers for all KeepClient methods.
https://golang.org/doc/faq#methods_on_values_or_pointers
* Use receiver name "kc", not "this".
https://github.com/golang/go/wiki/CodeReviewComments#receiver-names
* Handle errors first, use minimal indentation for normal code path.
https://github.com/golang/go/wiki/CodeReviewComments#indent-error-flow
Brett Smith [Wed, 8 Apr 2015 13:46:23 +0000 (09:46 -0400)]
5642: Explicitly make all swap available under Docker in crunch-job.
Without this, Docker 1.2 through 1.5 send subprocesses SIGKILL if they
exceed the memory limit. Refer to #5642 for an example.
--memory-swap is pretty new (newer than 1.3.3), so we don't want to
require it. At the same time, we don't want to impose any memory
limits if we can't use it, because killing subprocesses that exceed a
--memory limit is too strict. This commit arranges to use both
--memory and --memory-swap only if the latter is available.