Peter Amstutz [Fri, 10 Feb 2017 16:02:33 +0000 (11:02 -0500)]
9397: Update comment & tests for CollectionFileReader to reflect it is more
lenient in the paths it accepts as a result of updates to implementation of
manifest.FileSegmentIterByName.
Lucas Di Pentima [Wed, 8 Feb 2017 23:03:32 +0000 (20:03 -0300)]
10956: When asked for the recently uploaded collection's pdh, arv-put will compute a pdh from the local collection's manifest and compare it with the API server provided version. If they differ, it will log a warning, always returning the API server's version.
Lucas Di Pentima [Wed, 8 Feb 2017 22:37:45 +0000 (19:37 -0300)]
10956: Get PDH from API server's response when saving a collection so that it doesn't have to be calculated when being asked for later on.
Updated tests to reflect this change.
Lucas Di Pentima [Wed, 8 Feb 2017 16:18:36 +0000 (13:18 -0300)]
3900: When deleting items inside a project, use the delete API. In the special case of trashing collections, first remove them from its parent project.
Peter Amstutz [Mon, 6 Feb 2017 22:16:01 +0000 (17:16 -0500)]
9397: Add manifest normalization and sub-manifest extraction by path.
Introduces "SegmentedManifest" which stores streams -> files -> file segments.
Enables reexport of manifest in normalized form, as well as extraction of
individual files, streams or sets of streams. Also adds binary search for
efficiently determining first block to access for some stream offset.
Lucas Di Pentima [Fri, 3 Feb 2017 19:21:24 +0000 (16:21 -0300)]
10968: Added a notification when uploading at least one directory, just to let know the user that the expected bytes count can take some time when trying to upload lots of files.
Lucas Di Pentima [Fri, 3 Feb 2017 17:52:16 +0000 (14:52 -0300)]
10968: Changed the periodic update thread to run every 1 second while arv-put is checking which files to skip, only notifying the user via the progress indicator.
When starts uploading the rest of the files, the update thread returns to the previous behaviour, running once every minute and checkpointing to the cache.
Lucas Di Pentima [Fri, 3 Feb 2017 15:09:58 +0000 (12:09 -0300)]
10932: Changed _file_paths from being a list to a set so we're not going to copy it when checking for missing files on local collection on resume start.
Added comments on cache saving explaining why is better to use json.dumps() instead of copy.deepcopy().
radhika [Fri, 3 Feb 2017 00:16:25 +0000 (19:16 -0500)]
9397: Use manifest.FileSegmentForPath to get manifest segment for a file path. Cache collections to avoid fetching
the same collection repeatedly. If no manifest segment found for a mounted path, log that fact.
Peter Amstutz [Thu, 2 Feb 2017 14:41:30 +0000 (09:41 -0500)]
10846: Revise shutdown behavior.
* Remove node blacklisting. Because of arvados node record reuse and assigning
compute node ids based on record uuid, it was possible to boot a new node
with the same id as a previously blacklisted node. Previously blacklisted
'broken' nodes are now considered 'down' when determining if it is necessary
to bring up new nodes.
* Failure to destroy a node is no longer retried by the shutdown actor. A
failure cancels the shutdown. The daemon is expected to schedule a new
shutdown attempt.
* Restored the concept of "cancellable" shutdown, which checks if the node is
still shutdown eligible before actually making the call to shut it down.
* Adjusted mocking behavior to fix tests which were producing suppressed
errors (visible in debug logging but not failing the test) when node sizes
were inconsistent between the wishlist, cloud_node objects, and
ServerCalculator.
Lucas Di Pentima [Thu, 2 Feb 2017 22:10:45 +0000 (19:10 -0300)]
10932: Replaced the use of a list with a set to check if files on local collection are on the local file list, so that the resume start time is greatly reduced.
Also, the save_state method was taking too much time on two operations: deepcopy() and json.dump(). Replaced both with just one call to json.dumps() that's a lot faster than json.dump().
This will improve overall performance on big file collections uploads.