15699: Fix handling of streams with multiple refs to a block ID.
authorTom Clegg <tclegg@veritasgenetics.com>
Fri, 11 Oct 2019 05:11:45 +0000 (01:11 -0400)
committerTom Clegg <tclegg@veritasgenetics.com>
Fri, 11 Oct 2019 05:11:45 +0000 (01:11 -0400)
commita50fab63068c1e8d67ce1d477c6f2c2429464b5c
treeae697c52338aa8bd0dd790b0c3240800f209b0a5
parent36990378eeaef059619855a17ec824436502c52d
15699: Fix handling of streams with multiple refs to a block ID.

The manifest normalization code in the Ruby SDK (and therefore
Workbench) was based on an incorrect assumption that each block
locator could only appear once in a given stream.

If a manifest referenced the same block more than once, copying a file
from that manifest into a new one would produce a new file with the
correct size, but wrong data.

The new code uses a different strategy that deduplicates block
references in common cases, although not in all possible cases.

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tclegg@veritasgenetics.com>
sdk/ruby/lib/arvados/collection.rb
sdk/ruby/test/test_collection.rb