---
layout: default
navsection: api
navmenu: API Methods
title: "collections"

...
{% comment %}
Copyright (C) The Arvados Authors. All rights reserved.

SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}

API endpoint base: @https://{{ site.arvados_api_host }}/arvados/v1/collections@

Object type: @4zz18@

Example UUID: @zzzzz-4zz18-0123456789abcde@

h2. Resource

Collections describe sets of files in terms of data blocks stored in Keep.  See "Keep - Content-Addressable Storage":{{site.baseurl}}/architecture/storage.html for details.

Each collection has, in addition to the "Common resource fields":{{site.baseurl}}/api/resources.html:

table(table table-bordered table-condensed).
|_. Attribute|_. Type|_. Description|_. Example|
|name|string|||
|description|text|||
|properties|hash|User-defined metadata, may be used in queries using "subproperty filters":{{site.baseurl}}/api/methods.html#subpropertyfilters ||
|portable_data_hash|string|The MD5 sum of the manifest text stripped of block hints other than the size hint.||
|manifest_text|text|||
|replication_desired|number|Minimum storage replication level desired for each data block referenced by this collection. A value of @null@ signifies that the site default replication level (typically 2) is desired.|@2@|
|replication_confirmed|number|Replication level most recently confirmed by the storage system. This field is null when a collection is first created, and is reset to null when the manifest_text changes in a way that introduces a new data block. An integer value indicates the replication level of the _least replicated_ data block in the collection.|@2@, null|
|replication_confirmed_at|datetime|When replication_confirmed was confirmed. If replication_confirmed is null, this field is also null.||
|trash_at|datetime|If @trash_at@ is non-null and in the past, this collection will be hidden from API calls.  May be untrashed.||
|delete_at|datetime|If @delete_at@ is non-null and in the past, the collection may be permanently deleted.||
|is_trashed|boolean|True if @trash_at@ is in the past, false if not.||
|current_version_uuid|string|UUID of the collection's current version. On new collections, it'll be equal to the @uuid@ attribute.||
|version|number|Version number, starting at 1 on new collections. This attribute is read-only.||
|preserve_version|boolean|When set to true on a current version, it will be persisted. When passing @true@ as part of a bigger update call, both current and newly created versions are persisted.||
|file_count|number|The total number of files in the collection. This attribute is read-only.||
|file_size_total|number|The sum of the file sizes in the collection. This attribute is read-only.||

h3. Conditions of creating a Collection

The @portable_data_hash@ and @manifest_text@ attributes must be provided when creating a Collection. The cryptographic digest of the supplied @manifest_text@ must match the supplied @portable_data_hash@.

h3. Side effects of creating a Collection

Referenced blocks are protected from garbage collection in Keep.

Data can be shared with other users via the Arvados permission model.

h2. Methods

See "Common resource methods":{{site.baseurl}}/api/methods.html for more information about @create@, @delete@, @get@, @list@, and @update@.

Required arguments are displayed in %{background:#ccffcc}green%.

Supports federated @get@ only, which may be called with either a uuid or a portable data hash.  When requesting a portable data hash which is not available on the home cluster, the query is forwarded to all the clusters listed in @RemoteClusters@ and returns the first successful result.

h3. create

Create a new Collection.

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
|collection|object||query||

h3. delete

Put a Collection in the trash.  This sets the @trash_at@ field to @now@ and @delete_at@ field to @now@ + token TTL.  A trashed collection is invisible to most API calls unless the @include_trash@ parameter is true.

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection in question.|path||

h3. get

Gets a Collection's metadata by UUID or portable data hash.  When making a request by portable data hash, the returned record will only have the @portable_data_hash@ and @manifest_text@.

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection in question.|path||

h3. list

List collections.

See "common resource list method.":{{site.baseurl}}/api/methods.html#index

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
|include_trash|boolean (default false)|Include trashed collections.|query||
|include_old_versions|boolean (default false)|Include past versions of the collection(s) being listed, if any.|query||

Note: Because adding access tokens to manifests can be computationally expensive, the @manifest_text@ field is not included in results by default.  If you need it, pass a @select@ parameter that includes @manifest_text@.

h3. update

Update attributes of an existing Collection.

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection in question.|path||
|collection|object||query||

h3. untrash

Remove a Collection from the trash.  This sets the @trash_at@ and @delete_at@ fields to @null@.

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection to untrash.|path||
|ensure_unique_name|boolean (default false)|Rename collection uniquely if untrashing it would fail with a unique name conflict.|query||


h3. provenance

Returns a list of objects in the database that directly or indirectly contributed to producing this collection, such as the container request that produced this collection as output.

The general algorithm is:

# Visit the container request that produced this collection (via @output_uuid@ or @log_uuid@ attributes of the container request)
# Visit the input collections to that container request (via @mounts@ and @container_image@ of the container request)
# Iterate until there are no more objects to visit

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection to get provenance.|path||

h3. used_by

Returns a list of objects in the database this collection directly or indirectly contributed to, such as containers that takes this collection as input.

The general algorithm is:

# Visit containers that take this collection as input (via @mounts@ or @container_image@ of the container)
# Visit collections produced by those containers (via @output@ or @log@ of the container)
# Iterate until there are no more objects to visit

Arguments:

table(table table-bordered table-condensed).
|_. Argument |_. Type |_. Description |_. Location |_. Example |
{background:#ccffcc}.|uuid|string|The UUID of the Collection to get usage.|path||