title: Computing with Crunch
...
-Crunch is the name for the Arvados system for managing computation. It provides an abstract API to various clouds and HPC resource allocation and scheduling systems.
+Crunch is the name for the Arvados system for managing computation. It provides an abstract API to various clouds and HPC resource allocation and scheduling systems, and integrates closely with Keep storage and the Arvados permission system.
h2. Container API
This reference describes the semantics of Arvados resources and how to programatically access Arvados via its REST API. Each resource listed in this section is exposed on the Arvados API server under the @/arvados/v1/@ path prefix, for example, @https://{{ site.arvados_api_host }}/arvados/v1/collections@.
The API server publishes a machine-readable description of its endpoints and some additional site configuration values via a JSON-formatted discovery document. This is available at @/discovery/v1/apis/arvados/v1/rest@, for example @https://{{ site.arvados_api_host }}/discovery/v1/apis/arvados/v1/rest@. Some Arvados SDKs use the discovery document to generate language bindings.
+
+Many Arvados Workbench pages, under the the *Advanced* tab, provide examples of API and SDK use for accessing the current resource .
Required arguments are displayed in %{background:#ccffcc}green%.
-h2. Methods
-
-See "collections":{{site.baseurl}}/api/methods/collections.html
-
-h3. Conditions of creating a Collection
-
-The @uuid@ and @manifest_text@ attributes must be provided when creating a Collection. The cryptographic digest of the supplied @manifest_text@ must match the supplied @uuid@.
-
-h3. Side effects of creating a Collection
-
-Referenced data can be protected from garbage collection. See the section about "resources" links on the "Links":Link.html page.
-
-Data can be shared with other users via the Arvados permission model.
-
-Clients can request checks of data integrity and storage redundancy.
-
h2. Resource
Each collection has, in addition to the usual "attributes of Arvados resources":{{site.baseurl}}/api/resources.html:
|_. Attribute|_. Type|_. Description|_. Example|
|name|string|||
|description|text|||
-|portable_data_hash|string|||
+|portable_data_hash|string|The MD5 sum of the stripped manifest text.||
|manifest_text|text|||
|replication_desired|number|Minimum storage replication level desired for each data block referenced by this collection. A value of @null@ signifies that the site default replication level (typically 2) is desired.|@2@|
|replication_confirmed|number|Replication level most recently confirmed by the storage system. This field is null when a collection is first created, and is reset to null when the manifest_text changes in a way that introduces a new data block. An integer value indicates the replication level of the _least replicated_ data block in the collection.|@2@, null|
|replication_confirmed_at|datetime|When replication_confirmed was confirmed. If replication_confirmed is null, this field is also null.||
+h3. Conditions of creating a Collection
+
+The @portable_data_hash@ and @manifest_text@ attributes must be provided when creating a Collection. The cryptographic digest of the supplied @manifest_text@ must match the supplied @portable_data_hash@.
+
+h3. Side effects of creating a Collection
+
+Referenced blocks are protected from garbage collection in Keep.
+
+Data can be shared with other users via the Arvados permission model.
+
h2. Methods
h3. create
--- /dev/null
+---
+layout: default
+navsection: api
+title: Storage in Keep
+...
+
+h2. Storing data
+
+Storing data in Keep follows this process:
+
+# The client fetches a list of keep servers (or proxies) using the @accessible@ method on "keep_services":{{site.baseurl}}/api/methods/keep_services.html
+# Data is split into 64 MiB blocks and the MD5 hash is computed for each block.
+# The client uploads each block to one or more Keep servers, based on the number of desired replicas. The priority order is determined using rendezvous hashing, described below.
+# The Keep server returns a block locator (the MD5 sum of the block) and a "signed token" which the client can use as proof of knowledge for the block.
+# The client constructs a @manifest@ which lists the blocks by MD5 hash and how to reassemble them into the original files.
+# The client creates a "collection":{{site.baseurl}}/api/methods/collections.html and provides the @manifest_text@
+# The API server accepts the collection after validating the signed tokens (proof of knowledge) for each block.
+
+h2. Fetching data
+
+# The client requests a @collection@ object including @manifest_text@
+# The server adds "token signatures" which provide proof of access for each block
+# The client fetches a list of keep servers (or proxies) using the @accessible@ method on "keep_services":{{site.baseurl}}/api/methods/keep_services.html
+# For each data block, the client chooses the highest priority server using rendezvous hashing, described below.
+# The client sends the data block request to the keep server, including the token signature (proof of access).
+# The server provides the block data after validating the token signature for the block.
+
+h2. Keep server API
+
+The Keep server is accessed via a simple HTTP REST API.
+
+*GET /blockidentifier+size+A@token*
+
+Fetch the data block, if the token is valid.
+
+*PUT /blockidentifier*
+
+Returns the md5 sum of the data along with the signed token.
+
+h2. Rendezvous hashing
+
+
+
+h2. Manifest format
title: Examples
...
+See "Arvados GoDoc":https://godoc.org/git.curoverse.com/arvados.git/sdk/go for detailed documentation.
+
h2. Initialize SDK
<pre>
The Go ("Golang":http://golang.org) SDK provides a generic set of wrappers so you can make API calls easily.
+See "Arvados GoDoc":https://godoc.org/git.curoverse.com/arvados.git/sdk/go for detailed documentation.
+
h3. Installation
Use @go get git.curoverse.com/arvados.git/sdk/go/arvadosclient@. The go tools will fetch the relevant code and dependencies for you.
* "Perl SDK":{{site.baseurl}}/sdk/perl/index.html
* "Ruby SDK":{{site.baseurl}}/sdk/ruby/index.html
* "Java SDK":{{site.baseurl}}/sdk/java/index.html
+
+Many Arvados Workbench pages, under the the *Advanced* tab, provide examples of API and SDK use for accessing the current resource .
<notextile>
<pre><code>~$ <span class="userinput">virtualenv ~/venv</span>
~$ <span class="userinput">. ~/venv/bin/activate</span>
+~$ <span class="userinput">pip install -U setuptools</span>
~$ <span class="userinput">pip install arvados-cwl-runner</span>
</code></pre>
</notextile>