-{% include 'tutorial_expectations' %}
-
-The Arvados distributed file system is called *Keep*. Keep is a content-addressable file system. This means that files are managed using special unique identifiers derived from the _contents_ of the file, rather than human-assigned file names (specifically, the MD5 hash). This has a number of advantages:
-* Files can be stored and replicated across a cluster of servers without requiring a central name server.
-* Both the server and client systematically validate data integrity because the checksum is built into the identifier.
-* Data duplication is minimized—two files with the same contents will have in the same identifier, and will not be stored twice.
-* It avoids data race conditions, since an identifier always points to the same data.
-
-h1. Putting Data into Keep
-
-We will start by downloading a freely available VCF file from "Personal Genome Project (PGP)":http://www.personalgenomes.org subject "hu599905":https://my.personalgenomes.org/profile/hu599905 to a staging directory on the VM, and adding it to Keep. In the following commands, replace *@you@* with your login name.
-
-First, log into your Arvados VM and set up the staging area:
-
-notextile. <pre><code>~$ <span class="userinput">mkdir /scratch/<b>you</b></span></code></pre>
-
-Next, download the file:
-
-<notextile>
-<pre><code>~$ <span class="userinput">cd /scratch/<b>you</b></span>
-/scratch/<b>you</b>$ <span class="userinput">curl -o var-GS000016015-ASM.tsv.bz2 'https://warehouse.personalgenomes.org/warehouse/f815ec01d5d2f11cb12874ab2ed50daa+234+K@ant/var-GS000016015-ASM.tsv.bz2'</span>
- % Total % Received % Xferd Average Speed Time Time Time Current
- Dload Upload Total Spent Left Speed
-100 216M 100 216M 0 0 10.0M 0 0:00:21 0:00:21 --:--:-- 9361k
-</code></pre>
-</notextile>