--- layout: default navsection: userguide title: "Accessing Keep from GNU/Linux" ... {% comment %} Copyright (C) The Arvados Authors. All rights reserved. SPDX-License-Identifier: CC-BY-SA-3.0 {% endcomment %} This tutoral describes how to access Arvados collections on GNU/Linux using traditional filesystem tools by mounting Keep as a file system using @arv-mount@. {% include 'tutorial_expectations' %} h2. Arv-mount @arv-mount@ provides several features: * You can browse, open and read Keep entries as if they are regular files. * It is easy for existing tools to access files in Keep. * Data is streamed on demand. It is not necessary to download an entire file or collection to start processing. The default mode permits browsing any collection in Arvados as a subdirectory under the mount directory. To avoid having to fetch a potentially large list of all collections, collection directories only come into existence when explicitly accessed by UUID or portable data hash. For instance, a collection may be found by its content hash in the @keep/by_id@ directory.
~$ mkdir -p keep
~$ arv-mount keep
~$ cd keep/by_id/c1bad4b39ca5a924e481008009d94e32+210
~/keep/by_id/c1bad4b39ca5a924e481008009d94e32+210$ ls
var-GS000016015-ASM.tsv.bz2
~/keep/by_id/c1bad4b39ca5a924e481008009d94e32+210$ md5sum var-GS000016015-ASM.tsv.bz2
44b8ae3fde7a8a88d2f7ebd237625b4f  var-GS000016015-ASM.tsv.bz2
~/keep/by_id/c1bad4b39ca5a924e481008009d94e32+210$ cd ../..
~$ fusermount -u keep
The last line unmounts Keep. Subdirectories will no longer be accessible. In the top level directory of each collection, arv-mount provides a special file called @.arvados#collection@ that contains a JSON-formatted API record for the collection. This can be used to determine the collection's @portable_data_hash@, @uuid@, etc. This file does not show up in @ls@ or @ls -a@. h3. Modifying files and directories in Keep By default, all files in the Keep mount are read only. However, @arv-mount --read-write@ enables you to perform the following operations using normal Unix command line tools (@touch@, @mv@, @rm@, @mkdir@, @rmdir@) and your own programs using standard POSIX file system APIs: * Create, update, rename and delete individual files within collections * Create and delete subdirectories inside collections * Move files and directories within and between collections * Create and delete collections within a project (using @mkdir@ and @rmdir@ in a project directory) Not supported: * Symlinks, hard links * Changing permissions * Extended attributes * Moving a subdirectory of a collection into a project, or moving a collection from a project into another collection If multiple clients (separate instances of arv-mount or other arvados applications) modify the same file in the same collection within a short time interval, this may result in a conflict. In this case, the most recent commit wins, and the "loser" will be renamed to a conflict file in the form @name~YYYYMMDD-HHMMSS~conflict~@. Please note this feature is in beta testing. In particular, the conflict mechanism is itself currently subject to race conditions with potential for data loss when a collection is being modified simultaneously by multiple clients. This issue will be resolved in future development.