--- layout: default navsection: userguide title: "Customizing Crunch environment using Docker" ... This page describes how to customize the runtime environment (e.g. the programs, libraries, and other dependencies needed to run a job) that a crunch script will be run in using "Docker.":https://www.docker.com/ Docker is a tool for building and running containers that isolate applications from other applications running on the same node. For detailed information about Docker, see the "Docker User Guide.":https://docs.docker.com/userguide/ This page will demonstrate how to: # Fetch the arvados/jobs Docker image # Manually install additional software into the container # Create a new custom image # Upload that image to Arvados for use by Crunch jobs # Share your image with others {% include 'tutorial_expectations' %} You also need ensure that "Docker is installed,":https://docs.docker.com/installation/ the Docker daemon is running, and you have permission to access Docker. You can test this by running @docker version@. If you receive a permission denied error, your user account may need to be added to the @docker@ group. If you have root access, you can add yourself to the @docker@ group using @$ sudo addgroup $USER docker@ then log out and log back in again; otherwise consult your local sysadmin. h2. Fetch a starting image The easiest way to begin is to start from the "arvados/jobs" image which already has the Arvados SDK installed along with other configuration required for use with Crunch. Download the latest "arvados/jobs" image from the Docker registry:
$ docker pull arvados/jobs
Pulling repository arvados/jobs
3132168f2acb: Download complete
a42b7f2c59b6: Download complete
e5afdf26a7ae: Download complete
5cae48636278: Download complete
7a4f91b70558: Download complete
a04a275c1fd6: Download complete
c433ff206a22: Download complete
b2e539b45f96: Download complete
073b2581c6be: Download complete
593915af19dc: Download complete
32260b35005e: Download complete
6e5b860c1cde: Download complete
95f0bfb43d4d: Download complete
c7fd77eedb96: Download complete
0d7685aafd00: Download complete
h2. Install new packages Next, enter the container using @docker run@, providing the arvados/jobs image and the program you want to run (in this case the bash shell).
$ docker run --interactive --tty --user root arvados/jobs /bin/bash
root@a0e8299b59aa:/#
Next, update the package list using @apt-get update@.
root@a0e8299b59aa:/# apt-get update
Get:1 http://apt.arvados.org wheezy Release.gpg [490 B]
Get:2 http://apt.arvados.org wheezy Release [1568 B]
Get:3 http://apt.arvados.org wheezy/main amd64 Packages [34.6 kB]
Get:4 http://ftp.us.debian.org wheezy Release.gpg [1655 B]
Get:5 http://ftp.us.debian.org wheezy-updates Release.gpg [836 B]
Get:6 http://ftp.us.debian.org wheezy Release [168 kB]
Ign http://apt.arvados.org wheezy/main Translation-en
Get:7 http://security.debian.org wheezy/updates Release.gpg [836 B]
Get:8 http://security.debian.org wheezy/updates Release [102 kB]
Get:9 http://ftp.us.debian.org wheezy-updates Release [124 kB]
Get:10 http://ftp.us.debian.org wheezy/main amd64 Packages [5841 kB]
Get:11 http://security.debian.org wheezy/updates/main amd64 Packages [218 kB]
Get:12 http://security.debian.org wheezy/updates/main Translation-en [123 kB]
Hit http://ftp.us.debian.org wheezy/main Translation-en
Hit http://ftp.us.debian.org wheezy-updates/main amd64 Packages/DiffIndex
Hit http://ftp.us.debian.org wheezy-updates/main Translation-en/DiffIndex
Fetched 6617 kB in 5s (1209 kB/s)
Reading package lists... Done
In this example, we will install the "R" statistical language Debian package "r-base-core". Use @apt-get install@:
root@a0e8299b59aa:/# apt-get install r-base-core
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  [...]
libxv1 libxxf86dga1 libxxf86vm1 r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster r-cran-codetools
  [...]
Suggested packages:
  [...]
The following NEW packages will be installed:
  [...]
  libxv1 libxxf86dga1 libxxf86vm1 r-base r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster
  [...]
0 upgraded, 107 newly installed, 0 to remove and 9 not upgraded.
Need to get 88.2 MB of archives.
After this operation, 219 MB of additional disk space will be used.
Do you want to continue [Y/n]? y
[...]
Get:85 http://ftp.us.debian.org/debian/ wheezy/main r-base-core amd64 2.15.1-4 [20.6 MB]
Get:86 http://ftp.us.debian.org/debian/ wheezy/main r-base-dev all 2.15.1-4 [3882 B]
Get:87 http://ftp.us.debian.org/debian/ wheezy/main r-cran-boot all 1.3-5-1 [472 kB]
[...]
Fetched 88.2 MB in 2min 17s (642 kB/s)
Extracting templates from packages: 100%
Preconfiguring packages ...
[...]
Unpacking r-base-core (from .../r-base-core_2.15.1-4_amd64.deb) ...
Selecting previously unselected package r-base-dev.
Unpacking r-base-dev (from .../r-base-dev_2.15.1-4_all.deb) ...
Selecting previously unselected package r-cran-boot.
Unpacking r-cran-boot (from .../r-cran-boot_1.3-5-1_all.deb) ...
[...]
Setting up r-base-core (2.15.1-4) ...
Setting R_PAPERSIZE_USER default to 'a4'

Creating config file /etc/R/Renviron with new version
Setting up r-base-dev (2.15.1-4) ...
Setting up r-cran-boot (1.3-5-1) ...
[...]
Now we can verify that "R" is installed:
root@a0e8299b59aa:/# R

R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>
Note that you are not limited to installing Debian packages. You may compile programs or libraries from source and install them, edit systemwide configuration files, use other package managers such as @pip@ or @gem@, and perform any other customization necessary to run your program. h2. Create a new image We're now ready to create a new Docker image. First, quit the container, then use @docker commit@ to create a new image from the stopped container. The container id can be found in the default hostname of the container displayed in the prompt, in this case @a0e8299b59aa@:
root@a0e8299b59aa:/# exit
$ docker commit a0e8299b59aa arvados/jobs-with-r
33ea6b87792364cb9989a149c36a31e5a9c8cf96694ba05f66545ad7b842522e
$ docker images
REPOSITORY            TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
arvados/jobs-with-r   latest              33ea6b877923        43 seconds ago       1.607 GB
arvados/jobs          latest              3132168f2acb        22 hours ago         1.314 GB
h2. Upload your image Finally, we are ready to upload the new Docker image to Arvados. Use @arv keep docker@ with the image repository name to upload the image. Without arguments, @arv keep docker@ will print out the list of Docker images in Arvados that are available to you.
$ arv keep docker arvados/jobs-with-r
1591M / 1591M 100.0%
Collection saved as 'Docker image arvados/jobs-with-r:latest 33ea6b877923'
qr1hi-4zz18-3fk2px2ji25nst2
$ arv keep docker
REPOSITORY                      TAG         IMAGE ID      COLLECTION                     CREATED
arvados/jobs-with-r             latest      33ea6b877923  qr1hi-4zz18-3fk2px2ji25nst2    Thu Oct 16 13:58:53 2014
You are now able to specify the runtime environment for your program using the @docker_image@ field of the @runtime_constaints@ section of your pipeline components: {% code 'example_docker' as javascript %} * The @docker_image@ field can be one of: the Docker repository name (as shown above), the Docker image hash, the Arvados collection UUID, or the Arvados collection portable data hash. h2. Share Docker images Docker images are subject to normal Arvados permissions. If wish to share your Docker image with others (or wish to share a pipeline template that uses your Docker image) you will need to use @arv keep docker@ with the @--project-uuid@ option to upload the image to a shared project.
$ arv keep docker --project-uuid zzzzz-j7d0g-u7zg1qdaowykd8d arvados/jobs-with-r