4 title: "Customizing the Crunch runtime environment"
7 This page describes how to customize the runtime environment (e.g. the programs, libraries, and other dependencies needed to run a job) that a crunch script will be run in using "Docker.":https://www.docker.com/
9 This page will demonstrate:
11 # How to fetch the arvados/jobs Docker image
12 # Manually install additional software into the container
13 # Create a new custom image
14 # Upload that image to Arvados for use by Crunch jobs.
17 {% include 'tutorial_expectations' %}
19 h2. Fetching a starting image
21 First, download the latest image from the Docker registry:
24 <pre><code>$ <span class="userinput">docker pull arvados/jobs</span>
25 Pulling repository arvados/jobs
26 3132168f2acb: Download complete
27 a42b7f2c59b6: Download complete
28 e5afdf26a7ae: Download complete
29 5cae48636278: Download complete
30 7a4f91b70558: Download complete
31 a04a275c1fd6: Download complete
32 c433ff206a22: Download complete
33 b2e539b45f96: Download complete
34 073b2581c6be: Download complete
35 593915af19dc: Download complete
36 32260b35005e: Download complete
37 6e5b860c1cde: Download complete
38 95f0bfb43d4d: Download complete
39 c7fd77eedb96: Download complete
40 0d7685aafd00: Download complete
44 h2. Installing new packages
46 Next, enter the container using "docker run", the arvados/jobs image, and the program you want to run (in this case the bash shell).
49 <pre><code>$ <span class="userinput">docker run --interactive --tty --user root arvados/jobs /bin/bash</span>
54 Next, update the package list using @apt-get update@.
57 <pre><code>root@a0e8299b59aa:/# <span class="userinput">apt-get update</span>
58 Get:1 http://apt.arvados.org wheezy Release.gpg [490 B]
59 Get:2 http://apt.arvados.org wheezy Release [1568 B]
60 Get:3 http://apt.arvados.org wheezy/main amd64 Packages [34.6 kB]
61 Get:4 http://ftp.us.debian.org wheezy Release.gpg [1655 B]
62 Get:5 http://ftp.us.debian.org wheezy-updates Release.gpg [836 B]
63 Get:6 http://ftp.us.debian.org wheezy Release [168 kB]
64 Ign http://apt.arvados.org wheezy/main Translation-en
65 Get:7 http://security.debian.org wheezy/updates Release.gpg [836 B]
66 Get:8 http://security.debian.org wheezy/updates Release [102 kB]
67 Get:9 http://ftp.us.debian.org wheezy-updates Release [124 kB]
68 Get:10 http://ftp.us.debian.org wheezy/main amd64 Packages [5841 kB]
69 Get:11 http://security.debian.org wheezy/updates/main amd64 Packages [218 kB]
70 Get:12 http://security.debian.org wheezy/updates/main Translation-en [123 kB]
71 Hit http://ftp.us.debian.org wheezy/main Translation-en
72 Hit http://ftp.us.debian.org wheezy-updates/main amd64 Packages/DiffIndex
73 Hit http://ftp.us.debian.org wheezy-updates/main Translation-en/DiffIndex
74 Fetched 6617 kB in 5s (1209 kB/s)
75 Reading package lists... Done
79 In this example, we will install the "R" statistical language Debian package "r-base-core". Use @apt-get install@:
82 <pre><code>root@a0e8299b59aa:/# <span class="userinput">apt-get install r-base-core</span>
83 Reading package lists... Done
84 Building dependency tree
85 Reading state information... Done
86 The following extra packages will be installed:
88 libxv1 libxxf86dga1 libxxf86vm1 r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster r-cran-codetools
92 The following NEW packages will be installed:
94 libxv1 libxxf86dga1 libxxf86vm1 r-base r-base-core r-base-dev r-base-html r-cran-boot r-cran-class r-cran-cluster
96 0 upgraded, 107 newly installed, 0 to remove and 9 not upgraded.
97 Need to get 88.2 MB of archives.
98 After this operation, 219 MB of additional disk space will be used.
99 Do you want to continue [Y/n]? y
101 Get:85 http://ftp.us.debian.org/debian/ wheezy/main r-base-core amd64 2.15.1-4 [20.6 MB]
102 Get:86 http://ftp.us.debian.org/debian/ wheezy/main r-base-dev all 2.15.1-4 [3882 B]
103 Get:87 http://ftp.us.debian.org/debian/ wheezy/main r-cran-boot all 1.3-5-1 [472 kB]
105 Fetched 88.2 MB in 2min 17s (642 kB/s)
106 Extracting templates from packages: 100%
107 Preconfiguring packages ...
109 Unpacking r-base-core (from .../r-base-core_2.15.1-4_amd64.deb) ...
110 Selecting previously unselected package r-base-dev.
111 Unpacking r-base-dev (from .../r-base-dev_2.15.1-4_all.deb) ...
112 Selecting previously unselected package r-cran-boot.
113 Unpacking r-cran-boot (from .../r-cran-boot_1.3-5-1_all.deb) ...
115 Setting up r-base-core (2.15.1-4) ...
116 Setting R_PAPERSIZE_USER default to 'a4'
118 Creating config file /etc/R/Renviron with new version
119 Setting up r-base-dev (2.15.1-4) ...
120 Setting up r-cran-boot (1.3-5-1) ...
125 Now we can verify that "R" is installed:
128 <pre><code>root@a0e8299b59aa:/# <span class="userinput">R</span>
130 R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
131 Copyright (C) 2012 The R Foundation for Statistical Computing
133 Platform: x86_64-pc-linux-gnu (64-bit)
135 R is free software and comes with ABSOLUTELY NO WARRANTY.
136 You are welcome to redistribute it under certain conditions.
137 Type 'license()' or 'licence()' for distribution details.
139 R is a collaborative project with many contributors.
140 Type 'contributors()' for more information and
141 'citation()' on how to cite R or R packages in publications.
143 Type 'demo()' for some demos, 'help()' for on-line help, or
144 'help.start()' for an HTML browser interface to help.
145 Type 'q()' to quit R.
151 Note that you are not limited to installing Debian packages. You may compile C programs or libraries from source and install them, edit systemwide configuration files, use other package managers such as @pip@ or @gem@, and perform any other customization necessary to run your program.
153 h2. Create a new image
155 We're now ready to create a new Docker image. First, quit the container, then use @docker commit@ to create a new image from the stopped container. The container id can be found in the default hostname of the container displayed in the prompt, in this case @a0e8299b59aa@:
158 <pre><code>root@a0e8299b59aa:/# <span class="userinput">exit</span>
159 $ <span class="userinput">docker commit a0e8299b59aa arvados/jobs-with-r</span>
160 33ea6b87792364cb9989a149c36a31e5a9c8cf96694ba05f66545ad7b842522e
161 $ <span class="userinput">docker images</span>
162 REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
163 arvados/jobs-with-r latest 33ea6b877923 43 seconds ago 1.607 GB
164 arvados/jobs latest 3132168f2acb 22 hours ago 1.314 GB
168 h2. Upload your image
170 Finally, we are ready to upload the new Docker image to Arvados. Use @arv keep docker@ with the image repository name to upload the image. Without arguments, @arv keep docker@ will print out the list of Docker images in Arvados that are available to you.
173 <pre><code>$ <span class="userinput">arv keep docker arvados/jobs-with-r</span>
175 Collection saved as 'Docker image arvados/jobs-with-r:latest 33ea6b877923'
176 qr1hi-4zz18-3fk2px2ji25nst2
177 $ <span class="userinput">arv keep docker</span>
178 REPOSITORY TAG IMAGE ID COLLECTION CREATED
179 arvados/jobs-with-r latest 33ea6b877923 qr1hi-4zz18-3fk2px2ji25nst2 Thu Oct 16 13:58:53 2014
183 You are now able to specify the runtime environment for your program using the @docker_image@ field of the @runtime_constaints@ section of your pipeline components:
186 {% code 'example_docker' as javascript %}