{
"name":"run-command example pipeline",
"components":{
- "bwa-mem": {
+ "bwa-mem": {
"script": "run-command",
"script_version": "master",
"repository": "arvados",
"script_parameters": {
"command": [
- "bwa",
+ "$(dir $(bwa_collection))/bwa",
"mem",
"-t",
"$(node.cores)",
+ "-R",
+ "@RG\\\tID:group_id\\\tPL:illumina\\\tSM:sample_id",
"$(glob $(dir $(reference_collection))/*.fasta)",
"$(glob $(dir $(sample))/*_1.fastq)",
"$(glob $(dir $(sample))/*_2.fastq)"
],
- "task.stdout": "$(basename $(glob $(dir $(sample))/*_1.fastq)).sam",
"reference_collection": {
"required": true,
"dataclass": "Collection"
},
+ "bwa_collection": {
+ "required": true,
+ "dataclass": "Collection",
+ "default": "39c6f22d40001074f4200a72559ae7eb+5745"
+ },
"sample": {
"required": true,
"dataclass": "Collection"
- }
+ },
+ "task.stdout": "$(basename $(glob $(dir $(sample))/*_1.fastq)).sam"
}
}
}
h1(#login). Using SSH to log into an Arvados VM
-To see a list of virtual machines that you have access to and determine the name and login information, click on the dropdown menu icon <span class="fa fa-lg fa-user"></span> <span class="caret"></span> in the upper right corner of the top navigation menu to access the user settings menu and click on the menu item *Virtual machines* to go to the Virtual machines page. This page lists the virtual machines you can access. The *hostname* column lists the name of each available VM. The *logins* column will have a list of comma separated values of the form @you@. In this guide the hostname will be *_shell_* and the login will be *_you_*. Replace these with your hostname and login name as appropriate.
+To see a list of virtual machines that you have access to and determine the name and login information, click on the dropdown menu icon <span class="fa fa-lg fa-user"></span> <span class="caret"></span> in the upper right corner of the top navigation menu to access the user settings menu and click on the menu item *Virtual machines* to go to the Virtual machines page. This page lists the virtual machines you can access. The *Host name* column lists the name of each available VM. The *Login name* column will have a list of comma separated values of the form @you@. In this guide the hostname will be *_shell_* and the login will be *_you_*. Replace these with your hostname and login name as appropriate.
"-t",
"$(node.cores)",
"-R",
- "@RG\\tID:group_id\\tPL:illumina\\tSM:sample_id",
+ "@RG\\\tID:group_id\\\tPL:illumina\\\tSM:sample_id",
"$(glob $(dir $(reference_collection))/*.fasta)",
"$(glob $(dir $(sample))/*_1.fastq)",
"$(glob $(dir $(sample))/*_2.fastq)"
<div class="col-sm-6" style="border-left: solid; border-width: 1px">
<p><strong>Quickstart</strong>
<p>
- Try any pipeline from the <a href="https://dev.arvados.org/projects/arvados/wiki/Public_Pipelines_and_Datasets">list of public pipelines</a>. For instance, the <a href="http://curover.se/pathomap">Pathomap Pipeline</a> links to these <a href="https://dev.arvados.org/projects/arvados/wiki/pathomap_tutorial/">step-by-step instructions</a> for trying Arvados out right in your browser using Curoverse's <a href="http://lp.curoverse.com/beta-signup/">public Arvados instance</a>.
+ Try any pipeline from the <a href="https://cloud.curoverse.com/projects/public">list of public pipelines</a>. For instance, the <a href="http://curover.se/pathomap">Pathomap Pipeline</a> links to these <a href="https://dev.arvados.org/projects/arvados/wiki/pathomap_tutorial/">step-by-step instructions</a> for trying Arvados out right in your browser using Curoverse's <a href="http://lp.curoverse.com/beta-signup/">public Arvados instance</a>.
</p>
<!--<p>-->
<!--<ol>-->
<notextile>
<pre><code>~$ <span class="userinput">ruby -e 'puts rand(2**400).to_s(36)'</span>
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
-~$ <span class="userinput">RAILS_ENV=production bundle exec rails console</span>
+~$ <span class="userinput">cd /var/www/arvados-sso/current</span>
+/var/www/arvados-sso/current$ <span class="userinput">RAILS_ENV=production bundle exec rails console</span>
:001 > <span class="userinput">c = Client.new</span>
:002 > <span class="userinput">c.name = "joshid"</span>
:003 > <span class="userinput">c.app_id = "arvados-server"</span>
title: Accessing an Arvados VM with SSH - Unix Environments
...
-This document is for accessing an arvados VM using SSK keys in Unix environments (Linux, OS X, Cygwin). If you would like to access VM through your browser, please visit the "Accessing an Arvados VM with Webshell":vm-login-with-webshell.html page. If you are using a Windows environment, please visit the "Accessing an Arvados VM with SSH - Windows Environments":ssh-access-windows.html page.
+This document is for accessing an Arvados VM using SSH keys in Unix environments (Linux, OS X, Cygwin). If you would like to access VM through your browser, please visit the "Accessing an Arvados VM with Webshell":vm-login-with-webshell.html page. If you are using a Windows environment, please visit the "Accessing an Arvados VM with SSH - Windows Environments":ssh-access-windows.html page.
{% include 'ssh_intro' %}
title: Accessing an Arvados VM with SSH - Windows Environments
...
-This document is for accessing an arvados VM using SSK keys in Windows environments. If you would like to use to access VM through your browser, please visit the "Accessing an Arvados VM with Webshell":vm-login-with-webshell.html page. If you are using a Unix environment (Linux, OS X, Cygwin), please visit the "Accessing an Arvados VM with SSH - Unix Environments":ssh-access-unix.html page.
+This document is for accessing an Arvados VM using SSH keys in Windows environments. If you would like to use to access VM through your browser, please visit the "Accessing an Arvados VM with Webshell":vm-login-with-webshell.html page. If you are using a Unix environment (Linux, OS X, Cygwin), please visit the "Accessing an Arvados VM with SSH - Unix Environments":ssh-access-unix.html page.
{% include 'ssh_intro' %}
# After downloading PuTTY and installing it, you should have a PuTTY folder in @C:\Program Files\@ or @C:\Program Files (x86)\@ (if you are using a 64 bit operating system).
# Open the Control Panel.
# Select _Advanced System Settings_, and choose _Environment Variables_.
-If you are using newer systems like Windows 7, you may use the following to open _Advanced System Settings_. Open Control Panel. Click on _System and Security_. Click on _System_. Click on _Advanced system settings_ and choose _Environment Variables..._
+If you are using newer systems like Windows 10, you may use the following to open _Advanced System Settings_. Open Control Panel. Click on _System and Security_. Click on _System_. Click on _Advanced system settings_ and choose _Environment Variables..._
# Under system variables, find and edit @PATH@.
# If you installed PuTTY in @C:\Program Files\PuTTY\@, add the following to the end of PATH:
<code>;C:\Program Files\PuTTY</code>
# Open PuTTY from the Start Menu.
# On the Session screen set the Host Name (or IP address) to “shell”, which is the hostname listed in the _Virtual Machines_ page.
# On the Session screen set the Port to “22”.
-# On the Connection %(rarr)→% Data screen set the Auto-login username to the username listed in the *logins* column on the Arvados Workbench _Settings %(rarr)→% Virtual machines_ page.
+# On the Connection %(rarr)→% Data screen set the Auto-login username to the username listed in the *Login name* column on the Arvados Workbench Virtual machines_ page.
# On the Connection %(rarr)→% Proxy screen set the Proxy Type to “Local”.
# On the Connection %(rarr)→% Proxy screen in the “Telnet command, or local proxy command” box enter:
<code>plink -P 2222 turnout@switchyard.{{ site.arvados_api_host }} %host</code>
Webshell gives you access to an arvados virtual machine from your browser with no additional setup.
-In the Arvados Workbench, click on the dropdown menu icon <span class="fa fa-lg fa-user"></span> <span class="caret"></span> in the upper right corner of the top navigation menu to access the user settings menu, and click on the menu item *Virtual machines* to see the list of virtual machines you can access.
+In the Arvados Workbench, click on the dropdown menu icon <span class="fa fa-lg fa-user"></span> <span class="caret"></span> in the upper right corner of the top navigation menu to access the user settings menu, and click on the menu item *Virtual machines* to see the list of virtual machines you can access. If you do not have access to any virtual machines, please click on <span class="btn btn-sm btn-primary">Send request for shell access</span> or send an email to "support@curoverse.com":mailto:support@curoverse.com.
Each row in the Virtual Machines panel lists the hostname of the VM, along with a <code>Log in as *you*</code> button under the column "Web shell beta". Clicking on this button will open up a webshell terminal for you in a new browser tab and log you in.
* Storing and querying metadata about genome sequence files, such as human subjects and their phenotypic traits using the "Arvados Metadata Database.":{{site.baseurl}}/user/topics/tutorial-trait-search.html
* Accessing, organizing, and sharing data, pipelines and results using the "Arvados Workbench":{{site.baseurl}}/user/getting_started/workbench.html web application.
-The examples in this guide use the Arvados instance located at <a href="{{site.arvados_workbench_host}}/" target="_blank">{{site.arvados_workbench_host}}</a>. If you are using a different Arvados instance replace @{{ site.arvados_workbench_host }}@ with your private instance in all of the examples in this guide.
-
-Curoverse maintains a public Arvados instance located at <a href="https://workbench.qr1hi.arvadosapi.com/" target="_blank">https://workbench.qr1hi.arvadosapi.com/</a>. You must have an account in order to use this service. If you would like to request an account, please send an email to "arvados@curoverse.com":mailto:arvados@curoverse.com.
+The examples in this guide use the public Arvados instance located at <a href="{{site.arvados_workbench_host}}/" target="_blank">{{site.arvados_workbench_host}}</a>. If you are using a different Arvados instance replace @{{ site.arvados_workbench_host }}@ with your private instance in all of the examples in this guide.
h2. Typographic conventions
# Start from the *Workbench Dashboard*. You can access the Dashboard by clicking on *<i class="fa fa-lg fa-fw fa-dashboard"></i> Dashboard* in the upper left corner of any Workbench page.
# Click on the <span class="btn btn-sm btn-primary"><i class="fa fa-fw fa-gear"></i> Run a pipeline...</span> button. This will open a dialog box titled *Choose a pipeline to run*.
-# Click to open the *All projects <span class="caret"></span>* menu. Under the *Projects shared with me* header, select *<i class="fa fa-fw fa-share-alt"></i> Arvados Tutorial*.
+# In the search box, type in *Tutorial align using bwa mem*.
# Select *<i class="fa fa-fw fa-gear"></i> Tutorial align using bwa mem* and click the <span class="btn btn-sm btn-primary" >Next: choose inputs <i class="fa fa-fw fa-arrow-circle-right"></i></span> button. This will create a new pipeline in your *Home* project and will open it. You can now supply the inputs for the pipeline.
# The first input parameter to the pipeline is *"reference_collection" parameter for run-command script in bwa-mem component*. Click the <span class="btn btn-sm btn-primary">Choose</span> button beneath that header. This will open a dialog box titled *Choose a dataset for "reference_collection" parameter for run-command script in bwa-mem component*.
-# Once again, open the *All projects <span class="caret"></span>* menu and select *<i class="fa fa-fw fa-share-alt"></i> Arvados Tutorial*. Select *<i class="fa fa-fw fa-archive"></i> Tutorial chromosome 19 reference* and click the <span class="btn btn-sm btn-primary" >OK</span> button.
+# Open the *Home <span class="caret"></span>* menu and select *All Projects*. Search for and select *<i class="fa fa-fw fa-archive"></i> Tutorial chromosome 19 reference* and click the <span class="btn btn-sm btn-primary" >OK</span> button.
# Repeat the previous two steps to set the *"sample" parameter for run-command script in bwa-mem component* parameter to *<i class="fa fa-fw fa-archive"></i> Tutorial sample exome*.
# Click on the <span class="btn btn-sm btn-primary" >Run <i class="fa fa-fw fa-play"></i></span> button. The page updates to show you that the pipeline has been submitted to run on the Arvados cluster.
-# After the pipeline starts running, you can track the progress by watching log messages from jobs. This page refreshes automatically. You will see a <span class="label label-success">complete</span> label under the *job* column when the pipeline completes successfully.
+# After the pipeline starts running, you can track the progress by watching log messages from jobs. This page refreshes automatically. You will see a <span class="label label-success">complete</span> label when the pipeline completes successfully.
# Click on the *Output* link to see the results of the job. This will load a new page listing the output files from this pipeline. You'll see the output SAM file from the alignment tool under the *Files* tab.
# Click on the <span class="btn btn-sm btn-info"><i class="fa fa-download"></i></span> download button to the right of the SAM file to download your results.
kc, _ := MakeKeepClient(&arv)
kc.Want_replicas = 2
+ kc.Retries = 0
arv.ApiToken = "abc123"
localRoots := make(map[string]string)
writableLocalRoots := make(map[string]string)
kc, _ := MakeKeepClient(&arv)
arv.ApiToken = "abc123"
kc.SetServiceRoots(map[string]string{"x": ks.url}, nil, nil)
+ kc.Retries = 0
r, n, url2, err := kc.Get(hash)
errNotFound, _ := err.(ErrNotFound)
}
kc.SetServiceRoots(localRoots, writableLocalRoots, nil)
+ kc.Retries = 0
// This test works only if one of the failing services is
// attempted before the succeeding service. Otherwise,
c.Check(err2, Equals, nil)
c.Check(content, DeepEquals, st.body[0:len(st.body)-1])
}
+
+type FailThenSucceedPutHandler struct {
+ handled chan string
+ count int
+ successhandler StubPutHandler
+}
+
+func (h *FailThenSucceedPutHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
+ if h.count == 0 {
+ resp.WriteHeader(500)
+ h.count += 1
+ h.handled <- fmt.Sprintf("http://%s", req.Host)
+ } else {
+ h.successhandler.ServeHTTP(resp, req)
+ }
+}
+
+func (s *StandaloneSuite) TestPutBRetry(c *C) {
+ st := &FailThenSucceedPutHandler{make(chan string, 1), 0,
+ StubPutHandler{
+ c,
+ Md5String("foo"),
+ "abc123",
+ "foo",
+ make(chan string, 5)}}
+
+ arv, _ := arvadosclient.MakeArvadosClient()
+ kc, _ := MakeKeepClient(&arv)
+
+ kc.Want_replicas = 2
+ arv.ApiToken = "abc123"
+ localRoots := make(map[string]string)
+ writableLocalRoots := make(map[string]string)
+
+ ks := RunSomeFakeKeepServers(st, 2)
+
+ for i, k := range ks {
+ localRoots[fmt.Sprintf("zzzzz-bi6l4-fakefakefake%03d", i)] = k.url
+ writableLocalRoots[fmt.Sprintf("zzzzz-bi6l4-fakefakefake%03d", i)] = k.url
+ defer k.listener.Close()
+ }
+
+ kc.SetServiceRoots(localRoots, writableLocalRoots, nil)
+
+ hash, replicas, err := kc.PutB([]byte("foo"))
+
+ c.Check(err, Equals, nil)
+ c.Check(hash, Equals, "")
+ c.Check(replicas, Equals, 2)
+}
// Take the hash of locator and timestamp in order to identify this
// specific transaction in log statements.
- requestId := fmt.Sprintf("%x", md5.Sum([]byte(locator+time.Now().String())))[0:8]
+ requestId := fmt.Sprintf("%x", md5.Sum([]byte(hash+time.Now().String())))[0:8]
// Calculate the ordering for uploading to servers
sv := NewRootSorter(this.WritableLocalRoots(), hash).GetSortedRoots()
replicasPerThread = remaining_replicas
}
- for remaining_replicas > 0 {
- for active*replicasPerThread < remaining_replicas {
- // Start some upload requests
- if next_server < len(sv) {
- log.Printf("[%v] Begin upload %s to %s", requestId, hash, sv[next_server])
- go this.uploadToKeepServer(sv[next_server], hash, tr.MakeStreamReader(), upload_status, expectedLength, requestId)
- next_server += 1
- active += 1
- } else {
- if active == 0 {
- return locator, (this.Want_replicas - remaining_replicas), InsufficientReplicasError
+ retriesRemaining := 1 + this.Retries
+ var retryServers []string
+
+ for retriesRemaining > 0 {
+ retriesRemaining -= 1
+ next_server = 0
+ retryServers = []string{}
+ for remaining_replicas > 0 {
+ for active*replicasPerThread < remaining_replicas {
+ // Start some upload requests
+ if next_server < len(sv) {
+ log.Printf("[%v] Begin upload %s to %s", requestId, hash, sv[next_server])
+ go this.uploadToKeepServer(sv[next_server], hash, tr.MakeStreamReader(), upload_status, expectedLength, requestId)
+ next_server += 1
+ active += 1
} else {
- break
+ if active == 0 && retriesRemaining == 0 {
+ return locator, (this.Want_replicas - remaining_replicas), InsufficientReplicasError
+ } else {
+ break
+ }
+ }
+ }
+ log.Printf("[%v] Replicas remaining to write: %v active uploads: %v",
+ requestId, remaining_replicas, active)
+
+ // Now wait for something to happen.
+ if active > 0 {
+ status := <-upload_status
+ active -= 1
+
+ if status.statusCode == 200 {
+ // good news!
+ remaining_replicas -= status.replicas_stored
+ locator = status.response
+ } else if status.statusCode == 0 || status.statusCode == 408 || status.statusCode == 429 ||
+ (status.statusCode >= 500 && status.statusCode != 503) {
+ // Timeout, too many requests, or other server side failure
+ // Do not retry when status code is 503, which means the keep server is full
+ retryServers = append(retryServers, status.url[0:strings.LastIndex(status.url, "/")])
}
+ } else {
+ break
}
}
- log.Printf("[%v] Replicas remaining to write: %v active uploads: %v",
- requestId, remaining_replicas, active)
-
- // Now wait for something to happen.
- status := <-upload_status
- active -= 1
- if status.statusCode == 200 {
- // good news!
- remaining_replicas -= status.replicas_stored
- locator = status.response
- }
+ sv = retryServers
}
return locator, this.Want_replicas, nil
import logging
import os
import re
+import socket
import types
import apiclient
# previous call did not succeed, so this is slightly
# risky.
return self.orig_http_request(uri, **kwargs)
+ except socket.error:
+ # This is the one case where httplib2 doesn't close the
+ # underlying connection first. Close all open connections,
+ # expecting this object only has the one connection to the API
+ # server. This is safe because httplib2 reopens connections when
+ # needed.
+ _logger.debug("Retrying API request after socket error", exc_info=True)
+ for conn in self.connections.itervalues():
+ conn.close()
+ return self.orig_http_request(uri, **kwargs)
def _patch_http_request(http, api_token):
http.arvados_api_token = api_token
else:
sp = os.path.split(root)
return is_in_collection(sp[0], os.path.join(sp[1], branch))
- except IOError, OSError:
+ except (IOError, OSError):
return (None, None)
# Determine the project to place the output of this command by searching upward
else:
sp = os.path.split(root)
return determine_project(sp[0], current_user)
- except IOError, OSError:
+ except (IOError, OSError):
return current_user
# Determine if string corresponds to a file, and if that file is part of a
srcConfigFile := flags.String(
"src",
"",
- "Source configuration filename. May be either a pathname to a config file, or (for example) 'foo' as shorthand for $HOME/.config/arvados/foo.conf")
+ "Source configuration filename. May be either a pathname to a config file, or (for example) 'foo' as shorthand for $HOME/.config/arvados/foo.conf file. This file is expected to specify the values for ARVADOS_API_TOKEN, ARVADOS_API_HOST, ARVADOS_API_HOST_INSECURE, and ARVADOS_BLOB_SIGNING_KEY for the source.")
dstConfigFile := flags.String(
"dst",
"",
- "Destination configuration filename. May be either a pathname to a config file, or (for example) 'foo' as shorthand for $HOME/.config/arvados/foo.conf")
+ "Destination configuration filename. May be either a pathname to a config file, or (for example) 'foo' as shorthand for $HOME/.config/arvados/foo.conf file. This file is expected to specify the values for ARVADOS_API_TOKEN, ARVADOS_API_HOST, and ARVADOS_API_HOST_INSECURE for the destination.")
srcKeepServicesJSON := flags.String(
"src-keep-services-json",