setup.md

   1 ---
   2 title: Setup
   3 ---
   4
   5 {% capture generic_tab_content %}
   6
   7 # Setting up a practice repository
   8
   9 We will create a new git repository and import a library of existing
  10 tool definitions that will help us build our workflow.
  11
  12 Create a new empty git repository to hold our workflow with this command:
  13
  14 ```
  15 git init rnaseq-cwl-training-exercises
  16 ```
  17 {: .language-bash }
  18
  19 Next, import bio-cwl-tools with this command:
  20
  21 ```
  22 git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
  23 ```
  24 {: .language-bash }
  25
  26 # Downloading sample and reference data
  27
  28 Start from your rnaseq-cwl-exercises directory.
  29
  30 ```
  31 mkdir rnaseq
  32 cd rnaseq
  33 wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/
  34 ```
  35 {: .language-bash }
  36
  37 # Downloading or generating STAR index
  38
  39 Running STAR requires index files generated from the reference.
  40
  41 This is a rather large download (4 GB).  Depending on your bandwidth, it may be faster to generate it yourself.
  42
  43 ## Downloading
  44
  45 ```
  46 mkdir hg19-chr1-STAR-index
  47 cd hg19-chr1-STAR-index
  48 wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/
  49 ```
  50 {: .language-bash }
  51
  52 ## Generating
  53
  54 Create `chr1-star-index.yaml`:
  55
  56 ```
  57 InputFiles:
  58   - class: File
  59     location: rnaseq/reference_data/chr1.fa
  60     format: http://edamontology.org/format_1930
  61 IndexName: 'hg19-chr1-STAR-index'
  62 Gtf:
  63   class: File
  64   location: rnaseq/reference_data/chr1-hg19_genes.gtf
  65 Overhang: 99
  66 ```
  67 {: .language-yaml }
  68
  69 Generate the index with your local cwl-runner.
  70
  71 ```
  72 cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
  73 ```
  74 {: .language-bash }
  75
  76 {% endcapture %}
  77
  78 {% capture arvados_tab_content %}
  79
  80 # Setting up a practice repository
  81
  82 We will create a new git repository and import a library of existing
  83 tool definitions that will help us build our workflow.
  84
  85 When using the recommended [VSCode environment to develop on Arvados](https://doc.arvados.org/v2.3/user/cwl/arvados-vscode-training.html),
  86 start by forking the
  87 [arvados-vscode-cwl-template](https://github.com/arvados/arvados-vscode-cwl-template)
  88 repository.
  89
  90 1. Vscode: On the left sidebar, choose `Explorer` ![](assets/img/Explorer.png)
  91 1. Select `Clone Repository` and enter [https://github.com/arvados/arvados-vscode-cwl-template](https://github.com/arvados/arvados-vscode-cwl-template), then click `Open`
  92 1. If asked `Would you like to open the cloned repository?` choose `Open`
  93
  94 Next, import the [bio-cwl-tools](https://github.com/common-workflow-library/bio-cwl-tools) repository:
  95
  96 1. Vscode: In the top menu, select `Terminal` &rarr; `New Terminal`
  97 1. This will open a terminal window in the lower part of the screen
  98 1. Run this command:
  99 ```
 100 git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
 101 ```
 102 {: .language-bash }
 103
 104 # Downloading sample and reference data
 105
 106 > ## Note
 107 >
 108 > You may already have access to this collection.
 109 >
 110 > You can check by going to Workbench and pasting
 111 > `9178fe1b80a08a422dbe02adfd439764+925` into the search box.  If you
 112 > arrived at a collection page instead of a "not found" error, then
 113 > you do not need to perform this download step.
 114 {: .callout}
 115
 116 1. Go to [https://workbench2.jutro.arvadosapi.com](https://workbench2.jutro.arvadosapi.com) and sign in, this will create an account
 117 2. Go to `Get an API token` under the user menu
 118 3. Log into the shell node of your Arvados cluster
 119 4. On the shell node, copy the host name and token for the `jutro` cluster into the file `~/.config/arvados/jutro.conf` as described on the page for [arv-copy](https://doc.arvados.org/user/topics/arv-copy.html).
 120
 121 Now, on shell node of your Arvados cluster, use `arv-copy` to copy the collection:
 122
 123 ```
 124 arv-copy --src jutro 9178fe1b80a08a422dbe02adfd439764+925
 125 ```
 126 {: .language-bash }
 127
 128 # Downloading or generating STAR index
 129
 130 Running STAR requires index files generated from the reference.
 131
 132 This is a rather large download (4 GB).  Depending on your bandwidth, it may be faster to generate it yourself.
 133
 134 ## Downloading
 135
 136 > ## Note
 137 >
 138 > As above, you can check by going to Workbench and pasting
 139 > `02a12ce9e2707610991bd29d38796b57+2912` into the search box to see
 140 > if you already have access to this collection.
 141 {: .callout}
 142
 143 Use `arv-copy` to copy the collection:
 144
 145 ```
 146 arv-copy --src jutro 02a12ce9e2707610991bd29d38796b57+2912
 147 ```
 148 {: .language-bash }
 149
 150 ## Generating
 151
 152 Create `chr1-star-index.yaml`:
 153
 154 ```
 155 InputFiles:
 156   - class: File
 157     location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1.fa
 158     format: http://edamontology.org/format_1930
 159 IndexName: 'hg19-chr1-STAR-index'
 160 Gtf:
 161   class: File
 162   location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf
 163 Overhang: 99
 164 ```
 165 {: .language-yaml }
 166
 167 Generate the index with arvados-cwl-runner.
 168
 169 ```
 170 arvados-cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
 171 ```
 172 {: .language-bash }
 173
 174 {% endcapture %}
 175
 176 <div class="tabbed">
 177   <ul class="tab">
 178       <li><a href="#section-arvados">arvados</a></li>
 179       <li><a href="#section-generic">generic</a></li>
 180   </ul>
 181
 182   <section id="section-arvados">{{ arvados_tab_content | markdownify}}</section>
 183   <section id="section-generic">{{ generic_tab_content | markdownify}}</section>
 184 </div>
 185
 186 {% include links.md %}