Add tabs, arvados specific instructions & links to solutions
[rnaseq-cwl-training.git] / setup.md
1 ---
2 title: Setup
3 ---
4
5 <div>
6 {% tabs setup %}
7 {% tab setup generic %}
8
9 # Setting up a practice repository
10
11 We will create a new git repository and import a library of existing
12 tool definitions that will help us build our workflow.
13
14 Create a new empty git repository to hold our workflow with this command:
15
16 ```
17 git init rnaseq-cwl-training-exercises
18 ```
19 {: .language-bash }
20
21 Next, import bio-cwl-tools with this command:
22
23 ```
24 git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
25 ```
26 {: .language-bash }
27
28 # Downloading sample and reference data
29
30 Start from your rnaseq-cwl-exercises directory.
31
32 ```
33 mkdir rnaseq
34 cd rnaseq
35 wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/
36 ```
37 {: .language-bash }
38
39 # Downloading or generating STAR index
40
41 Running STAR requires index files generated from the reference.
42
43 This is a rather large download (4 GB).  Depending on your bandwidth, it may be faster to generate it yourself.
44
45 ## Downloading
46
47 ```
48 mkdir hg19-chr1-STAR-index
49 cd hg19-chr1-STAR-index
50 wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/
51 ```
52 {: .language-bash }
53
54 ## Generating
55
56 Create `chr1-star-index.yaml`:
57
58 ```
59 InputFiles:
60   - class: File
61     location: rnaseq/reference_data/chr1.fa
62     format: http://edamontology.org/format_1930
63 IndexName: 'hg19-chr1-STAR-index'
64 Gtf:
65   class: File
66   location: rnaseq/reference_data/chr1-hg19_genes.gtf
67 Overhang: 99
68 ```
69 {: .language-yaml }
70
71 Generate the index with your local cwl-runner.
72
73 ```
74 cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
75 ```
76 {: .language-bash }
77
78
79 {% endtab %}
80
81 {% tab setup arvados %}
82
83 # Setting up a practice repository
84
85 We will create a new git repository and import a library of existing
86 tool definitions that will help us build our workflow.
87
88 When using the recommended VSCode environment to develop on Arvados, start by forking this repository:
89 ```
90 git clone https://github.com/arvados/arvados-vscode-cwl-template.git rnaseq-cwl-training-exercises
91 ```
92 {: .language-bash }
93
94 Next, import bio-cwl-tools with this command:
95
96 ```
97 git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
98 ```
99 {: .language-bash }
100
101 # Downloading sample and reference data
102
103 > ## Note
104 >
105 > You may already have access to this collection.
106 >
107 > You can check by going to Workbench and pasting
108 > `9178fe1b80a08a422dbe02adfd439764+925` into the search box.  If you
109 > arrived at a collection page instead of a "not found" error, then
110 > you do not need to perform this download step.
111 {: .callout}
112
113 ```
114 arv-copy --src jutro 9178fe1b80a08a422dbe02adfd439764+925
115 ```
116 {: .language-bash }
117
118 # Downloading or generating STAR index
119
120 Running STAR requires index files generated from the reference.
121
122 This is a rather large download (4 GB).  Depending on your bandwidth, it may be faster to generate it yourself.
123
124 ## Downloading
125
126 > ## Note
127 >
128 > As above, you can check by going to Workbench and pasting
129 > `02a12ce9e2707610991bd29d38796b57+2912` into the search box to see
130 > if you already have access to this collection.
131 {: .callout}
132
133 ```
134 arv-copy --src jutro 02a12ce9e2707610991bd29d38796b57+2912
135 ```
136 {: .language-bash }
137
138 ## Generating
139
140 Create `chr1-star-index.yaml`:
141
142 ```
143 InputFiles:
144   - class: File
145     location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1.fa
146     format: http://edamontology.org/format_1930
147 IndexName: 'hg19-chr1-STAR-index'
148 Gtf:
149   class: File
150   location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf
151 Overhang: 99
152 ```
153 {: .language-yaml }
154
155 Generate the index with arvados-cwl-runner.
156
157 ```
158 arvados-cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
159 ```
160 {: .language-bash }
161
162 {% endtab %}
163 {% endtabs %}
164 </div>
165
166 {% include links.md %}