ruby '>=2.5.5'
gem 'github-pages', group: :jekyll_plugins
+
+group :jekyll_plugins do
+ gem "jekyll-tabs"
+end
# Turn on built-in syntax highlighting.
highlighter: rouge
+
+plugins:
+ - jekyll-tabs
```
{: .language-yaml }
-> ## `Exercise`
+> ## Exercise
>
> Look at `STAR-Align.cwl` and identify the other input parameters that
> correspond to the command line arguments used in the source script.
> Also identify the output parameter. Use these to write the STAR
> step.
>
-> > ## `Solution`
+> > ## Solution
> >
> > ```
> > STAR:
> > GenomeDir: genome
> > ForwardReads: fq
> > OutSAMtype: {default: BAM}
+> > SortedByCoordinate: {default: true}
> > OutSAMunmapped: {default: Within}
> > out: [alignment]
> > ```
outputSource: samtools/bam_sorted_indexed
```
{: .language-yaml }
+
+> ## Episode solution
+> * <a href="{% link assets/answers/ep2/main.cwl %}">main.cwl</a>
+{: .solution}
Note: if you don't have example sequence data or the STAR index files, see [setup](/setup.html).
-main-input.yaml
+<div>
+{% tabs input %}
+{% tab input generic %}
+main-input.yaml
```
fq:
class: File
> ```
> cwl-runner main.cwl main-input.yaml
> ```
+> {: .language-bash }
>
> This may take a few minutes to run, and will print some amount of
> logging. The logging you see, how access other logs, and how to
> track workflow progress will depend on your CWL runner platform.
+{: .challenge }
+
+{% endtab %}
+
+{% tab input arvados %}
+main-input.yaml
+```
+fq:
+ class: File
+ location: keep:9178fe1b80a08a422dbe02adfd439764+925/raw_fastq/Mov10_oe_1.subset.fq
+ format: http://edamontology.org/format_1930
+genome:
+ class: Directory
+ location: keep:02a12ce9e2707610991bd29d38796b57+2912
+gtf:
+ class: File
+ location: 9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf
+```
+{: .language-yaml }
+
+> ## Running the workflow
+>
+> If you are using VSCode with Arvados tasks, select `main.cwl` and
+> then use the `Run CWL Workflow on Arvados` task.
>
-> {: .language-bash }
{: .challenge }
+{% endtab %}
+{% endtabs %}
+</div>
# Debugging the workflow
>
{: .challenge }
+> ## Episode solution
+> * <a href="{% link assets/answers/ep3/main.cwl %}">main.cwl</a>
+{: .solution}
+
# Workflow results
The CWL runner will print a results JSON object to standard output. It will look something like this (it may include additional fields).
+<div>
+{% tabs output %}
+
+{% tab output generic %}
```
{
"bam_sorted_indexed": {
}
```
{: .language-yaml }
+{% endtab %}
+
+{% tab output arvados %}
+```
+{
+ "bam_sorted_indexed": {
+ "basename": "Aligned.sortedByCoord.out.bam",
+ "class": "File",
+ "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam",
+ "secondaryFiles": [
+ {
+ "basename": "Aligned.sortedByCoord.out.bam.bai",
+ "class": "File",
+ "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam.bai"
+ }
+ ],
+ "size": 25370695
+ },
+ "qc_html": {
+ "basename": "Mov10_oe_1.subset_fastqc.html",
+ "class": "File",
+ "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Mov10_oe_1.subset_fastqc.html",
+ "size": 383589
+ }
+}
+```
+{: .language-yaml }
+{% endtab %}
+{% endtabs %}
+</div>
This has a similar structure as `main-input.yaml`. The each output
parameter is listed, with the `location` field of each `File` object
> > {: .language-yaml }
> {: .solution}
{: .challenge}
+
+> ## Episode solution
+> * <a href="{% link assets/answers/ep4/main.cwl %}">main.cwl</a>
+> * <a href="{% link assets/answers/ep4/featureCounts.cwl %}">featureCounts.cwl</a>
+{: .solution}
>
{: .challenge }
+> ## Part 1 solution
+> * <a href="{% link assets/answers/ep5/part1/main.cwl %}">main.cwl</a>
+> * <a href="{% link assets/answers/ep5/part1/alignment.cwl %}">alignment.cwl</a>
+> * <a href="{% link assets/answers/ep5/part1/featureCounts.cwl %}">featureCounts.cwl</a>
+{: .solution}
+
# Scattering
The "wrapper" step lets us do something useful. We can modify the
```
{: .language-yaml }
+> ## Part 2 solution
+> * <a href="{% link assets/answers/ep5/part2/main.cwl %}">main.cwl</a>
+> * <a href="{% link assets/answers/ep5/part2/alignment.cwl %}">alignment.cwl</a>
+> * <a href="{% link assets/answers/ep5/part2/featureCounts.cwl %}">featureCounts.cwl</a>
+{: .solution}
+
# Input parameter lists
The `fq` parameter needs to be a list. You write a list in yaml by
> `featurecounts.tsv` file with a column for each bam file.
>
{: .challenge }
+
+> ## Episode solution
+> * <a href="{% link assets/answers/ep5/part4/main.cwl %}">main.cwl</a>
+> * <a href="{% link assets/answers/ep5/part4/alignment.cwl %}">alignment.cwl</a>
+> * <a href="{% link assets/answers/ep5/part4/featureCounts.cwl %}">featureCounts.cwl</a>
+{: .solution}
> bam files.
>
{: .challenge }
+
+> ## Episode solution
+> * <a href="{% link assets/answers/ep6/main.cwl %}">main.cwl</a>
+> * <a href="{% link assets/answers/ep6/alignment.cwl %}">alignment.cwl</a>
+> * <a href="{% link assets/answers/ep6/featureCounts.cwl %}">featureCounts.cwl</a>
+> * <a href="{% link assets/answers/ep6/subdirs.cwl %}">subdirs.cwl</a>
+{: .solution}
<link rel="stylesheet" type="text/css" href="{{ relative_root_path }}/assets/css/lesson.css" />
<link rel="stylesheet" type="text/css" href="{{ relative_root_path }}/assets/css/syntax.css" />
<link rel="license" href="#license-info" />
+ <script type="text/javascript" src="{{ relative_root_path }}/assets/js/tabs.js"></script>
+ <link rel="stylesheet" href="{{ relative_root_path }}/assets/css/tabs.css">
{% include favicons.html %}
<title>
{% if page.title %}{{ page.title }}{% endif %}{% if page.title and site.title %} – {% endif %}{% if site.title %}{{ site.title }}{% endif %}
- </title>
+ </title>
</head>
<body>
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
out: [alignment]
GenomeDir: genome
ForwardReads: fq
OutSAMtype: {default: BAM}
+ SortedByCoordinate: {default: true}
OutSAMunmapped: {default: Within}
### 1. Expressions on step inputs
OutFileNamePrefix: {valueFrom: "$(inputs.ForwardReads.nameroot)."}
--- /dev/null
+.tab {
+ display: flex;
+ flex-wrap: wrap;
+ margin-left: -20px;
+ padding: 0;
+ list-style: none;
+ position: relative;
+}
+
+.tab > * {
+ flex: none;
+ padding-left: 20px;
+ position: relative;
+}
+
+.tab > * > a {
+ display: block;
+ text-align: center;
+ padding: 9px 20px;
+ color: #999;
+ border-bottom: 2px solid transparent;
+ border-bottom-color: transparent;
+ font-size: 12px;
+ text-transform: uppercase;
+ transition: color .1s ease-in-out;
+ line-height: 20px;
+}
+
+.tab > .active > a {
+ color:#222;
+ border-color: #1e87f0;
+}
+
+.tab li a {
+ text-decoration: none;
+ cursor: pointer;
+}
+
+.tab-content{
+ padding: 0;
+}
+
+.tab-content li {
+ display: none;
+}
+.tab-content li.active {
+ display: initial;
+}
--- /dev/null
+const removeActiveClasses = function (ulElement) {
+ const lis = ulElement.querySelectorAll('li');
+ Array.prototype.forEach.call(lis, function(li) {
+ li.classList.remove('active');
+ });
+ }
+
+ const getChildPosition = function (element) {
+ var parent = element.parentNode;
+ var i = 0;
+ for (var i = 0; i < parent.children.length; i++) {
+ if (parent.children[i] === element) {
+ return i;
+ }
+ }
+
+ throw new Error('No parent found');
+ }
+
+window.addEventListener('load', function () {
+ const tabLinks = document.querySelectorAll('ul.tab li a');
+
+ Array.prototype.forEach.call(tabLinks, function(link) {
+ link.addEventListener('click', function (event) {
+ event.preventDefault();
+
+ liTab = link.parentNode;
+ ulTab = liTab.parentNode;
+ position = getChildPosition(liTab);
+ if (liTab.className.includes('active')) {
+ return;
+ }
+
+ removeActiveClasses(ulTab);
+ tabContentId = ulTab.getAttribute('data-tab');
+ tabContentElement = document.getElementById(tabContentId);
+ removeActiveClasses(tabContentElement);
+
+ tabContentElement.querySelectorAll('li')[position].classList.add('active');
+ liTab.classList.add('active');
+ }, false);
+ });
+});
title: Setup
---
+<div>
+{% tabs setup %}
+{% tab setup generic %}
+
# Setting up a practice repository
We will create a new git repository and import a library of existing
tool definitions that will help us build our workflow.
-Create a new git repository to hold our workflow with this command:
+Create a new empty git repository to hold our workflow with this command:
```
git init rnaseq-cwl-training-exercises
```
-
-On Arvados use this:
-
-```
-git clone https://github.com/arvados/arvados-vscode-cwl-template.git rnaseq-cwl-training-exercises
-```
+{: .language-bash }
Next, import bio-cwl-tools with this command:
```
git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
```
+{: .language-bash }
# Downloading sample and reference data
```
mkdir rnaseq
cd rnaseq
-wget --mirror --no-parent --no-host --cut-dirs=1 https://download.pirca.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/
+wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/
```
+{: .language-bash }
# Downloading or generating STAR index
```
mkdir hg19-chr1-STAR-index
cd hg19-chr1-STAR-index
-wget --mirror --no-parent --no-host --cut-dirs=1 https://download.pirca.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/
+wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/
```
+{: .language-bash }
## Generating
location: rnaseq/reference_data/chr1-hg19_genes.gtf
Overhang: 99
```
+{: .language-yaml }
Generate the index with your local cwl-runner.
```
cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
```
+{: .language-bash }
+
+
+{% endtab %}
+
+{% tab setup arvados %}
+
+# Setting up a practice repository
+
+We will create a new git repository and import a library of existing
+tool definitions that will help us build our workflow.
+
+When using the recommended VSCode environment to develop on Arvados, start by forking this repository:
+```
+git clone https://github.com/arvados/arvados-vscode-cwl-template.git rnaseq-cwl-training-exercises
+```
+{: .language-bash }
+
+Next, import bio-cwl-tools with this command:
+
+```
+git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
+```
+{: .language-bash }
+
+# Downloading sample and reference data
+
+> ## Note
+>
+> You may already have access to this collection.
+>
+> You can check by going to Workbench and pasting
+> `9178fe1b80a08a422dbe02adfd439764+925` into the search box. If you
+> arrived at a collection page instead of a "not found" error, then
+> you do not need to perform this download step.
+{: .callout}
+
+```
+arv-copy --src jutro 9178fe1b80a08a422dbe02adfd439764+925
+```
+{: .language-bash }
+
+# Downloading or generating STAR index
+
+Running STAR requires index files generated from the reference.
+
+This is a rather large download (4 GB). Depending on your bandwidth, it may be faster to generate it yourself.
+
+## Downloading
+
+> ## Note
+>
+> As above, you can check by going to Workbench and pasting
+> `02a12ce9e2707610991bd29d38796b57+2912` into the search box to see
+> if you already have access to this collection.
+{: .callout}
+
+```
+arv-copy --src jutro 02a12ce9e2707610991bd29d38796b57+2912
+```
+{: .language-bash }
+
+## Generating
+
+Create `chr1-star-index.yaml`:
+
+```
+InputFiles:
+ - class: File
+ location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1.fa
+ format: http://edamontology.org/format_1930
+IndexName: 'hg19-chr1-STAR-index'
+Gtf:
+ class: File
+ location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf
+Overhang: 99
+```
+{: .language-yaml }
+
+Generate the index with arvados-cwl-runner.
+
+```
+arvados-cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml
+```
+{: .language-bash }
+{% endtab %}
+{% endtabs %}
+</div>
{% include links.md %}