From: Peter Amstutz Date: Thu, 4 Feb 2021 19:36:56 +0000 (-0500) Subject: Add tabs, arvados specific instructions & links to solutions X-Git-Url: https://git.arvados.org/rnaseq-cwl-training.git/commitdiff_plain/75fb4494aee7c80e7139da7403dced296fe9620b Add tabs, arvados specific instructions & links to solutions Arvados-DCO-1.1-Signed-off-by: Peter Amstutz --- diff --git a/Gemfile b/Gemfile index 20ebaa1..fa54c96 100644 --- a/Gemfile +++ b/Gemfile @@ -8,3 +8,7 @@ git_source(:github) { |repo_name| "https://github.com/#{repo_name}" } ruby '>=2.5.5' gem 'github-pages', group: :jekyll_plugins + +group :jekyll_plugins do + gem "jekyll-tabs" +end diff --git a/_config.yml b/_config.yml index 7648b55..7d0172b 100644 --- a/_config.yml +++ b/_config.yml @@ -99,3 +99,6 @@ exclude: # Turn on built-in syntax highlighting. highlighter: rouge + +plugins: + - jekyll-tabs diff --git a/_episodes/02-workflow.md b/_episodes/02-workflow.md index 8ef18cd..6e584e6 100644 --- a/_episodes/02-workflow.md +++ b/_episodes/02-workflow.md @@ -334,14 +334,14 @@ For example: ``` {: .language-yaml } -> ## `Exercise` +> ## Exercise > > Look at `STAR-Align.cwl` and identify the other input parameters that > correspond to the command line arguments used in the source script. > Also identify the output parameter. Use these to write the STAR > step. > -> > ## `Solution` +> > ## Solution > > > > ``` > > STAR: @@ -351,6 +351,7 @@ For example: > > GenomeDir: genome > > ForwardReads: fq > > OutSAMtype: {default: BAM} +> > SortedByCoordinate: {default: true} > > OutSAMunmapped: {default: Within} > > out: [alignment] > > ``` @@ -425,3 +426,7 @@ outputs: outputSource: samtools/bam_sorted_indexed ``` {: .language-yaml } + +> ## Episode solution +> * main.cwl +{: .solution} diff --git a/_episodes/03-running.md b/_episodes/03-running.md index ba37ad3..3dc3176 100644 --- a/_episodes/03-running.md +++ b/_episodes/03-running.md @@ -31,8 +31,11 @@ plain strings that may or may not be file paths. Note: if you don't have example sequence data or the STAR index files, see [setup](/setup.html). -main-input.yaml +
+{% tabs input %} +{% tab input generic %} +main-input.yaml ``` fq: class: File @@ -54,13 +57,40 @@ gtf: > ``` > cwl-runner main.cwl main-input.yaml > ``` +> {: .language-bash } > > This may take a few minutes to run, and will print some amount of > logging. The logging you see, how access other logs, and how to > track workflow progress will depend on your CWL runner platform. +{: .challenge } + +{% endtab %} + +{% tab input arvados %} +main-input.yaml +``` +fq: + class: File + location: keep:9178fe1b80a08a422dbe02adfd439764+925/raw_fastq/Mov10_oe_1.subset.fq + format: http://edamontology.org/format_1930 +genome: + class: Directory + location: keep:02a12ce9e2707610991bd29d38796b57+2912 +gtf: + class: File + location: 9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf +``` +{: .language-yaml } + +> ## Running the workflow +> +> If you are using VSCode with Arvados tasks, select `main.cwl` and +> then use the `Run CWL Workflow on Arvados` task. > -> {: .language-bash } {: .challenge } +{% endtab %} +{% endtabs %} +
# Debugging the workflow @@ -127,10 +157,18 @@ Resource requirements you can set include: > {: .challenge } +> ## Episode solution +> * main.cwl +{: .solution} + # Workflow results The CWL runner will print a results JSON object to standard output. It will look something like this (it may include additional fields). +
+{% tabs output %} + +{% tab output generic %} ``` { "bam_sorted_indexed": { @@ -156,6 +194,36 @@ The CWL runner will print a results JSON object to standard output. It will loo } ``` {: .language-yaml } +{% endtab %} + +{% tab output arvados %} +``` +{ + "bam_sorted_indexed": { + "basename": "Aligned.sortedByCoord.out.bam", + "class": "File", + "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam", + "secondaryFiles": [ + { + "basename": "Aligned.sortedByCoord.out.bam.bai", + "class": "File", + "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Aligned.sortedByCoord.out.bam.bai" + } + ], + "size": 25370695 + }, + "qc_html": { + "basename": "Mov10_oe_1.subset_fastqc.html", + "class": "File", + "location": "keep:2dbaaef5aefd558e37f14280e47091a9+327/Mov10_oe_1.subset_fastqc.html", + "size": 383589 + } +} +``` +{: .language-yaml } +{% endtab %} +{% endtabs %} +
This has a similar structure as `main-input.yaml`. The each output parameter is listed, with the `location` field of each `File` object diff --git a/_episodes/04-commandlinetool.md b/_episodes/04-commandlinetool.md index 07378aa..3c33d66 100644 --- a/_episodes/04-commandlinetool.md +++ b/_episodes/04-commandlinetool.md @@ -264,3 +264,8 @@ add it to our workflow. > > {: .language-yaml } > {: .solution} {: .challenge} + +> ## Episode solution +> * main.cwl +> * featureCounts.cwl +{: .solution} diff --git a/_episodes/05-scatter.md b/_episodes/05-scatter.md index 0881326..9137731 100644 --- a/_episodes/05-scatter.md +++ b/_episodes/05-scatter.md @@ -74,6 +74,12 @@ requirements: > {: .challenge } +> ## Part 1 solution +> * main.cwl +> * alignment.cwl +> * featureCounts.cwl +{: .solution} + # Scattering The "wrapper" step lets us do something useful. We can modify the @@ -135,6 +141,12 @@ requirements: ``` {: .language-yaml } +> ## Part 2 solution +> * main.cwl +> * alignment.cwl +> * featureCounts.cwl +{: .solution} + # Input parameter lists The `fq` parameter needs to be a list. You write a list in yaml by @@ -246,3 +258,9 @@ outputs: > `featurecounts.tsv` file with a column for each bam file. > {: .challenge } + +> ## Episode solution +> * main.cwl +> * alignment.cwl +> * featureCounts.cwl +{: .solution} diff --git a/_episodes/06-expressions.md b/_episodes/06-expressions.md index 9c79ad5..e0bba58 100644 --- a/_episodes/06-expressions.md +++ b/_episodes/06-expressions.md @@ -164,3 +164,10 @@ outputs: > bam files. > {: .challenge } + +> ## Episode solution +> * main.cwl +> * alignment.cwl +> * featureCounts.cwl +> * subdirs.cwl +{: .solution} diff --git a/_layouts/base.html b/_layouts/base.html index cc4132d..66afa7a 100644 --- a/_layouts/base.html +++ b/_layouts/base.html @@ -17,6 +17,8 @@ + + {% include favicons.html %} @@ -29,7 +31,7 @@ {% if page.title %}{{ page.title }}{% endif %}{% if page.title and site.title %} – {% endif %}{% if site.title %}{{ site.title }}{% endif %} - + diff --git a/answers/ep2/main.cwl b/assets/answers/ep2/main.cwl similarity index 95% rename from answers/ep2/main.cwl rename to assets/answers/ep2/main.cwl index cb1ef84..df6adbf 100644 --- a/answers/ep2/main.cwl +++ b/assets/answers/ep2/main.cwl @@ -25,6 +25,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep3/main.cwl b/assets/answers/ep3/main.cwl similarity index 95% rename from answers/ep3/main.cwl rename to assets/answers/ep3/main.cwl index 09af85f..1fc4b56 100644 --- a/answers/ep3/main.cwl +++ b/assets/answers/ep3/main.cwl @@ -25,6 +25,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep4/featureCounts.cwl b/assets/answers/ep4/featureCounts.cwl similarity index 100% rename from answers/ep4/featureCounts.cwl rename to assets/answers/ep4/featureCounts.cwl diff --git a/answers/ep4/main.cwl b/assets/answers/ep4/main.cwl similarity index 96% rename from answers/ep4/main.cwl rename to assets/answers/ep4/main.cwl index a6a8d3c..3e3fa8e 100644 --- a/answers/ep4/main.cwl +++ b/assets/answers/ep4/main.cwl @@ -24,6 +24,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep5/part2/alignment.cwl b/assets/answers/ep5/part1/alignment.cwl similarity index 96% rename from answers/ep5/part2/alignment.cwl rename to assets/answers/ep5/part1/alignment.cwl index 8ab6bd2..a499e19 100644 --- a/answers/ep5/part2/alignment.cwl +++ b/assets/answers/ep5/part1/alignment.cwl @@ -24,6 +24,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep5/part1/featureCounts.cwl b/assets/answers/ep5/part1/featureCounts.cwl similarity index 100% rename from answers/ep5/part1/featureCounts.cwl rename to assets/answers/ep5/part1/featureCounts.cwl diff --git a/answers/ep5/part1/main.cwl b/assets/answers/ep5/part1/main.cwl similarity index 100% rename from answers/ep5/part1/main.cwl rename to assets/answers/ep5/part1/main.cwl diff --git a/answers/ep5/part1/alignment.cwl b/assets/answers/ep5/part2/alignment.cwl similarity index 96% rename from answers/ep5/part1/alignment.cwl rename to assets/answers/ep5/part2/alignment.cwl index 8ab6bd2..a499e19 100644 --- a/answers/ep5/part1/alignment.cwl +++ b/assets/answers/ep5/part2/alignment.cwl @@ -24,6 +24,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep5/part2/featureCounts.cwl b/assets/answers/ep5/part2/featureCounts.cwl similarity index 100% rename from answers/ep5/part2/featureCounts.cwl rename to assets/answers/ep5/part2/featureCounts.cwl diff --git a/answers/ep5/part2/main.cwl b/assets/answers/ep5/part2/main.cwl similarity index 100% rename from answers/ep5/part2/main.cwl rename to assets/answers/ep5/part2/main.cwl diff --git a/answers/ep5/part4/alignment.cwl b/assets/answers/ep5/part4/alignment.cwl similarity index 95% rename from answers/ep5/part4/alignment.cwl rename to assets/answers/ep5/part4/alignment.cwl index b69fa6e..3646a56 100644 --- a/answers/ep5/part4/alignment.cwl +++ b/assets/answers/ep5/part4/alignment.cwl @@ -24,6 +24,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} out: [alignment] diff --git a/answers/ep5/part4/featureCounts.cwl b/assets/answers/ep5/part4/featureCounts.cwl similarity index 100% rename from answers/ep5/part4/featureCounts.cwl rename to assets/answers/ep5/part4/featureCounts.cwl diff --git a/answers/ep5/part4/main.cwl b/assets/answers/ep5/part4/main.cwl similarity index 100% rename from answers/ep5/part4/main.cwl rename to assets/answers/ep5/part4/main.cwl diff --git a/answers/ep6/alignment.cwl b/assets/answers/ep6/alignment.cwl similarity index 95% rename from answers/ep6/alignment.cwl rename to assets/answers/ep6/alignment.cwl index 73d9323..6712aae 100644 --- a/answers/ep6/alignment.cwl +++ b/assets/answers/ep6/alignment.cwl @@ -27,6 +27,7 @@ steps: GenomeDir: genome ForwardReads: fq OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} OutSAMunmapped: {default: Within} ### 1. Expressions on step inputs OutFileNamePrefix: {valueFrom: "$(inputs.ForwardReads.nameroot)."} diff --git a/answers/ep6/featureCounts.cwl b/assets/answers/ep6/featureCounts.cwl similarity index 100% rename from answers/ep6/featureCounts.cwl rename to assets/answers/ep6/featureCounts.cwl diff --git a/answers/ep6/main.cwl b/assets/answers/ep6/main.cwl similarity index 100% rename from answers/ep6/main.cwl rename to assets/answers/ep6/main.cwl diff --git a/answers/ep6/subdirs.cwl b/assets/answers/ep6/subdirs.cwl similarity index 100% rename from answers/ep6/subdirs.cwl rename to assets/answers/ep6/subdirs.cwl diff --git a/assets/css/tabs.css b/assets/css/tabs.css new file mode 100644 index 0000000..e9d72f2 --- /dev/null +++ b/assets/css/tabs.css @@ -0,0 +1,48 @@ +.tab { + display: flex; + flex-wrap: wrap; + margin-left: -20px; + padding: 0; + list-style: none; + position: relative; +} + +.tab > * { + flex: none; + padding-left: 20px; + position: relative; +} + +.tab > * > a { + display: block; + text-align: center; + padding: 9px 20px; + color: #999; + border-bottom: 2px solid transparent; + border-bottom-color: transparent; + font-size: 12px; + text-transform: uppercase; + transition: color .1s ease-in-out; + line-height: 20px; +} + +.tab > .active > a { + color:#222; + border-color: #1e87f0; +} + +.tab li a { + text-decoration: none; + cursor: pointer; +} + +.tab-content{ + padding: 0; +} + +.tab-content li { + display: none; +} +.tab-content li.active { + display: initial; +} diff --git a/assets/js/tabs.js b/assets/js/tabs.js new file mode 100644 index 0000000..0148f32 --- /dev/null +++ b/assets/js/tabs.js @@ -0,0 +1,43 @@ +const removeActiveClasses = function (ulElement) { + const lis = ulElement.querySelectorAll('li'); + Array.prototype.forEach.call(lis, function(li) { + li.classList.remove('active'); + }); + } + + const getChildPosition = function (element) { + var parent = element.parentNode; + var i = 0; + for (var i = 0; i < parent.children.length; i++) { + if (parent.children[i] === element) { + return i; + } + } + + throw new Error('No parent found'); + } + +window.addEventListener('load', function () { + const tabLinks = document.querySelectorAll('ul.tab li a'); + + Array.prototype.forEach.call(tabLinks, function(link) { + link.addEventListener('click', function (event) { + event.preventDefault(); + + liTab = link.parentNode; + ulTab = liTab.parentNode; + position = getChildPosition(liTab); + if (liTab.className.includes('active')) { + return; + } + + removeActiveClasses(ulTab); + tabContentId = ulTab.getAttribute('data-tab'); + tabContentElement = document.getElementById(tabContentId); + removeActiveClasses(tabContentElement); + + tabContentElement.querySelectorAll('li')[position].classList.add('active'); + liTab.classList.add('active'); + }, false); + }); +}); diff --git a/setup.md b/setup.md index f907ec7..882d4f9 100644 --- a/setup.md +++ b/setup.md @@ -2,28 +2,28 @@ title: Setup --- +
+{% tabs setup %} +{% tab setup generic %} + # Setting up a practice repository We will create a new git repository and import a library of existing tool definitions that will help us build our workflow. -Create a new git repository to hold our workflow with this command: +Create a new empty git repository to hold our workflow with this command: ``` git init rnaseq-cwl-training-exercises ``` - -On Arvados use this: - -``` -git clone https://github.com/arvados/arvados-vscode-cwl-template.git rnaseq-cwl-training-exercises -``` +{: .language-bash } Next, import bio-cwl-tools with this command: ``` git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git ``` +{: .language-bash } # Downloading sample and reference data @@ -32,8 +32,9 @@ Start from your rnaseq-cwl-exercises directory. ``` mkdir rnaseq cd rnaseq -wget --mirror --no-parent --no-host --cut-dirs=1 https://download.pirca.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/ +wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=9178fe1b80a08a422dbe02adfd439764+925/ ``` +{: .language-bash } # Downloading or generating STAR index @@ -46,8 +47,9 @@ This is a rather large download (4 GB). Depending on your bandwidth, it may be ``` mkdir hg19-chr1-STAR-index cd hg19-chr1-STAR-index -wget --mirror --no-parent --no-host --cut-dirs=1 https://download.pirca.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/ +wget --mirror --no-parent --no-host --cut-dirs=1 https://download.jutro.arvadosapi.com/c=02a12ce9e2707610991bd29d38796b57+2912/ ``` +{: .language-bash } ## Generating @@ -64,12 +66,101 @@ Gtf: location: rnaseq/reference_data/chr1-hg19_genes.gtf Overhang: 99 ``` +{: .language-yaml } Generate the index with your local cwl-runner. ``` cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml ``` +{: .language-bash } + + +{% endtab %} + +{% tab setup arvados %} + +# Setting up a practice repository + +We will create a new git repository and import a library of existing +tool definitions that will help us build our workflow. + +When using the recommended VSCode environment to develop on Arvados, start by forking this repository: +``` +git clone https://github.com/arvados/arvados-vscode-cwl-template.git rnaseq-cwl-training-exercises +``` +{: .language-bash } + +Next, import bio-cwl-tools with this command: + +``` +git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git +``` +{: .language-bash } + +# Downloading sample and reference data + +> ## Note +> +> You may already have access to this collection. +> +> You can check by going to Workbench and pasting +> `9178fe1b80a08a422dbe02adfd439764+925` into the search box. If you +> arrived at a collection page instead of a "not found" error, then +> you do not need to perform this download step. +{: .callout} + +``` +arv-copy --src jutro 9178fe1b80a08a422dbe02adfd439764+925 +``` +{: .language-bash } + +# Downloading or generating STAR index + +Running STAR requires index files generated from the reference. + +This is a rather large download (4 GB). Depending on your bandwidth, it may be faster to generate it yourself. + +## Downloading + +> ## Note +> +> As above, you can check by going to Workbench and pasting +> `02a12ce9e2707610991bd29d38796b57+2912` into the search box to see +> if you already have access to this collection. +{: .callout} + +``` +arv-copy --src jutro 02a12ce9e2707610991bd29d38796b57+2912 +``` +{: .language-bash } + +## Generating + +Create `chr1-star-index.yaml`: + +``` +InputFiles: + - class: File + location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1.fa + format: http://edamontology.org/format_1930 +IndexName: 'hg19-chr1-STAR-index' +Gtf: + class: File + location: keep:9178fe1b80a08a422dbe02adfd439764+925/reference_data/chr1-hg19_genes.gtf +Overhang: 99 +``` +{: .language-yaml } + +Generate the index with arvados-cwl-runner. + +``` +arvados-cwl-runner bio-cwl-tools/STAR/STAR-Index.cwl chr1-star-index.yaml +``` +{: .language-bash } +{% endtab %} +{% endtabs %} +
{% include links.md %}