Migrating to carpentries WIP
authorPeter Amstutz <peter.amstutz@curii.com>
Tue, 26 Jan 2021 21:36:18 +0000 (16:36 -0500)
committerPeter Amstutz <peter.amstutz@curii.com>
Tue, 26 Jan 2021 21:36:18 +0000 (16:36 -0500)
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>

44 files changed:
AUTHORS [new file with mode: 0644]
CITATION [new file with mode: 0644]
CONTRIBUTING.md [new file with mode: 0644]
Gemfile
README.md
README.md.old [new file with mode: 0644]
_config.yml [new file with mode: 0644]
_episodes/01-introduction.md [new file with mode: 0644]
_episodes/02-workflow.md [moved from lesson1/lesson1.md with 58% similarity]
_episodes/03-running.md [moved from lesson2/lesson2.md with 95% similarity]
_episodes/04-commandlinetool.md [moved from lesson3/lesson3.md with 95% similarity]
_episodes/05-scatter.md [moved from lesson4/lesson4.md with 95% similarity]
_episodes/06-expressions.md [moved from lesson5/lesson5.md with 95% similarity]
_episodes/07-resources.md [moved from lesson6/lesson6.md with 82% similarity]
_extras/about.md [new file with mode: 0644]
_extras/discuss.md [new file with mode: 0644]
_extras/figures.md [new file with mode: 0644]
_extras/guide.md [new file with mode: 0644]
answers/ep2/bio-cwl-tools [new symlink]
answers/ep2/main.cwl [moved from lesson1/answers/main.cwl with 100% similarity]
answers/ep4/bio-cwl-tools [new symlink]
answers/ep4/featureCounts.cwl [moved from lesson3/answers/featureCounts.cwl with 100% similarity]
answers/ep4/main.cwl [moved from lesson3/answers/main.cwl with 100% similarity]
answers/ep5/part1/alignment.cwl [moved from lesson4/answers/part1/alignment.cwl with 100% similarity]
answers/ep5/part1/bio-cwl-tools [new symlink]
answers/ep5/part1/featureCounts.cwl [moved from lesson4/answers/part1/featureCounts.cwl with 100% similarity]
answers/ep5/part1/main.cwl [moved from lesson4/answers/part1/main.cwl with 100% similarity]
answers/ep5/part2/alignment.cwl [moved from lesson4/answers/part2/alignment.cwl with 100% similarity]
answers/ep5/part2/bio-cwl-tools [new symlink]
answers/ep5/part2/featureCounts.cwl [moved from lesson4/answers/part2/featureCounts.cwl with 100% similarity]
answers/ep5/part2/main.cwl [moved from lesson4/answers/part2/main.cwl with 100% similarity]
answers/ep5/part4/alignment.cwl [moved from lesson4/answers/part4/alignment.cwl with 100% similarity]
answers/ep5/part4/bio-cwl-tools [new symlink]
answers/ep5/part4/featureCounts.cwl [moved from lesson4/answers/part4/featureCounts.cwl with 100% similarity]
answers/ep5/part4/main.cwl [moved from lesson4/answers/part4/main.cwl with 100% similarity]
answers/ep6/alignment.cwl [moved from lesson5/answers/alignment.cwl with 100% similarity]
answers/ep6/bio-cwl-tools [new symlink]
answers/ep6/featureCounts.cwl [moved from lesson5/answers/featureCounts.cwl with 100% similarity]
answers/ep6/main.cwl [moved from lesson5/answers/main.cwl with 100% similarity]
answers/ep6/subdirs.cwl [moved from lesson5/answers/subdirs.cwl with 100% similarity]
assets/img/RNAseqWorkflow.png [moved from lesson1/RNAseqWorkflow.png with 100% similarity]
index.md [new file with mode: 0644]
reference.md [new file with mode: 0644]
setup.md [new file with mode: 0644]

diff --git a/AUTHORS b/AUTHORS
new file mode 100644 (file)
index 0000000..41d9a60
--- /dev/null
+++ b/AUTHORS
@@ -0,0 +1 @@
+Peter Amstutz <peter.amstutz@curii.com>
\ No newline at end of file
diff --git a/CITATION b/CITATION
new file mode 100644 (file)
index 0000000..56ece3c
--- /dev/null
+++ b/CITATION
@@ -0,0 +1 @@
+FIXME: describe how to cite this lesson.
\ No newline at end of file
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644 (file)
index 0000000..f5158b0
--- /dev/null
@@ -0,0 +1,151 @@
+# Contributing
+
+[The Carpentries][c-site] ([Software Carpentry][swc-site], [Data Carpentry][dc-site], and [Library Carpentry][lc-site]) are open source projects,
+and we welcome contributions of all kinds:
+new lessons,
+fixes to existing material,
+bug reports,
+and reviews of proposed changes are all welcome.
+
+## Contributor Agreement
+
+By contributing,
+you agree that we may redistribute your work under [our license](LICENSE.md).
+In exchange,
+we will address your issues and/or assess your change proposal as promptly as we can,
+and help you become a member of our community.
+Everyone involved in [The Carpentries][c-site]
+agrees to abide by our [code of conduct](CODE_OF_CONDUCT.md).
+
+## How to Contribute
+
+The easiest way to get started is to file an issue
+to tell us about a spelling mistake,
+some awkward wording,
+or a factual error.
+This is a good way to introduce yourself
+and to meet some of our community members.
+
+1.  If you do not have a [GitHub][github] account,
+    you can [send us comments by email][email].
+    However,
+    we will be able to respond more quickly if you use one of the other methods described below.
+
+2.  If you have a [GitHub][github] account,
+    or are willing to [create one][github-join],
+    but do not know how to use Git,
+    you can report problems or suggest improvements by [creating an issue][issues].
+    This allows us to assign the item to someone
+    and to respond to it in a threaded discussion.
+
+3.  If you are comfortable with Git,
+    and would like to add or change material,
+    you can submit a pull request (PR).
+    Instructions for doing this are [included below](#using-github).
+
+## Where to Contribute
+
+1.  If you wish to change this lesson,
+    please work in <https://github.com/swcarpentry/FIXME>,
+    which can be viewed at <https://swcarpentry.github.io/FIXME>.
+
+2.  If you wish to change the example lesson,
+    please work in <https://github.com/carpentries/lesson-example>,
+    which documents the format of our lessons
+    and can be viewed at <https://carpentries.github.io/lesson-example>.
+
+3.  If you wish to change the template used for workshop websites,
+    please work in <https://github.com/carpentries/workshop-template>.
+    The home page of that repository explains how to set up workshop websites,
+    while the extra pages in <https://carpentries.github.io/workshop-template>
+    provide more background on our design choices.
+
+4.  If you wish to change CSS style files, tools,
+    or HTML boilerplate for lessons or workshops stored in `_includes` or `_layouts`,
+    please work in <https://github.com/carpentries/styles>.
+
+## What to Contribute
+
+There are many ways to contribute,
+from writing new exercises and improving existing ones
+to updating or filling in the documentation
+and submitting [bug reports][issues]
+about things that do not work, aren not clear, or are missing.
+If you are looking for ideas, please see the 'Issues' tab for
+a list of issues associated with this repository,
+or you may also look at the issues for [Data Carpentry][dc-issues], 
+[Software Carpentry][swc-issues], and [Library Carpentry][lc-issues] projects.
+
+Comments on issues and reviews of pull requests are just as welcome:
+we are smarter together than we are on our own.
+Reviews from novices and newcomers are particularly valuable:
+it is easy for people who have been using these lessons for a while
+to forget how impenetrable some of this material can be,
+so fresh eyes are always welcome.
+
+## What *Not* to Contribute
+
+Our lessons already contain more material than we can cover in a typical workshop,
+so we are usually *not* looking for more concepts or tools to add to them.
+As a rule,
+if you want to introduce a new idea,
+you must (a) estimate how long it will take to teach
+and (b) explain what you would take out to make room for it.
+The first encourages contributors to be honest about requirements;
+the second, to think hard about priorities.
+
+We are also not looking for exercises or other material that only run on one platform.
+Our workshops typically contain a mixture of Windows, macOS, and Linux users;
+in order to be usable,
+our lessons must run equally well on all three.
+
+## Using GitHub
+
+If you choose to contribute via GitHub, you may want to look at
+[How to Contribute to an Open Source Project on GitHub][how-contribute].
+To manage changes, we follow [GitHub flow][github-flow].
+Each lesson has two maintainers who review issues and pull requests or encourage others to do so.
+The maintainers are community volunteers and have final say over what gets merged into the lesson.
+To use the web interface for contributing to a lesson:
+
+1.  Fork the originating repository to your GitHub profile.
+2.  Within your version of the forked repository, move to the `gh-pages` branch and
+create a new branch for each significant change being made.
+3.  Navigate to the file(s) you wish to change within the new branches and make revisions as required.
+4.  Commit all changed files within the appropriate branches.
+5.  Create individual pull requests from each of your changed branches
+to the `gh-pages` branch within the originating repository.
+6.  If you receive feedback, make changes using your issue-specific branches of the forked
+repository and the pull requests will update automatically.
+7.  Repeat as needed until all feedback has been addressed.
+
+When starting work, please make sure your clone of the originating `gh-pages` branch is up-to-date
+before creating your own revision-specific branch(es) from there.
+Additionally, please only work from your newly-created branch(es) and *not*
+your clone of the originating `gh-pages` branch.
+Lastly, published copies of all the lessons are available in the `gh-pages` branch of the originating
+repository for reference while revising.
+
+## Other Resources
+
+General discussion of [Software Carpentry][swc-site] and [Data Carpentry][dc-site]
+happens on the [discussion mailing list][discuss-list],
+which everyone is welcome to join.
+You can also [reach us by email][email].
+
+[email]: mailto:admin@software-carpentry.org
+[dc-issues]: https://github.com/issues?q=user%3Adatacarpentry
+[dc-lessons]: http://datacarpentry.org/lessons/
+[dc-site]: http://datacarpentry.org/
+[discuss-list]: https://carpentries.topicbox.com/groups/discuss
+[github]: https://github.com
+[github-flow]: https://guides.github.com/introduction/flow/
+[github-join]: https://github.com/join
+[how-contribute]: https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github
+[issues]: https://guides.github.com/features/issues/
+[swc-issues]: https://github.com/issues?q=user%3Aswcarpentry
+[swc-lessons]: https://software-carpentry.org/lessons/
+[swc-site]: https://software-carpentry.org/
+[c-site]: https://carpentries.org/
+[lc-site]: https://librarycarpentry.org/
+[lc-issues]: https://github.com/issues?q=user%3Alibrarycarpentry
diff --git a/Gemfile b/Gemfile
index 41c97a9a57d879681825f8a65434b4752d0716b0..20ebaa15c575c53b6476925ab3b4c8a135aa0b1c 100644 (file)
--- a/Gemfile
+++ b/Gemfile
@@ -5,6 +5,6 @@ source 'https://rubygems.org'
 git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }
 
 # Synchronize with https://pages.github.com/versions
-ruby '>=2.7.1'
+ruby '>=2.5.5'
 
 gem 'github-pages', group: :jekyll_plugins
index 510d494c2ec8fd2b037159b6203f37c5d43e217b..060994aef38102e7f2b4fdf7196e76ffa9b5dd29 100644 (file)
--- a/README.md
+++ b/README.md
@@ -1,27 +1,40 @@
-# Lessons
+# FIXME Lesson title
 
-These lessons go walk through the development of a CWL workflow for
-rnaseq.
+[![Create a Slack Account with us](https://img.shields.io/badge/Create_Slack_Account-The_Carpentries-071159.svg)](https://swc-slack-invite.herokuapp.com/)
 
-| Lesson   | Description |
-|----------|-------------|
-| [Lesson 1](lesson1/lesson1.md) | Turning a shell script into a workflow by composing existing tools  |
-| [Lesson 2](lesson2/lesson2.md) | Running and debugging a workflow  |
-| [Lesson 3](lesson3/lesson3.md) | Writing a tool wrapper  |
-| [Lesson 4](lesson4/lesson4.md) | Analyzing multiple samples  |
-| [Lesson 5](lesson5/lesson5.md) | Dynamic Workflow behavior with expressions  |
-| [Lesson 6](lesson6/lesson6.md) | Resources for further learning |
+This repository generates the corresponding lesson website from [The Carpentries](https://carpentries.org/) repertoire of lessons. 
 
-# Acknowledgements
+## Contributing
 
-These CWL lessons are based on "Introduction to RNA-seq using
-high-performance computing (HPC)" lessons developed by members of the
-teaching team at the Harvard Chan Bioinformatics Core (HBC) and
-obtained from
+We welcome all contributions to improve the lesson! Maintainers will do their best to help you if you have any
+questions, concerns, or experience any difficulties along the way.
 
-https://github.com/hbctraining/Intro-to-rnaseq-hpc-O2
+We'd like to ask you to familiarize yourself with our [Contribution Guide](CONTRIBUTING.md) and have a look at
+the [more detailed guidelines][lesson-example] on proper formatting, ways to render the lesson locally, and even
+how to write new episodes.
 
-The original lessons are open access materials distributed under the
-terms of the Creative Commons Attribution license (CC BY 4.0), which
-permits unrestricted use, distribution, and reproduction in any
-medium, provided the original author and source are credited.
+Please see the current list of [issues][FIXME] for ideas for contributing to this
+repository. For making your contribution, we use the GitHub flow, which is
+nicely explained in the chapter [Contributing to a Project](http://git-scm.com/book/en/v2/GitHub-Contributing-to-a-Project) in Pro Git
+by Scott Chacon.
+Look for the tag ![good_first_issue](https://img.shields.io/badge/-good%20first%20issue-gold.svg). This indicates that the maintainers will welcome a pull request fixing this issue.  
+
+
+## Maintainer(s)
+
+Current maintainers of this lesson are 
+
+* FIXME
+* FIXME
+* FIXME
+
+
+## Authors
+
+A list of contributors to the lesson can be found in [AUTHORS](AUTHORS)
+
+## Citation
+
+To cite this lesson, please consult with [CITATION](CITATION)
+
+[lesson-example]: https://carpentries.github.io/lesson-example
diff --git a/README.md.old b/README.md.old
new file mode 100644 (file)
index 0000000..510d494
--- /dev/null
@@ -0,0 +1,27 @@
+# Lessons
+
+These lessons go walk through the development of a CWL workflow for
+rnaseq.
+
+| Lesson   | Description |
+|----------|-------------|
+| [Lesson 1](lesson1/lesson1.md) | Turning a shell script into a workflow by composing existing tools  |
+| [Lesson 2](lesson2/lesson2.md) | Running and debugging a workflow  |
+| [Lesson 3](lesson3/lesson3.md) | Writing a tool wrapper  |
+| [Lesson 4](lesson4/lesson4.md) | Analyzing multiple samples  |
+| [Lesson 5](lesson5/lesson5.md) | Dynamic Workflow behavior with expressions  |
+| [Lesson 6](lesson6/lesson6.md) | Resources for further learning |
+
+# Acknowledgements
+
+These CWL lessons are based on "Introduction to RNA-seq using
+high-performance computing (HPC)" lessons developed by members of the
+teaching team at the Harvard Chan Bioinformatics Core (HBC) and
+obtained from
+
+https://github.com/hbctraining/Intro-to-rnaseq-hpc-O2
+
+The original lessons are open access materials distributed under the
+terms of the Creative Commons Attribution license (CC BY 4.0), which
+permits unrestricted use, distribution, and reproduction in any
+medium, provided the original author and source are credited.
diff --git a/_config.yml b/_config.yml
new file mode 100644 (file)
index 0000000..a67f14b
--- /dev/null
@@ -0,0 +1,101 @@
+#------------------------------------------------------------
+# Values for this lesson.
+#------------------------------------------------------------
+
+# Which carpentry is this ("swc", "dc", "lc", or "cp")?
+# swc: Software Carpentry
+# dc: Data Carpentry
+# lc: Library Carpentry
+# cp: Carpentries (to use for instructor traning for instance)
+# incubator: Carpentries Incubator
+carpentry: "swc"
+
+# Overall title for pages.
+title: "Lesson Title"
+
+# Life cycle stage of the lesson
+# See this page for more details: https://cdh.carpentries.org/the-lesson-life-cycle.html
+# Possible values: "pre-alpha", "alpha", "beta", "stable"
+life_cycle: "pre-alpha"
+
+#------------------------------------------------------------
+# Generic settings (should not need to change).
+#------------------------------------------------------------
+
+# What kind of thing is this ("workshop" or "lesson")?
+kind: "lesson"
+
+# Magic to make URLs resolve both locally and on GitHub.
+# See https://help.github.com/articles/repository-metadata-on-github-pages/.
+# Please don't change it: <USERNAME>/<PROJECT> is correct.
+repository: <USERNAME>/<PROJECT>
+
+# Email address, no mailto:
+email: "info@curii.com"
+
+# Sites.
+amy_site: "https://amy.carpentries.org/"
+carpentries_github: "https://github.com/carpentries"
+carpentries_pages: "https://carpentries.github.io"
+carpentries_site: "https://carpentries.org/"
+dc_site: "https://datacarpentry.org"
+example_repo: "https://github.com/carpentries/lesson-example"
+example_site: "https://carpentries.github.io/lesson-example"
+lc_site: "https://librarycarpentry.org/"
+swc_github: "https://github.com/swcarpentry"
+swc_pages: "https://swcarpentry.github.io"
+swc_site: "https://software-carpentry.org"
+template_repo: "https://github.com/carpentries/styles"
+training_site: "https://carpentries.github.io/instructor-training"
+workshop_repo: "https://github.com/carpentries/workshop-template"
+workshop_site: "https://carpentries.github.io/workshop-template"
+cc_by_human: "https://creativecommons.org/licenses/by/4.0/"
+
+# Surveys.
+pre_survey: "https://carpentries.typeform.com/to/wi32rS?slug="
+post_survey: "https://carpentries.typeform.com/to/UgVdRQ?slug="
+instructor_pre_survey: "https://www.surveymonkey.com/r/instructor_training_pre_survey?workshop_id="
+instructor_post_survey: "https://www.surveymonkey.com/r/instructor_training_post_survey?workshop_id="
+
+
+# Start time in minutes (0 to be clock-independent, 540 to show a start at 09:00 am).
+start_time: 0
+
+# Specify that things in the episodes collection should be output.
+collections:
+  episodes:
+    output: true
+    permalink: /:path/index.html
+  extras:
+    output: true
+    permalink: /:path/index.html
+
+# Set the default layout for things in the episodes collection.
+defaults:
+  - values:
+      root: .
+      layout: page
+  - scope:
+      path: ""
+      type: episodes
+    values:
+      root: ..
+      layout: episode
+  - scope:
+      path: ""
+      type: extras
+    values:
+      root: ..
+      layout: page
+
+# Files and directories that are not to be copied.
+exclude:
+  - Makefile
+  - bin/
+  - .Rproj.user/
+  - .vendor/
+  - vendor/
+  - .docker-vendor/
+
+# Turn on built-in syntax highlighting.
+highlighter: rouge
diff --git a/_episodes/01-introduction.md b/_episodes/01-introduction.md
new file mode 100644 (file)
index 0000000..fa10f79
--- /dev/null
@@ -0,0 +1,111 @@
+---
+title: "Introduction"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
+
+## Introduction
+
+The goal of this training is to walk through the development of a
+best-practices CWL workflow by translating an existing bioinformatics
+shell script into CWL.  Specific knowledge of the biology of RNA-seq
+is *not* a prerequisite for these lessons.
+
+These lessons are based on "Introduction to RNA-seq using
+high-performance computing (HPC)" lessons developed by members of the
+teaching team at the Harvard Chan Bioinformatics Core (HBC).  The
+original training, which includes additional lectures about the
+biology of RNA-seq can be found here:
+
+https://github.com/hbctraining/Intro-to-rnaseq-hpc-O2
+
+## Background
+
+RNA-seq is the process of sequencing RNA in a biological sample.  From
+the sequence reads, we want to measure the relative number of RNA
+molecules appearing in the sample that were produced by particular
+genes.  This analysis is called "differential gene expression".
+
+The entire process looks like this:
+
+![](/assets/img/RNAseqWorkflow.png)
+
+For this training, we are only concerned with the middle analytical
+steps (skipping adapter trimming).
+
+* Quality control (FASTQC)
+* Alignment (mapping)
+* Counting reads associated with genes
+
+## Analysis shell script
+
+This analysis is already available as a Unix shell script, which we
+will refer to in order to build the workflow.
+
+Some of the reasons to use CWL over a plain shell script: portability,
+scalability, ability to run on platforms that are not traditional HPC.
+
+rnaseq_analysis_on_input_file.sh
+
+```
+#!/bin/bash
+
+# Based on
+# https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/lessons/07_automating_workflow.html
+#
+
+# This script takes a fastq file of RNA-Seq data, runs FastQC and outputs a counts file for it.
+# USAGE: sh rnaseq_analysis_on_input_file.sh <name of fastq file>
+
+set -e
+
+# initialize a variable with an intuitive name to store the name of the input fastq file
+fq=$1
+
+# grab base of filename for naming outputs
+base=`basename $fq .subset.fq`
+echo "Sample name is $base"
+
+# specify the number of cores to use
+cores=4
+
+# directory with genome reference FASTA and index files + name of the gene annotation file
+genome=rnaseq/reference_data
+gtf=rnaseq/reference_data/chr1-hg19_genes.gtf
+
+# make all of the output directories
+# The -p option means mkdir will create the whole path if it
+# does not exist and refrain from complaining if it does exist
+mkdir -p rnaseq/results/fastqc
+mkdir -p rnaseq/results/STAR
+mkdir -p rnaseq/results/counts
+
+# set up output filenames and locations
+fastqc_out=rnaseq/results/fastqc
+align_out=rnaseq/results/STAR/${base}_
+counts_input_bam=rnaseq/results/STAR/${base}_Aligned.sortedByCoord.out.bam
+counts=rnaseq/results/counts/${base}_featurecounts.txt
+
+echo "Processing file $fq"
+
+# Run FastQC and move output to the appropriate folder
+fastqc $fq
+
+# Run STAR
+STAR --runThreadN $cores --genomeDir $genome --readFilesIn $fq --outFileNamePrefix $align_out --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within --outSAMattributes Standard
+
+# Create BAM index
+samtools index $counts_input_bam
+
+# Count mapped reads
+featureCounts -T $cores -s 2 -a $gtf -o $counts $counts_input_bam
+```
+
+
+{% include links.md %}
similarity index 58%
rename from lesson1/lesson1.md
rename to _episodes/02-workflow.md
index 03173b6d82abe9adf4025a68b8a0ab7f19a2b301..a3700a9def6b1cf2727dbdcd609e959c78ca9d6c 100644 (file)
-# Turning a shell script into a workflow by composing existing tools
-
-## Introduction
-
-The goal of this training is to walk through the development of a
-best-practices CWL workflow by translating an existing bioinformatics
-shell script into CWL.  Specific knowledge of the biology of RNA-seq
-is *not* a prerequisite for these lessons.
-
-These lessons are based on "Introduction to RNA-seq using
-high-performance computing (HPC)" lessons developed by members of the
-teaching team at the Harvard Chan Bioinformatics Core (HBC).  The
-original training, which includes additional lectures about the
-biology of RNA-seq can be found here:
-
-https://github.com/hbctraining/Intro-to-rnaseq-hpc-O2
-
-## Background
-
-RNA-seq is the process of sequencing RNA in a biological sample.  From
-the sequence reads, we want to measure the relative number of RNA
-molecules appearing in the sample that were produced by particular
-genes.  This analysis is called "differential gene expression".
-
-The entire process looks like this:
-
-![](RNAseqWorkflow.png)
-
-For this training, we are only concerned with the middle analytical
-steps (skipping adapter trimming).
-
-* Quality control (FASTQC)
-* Alignment (mapping)
-* Counting reads associated with genes
-
-## Analysis shell script
-
-This analysis is already available as a Unix shell script, which we
-will refer to in order to build the workflow.
-
-Some of the reasons to use CWL over a plain shell script: portability,
-scalability, ability to run on platforms that are not traditional HPC.
-
-rnaseq_analysis_on_input_file.sh
-
-```
-#!/bin/bash
-
-# Based on
-# https://hbctraining.github.io/Intro-to-rnaseq-hpc-O2/lessons/07_automating_workflow.html
-#
-
-# This script takes a fastq file of RNA-Seq data, runs FastQC and outputs a counts file for it.
-# USAGE: sh rnaseq_analysis_on_input_file.sh <name of fastq file>
-
-set -e
-
-# initialize a variable with an intuitive name to store the name of the input fastq file
-fq=$1
-
-# grab base of filename for naming outputs
-base=`basename $fq .subset.fq`
-echo "Sample name is $base"
-
-# specify the number of cores to use
-cores=4
-
-# directory with genome reference FASTA and index files + name of the gene annotation file
-genome=rnaseq/reference_data
-gtf=rnaseq/reference_data/chr1-hg19_genes.gtf
-
-# make all of the output directories
-# The -p option means mkdir will create the whole path if it
-# does not exist and refrain from complaining if it does exist
-mkdir -p rnaseq/results/fastqc
-mkdir -p rnaseq/results/STAR
-mkdir -p rnaseq/results/counts
-
-# set up output filenames and locations
-fastqc_out=rnaseq/results/fastqc
-align_out=rnaseq/results/STAR/${base}_
-counts_input_bam=rnaseq/results/STAR/${base}_Aligned.sortedByCoord.out.bam
-counts=rnaseq/results/counts/${base}_featurecounts.txt
-
-echo "Processing file $fq"
-
-# Run FastQC and move output to the appropriate folder
-fastqc $fq
-
-# Run STAR
-STAR --runThreadN $cores --genomeDir $genome --readFilesIn $fq --outFileNamePrefix $align_out --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within --outSAMattributes Standard
-
-# Create BAM index
-samtools index $counts_input_bam
-
-# Count mapped reads
-featureCounts -T $cores -s 2 -a $gtf -o $counts $counts_input_bam
-```
-
-## Setting up
+---
+title: "Turning a shell script into a workflow by composing existing tools"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
+
+# Setting up
 
 We will create a new git repository and import a library of existing
 tool definitions that will help us build our workflow.
@@ -120,9 +33,9 @@ Next, import bio-cwl-tools with this command:
 git submodule add https://github.com/common-workflow-library/bio-cwl-tools.git
 ```
 
-## Writing the workflow
+# Writing the workflow
 
-### 1. File header
+## 1. File header
 
 Create a new file "main.cwl"
 
@@ -135,7 +48,7 @@ class: Workflow
 label: RNAseq CWL practice workflow
 ```
 
-### 2. Workflow Inputs
+## 2. Workflow Inputs
 
 The purpose of a workflow is to consume some input parameters, run a
 series of steps, and produce output values.
@@ -168,7 +81,7 @@ inputs:
   gtf: File
 ```
 
-### 3. Workflow Steps
+## 3. Workflow Steps
 
 A workflow consists of one or more steps.  This is the `steps` section.
 
@@ -203,7 +116,7 @@ steps:
     out: [html_file]
 ```
 
-### 4. Running alignment with STAR
+## 4. Running alignment with STAR
 
 STAR has more parameters.  Sometimes we want to provide input values
 to a step without making them as workflow-level inputs.  We can do
@@ -225,7 +138,7 @@ this with `{default: N}`
     out: [alignment]
 ```
 
-### 5. Running samtools
+## 5. Running samtools
 
 The third step is to generate an index for the aligned BAM.
 
@@ -244,7 +157,7 @@ step will not run until the `STAR` step has completed successfully.
     out: [bam_sorted_indexed]
 ```
 
-### 6. featureCounts
+## 6. featureCounts
 
 As of this writing, the `subread` package that provides
 `featureCounts` is not available in bio-cwl-tools (and if it has been
@@ -252,7 +165,7 @@ added since writing this, let's pretend that it isn't there.)  We will
 go over how to write a CWL wrapper for a command line tool in
 lesson 3.  For now, we will leave off the final step.
 
-### 7. Workflow Outputs
+## 7. Workflow Outputs
 
 The last thing to do is declare the workflow outputs in the `outputs` section.
 
similarity index 95%
rename from lesson2/lesson2.md
rename to _episodes/03-running.md
index d7fb5a98b3b96d06e3d5bb84bf789dfb832075b6..b851a2959f70af583ad60987b7fb36069941cdcc 100644 (file)
@@ -1,3 +1,15 @@
+---
+title: "Running and debugging a workflow"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
+
 # Running and debugging a workflow
 
 ### 1. The input parameter file
similarity index 95%
rename from lesson3/lesson3.md
rename to _episodes/04-commandlinetool.md
index f24bce49c82d3f8bf9dc9dc8e1e11943489b6b29..0575a01561fd71919f991a11bf925592f31f5943 100644 (file)
@@ -1,4 +1,14 @@
-# Writing a tool wrapper
+---
+title: "Writing a tool wrapper"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
 
 It is time to add the last step in the analysis.
 
similarity index 95%
rename from lesson4/lesson4.md
rename to _episodes/05-scatter.md
index 91df9bdbdf07f5cb8a8db50c47c4cbd12b91bb45..6160baeb27cec93a0be94030d08b52f57add9a3f 100644 (file)
@@ -1,3 +1,15 @@
+---
+title: " Analyzing multiple samples"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
+
 # Analyzing multiple samples
 
 Analyzing a single sample is great, but in the real world you probably
similarity index 95%
rename from lesson5/lesson5.md
rename to _episodes/06-expressions.md
index 664017613cb8bc28af5da5718953718a79b95477..7b83de6d28c831f5b6f124f39377c0705d7df7ac 100644 (file)
@@ -1,4 +1,14 @@
-# Dynamic Workflow behavior with expressions
+---
+title: "Dynamic Workflow behavior with expressions"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
 
 ### 1. Expressions on step inputs
 
similarity index 82%
rename from lesson6/lesson6.md
rename to _episodes/07-resources.md
index c367a2b58e9815e3c6f59ba3ca823f5d8b7ebbfe..81fd2e1bd4915c67cb25c51f73ac67526480561f 100644 (file)
@@ -1,4 +1,14 @@
-# Resources for further learning
+---
+title: " Resources for further learning"
+teaching: 0
+exercises: 0
+questions:
+- "Key question (FIXME)"
+objectives:
+- "First learning objective. (FIXME)"
+keypoints:
+- "First key point. Brief Answer to questions. (FIXME)"
+---
 
 Hopefully you now have a basic grasp of the steps involved in
 developing a CWL workflow. There are many resources out there to
diff --git a/_extras/about.md b/_extras/about.md
new file mode 100644 (file)
index 0000000..5f07f65
--- /dev/null
@@ -0,0 +1,5 @@
+---
+title: About
+---
+{% include carpentries.html %}
+{% include links.md %}
diff --git a/_extras/discuss.md b/_extras/discuss.md
new file mode 100644 (file)
index 0000000..bfc33c5
--- /dev/null
@@ -0,0 +1,6 @@
+---
+title: Discussion
+---
+FIXME
+
+{% include links.md %}
diff --git a/_extras/figures.md b/_extras/figures.md
new file mode 100644 (file)
index 0000000..0012c88
--- /dev/null
@@ -0,0 +1,79 @@
+---
+title: Figures
+---
+
+{% include base_path.html %}
+{% include manual_episode_order.html %}
+
+<script>
+  window.onload = function() {
+    var lesson_episodes = [
+    {% for lesson_episode in lesson_episodes %}
+      {% if site.episode_order %}
+        {% assign episode = site.episodes | where: "slug", lesson_episode | first %}
+      {% else %}
+        {% assign episode = lesson_episode %}
+      {% endif %}
+    "{{ episode.url }}"{% unless forloop.last %},{% endunless %}
+    {% endfor %}
+    ];
+
+    var xmlHttp = [];  /* Required since we are going to query every episode. */
+    for (i=0; i < lesson_episodes.length; i++) {
+
+      xmlHttp[i] = new XMLHttpRequest();
+      xmlHttp[i].episode = lesson_episodes[i];  /* To enable use this later. */
+      xmlHttp[i].onreadystatechange = function() {
+
+        if (this.readyState == 4 && this.status == 200) {
+          var parser = new DOMParser();
+          var htmlDoc = parser.parseFromString(this.responseText,"text/html");
+          var htmlDocArticle = htmlDoc.getElementsByTagName("article")[0];
+
+          var article_here = document.getElementById(this.episode);
+          var images = htmlDocArticle.getElementsByTagName("img");
+
+          if (images.length > 0) {
+            var h1text = htmlDocArticle.getElementsByTagName("h1")[0].innerHTML;
+
+            var htitle = document.createElement('h2');
+            htitle.innerHTML = h1text;
+            article_here.appendChild(htitle);
+
+            var image_num = 0;
+            for (let image of images) {
+              image_num++;
+
+              var title = document.createElement('p');
+              title.innerHTML = "<strong>Figure " + image_num + ".</strong> " + image.alt;
+              article_here.appendChild(title);
+
+              article_here.appendChild(image.cloneNode(false));
+
+              if (image_num < images.length) {
+                var hr = document.createElement('hr');
+                article_here.appendChild(hr);
+              }
+            }
+          }
+        }
+      }
+      episode_url = "{{ relative_root_path }}" + lesson_episodes[i];
+      xmlHttp[i].open("GET", episode_url);
+      xmlHttp[i].send(null);
+    }
+  }
+</script>
+
+{% comment %} Create anchor for each one of the episodes.  {% endcomment %}
+
+{% for lesson_episode in lesson_episodes %}
+  {% if site.episode_order %}
+    {% assign episode = site.episodes | where: "slug", lesson_episode | first %}
+  {% else %}
+    {% assign episode = lesson_episode %}
+  {% endif %}
+<article id="{{ episode.url }}" class="figures"></article>
+{% endfor %}
+
+{% include links.md %}
diff --git a/_extras/guide.md b/_extras/guide.md
new file mode 100644 (file)
index 0000000..50f266f
--- /dev/null
@@ -0,0 +1,6 @@
+---
+title: "Instructor Notes"
+---
+FIXME
+
+{% include links.md %}
diff --git a/answers/ep2/bio-cwl-tools b/answers/ep2/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/answers/ep4/bio-cwl-tools b/answers/ep4/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/answers/ep5/part1/bio-cwl-tools b/answers/ep5/part1/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/answers/ep5/part2/bio-cwl-tools b/answers/ep5/part2/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/answers/ep5/part4/bio-cwl-tools b/answers/ep5/part4/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/answers/ep6/bio-cwl-tools b/answers/ep6/bio-cwl-tools
new file mode 120000 (symlink)
index 0000000..fc4057d
--- /dev/null
@@ -0,0 +1 @@
+/home/peter/work/rnaseq-cwl-training-exercises/bio-cwl-tools/
\ No newline at end of file
diff --git a/index.md b/index.md
new file mode 100644 (file)
index 0000000..95ccdbd
--- /dev/null
+++ b/index.md
@@ -0,0 +1,17 @@
+---
+layout: lesson
+root: .  # Is the only page that doesn't follow the pattern /:path/index.html
+permalink: index.html  # Is the only page that doesn't follow the pattern /:path/index.html
+---
+FIXME: home page introduction
+
+<!-- this is an html comment -->
+
+{% comment %} This is a comment in Liquid {% endcomment %}
+
+> ## Prerequisites
+>
+> FIXME
+{: .prereq}
+
+{% include links.md %}
diff --git a/reference.md b/reference.md
new file mode 100644 (file)
index 0000000..8c82616
--- /dev/null
@@ -0,0 +1,9 @@
+---
+layout: reference
+---
+
+## Glossary
+
+FIXME
+
+{% include links.md %}
diff --git a/setup.md b/setup.md
new file mode 100644 (file)
index 0000000..b8c5032
--- /dev/null
+++ b/setup.md
@@ -0,0 +1,7 @@
+---
+title: Setup
+---
+FIXME
+
+
+{% include links.md %}