From cc20b244789bda85dece42ae01db9739eda2b192 Mon Sep 17 00:00:00 2001 From: Peter Amstutz Date: Fri, 29 Jan 2021 10:45:38 -0500 Subject: [PATCH] Turn the "run the workflow again" instructions into exercises. Arvados-DCO-1.1-Signed-off-by: Peter Amstutz --- _episodes/02-workflow.md | 2 +- _episodes/03-running.md | 8 +++- _episodes/04-commandlinetool.md | 68 ++++++++++++++++++++------------- _episodes/05-scatter.md | 31 ++++++++++----- _episodes/06-expressions.md | 18 +++++++-- _episodes/07-resources.md | 6 +-- 6 files changed, 88 insertions(+), 45 deletions(-) diff --git a/_episodes/02-workflow.md b/_episodes/02-workflow.md index 3271ba8..8ef18cd 100644 --- a/_episodes/02-workflow.md +++ b/_episodes/02-workflow.md @@ -1,6 +1,6 @@ --- title: "Create a Workflow by Composing Tools" -teaching: 20 +teaching: 30 exercises: 10 questions: - "What is the syntax of CWL?" diff --git a/_episodes/03-running.md b/_episodes/03-running.md index f3c9779..ba37ad3 100644 --- a/_episodes/03-running.md +++ b/_episodes/03-running.md @@ -1,6 +1,6 @@ --- title: "Running and Debugging a Workflow" -teaching: 10 +teaching: 15 exercises: 20 questions: - "How do I provide input to run a workflow?" @@ -121,7 +121,11 @@ Resource requirements you can set include: * tmpdirMin: temporary directory available space * outdirMin: output directory available space -After setting the RAM requirements, re-run the workflow. +> ## Running the workflow +> +> Now that you've fixed the workflow, run it again. +> +{: .challenge } # Workflow results diff --git a/_episodes/04-commandlinetool.md b/_episodes/04-commandlinetool.md index 22110fa..07378aa 100644 --- a/_episodes/04-commandlinetool.md +++ b/_episodes/04-commandlinetool.md @@ -1,7 +1,7 @@ --- title: "Writing a Tool Wrapper" -teaching: 15 -exercises: 20 +teaching: 20 +exercises: 30 questions: - "What are the key components of a tool wrapper?" - "How do I use software containers to supply the software I want to run?" @@ -29,6 +29,11 @@ This will use the "featureCounts" tool from the "subread" package. # File header +A CommandLineTool describes a single invocation of a command line +program. It consumes some input parameters, runs a program, and +captures output, mainly in in the form of files produced by the +program. + Create a new file "featureCounts.cwl" Let's start with the header. This is very similar to the workflow, except that we use `class: CommandLineTool`. @@ -42,21 +47,29 @@ label: featureCounts tool # Command line tool inputs -A CommandLineTool describes a single invocation of a command line program. - -It consumes some input parameters, runs a program, and captures -output, mainly in in the form of files produced by the program. - -The variables used in the bash script are `$cores`, `$gtf`, `$counts` and `$counts_input_bam`. - -This gives us two file inputs, `gtf` and `counts_input_bam` which we can declare in our `inputs` section: +The `inputs` section describes input parameters with the same form as +the Workflow `inputs` section. -``` -inputs: - gtf: File - counts_input_bam: File -``` -{: .language-yaml } +> ## Exercise +> +> The variables used in the bash script are `$cores`, `$gtf`, `$counts` and `$counts_input_bam`. +> +> * $cores is the number of CPU cores to use. +> * $gtf is the input .gtf file +> * $counts is the name we will give to the output file +> * $counts_input_bam is the input .bam file +> +> Write the `inputs` section for the File inputs `gtf` and `counts_input_bam`. +> +> > ## Solution +> > ``` +> > inputs: +> > gtf: File +> > counts_input_bam: File +> > ``` +> > {: .language-yaml } +> {: .solution} +{: .challenge} # Specifying the program to run @@ -188,7 +201,7 @@ When creating a tool wrapper, it is helpful to run it on its own to test it. The input to a single tool is the same kind of input parameters file that we used as input to a workflow in the previous lesson. -featureCounts.yaml: +`featureCounts.yaml` ``` counts_input_bam: @@ -200,20 +213,23 @@ gtf: ``` {: .language-yaml } -The invocation is also the same: - -``` -cwl-runner featureCounts.cwl featureCounts.yaml -``` -{: .language-bash } +> ## Running the tool +> +> Run the tool on its own to confirm it has correct behavior: +> +> ``` +> cwl-runner featureCounts.cwl featureCounts.yaml +> ``` +> {: .language-bash } +{: .challenge } # Adding it to the workflow +Now that we have confirmed that the tool wrapper works, it is time to +add it to our workflow. + > ## Exercise > -> Now that we have confirmed that the tool wrapper works, it is time -> to add it to our workflow. -> > 1. Add a new step called `featureCounts` that runs our tool > wrapper. The new step should take input from > `samtools/bam_sorted_indexed`, and should be allocated a diff --git a/_episodes/05-scatter.md b/_episodes/05-scatter.md index 2903952..0881326 100644 --- a/_episodes/05-scatter.md +++ b/_episodes/05-scatter.md @@ -1,7 +1,7 @@ --- title: "Analyzing Multiple Samples" -teaching: 20 -exercises: 0 +teaching: 30 +exercises: 30 questions: - "How can you run the same workflow over multiple samples?" objectives: @@ -66,9 +66,13 @@ requirements: ``` {: .language-yaml } -If you run this workflow, you will get exactly the same results as -before, as all we have done so far is to wrap the inner workflow with -an outer workflow. +> ## Running the workflow +> +> Run this workflow. You should get exactly the same results as +> before, as all we have done so far is to wrap the inner workflow with +> an outer workflow. +> +{: .challenge } # Scattering @@ -165,8 +169,12 @@ gtf: ``` {: .language-yaml } -If you run the workflow, you will get results for each one of the -input fastq files. +> ## Running the workflow +> +> Run this workflow. You will now get results for each one of the +> input fastq files. +> +{: .challenge } # Combining results @@ -231,5 +239,10 @@ outputs: ``` {: .language-yaml } -Run this workflow to get a single `featurecounts.tsv` file with a -column for each bam file. +> ## Running the workflow +> +> Run this workflow. You will still have separate results from fastq +> and and STAR, but now you will only have a single +> `featurecounts.tsv` file with a column for each bam file. +> +{: .challenge } diff --git a/_episodes/06-expressions.md b/_episodes/06-expressions.md index cf26be4..9c79ad5 100644 --- a/_episodes/06-expressions.md +++ b/_episodes/06-expressions.md @@ -1,13 +1,15 @@ --- title: "Dynamic Workflow Behavior" teaching: 20 -exercises: 0 +exercises: 10 questions: -- "How can I adjust workflow behavior at runtime?" +- "What kind of custom logic can happen between steps?" objectives: -- "Set " +- "Customize the STAR output filename to use the input filename." +- "Organize files into directories." keypoints: -- "First key point. Brief Answer to questions. (FIXME)" +- "CWL expressions allow you to use custom logic to determine input parameter values." +- "CWL ExpressionTool can be used to reshape data, such as declaring directories that should contain output files." --- # Expressions on step inputs @@ -154,3 +156,11 @@ outputs: outputSource: featureCounts/featurecounts ``` {: .language-yaml } + +> ## Running the workflow +> +> Run the workflow. Look at the output. The BAM and fastqc files +> should now be organized into directories, with better naming of the +> bam files. +> +{: .challenge } diff --git a/_episodes/07-resources.md b/_episodes/07-resources.md index c68b5ac..6b12771 100644 --- a/_episodes/07-resources.md +++ b/_episodes/07-resources.md @@ -3,11 +3,11 @@ title: " Resources for further learning" teaching: 10 exercises: 0 questions: -- "Key question (FIXME)" +- "Where should I go to learn more?" objectives: -- "First learning objective. (FIXME)" +- "Become a part of the CWL community." keypoints: -- "First key point. Brief Answer to questions. (FIXME)" +- "Learn more advanced techniques from CWL user guide, by asking questions on the CWL forum and chat channel, and reading the specification." --- Hopefully you now have a basic grasp of the steps involved in -- 2.30.2