From e4d896f5f94a9cf7b157cf87d5042e416649d87b Mon Sep 17 00:00:00 2001 From: Peter Amstutz Date: Fri, 8 Nov 2024 13:57:48 -0500 Subject: [PATCH] More description of WGS workflow Arvados-DCO-1.1-Signed-off-by: Peter Amstutz --- WGS-processing/cwl/wgs-processing-wf.cwl | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/WGS-processing/cwl/wgs-processing-wf.cwl b/WGS-processing/cwl/wgs-processing-wf.cwl index 3a96f44..486c63b 100644 --- a/WGS-processing/cwl/wgs-processing-wf.cwl +++ b/WGS-processing/cwl/wgs-processing-wf.cwl @@ -2,10 +2,23 @@ cwlVersion: v1.1 class: Workflow label: Whole Genome Sequence processing workflow scattered over samples doc: | -

This is a “real-world” workflow example that takes in NGS Whole - Genome Sequence (WGS) data as FASTQs and performs quality checking, - alignment, and variant calling, returning GVCFs and accompanying - clinvar variant reports.

+

This is a “real-world” workflow example for processing Next + Generation Sequencing (NGS) Whole Genome Sequence (WGS) data.

+ +

You can learn more and run this workflow yourself by going + through the Processing + Whole Genome Sequences walkthrough in the Arvados user + guide.

+ +

The steps of this workflow include:

+ +
    +
  1. Check of fastq quality using FastQC
  2. +
  3. Local alignment using BWA-MEM
  4. +
  5. Variant calling in parallel using GATK Haplotype Caller
  6. +
  7. Generation of an HTML report comparing variants against ClinVar archive
  8. +

The primary input parameter is the Directory of paired FASTQ files, which should contain paired FASTQ files (suffixed with _1 -- 2.30.2