1 This directory contains an Arvados demo showing processing of whole genome sequencing (WGS) data. The workflow includes:
3 * Check of fastq quality using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
4 * Local alignment using BWA-MEM (http://bio-bwa.sourceforge.net/bwa.shtml)
5 * Variant calling in parallel using GATK Haplotype Caller (https://gatk.broadinstitute.org/hc/en-us)
6 * Generation of an HTML report comparing variants against ClinVar archive (https://www.ncbi.nlm.nih.gov/clinvar/)
8 Workflows are written in CWL v1.1.
11 * cwl - contains CWL code for the demo
12 * yml - contains yml inputs for cwl demo code
13 * src - contains any src code for the demo
14 * docker - contains dockerfiles necessary to re-create any needed docker images
18 * cd into cwl directory
20 arvados-cwl-runner --no-wait --project-uuid YOUR_PROJECT_UUID wgs-processing-wf.cwl ../yml/YOURINPUTS.yml
23 WGS Data used in this demo is public data made available by the Personal Genome Project.
24 This set of data is from the PGP-UK (https://www.personalgenomes.org.uk/).