Here you will use the GATK VariantFiltration program to assign pass/fail scores to variants in a VCF file.
+_This should be motivated better using a specific biomedical research
+or diagnostic question that involves this analysis_
+
+_From conversation with Ward: We should link to a discussion of the
+personal genome project and explain that it a freely available dataset
+that any researcher can use, which makes it appropriate to be used in these examples._
+
h3. Prerequisites
* Log in to a VM "using SSH":ssh-access.html
h3. Get the GATK binary distribution.
+_Perhaps separate out this and the next sections and link to it so the user only
+has to do this if they really don't have GATK installed. Also provide
+a way to determine if they do have it._
+
Download the GATK binary tarball[1] -- e.g., @GenomeAnalysisTK-2.6-4.tar.bz2@ -- and copy it to your Arvados VM.
+_Is it necessary to copy it to the Arvados VM first, you could put it into
+keep from your desktop and/or use the workbench? Also if we are
+telling them to copy it to the VM, maybe we should mention scp?_
+
Store it in Keep.
<pre>
↓
+_Make the itty bitty down arrows bigger, and maybe center them_
+
<pre>
c905c8d8443a9c44274d98b7c6cfaa32+94+K@qr1hi
</pre>
gatk_binary=c905c8d8443a9c44274d98b7c6cfaa32+94+K@qr1hi
gatk_bundle=d237a90bae3870b3b033aea1e99de4a9+10820+K@qr1hi
-read -rd "\000" the_job <<EOF
+read -rd $'\000' the_job <<EOF
{
"script":"GATK2-VariantFiltration",
"script_version":"$src_version",
h3. Monitor job progress
+_This was already covered in tutorial1_
+
There are three ways to monitor job progress:
# Go to Workbench, drop down the Compute menu, and click Jobs. The job you submitted should appear at the top of the list. Hit "Refresh" until it finishes.
https://{{ site.arvados_api_host }}/arvados/v1/jobs/JOB_UUID_HERE/log_tail_follow
</pre>
+
+_That's it? Say something about the output we're going to get_
+
h3. Notes
fn1. Download the GATK tools → http://www.broadinstitute.org/gatk/download