Tom Clegg [Thu, 15 Dec 2022 15:59:03 +0000 (10:59 -0500)]
Merge branch '19526-manhattan-plot'
refs #19526
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 13 Dec 2022 15:03:06 +0000 (10:03 -0500)]
Merge branch '19566-glm'
refs #19566
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 12 Dec 2022 16:37:21 +0000 (11:37 -0500)]
19566: Test p-value vs. Python.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 2 Dec 2022 19:23:34 +0000 (14:23 -0500)]
19566: Normalize pca values before glm.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 2 Dec 2022 18:59:29 +0000 (13:59 -0500)]
19566: Add constant, check GLM results against Python.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Jiayong Li [Fri, 9 Dec 2022 21:09:17 +0000 (21:09 +0000)]
Merge branch '19785-add-cwl' into main
refs #19785
Arvados-DCO-1.1-Signed-off-by: Jiayong Li <jli@curii.com>
Jiayong Li [Fri, 9 Dec 2022 21:08:35 +0000 (21:08 +0000)]
Fix readme
refs #19785
Arvados-DCO-1.1-Signed-off-by: Jiayong Li <jli@curii.com>
Jiayong Li [Fri, 9 Dec 2022 21:05:32 +0000 (21:05 +0000)]
Merge branch '19785-add-cwl' into main
refs #19785
Arvados-DCO-1.1-Signed-off-by: Jiayong Li <jli@curii.com>
Jiayong Li [Fri, 9 Dec 2022 21:03:13 +0000 (21:03 +0000)]
Add cwl and docker files
refs #19785
Arvados-DCO-1.1-Signed-off-by: Jiayong Li <jli@curii.com>
Tom Clegg [Thu, 1 Dec 2022 18:13:18 +0000 (13:13 -0500)]
19526: Tidy manhattan plot.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 29 Nov 2022 16:22:12 +0000 (11:22 -0500)]
19566: Option to limit pca components used in glm. Fix onehot use.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 29 Nov 2022 16:10:32 +0000 (11:10 -0500)]
19566: glm one column at a time.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 29 Nov 2022 15:43:29 +0000 (10:43 -0500)]
19566: Logistic regression p-value.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 28 Nov 2022 20:18:32 +0000 (15:18 -0500)]
Merge branch '19526-manhattan-plot'
refs #19526
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 28 Nov 2022 18:34:48 +0000 (13:34 -0500)]
19526: Add manhattan plot.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 18 Nov 2022 18:28:36 +0000 (13:28 -0500)]
Check for unparsed command line args.
refs #19780
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 18 Nov 2022 17:50:49 +0000 (12:50 -0500)]
19780: Fix indexing error.
refs #19780
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 16 Nov 2022 20:22:55 +0000 (15:22 -0500)]
19527: slice-numpy accepts samples.csv with or without p-values.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 14 Nov 2022 00:06:41 +0000 (19:06 -0500)]
19527: Update arvados sdk.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 11 Nov 2022 21:04:54 +0000 (16:04 -0500)]
19527: Fix odd # columns to pca.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 11 Nov 2022 17:44:16 +0000 (12:44 -0500)]
19527: Fix one-hot matrix.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 11 Nov 2022 17:35:09 +0000 (12:35 -0500)]
19527: Enable choose-samples to work without case/control info.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 11 Nov 2022 01:55:11 +0000 (20:55 -0500)]
19527: Fix Χ² calculation.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 10 Nov 2022 20:49:35 +0000 (15:49 -0500)]
19527: Output samples.csv earlier.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 10 Nov 2022 19:40:45 +0000 (14:40 -0500)]
19527: Fix crash on tag skipped for min-coverage.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 10 Nov 2022 16:16:59 +0000 (11:16 -0500)]
19527: Option to exclude non-case/control samples.
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 10 Nov 2022 15:24:53 +0000 (10:24 -0500)]
Merge branch '19527-training-set'
refs #19527
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 23:29:58 +0000 (18:29 -0500)]
19527: Accommodate header row in samples csv.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 23:08:57 +0000 (18:08 -0500)]
19527: Ignore empty line at EOF.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 20:12:33 +0000 (15:12 -0500)]
Merge branch '19524-pca'
refs #19524
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 20:11:48 +0000 (15:11 -0500)]
19527: Fix p-value calculation.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 19:39:31 +0000 (14:39 -0500)]
19527: Load training-set flag from samples.csv.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 7 Nov 2022 14:29:47 +0000 (09:29 -0500)]
19527: choose-samples: training/validation set.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Nov 2022 19:24:49 +0000 (14:24 -0500)]
19527: Load training-set flag from samples.csv.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 7 Nov 2022 14:29:47 +0000 (09:29 -0500)]
choose-samples: training/validation set.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 2 Nov 2022 14:49:09 +0000 (10:49 -0400)]
19524: Fit PCA to specified training set.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 31 Oct 2022 15:53:26 +0000 (11:53 -0400)]
Merge branch '19524-pca'
refs #19524
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 21 Oct 2022 13:23:12 +0000 (09:23 -0400)]
19524: Fix matrix alloc size.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 20 Oct 2022 17:06:35 +0000 (13:06 -0400)]
19524: Flags choose which PCA components to plot.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 20 Oct 2022 15:23:08 +0000 (11:23 -0400)]
19524: Update colors, plot unknown-phenotype behind known.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 20 Oct 2022 14:07:11 +0000 (10:07 -0400)]
19524: Limit size of PCA input matrix.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 19 Oct 2022 20:17:36 +0000 (16:17 -0400)]
19524: Limit size of PCA input matrix.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 19 Oct 2022 19:55:56 +0000 (15:55 -0400)]
19524: configurable vcpus/ram
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 14 Oct 2022 17:34:23 +0000 (13:34 -0400)]
Merge branch '19524-pca'
refs #19524
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 18:46:46 +0000 (14:46 -0400)]
19524: Use marker shape to indicate second category variable.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 15:44:05 +0000 (11:44 -0400)]
19524: Remove obsolete pca cmds.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 14:47:51 +0000 (10:47 -0400)]
19524: Fix deprecated scipy.load.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 14:47:02 +0000 (10:47 -0400)]
19524: Read multiple phenotype files.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 14:43:36 +0000 (10:43 -0400)]
19524: Generalize plot colors a little.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 13 Oct 2022 13:57:37 +0000 (09:57 -0400)]
Fail if inadvertently using randomness.
No issue #
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 12 Oct 2022 18:36:33 +0000 (14:36 -0400)]
19524: Fix colormap.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 12 Oct 2022 05:11:26 +0000 (01:11 -0400)]
19524: propagate pca-components arg.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 11 Oct 2022 18:40:03 +0000 (14:40 -0400)]
19524: plot: get sample list from csv instead of fasta filenames.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 11 Oct 2022 14:07:14 +0000 (10:07 -0400)]
19524: Output PCA.
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 7 Oct 2022 19:18:39 +0000 (15:18 -0400)]
Update deps, improve error reporting
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 7 Oct 2022 18:11:31 +0000 (14:11 -0400)]
Use min-coverage filter
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 5 Aug 2022 19:45:52 +0000 (15:45 -0400)]
Fix diff case
refs #19236 #note-20
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 3 Aug 2022 20:14:27 +0000 (16:14 -0400)]
Fix diff case
refs #19236 #note-15.7
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 27 Jul 2022 20:56:09 +0000 (16:56 -0400)]
Fix diff case
refs #19236 #note-15.6
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 27 Jul 2022 20:02:40 +0000 (16:02 -0400)]
Fix diff case
refs #19236 #note-15.4, #note-15.5
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 27 Jul 2022 18:48:05 +0000 (14:48 -0400)]
Fix diff case
refs #19236 #note-15.2, #note-15.3
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 26 Jul 2022 17:16:15 +0000 (13:16 -0400)]
Fix diff case
refs #19236 #note-15.1
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 20 Jul 2022 16:21:32 +0000 (12:21 -0400)]
Fix crash when ref tile is dropped due to duplicate tag.
refs #19236
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 15 Jul 2022 19:21:01 +0000 (15:21 -0400)]
Add test for variant at right end of spanning tile.
refs #19271
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 15 Jul 2022 17:51:56 +0000 (13:51 -0400)]
Generate annotations for spanning tiles.
refs #19271
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 14 Jul 2022 14:43:48 +0000 (10:43 -0400)]
Update tests.
No issue #
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 7 Jul 2022 18:28:36 +0000 (14:28 -0400)]
Fix wrong index in chunk>0 case.
refs #19168
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 1 Jul 2022 20:28:23 +0000 (16:28 -0400)]
Fix low-coverage tiles counting toward min coverage threshold.
refs #19168
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 3 Jun 2022 04:05:42 +0000 (00:05 -0400)]
Fix loss of precision in p-value calculation.
refs #19014
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 4 May 2022 05:16:51 +0000 (01:16 -0400)]
19073: Fix dup tag detection.
refs #19073
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 4 May 2022 03:22:31 +0000 (23:22 -0400)]
19073: Fix dup tag detection.
refs #19073
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 3 May 2022 17:57:01 +0000 (13:57 -0400)]
19073: Remove dup tags (>1 ref placement) from tilestats bed file.
refs #19073
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 29 Apr 2022 18:57:44 +0000 (14:57 -0400)]
Add -log10(pvalue) row to onehot-columns.npy output from slicenumpy.
closes #19014
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 16 Mar 2022 22:08:31 +0000 (18:08 -0400)]
Add tilestats cmd.
refs #18582
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 16 Mar 2022 17:16:22 +0000 (13:16 -0400)]
Write chunk-tag-offset.csv with chunked tilevariant# matrix.
refs #17996
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 4 Mar 2022 19:48:35 +0000 (14:48 -0500)]
Change zygosity column info from het=1 to hom=1.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 24 Feb 2022 15:50:13 +0000 (10:50 -0500)]
Fix left-most diff cases.
refs #18721
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 22 Feb 2022 19:04:00 +0000 (14:04 -0500)]
Encode spanning tile as 0 in tile variant matrix.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Sun, 20 Feb 2022 03:08:38 +0000 (22:08 -0500)]
More debugging info for -debug-tag.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Sun, 20 Feb 2022 03:07:14 +0000 (22:07 -0500)]
Skip input files that aren't needed because -max-tag.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 18 Feb 2022 20:11:19 +0000 (15:11 -0500)]
Fix left-most diff cases.
refs #18721
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 18 Feb 2022 20:11:17 +0000 (15:11 -0500)]
Add -debug-tag flag.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 18 Feb 2022 14:57:01 +0000 (09:57 -0500)]
Fix -include-variant-1
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 17 Feb 2022 15:07:02 +0000 (10:07 -0500)]
Update tests (don't include both het+hom if only one passes filter).
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 17 Feb 2022 15:06:55 +0000 (10:06 -0500)]
Fix -max-tag filter.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 17 Feb 2022 14:23:20 +0000 (09:23 -0500)]
Fix sparse one-hot coordinates for chunk n>0.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 17 Feb 2022 14:22:47 +0000 (09:22 -0500)]
Fix -max-tag filter.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Tue, 15 Feb 2022 17:54:27 +0000 (12:54 -0500)]
Option to include variant 1 in one-hot matrix.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 11 Feb 2022 20:01:57 +0000 (15:01 -0500)]
Include tiles in one-hot matrix even if there is no ref tile.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Wed, 9 Feb 2022 21:17:10 +0000 (16:17 -0500)]
Support -max-tag flag for debugging.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 4 Feb 2022 21:17:23 +0000 (16:17 -0500)]
Don't include het just because corresponding hom passed.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 4 Feb 2022 06:11:58 +0000 (01:11 -0500)]
Update logged stats.
refs #18664
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 4 Feb 2022 05:41:40 +0000 (00:41 -0500)]
Don't use tags that appear more than once per sequence.
refs #18664
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 3 Feb 2022 19:25:47 +0000 (14:25 -0500)]
Fix log message.
refs #18664
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 3 Feb 2022 02:37:19 +0000 (21:37 -0500)]
Skip tags that appear twice in the same chromosome.
refs #18664
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 31 Jan 2022 18:56:56 +0000 (13:56 -0500)]
Add dump command.
No issue #
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 27 Jan 2022 14:16:35 +0000 (09:16 -0500)]
Update memory-size log message.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Thu, 27 Jan 2022 05:31:02 +0000 (00:31 -0500)]
Output -single-onehot as coordinates of non-zero values.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Fri, 21 Jan 2022 19:05:34 +0000 (14:05 -0500)]
Fix Χ² calculation.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>
Tom Clegg [Mon, 17 Jan 2022 18:03:44 +0000 (13:03 -0500)]
Use native client to read annotations.csv.
refs #18581
Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tom@curii.com>