# HG changeset patch
# User iuc
# Date 1606900284 0
# Node ID 12f2b14549f6e903c7e7abfceed703c8e86dd226
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/vsnp commit 524a39e08f2bea8b8754284df606ff8dd27ed24b"
diff -r 000000000000 -r 12f2b14549f6 macros.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/macros.xml Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,30 @@
+
+
+ 1.0
+ 19.09
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ @misc{None,
+ journal = {None},
+ author = {1. Stuber T},
+ title = {Manuscript in preparation},
+ year = {None},
+ url = {https://github.com/USDA-VS/vSNP},}
+
+
+
+
+
diff -r 000000000000 -r 12f2b14549f6 static/images/table_description.png
Binary file static/images/table_description.png has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_avg_mq.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D6_avg_mq.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:362818","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1974665","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3310831","NC_002945.4:3319244","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,59,59,57,57,57,57,57,58,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,60,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,60,59,60,59,59,59,60,59,59,60,60,60,59,60,60,60,60,59,60,59,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,59,59,59,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,59,60,60,60,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_cascade_table.xlsx
Binary file test-data/Mbovis-01D6_cascade_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_snps.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D6_snps.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_snps.newick
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D6_snps.newick Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc);
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_sort_table.xlsx
Binary file test-data/Mbovis-01D6_sort_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_avg_mq.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D_avg_mq.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:362818","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1974665","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3310831","NC_002945.4:3319244","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,59,59,57,57,57,57,57,58,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,60,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,60,59,60,59,59,59,60,59,59,60,60,60,59,60,60,60,60,59,60,59,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,59,59,59,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,59,60,60,60,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_cascade_table.xlsx
Binary file test-data/Mbovis-01D_cascade_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_snps.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D_snps.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_snps.newick
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01D_snps.newick Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc);
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_sort_table.xlsx
Binary file test-data/Mbovis-01D_sort_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_avg_mq.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01_avg_mq.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:75274","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:249090","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:642172","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:707522","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1462755","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1723479","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1796727","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1967341","NC_002945.4:1974665","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2051968","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2178975","NC_002945.4:2204661","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2429853","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3043695","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3252431","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3309513","NC_002945.4:3310831","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3488828","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3672841","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3825329","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3872596","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:3993571","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4017949","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4052766","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4264139","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:362818","NC_002945.4:641896","NC_002945.4:673880","NC_002945.4:839308","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1213847","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1799442","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2210027","NC_002945.4:2413021","NC_002945.4:2884747","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3413486","NC_002945.4:3464485","NC_002945.4:3553753","NC_002945.4:4251588","NC_002945.4:4278315","NC_002945.4:4293932","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,60,59,60,60,59,59,59,59,59,59,57,57,57,57,57,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,59,59,59,60,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,59,59,59,60,59,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,56,60,59,60,60,59,60,59,60,59,59,59,60,59,59,59,60,60,60,60,60,60,59,60,60,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,59,59,60,60,59,59,59,59,60,59,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,59,60,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,59,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,57,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,59,59,59,59,59,59,59,59,60,60,60,59,60,60,60,59,60,59,60,60,60,60,60,60,59,60,59,59,59,59,59,60,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,60,59,59,60,59,59,59,59,59,60,60,59,60,59,60,60,60,60,59,60,60,60,59,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,60,60,59,59,59,59,60,59,59,59,59,59,58,60,60,60,52,55,56,60,59,60,59,59,59,60,60,60,60,60,60,60,60,59,60,60,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_cascade_table.xlsx
Binary file test-data/Mbovis-01_cascade_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_snps.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01_snps.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_snps.newick
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/Mbovis-01_snps.newick Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc);
diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_sort_table.xlsx
Binary file test-data/Mbovis-01_sort_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/Mcap_Deer_DE_SRR650221.fastq.gz
Binary file test-data/Mcap_Deer_DE_SRR650221.fastq.gz has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/NC_002945v4.fasta
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/NC_002945v4.fasta Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,101 @@
+>NC_002945.4 Mycobacterium bovis AF2122/97 genome assembly, chromosome: Mycobacterium_bovis_AF2122/97
+TTGACCGATGACCCCGGTTCAGGCTTCACCACAGTGTGGAACGCGGTCGTCTCCGAACTTAACGGCGACC
+CTAAGGTTGACGACGGACCCAGCAGTGATGCTAATCTCAGCGCTCCGCTGACCCCTCAGCAAAGGGCTTG
+GCTCAATCTCGTCCAGCCATTGACCATCGTCGAGGGGTTTGCTCTGTTATCCGTGCCGAGCAGCTTTGTC
+CAAAACGAAATCGAGCGCCATCTGCGGGCCCCGATTACCGACGCTCTCAGCCGCCGACTCGGACATCAGA
+TCCAACTCGGGGTCCGCATCGCTCCGCCGGCGACCGACGAAGCCGACGACACTACCGTGCCGCCTTCCGA
+AAATCCTGCTACCACATCGCCAGACACCACAACCGACAACGACGAGATTGATGACAGCGCTGCGGCACGG
+GGCGATAACCAGCACAGTTGGCCAAGTTACTTCACCGAGCGCCCGCGCAATACCGATTCCGCTACCGCTG
+GCGTAACCAGCCTTAACCGTCGCTACACCTTTGATACGTTCGTTATCGGCGCCTCCAACCGGTTCGCGCA
+CGCCGCCGCCTTGGCGATCGCAGAAGCACCCGCCCGCGCTTACAACCCCCTGTTCATCTGGGGCGAGTCC
+GGTCTCGGCAAGACACACCTGCTACACGCGGCAGGCAACTATGCCCAACGGTTGTTCCCGGGAATGCGGG
+TCAAATATGTCTCCACCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGC
+ATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAGAG
+GGTATTCAAGAGGAGTTCTTCCACACCTTCAACACCTTGCACAATGCCAACAAGCAAATCGTCATCTCAT
+CTGACCGCCCACCCAAGCAGCTCGCCACCCTCGAGGACCGGCTGAGAACCCGCTTTGAGTGGGGGCTGAT
+CACTGACGTACAACCACCCGAGCTGGAGACCCGCATCGCCATCTTGCGCAAGAAAGCACAGATGGAACGG
+CTCGCGATCCCCGACGATGTCCTCGAACTCATCGCCAGCAGTATCGAACGCAATATCCGTGAACTCGAGG
+GCGCGCTGATCCGGGTCACCGCGTTCGCCTCATTGAACAAAACACCAATCGACAAAGCGCTGGCCGAGAT
+TGTGCTTCGCGATCTGATCGCCGACGCCAACACCATGCAAATCAGCGCGGCGACGATCATGGCTGCCACC
+GCCGAATACTTCGACACTACCGTCGAAGAGCTTCGCGGGCCCGGCAAGACCCGAGCACTGGCCCAGTCAC
+GACAGATTGCGATGTACCTGTGTCGTGAGCTCACCGATCTTTCGTTGCCCAAAATCGGCCAAGCGTTCGG
+CCGTGATCACACAACCGTCATGTACGCCCAACGCAAGATCCTGTCCGAGATGGCCGAGCGCCGTGAGGTC
+TTTGATCACGTCAAAGAACTCACCACTCGCATCCGTCAGCGCTCCAAGCGCTAGCACGGCGTGTTCTTCC
+GACAACGTTCTTAAAAAAACTTCTCTCTCCCAGGTCACACCAGTCACAGAGATTGGCTGTGAGTGTCGCT
+GTGCACAAACCGCGCACAGACTCATACAGTCCCGGCGGTTCCGTTCACAACCCACGCCTCATCCCCACCG
+ACCCAACACACACCCCACAGTCATCGCCACCGTCATCCACAACTCCGACCGACGTCGACCTGCACCAAGA
+CCAGACTGTCCCCAAACTGCACACCCTCTAATACTGTTACCGAGATTTCTTCGTCGTTTGTTCTTGGAAA
+GACAGCGCTGGGGATCGTTCGCTGGATACCACCCGCATAACTGGCTCGTCGCGGTGGGTCAGAGGTCAAT
+GATGAACTTTCAAGTTGACGTGAGAAGCTCTACGGTTGTTGTTCGACTGCTGTTGCGGCCGTCGTGGCGG
+GTCACGCGTCATGGGCGTTCGTCGTTGGCAGTCCCCACGCTAGCGGGGCGCTAGCCACGGGATCGAACTC
+ATCGTGAGGTGAAAGGGCGCAATGGACGCGGCTACGACAAGAGTTGGCCTCACCGACTTGACGTTTCGTT
+TGCTACGAGAGTCTTTCGCCGATGCGGTGTCGTGGGTGGCTAAAAATCTGCCAGCCAGGCCCGCGGTGCC
+GGTGCTCTCCGGCGTGTTGTTGACCGGCTCGGACAACGGTCTGACGATTTCCGGATTCGACTACGAGGTT
+TCCGCCGAGGCCCAGGTTGGCGCTGAAATTGTTTCTCCTGGAAGCGTTTTAGTTTCTGGCCGATTGTTGT
+CCGATATTACCCGGGCGTTGCCTAACAAGCCCGTAGGCGTTCATGTCGAAGGTAACCGGGTCGCATTGAC
+CTGCGGTAACGCCAGGTTTTCGCTACCGACGATGCCAGTCGAGGATTATCCGACGCTGCCGACGCTGCCG
+GAAGAGACCGGATTGTTGCCTGCGGAATTATTCGCCGAGGCAATCAGTCAGGTCGCTATCGCCGCCGGCC
+GGGACGACACGCTGCCTATGTTGACCGGCATCCGGGTCGAAATCCTCGGTGAGACGGTGGTTTTGGCCGC
+TACCGACAGGTTTCGCCTGGCTGTTCGAGAACTGAAGTGGTCGGCGTCGTCGCCAGATATCGAAGCGGCT
+GTGCTGGTCCCGGCCAAGACGCTGGCCGAGGCCGCCAAAGCGGGCATCGGCGGCTCTGACGTTCGTTTGT
+CGTTGGGTACTGGGCCGGGGGTGGGCAAGGATGGCCTGCTCGGTATCAGTGGGAACGGCAAGCGCAGCAC
+CACGCGACTTCTTGATGCCGAGTTCCCGAAGTTTCGGCAGTTGCTACCAACCGAACACACCGCGGTGGCC
+ACCATGGACGTGGCCGAGTTGATCGAAGCGATCAAGCTGGTTGCGTTGGTAGCTGATCGGGGCGCGCAGG
+TGCGCATGGAGTTCGCTGATGGCAGCGTGCGGCTTTCTGCGGGTGCCGATGATGTTGGACGAGCCGAGGA
+AGATCTTGTTGTTGACTATGCCGGTGAACCATTGACGATTGCGTTTAACCCAACCTATCTAACGGACGGT
+TTGAGTTCGTTGCGCTCGGAGCGAGTGTCTTTCGGGTTTACGACTGCGGGTAAGCCTGCCTTGCTACGTC
+CGGTGTCCGGGGACGATCGCCCTGTGGCGGGTCTGAATGGCAACGGTCCGTTCCCGGCGGTGTCGACGGA
+CTATGTCTATCTGTTGATGCCGGTTCGGTTGCCGGGCTGAGCACTTGGCGCCCGGGTAGGTGTACGTCCG
+TCATTTGGGGCTGCGTGACTTCCGGTCCTGGGCATGTGTAGATCTGGAATTGCATCCAGGGCGGACGGTT
+TTTGTTGGGCCTAACGGTTATGGTAAGACGAATCTTATTGAGGCACTGTGGTATTCGACGACGTTAGGTT
+CGCACCGCGTTAGCGCCGATTTGCCGTTGATCCGGGTAGGTACCGATCGTGCGGTGATCTCCACGATCGT
+GGTGAACGACGGTAGAGAATGTGCCGTCGACCTCGAGATCGCCACGGGGCGAGTCAACAAAGCGCGATTG
+AATCGATCATCGGTCCGAAGTACACGTGATGTGGTCGGAGTGCTTCGAGCTGTGTTGTTTGCCCCTGAGG
+ATCTGGGGTTGGTTCGTGGGGATCCCGCTGACCGGCGGCGCTATCTGGATGATCTGGCGATCGTGCGTAG
+GCCTGCGATCGCTGCGGTACGAGCCGAATATGAGAGGGTGGTGCGCCAGCGGACGGCGTTATTGAAGTCC
+GTACCTGGAGCACGGTATCGGGGTGACCGGGGTGTGTTTGACACTCTTGAGGTATGGGACAGTCGTTTGG
+CGGAGCACGGGGCTGAACTGGTGGCCGCCCGCATCGATTTGGTCAACCAGTTGGCACCGGAAGTGAAGAA
+GGCATACCAGCTGTTGGCGCCGGAATCGCGATCGGCGTCTATCGGTTATCGGGCCAGCATGGATGTAACC
+GGTCCCAGCGAGCAGTCAGATACCGATCGGCAATTGTTAGCAGCTCGGCTGTTGGCGGCGCTGGCGGCCC
+GTCGGGATGCCGAACTCGAGCGTGGGGTTTGTCTAGTTGGTCCGCACCGTGACGACCTAATACTGCGACT
+AGGCGATCAACCCGCGAAAGGATTTGCTAGCCATGGGGAGGCGTGGTCGTTGGCGGTGGCACTGCGGTTG
+GCGGCCTATCAACTGTTACGCGTTGATGGTGGTGAGCCGGTGTTGTTGCTCGACGACGTGTTCGCCGAAC
+TGGATGTCATGCGCCGTCGAGCGTTGGCGACGGCGGCCGAGTCCGCCGAACAGGTGTTGGTGACTGCCGC
+GGTGCTCGAGGATATTCCCGCCGGCTGGGACGCCAGGCGGGTGCACATCGATGTGCGTGCCGATGACACC
+GGATCGATGTCGGTGGTTCTGCCATGACGGGTTCTGTTGACCGGCCCGACCAGAATCGCGGTGAGCGATT
+AATGAAGTCACCAGGGTTGGATTTGGTCAGGCGCACCCTGGACGAAGCTCGTGCTGCTGCCCGCGCGCGC
+GGACAAGACGCCGGTCGAGGGCGGGTCGCTTCCGTTGCGTCGGGTCGGGTGGCCGGACGGCGACGAAGCT
+GGTCGGGTCCGGGGCCCGACATTCGTGATCCACAACCGCTGGGTAAGGCCGCTCGTGAGCTGGCAAAGAA
+ACGCGGCTGGTCGGTGCGGGTCGCCGAGGGTATGGTGCTCGGCCAGTGGTCTGCGGTGGTCGGCCACCAG
+ATCGCCGAACATGCACGCCCGACTGCGCTAAACGACGGGGTGTTGAGCGTGATTGCGGAGTCGACGGCGT
+GGGCGACGCAGTTGAGGATCATGCAGGCCCAGCTTCTGGCCAAGATCGCCGCAGCGGTTGGCAACGATGT
+GGTGCGATCGCTAAAGATCACCGGGCCGGCGGCACCATCGTGGCGCAAGGGGCCTCGCCATATTGCCGGT
+AGGGGTCCGCGCGACACCTACGGATAACACGTCGATCGGCCCAGAACAAGGCGCTCCGGTCCCGGCCTGA
+GAGCCTCGAGGACGAAGCGGATCCGTATGCCGGACGTCGGGACGCACCAGGAAGAAAGATGTCCGACGCA
+CGGCGCGGTTAGATGGGTAAAAACGAGGCCAGAAGATCGGCCCTGGCGCCCGATCACGGTACAGTGGTGT
+GCGACCCCCTGCGGCGACTCAACCGCATGCACGCAACCCCTGAGGAGAGTATTCGGATCGTGGCTGCCCA
+GAAAAAGAAGGCCCAAGACGAATACGGCGCTGCGTCTATCACCATTCTCGAAGGGCTGGAGGCCGTCCGC
+AAACGTCCCGGCATGTACATTGGCTCGACCGGTGAGCGCGGTTTACACCATCTCATTTGGGAGGTGGTCG
+ACAACGCGGTCGACGAGGCGATGGCCGGTTATGCAACCACAGTGAACGTAGTGCTGCTTGAGGATGGCGG
+TGTCGAGGTCGCCGACGACGGCCGCGGCATTCCGGTCGCCACCCACGCCTCCGGCATACCGACCGTCGAC
+GTGGTGATGACACAACTACATGCCGGCGGCAAGTTCGACTCGGACGCGTATGCGATATCTGGTGGTCTGC
+ACGGCGTCGGCGTGTCGGTGGTTAACGCGCTATCCACCCGGCTCGAAGTCGAGATCAAGCGCGACGGGTA
+CGAGTGGTCTCAGGTTTATGAGAAGTCGGAACCCCTGGGCCTCAAGCAAGGGGCGCCGACCAAGAAGACG
+GGGTCAACGGTACGGTTCTGGGCCGACCCCGCTGTTTTCGAAACCACGGAATACGACTTCGAAACCGTCG
+CCCGCCGGCTGCAAGAGATGGCGTTCCTCAACAAGGGGCTGACCATCAACCTGACCGACGAGAGGGTGAC
+CCAAGACGAGGTCGTCGACGAAGTGGTCAGCGACGTCGCCGAGGCGCCGAAGTCGGCAAGTGAACGCGCA
+GCCGAATCCACTGCACCGCACAAAGTTAAGAGCCGCACCTTTCACTATCCGGGTGGCCTGGTGGACTTCG
+TGAAACACATCAACCGCACCAAGAACGCGATTCATAGCAGCATCGTGGACTTTTCCGGCAAGGGCACCGG
+GCACGAGGTGGAGATCGCGATGCAATGGAACGCCGGGTATTCGGAGTCGGTGCACACCTTCGCCAACACC
+ATCAACACCCACGAGGGCGGCACCCACGAAGAGGGCTTCCGCAGCGCGCTGACGTCGGTGGTGAACAAGT
+ACGCCAAGGACCGCAAGCTACTGAAGGACAAGGACCCCAACCTCACCGGTGACGATATCCGGGAAGGCCT
+GGCCGCTGTGATCTCGGTGAAGGTCAGCGAACCGCAGTTCGAGGGCCAGACCAAGACCAAGTTGGGCAAC
+ACCGAGGTCAAATCGTTTGTGCAGAAGGTCTGTAATGAACAGCTGACCCACTGGTTTGAAGCCAACCCCA
+CCGACTCGAAAGTCGTTGTGAACAAGGCTGTGTCCTCGGCGCAAGCCCGTATCGCGGCACGTAAGGCACG
+AGAGTTGGTGCGGCGTAAGAGCGCCACCGACATCGGTGGATTGCCCGGCAAGCTGGCCGATTGCCGTTCC
+ACGGATCCGCGCAAGTCCGAACTGTATGTCGTAGAAGGTGACTCGGCCGGCGGTTCTGCAAAAAGCGGTC
+GCGATTCGATGTTCCAGGCGATACTTCCGCTGCGCGGCAAGATCATCAATGTGGAGAAAGCGCGCATCGA
+CCGGGTGCTAAAGAACACCGAAGTTCAGGCGATCATCACGGCGCTGGGCACCGGGATCCACGACGAGTTC
+GATATCGGCAAGCTGCGCTACCACAAGATCGTGCTGATGGCCGACGCCGATGTTGACGGCCAACATATTT
+CCACGCTGTTGTTGACGTTGTTGTTCCGGTTCATGCGGCCGCTCATCGAGAACGGGCATGTGTTTTTGGC
+ACAACCGCCGCTGTACAAACTCAAGTGGCAGCGCAGTGACCCGGAATTCGCATACTCCGACCGCGAGCGC
diff -r 000000000000 -r 12f2b14549f6 test-data/NC_002945v4.yml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/NC_002945v4.yml Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,5 @@
+bovis:
+ - '11001110'
+ - '11011110'
+ - '11001100'
+
diff -r 000000000000 -r 12f2b14549f6 test-data/bam_input.bam
Binary file test-data/bam_input.bam has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/cascade_table.xlsx
Binary file test-data/cascade_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/fasta_indexes.loc
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/fasta_indexes.loc Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+89 89 Mycobacterium_AF2122 ${__HERE__}/NC_002945v4.fasta
diff -r 000000000000 -r 12f2b14549f6 test-data/forward.fastq.gz
Binary file test-data/forward.fastq.gz has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/input_avg_mq_json.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/input_avg_mq_json.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"name":null,"index":["NC_002945.4:1005705","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1057","NC_002945.4:1096802","NC_002945.4:110198","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:114965","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:117800","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:127447","NC_002945.4:1277988","NC_002945.4:1295457","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:140686","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:143799","NC_002945.4:1442194","NC_002945.4:144992","NC_002945.4:1463503","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:148871","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:159370","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:160535","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:165799","NC_002945.4:1659174","NC_002945.4:166696","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1704859","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1723583","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:179885","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1806623","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:189083","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1911237","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:192177","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1961826","NC_002945.4:1974665","NC_002945.4:198890","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:223919","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:228109","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:230661","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:232188","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:29061","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:295519","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:299636","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:304339","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:319911","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:331051","NC_002945.4:3310831","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:3319244","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3373966","NC_002945.4:33788","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:362818","NC_002945.4:3630061","NC_002945.4:364560","NC_002945.4:3645682","NC_002945.4:364804","NC_002945.4:3655045","NC_002945.4:366022","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:407246","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:41228","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:41437","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4236320","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:430077","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:4480","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:50470","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:59861","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:70438","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:79918","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:8741","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:960995","NC_002945.4:96244","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:997676","NC_002945.4:998183","NC_002945.4:998196"],"data":[60,60,59,60,59,59,60,55,60,60,60,60,59,59,60,60,59,60,59,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,59,56,60,60,60,59,59,59,60,59,60,60,60,60,60,59,59,56,56,59,60,60,59,60,59,58,60,59,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,59,60,59,60,59,59,59,59,60,59,59,60,60,59,60,60,59,59,60,60,60,60,60,59,60,60,59,59,60,60,60,59,59,59,60,59,60,59,60,60,59,60,60,60,59,59,59,59,60,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,60,59,60,59,59,59,59,59,59,59,60,60,60,60,60,60,60,60,60,59,60,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,59,60,59,59,60,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,59,60,60,60,60,59,59,59,60,60,60,60,60,60,59,59,59,59,59,59,59,59,60,58,60,59,60,59,60,59,59,57,57,57,57,57,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,58,59,59,59,60,59,60,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,60,59,59,59,59,60,59,60,59,59,60,59,59,60,60,59,60,59,60,60,60,59,60,59,59,60,60,60,59,60,59,59,59,59,60,59,59,59,60,60,59,59,60,59,60,60,59,60,60,59,59,59,59,59,60,60,59,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,60,59,60,59,59,60,59,59,59,59,60,59,59,59,60,60,59,58,60,59,60,59,59,60,59,59,59,60,59,59,60,59,59,59,60,59,59,59,59,59,60,60,60,59,59,59,60,60]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/input_newick.newick
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/input_newick.newick Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc);
diff -r 000000000000 -r 12f2b14549f6 test-data/input_snps_json.json
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/input_snps_json.json Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]}
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/output_dbkey.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/output_dbkey.txt Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+AF2122
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/output_metrics.tabular
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/output_metrics.tabular Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,2 @@
+# File Number of Good SNPs Average Coverage Genome Coverage
+ 0
diff -r 000000000000 -r 12f2b14549f6 test-data/output_metrics.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/output_metrics.txt Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,6 @@
+Sample: Mcap_Deer_DE_SRR650221
+Brucella counts: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+TB counts: 2,2,0,0,4,5,0,0,
+Para counts: 0,0,0,
+Group: TB
+dbkey: AF2122
diff -r 000000000000 -r 12f2b14549f6 test-data/output_vcf.vcf
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/output_vcf.vcf Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,100 @@
+##fileformat=VCFv4.2
+##fileDate=20200302
+##source=freeBayes v1.3.1-dirty
+##reference=/home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa
+##contig=
+##phasing=none
+##commandline="freebayes --region NC_002945.4:0..4349904 --bam b_0.bam --fasta-reference /home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa --vcf ./vcf_output/part_NC_002945.4:0..4349904.vcf -u -n 0 --haplotype-length -1 --min-repeat-size 5 --min-repeat-entropy 1 -m 1 -q 0 -R 0 -Y 0 -e 1 -F 0.05 -C 2 -G 1 --min-alternate-qsum 0"
+##filter="QUAL > 0"
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 13-1941-6
+NC_002945.4 1 . N . . . . GT ./.
+NC_002945.4 2 . N . . . . GT ./.
+NC_002945.4 3 . N . . . . GT ./.
+NC_002945.4 4 . N . . . . GT ./.
+NC_002945.4 5 . N . . . . GT ./.
+NC_002945.4 6 . N . . . . GT ./.
+NC_002945.4 7 . N . . . . GT ./.
+NC_002945.4 8 . N . . . . GT ./.
+NC_002945.4 9 . N . . . . GT ./.
+NC_002945.4 10 . N . . . . GT ./.
+NC_002945.4 11 . N . . . . GT ./.
+NC_002945.4 12 . N . . . . GT ./.
+NC_002945.4 13 . N . . . . GT ./.
+NC_002945.4 14 . N . . . . GT ./.
+NC_002945.4 15 . N . . . . GT ./.
+NC_002945.4 16 . N . . . . GT ./.
+NC_002945.4 17 . N . . . . GT ./.
+NC_002945.4 18 . N . . . . GT ./.
+NC_002945.4 19 . N . . . . GT ./.
+NC_002945.4 20 . N . . . . GT ./.
+NC_002945.4 21 . N . . . . GT ./.
+NC_002945.4 22 . N . . . . GT ./.
+NC_002945.4 23 . N . . . . GT ./.
+NC_002945.4 24 . N . . . . GT ./.
+NC_002945.4 25 . N . . . . GT ./.
+NC_002945.4 26 . N . . . . GT ./.
+NC_002945.4 27 . N . . . . GT ./.
+NC_002945.4 28 . N . . . . GT ./.
+NC_002945.4 29 . N . . . . GT ./.
+NC_002945.4 30 . N . . . . GT ./.
+NC_002945.4 31 . N . . . . GT ./.
+NC_002945.4 32 . N . . . . GT ./.
+NC_002945.4 33 . N . . . . GT ./.
+NC_002945.4 34 . N . . . . GT ./.
+NC_002945.4 35 . N . . . . GT ./.
+NC_002945.4 36 . N . . . . GT ./.
+NC_002945.4 37 . N . . . . GT ./.
diff -r 000000000000 -r 12f2b14549f6 test-data/paired_dbkey.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/paired_dbkey.txt Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,1 @@
+AF2122
\ No newline at end of file
diff -r 000000000000 -r 12f2b14549f6 test-data/paired_metrics.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/paired_metrics.txt Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,6 @@
+Sample: forward
+Brucella counts: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
+TB counts: 4,4,0,0,8,10,0,0,
+Para counts: 0,0,0,
+Group: TB
+dbkey: AF2122
diff -r 000000000000 -r 12f2b14549f6 test-data/reverse.fastq.gz
Binary file test-data/reverse.fastq.gz has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/sort_table.xlsx
Binary file test-data/sort_table.xlsx has changed
diff -r 000000000000 -r 12f2b14549f6 test-data/vcf_input.vcf
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/vcf_input.vcf Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,64 @@
+##fileformat=VCFv4.2
+##fileDate=20200302
+##source=freeBayes v1.3.1-dirty
+##reference=/home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa
+##contig=
+##phasing=none
+##commandline="freebayes --region NC_002945.4:0..4349904 --bam b_0.bam --fasta-reference /home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa --vcf ./vcf_output/part_NC_002945.4:0..4349904.vcf -u -n 0 --haplotype-length -1 --min-repeat-size 5 --min-repeat-entropy 1 -m 1 -q 0 -R 0 -Y 0 -e 1 -F 0.05 -C 2 -G 1 --min-alternate-qsum 0"
+##filter="QUAL > 0"
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 13-1941-6
+NC_002945.4 2898437 . T G 0.263449 . AB=0;ABP=0;AC=0;AF=0;AN=2;AO=2;CIGAR=1X;DP=2;DPB=2;DPRA=0;EPP=3.0103;EPPR=0;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=2.77259;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=0;QR=0;RO=0;RPL=2;RPP=7.35324;RPPR=0;RPR=0;RUN=1;SAF=1;SAP=3.0103;SAR=1;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.ILLUMINA=1 GT:DP:AD:RO:QR:AO:QA:GL 0/0:2:0,2:0:0:2:0:0,-0.60206,-8.68589e-09
diff -r 000000000000 -r 12f2b14549f6 test-data/vsnp_dnaprints.loc
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/vsnp_dnaprints.loc Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,4 @@
+## vSNP DNAprints files
+#Value Name Path Description
+AF2122 Mycobacterium_AF2122/NC_002945v4.yml ${__HERE__}/NC_002945v4.yml DNAprints file for Mycobacterium bovis AF2122/97
+#NC_006932 Brucella_abortus1/NC_006932-NC_006933.yml /vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.yml DNAprints file for Brucella abortus bv. 1 str. 9-941
diff -r 000000000000 -r 12f2b14549f6 tool-data/fasta_indexes.loc.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/fasta_indexes.loc.sample Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,29 @@
+#This is a sample file distributed with Galaxy that enables tools
+#to use a directory of Samtools indexed sequences data files. You will need
+#to create these data files and then create a fasta_indexes.loc file
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The fasta_indexes.loc
+#file has this format (white space characters are TAB characters):
+#
+#
+#
+#So, for example, if you had hg19 Canonical indexed stored in
+#
+# /depot/data2/galaxy/hg19/sam/,
+#
+#then the fasta_indexes.loc entry would look like this:
+#
+#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /depot/data2/galaxy/hg19/sam/hg19canon.fa
+#
+#and your /depot/data2/galaxy/hg19/sam/ directory
+#would contain hg19canon.fa and hg19canon.fa.fai files.
+#
+#Your fasta_indexes.loc file should include an entry per line for
+#each index set you have stored. The file in the path does actually
+#exist, but it should never be directly used. Instead, the name serves
+#as a prefix for the index file. For example:
+#
+#hg18canon hg18 Human (Homo sapiens): hg18 Canonical /depot/data2/galaxy/hg18/sam/hg18canon.fa
+#hg18full hg18 Human (Homo sapiens): hg18 Full /depot/data2/galaxy/hg18/sam/hg18full.fa
+#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /depot/data2/galaxy/hg19/sam/hg19canon.fa
+#hg19full hg19 Human (Homo sapiens): hg19 Full /depot/data2/galaxy/hg19/sam/hg19full.fa
diff -r 000000000000 -r 12f2b14549f6 tool-data/vsnp_dnaprints.loc.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/vsnp_dnaprints.loc.sample Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,4 @@
+## vSNP DNAprints files
+#Value Name Path Description
+#AF2122 Mycobacterium_AF2122/NC_002945v4.yml /vsnp/AF2122/Mycobacterium_AF2122/NC_002945v4.yml DNAprints file for Mycobacterium bovis AF2122/97
+#NC_006932 Brucella_abortus1/NC_006932-NC_006933.yml /vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.yml DNAprints file for Brucella abortus bv. 1 str. 9-941
diff -r 000000000000 -r 12f2b14549f6 tool-data/vsnp_genbank.loc.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/vsnp_genbank.loc.sample Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,4 @@
+## vSNP Genbank files
+#Value Name Path Description
+#AF2122 Mycobacterium_AF2122/NC_002945v4.gbk vsnp/AF2122/Mycobacterium_AF2122/NC_002945v4.gbk Genbank file for Mycobacterium bovis AF2122/97
+#NC_006932 Brucella_abortus1/NC_006932-NC_006933.gbk vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.gbk Genbank file for Brucella abortus bv. 1 str. 9-941
diff -r 000000000000 -r 12f2b14549f6 tool_data_table_conf.xml.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_data_table_conf.xml.sample Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,18 @@
+
+
+
+ value, dbkey, name, path
+
+
+
+
+ value, name, path, description
+
+
+
+
+ value, name, path, description
+
+
+
+
diff -r 000000000000 -r 12f2b14549f6 tool_data_table_conf.xml.test
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_data_table_conf.xml.test Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,15 @@
+
+
+ value, dbkey, name, path
+
+
+
+
+ value, name, path, description
+
+
+
+ value, name, path, description
+
+
+
diff -r 000000000000 -r 12f2b14549f6 vsnp_add_zero_coverage.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/vsnp_add_zero_coverage.py Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,189 @@
+#!/usr/bin/env python
+
+import argparse
+import multiprocessing
+import os
+import queue
+import re
+import shutil
+
+import pandas
+import pysam
+from Bio import SeqIO
+
+INPUT_BAM_DIR = 'input_bam_dir'
+INPUT_VCF_DIR = 'input_vcf_dir'
+OUTPUT_VCF_DIR = 'output_vcf_dir'
+OUTPUT_METRICS_DIR = 'output_metrics_dir'
+
+
+def get_base_file_name(file_path):
+ base_file_name = os.path.basename(file_path)
+ if base_file_name.find(".") > 0:
+ # Eliminate the extension.
+ return os.path.splitext(base_file_name)[0]
+ elif base_file_name.endswith("_vcf"):
+ # The "." character has likely
+ # changed to an "_" character.
+ return base_file_name.rstrip("_vcf")
+ return base_file_name
+
+
+def get_coverage_and_snp_count(task_queue, reference, output_metrics, output_vcf, timeout):
+ while True:
+ try:
+ tup = task_queue.get(block=True, timeout=timeout)
+ except queue.Empty:
+ break
+ bam_file, vcf_file = tup
+ # Create a coverage dictionary.
+ coverage_dict = {}
+ coverage_list = pysam.depth(bam_file, split_lines=True)
+ for line in coverage_list:
+ chrom, position, depth = line.split('\t')
+ coverage_dict["%s-%s" % (chrom, position)] = depth
+ # Convert it to a data frame.
+ coverage_df = pandas.DataFrame.from_dict(coverage_dict, orient='index', columns=["depth"])
+ # Create a zero coverage dictionary.
+ zero_dict = {}
+ for record in SeqIO.parse(reference, "fasta"):
+ chrom = record.id
+ total_len = len(record.seq)
+ for pos in list(range(1, total_len + 1)):
+ zero_dict["%s-%s" % (str(chrom), str(pos))] = 0
+ # Convert it to a data frame with depth_x
+ # and depth_y columns - index is NaN.
+ zero_df = pandas.DataFrame.from_dict(zero_dict, orient='index', columns=["depth"])
+ coverage_df = zero_df.merge(coverage_df, left_index=True, right_index=True, how='outer')
+ # depth_x "0" column no longer needed.
+ coverage_df = coverage_df.drop(columns=['depth_x'])
+ coverage_df = coverage_df.rename(columns={'depth_y': 'depth'})
+ # Covert the NaN to 0 coverage and get some metrics.
+ coverage_df = coverage_df.fillna(0)
+ coverage_df['depth'] = coverage_df['depth'].apply(int)
+ total_length = len(coverage_df)
+ average_coverage = coverage_df['depth'].mean()
+ zero_df = coverage_df[coverage_df['depth'] == 0]
+ total_zero_coverage = len(zero_df)
+ total_coverage = total_length - total_zero_coverage
+ genome_coverage = "{:.2%}".format(total_coverage / total_length)
+ # Process the associated VCF input.
+ column_names = ["CHROM", "POS", "ID", "REF", "ALT", "QUAL", "FILTER", "INFO", "FORMAT", "Sample"]
+ vcf_df = pandas.read_csv(vcf_file, sep='\t', header=None, names=column_names, comment='#')
+ good_snp_count = len(vcf_df[(vcf_df['ALT'].str.len() == 1) & (vcf_df['REF'].str.len() == 1) & (vcf_df['QUAL'] > 150)])
+ base_file_name = get_base_file_name(vcf_file)
+ if total_zero_coverage > 0:
+ header_file = "%s_header.csv" % base_file_name
+ with open(header_file, 'w') as outfile:
+ with open(vcf_file) as infile:
+ for line in infile:
+ if re.search('^#', line):
+ outfile.write("%s" % line)
+ vcf_df_snp = vcf_df[vcf_df['REF'].str.len() == 1]
+ vcf_df_snp = vcf_df_snp[vcf_df_snp['ALT'].str.len() == 1]
+ vcf_df_snp['ABS_VALUE'] = vcf_df_snp['CHROM'].map(str) + "-" + vcf_df_snp['POS'].map(str)
+ vcf_df_snp = vcf_df_snp.set_index('ABS_VALUE')
+ cat_df = pandas.concat([vcf_df_snp, zero_df], axis=1, sort=False)
+ cat_df = cat_df.drop(columns=['CHROM', 'POS', 'depth'])
+ cat_df[['ID', 'ALT', 'QUAL', 'FILTER', 'INFO']] = cat_df[['ID', 'ALT', 'QUAL', 'FILTER', 'INFO']].fillna('.')
+ cat_df['REF'] = cat_df['REF'].fillna('N')
+ cat_df['FORMAT'] = cat_df['FORMAT'].fillna('GT')
+ cat_df['Sample'] = cat_df['Sample'].fillna('./.')
+ cat_df['temp'] = cat_df.index.str.rsplit('-', n=1)
+ cat_df[['CHROM', 'POS']] = pandas.DataFrame(cat_df.temp.values.tolist(), index=cat_df.index)
+ cat_df = cat_df[['CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'Sample']]
+ cat_df['POS'] = cat_df['POS'].astype(int)
+ cat_df = cat_df.sort_values(['CHROM', 'POS'])
+ body_file = "%s_body.csv" % base_file_name
+ cat_df.to_csv(body_file, sep='\t', header=False, index=False)
+ if output_vcf is None:
+ output_vcf_file = os.path.join(OUTPUT_VCF_DIR, "%s.vcf" % base_file_name)
+ else:
+ output_vcf_file = output_vcf
+ with open(output_vcf_file, "w") as outfile:
+ for cf in [header_file, body_file]:
+ with open(cf, "r") as infile:
+ for line in infile:
+ outfile.write("%s" % line)
+ else:
+ if output_vcf is None:
+ output_vcf_file = os.path.join(OUTPUT_VCF_DIR, "%s.vcf" % base_file_name)
+ else:
+ output_vcf_file = output_vcf
+ shutil.copyfile(vcf_file, output_vcf_file)
+ bam_metrics = [base_file_name, "", "%4f" % average_coverage, genome_coverage]
+ vcf_metrics = [base_file_name, str(good_snp_count), "", ""]
+ if output_metrics is None:
+ output_metrics_file = os.path.join(OUTPUT_METRICS_DIR, "%s.tabular" % base_file_name)
+ else:
+ output_metrics_file = output_metrics
+ metrics_columns = ["File", "Number of Good SNPs", "Average Coverage", "Genome Coverage"]
+ with open(output_metrics_file, "w") as fh:
+ fh.write("# %s\n" % "\t".join(metrics_columns))
+ fh.write("%s\n" % "\t".join(bam_metrics))
+ fh.write("%s\n" % "\t".join(vcf_metrics))
+ task_queue.task_done()
+
+
+def set_num_cpus(num_files, processes):
+ num_cpus = int(multiprocessing.cpu_count())
+ if num_files < num_cpus and num_files < processes:
+ return num_files
+ if num_cpus < processes:
+ half_cpus = int(num_cpus / 2)
+ if num_files < half_cpus:
+ return num_files
+ return half_cpus
+ return processes
+
+
+if __name__ == '__main__':
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--output_metrics', action='store', dest='output_metrics', required=False, default=None, help='Output metrics text file')
+ parser.add_argument('--output_vcf', action='store', dest='output_vcf', required=False, default=None, help='Output VCF file')
+ parser.add_argument('--reference', action='store', dest='reference', help='Reference dataset')
+ parser.add_argument('--processes', action='store', dest='processes', type=int, help='User-selected number of processes to use for job splitting')
+
+ args = parser.parse_args()
+
+ # The assumption here is that the list of files
+ # in both INPUT_BAM_DIR and INPUT_VCF_DIR are
+ # equal in number and named such that they are
+ # properly matched if the directories contain
+ # more than 1 file (i.e., hopefully the bam file
+ # names and vcf file names will be something like
+ # Mbovis-01D6_* so they can be # sorted and properly
+ # associated with each other).
+ bam_files = []
+ for file_name in sorted(os.listdir(INPUT_BAM_DIR)):
+ file_path = os.path.abspath(os.path.join(INPUT_BAM_DIR, file_name))
+ bam_files.append(file_path)
+ vcf_files = []
+ for file_name in sorted(os.listdir(INPUT_VCF_DIR)):
+ file_path = os.path.abspath(os.path.join(INPUT_VCF_DIR, file_name))
+ vcf_files.append(file_path)
+
+ multiprocessing.set_start_method('spawn')
+ queue1 = multiprocessing.JoinableQueue()
+ num_files = len(bam_files)
+ cpus = set_num_cpus(num_files, args.processes)
+ # Set a timeout for get()s in the queue.
+ timeout = 0.05
+
+ # Add each associated bam and vcf file pair to the queue.
+ for i, bam_file in enumerate(bam_files):
+ vcf_file = vcf_files[i]
+ queue1.put((bam_file, vcf_file))
+
+ # Complete the get_coverage_and_snp_count task.
+ processes = [multiprocessing.Process(target=get_coverage_and_snp_count, args=(queue1, args.reference, args.output_metrics, args.output_vcf, timeout, )) for _ in range(cpus)]
+ for p in processes:
+ p.start()
+ for p in processes:
+ p.join()
+ queue1.join()
+
+ if queue1.empty():
+ queue1.close()
+ queue1.join_thread()
diff -r 000000000000 -r 12f2b14549f6 vsnp_build_tables.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/vsnp_build_tables.py Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,382 @@
+#!/usr/bin/env python
+
+import argparse
+import multiprocessing
+import os
+import queue
+import re
+
+import pandas
+import pandas.io.formats.excel
+from Bio import SeqIO
+
+INPUT_JSON_AVG_MQ_DIR = 'input_json_avg_mq_dir'
+INPUT_JSON_DIR = 'input_json_dir'
+INPUT_NEWICK_DIR = 'input_newick_dir'
+# Maximum columns allowed in a LibreOffice
+# spreadsheet is 1024. Excel allows for
+# 16,384 columns, but we'll set the lower
+# number as the maximum. Some browsers
+# (e.g., Firefox on Linux) are configured
+# to use LibreOffice for Excel spreadsheets.
+MAXCOLS = 1024
+OUTPUT_EXCEL_DIR = 'output_excel_dir'
+
+
+def annotate_table(table_df, group, annotation_dict):
+ for gbk_chrome, pro in list(annotation_dict.items()):
+ ref_pos = list(table_df)
+ ref_series = pandas.Series(ref_pos)
+ ref_df = pandas.DataFrame(ref_series.str.split(':', expand=True).values, columns=['reference', 'position'])
+ all_ref = ref_df[ref_df['reference'] == gbk_chrome]
+ positions = all_ref.position.to_frame()
+ # Create an annotation file.
+ annotation_file = "%s_annotations.csv" % group
+ with open(annotation_file, "a") as fh:
+ for _, row in positions.iterrows():
+ pos = row.position
+ try:
+ aaa = pro.iloc[pro.index.get_loc(int(pos))][['chrom', 'locus', 'product', 'gene']]
+ try:
+ chrom, name, locus, tag = aaa.values[0]
+ print("{}:{}\t{}, {}, {}".format(chrom, pos, locus, tag, name), file=fh)
+ except ValueError:
+ # If only one annotation for the entire
+ # chromosome (e.g., flu) then having [0] fails
+ chrom, name, locus, tag = aaa.values
+ print("{}:{}\t{}, {}, {}".format(chrom, pos, locus, tag, name), file=fh)
+ except KeyError:
+ print("{}:{}\tNo annotated product".format(gbk_chrome, pos), file=fh)
+ # Read the annotation file into a data frame.
+ annotations_df = pandas.read_csv(annotation_file, sep='\t', header=None, names=['index', 'annotations'], index_col='index')
+ # Remove the annotation_file from disk since both
+ # cascade and sort tables are built using the file,
+ # and it is opened for writing in append mode.
+ os.remove(annotation_file)
+ # Process the data.
+ table_df_transposed = table_df.T
+ table_df_transposed.index = table_df_transposed.index.rename('index')
+ table_df_transposed = table_df_transposed.merge(annotations_df, left_index=True, right_index=True)
+ table_df = table_df_transposed.T
+ return table_df
+
+
+def excel_formatter(json_file_name, excel_file_name, group, annotation_dict):
+ pandas.io.formats.excel.header_style = None
+ table_df = pandas.read_json(json_file_name, orient='split')
+ if annotation_dict is not None:
+ table_df = annotate_table(table_df, group, annotation_dict)
+ else:
+ table_df = table_df.append(pandas.Series(name='no annotations'))
+ writer = pandas.ExcelWriter(excel_file_name, engine='xlsxwriter')
+ table_df.to_excel(writer, sheet_name='Sheet1')
+ writer_book = writer.book
+ ws = writer.sheets['Sheet1']
+ format_a = writer_book.add_format({'bg_color': '#58FA82'})
+ format_g = writer_book.add_format({'bg_color': '#F7FE2E'})
+ format_c = writer_book.add_format({'bg_color': '#0000FF'})
+ format_t = writer_book.add_format({'bg_color': '#FF0000'})
+ format_normal = writer_book.add_format({'bg_color': '#FDFEFE'})
+ formatlowqual = writer_book.add_format({'font_color': '#C70039', 'bg_color': '#E2CFDD'})
+ format_ambigous = writer_book.add_format({'font_color': '#C70039', 'bg_color': '#E2CFDD'})
+ format_n = writer_book.add_format({'bg_color': '#E2CFDD'})
+ rows, cols = table_df.shape
+ ws.set_column(0, 0, 30)
+ ws.set_column(1, cols, 2.1)
+ ws.freeze_panes(2, 1)
+ format_annotation = writer_book.add_format({'font_color': '#0A028C', 'rotation': '-90', 'align': 'top'})
+ # Set last row.
+ ws.set_row(rows + 1, cols + 1, format_annotation)
+ # Make sure that row/column locations don't overlap.
+ ws.conditional_format(rows - 2, 1, rows - 1, cols, {'type': 'cell', 'criteria': '<', 'value': 55, 'format': formatlowqual})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'cell', 'criteria': '==', 'value': 'B$2', 'format': format_normal})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'A', 'format': format_a})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'G', 'format': format_g})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'C', 'format': format_c})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'T', 'format': format_t})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'S', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'Y', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'R', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'W', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'K', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'M', 'format': format_ambigous})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'N', 'format': format_n})
+ ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': '-', 'format': format_n})
+ format_rotation = writer_book.add_format({})
+ format_rotation.set_rotation(90)
+ for column_num, column_name in enumerate(list(table_df.columns)):
+ ws.write(0, column_num + 1, column_name, format_rotation)
+ format_annotation = writer_book.add_format({'font_color': '#0A028C', 'rotation': '-90', 'align': 'top'})
+ # Set last row.
+ ws.set_row(rows, 400, format_annotation)
+ writer.save()
+
+
+def get_annotation_dict(gbk_file):
+ gbk_dict = SeqIO.to_dict(SeqIO.parse(gbk_file, "genbank"))
+ annotation_dict = {}
+ tmp_file = "features.csv"
+ # Create a file of chromosomes and features.
+ for chromosome in list(gbk_dict.keys()):
+ with open(tmp_file, 'w+') as fh:
+ for feature in gbk_dict[chromosome].features:
+ if "CDS" in feature.type or "rRNA" in feature.type:
+ try:
+ product = feature.qualifiers['product'][0]
+ except KeyError:
+ product = None
+ try:
+ locus = feature.qualifiers['locus_tag'][0]
+ except KeyError:
+ locus = None
+ try:
+ gene = feature.qualifiers['gene'][0]
+ except KeyError:
+ gene = None
+ fh.write("%s\t%d\t%d\t%s\t%s\t%s\n" % (chromosome, int(feature.location.start), int(feature.location.end), locus, product, gene))
+ # Read the chromosomes and features file into a data frame.
+ df = pandas.read_csv(tmp_file, sep='\t', names=["chrom", "start", "stop", "locus", "product", "gene"])
+ # Process the data.
+ df = df.sort_values(['start', 'gene'], ascending=[True, False])
+ df = df.drop_duplicates('start')
+ pro = df.reset_index(drop=True)
+ pro.index = pandas.IntervalIndex.from_arrays(pro['start'], pro['stop'], closed='both')
+ annotation_dict[chromosome] = pro
+ return annotation_dict
+
+
+def get_base_file_name(file_path):
+ base_file_name = os.path.basename(file_path)
+ if base_file_name.find(".") > 0:
+ # Eliminate the extension.
+ return os.path.splitext(base_file_name)[0]
+ elif base_file_name.find("_") > 0:
+ # The dot extension was likely changed to
+ # the " character.
+ items = base_file_name.split("_")
+ return "_".join(items[0:-1])
+ else:
+ return base_file_name
+
+
+def output_cascade_table(cascade_order, mqdf, group, annotation_dict):
+ cascade_order_mq = pandas.concat([cascade_order, mqdf], join='inner')
+ output_table(cascade_order_mq, "cascade", group, annotation_dict)
+
+
+def output_excel(df, type_str, group, annotation_dict, count=None):
+ # Output the temporary json file that
+ # is used by the excel_formatter.
+ if count is None:
+ if group is None:
+ json_file_name = "%s_order_mq.json" % type_str
+ excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_table.xlsx" % type_str)
+ else:
+ json_file_name = "%s_%s_order_mq.json" % (group, type_str)
+ excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_%s_table.xlsx" % (group, type_str))
+ else:
+ if group is None:
+ json_file_name = "%s_order_mq_%d.json" % (type_str, count)
+ excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_table_%d.xlsx" % (type_str, count))
+ else:
+ json_file_name = "%s_%s_order_mq_%d.json" % (group, type_str, count)
+ excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_%s_table_%d.xlsx" % (group, type_str, count))
+ df.to_json(json_file_name, orient='split')
+ # Output the Excel file.
+ excel_formatter(json_file_name, excel_file_name, group, annotation_dict)
+
+
+def output_sort_table(cascade_order, mqdf, group, annotation_dict):
+ sort_df = cascade_order.T
+ sort_df['abs_value'] = sort_df.index
+ sort_df[['chrom', 'pos']] = sort_df['abs_value'].str.split(':', expand=True)
+ sort_df = sort_df.drop(['abs_value', 'chrom'], axis=1)
+ sort_df.pos = sort_df.pos.astype(int)
+ sort_df = sort_df.sort_values(by=['pos'])
+ sort_df = sort_df.drop(['pos'], axis=1)
+ sort_df = sort_df.T
+ sort_order_mq = pandas.concat([sort_df, mqdf], join='inner')
+ output_table(sort_order_mq, "sort", group, annotation_dict)
+
+
+def output_table(df, type_str, group, annotation_dict):
+ if isinstance(group, str) and group.startswith("dataset"):
+ # Inputs are single files, not collections,
+ # so input file names are not useful for naming
+ # output files.
+ group_str = None
+ else:
+ group_str = group
+ count = 0
+ chunk_start = 0
+ chunk_end = 0
+ column_count = df.shape[1]
+ if column_count >= MAXCOLS:
+ # Here the number of columns is greater than
+ # the maximum allowed by Excel, so multiple
+ # outputs will be produced.
+ while column_count >= MAXCOLS:
+ count += 1
+ chunk_end += MAXCOLS
+ df_of_type = df.iloc[:, chunk_start:chunk_end]
+ output_excel(df_of_type, type_str, group_str, annotation_dict, count=count)
+ chunk_start += MAXCOLS
+ column_count -= MAXCOLS
+ count += 1
+ df_of_type = df.iloc[:, chunk_start:]
+ output_excel(df_of_type, type_str, group_str, annotation_dict, count=count)
+ else:
+ output_excel(df, type_str, group_str, annotation_dict)
+
+
+def preprocess_tables(task_queue, annotation_dict, timeout):
+ while True:
+ try:
+ tup = task_queue.get(block=True, timeout=timeout)
+ except queue.Empty:
+ break
+ newick_file, json_file, json_avg_mq_file = tup
+ avg_mq_series = pandas.read_json(json_avg_mq_file, typ='series', orient='split')
+ # Map quality to dataframe.
+ mqdf = avg_mq_series.to_frame(name='MQ')
+ mqdf = mqdf.T
+ # Get the group.
+ group = get_base_file_name(newick_file)
+ snps_df = pandas.read_json(json_file, orient='split')
+ with open(newick_file, 'r') as fh:
+ for line in fh:
+ line = re.sub('[:,]', '\n', line)
+ line = re.sub('[)(]', '', line)
+ line = re.sub(r'[0-9].*\.[0-9].*\n', '', line)
+ line = re.sub('root\n', '', line)
+ sample_order = line.split('\n')
+ sample_order = list([_f for _f in sample_order if _f])
+ sample_order.insert(0, 'root')
+ tree_order = snps_df.loc[sample_order]
+ # Count number of SNPs in each column.
+ snp_per_column = []
+ for column_header in tree_order:
+ count = 0
+ column = tree_order[column_header]
+ for element in column:
+ if element != column[0]:
+ count = count + 1
+ snp_per_column.append(count)
+ row1 = pandas.Series(snp_per_column, tree_order.columns, name="snp_per_column")
+ # Count number of SNPS from the
+ # top of each column in the table.
+ snp_from_top = []
+ for column_header in tree_order:
+ count = 0
+ column = tree_order[column_header]
+ # for each element in the column
+ # skip the first element
+ for element in column[1:]:
+ if element == column[0]:
+ count = count + 1
+ else:
+ break
+ snp_from_top.append(count)
+ row2 = pandas.Series(snp_from_top, tree_order.columns, name="snp_from_top")
+ tree_order = tree_order.append([row1])
+ tree_order = tree_order.append([row2])
+ # In pandas=0.18.1 even this does not work:
+ # abc = row1.to_frame()
+ # abc = abc.T --> tree_order.shape (5, 18), abc.shape (1, 18)
+ # tree_order.append(abc)
+ # Continue to get error: "*** ValueError: all the input arrays must have same number of dimensions"
+ tree_order = tree_order.T
+ tree_order = tree_order.sort_values(['snp_from_top', 'snp_per_column'], ascending=[True, False])
+ tree_order = tree_order.T
+ # Remove snp_per_column and snp_from_top rows.
+ cascade_order = tree_order[:-2]
+ # Output the cascade table.
+ output_cascade_table(cascade_order, mqdf, group, annotation_dict)
+ # Output the sorted table.
+ output_sort_table(cascade_order, mqdf, group, annotation_dict)
+ task_queue.task_done()
+
+
+def set_num_cpus(num_files, processes):
+ num_cpus = int(multiprocessing.cpu_count())
+ if num_files < num_cpus and num_files < processes:
+ return num_files
+ if num_cpus < processes:
+ half_cpus = int(num_cpus / 2)
+ if num_files < half_cpus:
+ return num_files
+ return half_cpus
+ return processes
+
+
+if __name__ == '__main__':
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--input_avg_mq_json', action='store', dest='input_avg_mq_json', required=False, default=None, help='Average MQ json file')
+ parser.add_argument('--input_newick', action='store', dest='input_newick', required=False, default=None, help='Newick file')
+ parser.add_argument('--input_snps_json', action='store', dest='input_snps_json', required=False, default=None, help='SNPs json file')
+ parser.add_argument('--gbk_file', action='store', dest='gbk_file', required=False, default=None, help='Optional gbk file'),
+ parser.add_argument('--processes', action='store', dest='processes', type=int, help='User-selected number of processes to use for job splitting')
+
+ args = parser.parse_args()
+
+ if args.gbk_file is not None:
+ # Create the annotation_dict for annotating
+ # the Excel tables.
+ annotation_dict = get_annotation_dict(args.gbk_file)
+ else:
+ annotation_dict = None
+
+ # The assumption here is that the list of files
+ # in both INPUT_NEWICK_DIR and INPUT_JSON_DIR are
+ # named such that they are properly matched if
+ # the directories contain more than 1 file (i.e.,
+ # hopefully the newick file names and json file names
+ # will be something like Mbovis-01D6_* so they can be
+ # sorted and properly associated with each other).
+ if args.input_newick is not None:
+ newick_files = [args.input_newick]
+ else:
+ newick_files = []
+ for file_name in sorted(os.listdir(INPUT_NEWICK_DIR)):
+ file_path = os.path.abspath(os.path.join(INPUT_NEWICK_DIR, file_name))
+ newick_files.append(file_path)
+ if args.input_snps_json is not None:
+ json_files = [args.input_snps_json]
+ else:
+ json_files = []
+ for file_name in sorted(os.listdir(INPUT_JSON_DIR)):
+ file_path = os.path.abspath(os.path.join(INPUT_JSON_DIR, file_name))
+ json_files.append(file_path)
+ if args.input_avg_mq_json is not None:
+ json_avg_mq_files = [args.input_avg_mq_json]
+ else:
+ json_avg_mq_files = []
+ for file_name in sorted(os.listdir(INPUT_JSON_AVG_MQ_DIR)):
+ file_path = os.path.abspath(os.path.join(INPUT_JSON_AVG_MQ_DIR, file_name))
+ json_avg_mq_files.append(file_path)
+
+ multiprocessing.set_start_method('spawn')
+ queue1 = multiprocessing.JoinableQueue()
+ queue2 = multiprocessing.JoinableQueue()
+ num_files = len(newick_files)
+ cpus = set_num_cpus(num_files, args.processes)
+ # Set a timeout for get()s in the queue.
+ timeout = 0.05
+
+ for i, newick_file in enumerate(newick_files):
+ json_file = json_files[i]
+ json_avg_mq_file = json_avg_mq_files[i]
+ queue1.put((newick_file, json_file, json_avg_mq_file))
+
+ # Complete the preprocess_tables task.
+ processes = [multiprocessing.Process(target=preprocess_tables, args=(queue1, annotation_dict, timeout, )) for _ in range(cpus)]
+ for p in processes:
+ p.start()
+ for p in processes:
+ p.join()
+ queue1.join()
+
+ if queue1.empty():
+ queue1.close()
+ queue1.join_thread()
diff -r 000000000000 -r 12f2b14549f6 vsnp_determine_ref_from_data.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/vsnp_determine_ref_from_data.py Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,231 @@
+#!/usr/bin/env python
+
+import argparse
+import gzip
+import os
+from collections import OrderedDict
+
+import yaml
+from Bio.SeqIO.QualityIO import FastqGeneralIterator
+
+OUTPUT_DBKEY_DIR = 'output_dbkey'
+OUTPUT_METRICS_DIR = 'output_metrics'
+
+
+def get_base_file_name(file_path):
+ base_file_name = os.path.basename(file_path)
+ if base_file_name.find(".") > 0:
+ # Eliminate the extension.
+ return os.path.splitext(base_file_name)[0]
+ elif base_file_name.find("_fq") > 0:
+ # The "." character has likely
+ # changed to an "_" character.
+ return base_file_name.split("_fq")[0]
+ elif base_file_name.find("_fastq") > 0:
+ return base_file_name.split("_fastq")[0]
+ return base_file_name
+
+
+def get_dbkey(dnaprints_dict, key, s):
+ # dnaprints_dict looks something like this:
+ # {'brucella': {'NC_002945v4': ['11001110', '11011110', '11001100']}
+ # {'bovis': {'NC_006895': ['11111110', '00010010', '01111011']}}
+ d = dnaprints_dict.get(key, {})
+ for data_table_value, v_list in d.items():
+ if s in v_list:
+ return data_table_value
+ return ""
+
+
+def get_dnaprints_dict(dnaprint_fields):
+ # A dndprint_fields entry looks something liek this.
+ # [['AF2122', '/galaxy/tool-data/vsnp/AF2122/dnaprints/NC_002945v4.yml']]
+ dnaprints_dict = {}
+ for item in dnaprint_fields:
+ # Here item is a 2-element list of data
+ # table components, # value and path.
+ value = item[0]
+ path = item[1].strip()
+ with open(path, "rt") as fh:
+ # The format of all dnaprints yaml
+ # files is something like this:
+ # brucella:
+ # - 0111111111111111
+ print_dict = yaml.load(fh, Loader=yaml.Loader)
+ for print_dict_k, print_dict_v in print_dict.items():
+ dnaprints_v_dict = dnaprints_dict.get(print_dict_k, {})
+ if len(dnaprints_v_dict) > 0:
+ # dnaprints_dict already contains k (e.g., 'brucella',
+ # and dnaprints_v_dict will be a dictionary # that
+ # looks something like this:
+ # {'NC_002945v4': ['11001110', '11011110', '11001100']}
+ value_list = dnaprints_v_dict.get(value, [])
+ value_list = value_list + print_dict_v
+ dnaprints_v_dict[value] = value_list
+ else:
+ # dnaprints_v_dict is an empty dictionary.
+ dnaprints_v_dict[value] = print_dict_v
+ dnaprints_dict[print_dict_k] = dnaprints_v_dict
+ # dnaprints_dict looks something like this:
+ # {'brucella': {'NC_002945v4': ['11001110', '11011110', '11001100']}
+ # {'bovis': {'NC_006895': ['11111110', '00010010', '01111011']}}
+ return dnaprints_dict
+
+
+def get_group_and_dbkey(dnaprints_dict, brucella_string, brucella_sum, bovis_string, bovis_sum, para_string, para_sum):
+ if brucella_sum > 3:
+ group = "Brucella"
+ dbkey = get_dbkey(dnaprints_dict, "brucella", brucella_string)
+ elif bovis_sum > 3:
+ group = "TB"
+ dbkey = get_dbkey(dnaprints_dict, "bovis", bovis_string)
+ elif para_sum >= 1:
+ group = "paraTB"
+ dbkey = get_dbkey(dnaprints_dict, "para", para_string)
+ else:
+ group = ""
+ dbkey = ""
+ return group, dbkey
+
+
+def get_oligo_dict():
+ oligo_dict = {}
+ oligo_dict["01_ab1"] = "AATTGTCGGATAGCCTGGCGATAACGACGC"
+ oligo_dict["02_ab3"] = "CACACGCGGGCCGGAACTGCCGCAAATGAC"
+ oligo_dict["03_ab5"] = "GCTGAAGCGGCAGACCGGCAGAACGAATAT"
+ oligo_dict["04_mel"] = "TGTCGCGCGTCAAGCGGCGTGAAATCTCTG"
+ oligo_dict["05_suis1"] = "TGCGTTGCCGTGAAGCTTAATTCGGCTGAT"
+ oligo_dict["06_suis2"] = "GGCAATCATGCGCAGGGCTTTGCATTCGTC"
+ oligo_dict["07_suis3"] = "CAAGGCAGATGCACATAATCCGGCGACCCG"
+ oligo_dict["08_ceti1"] = "GTGAATATAGGGTGAATTGATCTTCAGCCG"
+ oligo_dict["09_ceti2"] = "TTACAAGCAGGCCTATGAGCGCGGCGTGAA"
+ oligo_dict["10_canis4"] = "CTGCTACATAAAGCACCCGGCGACCGAGTT"
+ oligo_dict["11_canis"] = "ATCGTTTTGCGGCATATCGCTGACCACAGC"
+ oligo_dict["12_ovis"] = "CACTCAATCTTCTCTACGGGCGTGGTATCC"
+ oligo_dict["13_ether2"] = "CGAAATCGTGGTGAAGGACGGGACCGAACC"
+ oligo_dict["14_63B1"] = "CCTGTTTAAAAGAATCGTCGGAACCGCTCT"
+ oligo_dict["15_16M0"] = "TCCCGCCGCCATGCCGCCGAAAGTCGCCGT"
+ oligo_dict["16_mel1b"] = "TCTGTCCAAACCCCGTGACCGAACAATAGA"
+ oligo_dict["17_tb157"] = "CTCTTCGTATACCGTTCCGTCGTCACCATGGTCCT"
+ oligo_dict["18_tb7"] = "TCACGCAGCCAACGATATTCGTGTACCGCGACGGT"
+ oligo_dict["19_tbbov"] = "CTGGGCGACCCGGCCGACCTGCACACCGCGCATCA"
+ oligo_dict["20_tb5"] = "CCGTGGTGGCGTATCGGGCCCCTGGATCGCGCCCT"
+ oligo_dict["21_tb2"] = "ATGTCTGCGTAAAGAAGTTCCATGTCCGGGAAGTA"
+ oligo_dict["22_tb3"] = "GAAGACCTTGATGCCGATCTGGGTGTCGATCTTGA"
+ oligo_dict["23_tb4"] = "CGGTGTTGAAGGGTCCCCCGTTCCAGAAGCCGGTG"
+ oligo_dict["24_tb6"] = "ACGGTGATTCGGGTGGTCGACACCGATGGTTCAGA"
+ oligo_dict["25_para"] = "CCTTTCTTGAAGGGTGTTCG"
+ oligo_dict["26_para_sheep"] = "CGTGGTGGCGACGGCGGCGGGCCTGTCTAT"
+ oligo_dict["27_para_cattle"] = "TCTCCTCGGTCGGTGATTCGGGGGCGCGGT"
+ return oligo_dict
+
+
+def get_seq_counts(value, fastq_list, gzipped):
+ count = 0
+ for fastq_file in fastq_list:
+ if gzipped:
+ with gzip.open(fastq_file, 'rt') as fh:
+ for title, seq, qual in FastqGeneralIterator(fh):
+ count += seq.count(value)
+ else:
+ with open(fastq_file, 'r') as fh:
+ for title, seq, qual in FastqGeneralIterator(fh):
+ count += seq.count(value)
+ return(value, count)
+
+
+def get_species_counts(fastq_list, gzipped):
+ count_summary = {}
+ oligo_dict = get_oligo_dict()
+ for v1 in oligo_dict.values():
+ returned_value, count = get_seq_counts(v1, fastq_list, gzipped)
+ for key, v2 in oligo_dict.items():
+ if returned_value == v2:
+ count_summary.update({key: count})
+ count_list = []
+ for v in count_summary.values():
+ count_list.append(v)
+ brucella_sum = sum(count_list[:16])
+ bovis_sum = sum(count_list[16:24])
+ para_sum = sum(count_list[24:])
+ return count_summary, count_list, brucella_sum, bovis_sum, para_sum
+
+
+def get_species_strings(count_summary):
+ binary_dictionary = {}
+ for k, v in count_summary.items():
+ if v > 1:
+ binary_dictionary.update({k: 1})
+ else:
+ binary_dictionary.update({k: 0})
+ binary_dictionary = OrderedDict(sorted(binary_dictionary.items()))
+ binary_list = []
+ for v in binary_dictionary.values():
+ binary_list.append(v)
+ brucella_binary = binary_list[:16]
+ brucella_string = ''.join(str(e) for e in brucella_binary)
+ bovis_binary = binary_list[16:24]
+ bovis_string = ''.join(str(e) for e in bovis_binary)
+ para_binary = binary_list[24:]
+ para_string = ''.join(str(e) for e in para_binary)
+ return brucella_string, bovis_string, para_string
+
+
+def output_dbkey(file_name, dbkey, output_file):
+ # Output the dbkey.
+ with open(output_file, "w") as fh:
+ fh.write("%s" % dbkey)
+
+
+def output_files(fastq_file, count_list, group, dbkey, dbkey_file, metrics_file):
+ base_file_name = get_base_file_name(fastq_file)
+ output_dbkey(base_file_name, dbkey, dbkey_file)
+ output_metrics(base_file_name, count_list, group, dbkey, metrics_file)
+
+
+def output_metrics(file_name, count_list, group, dbkey, output_file):
+ # Output the metrics.
+ with open(output_file, "w") as fh:
+ fh.write("Sample: %s\n" % file_name)
+ fh.write("Brucella counts: ")
+ for i in count_list[:16]:
+ fh.write("%d," % i)
+ fh.write("\nTB counts: ")
+ for i in count_list[16:24]:
+ fh.write("%d," % i)
+ fh.write("\nPara counts: ")
+ for i in count_list[24:]:
+ fh.write("%d," % i)
+ fh.write("\nGroup: %s" % group)
+ fh.write("\ndbkey: %s\n" % dbkey)
+
+
+if __name__ == '__main__':
+ parser = argparse.ArgumentParser()
+
+ parser.add_argument('--dnaprint_fields', action='append', dest='dnaprint_fields', nargs=2, required=False, default=None, help="List of dnaprints data table value, name and path fields")
+ parser.add_argument('--read1', action='store', dest='read1', required=True, default=None, help='Required: single read')
+ parser.add_argument('--read2', action='store', dest='read2', required=False, default=None, help='Optional: paired read')
+ parser.add_argument('--gzipped', action='store_true', dest='gzipped', default=False, help='Input files are gzipped')
+ parser.add_argument('--output_dbkey', action='store', dest='output_dbkey', required=True, default=None, help='Output reference file')
+ parser.add_argument('--output_metrics', action='store', dest='output_metrics', required=True, default=None, help='Output metrics file')
+
+ args = parser.parse_args()
+
+ fastq_list = [args.read1]
+ if args.read2 is not None:
+ fastq_list.append(args.read2)
+
+ # The value of dnaprint_fields is a list of lists, where each list is
+ # the [value, name, path] components of the vsnp_dnaprints data table.
+ # The data_manager_vsnp_dnaprints tool assigns the dbkey column from the
+ # all_fasta data table to the value column in the vsnp_dnaprints data
+ # table to ensure a proper mapping for discovering the dbkey.
+ dnaprints_dict = get_dnaprints_dict(args.dnaprint_fields)
+
+ # Here fastq_list consists of either a single read
+ # or a set of paired reads, producing single outputs.
+ count_summary, count_list, brucella_sum, bovis_sum, para_sum = get_species_counts(fastq_list, args.gzipped)
+ brucella_string, bovis_string, para_string = get_species_strings(count_summary)
+ group, dbkey = get_group_and_dbkey(dnaprints_dict, brucella_string, brucella_sum, bovis_string, bovis_sum, para_string, para_sum)
+ output_files(args.read1, count_list, group, dbkey, dbkey_file=args.output_dbkey, metrics_file=args.output_metrics)
diff -r 000000000000 -r 12f2b14549f6 vsnp_determine_ref_from_data.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/vsnp_determine_ref_from_data.xml Wed Dec 02 09:11:24 2020 +0000
@@ -0,0 +1,154 @@
+
+ from input data
+
+ macros.xml
+
+
+ biopython
+ pyyaml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+**What it does**
+
+Accepts a single fastqsanger read, a set of paired reads, or a collection of single or paired reads (bacterial samples) and
+inspects the data to discover the best reference genome for aligning the reads.
+
+The information needed to discover the best reference is maintained by the USDA in this repository_. References are curreently
+
+.. _repository: https://github.com/USDA-VS/vSNP_reference_options
+
+limited to TB complex, paraTB, and Brucella, but information for additional references will be added. The information for each
+reference is a string consisting of zeros and ones, compiled by USDA researchers, which we call a "DNA print". These strings
+are maintained in yaml files for use in Galaxy, and are installed via the **vSNP DNAprints data manager** tool.
+
+This tool creates an in-memory dictionary of these DNA print strings for matching with a string generated by inspecting the
+input sample data. During inspection, this tool accrues sequence counts for supported species, ultimately generating a string
+consisting of zeros and ones based on the counts, (i.e., a DNA print). This string is then compared to the strings contained
+in the in-memory dictionary of DNA prints to find a match.
+
+The strings in the in-memory dictionary are each associated with a Galaxy "dbkey" (i.e., genome build), so when a match is found,
+the associated "dbkey" is passed to a mapper (e.g., **Map with BWA-MEM**), typically within a workflow via an expression tool,
+to align the reads to the associated reference.
+
+This tool produces 2 text files, a "dbkey" file that contains the dbkey string and a "metrics" file that provides information
+about the sequence counts that were discovered in the input sample data that produced the "DNA print" string.
+
+This tool is important for samples containing bacterial species because many of the samples have a "mixed bag" of species,
+and discovering the primary species is critical. DNA print matching is currently supported for the following genomes.
+
+ * Mycobacterium bovis AF2122/97
+ * Brucella abortus bv. 1 str. 9-941
+ * Brucella abortus strain BER
+ * Brucella canis ATCC 23365
+ * Brucella ceti TE10759-12
+ * Brucella melitensis bv. 1 str. 16M
+ * Brucella melitensis bv. 3 str. Ether
+ * Brucella melitensis BwIM_SOM_36b
+ * Brucella melitensis ATCC 23457
+ * Brucella ovis ATCC 25840
+ * Brucella suis 1330
+ * Mycobacterium tuberculosis H37Rv
+ * Mycobacterium avium subsp. paratuberculosis strain Telford
+ * Mycobacterium avium subsp. paratuberculosis K-10
+ * Brucella suis ATCC 23445
+ * Brucella suis bv. 3 str. 686
+
+**Required Options**
+
+ * **Choose the category of the files to be analyzed** - select "Single files" or "Collection of files", then select the appropriate history items (single or paired fastqsanger reads or a collection of fastqsanger reads) based on the selected option.
+
+
+
+