# HG changeset patch # User iuc # Date 1606900284 0 # Node ID 12f2b14549f6e903c7e7abfceed703c8e86dd226 "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/vsnp commit 524a39e08f2bea8b8754284df606ff8dd27ed24b" diff -r 000000000000 -r 12f2b14549f6 macros.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/macros.xml Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,30 @@ + + + 1.0 + 19.09 + + + + + + + + + + + + + + + + @misc{None, + journal = {None}, + author = {1. Stuber T}, + title = {Manuscript in preparation}, + year = {None}, + url = {https://github.com/USDA-VS/vSNP},} + + + + + diff -r 000000000000 -r 12f2b14549f6 static/images/table_description.png Binary file static/images/table_description.png has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_avg_mq.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D6_avg_mq.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:362818","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1974665","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3310831","NC_002945.4:3319244","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,59,59,57,57,57,57,57,58,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,60,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,60,59,60,59,59,59,60,59,59,60,60,60,59,60,60,60,60,59,60,59,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,59,59,59,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,59,60,60,60,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_cascade_table.xlsx Binary file test-data/Mbovis-01D6_cascade_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_snps.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D6_snps.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_snps.newick --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D6_snps.newick Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc); diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D6_sort_table.xlsx Binary file test-data/Mbovis-01D6_sort_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_avg_mq.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D_avg_mq.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:362818","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1974665","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3310831","NC_002945.4:3319244","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,59,59,57,57,57,57,57,58,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,60,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,60,59,60,59,59,59,60,59,59,60,60,60,59,60,60,60,60,59,60,59,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,59,59,59,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,59,60,60,60,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_cascade_table.xlsx Binary file test-data/Mbovis-01D_cascade_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_snps.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D_snps.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_snps.newick --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01D_snps.newick Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc); diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01D_sort_table.xlsx Binary file test-data/Mbovis-01D_sort_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_avg_mq.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01_avg_mq.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"name":null,"index":["NC_002945.4:1057","NC_002945.4:4480","NC_002945.4:8741","NC_002945.4:29061","NC_002945.4:33788","NC_002945.4:41228","NC_002945.4:41437","NC_002945.4:50470","NC_002945.4:59861","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:70438","NC_002945.4:75274","NC_002945.4:79918","NC_002945.4:96244","NC_002945.4:110198","NC_002945.4:114965","NC_002945.4:117800","NC_002945.4:127447","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:140686","NC_002945.4:143799","NC_002945.4:144992","NC_002945.4:148871","NC_002945.4:159370","NC_002945.4:160535","NC_002945.4:165799","NC_002945.4:166696","NC_002945.4:179885","NC_002945.4:189083","NC_002945.4:192177","NC_002945.4:198890","NC_002945.4:223919","NC_002945.4:230661","NC_002945.4:232188","NC_002945.4:249090","NC_002945.4:295519","NC_002945.4:299636","NC_002945.4:304339","NC_002945.4:319911","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:364560","NC_002945.4:364804","NC_002945.4:366022","NC_002945.4:407246","NC_002945.4:430077","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:642172","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:707522","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:998183","NC_002945.4:998196","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1096802","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:1186381","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:1295457","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:1442194","NC_002945.4:1462755","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:1659174","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1723479","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:1796727","NC_002945.4:1803035","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1967341","NC_002945.4:1974665","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2051968","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2178975","NC_002945.4:2204661","NC_002945.4:2239061","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2429853","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:3043695","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3252431","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:3309513","NC_002945.4:3310831","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3488828","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:3630061","NC_002945.4:3645682","NC_002945.4:3655045","NC_002945.4:3667823","NC_002945.4:3672841","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3825329","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3872596","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:3993571","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4017949","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4052766","NC_002945.4:4070056","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4245762","NC_002945.4:4264139","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4298964","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:362818","NC_002945.4:641896","NC_002945.4:673880","NC_002945.4:839308","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1213847","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1799442","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2210027","NC_002945.4:2413021","NC_002945.4:2884747","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3413486","NC_002945.4:3464485","NC_002945.4:3553753","NC_002945.4:4251588","NC_002945.4:4278315","NC_002945.4:4293932","NC_002945.4:228109","NC_002945.4:331051","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:960995","NC_002945.4:997676","NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1723583","NC_002945.4:1961826","NC_002945.4:3373966","NC_002945.4:3941254","NC_002945.4:4236320","NC_002945.4:1277988","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1806623","NC_002945.4:1911237","NC_002945.4:3942270"],"data":[60,60,60,59,60,59,60,59,59,60,60,59,60,60,59,60,60,59,59,60,60,60,60,59,59,59,59,60,59,60,59,60,60,60,60,60,60,59,60,60,59,59,59,59,59,59,57,57,57,57,57,59,60,60,60,59,59,60,59,59,60,59,60,60,59,60,60,59,59,59,59,60,60,59,59,59,59,60,60,60,60,59,59,59,60,60,60,60,60,59,59,60,59,59,59,59,59,59,59,60,60,59,58,59,60,59,59,59,59,59,60,59,59,60,59,59,59,60,59,59,59,60,60,60,59,59,60,60,60,59,60,59,59,55,60,60,60,59,59,60,59,60,60,59,59,59,60,59,60,59,59,60,60,59,59,60,59,59,59,60,59,60,60,59,59,56,56,59,60,60,59,58,60,59,59,60,59,59,59,59,59,60,59,58,57,56,60,59,60,60,59,60,59,60,59,59,59,60,59,59,59,60,60,60,60,60,60,59,60,60,59,60,60,59,59,59,60,60,59,60,59,60,60,59,59,59,59,60,60,60,59,59,59,59,59,59,59,60,60,59,60,59,60,60,60,59,60,59,59,60,60,59,59,59,59,60,59,60,60,59,60,59,59,59,59,59,59,60,60,60,60,60,60,60,60,59,60,59,60,60,60,59,59,60,59,60,60,60,59,60,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,59,59,59,60,59,59,60,59,59,60,60,60,60,59,60,60,60,59,60,59,59,60,60,60,60,59,59,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,60,57,60,60,60,59,59,59,60,60,60,60,58,60,59,60,59,59,60,60,59,59,59,59,59,59,59,59,60,60,60,59,60,60,60,59,60,59,60,60,60,60,60,60,59,60,59,59,59,59,59,60,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,60,59,59,60,59,59,59,59,59,60,60,59,60,59,60,60,60,60,59,60,60,60,59,59,59,60,60,59,59,59,60,59,59,59,60,59,59,60,60,59,60,59,60,60,60,59,59,60,60,59,59,59,59,60,59,59,59,59,59,58,60,60,60,52,55,56,60,59,60,59,59,59,60,60,60,60,60,60,60,60,59,60,60,59,60,60,60,59,59,60,60,60,59,59,59,60,56,60,60,59,60,60,60]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_cascade_table.xlsx Binary file test-data/Mbovis-01_cascade_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_snps.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01_snps.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_snps.newick --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/Mbovis-01_snps.newick Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc); diff -r 000000000000 -r 12f2b14549f6 test-data/Mbovis-01_sort_table.xlsx Binary file test-data/Mbovis-01_sort_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/Mcap_Deer_DE_SRR650221.fastq.gz Binary file test-data/Mcap_Deer_DE_SRR650221.fastq.gz has changed diff -r 000000000000 -r 12f2b14549f6 test-data/NC_002945v4.fasta --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/NC_002945v4.fasta Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,101 @@ +>NC_002945.4 Mycobacterium bovis AF2122/97 genome assembly, chromosome: Mycobacterium_bovis_AF2122/97 +TTGACCGATGACCCCGGTTCAGGCTTCACCACAGTGTGGAACGCGGTCGTCTCCGAACTTAACGGCGACC +CTAAGGTTGACGACGGACCCAGCAGTGATGCTAATCTCAGCGCTCCGCTGACCCCTCAGCAAAGGGCTTG +GCTCAATCTCGTCCAGCCATTGACCATCGTCGAGGGGTTTGCTCTGTTATCCGTGCCGAGCAGCTTTGTC +CAAAACGAAATCGAGCGCCATCTGCGGGCCCCGATTACCGACGCTCTCAGCCGCCGACTCGGACATCAGA +TCCAACTCGGGGTCCGCATCGCTCCGCCGGCGACCGACGAAGCCGACGACACTACCGTGCCGCCTTCCGA +AAATCCTGCTACCACATCGCCAGACACCACAACCGACAACGACGAGATTGATGACAGCGCTGCGGCACGG +GGCGATAACCAGCACAGTTGGCCAAGTTACTTCACCGAGCGCCCGCGCAATACCGATTCCGCTACCGCTG +GCGTAACCAGCCTTAACCGTCGCTACACCTTTGATACGTTCGTTATCGGCGCCTCCAACCGGTTCGCGCA +CGCCGCCGCCTTGGCGATCGCAGAAGCACCCGCCCGCGCTTACAACCCCCTGTTCATCTGGGGCGAGTCC +GGTCTCGGCAAGACACACCTGCTACACGCGGCAGGCAACTATGCCCAACGGTTGTTCCCGGGAATGCGGG +TCAAATATGTCTCCACCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGC +ATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAGAG +GGTATTCAAGAGGAGTTCTTCCACACCTTCAACACCTTGCACAATGCCAACAAGCAAATCGTCATCTCAT +CTGACCGCCCACCCAAGCAGCTCGCCACCCTCGAGGACCGGCTGAGAACCCGCTTTGAGTGGGGGCTGAT +CACTGACGTACAACCACCCGAGCTGGAGACCCGCATCGCCATCTTGCGCAAGAAAGCACAGATGGAACGG +CTCGCGATCCCCGACGATGTCCTCGAACTCATCGCCAGCAGTATCGAACGCAATATCCGTGAACTCGAGG +GCGCGCTGATCCGGGTCACCGCGTTCGCCTCATTGAACAAAACACCAATCGACAAAGCGCTGGCCGAGAT +TGTGCTTCGCGATCTGATCGCCGACGCCAACACCATGCAAATCAGCGCGGCGACGATCATGGCTGCCACC +GCCGAATACTTCGACACTACCGTCGAAGAGCTTCGCGGGCCCGGCAAGACCCGAGCACTGGCCCAGTCAC +GACAGATTGCGATGTACCTGTGTCGTGAGCTCACCGATCTTTCGTTGCCCAAAATCGGCCAAGCGTTCGG +CCGTGATCACACAACCGTCATGTACGCCCAACGCAAGATCCTGTCCGAGATGGCCGAGCGCCGTGAGGTC +TTTGATCACGTCAAAGAACTCACCACTCGCATCCGTCAGCGCTCCAAGCGCTAGCACGGCGTGTTCTTCC +GACAACGTTCTTAAAAAAACTTCTCTCTCCCAGGTCACACCAGTCACAGAGATTGGCTGTGAGTGTCGCT +GTGCACAAACCGCGCACAGACTCATACAGTCCCGGCGGTTCCGTTCACAACCCACGCCTCATCCCCACCG +ACCCAACACACACCCCACAGTCATCGCCACCGTCATCCACAACTCCGACCGACGTCGACCTGCACCAAGA +CCAGACTGTCCCCAAACTGCACACCCTCTAATACTGTTACCGAGATTTCTTCGTCGTTTGTTCTTGGAAA +GACAGCGCTGGGGATCGTTCGCTGGATACCACCCGCATAACTGGCTCGTCGCGGTGGGTCAGAGGTCAAT +GATGAACTTTCAAGTTGACGTGAGAAGCTCTACGGTTGTTGTTCGACTGCTGTTGCGGCCGTCGTGGCGG +GTCACGCGTCATGGGCGTTCGTCGTTGGCAGTCCCCACGCTAGCGGGGCGCTAGCCACGGGATCGAACTC +ATCGTGAGGTGAAAGGGCGCAATGGACGCGGCTACGACAAGAGTTGGCCTCACCGACTTGACGTTTCGTT +TGCTACGAGAGTCTTTCGCCGATGCGGTGTCGTGGGTGGCTAAAAATCTGCCAGCCAGGCCCGCGGTGCC +GGTGCTCTCCGGCGTGTTGTTGACCGGCTCGGACAACGGTCTGACGATTTCCGGATTCGACTACGAGGTT +TCCGCCGAGGCCCAGGTTGGCGCTGAAATTGTTTCTCCTGGAAGCGTTTTAGTTTCTGGCCGATTGTTGT +CCGATATTACCCGGGCGTTGCCTAACAAGCCCGTAGGCGTTCATGTCGAAGGTAACCGGGTCGCATTGAC +CTGCGGTAACGCCAGGTTTTCGCTACCGACGATGCCAGTCGAGGATTATCCGACGCTGCCGACGCTGCCG +GAAGAGACCGGATTGTTGCCTGCGGAATTATTCGCCGAGGCAATCAGTCAGGTCGCTATCGCCGCCGGCC +GGGACGACACGCTGCCTATGTTGACCGGCATCCGGGTCGAAATCCTCGGTGAGACGGTGGTTTTGGCCGC +TACCGACAGGTTTCGCCTGGCTGTTCGAGAACTGAAGTGGTCGGCGTCGTCGCCAGATATCGAAGCGGCT +GTGCTGGTCCCGGCCAAGACGCTGGCCGAGGCCGCCAAAGCGGGCATCGGCGGCTCTGACGTTCGTTTGT +CGTTGGGTACTGGGCCGGGGGTGGGCAAGGATGGCCTGCTCGGTATCAGTGGGAACGGCAAGCGCAGCAC +CACGCGACTTCTTGATGCCGAGTTCCCGAAGTTTCGGCAGTTGCTACCAACCGAACACACCGCGGTGGCC +ACCATGGACGTGGCCGAGTTGATCGAAGCGATCAAGCTGGTTGCGTTGGTAGCTGATCGGGGCGCGCAGG +TGCGCATGGAGTTCGCTGATGGCAGCGTGCGGCTTTCTGCGGGTGCCGATGATGTTGGACGAGCCGAGGA +AGATCTTGTTGTTGACTATGCCGGTGAACCATTGACGATTGCGTTTAACCCAACCTATCTAACGGACGGT +TTGAGTTCGTTGCGCTCGGAGCGAGTGTCTTTCGGGTTTACGACTGCGGGTAAGCCTGCCTTGCTACGTC +CGGTGTCCGGGGACGATCGCCCTGTGGCGGGTCTGAATGGCAACGGTCCGTTCCCGGCGGTGTCGACGGA +CTATGTCTATCTGTTGATGCCGGTTCGGTTGCCGGGCTGAGCACTTGGCGCCCGGGTAGGTGTACGTCCG +TCATTTGGGGCTGCGTGACTTCCGGTCCTGGGCATGTGTAGATCTGGAATTGCATCCAGGGCGGACGGTT +TTTGTTGGGCCTAACGGTTATGGTAAGACGAATCTTATTGAGGCACTGTGGTATTCGACGACGTTAGGTT +CGCACCGCGTTAGCGCCGATTTGCCGTTGATCCGGGTAGGTACCGATCGTGCGGTGATCTCCACGATCGT +GGTGAACGACGGTAGAGAATGTGCCGTCGACCTCGAGATCGCCACGGGGCGAGTCAACAAAGCGCGATTG +AATCGATCATCGGTCCGAAGTACACGTGATGTGGTCGGAGTGCTTCGAGCTGTGTTGTTTGCCCCTGAGG +ATCTGGGGTTGGTTCGTGGGGATCCCGCTGACCGGCGGCGCTATCTGGATGATCTGGCGATCGTGCGTAG +GCCTGCGATCGCTGCGGTACGAGCCGAATATGAGAGGGTGGTGCGCCAGCGGACGGCGTTATTGAAGTCC +GTACCTGGAGCACGGTATCGGGGTGACCGGGGTGTGTTTGACACTCTTGAGGTATGGGACAGTCGTTTGG +CGGAGCACGGGGCTGAACTGGTGGCCGCCCGCATCGATTTGGTCAACCAGTTGGCACCGGAAGTGAAGAA +GGCATACCAGCTGTTGGCGCCGGAATCGCGATCGGCGTCTATCGGTTATCGGGCCAGCATGGATGTAACC +GGTCCCAGCGAGCAGTCAGATACCGATCGGCAATTGTTAGCAGCTCGGCTGTTGGCGGCGCTGGCGGCCC +GTCGGGATGCCGAACTCGAGCGTGGGGTTTGTCTAGTTGGTCCGCACCGTGACGACCTAATACTGCGACT +AGGCGATCAACCCGCGAAAGGATTTGCTAGCCATGGGGAGGCGTGGTCGTTGGCGGTGGCACTGCGGTTG +GCGGCCTATCAACTGTTACGCGTTGATGGTGGTGAGCCGGTGTTGTTGCTCGACGACGTGTTCGCCGAAC +TGGATGTCATGCGCCGTCGAGCGTTGGCGACGGCGGCCGAGTCCGCCGAACAGGTGTTGGTGACTGCCGC +GGTGCTCGAGGATATTCCCGCCGGCTGGGACGCCAGGCGGGTGCACATCGATGTGCGTGCCGATGACACC +GGATCGATGTCGGTGGTTCTGCCATGACGGGTTCTGTTGACCGGCCCGACCAGAATCGCGGTGAGCGATT +AATGAAGTCACCAGGGTTGGATTTGGTCAGGCGCACCCTGGACGAAGCTCGTGCTGCTGCCCGCGCGCGC +GGACAAGACGCCGGTCGAGGGCGGGTCGCTTCCGTTGCGTCGGGTCGGGTGGCCGGACGGCGACGAAGCT +GGTCGGGTCCGGGGCCCGACATTCGTGATCCACAACCGCTGGGTAAGGCCGCTCGTGAGCTGGCAAAGAA +ACGCGGCTGGTCGGTGCGGGTCGCCGAGGGTATGGTGCTCGGCCAGTGGTCTGCGGTGGTCGGCCACCAG +ATCGCCGAACATGCACGCCCGACTGCGCTAAACGACGGGGTGTTGAGCGTGATTGCGGAGTCGACGGCGT +GGGCGACGCAGTTGAGGATCATGCAGGCCCAGCTTCTGGCCAAGATCGCCGCAGCGGTTGGCAACGATGT +GGTGCGATCGCTAAAGATCACCGGGCCGGCGGCACCATCGTGGCGCAAGGGGCCTCGCCATATTGCCGGT +AGGGGTCCGCGCGACACCTACGGATAACACGTCGATCGGCCCAGAACAAGGCGCTCCGGTCCCGGCCTGA +GAGCCTCGAGGACGAAGCGGATCCGTATGCCGGACGTCGGGACGCACCAGGAAGAAAGATGTCCGACGCA +CGGCGCGGTTAGATGGGTAAAAACGAGGCCAGAAGATCGGCCCTGGCGCCCGATCACGGTACAGTGGTGT +GCGACCCCCTGCGGCGACTCAACCGCATGCACGCAACCCCTGAGGAGAGTATTCGGATCGTGGCTGCCCA +GAAAAAGAAGGCCCAAGACGAATACGGCGCTGCGTCTATCACCATTCTCGAAGGGCTGGAGGCCGTCCGC +AAACGTCCCGGCATGTACATTGGCTCGACCGGTGAGCGCGGTTTACACCATCTCATTTGGGAGGTGGTCG +ACAACGCGGTCGACGAGGCGATGGCCGGTTATGCAACCACAGTGAACGTAGTGCTGCTTGAGGATGGCGG +TGTCGAGGTCGCCGACGACGGCCGCGGCATTCCGGTCGCCACCCACGCCTCCGGCATACCGACCGTCGAC +GTGGTGATGACACAACTACATGCCGGCGGCAAGTTCGACTCGGACGCGTATGCGATATCTGGTGGTCTGC +ACGGCGTCGGCGTGTCGGTGGTTAACGCGCTATCCACCCGGCTCGAAGTCGAGATCAAGCGCGACGGGTA +CGAGTGGTCTCAGGTTTATGAGAAGTCGGAACCCCTGGGCCTCAAGCAAGGGGCGCCGACCAAGAAGACG +GGGTCAACGGTACGGTTCTGGGCCGACCCCGCTGTTTTCGAAACCACGGAATACGACTTCGAAACCGTCG +CCCGCCGGCTGCAAGAGATGGCGTTCCTCAACAAGGGGCTGACCATCAACCTGACCGACGAGAGGGTGAC +CCAAGACGAGGTCGTCGACGAAGTGGTCAGCGACGTCGCCGAGGCGCCGAAGTCGGCAAGTGAACGCGCA +GCCGAATCCACTGCACCGCACAAAGTTAAGAGCCGCACCTTTCACTATCCGGGTGGCCTGGTGGACTTCG +TGAAACACATCAACCGCACCAAGAACGCGATTCATAGCAGCATCGTGGACTTTTCCGGCAAGGGCACCGG +GCACGAGGTGGAGATCGCGATGCAATGGAACGCCGGGTATTCGGAGTCGGTGCACACCTTCGCCAACACC +ATCAACACCCACGAGGGCGGCACCCACGAAGAGGGCTTCCGCAGCGCGCTGACGTCGGTGGTGAACAAGT +ACGCCAAGGACCGCAAGCTACTGAAGGACAAGGACCCCAACCTCACCGGTGACGATATCCGGGAAGGCCT +GGCCGCTGTGATCTCGGTGAAGGTCAGCGAACCGCAGTTCGAGGGCCAGACCAAGACCAAGTTGGGCAAC +ACCGAGGTCAAATCGTTTGTGCAGAAGGTCTGTAATGAACAGCTGACCCACTGGTTTGAAGCCAACCCCA +CCGACTCGAAAGTCGTTGTGAACAAGGCTGTGTCCTCGGCGCAAGCCCGTATCGCGGCACGTAAGGCACG +AGAGTTGGTGCGGCGTAAGAGCGCCACCGACATCGGTGGATTGCCCGGCAAGCTGGCCGATTGCCGTTCC +ACGGATCCGCGCAAGTCCGAACTGTATGTCGTAGAAGGTGACTCGGCCGGCGGTTCTGCAAAAAGCGGTC +GCGATTCGATGTTCCAGGCGATACTTCCGCTGCGCGGCAAGATCATCAATGTGGAGAAAGCGCGCATCGA +CCGGGTGCTAAAGAACACCGAAGTTCAGGCGATCATCACGGCGCTGGGCACCGGGATCCACGACGAGTTC +GATATCGGCAAGCTGCGCTACCACAAGATCGTGCTGATGGCCGACGCCGATGTTGACGGCCAACATATTT +CCACGCTGTTGTTGACGTTGTTGTTCCGGTTCATGCGGCCGCTCATCGAGAACGGGCATGTGTTTTTGGC +ACAACCGCCGCTGTACAAACTCAAGTGGCAGCGCAGTGACCCGGAATTCGCATACTCCGACCGCGAGCGC diff -r 000000000000 -r 12f2b14549f6 test-data/NC_002945v4.yml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/NC_002945v4.yml Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,5 @@ +bovis: + - '11001110' + - '11011110' + - '11001100' + diff -r 000000000000 -r 12f2b14549f6 test-data/bam_input.bam Binary file test-data/bam_input.bam has changed diff -r 000000000000 -r 12f2b14549f6 test-data/cascade_table.xlsx Binary file test-data/cascade_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/fasta_indexes.loc --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/fasta_indexes.loc Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +89 89 Mycobacterium_AF2122 ${__HERE__}/NC_002945v4.fasta diff -r 000000000000 -r 12f2b14549f6 test-data/forward.fastq.gz Binary file test-data/forward.fastq.gz has changed diff -r 000000000000 -r 12f2b14549f6 test-data/input_avg_mq_json.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/input_avg_mq_json.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"name":null,"index":["NC_002945.4:1005705","NC_002945.4:1018313","NC_002945.4:1021422","NC_002945.4:1034434","NC_002945.4:1036102","NC_002945.4:1036530","NC_002945.4:1057","NC_002945.4:1096802","NC_002945.4:110198","NC_002945.4:1104019","NC_002945.4:1104291","NC_002945.4:1124266","NC_002945.4:1137800","NC_002945.4:1139489","NC_002945.4:114965","NC_002945.4:1159390","NC_002945.4:1160992","NC_002945.4:1168458","NC_002945.4:117800","NC_002945.4:1186381","NC_002945.4:1190076","NC_002945.4:1190080","NC_002945.4:1190084","NC_002945.4:1191092","NC_002945.4:1199529","NC_002945.4:1199530","NC_002945.4:1199951","NC_002945.4:1206896","NC_002945.4:1212203","NC_002945.4:1213847","NC_002945.4:1214540","NC_002945.4:1224899","NC_002945.4:1230875","NC_002945.4:1244746","NC_002945.4:1259250","NC_002945.4:1264712","NC_002945.4:127447","NC_002945.4:1277988","NC_002945.4:1295457","NC_002945.4:130166","NC_002945.4:130237","NC_002945.4:1312836","NC_002945.4:1314197","NC_002945.4:1333537","NC_002945.4:1335092","NC_002945.4:1341613","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1383731","NC_002945.4:1405922","NC_002945.4:140686","NC_002945.4:1412824","NC_002945.4:1412828","NC_002945.4:1412885","NC_002945.4:1412893","NC_002945.4:1421904","NC_002945.4:143799","NC_002945.4:1442194","NC_002945.4:144992","NC_002945.4:1463503","NC_002945.4:1467394","NC_002945.4:1470606","NC_002945.4:1479827","NC_002945.4:1481327","NC_002945.4:1484942","NC_002945.4:148871","NC_002945.4:1492328","NC_002945.4:1498639","NC_002945.4:1501932","NC_002945.4:1509487","NC_002945.4:1517866","NC_002945.4:1524526","NC_002945.4:1529147","NC_002945.4:1533175","NC_002945.4:1535299","NC_002945.4:1535303","NC_002945.4:1535366","NC_002945.4:1536267","NC_002945.4:1547426","NC_002945.4:1568090","NC_002945.4:1584881","NC_002945.4:1591357","NC_002945.4:159370","NC_002945.4:1594398","NC_002945.4:1597464","NC_002945.4:1597847","NC_002945.4:1600443","NC_002945.4:160535","NC_002945.4:1619153","NC_002945.4:1619361","NC_002945.4:1625561","NC_002945.4:1628068","NC_002945.4:1632869","NC_002945.4:165799","NC_002945.4:1659174","NC_002945.4:166696","NC_002945.4:1682044","NC_002945.4:1701507","NC_002945.4:1704859","NC_002945.4:1711760","NC_002945.4:1716413","NC_002945.4:1717086","NC_002945.4:1720220","NC_002945.4:1723583","NC_002945.4:1741553","NC_002945.4:1762390","NC_002945.4:1790296","NC_002945.4:179885","NC_002945.4:1799442","NC_002945.4:1803035","NC_002945.4:1806623","NC_002945.4:1817260","NC_002945.4:1828312","NC_002945.4:1833330","NC_002945.4:1863248","NC_002945.4:1871114","NC_002945.4:1880430","NC_002945.4:189083","NC_002945.4:1894922","NC_002945.4:1896107","NC_002945.4:1911237","NC_002945.4:1915461","NC_002945.4:1915936","NC_002945.4:1920100","NC_002945.4:192177","NC_002945.4:1932972","NC_002945.4:1941781","NC_002945.4:1954048","NC_002945.4:1957978","NC_002945.4:1958977","NC_002945.4:1961656","NC_002945.4:1961826","NC_002945.4:1974665","NC_002945.4:198890","NC_002945.4:1989922","NC_002945.4:1996251","NC_002945.4:2002061","NC_002945.4:2007303","NC_002945.4:2010421","NC_002945.4:2020061","NC_002945.4:2021640","NC_002945.4:2024890","NC_002945.4:2027869","NC_002945.4:2035774","NC_002945.4:2036697","NC_002945.4:2049171","NC_002945.4:2057553","NC_002945.4:2059249","NC_002945.4:2059920","NC_002945.4:2075405","NC_002945.4:2078648","NC_002945.4:2093479","NC_002945.4:2096812","NC_002945.4:2099043","NC_002945.4:2118096","NC_002945.4:2121160","NC_002945.4:2137049","NC_002945.4:2138896","NC_002945.4:2145868","NC_002945.4:2163576","NC_002945.4:2204661","NC_002945.4:2210027","NC_002945.4:2239061","NC_002945.4:223919","NC_002945.4:2257546","NC_002945.4:2267557","NC_002945.4:2268821","NC_002945.4:228109","NC_002945.4:2283200","NC_002945.4:2283218","NC_002945.4:2283220","NC_002945.4:2283227","NC_002945.4:2283235","NC_002945.4:2283236","NC_002945.4:2283350","NC_002945.4:2283353","NC_002945.4:2283355","NC_002945.4:2283362","NC_002945.4:2283366","NC_002945.4:2283367","NC_002945.4:2283368","NC_002945.4:2283371","NC_002945.4:230661","NC_002945.4:2308525","NC_002945.4:2310215","NC_002945.4:232188","NC_002945.4:2333994","NC_002945.4:2339770","NC_002945.4:2358298","NC_002945.4:2360219","NC_002945.4:2368982","NC_002945.4:2369407","NC_002945.4:2378324","NC_002945.4:2381437","NC_002945.4:2384647","NC_002945.4:2410761","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:2418267","NC_002945.4:2428397","NC_002945.4:2433602","NC_002945.4:2479007","NC_002945.4:2492067","NC_002945.4:2497022","NC_002945.4:2499336","NC_002945.4:2506199","NC_002945.4:2508626","NC_002945.4:2513801","NC_002945.4:2515130","NC_002945.4:2520576","NC_002945.4:2524942","NC_002945.4:2528517","NC_002945.4:2529413","NC_002945.4:2532958","NC_002945.4:2538021","NC_002945.4:2539896","NC_002945.4:2549198","NC_002945.4:2573831","NC_002945.4:2615591","NC_002945.4:2631265","NC_002945.4:2656304","NC_002945.4:2656651","NC_002945.4:2662768","NC_002945.4:2663582","NC_002945.4:2667489","NC_002945.4:2683485","NC_002945.4:2688315","NC_002945.4:2729845","NC_002945.4:2747797","NC_002945.4:2749502","NC_002945.4:2758761","NC_002945.4:2767533","NC_002945.4:2770129","NC_002945.4:2794510","NC_002945.4:2806603","NC_002945.4:2807510","NC_002945.4:2807511","NC_002945.4:2809255","NC_002945.4:2819758","NC_002945.4:2823105","NC_002945.4:2870414","NC_002945.4:2870624","NC_002945.4:2873027","NC_002945.4:2884747","NC_002945.4:2886118","NC_002945.4:2890220","NC_002945.4:2893045","NC_002945.4:2899163","NC_002945.4:2899584","NC_002945.4:2900525","NC_002945.4:29061","NC_002945.4:2918203","NC_002945.4:2924775","NC_002945.4:2927134","NC_002945.4:2931071","NC_002945.4:2931113","NC_002945.4:2942926","NC_002945.4:2946800","NC_002945.4:295519","NC_002945.4:2956778","NC_002945.4:2964207","NC_002945.4:2978162","NC_002945.4:2978164","NC_002945.4:2983580","NC_002945.4:2984156","NC_002945.4:299636","NC_002945.4:3018593","NC_002945.4:3031841","NC_002945.4:3039600","NC_002945.4:3040820","NC_002945.4:3042914","NC_002945.4:304339","NC_002945.4:3045025","NC_002945.4:3053649","NC_002945.4:3053756","NC_002945.4:3063074","NC_002945.4:3068041","NC_002945.4:3069493","NC_002945.4:3070642","NC_002945.4:3088868","NC_002945.4:3093531","NC_002945.4:3098932","NC_002945.4:3100639","NC_002945.4:3103354","NC_002945.4:3106064","NC_002945.4:3106527","NC_002945.4:3116059","NC_002945.4:3127117","NC_002945.4:3137471","NC_002945.4:3140342","NC_002945.4:3151212","NC_002945.4:3154140","NC_002945.4:3172929","NC_002945.4:3173568","NC_002945.4:3191792","NC_002945.4:319911","NC_002945.4:3247551","NC_002945.4:3250072","NC_002945.4:3250245","NC_002945.4:3270181","NC_002945.4:3294771","NC_002945.4:3295991","NC_002945.4:3297558","NC_002945.4:3304410","NC_002945.4:3304946","NC_002945.4:3306898","NC_002945.4:331051","NC_002945.4:3310831","NC_002945.4:331241","NC_002945.4:331411","NC_002945.4:3319244","NC_002945.4:332124","NC_002945.4:332128","NC_002945.4:332144","NC_002945.4:332145","NC_002945.4:332154","NC_002945.4:332215","NC_002945.4:332218","NC_002945.4:333010","NC_002945.4:3330907","NC_002945.4:3338298","NC_002945.4:3347870","NC_002945.4:3368453","NC_002945.4:3371156","NC_002945.4:3373966","NC_002945.4:33788","NC_002945.4:3396621","NC_002945.4:3396650","NC_002945.4:340088","NC_002945.4:340090","NC_002945.4:340091","NC_002945.4:340092","NC_002945.4:340097","NC_002945.4:3413486","NC_002945.4:3414355","NC_002945.4:3421983","NC_002945.4:3422650","NC_002945.4:3439578","NC_002945.4:3451869","NC_002945.4:3453219","NC_002945.4:3460907","NC_002945.4:3464357","NC_002945.4:3464485","NC_002945.4:3464524","NC_002945.4:3468669","NC_002945.4:3476130","NC_002945.4:3482644","NC_002945.4:3484836","NC_002945.4:3486507","NC_002945.4:3493554","NC_002945.4:3495510","NC_002945.4:3497957","NC_002945.4:3533661","NC_002945.4:3546799","NC_002945.4:3553753","NC_002945.4:3564896","NC_002945.4:3567535","NC_002945.4:3574014","NC_002945.4:3574955","NC_002945.4:3591452","NC_002945.4:3600600","NC_002945.4:3622899","NC_002945.4:3624371","NC_002945.4:3626128","NC_002945.4:362818","NC_002945.4:3630061","NC_002945.4:364560","NC_002945.4:3645682","NC_002945.4:364804","NC_002945.4:3655045","NC_002945.4:366022","NC_002945.4:3667823","NC_002945.4:3712401","NC_002945.4:3718169","NC_002945.4:3718628","NC_002945.4:3719802","NC_002945.4:3723554","NC_002945.4:3725203","NC_002945.4:3729351","NC_002945.4:3751627","NC_002945.4:3769174","NC_002945.4:3776764","NC_002945.4:3778473","NC_002945.4:3800223","NC_002945.4:3805467","NC_002945.4:3816878","NC_002945.4:3821259","NC_002945.4:3839650","NC_002945.4:3846859","NC_002945.4:3874432","NC_002945.4:3877448","NC_002945.4:3884519","NC_002945.4:3888418","NC_002945.4:3902781","NC_002945.4:3905690","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:3957298","NC_002945.4:3966140","NC_002945.4:3969490","NC_002945.4:3969558","NC_002945.4:3969875","NC_002945.4:4003460","NC_002945.4:4008509","NC_002945.4:4010760","NC_002945.4:4017319","NC_002945.4:4018300","NC_002945.4:4029201","NC_002945.4:4046572","NC_002945.4:4070056","NC_002945.4:407246","NC_002945.4:4076594","NC_002945.4:4077189","NC_002945.4:4080736","NC_002945.4:4096612","NC_002945.4:41228","NC_002945.4:4128841","NC_002945.4:4130927","NC_002945.4:41437","NC_002945.4:4149101","NC_002945.4:4155870","NC_002945.4:4159272","NC_002945.4:4160820","NC_002945.4:4162407","NC_002945.4:4162554","NC_002945.4:4180986","NC_002945.4:4205111","NC_002945.4:4207380","NC_002945.4:4214259","NC_002945.4:4219009","NC_002945.4:4222196","NC_002945.4:4226875","NC_002945.4:4231626","NC_002945.4:4236320","NC_002945.4:4245762","NC_002945.4:4251588","NC_002945.4:4264139","NC_002945.4:4278315","NC_002945.4:4281136","NC_002945.4:4282825","NC_002945.4:4293932","NC_002945.4:4298964","NC_002945.4:430077","NC_002945.4:4303164","NC_002945.4:4311425","NC_002945.4:4321337","NC_002945.4:4339036","NC_002945.4:4347304","NC_002945.4:438482","NC_002945.4:441762","NC_002945.4:4480","NC_002945.4:449922","NC_002945.4:452398","NC_002945.4:460722","NC_002945.4:467343","NC_002945.4:467402","NC_002945.4:479644","NC_002945.4:483845","NC_002945.4:485584","NC_002945.4:488897","NC_002945.4:490878","NC_002945.4:50470","NC_002945.4:507929","NC_002945.4:518522","NC_002945.4:519412","NC_002945.4:541571","NC_002945.4:544180","NC_002945.4:577068","NC_002945.4:59861","NC_002945.4:598704","NC_002945.4:600207","NC_002945.4:611077","NC_002945.4:622386","NC_002945.4:641896","NC_002945.4:642875","NC_002945.4:644245","NC_002945.4:649910","NC_002945.4:652349","NC_002945.4:673880","NC_002945.4:680416","NC_002945.4:685069","NC_002945.4:69913","NC_002945.4:70082","NC_002945.4:701329","NC_002945.4:701386","NC_002945.4:70438","NC_002945.4:712319","NC_002945.4:723170","NC_002945.4:726979","NC_002945.4:737636","NC_002945.4:738102","NC_002945.4:745507","NC_002945.4:760347","NC_002945.4:792617","NC_002945.4:79918","NC_002945.4:804997","NC_002945.4:808601","NC_002945.4:811737","NC_002945.4:812709","NC_002945.4:828003","NC_002945.4:832093","NC_002945.4:833960","NC_002945.4:839308","NC_002945.4:843812","NC_002945.4:854043","NC_002945.4:865821","NC_002945.4:870116","NC_002945.4:8741","NC_002945.4:884432","NC_002945.4:889897","NC_002945.4:905912","NC_002945.4:917766","NC_002945.4:920753","NC_002945.4:941068","NC_002945.4:942431","NC_002945.4:943719","NC_002945.4:946102","NC_002945.4:948022","NC_002945.4:948811","NC_002945.4:948974","NC_002945.4:960995","NC_002945.4:96244","NC_002945.4:965529","NC_002945.4:967989","NC_002945.4:973459","NC_002945.4:974604","NC_002945.4:976327","NC_002945.4:982301","NC_002945.4:990611","NC_002945.4:997676","NC_002945.4:998183","NC_002945.4:998196"],"data":[60,60,59,60,59,59,60,55,60,60,60,60,59,59,60,60,59,60,59,60,52,55,56,59,59,59,60,59,60,60,59,59,60,60,59,59,59,56,60,60,60,59,59,59,60,59,60,60,60,60,60,59,59,56,56,59,60,60,59,60,59,58,60,59,59,59,60,59,59,59,59,59,60,59,58,57,57,60,59,60,60,59,59,60,59,60,59,59,59,59,60,59,59,60,60,59,60,60,59,59,60,60,60,60,60,59,60,60,59,59,60,60,60,59,59,59,60,59,60,59,60,60,59,60,60,60,59,59,59,59,60,59,60,60,59,59,59,59,59,59,59,59,59,60,60,59,59,60,60,60,59,60,59,59,60,60,59,59,59,59,59,60,60,60,60,59,60,59,59,59,59,59,59,59,60,60,60,60,60,60,60,60,60,59,60,60,59,60,60,60,59,59,60,59,60,60,60,60,59,60,60,59,60,60,59,59,60,60,60,59,60,60,59,59,60,59,60,60,59,59,60,60,60,59,60,59,59,59,59,60,60,59,59,59,60,60,60,59,59,60,60,59,60,60,60,59,59,59,60,59,59,59,60,59,59,60,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,60,60,60,59,60,60,60,59,59,60,59,60,59,59,59,59,59,60,59,60,59,60,59,60,59,59,59,60,60,60,60,59,59,59,60,60,60,60,60,60,59,59,59,59,59,59,59,59,60,58,60,59,60,59,60,59,59,57,57,57,57,57,60,60,60,59,59,59,59,59,59,60,59,59,60,60,60,59,60,60,59,60,59,60,60,60,60,60,60,60,59,60,59,58,59,59,59,60,59,60,59,59,59,59,59,59,59,59,60,60,60,60,60,59,60,60,59,59,59,59,59,59,59,60,59,60,60,59,60,59,60,60,60,59,60,60,59,59,60,60,60,59,59,59,59,60,59,60,59,59,60,59,59,60,60,59,60,59,60,60,60,59,60,59,59,60,60,60,59,60,59,59,59,59,60,59,59,59,60,60,59,59,60,59,60,60,59,60,60,59,59,59,59,59,60,60,59,59,59,59,59,60,60,60,60,60,59,60,59,60,60,60,60,60,59,60,59,59,60,59,59,59,59,60,59,59,59,60,60,59,58,60,59,60,59,59,60,59,59,59,60,59,59,60,59,59,59,60,59,59,59,59,59,60,60,60,59,59,59,60,60]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/input_newick.newick --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/input_newick.newick Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +(root,((((SRR1792271_zc,SRR1792272_zc),SRR1791772_zc),SRR8073662_zc),SRR1791698_zc_vcf),SRR1792265_zc); diff -r 000000000000 -r 12f2b14549f6 test-data/input_snps_json.json --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/input_snps_json.json Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +{"columns":["NC_002945.4:1005705","NC_002945.4:1348342","NC_002945.4:1382465","NC_002945.4:1463503","NC_002945.4:1704859","NC_002945.4:1723583","NC_002945.4:1911237","NC_002945.4:1961826","NC_002945.4:228109","NC_002945.4:2412437","NC_002945.4:2413021","NC_002945.4:3069493","NC_002945.4:3319244","NC_002945.4:3373966","NC_002945.4:3413486","NC_002945.4:3941254","NC_002945.4:3942270","NC_002945.4:4236320","NC_002945.4:4278315","NC_002945.4:960995","NC_002945.4:997676"],"index":["SRR1792265_zc","SRR1792272_zc","SRR1792271_zc","SRR8073662_zc","SRR1791772_zc","SRR1791698_zc_vcf","root"],"data":[["C","G","G","A","C","G","C","G","C","R","C","A","C","G","A","G","A","G","T","T","C"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","A","C","C","T","A","T","C","A","A","G","A","A","A","C","G","T"],["G","A","G","A","C","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","C","G","T","G","C","C","T","A","T","C","A","G","G","G","A","G","C","G","T"],["G","A","G","A","C","G","T","C","T","A","T","C","A","G","G","G","C","G","C","G","T"],["C","G","G","A","C","G","C","G","C","G","T","C","A","G","G","G","A","G","C","T","C"]]} \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/output_dbkey.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_dbkey.txt Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +AF2122 \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/output_metrics.tabular --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_metrics.tabular Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,2 @@ +# File Number of Good SNPs Average Coverage Genome Coverage + 0 diff -r 000000000000 -r 12f2b14549f6 test-data/output_metrics.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_metrics.txt Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,6 @@ +Sample: Mcap_Deer_DE_SRR650221 +Brucella counts: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, +TB counts: 2,2,0,0,4,5,0,0, +Para counts: 0,0,0, +Group: TB +dbkey: AF2122 diff -r 000000000000 -r 12f2b14549f6 test-data/output_vcf.vcf --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_vcf.vcf Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,100 @@ +##fileformat=VCFv4.2 +##fileDate=20200302 +##source=freeBayes v1.3.1-dirty +##reference=/home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa +##contig= +##phasing=none +##commandline="freebayes --region NC_002945.4:0..4349904 --bam b_0.bam --fasta-reference /home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa --vcf ./vcf_output/part_NC_002945.4:0..4349904.vcf -u -n 0 --haplotype-length -1 --min-repeat-size 5 --min-repeat-entropy 1 -m 1 -q 0 -R 0 -Y 0 -e 1 -F 0.05 -C 2 -G 1 --min-alternate-qsum 0" +##filter="QUAL > 0" +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 13-1941-6 +NC_002945.4 1 . N . . . . GT ./. +NC_002945.4 2 . N . . . . GT ./. +NC_002945.4 3 . N . . . . GT ./. +NC_002945.4 4 . N . . . . GT ./. +NC_002945.4 5 . N . . . . GT ./. +NC_002945.4 6 . N . . . . GT ./. +NC_002945.4 7 . N . . . . GT ./. +NC_002945.4 8 . N . . . . GT ./. +NC_002945.4 9 . N . . . . GT ./. +NC_002945.4 10 . N . . . . GT ./. +NC_002945.4 11 . N . . . . GT ./. +NC_002945.4 12 . N . . . . GT ./. +NC_002945.4 13 . N . . . . GT ./. +NC_002945.4 14 . N . . . . GT ./. +NC_002945.4 15 . N . . . . GT ./. +NC_002945.4 16 . N . . . . GT ./. +NC_002945.4 17 . N . . . . GT ./. +NC_002945.4 18 . N . . . . GT ./. +NC_002945.4 19 . N . . . . GT ./. +NC_002945.4 20 . N . . . . GT ./. +NC_002945.4 21 . N . . . . GT ./. +NC_002945.4 22 . N . . . . GT ./. +NC_002945.4 23 . N . . . . GT ./. +NC_002945.4 24 . N . . . . GT ./. +NC_002945.4 25 . N . . . . GT ./. +NC_002945.4 26 . N . . . . GT ./. +NC_002945.4 27 . N . . . . GT ./. +NC_002945.4 28 . N . . . . GT ./. +NC_002945.4 29 . N . . . . GT ./. +NC_002945.4 30 . N . . . . GT ./. +NC_002945.4 31 . N . . . . GT ./. +NC_002945.4 32 . N . . . . GT ./. +NC_002945.4 33 . N . . . . GT ./. +NC_002945.4 34 . N . . . . GT ./. +NC_002945.4 35 . N . . . . GT ./. +NC_002945.4 36 . N . . . . GT ./. +NC_002945.4 37 . N . . . . GT ./. diff -r 000000000000 -r 12f2b14549f6 test-data/paired_dbkey.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/paired_dbkey.txt Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,1 @@ +AF2122 \ No newline at end of file diff -r 000000000000 -r 12f2b14549f6 test-data/paired_metrics.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/paired_metrics.txt Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,6 @@ +Sample: forward +Brucella counts: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, +TB counts: 4,4,0,0,8,10,0,0, +Para counts: 0,0,0, +Group: TB +dbkey: AF2122 diff -r 000000000000 -r 12f2b14549f6 test-data/reverse.fastq.gz Binary file test-data/reverse.fastq.gz has changed diff -r 000000000000 -r 12f2b14549f6 test-data/sort_table.xlsx Binary file test-data/sort_table.xlsx has changed diff -r 000000000000 -r 12f2b14549f6 test-data/vcf_input.vcf --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/vcf_input.vcf Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,64 @@ +##fileformat=VCFv4.2 +##fileDate=20200302 +##source=freeBayes v1.3.1-dirty +##reference=/home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa +##contig= +##phasing=none +##commandline="freebayes --region NC_002945.4:0..4349904 --bam b_0.bam --fasta-reference /home/galaxy/galaxy/tool-data/AF2122/seq/AF2122.fa --vcf ./vcf_output/part_NC_002945.4:0..4349904.vcf -u -n 0 --haplotype-length -1 --min-repeat-size 5 --min-repeat-entropy 1 -m 1 -q 0 -R 0 -Y 0 -e 1 -F 0.05 -C 2 -G 1 --min-alternate-qsum 0" +##filter="QUAL > 0" +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##INFO= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +##FORMAT= +#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 13-1941-6 +NC_002945.4 2898437 . T G 0.263449 . AB=0;ABP=0;AC=0;AF=0;AN=2;AO=2;CIGAR=1X;DP=2;DPB=2;DPRA=0;EPP=3.0103;EPPR=0;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=2.77259;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=0;QR=0;RO=0;RPL=2;RPP=7.35324;RPPR=0;RPR=0;RUN=1;SAF=1;SAP=3.0103;SAR=1;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.ILLUMINA=1 GT:DP:AD:RO:QR:AO:QA:GL 0/0:2:0,2:0:0:2:0:0,-0.60206,-8.68589e-09 diff -r 000000000000 -r 12f2b14549f6 test-data/vsnp_dnaprints.loc --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/vsnp_dnaprints.loc Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,4 @@ +## vSNP DNAprints files +#Value Name Path Description +AF2122 Mycobacterium_AF2122/NC_002945v4.yml ${__HERE__}/NC_002945v4.yml DNAprints file for Mycobacterium bovis AF2122/97 +#NC_006932 Brucella_abortus1/NC_006932-NC_006933.yml /vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.yml DNAprints file for Brucella abortus bv. 1 str. 9-941 diff -r 000000000000 -r 12f2b14549f6 tool-data/fasta_indexes.loc.sample --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool-data/fasta_indexes.loc.sample Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,29 @@ +#This is a sample file distributed with Galaxy that enables tools +#to use a directory of Samtools indexed sequences data files. You will need +#to create these data files and then create a fasta_indexes.loc file +#similar to this one (store it in this directory) that points to +#the directories in which those files are stored. The fasta_indexes.loc +#file has this format (white space characters are TAB characters): +# +# +# +#So, for example, if you had hg19 Canonical indexed stored in +# +# /depot/data2/galaxy/hg19/sam/, +# +#then the fasta_indexes.loc entry would look like this: +# +#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /depot/data2/galaxy/hg19/sam/hg19canon.fa +# +#and your /depot/data2/galaxy/hg19/sam/ directory +#would contain hg19canon.fa and hg19canon.fa.fai files. +# +#Your fasta_indexes.loc file should include an entry per line for +#each index set you have stored. The file in the path does actually +#exist, but it should never be directly used. Instead, the name serves +#as a prefix for the index file. For example: +# +#hg18canon hg18 Human (Homo sapiens): hg18 Canonical /depot/data2/galaxy/hg18/sam/hg18canon.fa +#hg18full hg18 Human (Homo sapiens): hg18 Full /depot/data2/galaxy/hg18/sam/hg18full.fa +#hg19canon hg19 Human (Homo sapiens): hg19 Canonical /depot/data2/galaxy/hg19/sam/hg19canon.fa +#hg19full hg19 Human (Homo sapiens): hg19 Full /depot/data2/galaxy/hg19/sam/hg19full.fa diff -r 000000000000 -r 12f2b14549f6 tool-data/vsnp_dnaprints.loc.sample --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool-data/vsnp_dnaprints.loc.sample Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,4 @@ +## vSNP DNAprints files +#Value Name Path Description +#AF2122 Mycobacterium_AF2122/NC_002945v4.yml /vsnp/AF2122/Mycobacterium_AF2122/NC_002945v4.yml DNAprints file for Mycobacterium bovis AF2122/97 +#NC_006932 Brucella_abortus1/NC_006932-NC_006933.yml /vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.yml DNAprints file for Brucella abortus bv. 1 str. 9-941 diff -r 000000000000 -r 12f2b14549f6 tool-data/vsnp_genbank.loc.sample --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool-data/vsnp_genbank.loc.sample Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,4 @@ +## vSNP Genbank files +#Value Name Path Description +#AF2122 Mycobacterium_AF2122/NC_002945v4.gbk vsnp/AF2122/Mycobacterium_AF2122/NC_002945v4.gbk Genbank file for Mycobacterium bovis AF2122/97 +#NC_006932 Brucella_abortus1/NC_006932-NC_006933.gbk vsnp/NC_006932/Brucella_abortus1/NC_006932-NC_006933.gbk Genbank file for Brucella abortus bv. 1 str. 9-941 diff -r 000000000000 -r 12f2b14549f6 tool_data_table_conf.xml.sample --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_data_table_conf.xml.sample Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,18 @@ + + + + value, dbkey, name, path + +
+ + + value, name, path, description + +
+ + + value, name, path, description + +
+
+ diff -r 000000000000 -r 12f2b14549f6 tool_data_table_conf.xml.test --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_data_table_conf.xml.test Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,15 @@ + + + value, dbkey, name, path + +
+ + + value, name, path, description + +
+ + value, name, path, description + +
+
diff -r 000000000000 -r 12f2b14549f6 vsnp_add_zero_coverage.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/vsnp_add_zero_coverage.py Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,189 @@ +#!/usr/bin/env python + +import argparse +import multiprocessing +import os +import queue +import re +import shutil + +import pandas +import pysam +from Bio import SeqIO + +INPUT_BAM_DIR = 'input_bam_dir' +INPUT_VCF_DIR = 'input_vcf_dir' +OUTPUT_VCF_DIR = 'output_vcf_dir' +OUTPUT_METRICS_DIR = 'output_metrics_dir' + + +def get_base_file_name(file_path): + base_file_name = os.path.basename(file_path) + if base_file_name.find(".") > 0: + # Eliminate the extension. + return os.path.splitext(base_file_name)[0] + elif base_file_name.endswith("_vcf"): + # The "." character has likely + # changed to an "_" character. + return base_file_name.rstrip("_vcf") + return base_file_name + + +def get_coverage_and_snp_count(task_queue, reference, output_metrics, output_vcf, timeout): + while True: + try: + tup = task_queue.get(block=True, timeout=timeout) + except queue.Empty: + break + bam_file, vcf_file = tup + # Create a coverage dictionary. + coverage_dict = {} + coverage_list = pysam.depth(bam_file, split_lines=True) + for line in coverage_list: + chrom, position, depth = line.split('\t') + coverage_dict["%s-%s" % (chrom, position)] = depth + # Convert it to a data frame. + coverage_df = pandas.DataFrame.from_dict(coverage_dict, orient='index', columns=["depth"]) + # Create a zero coverage dictionary. + zero_dict = {} + for record in SeqIO.parse(reference, "fasta"): + chrom = record.id + total_len = len(record.seq) + for pos in list(range(1, total_len + 1)): + zero_dict["%s-%s" % (str(chrom), str(pos))] = 0 + # Convert it to a data frame with depth_x + # and depth_y columns - index is NaN. + zero_df = pandas.DataFrame.from_dict(zero_dict, orient='index', columns=["depth"]) + coverage_df = zero_df.merge(coverage_df, left_index=True, right_index=True, how='outer') + # depth_x "0" column no longer needed. + coverage_df = coverage_df.drop(columns=['depth_x']) + coverage_df = coverage_df.rename(columns={'depth_y': 'depth'}) + # Covert the NaN to 0 coverage and get some metrics. + coverage_df = coverage_df.fillna(0) + coverage_df['depth'] = coverage_df['depth'].apply(int) + total_length = len(coverage_df) + average_coverage = coverage_df['depth'].mean() + zero_df = coverage_df[coverage_df['depth'] == 0] + total_zero_coverage = len(zero_df) + total_coverage = total_length - total_zero_coverage + genome_coverage = "{:.2%}".format(total_coverage / total_length) + # Process the associated VCF input. + column_names = ["CHROM", "POS", "ID", "REF", "ALT", "QUAL", "FILTER", "INFO", "FORMAT", "Sample"] + vcf_df = pandas.read_csv(vcf_file, sep='\t', header=None, names=column_names, comment='#') + good_snp_count = len(vcf_df[(vcf_df['ALT'].str.len() == 1) & (vcf_df['REF'].str.len() == 1) & (vcf_df['QUAL'] > 150)]) + base_file_name = get_base_file_name(vcf_file) + if total_zero_coverage > 0: + header_file = "%s_header.csv" % base_file_name + with open(header_file, 'w') as outfile: + with open(vcf_file) as infile: + for line in infile: + if re.search('^#', line): + outfile.write("%s" % line) + vcf_df_snp = vcf_df[vcf_df['REF'].str.len() == 1] + vcf_df_snp = vcf_df_snp[vcf_df_snp['ALT'].str.len() == 1] + vcf_df_snp['ABS_VALUE'] = vcf_df_snp['CHROM'].map(str) + "-" + vcf_df_snp['POS'].map(str) + vcf_df_snp = vcf_df_snp.set_index('ABS_VALUE') + cat_df = pandas.concat([vcf_df_snp, zero_df], axis=1, sort=False) + cat_df = cat_df.drop(columns=['CHROM', 'POS', 'depth']) + cat_df[['ID', 'ALT', 'QUAL', 'FILTER', 'INFO']] = cat_df[['ID', 'ALT', 'QUAL', 'FILTER', 'INFO']].fillna('.') + cat_df['REF'] = cat_df['REF'].fillna('N') + cat_df['FORMAT'] = cat_df['FORMAT'].fillna('GT') + cat_df['Sample'] = cat_df['Sample'].fillna('./.') + cat_df['temp'] = cat_df.index.str.rsplit('-', n=1) + cat_df[['CHROM', 'POS']] = pandas.DataFrame(cat_df.temp.values.tolist(), index=cat_df.index) + cat_df = cat_df[['CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'Sample']] + cat_df['POS'] = cat_df['POS'].astype(int) + cat_df = cat_df.sort_values(['CHROM', 'POS']) + body_file = "%s_body.csv" % base_file_name + cat_df.to_csv(body_file, sep='\t', header=False, index=False) + if output_vcf is None: + output_vcf_file = os.path.join(OUTPUT_VCF_DIR, "%s.vcf" % base_file_name) + else: + output_vcf_file = output_vcf + with open(output_vcf_file, "w") as outfile: + for cf in [header_file, body_file]: + with open(cf, "r") as infile: + for line in infile: + outfile.write("%s" % line) + else: + if output_vcf is None: + output_vcf_file = os.path.join(OUTPUT_VCF_DIR, "%s.vcf" % base_file_name) + else: + output_vcf_file = output_vcf + shutil.copyfile(vcf_file, output_vcf_file) + bam_metrics = [base_file_name, "", "%4f" % average_coverage, genome_coverage] + vcf_metrics = [base_file_name, str(good_snp_count), "", ""] + if output_metrics is None: + output_metrics_file = os.path.join(OUTPUT_METRICS_DIR, "%s.tabular" % base_file_name) + else: + output_metrics_file = output_metrics + metrics_columns = ["File", "Number of Good SNPs", "Average Coverage", "Genome Coverage"] + with open(output_metrics_file, "w") as fh: + fh.write("# %s\n" % "\t".join(metrics_columns)) + fh.write("%s\n" % "\t".join(bam_metrics)) + fh.write("%s\n" % "\t".join(vcf_metrics)) + task_queue.task_done() + + +def set_num_cpus(num_files, processes): + num_cpus = int(multiprocessing.cpu_count()) + if num_files < num_cpus and num_files < processes: + return num_files + if num_cpus < processes: + half_cpus = int(num_cpus / 2) + if num_files < half_cpus: + return num_files + return half_cpus + return processes + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + + parser.add_argument('--output_metrics', action='store', dest='output_metrics', required=False, default=None, help='Output metrics text file') + parser.add_argument('--output_vcf', action='store', dest='output_vcf', required=False, default=None, help='Output VCF file') + parser.add_argument('--reference', action='store', dest='reference', help='Reference dataset') + parser.add_argument('--processes', action='store', dest='processes', type=int, help='User-selected number of processes to use for job splitting') + + args = parser.parse_args() + + # The assumption here is that the list of files + # in both INPUT_BAM_DIR and INPUT_VCF_DIR are + # equal in number and named such that they are + # properly matched if the directories contain + # more than 1 file (i.e., hopefully the bam file + # names and vcf file names will be something like + # Mbovis-01D6_* so they can be # sorted and properly + # associated with each other). + bam_files = [] + for file_name in sorted(os.listdir(INPUT_BAM_DIR)): + file_path = os.path.abspath(os.path.join(INPUT_BAM_DIR, file_name)) + bam_files.append(file_path) + vcf_files = [] + for file_name in sorted(os.listdir(INPUT_VCF_DIR)): + file_path = os.path.abspath(os.path.join(INPUT_VCF_DIR, file_name)) + vcf_files.append(file_path) + + multiprocessing.set_start_method('spawn') + queue1 = multiprocessing.JoinableQueue() + num_files = len(bam_files) + cpus = set_num_cpus(num_files, args.processes) + # Set a timeout for get()s in the queue. + timeout = 0.05 + + # Add each associated bam and vcf file pair to the queue. + for i, bam_file in enumerate(bam_files): + vcf_file = vcf_files[i] + queue1.put((bam_file, vcf_file)) + + # Complete the get_coverage_and_snp_count task. + processes = [multiprocessing.Process(target=get_coverage_and_snp_count, args=(queue1, args.reference, args.output_metrics, args.output_vcf, timeout, )) for _ in range(cpus)] + for p in processes: + p.start() + for p in processes: + p.join() + queue1.join() + + if queue1.empty(): + queue1.close() + queue1.join_thread() diff -r 000000000000 -r 12f2b14549f6 vsnp_build_tables.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/vsnp_build_tables.py Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,382 @@ +#!/usr/bin/env python + +import argparse +import multiprocessing +import os +import queue +import re + +import pandas +import pandas.io.formats.excel +from Bio import SeqIO + +INPUT_JSON_AVG_MQ_DIR = 'input_json_avg_mq_dir' +INPUT_JSON_DIR = 'input_json_dir' +INPUT_NEWICK_DIR = 'input_newick_dir' +# Maximum columns allowed in a LibreOffice +# spreadsheet is 1024. Excel allows for +# 16,384 columns, but we'll set the lower +# number as the maximum. Some browsers +# (e.g., Firefox on Linux) are configured +# to use LibreOffice for Excel spreadsheets. +MAXCOLS = 1024 +OUTPUT_EXCEL_DIR = 'output_excel_dir' + + +def annotate_table(table_df, group, annotation_dict): + for gbk_chrome, pro in list(annotation_dict.items()): + ref_pos = list(table_df) + ref_series = pandas.Series(ref_pos) + ref_df = pandas.DataFrame(ref_series.str.split(':', expand=True).values, columns=['reference', 'position']) + all_ref = ref_df[ref_df['reference'] == gbk_chrome] + positions = all_ref.position.to_frame() + # Create an annotation file. + annotation_file = "%s_annotations.csv" % group + with open(annotation_file, "a") as fh: + for _, row in positions.iterrows(): + pos = row.position + try: + aaa = pro.iloc[pro.index.get_loc(int(pos))][['chrom', 'locus', 'product', 'gene']] + try: + chrom, name, locus, tag = aaa.values[0] + print("{}:{}\t{}, {}, {}".format(chrom, pos, locus, tag, name), file=fh) + except ValueError: + # If only one annotation for the entire + # chromosome (e.g., flu) then having [0] fails + chrom, name, locus, tag = aaa.values + print("{}:{}\t{}, {}, {}".format(chrom, pos, locus, tag, name), file=fh) + except KeyError: + print("{}:{}\tNo annotated product".format(gbk_chrome, pos), file=fh) + # Read the annotation file into a data frame. + annotations_df = pandas.read_csv(annotation_file, sep='\t', header=None, names=['index', 'annotations'], index_col='index') + # Remove the annotation_file from disk since both + # cascade and sort tables are built using the file, + # and it is opened for writing in append mode. + os.remove(annotation_file) + # Process the data. + table_df_transposed = table_df.T + table_df_transposed.index = table_df_transposed.index.rename('index') + table_df_transposed = table_df_transposed.merge(annotations_df, left_index=True, right_index=True) + table_df = table_df_transposed.T + return table_df + + +def excel_formatter(json_file_name, excel_file_name, group, annotation_dict): + pandas.io.formats.excel.header_style = None + table_df = pandas.read_json(json_file_name, orient='split') + if annotation_dict is not None: + table_df = annotate_table(table_df, group, annotation_dict) + else: + table_df = table_df.append(pandas.Series(name='no annotations')) + writer = pandas.ExcelWriter(excel_file_name, engine='xlsxwriter') + table_df.to_excel(writer, sheet_name='Sheet1') + writer_book = writer.book + ws = writer.sheets['Sheet1'] + format_a = writer_book.add_format({'bg_color': '#58FA82'}) + format_g = writer_book.add_format({'bg_color': '#F7FE2E'}) + format_c = writer_book.add_format({'bg_color': '#0000FF'}) + format_t = writer_book.add_format({'bg_color': '#FF0000'}) + format_normal = writer_book.add_format({'bg_color': '#FDFEFE'}) + formatlowqual = writer_book.add_format({'font_color': '#C70039', 'bg_color': '#E2CFDD'}) + format_ambigous = writer_book.add_format({'font_color': '#C70039', 'bg_color': '#E2CFDD'}) + format_n = writer_book.add_format({'bg_color': '#E2CFDD'}) + rows, cols = table_df.shape + ws.set_column(0, 0, 30) + ws.set_column(1, cols, 2.1) + ws.freeze_panes(2, 1) + format_annotation = writer_book.add_format({'font_color': '#0A028C', 'rotation': '-90', 'align': 'top'}) + # Set last row. + ws.set_row(rows + 1, cols + 1, format_annotation) + # Make sure that row/column locations don't overlap. + ws.conditional_format(rows - 2, 1, rows - 1, cols, {'type': 'cell', 'criteria': '<', 'value': 55, 'format': formatlowqual}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'cell', 'criteria': '==', 'value': 'B$2', 'format': format_normal}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'A', 'format': format_a}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'G', 'format': format_g}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'C', 'format': format_c}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'T', 'format': format_t}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'S', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'Y', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'R', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'W', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'K', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'M', 'format': format_ambigous}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': 'N', 'format': format_n}) + ws.conditional_format(2, 1, rows - 2, cols, {'type': 'text', 'criteria': 'containing', 'value': '-', 'format': format_n}) + format_rotation = writer_book.add_format({}) + format_rotation.set_rotation(90) + for column_num, column_name in enumerate(list(table_df.columns)): + ws.write(0, column_num + 1, column_name, format_rotation) + format_annotation = writer_book.add_format({'font_color': '#0A028C', 'rotation': '-90', 'align': 'top'}) + # Set last row. + ws.set_row(rows, 400, format_annotation) + writer.save() + + +def get_annotation_dict(gbk_file): + gbk_dict = SeqIO.to_dict(SeqIO.parse(gbk_file, "genbank")) + annotation_dict = {} + tmp_file = "features.csv" + # Create a file of chromosomes and features. + for chromosome in list(gbk_dict.keys()): + with open(tmp_file, 'w+') as fh: + for feature in gbk_dict[chromosome].features: + if "CDS" in feature.type or "rRNA" in feature.type: + try: + product = feature.qualifiers['product'][0] + except KeyError: + product = None + try: + locus = feature.qualifiers['locus_tag'][0] + except KeyError: + locus = None + try: + gene = feature.qualifiers['gene'][0] + except KeyError: + gene = None + fh.write("%s\t%d\t%d\t%s\t%s\t%s\n" % (chromosome, int(feature.location.start), int(feature.location.end), locus, product, gene)) + # Read the chromosomes and features file into a data frame. + df = pandas.read_csv(tmp_file, sep='\t', names=["chrom", "start", "stop", "locus", "product", "gene"]) + # Process the data. + df = df.sort_values(['start', 'gene'], ascending=[True, False]) + df = df.drop_duplicates('start') + pro = df.reset_index(drop=True) + pro.index = pandas.IntervalIndex.from_arrays(pro['start'], pro['stop'], closed='both') + annotation_dict[chromosome] = pro + return annotation_dict + + +def get_base_file_name(file_path): + base_file_name = os.path.basename(file_path) + if base_file_name.find(".") > 0: + # Eliminate the extension. + return os.path.splitext(base_file_name)[0] + elif base_file_name.find("_") > 0: + # The dot extension was likely changed to + # the " character. + items = base_file_name.split("_") + return "_".join(items[0:-1]) + else: + return base_file_name + + +def output_cascade_table(cascade_order, mqdf, group, annotation_dict): + cascade_order_mq = pandas.concat([cascade_order, mqdf], join='inner') + output_table(cascade_order_mq, "cascade", group, annotation_dict) + + +def output_excel(df, type_str, group, annotation_dict, count=None): + # Output the temporary json file that + # is used by the excel_formatter. + if count is None: + if group is None: + json_file_name = "%s_order_mq.json" % type_str + excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_table.xlsx" % type_str) + else: + json_file_name = "%s_%s_order_mq.json" % (group, type_str) + excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_%s_table.xlsx" % (group, type_str)) + else: + if group is None: + json_file_name = "%s_order_mq_%d.json" % (type_str, count) + excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_table_%d.xlsx" % (type_str, count)) + else: + json_file_name = "%s_%s_order_mq_%d.json" % (group, type_str, count) + excel_file_name = os.path.join(OUTPUT_EXCEL_DIR, "%s_%s_table_%d.xlsx" % (group, type_str, count)) + df.to_json(json_file_name, orient='split') + # Output the Excel file. + excel_formatter(json_file_name, excel_file_name, group, annotation_dict) + + +def output_sort_table(cascade_order, mqdf, group, annotation_dict): + sort_df = cascade_order.T + sort_df['abs_value'] = sort_df.index + sort_df[['chrom', 'pos']] = sort_df['abs_value'].str.split(':', expand=True) + sort_df = sort_df.drop(['abs_value', 'chrom'], axis=1) + sort_df.pos = sort_df.pos.astype(int) + sort_df = sort_df.sort_values(by=['pos']) + sort_df = sort_df.drop(['pos'], axis=1) + sort_df = sort_df.T + sort_order_mq = pandas.concat([sort_df, mqdf], join='inner') + output_table(sort_order_mq, "sort", group, annotation_dict) + + +def output_table(df, type_str, group, annotation_dict): + if isinstance(group, str) and group.startswith("dataset"): + # Inputs are single files, not collections, + # so input file names are not useful for naming + # output files. + group_str = None + else: + group_str = group + count = 0 + chunk_start = 0 + chunk_end = 0 + column_count = df.shape[1] + if column_count >= MAXCOLS: + # Here the number of columns is greater than + # the maximum allowed by Excel, so multiple + # outputs will be produced. + while column_count >= MAXCOLS: + count += 1 + chunk_end += MAXCOLS + df_of_type = df.iloc[:, chunk_start:chunk_end] + output_excel(df_of_type, type_str, group_str, annotation_dict, count=count) + chunk_start += MAXCOLS + column_count -= MAXCOLS + count += 1 + df_of_type = df.iloc[:, chunk_start:] + output_excel(df_of_type, type_str, group_str, annotation_dict, count=count) + else: + output_excel(df, type_str, group_str, annotation_dict) + + +def preprocess_tables(task_queue, annotation_dict, timeout): + while True: + try: + tup = task_queue.get(block=True, timeout=timeout) + except queue.Empty: + break + newick_file, json_file, json_avg_mq_file = tup + avg_mq_series = pandas.read_json(json_avg_mq_file, typ='series', orient='split') + # Map quality to dataframe. + mqdf = avg_mq_series.to_frame(name='MQ') + mqdf = mqdf.T + # Get the group. + group = get_base_file_name(newick_file) + snps_df = pandas.read_json(json_file, orient='split') + with open(newick_file, 'r') as fh: + for line in fh: + line = re.sub('[:,]', '\n', line) + line = re.sub('[)(]', '', line) + line = re.sub(r'[0-9].*\.[0-9].*\n', '', line) + line = re.sub('root\n', '', line) + sample_order = line.split('\n') + sample_order = list([_f for _f in sample_order if _f]) + sample_order.insert(0, 'root') + tree_order = snps_df.loc[sample_order] + # Count number of SNPs in each column. + snp_per_column = [] + for column_header in tree_order: + count = 0 + column = tree_order[column_header] + for element in column: + if element != column[0]: + count = count + 1 + snp_per_column.append(count) + row1 = pandas.Series(snp_per_column, tree_order.columns, name="snp_per_column") + # Count number of SNPS from the + # top of each column in the table. + snp_from_top = [] + for column_header in tree_order: + count = 0 + column = tree_order[column_header] + # for each element in the column + # skip the first element + for element in column[1:]: + if element == column[0]: + count = count + 1 + else: + break + snp_from_top.append(count) + row2 = pandas.Series(snp_from_top, tree_order.columns, name="snp_from_top") + tree_order = tree_order.append([row1]) + tree_order = tree_order.append([row2]) + # In pandas=0.18.1 even this does not work: + # abc = row1.to_frame() + # abc = abc.T --> tree_order.shape (5, 18), abc.shape (1, 18) + # tree_order.append(abc) + # Continue to get error: "*** ValueError: all the input arrays must have same number of dimensions" + tree_order = tree_order.T + tree_order = tree_order.sort_values(['snp_from_top', 'snp_per_column'], ascending=[True, False]) + tree_order = tree_order.T + # Remove snp_per_column and snp_from_top rows. + cascade_order = tree_order[:-2] + # Output the cascade table. + output_cascade_table(cascade_order, mqdf, group, annotation_dict) + # Output the sorted table. + output_sort_table(cascade_order, mqdf, group, annotation_dict) + task_queue.task_done() + + +def set_num_cpus(num_files, processes): + num_cpus = int(multiprocessing.cpu_count()) + if num_files < num_cpus and num_files < processes: + return num_files + if num_cpus < processes: + half_cpus = int(num_cpus / 2) + if num_files < half_cpus: + return num_files + return half_cpus + return processes + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + + parser.add_argument('--input_avg_mq_json', action='store', dest='input_avg_mq_json', required=False, default=None, help='Average MQ json file') + parser.add_argument('--input_newick', action='store', dest='input_newick', required=False, default=None, help='Newick file') + parser.add_argument('--input_snps_json', action='store', dest='input_snps_json', required=False, default=None, help='SNPs json file') + parser.add_argument('--gbk_file', action='store', dest='gbk_file', required=False, default=None, help='Optional gbk file'), + parser.add_argument('--processes', action='store', dest='processes', type=int, help='User-selected number of processes to use for job splitting') + + args = parser.parse_args() + + if args.gbk_file is not None: + # Create the annotation_dict for annotating + # the Excel tables. + annotation_dict = get_annotation_dict(args.gbk_file) + else: + annotation_dict = None + + # The assumption here is that the list of files + # in both INPUT_NEWICK_DIR and INPUT_JSON_DIR are + # named such that they are properly matched if + # the directories contain more than 1 file (i.e., + # hopefully the newick file names and json file names + # will be something like Mbovis-01D6_* so they can be + # sorted and properly associated with each other). + if args.input_newick is not None: + newick_files = [args.input_newick] + else: + newick_files = [] + for file_name in sorted(os.listdir(INPUT_NEWICK_DIR)): + file_path = os.path.abspath(os.path.join(INPUT_NEWICK_DIR, file_name)) + newick_files.append(file_path) + if args.input_snps_json is not None: + json_files = [args.input_snps_json] + else: + json_files = [] + for file_name in sorted(os.listdir(INPUT_JSON_DIR)): + file_path = os.path.abspath(os.path.join(INPUT_JSON_DIR, file_name)) + json_files.append(file_path) + if args.input_avg_mq_json is not None: + json_avg_mq_files = [args.input_avg_mq_json] + else: + json_avg_mq_files = [] + for file_name in sorted(os.listdir(INPUT_JSON_AVG_MQ_DIR)): + file_path = os.path.abspath(os.path.join(INPUT_JSON_AVG_MQ_DIR, file_name)) + json_avg_mq_files.append(file_path) + + multiprocessing.set_start_method('spawn') + queue1 = multiprocessing.JoinableQueue() + queue2 = multiprocessing.JoinableQueue() + num_files = len(newick_files) + cpus = set_num_cpus(num_files, args.processes) + # Set a timeout for get()s in the queue. + timeout = 0.05 + + for i, newick_file in enumerate(newick_files): + json_file = json_files[i] + json_avg_mq_file = json_avg_mq_files[i] + queue1.put((newick_file, json_file, json_avg_mq_file)) + + # Complete the preprocess_tables task. + processes = [multiprocessing.Process(target=preprocess_tables, args=(queue1, annotation_dict, timeout, )) for _ in range(cpus)] + for p in processes: + p.start() + for p in processes: + p.join() + queue1.join() + + if queue1.empty(): + queue1.close() + queue1.join_thread() diff -r 000000000000 -r 12f2b14549f6 vsnp_determine_ref_from_data.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/vsnp_determine_ref_from_data.py Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,231 @@ +#!/usr/bin/env python + +import argparse +import gzip +import os +from collections import OrderedDict + +import yaml +from Bio.SeqIO.QualityIO import FastqGeneralIterator + +OUTPUT_DBKEY_DIR = 'output_dbkey' +OUTPUT_METRICS_DIR = 'output_metrics' + + +def get_base_file_name(file_path): + base_file_name = os.path.basename(file_path) + if base_file_name.find(".") > 0: + # Eliminate the extension. + return os.path.splitext(base_file_name)[0] + elif base_file_name.find("_fq") > 0: + # The "." character has likely + # changed to an "_" character. + return base_file_name.split("_fq")[0] + elif base_file_name.find("_fastq") > 0: + return base_file_name.split("_fastq")[0] + return base_file_name + + +def get_dbkey(dnaprints_dict, key, s): + # dnaprints_dict looks something like this: + # {'brucella': {'NC_002945v4': ['11001110', '11011110', '11001100']} + # {'bovis': {'NC_006895': ['11111110', '00010010', '01111011']}} + d = dnaprints_dict.get(key, {}) + for data_table_value, v_list in d.items(): + if s in v_list: + return data_table_value + return "" + + +def get_dnaprints_dict(dnaprint_fields): + # A dndprint_fields entry looks something liek this. + # [['AF2122', '/galaxy/tool-data/vsnp/AF2122/dnaprints/NC_002945v4.yml']] + dnaprints_dict = {} + for item in dnaprint_fields: + # Here item is a 2-element list of data + # table components, # value and path. + value = item[0] + path = item[1].strip() + with open(path, "rt") as fh: + # The format of all dnaprints yaml + # files is something like this: + # brucella: + # - 0111111111111111 + print_dict = yaml.load(fh, Loader=yaml.Loader) + for print_dict_k, print_dict_v in print_dict.items(): + dnaprints_v_dict = dnaprints_dict.get(print_dict_k, {}) + if len(dnaprints_v_dict) > 0: + # dnaprints_dict already contains k (e.g., 'brucella', + # and dnaprints_v_dict will be a dictionary # that + # looks something like this: + # {'NC_002945v4': ['11001110', '11011110', '11001100']} + value_list = dnaprints_v_dict.get(value, []) + value_list = value_list + print_dict_v + dnaprints_v_dict[value] = value_list + else: + # dnaprints_v_dict is an empty dictionary. + dnaprints_v_dict[value] = print_dict_v + dnaprints_dict[print_dict_k] = dnaprints_v_dict + # dnaprints_dict looks something like this: + # {'brucella': {'NC_002945v4': ['11001110', '11011110', '11001100']} + # {'bovis': {'NC_006895': ['11111110', '00010010', '01111011']}} + return dnaprints_dict + + +def get_group_and_dbkey(dnaprints_dict, brucella_string, brucella_sum, bovis_string, bovis_sum, para_string, para_sum): + if brucella_sum > 3: + group = "Brucella" + dbkey = get_dbkey(dnaprints_dict, "brucella", brucella_string) + elif bovis_sum > 3: + group = "TB" + dbkey = get_dbkey(dnaprints_dict, "bovis", bovis_string) + elif para_sum >= 1: + group = "paraTB" + dbkey = get_dbkey(dnaprints_dict, "para", para_string) + else: + group = "" + dbkey = "" + return group, dbkey + + +def get_oligo_dict(): + oligo_dict = {} + oligo_dict["01_ab1"] = "AATTGTCGGATAGCCTGGCGATAACGACGC" + oligo_dict["02_ab3"] = "CACACGCGGGCCGGAACTGCCGCAAATGAC" + oligo_dict["03_ab5"] = "GCTGAAGCGGCAGACCGGCAGAACGAATAT" + oligo_dict["04_mel"] = "TGTCGCGCGTCAAGCGGCGTGAAATCTCTG" + oligo_dict["05_suis1"] = "TGCGTTGCCGTGAAGCTTAATTCGGCTGAT" + oligo_dict["06_suis2"] = "GGCAATCATGCGCAGGGCTTTGCATTCGTC" + oligo_dict["07_suis3"] = "CAAGGCAGATGCACATAATCCGGCGACCCG" + oligo_dict["08_ceti1"] = "GTGAATATAGGGTGAATTGATCTTCAGCCG" + oligo_dict["09_ceti2"] = "TTACAAGCAGGCCTATGAGCGCGGCGTGAA" + oligo_dict["10_canis4"] = "CTGCTACATAAAGCACCCGGCGACCGAGTT" + oligo_dict["11_canis"] = "ATCGTTTTGCGGCATATCGCTGACCACAGC" + oligo_dict["12_ovis"] = "CACTCAATCTTCTCTACGGGCGTGGTATCC" + oligo_dict["13_ether2"] = "CGAAATCGTGGTGAAGGACGGGACCGAACC" + oligo_dict["14_63B1"] = "CCTGTTTAAAAGAATCGTCGGAACCGCTCT" + oligo_dict["15_16M0"] = "TCCCGCCGCCATGCCGCCGAAAGTCGCCGT" + oligo_dict["16_mel1b"] = "TCTGTCCAAACCCCGTGACCGAACAATAGA" + oligo_dict["17_tb157"] = "CTCTTCGTATACCGTTCCGTCGTCACCATGGTCCT" + oligo_dict["18_tb7"] = "TCACGCAGCCAACGATATTCGTGTACCGCGACGGT" + oligo_dict["19_tbbov"] = "CTGGGCGACCCGGCCGACCTGCACACCGCGCATCA" + oligo_dict["20_tb5"] = "CCGTGGTGGCGTATCGGGCCCCTGGATCGCGCCCT" + oligo_dict["21_tb2"] = "ATGTCTGCGTAAAGAAGTTCCATGTCCGGGAAGTA" + oligo_dict["22_tb3"] = "GAAGACCTTGATGCCGATCTGGGTGTCGATCTTGA" + oligo_dict["23_tb4"] = "CGGTGTTGAAGGGTCCCCCGTTCCAGAAGCCGGTG" + oligo_dict["24_tb6"] = "ACGGTGATTCGGGTGGTCGACACCGATGGTTCAGA" + oligo_dict["25_para"] = "CCTTTCTTGAAGGGTGTTCG" + oligo_dict["26_para_sheep"] = "CGTGGTGGCGACGGCGGCGGGCCTGTCTAT" + oligo_dict["27_para_cattle"] = "TCTCCTCGGTCGGTGATTCGGGGGCGCGGT" + return oligo_dict + + +def get_seq_counts(value, fastq_list, gzipped): + count = 0 + for fastq_file in fastq_list: + if gzipped: + with gzip.open(fastq_file, 'rt') as fh: + for title, seq, qual in FastqGeneralIterator(fh): + count += seq.count(value) + else: + with open(fastq_file, 'r') as fh: + for title, seq, qual in FastqGeneralIterator(fh): + count += seq.count(value) + return(value, count) + + +def get_species_counts(fastq_list, gzipped): + count_summary = {} + oligo_dict = get_oligo_dict() + for v1 in oligo_dict.values(): + returned_value, count = get_seq_counts(v1, fastq_list, gzipped) + for key, v2 in oligo_dict.items(): + if returned_value == v2: + count_summary.update({key: count}) + count_list = [] + for v in count_summary.values(): + count_list.append(v) + brucella_sum = sum(count_list[:16]) + bovis_sum = sum(count_list[16:24]) + para_sum = sum(count_list[24:]) + return count_summary, count_list, brucella_sum, bovis_sum, para_sum + + +def get_species_strings(count_summary): + binary_dictionary = {} + for k, v in count_summary.items(): + if v > 1: + binary_dictionary.update({k: 1}) + else: + binary_dictionary.update({k: 0}) + binary_dictionary = OrderedDict(sorted(binary_dictionary.items())) + binary_list = [] + for v in binary_dictionary.values(): + binary_list.append(v) + brucella_binary = binary_list[:16] + brucella_string = ''.join(str(e) for e in brucella_binary) + bovis_binary = binary_list[16:24] + bovis_string = ''.join(str(e) for e in bovis_binary) + para_binary = binary_list[24:] + para_string = ''.join(str(e) for e in para_binary) + return brucella_string, bovis_string, para_string + + +def output_dbkey(file_name, dbkey, output_file): + # Output the dbkey. + with open(output_file, "w") as fh: + fh.write("%s" % dbkey) + + +def output_files(fastq_file, count_list, group, dbkey, dbkey_file, metrics_file): + base_file_name = get_base_file_name(fastq_file) + output_dbkey(base_file_name, dbkey, dbkey_file) + output_metrics(base_file_name, count_list, group, dbkey, metrics_file) + + +def output_metrics(file_name, count_list, group, dbkey, output_file): + # Output the metrics. + with open(output_file, "w") as fh: + fh.write("Sample: %s\n" % file_name) + fh.write("Brucella counts: ") + for i in count_list[:16]: + fh.write("%d," % i) + fh.write("\nTB counts: ") + for i in count_list[16:24]: + fh.write("%d," % i) + fh.write("\nPara counts: ") + for i in count_list[24:]: + fh.write("%d," % i) + fh.write("\nGroup: %s" % group) + fh.write("\ndbkey: %s\n" % dbkey) + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + + parser.add_argument('--dnaprint_fields', action='append', dest='dnaprint_fields', nargs=2, required=False, default=None, help="List of dnaprints data table value, name and path fields") + parser.add_argument('--read1', action='store', dest='read1', required=True, default=None, help='Required: single read') + parser.add_argument('--read2', action='store', dest='read2', required=False, default=None, help='Optional: paired read') + parser.add_argument('--gzipped', action='store_true', dest='gzipped', default=False, help='Input files are gzipped') + parser.add_argument('--output_dbkey', action='store', dest='output_dbkey', required=True, default=None, help='Output reference file') + parser.add_argument('--output_metrics', action='store', dest='output_metrics', required=True, default=None, help='Output metrics file') + + args = parser.parse_args() + + fastq_list = [args.read1] + if args.read2 is not None: + fastq_list.append(args.read2) + + # The value of dnaprint_fields is a list of lists, where each list is + # the [value, name, path] components of the vsnp_dnaprints data table. + # The data_manager_vsnp_dnaprints tool assigns the dbkey column from the + # all_fasta data table to the value column in the vsnp_dnaprints data + # table to ensure a proper mapping for discovering the dbkey. + dnaprints_dict = get_dnaprints_dict(args.dnaprint_fields) + + # Here fastq_list consists of either a single read + # or a set of paired reads, producing single outputs. + count_summary, count_list, brucella_sum, bovis_sum, para_sum = get_species_counts(fastq_list, args.gzipped) + brucella_string, bovis_string, para_string = get_species_strings(count_summary) + group, dbkey = get_group_and_dbkey(dnaprints_dict, brucella_string, brucella_sum, bovis_string, bovis_sum, para_string, para_sum) + output_files(args.read1, count_list, group, dbkey, dbkey_file=args.output_dbkey, metrics_file=args.output_metrics) diff -r 000000000000 -r 12f2b14549f6 vsnp_determine_ref_from_data.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/vsnp_determine_ref_from_data.xml Wed Dec 02 09:11:24 2020 +0000 @@ -0,0 +1,154 @@ + + from input data + + macros.xml + + + biopython + pyyaml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +**What it does** + +Accepts a single fastqsanger read, a set of paired reads, or a collection of single or paired reads (bacterial samples) and +inspects the data to discover the best reference genome for aligning the reads. + +The information needed to discover the best reference is maintained by the USDA in this repository_. References are curreently + +.. _repository: https://github.com/USDA-VS/vSNP_reference_options + +limited to TB complex, paraTB, and Brucella, but information for additional references will be added. The information for each +reference is a string consisting of zeros and ones, compiled by USDA researchers, which we call a "DNA print". These strings +are maintained in yaml files for use in Galaxy, and are installed via the **vSNP DNAprints data manager** tool. + +This tool creates an in-memory dictionary of these DNA print strings for matching with a string generated by inspecting the +input sample data. During inspection, this tool accrues sequence counts for supported species, ultimately generating a string +consisting of zeros and ones based on the counts, (i.e., a DNA print). This string is then compared to the strings contained +in the in-memory dictionary of DNA prints to find a match. + +The strings in the in-memory dictionary are each associated with a Galaxy "dbkey" (i.e., genome build), so when a match is found, +the associated "dbkey" is passed to a mapper (e.g., **Map with BWA-MEM**), typically within a workflow via an expression tool, +to align the reads to the associated reference. + +This tool produces 2 text files, a "dbkey" file that contains the dbkey string and a "metrics" file that provides information +about the sequence counts that were discovered in the input sample data that produced the "DNA print" string. + +This tool is important for samples containing bacterial species because many of the samples have a "mixed bag" of species, +and discovering the primary species is critical. DNA print matching is currently supported for the following genomes. + + * Mycobacterium bovis AF2122/97 + * Brucella abortus bv. 1 str. 9-941 + * Brucella abortus strain BER + * Brucella canis ATCC 23365 + * Brucella ceti TE10759-12 + * Brucella melitensis bv. 1 str. 16M + * Brucella melitensis bv. 3 str. Ether + * Brucella melitensis BwIM_SOM_36b + * Brucella melitensis ATCC 23457 + * Brucella ovis ATCC 25840 + * Brucella suis 1330 + * Mycobacterium tuberculosis H37Rv + * Mycobacterium avium subsp. paratuberculosis strain Telford + * Mycobacterium avium subsp. paratuberculosis K-10 + * Brucella suis ATCC 23445 + * Brucella suis bv. 3 str. 686 + +**Required Options** + + * **Choose the category of the files to be analyzed** - select "Single files" or "Collection of files", then select the appropriate history items (single or paired fastqsanger reads or a collection of fastqsanger reads) based on the selected option. + + + +