annotate mycrobiota.py @ 0:607c5e7e0a64 draft

planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
author erasmus-medical-center
date Wed, 13 Dec 2017 10:09:50 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
1 import argparse
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
2 import csv
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
3 import math
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
4 import os
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
5 from subprocess import call
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
6
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
7
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
8 def main():
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
9 print("Welcome to the MYcrobiota suite")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
10 parser = argparse.ArgumentParser()
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
11 parser.add_argument('-c', '--command', required=True,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
12 help="What action to perform")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
13 parser.add_argument('-ct', '--count_table', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
14 help="mothur count table")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
15 parser.add_argument('-cp', '--copies', help="copies of NC for samples")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
16 parser.add_argument('-nccp', '--nc_copies', help="copies of NC for itself")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
17 parser.add_argument('-t', '--taxonomy', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
18 help="mothur taxonomy file")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
19 parser.add_argument('-s', '--shared_file', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
20 help="mothur shared file")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
21 parser.add_argument('-otu', '--otutable', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
22 help="mothur OTU table")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
23 parser.add_argument('-f', '--fasta', action='append', help="fasta")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
24 parser.add_argument('-sl', '--summary_log', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
25 help="mothur summary log file")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
26 parser.add_argument('-o', '--outfile', help="output file")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
27 parser.add_argument('-od', '--outdir', help="output directory", default="")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
28 parser.add_argument('-lv', '--level', help="taxonomy level")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
29 parser.add_argument('-nc', '--negative_control',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
30 help="sample name of the negative control")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
31 parser.add_argument('-ncs', '--negative_control_species',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
32 help="species name of the negative control",
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
33 default="Oscillatoria")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
34 parser.add_argument('-r', '--replicate_suffix',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
35 help="suffix to identify replicates")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
36 parser.add_argument('-l', '--label', action='append',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
37 help="label for count table")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
38 parser.add_argument('--with-otu', dest='with_otu', action='store_true',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
39 default=False)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
40 args = parser.parse_args()
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
41
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
42 try:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
43 os.mkdir(args.outdir)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
44 except OSError:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
45 pass
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
46
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
47 print("Running command: "+args.command)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
48
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
49 if args.command == 'counttable_totals':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
50 count_table_totals(args.count_table[0], args.outdir, args.outfile)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
51
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
52 elif args.command == 'qc_report':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
53 if args.count_table:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
54 qc_report(args.count_table, args.label, 'counttables', args.outdir)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
55 elif args.summary_log:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
56 qc_report(args.summary_log, args.label, 'summarylogs', args.outdir)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
57
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
58 elif args.command == 'create_krona_plot':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
59 create_krona_plot(args.taxonomy, args.outdir, args.with_otu)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
60
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
61 elif args.command == 'create_krona_plot_multisample':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
62 create_krona_plot_multisample(args.taxonomy, args.shared_file,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
63 args.level, args.outdir, args.with_otu)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
64
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
65 elif args.command == 'correct_replicates':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
66 correct_replicates(args.shared_file, args.taxonomy, args.outdir,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
67 args.replicate_suffix, args.copies,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
68 args.negative_control, args.nc_copies,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
69 args.negative_control_species)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
70
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
71 elif args.command == 'make_multi_otutable':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
72 make_multi_otutable(args.taxonomy, args.shared_file, args.level,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
73 args.outdir)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
74
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
75 elif args.command == 'otutable_add_blast_links':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
76 otutable_add_blast_links(args.otutable, args.fasta)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
77
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
78 elif args.command == 'split_multi_otutable':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
79 split_multi_otutable(args.otutable)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
80
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
81 else:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
82 print("unknown command. exiting")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
83
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
84
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
85 def make_url(seq, baseurl):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
86 return baseurl+"?DATABASE=nr&PERC_IDENT=97&EXCLUDE_SEQ_UNCULT=on&" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
87 "HITLIST_SIZE=10&FILTER=L&FILTER=m&FILTER=R&EXPECT=10&" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
88 "FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
89 "SERVICE=megablast&PAGE=Nucleotides&CMD=Put&QUERY=" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
90 + seq.lower()
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
91
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
92
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
93 def make_RIDlink(RID, baseurl):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
94 return "<a target=\"_blank\" href=\""+baseurl+"?CMD=Get&RID="\
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
95 + RID + "\">view results</a>"
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
96
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
97
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
98 def make_rerun_link(seq, baseurl):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
99 return "<a target=\"_blank\" href=\"" + baseurl +\
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
100 "?DATABASE=nr&EXCLUDE_SEQ_UNCULT=yes&FILTER=L&FORMAT_TYPE=HTML" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
101 "&PROGRAM=blastn&CLIENT=web&SERVICE=megablast&PAGE=Nucleotides&" \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
102 "CMD=Web&QUERY=" + seq.lower() + "\">send to BLAST</a>"
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
103
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
104
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
105 def otutable_add_blast_links(otutable, otureps):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
106 baseurl = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi"
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
107
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
108 # for each otu create blast search of corresponding representative sequence
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
109 reps = [line for line in open(otureps[0], "r")]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
110
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
111 seqs = [r.rstrip('\n').replace('-', '') for r in reps if '>' not in r]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
112 seq_names = [r for r in reps if '>' in r]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
113 otulines = [line for line in open(otutable[0], "r")]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
114
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
115 # Add RID link and rerun link to table
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
116 with open("otutable_with_blast.tsv", "w+") as outfile, \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
117 open("filtered_otureps.fasta", "w+") as repsout:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
118 outfile.write(otulines[0].rstrip() + "\tBLAST\n")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
119
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
120 for otuline in otulines[1:]:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
121 otu = otuline.split('\t')[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
122 for i, seq in enumerate(seq_names):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
123 if otu in seq:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
124 outfile.write(otuline.rstrip() + "\t" +
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
125 make_rerun_link(seqs[i], baseurl) + "\n")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
126 # output otureps for these otus
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
127 for i, seq in enumerate(reps):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
128 if otu in seq:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
129 repsout.write(reps[i])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
130 repsout.write(reps[i+1])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
131
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
132
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
133 def summarylog_total(infile):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
134 with open(infile) as f:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
135 summarylog = f.readlines()
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
136 for line in summarylog:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
137 if line.startswith('# of Seqs:') \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
138 or line.startswith('total # of seqs:'):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
139 return int(line.split('\t')[-1])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
140 return None
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
141
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
142
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
143 def count_table_totals(infile, outdir='', outfile=''):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
144 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
145 Given a Mothur counttable, calculate the total number of sequences for each
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
146 sample. This can be appended as additional row in the count table by
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
147 providing a file name.
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
148
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
149 :param infile: Mothur count table.
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
150 :param outfile: Optional. Write the count table with an additional row with
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
151 totals to this file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
152 :param outdir: Optional. Write output do this directory
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
153 :return: A list with totals for all columns (samples) in the count table
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
154 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
155 # assume a single input file for now
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
156 out_rows = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
157 with open(infile) as f:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
158 count_table = csv.reader(f, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
159
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
160 header = next(count_table)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
161 totals = [0] * (len(header)-1)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
162
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
163 out_rows.append(header)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
164 for row in count_table:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
165 if row[0] != 'total':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
166 out_rows.append(row)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
167 for i in range(1, len(row)):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
168 totals[i-1] += int(row[i])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
169
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
170 out_rows.append(["total"] + map(str, totals))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
171
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
172 # write count table with totals to file if requested
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
173 if outfile:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
174 write_output(outdir, outfile, out_rows)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
175
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
176 return totals
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
177
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
178
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
179 def qc_report(infiles, label, inputtype, outdir=''):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
180 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
181 Construct QC table from multiple count files
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
182 - Report the number of sequences lost at each consecutive QC step
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
183 - Create a multi-sample report and a separate report for each sample
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
184
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
185 :param infiles: set of count tables
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
186 :param label: labels for each step
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
187 :param outdir: directory to place output files. Default: cwd
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
188 :return:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
189 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
190 assert len(infiles) == len(label), \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
191 "number of files and labels unequal, stopping"
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
192
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
193 print("qcreport")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
194 previous_totals = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
195 outlines = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
196 lasttotal = None
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
197
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
198 for (i, lab) in zip(infiles, label):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
199 with open(i, 'rb') as f:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
200 count_file = csv.reader(f, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
201
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
202 if inputtype == 'summarylogs':
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
203 print("summarylogs")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
204 if not outlines:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
205 outlines = [['Step', 'Total', 'Removed', 'Percentage']]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
206
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
207 # get total count
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
208 total = summarylog_total(i)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
209
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
210 # calculate difference with last
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
211 if not lasttotal:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
212 outlines.append([lab, total, None, None])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
213 else:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
214 diff = total - lasttotal
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
215 perc = float(diff)/float(lasttotal)*100.0
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
216 outlines.append([lab, total, diff, str("%.1f" % perc)+"%"])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
217
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
218 lasttotal = total
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
219
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
220 else:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
221 # add header line to output
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
222 if not outlines:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
223 outlines = [['step'] + next(count_file)[1:]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
224
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
225 # calculate totals of each column in the count file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
226 totals = count_table_totals(i)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
227
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
228 # calculate difference with previous count file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
229 if not previous_totals:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
230 diffs = [""] * len(totals)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
231 else:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
232 diffs = [" (" + str(t1-t2)+"; "
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
233 + str("%.1f" % (float(t1-t2)/float(t2)*100.0)) +
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
234 "%)" for t1, t2 in zip(totals, previous_totals)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
235
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
236 outlines.append([lab] +
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
237 [str(a)+b for a, b in zip(totals, diffs)])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
238 previous_totals = totals
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
239
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
240 # write multi-sample output file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
241 write_output(outdir, 'all_qctable.tsv', outlines)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
242
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
243 # write per-sample output files
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
244 for j in range(2, len(outlines[0])):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
245 sample = outlines[0][j]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
246 sample_outlines = [[outlines_line[0], outlines_line[j]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
247 for outlines_line in outlines]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
248 write_output(outdir, 'persample_qctable_'+sample+'.tsv',
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
249 sample_outlines)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
250
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
251
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
252 def column(matrix, i):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
253 return [row[i] for row in matrix]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
254
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
255
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
256 def mean(data):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
257 return sum(data) / float(len(data))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
258
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
259
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
260 def stdev(data):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
261 c = mean(data)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
262 ss = sum((x-c)**2 for x in data)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
263 n = len(data)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
264 return math.sqrt(ss/(n-1))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
265
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
266
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
267 def write_output(outdir, filename, outlines):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
268 with open(os.path.join(outdir, filename), 'wb') as of:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
269 out_table = csv.writer(of, delimiter='\t', lineterminator='\n')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
270 for row in outlines:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
271 out_table.writerow(row)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
272
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
273
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
274 def correct_replicates(shared, taxonomy, outdir, replicate_suffix,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
275 sample_copies, negative_control='', nc_copies=-1,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
276 negative_control_species='Oscillatoria'):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
277 with open(shared[0], 'rb') as f, open(taxonomy[0], 'rb') as f2:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
278 shared_file = csv.reader(f, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
279 taxonomy_file = csv.reader(f2, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
280
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
281 # determine which OTU number is the control, Oscillatoria by default
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
282 # (Bacteria;Cyanobacteria;Cyanobacteria;..;Oscillatoria)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
283 try:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
284 line = next(taxonomy_file)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
285 while negative_control_species not in line[2]:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
286 line = next(taxonomy_file)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
287 otu = line[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
288 except StopIteration:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
289 print("negative control species not found in taxonomy, Exiting")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
290 return 1
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
291
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
292 ''' Calculate Copies '''
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
293 # per replicate of sample and NC, determine correction factor,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
294 # (number Oscillatoria seqs/known copies of it)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
295 # correct all sequence counts with that
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
296 myshared = [row for row in shared_file if row]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
297 newshared = [myshared[0]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
298 newshared2 = [myshared[0]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
299 newshared3 = [myshared[0]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
300 newshared4 = [myshared[0]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
301 oscil_column = myshared[0].index(otu)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
302
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
303 for row in myshared[1:]:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
304 if row[1].startswith(negative_control):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
305 copies = nc_copies
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
306 else:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
307 copies = sample_copies
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
308
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
309 correction_factor = float(row[oscil_column]) / float(copies)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
310
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
311 new_row = row[0:3]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
312 for count in row[3:]:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
313 try:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
314 newval = float(count) / correction_factor
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
315 except ZeroDivisionError:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
316 newval = float(count)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
317 new_row.append(newval)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
318 newshared.append(new_row)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
319
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
320 ''' Average copy counts across replicates '''
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
321 levels = set([row[0] for row in newshared[1:]])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
322 samples = set([row[1].split(replicate_suffix)[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
323 for row in newshared[1:]])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
324
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
325 for level in levels:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
326 for sample in samples:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
327 neg = True if sample.startswith(negative_control) else False
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
328 replicates = [row for row in newshared if row[0] == level
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
329 and row[1].startswith(sample)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
330 num_otus = int(replicates[0][2])+3
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
331 total = replicates[0][2]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
332 avg = [level, sample, total]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
333
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
334 for i in range(3, num_otus):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
335 counts = column(replicates, i)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
336 avg.append(mean(counts)) if 0 not in counts or neg \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
337 else avg.append(0)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
338
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
339 newshared2.append(avg)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
340
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
341 ''' Average *sequence* counts across replicates '''
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
342 levels = set([row[0] for row in myshared[1:]])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
343 samples = set([row[1].split(replicate_suffix)[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
344 for row in myshared[1:]])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
345
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
346 for level in levels:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
347 for sample in samples:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
348 replicates = [row for row in myshared if row[0] == level
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
349 and row[1].startswith(sample)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
350 num_otus = int(replicates[0][2]) + 3
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
351 total = replicates[0][2]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
352 avg = [level, sample, total]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
353
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
354 for i in range(3, num_otus):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
355 counts = map(int, column(replicates, i))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
356 avg.append(int(round(mean(counts))))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
357
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
358 newshared4.append(avg)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
359
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
360 ''' Correct for background '''
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
361 # for each otu, subtract 3 times the standard deviation of
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
362 # the negative control sample
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
363 for level in levels:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
364 NC = [row for row in newshared if row[0] == level
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
365 and row[1].startswith(negative_control)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
366 samples = [row for row in newshared2 if row[0] == level
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
367 and not row[1].startswith(negative_control)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
368 num_otus = int(samples[0][2])+3
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
369
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
370 for i in range(3, num_otus):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
371 m = mean(column(NC, i))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
372 sd = stdev(column(NC, i))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
373 corr = m + 3*sd
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
374
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
375 for s in samples:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
376 s[i] = max(0, int(round(s[i] - corr)))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
377
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
378 newshared3 += samples
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
379
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
380 # remove Negative control species otu from table
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
381 for i, row in enumerate(newshared3):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
382 del row[oscil_column]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
383 if i > 0:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
384 row[2] = int(row[2]) - 1
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
385
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
386 # sort file or other mothur tools may segfault :/
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
387 newshared3 = [newshared3[0]] + sorted(newshared3[1:],
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
388 key=lambda a_entry: a_entry[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
389 if a_entry[0] != 'unique' else 0)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
390
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
391 f2.seek(0)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
392 taxonomy_out = [row for row in taxonomy_file if row and row[0] != otu]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
393 write_output(outdir, 'taxonomy_corrected.tsv', taxonomy_out)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
394 write_output(outdir, 'shared_corrected.tsv', newshared3)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
395 write_output(outdir, 'shared_averaged.tsv', newshared4)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
396
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
397
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
398 def make_multi_otutable(taxonomy_file, shared_file, level, outdir):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
399 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
400 Create an otu table from shared file and taxonomy file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
401
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
402 example output:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
403
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
404 OTU sample1 sample2 .. sampleX Kingdom Phylum Class Order Family Genus
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
405 Otu001 13 8 .. 91 Bacteria Bacteroidetes Bacteroidia ..
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
406 ...
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
407
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
408 :param taxonomy_file:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
409 :param shared_file:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
410 :param level:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
411
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
412 :return:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
413 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
414
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
415 outlines = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
416 samples = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
417 # create multisample taxonomy from counts in shared file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
418 with open(taxonomy_file[0], 'r') as tax, open(shared_file[0]) as sh:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
419 taxonomy = csv.reader(tax, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
420 shared = csv.reader(sh, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
421 shared_header = next(shared)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
422 outlines.append(shared_header[3:])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
423
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
424 # get all taxonomies
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
425 taxonomies = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
426 for j, t in enumerate(taxonomy):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
427 if j > 0:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
428 taxonomies.append(filter(None, [x.split('(')[0]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
429 for x in t[2].split(';')]))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
430
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
431 for i, row in enumerate(shared):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
432 tax.seek(0) # make sure to start at beginning of file each time
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
433 if row[0] == level:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
434 samples.append(row[1])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
435 outlines.append(row[3:])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
436
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
437 transposed = map(list, zip(*outlines))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
438 header = ["OTU"] + samples + ["Kingdom", "Phylum", "Class", "Order",
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
439 "Family", "Genus"]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
440
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
441 writelines = [header] + [a + b for a, b in zip(transposed, taxonomies)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
442 if a[1:] != ['0'] * len(a[1:])]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
443
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
444 # sort sample columns by name
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
445 lst = map(list, zip(*[w[1:-6] for w in writelines]))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
446 lst.sort(key=lambda x: x[0])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
447 lst = map(list, zip(*lst))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
448 writelines2 = [[writelines[i][0]] + lst[i] + writelines[i][-6:]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
449 for i in range(0, len(writelines))]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
450
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
451 # output corrected shared file
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
452 write_output(outdir, "multi_otutable.tsv", writelines2)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
453
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
454
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
455 def split_multi_otutable(otutable, with_avg=True):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
456 fulltable = [line.strip().split('\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
457 for line in open(otutable[0], 'r') if line]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
458 samples = [s.split('_')[0] for s in fulltable[0][1:-6]]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
459 numcols = len(fulltable[0])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
460 numreplicates = (numcols - 7) / len(set(samples))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
461
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
462 for sample in set(samples):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
463 outlines = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
464 cols = [0] + [i+1 for i, s in enumerate(samples) if sample in s] \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
465 + [i for i in range(numcols-6, numcols)]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
466 for i, line in enumerate(fulltable):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
467 out = [line[j] for j in cols]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
468 if out[1:-6] != ['0'] * numreplicates:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
469 out.insert(-6, 'mean' if i == 0
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
470 else int(round(mean(map(int, out[1:-6])))))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
471 outlines.append(out)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
472
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
473 write_output('.', sample+'.otutable', outlines)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
474
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
475
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
476 def create_krona_plot_multisample(taxonomy_file, shared_file, level, outdir,
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
477 with_otu):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
478 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
479 Create krona plots from a multisample taxonomy plot and a shared file.
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
480 Create one multisample plot and a plot per individual sample
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
481
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
482 :param taxonomy_file:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
483 :param shared_file:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
484 :param level: which level to use, e.g. unique/0.03/..
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
485 :param with_otu:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
486 :return:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
487 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
488
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
489 taxonomies = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
490
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
491 # create taxonomy file per sample
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
492 with open(taxonomy_file[0], 'r') as tax, open(shared_file[0]) as sh:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
493 taxonomy = csv.reader(tax, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
494 shared = csv.reader(sh, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
495 shared_header = next(shared)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
496
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
497 for i, row in enumerate(shared):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
498 tax.seek(0) # make sure to start at beginning of file each time
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
499 if row[0] == level:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
500 sample = row[1]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
501
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
502 outfile = os.path.join(outdir, sample+".tsv")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
503 taxonomies.append(outfile)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
504 with open(outfile, 'w+') as of:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
505 out_table = csv.writer(of, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
506 out_table.writerow(next(taxonomy)) # header line
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
507 for j, t in enumerate(taxonomy):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
508 assert t[0] == shared_header[j+3], \
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
509 "OTU mismatch between taxonomy and shared file"
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
510 t[1] = row[j+3]
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
511 out_table.writerow(t + [shared_header[j+3]])
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
512
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
513 # make krona plot
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
514 create_krona_plot(taxonomies, outdir, with_otu)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
515
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
516
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
517 def create_krona_plot(taxonomy_files, outdir, with_otu):
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
518 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
519 Create a krona plot from one or more mothur taxonomy files
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
520
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
521 :param taxonomy_files: mothur taxonomy file (output from classify.otu)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
522 :param outdir: directory to store krona-formatted outputs. Default=cwd
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
523 :param with_otu: add OTU number as a level in the Krona plot? Default=True
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
524 :return:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
525 """
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
526 krona_input_files = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
527
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
528 # convert taxonomy files to krona input.
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
529 for tax in taxonomy_files:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
530 with open(tax, 'r') as f:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
531 taxonomy = csv.reader(f, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
532 out_rows = []
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
533
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
534 next(taxonomy) # skip header line
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
535 for row in taxonomy:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
536 out_rows.append(
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
537 filter(None, [row[1]] + row[2].rstrip(";\n").split(';') +
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
538 [row[0] if with_otu else None]))
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
539
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
540 outfile = os.path.join(outdir, tax.split("/")[-1]+"krona")
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
541 krona_input_files.append(outfile)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
542
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
543 with open(outfile, 'w+') as f2:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
544 out_table = csv.writer(f2, delimiter='\t')
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
545 for row in out_rows:
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
546 out_table.writerow(row)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
547
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
548 # execute krona command
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
549 call(["ktImportText"] + krona_input_files)
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
550
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
551
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
552 if __name__ == "__main__":
607c5e7e0a64 planemo upload for repository https://github.com/ErasmusMC-Bioinformatics/galaxytools-emc/tree/master/tools/mycrobiota commit 1c4c58018b64ff3531a719e789ce71cb0a1244c5
erasmus-medical-center
parents:
diff changeset
553 main()