annotate tools/coverage_stats/coverage_stats.xml @ 2:7254ece0c0ff draft

v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
author peterjc
date Thu, 11 May 2017 12:16:10 -0400
parents d1fdfaae5dbe
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
1 <tool id="coverage_stats" name="BAM coverage statistics" version="0.0.5">
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
2 <description>using samtools idxstats and depth</description>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
3 <requirements>
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
4 <requirement type="package" version="1.4.1">samtools</requirement>
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
5 </requirements>
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
6 <version_command>
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
7 python $__tool_directory__/coverage_stats.py --version
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
8 </version_command>
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
9 <command detect_errors="aggressive">
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
10 python $__tool_directory__/coverage_stats.py '$input_bam' '${input_bam.metadata.bam_index}' '$out_tabular' '$max_depth'
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
11 </command>
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
12 <inputs>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
13 <param name="input_bam" type="data" format="bam" label="Input BAM file" />
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
14 <param name="max_depth" type="integer" min="0" max="10000000" label="Max depth" value="8000" />
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
15 </inputs>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
16 <outputs>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
17 <data name="out_tabular" format="tabular" label="$input_bam.name (coverage stats)" />
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
18 </outputs>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
19 <tests>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
20 <test>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
21 <param name="input_bam" value="ex1.bam" ftype="bam" />
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
22 <param name="max_depth" value="123" />
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
23 <output name="out_tabular" file="ex1.coverage_stats.tabular" ftype="tabular" />
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
24 </test>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
25 <test>
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
26 <param name="input_bam" value="ex1.bam" ftype="bam" />
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
27 <param name="max_depth" value="50" />
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
28 <output name="out_tabular" file="ex1.coverage_stats.md50.tabular" ftype="tabular" />
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
29 </test>
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
30 <test>
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
31 <param name="input_bam" value="coverage_test.bam" ftype="bam" />
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
32 <param name="max_depth" value="123" />
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
33 <output name="out_tabular" file="coverage_test.coverage_stats.tabular" ftype="tabular" />
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
34 </test>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
35 </tests>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
36 <help>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
37 **What it does**
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
38
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
39 This tool runs the commands ``samtools idxstats`` and ``samtools depth`` from the
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
40 SAMtools toolkit, and parses their output to produce a consise summary of the
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
41 coverage information for each reference sequence.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
42
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
43 Input is a sorted and indexed BAM file, the output is tabular. The first four
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
44 columns match the output from ``samtools idxstats``, the additional columns are
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
45 calculated from the ``samtools depth`` output. The final row with a star as the
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
46 reference identifier represents unmapped reads, and will have zeros in every
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
47 column except columns one and four.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
48
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
49 ====== =================================================================================
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
50 Column Description
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
51 ------ ---------------------------------------------------------------------------------
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
52 1 Reference sequence identifier
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
53 2 Reference sequence length
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
54 3 Number of mapped reads
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
55 4 Number of placed but unmapped reads (typically unmapped partners of mapped reads)
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
56 5 Minimum coverage (per base of reference)
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
57 6 Maximum coverage (per base of reference)
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
58 7 Mean coverage (given to 2 dp)
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
59 ====== =================================================================================
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
60
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
61 Example output from a *de novo* assembly:
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
62
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
63 ========== ====== ====== ====== ======= ======= ========
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
64 identiifer length mapped placed min_cov max_cov mean_cov
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
65 ---------- ------ ------ ------ ------- ------- --------
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
66 contig_1 833604 436112 0 1 157 71.95
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
67 contig_2 14820 9954 0 1 152 91.27
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
68 contig_3 272099 142958 0 1 150 72.31
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
69 contig_4 135519 73288 0 1 149 75.23
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
70 contig_5 91245 46759 0 1 157 70.92
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
71 contig_6 175604 95744 0 1 146 75.99
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
72 contig_7 90586 48158 0 1 151 72.93
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
73 contig_9 234347 126458 0 1 159 75.40
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
74 contig_10 121515 60211 0 1 152 68.12
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
75 ... ... ... ... ... ... ...
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
76 contig_604 712 85 0 1 49 21.97
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
77 \* 0 0 950320 0 0 0.00
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
78 ========== ====== ====== ====== ======= ======= ========
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
79
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
80 In this example there were 604 contigs, each with one line in the output table,
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
81 plus the final row (labelled with an asterisk) representing 950320 unmapped reads.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
82 In this BAM file, the fourth column was otherwise zero.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
83
1
d1fdfaae5dbe Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents: 0
diff changeset
84 .. class:: warningmark
d1fdfaae5dbe Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents: 0
diff changeset
85
2
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
86 **Note**. If using this on a mapping BAM file, beware that the coverage counting is
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
87 done per base of the reference. This means if your reference has any extra bases
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
88 compared to the reads being mapped, those bases will be skipped by CIGAR D operators
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
89 and these "extra" bases can have an extremely low coverage, giving a potentially
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
90 misleading ``min_cov`` values. A sliding window coverage may be more appropriate.
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
91
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
92 **Note**. Up until samtools 1.2, there was an internal hard limit of 8000 for the
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
93 pileup routine, meaning the reported coverage from ``samtools depth`` would show
7254ece0c0ff v0.0.5 - Supports max coverage depth in recent samtools. Expects samtools 1.4.1 via Conda, not via Tool Shed.
peterjc
parents: 1
diff changeset
94 maximum coverage depths *around* 8000. This is now a run time option.
1
d1fdfaae5dbe Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents: 0
diff changeset
95
0
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
96
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
97 **Citation**
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
98
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
99 If you use this Galaxy tool in work leading to a scientific publication please
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
100 cite:
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
101
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
102 Heng Li et al (2009). The Sequence Alignment/Map format and SAMtools.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
103 Bioinformatics 25(16), 2078-9.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
104 http://dx.doi.org/10.1093/bioinformatics/btp352
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
105
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
106 Peter J.A. Cock (2013), BAM coverage statistics using samtools idxstats and depth.
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
107 http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
108
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
109 This wrapper is available to install into other Galaxy Instances via the Galaxy
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
110 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
111 </help>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
112 <citations>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
113 <citation type="doi">10.1093/bioinformatics/btp352</citation>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
114 </citations>
ca8f63f2f7d4 Uploaded v0.0.1
peterjc
parents:
diff changeset
115 </tool>