Mercurial > repos > peterjc > coverage_stats
annotate tools/coverage_stats/coverage_stats.xml @ 1:d1fdfaae5dbe draft
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
| author | peterjc |
|---|---|
| date | Fri, 21 Nov 2014 09:43:58 -0500 |
| parents | ca8f63f2f7d4 |
| children | 7254ece0c0ff |
| rev | line source |
|---|---|
| 0 | 1 <tool id="coverage_stats" name="BAM coverage statistics" version="0.0.1"> |
| 2 <description>using samtools idxstats and depth</description> | |
| 3 <requirements> | |
| 4 <requirement type="binary">samtools</requirement> | |
| 5 <requirement type="package" version="0.1.19">samtools</requirement> | |
| 6 </requirements> | |
| 7 <version_command interpreter="python">coverage_stats.py --version</version_command> | |
| 8 <command interpreter="python">coverage_stats.py "$input_bam" "${input_bam.metadata.bam_index}" "$out_tabular"</command> | |
| 9 <inputs> | |
| 10 <param name="input_bam" type="data" format="bam" label="Input BAM file" /> | |
| 11 </inputs> | |
| 12 <outputs> | |
| 13 <data name="out_tabular" format="tabular" label="$input_bam.name (coverage stats)" /> | |
| 14 </outputs> | |
| 15 <stdio> | |
| 16 <!-- Assume anything other than zero is an error --> | |
| 17 <exit_code range="1:" /> | |
| 18 <exit_code range=":-1" /> | |
| 19 </stdio> | |
| 20 <tests> | |
| 21 <test> | |
| 22 <param name="input_bam" value="ex1.bam" ftype="bam" /> | |
| 23 <output name="out_tabular" file="ex1.coverage_stats.tabular" ftype="tabular" /> | |
| 24 </test> | |
| 25 <test> | |
| 26 <param name="input_bam" value="coverage_test.bam" ftype="bam" /> | |
| 27 <output name="out_tabular" file="coverage_test.coverage_stats.tabular" ftype="tabular" /> | |
| 28 </test> | |
| 29 </tests> | |
| 30 <help> | |
| 31 **What it does** | |
| 32 | |
| 33 This tool runs the commands ``samtools idxstats`` and ``samtools depth`` from the | |
| 34 SAMtools toolkit, and parses their output to produce a consise summary of the | |
| 35 coverage information for each reference sequence. | |
| 36 | |
| 37 Input is a sorted and indexed BAM file, the output is tabular. The first four | |
| 38 columns match the output from ``samtools idxstats``, the additional columns are | |
| 39 calculated from the ``samtools depth`` output. The final row with a star as the | |
| 40 reference identifier represents unmapped reads, and will have zeros in every | |
| 41 column except columns one and four. | |
| 42 | |
| 43 ====== ================================================================================= | |
| 44 Column Description | |
| 45 ------ --------------------------------------------------------------------------------- | |
| 46 1 Reference sequence identifier | |
| 47 2 Reference sequence length | |
| 48 3 Number of mapped reads | |
| 49 4 Number of placed but unmapped reads (typically unmapped partners of mapped reads) | |
| 50 5 Minimum coverage | |
| 51 6 Maximum coverage | |
| 52 7 Mean coverage (given to 2 dp) | |
| 53 ====== ================================================================================= | |
| 54 | |
| 55 Example output from a *de novo* assembly: | |
| 56 | |
| 57 ========== ====== ====== ====== ======= ======= ======== | |
| 58 identiifer length mapped placed min_cov max_cov mean_cov | |
| 59 ---------- ------ ------ ------ ------- ------- -------- | |
| 60 contig_1 833604 436112 0 1 157 71.95 | |
| 61 contig_2 14820 9954 0 1 152 91.27 | |
| 62 contig_3 272099 142958 0 1 150 72.31 | |
| 63 contig_4 135519 73288 0 1 149 75.23 | |
| 64 contig_5 91245 46759 0 1 157 70.92 | |
| 65 contig_6 175604 95744 0 1 146 75.99 | |
| 66 contig_7 90586 48158 0 1 151 72.93 | |
| 67 contig_9 234347 126458 0 1 159 75.40 | |
| 68 contig_10 121515 60211 0 1 152 68.12 | |
| 69 ... ... ... ... ... ... ... | |
| 70 contig_604 712 85 0 1 49 21.97 | |
| 71 \* 0 0 950320 0 0 0.00 | |
| 72 ========== ====== ====== ====== ======= ======= ======== | |
| 73 | |
| 74 In this example there were 604 contigs, each with one line in the output table, | |
| 75 plus the final row (labelled with an asterisk) representing 950320 unmapped reads. | |
| 76 In this BAM file, the fourth column was otherwise zero. | |
| 77 | |
|
1
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
78 .. class:: warningmark |
|
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
79 |
|
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
80 **Note**. There is an internal hard limit of 8000 for the pileup routine in |
|
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
81 samtools, meaning the reported coverage from ``samtools depth`` will show |
|
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
82 maximum coverage depths *around* 8000. |
|
d1fdfaae5dbe
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
peterjc
parents:
0
diff
changeset
|
83 |
| 0 | 84 |
| 85 **Citation** | |
| 86 | |
| 87 If you use this Galaxy tool in work leading to a scientific publication please | |
| 88 cite: | |
| 89 | |
| 90 Heng Li et al (2009). The Sequence Alignment/Map format and SAMtools. | |
| 91 Bioinformatics 25(16), 2078-9. | |
| 92 http://dx.doi.org/10.1093/bioinformatics/btp352 | |
| 93 | |
| 94 Peter J.A. Cock (2013), BAM coverage statistics using samtools idxstats and depth. | |
| 95 http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats | |
| 96 | |
| 97 This wrapper is available to install into other Galaxy Instances via the Galaxy | |
| 98 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats | |
| 99 </help> | |
| 100 <citations> | |
| 101 <citation type="doi">10.1093/bioinformatics/btp352</citation> | |
| 102 </citations> | |
| 103 </tool> |
