changeset 0:95efbdb72961 draft

v0.0.4 - Previously only on Test Tool Shed
author peterjc
date Wed, 01 Feb 2017 07:10:26 -0500
parents
children 3ee6f4d0ac80
files test-data/SRR639755_mito_pairs_vs_NC_010642_clc.bam test-data/SRR639755_mito_pairs_vs_NC_010642_clc.bam.bai test-data/SRR639755_mito_pairs_vs_NC_010642_clc.count-1695-1725.tabular test-data/coverage_test.bam test-data/coverage_test.count_roi_variants.tabular test-data/ex1.bam test-data/ex1.count_roi_variants.tabular tools/count_roi_variants/README.rst tools/count_roi_variants/count_roi_variants.py tools/count_roi_variants/count_roi_variants.xml tools/count_roi_variants/tool_dependencies.xml
diffstat 11 files changed, 505 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
Binary file test-data/SRR639755_mito_pairs_vs_NC_010642_clc.bam has changed
Binary file test-data/SRR639755_mito_pairs_vs_NC_010642_clc.bam.bai has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/SRR639755_mito_pairs_vs_NC_010642_clc.count-1695-1725.tabular	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,4 @@
+Variant	Count	Percentage
+AGCCCATGAGATGGGAAGCAATGGGCTACA	14	87.50
+AGCCCATGAGATGGGAAGCAATGGGCTACG	1	6.25
+AGCGCATGAGATGGGAAGCAATGGGCTACG	1	6.25
Binary file test-data/coverage_test.bam has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/coverage_test.count_roi_variants.tabular	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,2 @@
+Variant	Count	Percentage
+TAAAGG	1	100.00
Binary file test-data/ex1.bam has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/ex1.count_roi_variants.tabular	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,3 @@
+Variant	Count	Percentage
+CTATCTA	39	97.50
+ATATATA	1	2.50
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/count_roi_variants/README.rst	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,138 @@
+Count sequence variants in region of interest in BAM file
+=========================================================
+
+This tool is copyright 2016 by Peter Cock, The James Hutton Institute
+(formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
+See the licence text below.
+
+This tool runs the command ``samtools view`` (taking advantage of an
+indexed BAM file) to access only those reads mapped to the region of
+interest (ROI), and then counts the different sequence variants found.
+
+Internally this tool uses the command-line samtools suite.
+
+This tool is available from the Galaxy Tool Shed at:
+http://toolshed.g2.bx.psu.edu/view/peterjc/count_roi_variants
+
+
+Use outside of Galaxy
+=====================
+
+You just need the ``count_roi_variants.py`` script and to have samtools
+on the ``$PATH``.  If you move/copy the script somewhere on your ``$PATH``
+and then you can run it like this::
+
+    $ count_roi_variants.py --help
+
+Or, call the script at an explicit path::
+
+    $ /path/to/my/stuff/count_roi_variants.py --help
+
+Run like this it will use the current default Python. This was written and
+tested under Python 2.7, but should also work under Python 2.6. e.g.::
+
+    $ python2.6 /path/to/my/stuff/count_roi_variants.py --help
+
+This does not yet support Python 3.
+
+The sample data and tests are designed to be run via Galaxy.
+
+
+Automated Galaxy Installation
+=============================
+
+This should be straightforward, Galaxy should automatically download and install
+samtools if required.
+
+
+Manual Galaxy Installation
+==========================
+
+This expects samtools to be on the ``$PATH``, and was tested using v0.1.3
+
+To install the wrapper copy or move the following files under the Galaxy tools
+folder, e.g. in a ``tools/count_roi_variants`` folder:
+
+* ``count_roi_variants.xml`` (the Galaxy tool definition)
+* ``count_roi_variants.py`` (the Python wrapper script)
+* ``README.rst`` (this file)
+
+You will also need to modify the ``tools_conf.xml`` file to tell Galaxy to offer
+the tool. Just add the line, perhaps under the NGS tools section::
+
+  <tool file="count_roi_variants/count_roi_variants.xml" />
+
+If you wish to run the unit tests, also move/copy the ``test-data/`` files
+under Galaxy's ``test-data/`` folder. Then::
+
+    $ ./run_tests.sh -id count_roi_variants
+
+That's it.
+
+
+History
+=======
+
+======= ======================================================================
+Version Changes
+------- ----------------------------------------------------------------------
+v0.0.1  - Initial public release
+v0.0.2  - Cope with pipes in reference name (e.g. NCBI style FASTA naming)
+v0.0.3  - Include a functional test for using an unrecognised reference.
+v0.0.4  - Improved usage text and README for use outside of Galaxy.
+======= ======================================================================
+
+
+Developers
+==========
+
+Development is on this GitHub repository:
+https://github.com/peterjc/pico_galaxy/tree/master/tools/count_roi_variants
+
+For pushing a release to the test or main "Galaxy Tool Shed", use the following
+Planemo commands (which requires you have set your Tool Shed access details in
+``~/.planemo.yml`` and that you have access rights on the Tool Shed)::
+
+    $ planemo shed_update -t testtoolshed --check_diff ~/repositories/pico_galaxy/tools/count_roi_variants/
+    ...
+
+or::
+
+    $ planemo shed_update -t toolshed --check_diff ~/repositories/pico_galaxy/tools/count_roi_variants/
+    ...
+
+To just build and check the tar ball, use::
+
+    $ planemo shed_upload --tar_only  ~/repositories/pico_galaxy/tools/count_roi_variants/
+    ...
+    $ tar -tzf shed_upload.tar.gz
+    tools/count_roi_variants/README.rst
+    tools/count_roi_variants/count_roi_variants.xml
+    tools/count_roi_variants/count_roi_variants.py
+    tools/count_roi_variants/tool_dependencies.xml
+    ...
+
+
+Licence (MIT)
+=============
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
+
+NOTE: This is the licence for the Galaxy Wrapper only.
+samtools is available and licenced separately.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/count_roi_variants/count_roi_variants.py	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,240 @@
+#!/usr/bin/env python
+"""Count sequence variants in region of interest in a BAM file.
+
+This script takes exactly four command line arguments:
+ * Input BAM filename
+ * Input BAI filename (via Galaxy metadata)
+ * Output tabular filename
+ * Region of interest (reference:start-end as used in samtools)
+
+This messes about with the filenames to make samtools happy, then
+runs "samtools view" and parses the reads mapped to the ROI, counts
+the observed variants spanning the ROI, and outputs this as a
+tabular file.
+"""
+import sys
+import os
+import subprocess
+import tempfile
+
+if "-v" in sys.argv or "--version" in sys.argv:
+    # Galaxy seems to invert the order of the two lines
+    print("BAM coverage statistics v0.0.4 (using samtools)")
+    cmd = "samtools 2>&1 | grep -i ^Version"
+    sys.exit(os.system(cmd))
+
+# TODO - Proper command line API
+usage = """Requires 4 arguments: BAM, BAI, tabular filename, samtools-style region
+
+For ease of use, you can use a minus sign as the BAI filename which will use the BAM
+filename with the suffix .bai added to it. Example using one of the test-data files:
+
+$ count_roi_variants.py "SRR639755_mito_pairs_vs_NC_010642_clc.bam" "-" "counts.txt" "gi|187250362|ref|NC_010642.1|:1695-1725"
+Counted 3 variants from 16 reads spanning gi|187250362|ref|NC_010642.1|:1695-1725
+$ cat counts.txt
+Variant	Count	Percentage
+AGCCCATGAGATGGGAAGCAATGGGCTACA	14	87.50
+AGCCCATGAGATGGGAAGCAATGGGCTACG	1	6.25
+AGCGCATGAGATGGGAAGCAATGGGCTACG	1	6.25
+"""
+if len(sys.argv) == 5:
+    bam_filename, bai_filename, tabular_filename, region = sys.argv[1:]
+else:
+    sys.exit(usage)
+
+if not os.path.isfile(bam_filename):
+    sys.exit("Input BAM file not found: %s" % bam_filename)
+if bai_filename == "-":
+    # Make this optional for ease of use at the command line by hand:
+    if os.path.isfile(bam_filename + ".bai"):
+        bai_filename = bam_filename + ".bai"
+if not os.path.isfile(bai_filename):
+    if bai_filename == "None":
+        sys.exit("Error: Galaxy did not index your BAM file")
+    sys.exit("Input BAI file not found: %s" % bai_filename)
+
+try:
+    # Sanity check we have "ref:start-end" to give clear error message
+    # Note can have semi-colon in the reference name
+    # Note can have thousand separator commas in the start/end
+    ref, start_end = region.rsplit(":", 1)
+    start, end = start_end.split("-")
+    start = int(start.replace(",", ""))
+    end = int(end.replace(",", ""))
+except ValueError:
+    sys.exit("Bad argument for region: %r" % region)
+
+
+# Assign sensible names with real extensions, and setup symlinks:
+tmp_dir = tempfile.mkdtemp()
+bam_file = os.path.join(tmp_dir, "temp.bam")
+bai_file = os.path.join(tmp_dir, "temp.bam.bai")
+os.symlink(os.path.abspath(bam_filename), bam_file)
+os.symlink(os.path.abspath(bai_filename), bai_file)
+assert os.path.isfile(bam_file), bam_file
+assert os.path.isfile(bai_file), bai_file
+assert os.path.isfile(bam_file + ".bai"), bam_file
+
+
+def clean_up():
+    os.remove(bam_file)
+    os.remove(bai_file)
+    os.rmdir(tmp_dir)
+
+
+def decode_cigar(cigar):
+    """Returns a list of 2-tuples, integer count and operator char."""
+    count = ""
+    answer = []
+    for letter in cigar:
+        if letter.isdigit():
+            count += letter  # string addition
+        elif letter in "MIDNSHP=X":
+            answer.append((int(count), letter))
+            count = ""
+        else:
+            raise ValueError("Invalid character %s in CIGAR %s" % (letter, cigar))
+    return answer
+
+
+assert decode_cigar("14S15M1P1D3P54M1D34M5S") == [(14, 'S'), (15, 'M'), (1, 'P'), (1, 'D'), (3, 'P'), (54, 'M'), (1, 'D'), (34, 'M'), (5, 'S')]
+
+
+def align_len(cigar_ops):
+    """Sums the CIGAR M/=/X/D/N operators."""
+    return sum(count for count, op in cigar_ops if op in "M=XDN")
+
+
+def expand_cigar(seq, cigar_ops):
+    """Yields (ref_offset, seq_base) pairs."""
+    ref_offset = 0
+    seq_offset = 0
+    for count, op in cigar_ops:
+        if op in "MX=":
+            for (i, base) in enumerate(seq[seq_offset:seq_offset + count]):
+                yield ref_offset + i, base
+            ref_offset += count
+            seq_offset += count
+        elif op == "I":
+            # Give them all an in-between reference position
+            # (Python lets us mix integers and floats, wouldn't work in C)
+            for (i, base) in enumerate(seq[seq_offset:seq_offset + count]):
+                yield ref_offset - 0.5, base
+            # Does not change ref_offset
+            seq_offset += count
+        elif op in "DN":
+            # Deletion/skip,
+            # TODO: Option to return gap characters
+            # for i in range(count):
+            #     yield ref_offset + i, "-"
+            ref_offset += count
+        elif op == "S":
+            # Soft clipping, silently discard the bases (OK?)
+            seq_offset += count
+        elif op in "HP":
+            # Hard trimming or pad, can ignore
+            # TODO: Yield "-" if later offer to report deletions
+            pass
+        else:
+            raise NotImplementedError("Unexpected CIGAR operator %s" % op)
+
+assert list(expand_cigar("ACGT", decode_cigar("4M"))) == [(0, "A"), (1, "C"), (2, "G"), (3, "T")]
+assert list(expand_cigar("ACGT", decode_cigar("2=1X1="))) == [(0, "A"), (1, "C"), (2, "G"), (3, "T")]
+assert list(expand_cigar("ACGT", decode_cigar("2M1D2M"))) == [(0, "A"), (1, "C"), (3, "G"), (4, "T")]
+assert list(expand_cigar("ACtGT", decode_cigar("2M1I2M"))) == [(0, "A"), (1, "C"), (1.5, "t"), (2, "G"), (3, "T")]
+assert list(expand_cigar("tACGT", decode_cigar("1I4M"))) == [(-0.5, 't'), (0, 'A'), (1, 'C'), (2, 'G'), (3, 'T')]
+assert list(expand_cigar("ACGTt", decode_cigar("4M1I"))) == [(0, 'A'), (1, 'C'), (2, 'G'), (3, 'T'), (3.5, 't')]
+assert list(expand_cigar("AAAAGGGGTTTT", decode_cigar("12M"))) == [(0, 'A'), (1, 'A'), (2, 'A'), (3, 'A'), (4, 'G'), (5, 'G'), (6, 'G'), (7, 'G'), (8, 'T'), (9, 'T'), (10, 'T'), (11, 'T')]
+assert list(expand_cigar("AAAAcGGGGTTTT", decode_cigar("4M1I8M"))) == [(0, 'A'), (1, 'A'), (2, 'A'), (3, 'A'), (3.5, 'c'), (4, 'G'), (5, 'G'), (6, 'G'), (7, 'G'), (8, 'T'), (9, 'T'), (10, 'T'), (11, 'T')]
+assert list(expand_cigar("AAAAGGGGcTTTT", decode_cigar("8M1I4M"))) == [(0, 'A'), (1, 'A'), (2, 'A'), (3, 'A'), (4, 'G'), (5, 'G'), (6, 'G'), (7, 'G'), (7.5, "c"), (8, 'T'), (9, 'T'), (10, 'T'), (11, 'T')]
+assert list(expand_cigar("AAAAcGGGGcTTTT", decode_cigar("4M1I4M1I4M"))) == [(0, 'A'), (1, 'A'), (2, 'A'), (3, 'A'), (3.5, 'c'), (4, 'G'), (5, 'G'), (6, 'G'), (7, 'G'), (7.5, 'c'), (8, 'T'), (9, 'T'), (10, 'T'), (11, 'T')]
+
+
+def get_roi(seq, cigar_ops, start, end):
+    """Extract region of seq mapping to the ROI.
+
+    Expect start and end to be zero based Python style end points.
+
+    i.e. The ROI relative to the mapping start recorded in the POS field.
+    Will return part of the SAM/BAM value SEQ based on interpretting the
+    passed CIGAR operators.
+    """
+    if len(cigar_ops) == 1 and cigar_ops[0][1] in "M=X":
+        # Easy case, note start/end/pos all one-based
+        assert cigar_ops[0][0] == len(seq)
+        return seq[start:end]
+    # Would use "start <= i < end" if they were all integers, but
+    # want to exclude e.g. 3.5 and 7.5 when given start 4 and end 8.
+    return "".join(base for i, base in expand_cigar(seq, cigar_ops) if start <= i <= end - 1)
+
+assert "GGGG" == get_roi("AAAAGGGGTTTT", decode_cigar("12M"), 4, 8)
+assert "GGGG" == get_roi("AAAAcGGGGTTTT", decode_cigar("4M1I8M"), 4, 8)
+assert "GGGG" == get_roi("AAAAGGGGcTTTT", decode_cigar("8M1I4M"), 4, 8)
+assert "GGGG" == get_roi("AAAAcGGGGcTTTT", decode_cigar("4M1I4M1I4M"), 4, 8)
+assert "GGaGG" == get_roi("AAAAGGaGGTTTT", decode_cigar("6M1I6M"), 4, 8)
+assert "GGGgA" == get_roi("AAAAGGGgATTTT", decode_cigar("7M1I5M"), 4, 8)
+
+
+def count_region():
+    # Could recreate the region string (with no commas in start/end)?
+    # region = "%s:%i-%i" % (ref, start, end)
+
+    tally = dict()
+
+    # Call samtools view, don't need header so no -h added.
+    # Only want mapped reads, thus flag filter -F 4.
+    child = subprocess.Popen(["samtools", "view", "-F", "4", bam_file, region],
+                             stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+    for line in child.stdout:
+        assert line[0] != "@", "Got unexpected SAM header line: %s" % line
+        qname, flag, rname, pos, mapq, cigar, rnext, pnext, tlen, seq, rest = line.split("\t", 10)
+        pos = int(pos)  # one-based
+        if start < pos:
+            # Does not span the ROI
+            continue
+        cigar_ops = decode_cigar(cigar)
+        if pos + align_len(cigar_ops) - 1 < end:
+            # Does not span the ROI
+            continue
+        # All of start/end/pos are currently one-based, making offsets Python style....
+        roi_seq = get_roi(seq, cigar_ops, start - pos, end - pos + 1)
+        assert roi_seq, "Error, empty ROI sequence for: %s" % line
+        try:
+            tally[roi_seq] += 1
+        except KeyError:
+            tally[roi_seq] = 1
+
+    stderr = child.stderr.read()
+    child.stdout.close()
+    child.stderr.close()
+    return_code = child.wait()
+    if return_code:
+        sys.exit("Got return code %i from samtools view" % return_code)
+    elif "specifies an unknown reference name. Continue anyway." in stderr:
+        sys.exit(stderr.strip() + "\n\nERROR: samtools did not recognise the region requested, can't count any variants.")
+
+    return tally
+
+
+def record_counts():
+
+    tally = count_region()
+    total = sum(tally.values())
+
+    # Using negative count to get sort with highest count first,
+    # while tie-breaking by the ROI sequence alphabetically.
+    table = sorted((-count, roi_seq) for (roi_seq, count) in tally.items())
+    del tally
+
+    with open(tabular_filename, "w") as handle:
+        handle.write("Variant\tCount\tPercentage\n")
+        for count, roi_seq in table:
+            handle.write("%s\t%i\t%0.2f\n" % (roi_seq, -count, -count * 100.0 / total))
+
+    print("Counted %i variants from %i reads spanning %s" % (len(table), total, region))
+
+
+# Run it!
+record_counts()
+# Remove the temp symlinks and files:
+clean_up()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/count_roi_variants/count_roi_variants.xml	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,112 @@
+<tool id="count_roi_variants" name="Count sequence variants in region of interest" version="0.0.3">
+    <description>using samtools view</description>
+    <requirements>
+        <requirement type="binary">samtools</requirement>
+        <requirement type="package" version="1.2.2">samtools</requirement>
+    </requirements>
+    <stdio>
+        <!-- Assume anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+    </stdio>
+    <version_command interpreter="python">count_roi_variants.py --version</version_command>
+    <command interpreter="python">count_roi_variants.py "$input_bam" "${input_bam.metadata.bam_index}" "$out_tabular" "$region"</command>
+    <inputs>
+        <param name="input_bam" type="data" format="bam" label="Input BAM file" />
+        <param name="region" type="text" label="Region of interest" help="Use the reference:start-end syntax as in samtools.">
+            <sanitizer>
+                <!--
+                    SAM/BAM spec says the name must match regex [!-)+-<>-~][!-~]*
+                    but will focus on ASCII 33 ("!") to ASCII 126 ("~"), i.e.
+                    !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
+                    of which some are going to be potential trouble like !, ", $, &, ', \, `, {, |, }
+                    which is how I came to this hopefully safe set..
+                -->
+                <valid initial="string.letters,string.digits">
+                    <add value="#%+,-./:;=@^_|~" />
+                    <remove value=" " />
+                </valid>
+                <mapping initial="none" />
+            </sanitizer>
+        </param>
+    </inputs>
+    <outputs>
+        <data name="out_tabular" format="tabular" label="$input_bam.name $region variants" />
+    </outputs>
+    <tests>
+        <test>
+            <param name="input_bam" value="ex1.bam" ftype="bam" />
+            <param name="region" value="chr2:400-406" />
+            <output name="out_tabular" file="ex1.count_roi_variants.tabular" ftype="tabular" />
+            <assert_stdout>
+                <has_line line="Counted 2 variants from 40 reads spanning chr2:400-406" />
+            </assert_stdout>
+        </test>
+        <test>
+            <param name="input_bam" value="coverage_test.bam" ftype="bam" />
+            <param name="region" value="ref:10-15" />
+            <output name="out_tabular" file="coverage_test.count_roi_variants.tabular" ftype="tabular" />
+            <assert_stdout>
+                <has_line line="Counted 1 variants from 1 reads spanning ref:10-15" />
+            </assert_stdout>
+        </test>
+        <!-- This test is a tricky one due to the NCBI's love of pipe characters in identifiers -->
+        <test>
+            <param name="input_bam" value="SRR639755_mito_pairs_vs_NC_010642_clc.bam" ftype="bam" />
+            <param name="region" value="gi|187250362|ref|NC_010642.1|:1695-1725" />
+            <output name="out_tabular" file="SRR639755_mito_pairs_vs_NC_010642_clc.count-1695-1725.tabular" ftype="tabular" />
+            <assert_stdout>
+                <has_line line="Counted 3 variants from 16 reads spanning gi|187250362|ref|NC_010642.1|:1695-1725" />
+            </assert_stdout>
+        </test>
+        <test expect_failure="true" expect_exit_code="1">
+            <param name="input_bam" value="SRR639755_mito_pairs_vs_NC_010642_clc.bam" ftype="bam" />
+            <param name="region" value="ref:1695-1725" />
+            <assert_stderr>
+                <has_line line="ERROR: samtools did not recognise the region requested, can't count any variants." />
+            </assert_stderr>
+        </test>
+    </tests>
+    <help>
+**What it does**
+
+This tool runs the command ``samtools view`` from the SAMtools toolkit, getting
+all the reads in your BAM file mapped to the given region of interest (ROI).
+It then counts all the different sequence variants in reads spanning that ROI,
+which are returned as a tab-separated table.
+
+Reads mapped to the ROI but not spanning it completely are ignored.
+
+Input is a sorted and indexed BAM file, the output is tabular. The first column
+is the observed sequence variants within the ROI, the second column is the number
+of reads with that sequence, and the third column gives this as a percentage of
+the reads spanning the ROI.
+
+====== =================================================================================
+Column Description
+------ ---------------------------------------------------------------------------------
+     1 Sequence variant from ROI
+     2 Number of reads with that sequence variant
+     3 Percentage of reads with that sequence variant (2 dp)
+====== =================================================================================
+
+
+**Citation**
+
+If you use this Galaxy tool in work leading to a scientific publication please
+cite:
+
+Heng Li et al (2009). The Sequence Alignment/Map format and SAMtools.
+Bioinformatics 25(16), 2078-9.
+http://dx.doi.org/10.1093/bioinformatics/btp352
+
+Peter J.A. Cock (2016), Count sequence variants in region of interest in BAM file.
+http://toolshed.g2.bx.psu.edu/view/peterjc/count_roi_variants
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/count_roi_variants
+    </help>
+    <citations>
+        <citation type="doi">10.1093/bioinformatics/btp352</citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/count_roi_variants/tool_dependencies.xml	Wed Feb 01 07:10:26 2017 -0500
@@ -0,0 +1,6 @@
+<?xml version="1.0"?>
+<tool_dependency>
+    <package name="samtools" version="1.2">
+        <repository changeset_revision="f6ae3ba3f3c1" name="package_samtools_1_2" owner="iuc" toolshed="https://toolshed.g2.bx.psu.edu" />
+    </package>
+</tool_dependency>