# HG changeset patch
# User devteam
# Date 1380135311 14400
# Node ID f2ab5b44870d47fff5baac9c52f32c0b0fc9fb6e
Uploaded tool tarball.
diff -r 000000000000 -r f2ab5b44870d fasta_clipping_histogram.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/fasta_clipping_histogram.xml Wed Sep 25 14:55:11 2013 -0400
@@ -0,0 +1,112 @@
+
+ chart
+
+ fastx_toolkit
+
+ fasta_clipping_histogram.pl $input $outfile
+
+
+
+
+
+
+
+
+
+
+**What it does**
+
+This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file.
+
+**TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results.
+
+-----
+
+**Output Examples**
+
+In the following library, most sequences are 24-mers to 27-mers.
+This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place).
+
+.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png
+
+
+In the following library, most sequences are 19,22 or 23-mers.
+This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place).
+
+.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png
+
+
+-----
+
+
+**Input Formats**
+
+This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so::
+
+ >sequence1
+ AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG
+ >sequence2
+ GTGTGTGTGGGAAGTTGACACAGTA
+ >sequence3
+ CCTTGAGATTAACGCTAATCAAGTAAAC
+
+
+If the sequences span over multiple lines::
+
+ >sequence1
+ CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
+ TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
+ aactggtctttacctTTAAGTTG
+
+Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences::
+
+ >sequence1
+ CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
+
+
+-----
+
+
+
+**Multiplicity counts (a.k.a reads-count)**
+
+If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing).
+
+Example 1 - The following FASTA file *does not* have multiplicity counts::
+
+ >seq1
+ GGATCC
+ >seq2
+ GGTCATGGGTTTAAA
+ >seq3
+ GGGATATATCCCCACACACACACAC
+
+Each sequence is counts as one, to produce the following chart:
+
+.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png
+
+
+Example 2 - The following FASTA file have multiplicity counts::
+
+ >seq1-2
+ GGATCC
+ >seq2-10
+ GGTCATGGGTTTAAA
+ >seq3-3
+ GGGATATATCCCCACACACACACAC
+
+The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart:
+
+.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_4.png
+
+Use the **FASTA Collapser** tool to create FASTA files with multiplicity counts.
+
+------
+
+This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
+
+ .. __: http://hannonlab.cshl.edu/fastx_toolkit/
+
+
+
+
\ No newline at end of file
diff -r 000000000000 -r f2ab5b44870d tool_dependencies.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_dependencies.xml Wed Sep 25 14:55:11 2013 -0400
@@ -0,0 +1,6 @@
+
+
+
+
+
+