comparison fasta_clipping_histogram.xml @ 1:f666895cbebd draft

planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tool_collections/fastx_toolkit/fasta_clipping_histogram commit a1517c9d22029095120643bbe2c8fa53754dd2b7
author devteam
date Wed, 11 Nov 2015 12:36:37 -0500
parents f2ab5b44870d
children 9db07fd39f85
comparison
equal deleted inserted replaced
0:f2ab5b44870d 1:f666895cbebd
1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.0"> 1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.0">
2 <description>chart</description> 2 <description>chart</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement> 4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement>
5 </requirements> 5 </requirements>
6 <command>fasta_clipping_histogram.pl $input $outfile</command> 6 <command>fasta_clipping_histogram.pl $input $outfile</command>
7
8 <inputs>
9 <param format="fasta" name="input" type="data" label="Library to analyze" />
10 </inputs>
11 7
12 <outputs> 8 <inputs>
13 <data format="png" name="outfile" metadata_source="input" /> 9 <param format="fasta" name="input" type="data" label="Library to analyze" />
14 </outputs> 10 </inputs>
15 <help>
16 11
12 <outputs>
13 <data format="png" name="outfile" metadata_source="input" />
14 </outputs>
15 <tests>
16 </tests>
17 <help>
17 **What it does** 18 **What it does**
18 19
19 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file. 20 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file.
20 21
21 **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results. 22 **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results.
22 23
23 ----- 24 -----
24 25
25 **Output Examples** 26 **Output Examples**
26 27
27 In the following library, most sequences are 24-mers to 27-mers. 28 In the following library, most sequences are 24-mers to 27-mers.
28 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place). 29 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place).
29 30
30 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png 31 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png
31 32
32 33 In the following library, most sequences are 19,22 or 23-mers.
33 In the following library, most sequences are 19,22 or 23-mers.
34 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place). 34 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place).
35 35
36 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png 36 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png
37 37
38
39 ----- 38 -----
40
41 39
42 **Input Formats** 40 **Input Formats**
43 41
44 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so:: 42 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so::
45 43
47 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG 45 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG
48 >sequence2 46 >sequence2
49 GTGTGTGTGGGAAGTTGACACAGTA 47 GTGTGTGTGGGAAGTTGACACAGTA
50 >sequence3 48 >sequence3
51 CCTTGAGATTAACGCTAATCAAGTAAAC 49 CCTTGAGATTAACGCTAATCAAGTAAAC
52
53 50
54 If the sequences span over multiple lines:: 51 If the sequences span over multiple lines::
55 52
56 >sequence1 53 >sequence1
57 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG 54 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
61 Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences:: 58 Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences::
62 59
63 >sequence1 60 >sequence1
64 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG 61 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
65 62
66
67 ----- 63 -----
68
69
70 64
71 **Multiplicity counts (a.k.a reads-count)** 65 **Multiplicity counts (a.k.a reads-count)**
72 66
73 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing). 67 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing).
74 68
82 GGGATATATCCCCACACACACACAC 76 GGGATATATCCCCACACACACACAC
83 77
84 Each sequence is counts as one, to produce the following chart: 78 Each sequence is counts as one, to produce the following chart:
85 79
86 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png 80 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png
87
88 81
89 Example 2 - The following FASTA file have multiplicity counts:: 82 Example 2 - The following FASTA file have multiplicity counts::
90 83
91 >seq1-2 84 >seq1-2
92 GGATCC 85 GGATCC
104 ------ 97 ------
105 98
106 This tool is based on `FASTX-toolkit`__ by Assaf Gordon. 99 This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
107 100
108 .. __: http://hannonlab.cshl.edu/fastx_toolkit/ 101 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
109 102 </help>
110 </help>
111 <!-- FASTA-Clipping-Histogram is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->
112 </tool> 103 </tool>