sharplabtool: tools/fastx_toolkit/fasta_clipping

annotate tools/fastx_toolkit/fasta_clipping_histogram.xml @ 1:cdcb0ce84a1b

Uploaded

author	xuebing
date	Fri, 09 Mar 2012 19:45:15 -0500
parents	9071e359b9a3
children

rev	line source
0 9071e359b9a3 Uploaded xuebing parents: diff changeset	1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution">
9071e359b9a3 Uploaded xuebing parents: diff changeset	2 <description>chart</description>
9071e359b9a3 Uploaded xuebing parents: diff changeset	3 <requirements><requirement type="package">fastx_toolkit</requirement></requirements>
9071e359b9a3 Uploaded xuebing parents: diff changeset	4 <command>fasta_clipping_histogram.pl $input $outfile</command>
9071e359b9a3 Uploaded xuebing parents: diff changeset	5
9071e359b9a3 Uploaded xuebing parents: diff changeset	6 <inputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	7 <param format="fasta" name="input" type="data" label="Library to analyze" />
9071e359b9a3 Uploaded xuebing parents: diff changeset	8 </inputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	9
9071e359b9a3 Uploaded xuebing parents: diff changeset	10 <outputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	11 <data format="png" name="outfile" metadata_source="input" />
9071e359b9a3 Uploaded xuebing parents: diff changeset	12 </outputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	13 <help>
9071e359b9a3 Uploaded xuebing parents: diff changeset	14
9071e359b9a3 Uploaded xuebing parents: diff changeset	15 What it does
9071e359b9a3 Uploaded xuebing parents: diff changeset	16
9071e359b9a3 Uploaded xuebing parents: diff changeset	17 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file.
9071e359b9a3 Uploaded xuebing parents: diff changeset	18
9071e359b9a3 Uploaded xuebing parents: diff changeset	19 TIP: Use this tool after clipping your library (with FASTX Clipper tool), to visualize the clipping results.
9071e359b9a3 Uploaded xuebing parents: diff changeset	20
9071e359b9a3 Uploaded xuebing parents: diff changeset	21 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	22
9071e359b9a3 Uploaded xuebing parents: diff changeset	23 Output Examples
9071e359b9a3 Uploaded xuebing parents: diff changeset	24
9071e359b9a3 Uploaded xuebing parents: diff changeset	25 In the following library, most sequences are 24-mers to 27-mers.
9071e359b9a3 Uploaded xuebing parents: diff changeset	26 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place).
9071e359b9a3 Uploaded xuebing parents: diff changeset	27
9071e359b9a3 Uploaded xuebing parents: diff changeset	28 .. image:: ./static/fastx_icons/fasta_clipping_histogram_1.png
9071e359b9a3 Uploaded xuebing parents: diff changeset	29
9071e359b9a3 Uploaded xuebing parents: diff changeset	30
9071e359b9a3 Uploaded xuebing parents: diff changeset	31 In the following library, most sequences are 19,22 or 23-mers.
9071e359b9a3 Uploaded xuebing parents: diff changeset	32 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place).
9071e359b9a3 Uploaded xuebing parents: diff changeset	33
9071e359b9a3 Uploaded xuebing parents: diff changeset	34 .. image:: ./static/fastx_icons/fasta_clipping_histogram_2.png
9071e359b9a3 Uploaded xuebing parents: diff changeset	35
9071e359b9a3 Uploaded xuebing parents: diff changeset	36
9071e359b9a3 Uploaded xuebing parents: diff changeset	37 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	38
9071e359b9a3 Uploaded xuebing parents: diff changeset	39
9071e359b9a3 Uploaded xuebing parents: diff changeset	40 Input Formats
9071e359b9a3 Uploaded xuebing parents: diff changeset	41
9071e359b9a3 Uploaded xuebing parents: diff changeset	42 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so::
9071e359b9a3 Uploaded xuebing parents: diff changeset	43
9071e359b9a3 Uploaded xuebing parents: diff changeset	44 >sequence1
9071e359b9a3 Uploaded xuebing parents: diff changeset	45 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG
9071e359b9a3 Uploaded xuebing parents: diff changeset	46 >sequence2
9071e359b9a3 Uploaded xuebing parents: diff changeset	47 GTGTGTGTGGGAAGTTGACACAGTA
9071e359b9a3 Uploaded xuebing parents: diff changeset	48 >sequence3
9071e359b9a3 Uploaded xuebing parents: diff changeset	49 CCTTGAGATTAACGCTAATCAAGTAAAC
9071e359b9a3 Uploaded xuebing parents: diff changeset	50
9071e359b9a3 Uploaded xuebing parents: diff changeset	51
9071e359b9a3 Uploaded xuebing parents: diff changeset	52 If the sequences span over multiple lines::
9071e359b9a3 Uploaded xuebing parents: diff changeset	53
9071e359b9a3 Uploaded xuebing parents: diff changeset	54 >sequence1
9071e359b9a3 Uploaded xuebing parents: diff changeset	55 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
9071e359b9a3 Uploaded xuebing parents: diff changeset	56 TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
9071e359b9a3 Uploaded xuebing parents: diff changeset	57 aactggtctttacctTTAAGTTG
9071e359b9a3 Uploaded xuebing parents: diff changeset	58
9071e359b9a3 Uploaded xuebing parents: diff changeset	59 Use the FASTA Width Formatter tool to re-format the FASTA into a single-lined sequences::
9071e359b9a3 Uploaded xuebing parents: diff changeset	60
9071e359b9a3 Uploaded xuebing parents: diff changeset	61 >sequence1
9071e359b9a3 Uploaded xuebing parents: diff changeset	62 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
9071e359b9a3 Uploaded xuebing parents: diff changeset	63
9071e359b9a3 Uploaded xuebing parents: diff changeset	64
9071e359b9a3 Uploaded xuebing parents: diff changeset	65 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	66
9071e359b9a3 Uploaded xuebing parents: diff changeset	67
9071e359b9a3 Uploaded xuebing parents: diff changeset	68
9071e359b9a3 Uploaded xuebing parents: diff changeset	69 Multiplicity counts (a.k.a reads-count)
9071e359b9a3 Uploaded xuebing parents: diff changeset	70
9071e359b9a3 Uploaded xuebing parents: diff changeset	71 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing).
9071e359b9a3 Uploaded xuebing parents: diff changeset	72
9071e359b9a3 Uploaded xuebing parents: diff changeset	73 Example 1 - The following FASTA file does not have multiplicity counts::
9071e359b9a3 Uploaded xuebing parents: diff changeset	74
9071e359b9a3 Uploaded xuebing parents: diff changeset	75 >seq1
9071e359b9a3 Uploaded xuebing parents: diff changeset	76 GGATCC
9071e359b9a3 Uploaded xuebing parents: diff changeset	77 >seq2
9071e359b9a3 Uploaded xuebing parents: diff changeset	78 GGTCATGGGTTTAAA
9071e359b9a3 Uploaded xuebing parents: diff changeset	79 >seq3
9071e359b9a3 Uploaded xuebing parents: diff changeset	80 GGGATATATCCCCACACACACACAC
9071e359b9a3 Uploaded xuebing parents: diff changeset	81
9071e359b9a3 Uploaded xuebing parents: diff changeset	82 Each sequence is counts as one, to produce the following chart:
9071e359b9a3 Uploaded xuebing parents: diff changeset	83
9071e359b9a3 Uploaded xuebing parents: diff changeset	84 .. image:: ./static/fastx_icons/fasta_clipping_histogram_3.png
9071e359b9a3 Uploaded xuebing parents: diff changeset	85
9071e359b9a3 Uploaded xuebing parents: diff changeset	86
9071e359b9a3 Uploaded xuebing parents: diff changeset	87 Example 2 - The following FASTA file have multiplicity counts::
9071e359b9a3 Uploaded xuebing parents: diff changeset	88
9071e359b9a3 Uploaded xuebing parents: diff changeset	89 >seq1-2
9071e359b9a3 Uploaded xuebing parents: diff changeset	90 GGATCC
9071e359b9a3 Uploaded xuebing parents: diff changeset	91 >seq2-10
9071e359b9a3 Uploaded xuebing parents: diff changeset	92 GGTCATGGGTTTAAA
9071e359b9a3 Uploaded xuebing parents: diff changeset	93 >seq3-3
9071e359b9a3 Uploaded xuebing parents: diff changeset	94 GGGATATATCCCCACACACACACAC
9071e359b9a3 Uploaded xuebing parents: diff changeset	95
9071e359b9a3 Uploaded xuebing parents: diff changeset	96 The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart:
9071e359b9a3 Uploaded xuebing parents: diff changeset	97
9071e359b9a3 Uploaded xuebing parents: diff changeset	98 .. image:: ./static/fastx_icons/fasta_clipping_histogram_4.png
9071e359b9a3 Uploaded xuebing parents: diff changeset	99
9071e359b9a3 Uploaded xuebing parents: diff changeset	100 Use the FASTA Collapser tool to create FASTA files with multiplicity counts.
9071e359b9a3 Uploaded xuebing parents: diff changeset	101
9071e359b9a3 Uploaded xuebing parents: diff changeset	102 </help>
9071e359b9a3 Uploaded xuebing parents: diff changeset	103 </tool>
9071e359b9a3 Uploaded xuebing parents: diff changeset	104 <!-- FASTA-Clipping-Histogram is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->

Mercurial > repos > xuebing > sharplabtool

annotate tools/fastx_toolkit/fasta_clipping_histogram.xml @ 1:cdcb0ce84a1b