comparison fastx_toolkit-0.0.6/galaxy/tools/fastx_toolkit_with_gzip_and_output_label/fastx_collapser.xml @ 3:997f5136985f draft default tip

Uploaded
author xilinxu
date Thu, 14 Aug 2014 04:52:17 -0400
parents
children
comparison
equal deleted inserted replaced
2:dfe9332138cf 3:997f5136985f
1 <tool id="cshl_fastx_collapser" name="Collapse">
2 <description>sequences</description>
3 <command>zcat -f '$input' | fastx_collapser -v -o '$output' </command>
4
5 <inputs>
6 <param format="fastqsolexa,fasta" name="input" type="data" label="Library to collapse" />
7
8 <!--
9 <param name="GZIPOUT" type="select" label="Compress output file (using GZIP) ">
10 <option value="">no</option>
11 <option value="-g">yes</option>
12 </param>
13 -->
14 </inputs>
15
16 <tests>
17 <test>
18 <param name="input" value="fasta_collapser1.fasta" />
19 <param name="GZIPOUT" value=""/>
20 <output name="output" file="fasta_collapser1.out" />
21 </test>
22 </tests>
23
24 <outputs>
25 <data format="fasta" name="output" metadata_source="input" label="$input.tag collapsed" />
26 </outputs>
27 <help>
28
29 **What it does**
30
31 This tool collapses identical sequences in a FASTA file into a single sequence.
32
33 --------
34
35 **Example**
36
37 Example Input File (Sequence "ATAT" appears multiple times)::
38
39 >CSHL_2_FC0042AGLLOO_1_1_605_414
40 TGCG
41 >CSHL_2_FC0042AGLLOO_1_1_537_759
42 ATAT
43 >CSHL_2_FC0042AGLLOO_1_1_774_520
44 TGGC
45 >CSHL_2_FC0042AGLLOO_1_1_742_502
46 ATAT
47 >CSHL_2_FC0042AGLLOO_1_1_781_514
48 TGAG
49 >CSHL_2_FC0042AGLLOO_1_1_757_487
50 TTCA
51 >CSHL_2_FC0042AGLLOO_1_1_903_769
52 ATAT
53 >CSHL_2_FC0042AGLLOO_1_1_724_499
54 ATAT
55
56 Example Output file::
57
58 >1-1
59 TGCG
60 >2-4
61 ATAT
62 >3-1
63 TGGC
64 >4-1
65 TGAG
66 >5-1
67 TTCA
68
69 .. class:: infomark
70
71 Original Sequence Names / Lane descriptions (e.g. "CSHL_2_FC0042AGLLOO_1_1_742_502") are discarded.
72
73 The output seqeunce name is composed of two numbers: the first is the sequence's number, the second is the multiplicity value.
74
75 The following output::
76
77 >2-4
78 ATAT
79
80 means that the sequence "ATAT" is the second sequence in the file, and it appeared 4 times in the input FASTA file.
81
82 </help>
83 </tool>