Mercurial > repos > xilinxu > xilinxu
comparison fastx_toolkit-0.0.6/galaxy/tools/fastx_toolkit_with_gzip_and_output_label/fastx_collapser.xml @ 3:997f5136985f draft default tip
Uploaded
author | xilinxu |
---|---|
date | Thu, 14 Aug 2014 04:52:17 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
2:dfe9332138cf | 3:997f5136985f |
---|---|
1 <tool id="cshl_fastx_collapser" name="Collapse"> | |
2 <description>sequences</description> | |
3 <command>zcat -f '$input' | fastx_collapser -v -o '$output' </command> | |
4 | |
5 <inputs> | |
6 <param format="fastqsolexa,fasta" name="input" type="data" label="Library to collapse" /> | |
7 | |
8 <!-- | |
9 <param name="GZIPOUT" type="select" label="Compress output file (using GZIP) "> | |
10 <option value="">no</option> | |
11 <option value="-g">yes</option> | |
12 </param> | |
13 --> | |
14 </inputs> | |
15 | |
16 <tests> | |
17 <test> | |
18 <param name="input" value="fasta_collapser1.fasta" /> | |
19 <param name="GZIPOUT" value=""/> | |
20 <output name="output" file="fasta_collapser1.out" /> | |
21 </test> | |
22 </tests> | |
23 | |
24 <outputs> | |
25 <data format="fasta" name="output" metadata_source="input" label="$input.tag collapsed" /> | |
26 </outputs> | |
27 <help> | |
28 | |
29 **What it does** | |
30 | |
31 This tool collapses identical sequences in a FASTA file into a single sequence. | |
32 | |
33 -------- | |
34 | |
35 **Example** | |
36 | |
37 Example Input File (Sequence "ATAT" appears multiple times):: | |
38 | |
39 >CSHL_2_FC0042AGLLOO_1_1_605_414 | |
40 TGCG | |
41 >CSHL_2_FC0042AGLLOO_1_1_537_759 | |
42 ATAT | |
43 >CSHL_2_FC0042AGLLOO_1_1_774_520 | |
44 TGGC | |
45 >CSHL_2_FC0042AGLLOO_1_1_742_502 | |
46 ATAT | |
47 >CSHL_2_FC0042AGLLOO_1_1_781_514 | |
48 TGAG | |
49 >CSHL_2_FC0042AGLLOO_1_1_757_487 | |
50 TTCA | |
51 >CSHL_2_FC0042AGLLOO_1_1_903_769 | |
52 ATAT | |
53 >CSHL_2_FC0042AGLLOO_1_1_724_499 | |
54 ATAT | |
55 | |
56 Example Output file:: | |
57 | |
58 >1-1 | |
59 TGCG | |
60 >2-4 | |
61 ATAT | |
62 >3-1 | |
63 TGGC | |
64 >4-1 | |
65 TGAG | |
66 >5-1 | |
67 TTCA | |
68 | |
69 .. class:: infomark | |
70 | |
71 Original Sequence Names / Lane descriptions (e.g. "CSHL_2_FC0042AGLLOO_1_1_742_502") are discarded. | |
72 | |
73 The output seqeunce name is composed of two numbers: the first is the sequence's number, the second is the multiplicity value. | |
74 | |
75 The following output:: | |
76 | |
77 >2-4 | |
78 ATAT | |
79 | |
80 means that the sequence "ATAT" is the second sequence in the file, and it appeared 4 times in the input FASTA file. | |
81 | |
82 </help> | |
83 </tool> |