comparison fasta_compute_length.xml @ 2:de2db1bdfbf8 draft

planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tools/fasta_compute_length commit a1517c9d22029095120643bbe2c8fa53754dd2b7
author devteam
date Wed, 11 Nov 2015 12:13:18 -0500
parents d8cc2c8eef14
children 2051602a5f97
comparison
equal deleted inserted replaced
1:d8cc2c8eef14 2:de2db1bdfbf8
1 <tool id="fasta_compute_length" name="Compute sequence length" version="1.0.0"> 1 <tool id="fasta_compute_length" name="Compute sequence length" version="1.0.1">
2 <description></description> 2 <description></description>
3 <command interpreter="python">fasta_compute_length.py $input $output $keep_first</command> 3 <command interpreter="python">fasta_compute_length.py $input $output $keep_first $keep_first_word</command>
4 <inputs> 4 <inputs>
5 <param name="input" type="data" format="fasta" label="Compute length for these sequences"/> 5 <param name="input" type="data" format="fasta" label="Compute length for these sequences"/>
6 <param name="keep_first" type="integer" size="5" value="0" label="How many title characters to keep?" help="'0' = keep the whole thing"/> 6 <param name="keep_first" type="integer" value="0" label="How many title characters to keep?" help="'0' = keep the whole thing"/>
7 </inputs> 7 <param name="keep_first_word" type="boolean" truevalue="id_only" falsevalue="id_and_desc"
8 <outputs> 8 selected="false" label="Strip fasta description from header?"
9 <data name="output" format="tabular"/> 9 help="Stripping the description will truncate the fasta header to just the sequence ID. Otherwise the header description will be kept. This step is done before the 'How many characters to keep' option."/>
10 </outputs> 10
11 <tests> 11 </inputs>
12 <test> 12 <outputs>
13 <param name="input" value="454.fasta" /> 13 <data name="output" format="tabular"/>
14 <param name="keep_first" value="0"/> 14 </outputs>
15 <output name="output" file="fasta_tool_compute_length_1.out" /> 15 <tests>
16 </test> 16 <test>
17 17 <param name="input" value="454.fasta" />
18 <test> 18 <param name="keep_first" value="0"/>
19 <param name="input" value="extract_genomic_dna_out1.fasta" /> 19 <param name="keep_first_word" value="id_and_desc" />
20 <param name="keep_first" value="0"/> 20 <output name="output" file="fasta_tool_compute_length_1.out" />
21 <output name="output" file="fasta_tool_compute_length_2.out" /> 21 </test>
22 </test> 22
23 23 <test>
24 <test> 24 <param name="input" value="extract_genomic_dna_out1.fasta" />
25 <param name="input" value="454.fasta" /> 25 <param name="keep_first" value="0"/>
26 <param name="keep_first" value="14"/> 26 <param name="keep_first_word" value="id_and_desc" />
27 <output name="output" file="fasta_tool_compute_length_3.out" /> 27 <output name="output" file="fasta_tool_compute_length_2.out" />
28 </test> 28 </test>
29 </tests> 29
30 <help> 30 <test>
31 <param name="input" value="454.fasta" />
32 <param name="keep_first" value="14"/>
33 <param name="keep_first_word" value="id_and_desc" />
34 <output name="output" file="fasta_tool_compute_length_3.out" />
35 </test>
36 </tests>
37 <help>
31 38
32 **What it does** 39 **What it does**
33 40
34 This tool counts the length of each fasta sequence in the file. The output file has two columns per line (separated by tab): fasta titles and lengths of the sequences. The option *How many characters to keep?* allows to select a specified number of letters from the beginning of each FASTA entry. 41 This tool counts the length of each fasta sequence in the file. The output file has two columns per line (separated by tab): fasta titles and lengths of the sequences. The option *How many characters to keep?* allows to select a specified number of letters from the beginning of each FASTA entry.
35 42
36 ----- 43 -----
37 44
38 **Example** 45 **Example**
39 46
40 Suppose you have the following FASTA formatted sequences from a Roche (454) FLX sequencing run:: 47 Suppose you have the following FASTA formatted sequences from a Roche (454) FLX sequencing run::
41 48
44 TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG 51 TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG
45 &gt;EYKX4VC02D4GS2 length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_ 52 &gt;EYKX4VC02D4GS2 length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_
46 AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa 53 AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa
47 54
48 Running this tool while setting **How many characters to keep?** to **14** will produce this:: 55 Running this tool while setting **How many characters to keep?** to **14** will produce this::
49 56
50 EYKX4VC02EQLO5 108 57 EYKX4VC02EQLO5 108
51 EYKX4VC02D4GS2 60 58 EYKX4VC02D4GS2 60
59
60 However, if your IDs are not all the same length, you may wish to just keep the fasta ID, and not the description::
61
62 &gt;EYKX4VC02EQLO5 length=108 xy=1826_0455 region=2 run=R_2007_11_07_16_15_57_
63 TCCGCGCCGAGCATGCCCATCTTGGATTCCGGCGCGATGACCATCGCCCGCTCCACCACG
64 TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG
65 &gt;EYKX4VC length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_
66 AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa
67
68 Running this tool with **Strip fasta description from header** set to **True** and **How many characters to keep?** set to **0** will produce::
69
70 EYKX4VC02EQLO5 108
71 EYKX4VC 60
52 72
53 73
54 </help> 74 </help>
75 <citations>
76 <citation type="doi">10.1093/bioinformatics/btq281</citation>
77 </citations>
55 </tool> 78 </tool>