Mercurial > repos > petr-novak > re_utils
comparison fasta_affixer.xml @ 0:a4cd8608ef6b draft
Uploaded
author | petr-novak |
---|---|
date | Mon, 01 Apr 2019 07:56:36 -0400 |
parents | |
children | c2c69c6090f0 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:a4cd8608ef6b |
---|---|
1 <tool id="fasta_affixer" name="FASTA read name affixer" version="1.0.0"> | |
2 <description> Tool appending suffix and prefix to sequences names </description> | |
3 <command interpreter="python3"> | |
4 fasta_affixer.py -f $input -p "$prefix" -s "$suffix" -n $nspace -o $output | |
5 </command> | |
6 | |
7 <inputs> | |
8 <param format="fasta" type="data" name="input" label="Choose your fasta file" /> | |
9 <param name="prefix" type="text" size="10" value="" label="Prefix" help="Enter prefix which will be added to all sequences names" /> | |
10 <param name="suffix" type="text" size="10" value="" label="Suffix" help="Enter suffix which will be added to all sequences names"/> | |
11 <param name="nspace" type="integer" size="10" value="0" min="0" max="1000" label="Number of spaces in name to ignore" help="Sequence name is a string before the first space. If you want name to include spaces in name, enter positive integer. All other characters beyond ignored spaces are omitted"/> | |
12 </inputs> | |
13 | |
14 | |
15 <outputs> | |
16 <data format="fasta" name="output" label="fasta dataset ${input.hid} with modified sequence names" /> | |
17 </outputs> | |
18 | |
19 <tests> | |
20 <test> | |
21 <param name="input" value="single_output.fasta" /> | |
22 <param name="prefix" value="TEST" /> | |
23 <param name="suffux" value="OK"/> | |
24 <param name="nspace" value="0" /> | |
25 <output name="output" value="prefix_suffix.fasta" /> | |
26 </test> | |
27 </tests> | |
28 <help> | |
29 **What is does** | |
30 | |
31 Tool for appending prefix and suffix to sequences names in fasta formated sequences. This tool is useful | |
32 if you want to do comparative analysis with RepeatExplorer and need to | |
33 append sample codes to sequence identifiers | |
34 | |
35 **Example** | |
36 The following fasta file: | |
37 | |
38 :: | |
39 | |
40 >123454 | |
41 acgtactgactagccatgacg | |
42 >234235 | |
43 acgtactgactagccatgacg | |
44 | |
45 is renamed to: | |
46 | |
47 :: | |
48 | |
49 >prefix123454suffix | |
50 acgtactgactagccatgacg | |
51 >prefix234235suffix | |
52 acgtactgactagccatgacg | |
53 | |
54 | |
55 By default, anything after spaces is | |
56 excluded from sequences name. In example sequence: | |
57 | |
58 :: | |
59 | |
60 >SRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1 | |
61 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
62 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG | |
63 | |
64 when **Number of spaces in name to ignore** is set to 0 (default) the output will be: | |
65 | |
66 :: | |
67 | |
68 >prefixSRR352150.23846180suffix | |
69 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
70 | |
71 | |
72 If you want to keep spaces the setting **Number of spaces in name to ignore** to 1 will yield | |
73 | |
74 :: | |
75 | |
76 >prefixSRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1suffix | |
77 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
78 | |
79 | |
80 </help> | |
81 </tool> |