Mercurial > repos > petr-novak > re_utils
comparison fastq_name_affixer.xml @ 3:e320ef2d105a draft
Uploaded
author | petr-novak |
---|---|
date | Thu, 05 Sep 2019 09:04:56 -0400 |
parents | |
children | c2c69c6090f0 |
comparison
equal
deleted
inserted
replaced
2:ff658cf87f16 | 3:e320ef2d105a |
---|---|
1 <tool id="names_affixer" name="FASTQ Read name affixer" version="1.0.0"> | |
2 <description> Tool appending suffix and prefix to sequences names </description> | |
3 <command interpreter="python"> | |
4 ${__tool_directory__}/name_affixer.py -f $input -p "$prefix" -s "$suffix" -n $nspace > $output | |
5 </command> | |
6 | |
7 <inputs> | |
8 <param format="fastq" type="data" name="input" label="Choose your fastq file" /> | |
9 <param name="prefix" type="text" size="10" value="" label="Prefix" help="Enter prefix which will be added to all sequences names" /> | |
10 <param name="suffix" type="text" size="10" value="" label="Suffix" help="Enter suffix which will be added to all sequences names"/> | |
11 <param name="nspace" type="integer" size="10" value="0" min="0" max="1000" label="Number of spaces in name to ignore" help="Sequence name is a string before the first space. If you want name to include spaces in name, enter positive integer. All other characters beyond ignored spaces are omitted"/> | |
12 </inputs> | |
13 | |
14 | |
15 <outputs> | |
16 <data format="fastq" name="output" label="fastq dataset ${input.hid} with modified sequence names" /> | |
17 </outputs> | |
18 | |
19 <help> | |
20 **What is does** | |
21 | |
22 Tool for appending prefix and suffix to sequences names in fastq formated sequences. | |
23 | |
24 **Example** | |
25 | |
26 The following Solexa-FASTQ file: | |
27 | |
28 :: | |
29 | |
30 @CSHL_4_FC042GAMMII_2_1_517_596 | |
31 GGTCAATGATGAGTTGGCACTGTAGGCACCATCAAT | |
32 +CSHL_4_FC042GAMMII_2_1_517_596 | |
33 40 40 40 40 40 40 40 40 40 40 38 40 40 40 40 40 14 40 40 40 40 40 36 40 13 14 24 24 9 24 9 40 10 10 15 40 | |
34 | |
35 is renamed to: | |
36 | |
37 :: | |
38 | |
39 @prefixCSHL_4_FC042GAMMII_2_1_517_596suffix | |
40 GGTCAATGATGAGTTGGCACTGTAGGCACCATCAAT | |
41 +prefixCSHL_4_FC042GAMMII_2_1_517_596suffix | |
42 40 40 40 40 40 40 40 40 40 40 38 40 40 40 40 40 14 40 40 40 40 40 36 40 13 14 24 24 9 24 9 40 10 10 15 40 | |
43 | |
44 different format: | |
45 | |
46 | |
47 :: | |
48 | |
49 @HISEQ1:92:c0190acxx:8:1101:1252:2230 2:N:0:CGATGT | |
50 AGAGGAAAAAACATAGTTCTTGTCTAAAAAAATCCCTTGAAAAAGGGCAGATGTATAGAAATAGAAAATTTCAAAGAAAAACTCTCTACAAATGGAAGAGA | |
51 + | |
52 CCCFFFFFHHHHHJJJJIJJJJJJJJJJJJJJJIJJJJJIIJJJJJJGIJIJIHHHHHHHHFFFFFFDEEEEEDCDDDDDDDCCDDDEDDDDD>CCCCB@9 | |
53 | |
54 is renamed to: | |
55 | |
56 :: | |
57 | |
58 @prefixHISEQ1:92:c0190acxx:8:1101:1252:2230suffix | |
59 AGAGGAAAAAACATAGTTCTTGTCTAAAAAAATCCCTTGAAAAAGGGCAGATGTATAGAAATAGAAAATTTCAAAGAAAAACTCTCTACAAATGGAAGAGA | |
60 + | |
61 CCCFFFFFHHHHHJJJJIJJJJJJJJJJJJJJJIJJJJJIIJJJJJJGIJIJIHHHHHHHHFFFFFFDEEEEEDCDDDDDDDCCDDDEDDDDD>CCCCB@9 | |
62 | |
63 note that string after first space is omitted! | |
64 | |
65 Because sequence names sometimes containg spaces which delimit the actual name. By default, anything after spaces is | |
66 excluded from sequences name. In example sequence: | |
67 | |
68 :: | |
69 | |
70 @SRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1 | |
71 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
72 + | |
73 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG | |
74 | |
75 when **Number of spaces in name to ignore** is set to 0 (default) the output will be: | |
76 | |
77 :: | |
78 | |
79 @prefixSRR352150.23846180suffix | |
80 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
81 + | |
82 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG | |
83 | |
84 If you want to keep spaces the setting **Number of spaces in name to ignore** to 1 will yield | |
85 | |
86 :: | |
87 | |
88 @prefixSRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1suffix | |
89 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC | |
90 + | |
91 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG | |
92 | |
93 | |
94 </help> | |
95 </tool> |