re_utils: fasta_affixer.xml comparison

comparison fasta_affixer.xml @ 0:a4cd8608ef6b draft

Uploaded

author	petr-novak
date	Mon, 01 Apr 2019 07:56:36 -0400
parents
children	c2c69c6090f0

comparison

equal deleted inserted replaced

--1:000000000000
+:a4cd8608ef6b
+<tool id="fasta_affixer" name="FASTA read name affixer" version="1.0.0">
+<description> Tool appending suffix and prefix to sequences names </description>
+<command interpreter="python3">
+fasta_affixer.py -f $input -p "$prefix" -s "$suffix" -n $nspace -o $output
+</command>
+<inputs>
+<param format="fasta" type="data" name="input" label="Choose your fasta file" />
+<param name="prefix" type="text" size="10" value="" label="Prefix" help="Enter prefix which will be added to all sequences names" />
+<param name="suffix" type="text" size="10" value="" label="Suffix" help="Enter suffix which will be added to all sequences names"/>
+<param name="nspace" type="integer" size="10" value="0" min="0" max="1000" label="Number of spaces in name to ignore" help="Sequence name is a string before the first space. If you want name to include spaces in name, enter positive integer. All other characters beyond ignored spaces are omitted"/>
+</inputs>
+<outputs>
+	<data format="fasta" name="output" label="fasta dataset ${input.hid} with modified sequence names" />
+</outputs>
+<tests>
+<test>
+<param name="input" value="single_output.fasta" />
+<param name="prefix" value="TEST" />
+<param name="suffux" value="OK"/>
+<param name="nspace" value="0" />
+<output name="output" value="prefix_suffix.fasta" />
+</test>
+</tests>
+<help>
+**What is does**
+Tool for appending prefix and suffix to sequences names in fasta formated sequences. This tool is useful
+if you want to do comparative analysis with RepeatExplorer and need to
+append sample codes to sequence identifiers
+**Example**
+The following fasta file:
+::
+>123454
+acgtactgactagccatgacg
+>234235
+acgtactgactagccatgacg
+is renamed to:
+::
+>prefix123454suffix
+acgtactgactagccatgacg
+>prefix234235suffix
+acgtactgactagccatgacg
+By default, anything after spaces is
+excluded from sequences name. In example sequence:
+::
+>SRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1
+CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
+IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG
+when **Number of spaces in name to ignore** is set to 0 (default) the output will be:
+::
+>prefixSRR352150.23846180suffix
+CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
+If you want to keep spaces the setting **Number of spaces in name to ignore** to 1 will yield
+::
+>prefixSRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1suffix
+CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
+</help>
+</tool>

Mercurial > repos > petr-novak > re_utils

comparison fasta_affixer.xml @ 0:a4cd8608ef6b draft