view fastashuffle2.xml @ 13:292186c14b08

Uploaded
author xuebing
date Sat, 10 Mar 2012 08:17:36 -0500
parents b7f1d9f8f3bc
children
line wrap: on
line source

<tool id="seqshuffle2" name="shuffle sequence">
  <description>preserving dinucleotide frequency</description>
  <command interpreter="python">fasta-dinucleotide-shuffle.py -f $input -t $tag -c $n -s $seed > $output </command>
  <inputs>
    <param name="input" format="fasta" type="data" label="Original sequence file"/>
    <param name="tag" type="text" size="40" value="-shuffled" label="tag added to shuffled sequence name"/>
    <param name="n" type="integer" value="1" label="number of shuffled copies for each sequence"/>
    <param name="seed" type="integer" value="1" label="random seed" help="the same seed gives the same random sequences"/>
  </inputs>
  <outputs>
    <data format="fasta" name="output" />
  </outputs>
  <help>

**What it does**

This tool shuffles the sequences in the input file but preserves the dinucleotide frequency of each sequence. 

The code implements the Altschul-Erikson dinucleotide shuffle algorithm, described in "Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage", S.F. Altschul and B.W. Erikson, Mol. Biol. Evol., 2(6):526--538, 1985. 

Code adapted from http://bioinformatics.bc.edu/clotelab/RNAdinucleotideShuffle/dinucleotideShuffle.html

  </help>
</tool>