view dnaclust.xml @ 0:84be1fe7e34a draft

Uploaded
author john-mccallum
date Thu, 06 Sep 2012 23:05:22 -0400
parents
children
line wrap: on
line source

<?xml version="1.0"?>
<tool id="dnaclust_1" name="dnaclust">
  <description>Cluster sequences  into OTUs using DNAclust </description>
  <command> dnaclust  -s $similarity -i $inputFastaFile > $outputfile </command>
  <inputs>
    <param format="fasta" name="inputFastaFile" type="data" label="Fasta input file"/>
 <param name="similarity" size="10" type="text" value="0.99" label="similarity between center and cluster sequences" />

  </inputs>
  <outputs>

     <data format="tabular" name="outputfile" />
  </outputs>

<help>

.. class:: infomark

**TIP**

see the  DNACLUST documentation at http://dnaclust.sourceforge.net/



Each line will contain the ids of the sequences in each cluster, and the first id of each line is the cluster representative.

Example: To cluster a set of 16S rRNA fragments at 0.98 similarity use:
./dnaclust file.fasta -l -s 0.98 > clusters

You can optionally specify a k-mer length for the filter. The longer k-mers use more memory.  Also the filter will be more specific with longer k-mers. The default log_4(median length) should be good for mo\
st cases.
</help>
</tool>