view pynast/pynast.xml @ 3:17b2737094ce draft default tip

Uploaded
author qfab
date Sun, 01 Jun 2014 19:49:03 -0400
parents 2813a698bf9c
children
line wrap: on
line source

<tool id='pynast' name='PyNAST' version='1.0'>
  <description>PyNAST is a reimplementation of the NAST sequence aligner for adding new 16S rDNA sequences to existing 16S rDNA alignments.</description>
  <requirements>
     <requirement type="package" version="1.5.2">pycogent</requirement>
     <requirement type="package" version="1.22.2">uclust</requirement>
     <requirement type="binary">@BINARY@</requirement>
     <requirement type="package" version="1.2.2">pynast</requirement>
  </requirements>
  <command>
   which pynast &amp;&amp; which uclust &amp;&amp; python -c "import numpy; print numpy.version.version" &amp;&amp; python -c "import cogent; print cogent.version"&amp;&amp; pynast -i $input -t $template -l $length  -a $aligned_output -g $log -f $fail  2>1
  </command>
  <inputs>
    <param name="input" type="data" format="fasta" label="Candidate file (FASTA format)" help="A set of sequences to align" />
    <param name="template" type="data" format="fasta" label="Template alignment file (FASTA format)" help="A set of pre-aligned sequences against which the candidates will be aligned" />
    <param name="length" type="integer" value="1000" label="Minimum sequence length" help="The minimum sequence length to include in NAST alignment [default: 1000]" />
  </inputs>
  <outputs>
    <data name='aligned_output' format='fasta' label='${tool.name} on ${on_string}:pynast aligned' />
    <data name='log' format='txt' hidden='TRUE' label='${tool.name} on ${on_string}:pynast log file' />
    <data name='fail' format='fasta' hidden='TRUE' label='${tool.name} on ${on_string}: pynast failure' />
  </outputs>
  <help>
===========
Description
===========

.. class:: infomark

Two additional files are generated by this tool, which are hidden from the history list. You can view these outputs by clicking on the cog wheel next to the History panel and select 'Include Hidden Dataset'.

PyNAST (Python Nearest Alignment Space Termination)  is a reimplementation of the NAST sequence aligner, which has become a popular tool for adding new 16s rDNA sequences to existing 16s rDNA alignments. This reimplementation is more flexible, faster, and easier to install and maintain than the original NAST implementation. PyNAST is built using the PyCogent_Bioinformatics_Toolkit_.

.. _PyCogent_Bioinformatics_Toolkit: http://pycogent.org/

Given a set of sequences and a template alignment, PyNAST will align the input sequences ('candidate') to the best-matching sequence in a pre-aligned set of sequences ('template alignment'), and return a multiple sequence alignment which contains the same number of positions (or columns) as the template alignment. This facilitates the analysis of new sequences in the context of existing alignments, and additional data derived from existing alignments such as phylogenetic trees. Because any protein or nucleic acid sequences and template alignments can be provided, PyNAST is not limited to the analysis of 16S rDNA sequences.

For further information visit: PyNAST_Documentation_

.. _PyNAST_Documentation: http://biocore.github.io/pynast/

------

-----
Input
-----

(A) Candidate file, a set of sequences in FASTA format, e.g.::

        >0 A
        GAGTTTGATCCTGGCTCAGATTGAACGCTGGCGGTATGCTT
        >1 B
        AGTTTGATCCTGGCTCAGATTGAACGCTGGCGGTATGCTT
        >2 C
        GAGGATCCTGGCTCAGATTGAACGCTGGCGGTATGCTT
        >3 D
        CCCGAGGATCCTGGCTCAGATTGAACGCTGGCGGTATGCTT

(B) Template alignment file, e.g.::

        >Y
        GAGTTT-GA--T-CC-T-G-GCTC-AG-AT-TGAA-C-GC--TGG-C--G-GT-A-TG--C----T-T
        >Z
        GAGTTT-GA--T-CC-T-G-GCTC-AG-AT-TGAA-C-GC--TGG-C--G-GC-A-GG--C----C-T

----------
Parameters
----------

Minimum sequence length
 To include in NAST alignment. The default is 1000. 

.. class:: warningmark

Set the minimum sequence length according to your set of your sequences.

------
Output
------

This tool produces three output files, two of which are hidden by default (the log and failure file).

.. class:: infomark

You can view these outputs by clicking on the cog wheel next to the History panel and select "Include Hidden Dataset".

(A) The alignment, a file in FASTA format containing the alignment
(B) *(hidden)* Log file
(C) *(hidden)* Failure file, a file containing alignment fails

-----

=========
Resources
=========

PyNAST_

.. _PyNAST: https://github.com/biocore/pynast

**Wrapper Author**

QFAB Bioinformatics (support@qfab.org)

</help>
<tests>
  <test>
    <param name="input" value="seqs.fasta" />
    <param name="template" value="core_set_aligned.fasta" />
    <param name="length" value="1000" />
    <output name="aligned_output" file="aligned_output.fasta" ftype="fasta" lines_diff="10" />
    <output name="log" file="log.txt" ftype="txt" lines_diff="10" />
    <output name="fail" file="fail.fasta" ftype="fasta" lines_diff="10" />
  </test>
</tests>
</tool>