Mercurial > repos > thanhlv > fastani

<tool id="fastani" name="FastANI" version="1.1">
    <description> fast alignment-free computation of whole-genome Average Nucleotide Identity</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="requirements" />
    <version_command>fastANI --version</version_command>
    <command detect_errors="exit_code">
    <![CDATA[
    #for $counter_q, $input_q in enumerate($query):
        ln -fs '$input_q' ${input_q.element_identifier}_query &&
        echo ${input_q.element_identifier}_query >> query.lst &&
    #end for

    #for $counter_r, $input_r in enumerate($reference):
        ln -fs '$input_r' ${input_r.element_identifier}_ref &&
        echo ${input_r.element_identifier}_query >> ref.lst &&
    #end for

    fastANI
    #if len($query) == 1 and len($reference) == 1:
        -q ${input_q.element_identifier}_query -r  ${input_r.element_identifier}_ref -o output
    #elif len($query) == 1 and len($reference) > 1:
        -q ${input_q.element_identifier}_query --rl ref.lst  -o output
    #elif len($query) > 1 and len($reference) > 1:
        --ql query.lst --rl ref.lst -o output
    #end if

    2>&1
    ]]></command>
    <inputs>
        <param name="query" type="data" format="fasta" multiple="true" label="Query sequences" />
        <param name="reference" type="data" format="fasta" multiple="true" label="Reference sequence" />
    </inputs>
    <outputs>
        <data name="output" format="txt" from_work_dir="output" label="${tool.name} on ${on_string} Output"/>
    </outputs>
    <tests>
        <test>
            <param name="query" value="Shigella_flexneri_2a_01.fna" />
            <param name="reference" value="Escherichia_coli_str_K12_MG1655.fna" />
            <output name="output" file="fastani.out" compare="sim_size" />
        </test>
    </tests>
    <help><![CDATA[
    FastANI is developed for fast alignment-free computation of whole-genome
    Average Nucleotide Identity (ANI). ANI is defined as mean nucleotide
    identity of orthologous gene pairs shared between two microbial genomes.
    FastANI supports pairwise comparison of both complete and draft genome
    assemblies. Its underlying procedure follows a similar workflow as described
    by Goris et al. 2007. However, it avoids expensive sequence alignments and
    uses Mashmap as its MinHash based sequence mapping engine to compute the
    orthologous mappings and alignment identity estimates. Based on our
    experiments with complete and draft genomes, its accuracy is on par with
    BLAST-based ANI solver and it achieves two to three orders of magnitude
    speedup. Therefore, it is useful for pairwise ANI computation of large
    number of genome pairs. More details about its speed, accuracy and potential
    applications are described here: "High Throughput ANI Analysis of 90K
    Prokaryotic Genomes Reveals Clear Species Boundaries".
    ]]></help>
    <expand macro="citations" />
</tool>
author	thanhlv
date	Fri, 19 Jul 2019 06:49:02 -0400
parents
children