Mercurial > repos > csbl > repeatmodeler

<tool id="repeatmodeler" name="RepeatModeler" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="20.01">
    <description>Model repetitive DNA</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <requirements>
        <expand macro="requirements" />
    </requirements>
    <command detect_errors="exit_code"><![CDATA[
BuildDatabase -name 'rmdb' '$input_file'

&&

## "RMBlast jobs will use 4 cores each"
pa=\$(( (\${GALAXY_SLOTS:-1}+3)/4 ))

&&

RepeatModeler -database 'rmdb' -pa \$pa
    ]]></command>
    <inputs>
        <param type="data" name="input_file" format="fasta" label="Input genome fasta"/>
    </inputs>
    <outputs>
        <data format="fasta" name="sequences" from_work_dir="rmdb-families.fa" label="${tool.name} on ${on_string}: consensus sequences" />
        <data format="stockholm" name="seeds" from_work_dir="rmdb-families.stk" label="${tool.name} on ${on_string}: seed alignments" />
    </outputs>
    <tests>
        <test>
            <param name="input_file" value="eco.fasta.gz" ftype="fasta.gz"/>
            <output name="sequences" ftype="fasta">
                <assert_contents>
                    <has_text text="( RepeatScout Family Size ="/>
                    <has_text text="rnd-1_family-0"/>
                </assert_contents>
            </output>
            <output name="seeds" ftype="stockholm">
                <assert_contents>
                    <has_text text="#=GF DE    RepeatModeler Generated"/>
                </assert_contents>
            </output>
        </test>
    </tests>
    <help><![CDATA[
RepeatModeler is a de novo transposable element (TE) family identification and modeling package. At the heart of RepeatModeler are three de-novo repeat finding programs ( RECON, RepeatScout and LtrHarvest/Ltr_retriever ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data.

RepeatModeler assists in automating the runs of the various algorithms given a genomic database, clustering redundant results, refining and classifying the families and producing a high quality library of TE families suitable for use with RepeatMasker and ultimately for submission to the Dfam database (http://dfam.org).
    ]]></help>
    <expand macro="citations" />
</tool>
author	iuc
date	Sat, 27 Nov 2021 10:07:06 +0000
parents	dda44fd49bcd
children	a9fb165808c8