Mercurial > repos > devteam > ncbi_blast_plus
view tools/ncbi_blast_plus/ncbi_makeblastdb.xml @ 10:70e7dcbf6573 draft
Uploaded v0.0.20, handles dependencies via package_blast_plus_2_2_26, development moved to GitHub, RST README, MIT licence, citation information, more tests, percentage identity option to BLASTN, cElementTree to ElementTree fallback.
author | peterjc |
---|---|
date | Mon, 23 Sep 2013 06:14:13 -0400 |
parents | 9dabbfd73c8a |
children | 4c4a0da938ff |
line wrap: on
line source
<tool id="ncbi_makeblastdb" name="NCBI BLAST+ makeblastdb" version="0.0.5"> <description>Make BLAST database</description> <requirements> <requirement type="binary">makeblastdb</requirement> <requirement type="package" version="2.2.26+">blast+</requirement> </requirements> <version_command>makeblastdb -version</version_command> <command> makeblastdb -out "${os.path.join($outfile.extra_files_path,'blastdb')}" $parse_seqids $hash_index ## Single call to -in with multiple filenames space separated with outer quotes ## (presumably any filenames with spaces would be a problem). Note this gives ## some extra spaces, e.g. -in " file1 file2 file3 " but BLAST seems happy: -in " #for $i in $in ${i.file} #end for " #if $title: -title "$title" #else: ##Would default to being based on the cryptic Galaxy filenames, which is unhelpful -title "BLAST Database" #end if -dbtype $dbtype ## #set $sep = '-mask_data ' ## #for $i in $mask_data ## $sep${i.file} ## #set $set = ', ' ## #end for ## #set $sep = '-gi_mask -gi_mask_name ' ## #for $i in $gi_mask ## $sep${i.file} ## #set $set = ', ' ## #end for ## #if $tax.select == 'id': ## -taxid $tax.id ## #else if $tax.select == 'map': ## -taxid_map $tax.map ## #end if </command> <stdio> <!-- Anything other than zero is an error --> <exit_code range="1:" /> <exit_code range=":-1" /> <!-- In case the return code has not been set propery check stderr too --> <regex match="Error:" /> <regex match="Exception:" /> </stdio> <inputs> <param name="dbtype" type="select" display="radio" label="Molecule type of input"> <option value="prot">protein</option> <option value="nucl">nucleotide</option> </param> <!-- TODO Allow merging of existing BLAST databases (conditional on the database type) <repeat name="in" title="BLAST or FASTA Database" min="1"> <param name="file" type="data" format="fasta,blastdbn,blastdbp" label="BLAST or FASTA database" /> </repeat> --> <repeat name="in" title="FASTA file" min="1"> <param name="file" type="data" format="fasta" /> </repeat> <param name="title" type="text" value="" label="Title for BLAST database" help="This is the database name shown in BLAST search output" /> <param name="parse_seqids" type="boolean" truevalue="-parse_seqids" falsevalue="" checked="False" label="Parse the sequence identifiers" help="This is only advised if your FASTA file follows the NCBI naming conventions using pipe '|' symbols" /> <param name="hash_index" type="boolean" truevalue="-hash_index" falsevalue="" checked="true" label="Enable the creation of sequence hash values." help="These hash values can then be used to quickly determine if a given sequence data exists in this BLAST database." /> <!-- SEQUENCE MASKING OPTIONS --> <!-- TODO <repeat name="mask_data" title="Provide one or more files containing masking data"> <param name="file" type="data" format="asnb" label="File containing masking data" help="As produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker)" /> </repeat> <repeat name="gi_mask" title="Create GI indexed masking data"> <param name="file" type="data" format="asnb" label="Masking data output file" /> </repeat> --> <!-- TAXONOMY OPTIONS --> <!-- TODO <conditional name="tax"> <param name="select" type="select" label="Taxonomy options"> <option value="">Do not assign sequences to Taxonomy IDs</option> <option value="id">Assign all sequences to one Taxonomy ID</option> <option value="map">Supply text file mapping sequence IDs to taxnomy IDs</option> </param> <when value=""> </when> <when value="id"> <param name="id" type="integer" value="" label="NCBI taxonomy ID" help="Integer >=0" /> </when> <when value="map"> <param name="file" type="data" format="txt" label="Seq ID : Tax ID mapping file" help="Format: SequenceId TaxonomyId" /> </when> </conditional> --> </inputs> <outputs> <!-- If we only accepted one FASTA file, we could use its human name here... --> <data name="outfile" format="data" label="${dbtype.value_label} BLAST database from ${on_string}"> <change_format> <when input="dbtype" value="nucl" format="blastdbn" /> <when input="dbtype" value="prot" format="blastdbp" /> </change_format> </data> </outputs> <tests> </tests> <help> **What it does** Make BLAST database from one or more FASTA files and/or BLAST databases. This is a wrapper for the NCBI BLAST+ tool 'makeblastdb', which is the replacement for the 'formatdb' tool in the NCBI 'legacy' BLAST suite. <!-- Applying masks to an existing BLAST database will not change the original database; a new database will be created. For this reason, it's best to apply all masks at once to minimize the number of unnecessary intermediate databases. --> **Documentation** http://www.ncbi.nlm.nih.gov/books/NBK1763/ **References** If you use this Galaxy tool in work leading to a scientific publication please cite the following papers: Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology. PeerJ 1:e167 http://dx.doi.org/10.7717/peerj.167 Christiam Camacho et al. (2009). BLAST+: architecture and applications. BMC Bioinformatics. 15;10:421. http://dx.doi.org/10.1186/1471-2105-10-421 This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus </help> </tool>