Mercurial > repos > devteam > kraken
comparison kraken.xml @ 0:0f17a8816b28 draft
planemo upload for repository https://github.com/galaxyproject/tools-devteam/blob/master/tool_collections/kraken/kraken/ commit 00a7926c285bc4a339bd7deebf40b28f39c7d947-dirty
author | devteam |
---|---|
date | Thu, 23 Jul 2015 10:55:44 -0400 |
parents | |
children | 7b3ef9b4af80 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:0f17a8816b28 |
---|---|
1 <?xml version="1.0"?> | |
2 <tool id="kraken" name="Kraken" version="1.1.0"> | |
3 <description> | |
4 assign taxonomic labels to sequencing reads | |
5 </description> | |
6 <macros> | |
7 <import>macros.xml</import> | |
8 </macros> | |
9 <command> | |
10 <![CDATA[ | |
11 @SET_DATABASE_PATH@ && | |
12 kraken --threads \${GALAXY_SLOTS:-1} @INPUT_DATABASE@ | |
13 | |
14 #if $input_sequences.is_of_type( 'fastq' ): | |
15 --fastq-input | |
16 #else: | |
17 --fasta-input | |
18 #end if | |
19 | |
20 ${only_classified_output} | |
21 | |
22 #if str( $quick_operation.quick ) == "yes": | |
23 --quick | |
24 --min-hits ${quick_operation.min_hits} | |
25 | |
26 #end if | |
27 | |
28 "$input_sequences" | |
29 | |
30 #if $split_reads: | |
31 --classified-out "${classified_out}" --unclassified-out "${unclassified_out}" | |
32 #end if | |
33 --output "${output}" | |
34 ##kraken-translate --db ${kraken_database.fields.name} "${output}" > "${translated}" | |
35 ]]> | |
36 </command> | |
37 <inputs> | |
38 <param format="fasta,fastq" label="Input sequences" name="input_sequences" type="data" help="FASTA or FASTQ datasets"/> | |
39 <param label="Output classified and unclassified reads?" name="split_reads" type="boolean" help="Sets --unclassified-out and --classified-out"/> | |
40 | |
41 <conditional name="quick_operation"> | |
42 <param name="quick" type="select" label="Enable quick operation?" help="--quick; Rather than searching all k-mers in a sequence, stop classification after a specified number of database hit"> | |
43 <option value="yes">Yes</option> | |
44 <option selected="True" value="no">No</option> | |
45 </param> | |
46 <when value="yes"> | |
47 <param name="min_hits" type="integer" value="1" label="Number of hits required for classification" help="--min-hits; min-hits will allow you to require multiple hits before declaring a sequence classified, which can be especially useful with custom databases when testing to see if sequences either do or do not belong to a particular genome; default=1"/> | |
48 </when> | |
49 <when value="no"> | |
50 <!-- Do absolutely nothing --> | |
51 </when> | |
52 </conditional> | |
53 | |
54 <param name="only_classified_output" type="boolean" checked="False" truevalue="--only-classified-output" falsevalue="" label="Print no Kraken output for unclassified sequences" help="--only-classified-output"/> | |
55 | |
56 <expand macro="input_database" /> | |
57 </inputs> | |
58 <outputs> | |
59 <data format="tabular" label="${tool.name} on ${on_string}: Classified reads" name="classified_out"> | |
60 <filter>(split_reads)</filter> | |
61 </data> | |
62 <data format="tabular" label="${tool.name} on ${on_string}: Unclassified reads" name="unclassified_out"> | |
63 <filter>(split_reads)</filter> | |
64 </data> | |
65 <data format="tabular" label="${tool.name} on ${on_string}: Classification" name="output" /> | |
66 <!--<data format="tabular" label="${tool.name} on ${on_string}: Translated classification" name="translated" />--> | |
67 </outputs> | |
68 <help> | |
69 <![CDATA[ | |
70 **What it does** | |
71 | |
72 Kraken is a taxonomic sequence classifier that assigns taxonomic labels to short DNA reads. It does this by examining the k-mers within a read and querying a database with those k-mers. This database contains a mapping of every k-mer in Kraken's genomic library to the lowest common ancestor (LCA) in a taxonomic tree of all genomes that contain that k-mer. The set of LCA taxa that correspond to the k-mers in a read are then analyzed to create a single taxonomic label for the read; this label can be any of the nodes in the taxonomic tree. Kraken is designed to be rapid, sensitive, and highly precise. | |
73 | |
74 ----- | |
75 | |
76 **Kraken options** | |
77 | |
78 The Galaxy version of Kraken implements the following options:: | |
79 | |
80 | |
81 --fasta-input Input is FASTA format | |
82 --fastq-input Input is FASTQ format | |
83 --quick Quick operation (use first hit or hits) | |
84 --min-hits NUM In quick op., number of hits req'd for classification | |
85 NOTE: this is ignored if --quick is not specified | |
86 --unclassified-out Print unclassified sequences to filename | |
87 --classified-out Print classified sequences to filename | |
88 | |
89 --only-classified-output Print no Kraken output for unclassified sequences | |
90 | |
91 ------ | |
92 | |
93 **Output Format** | |
94 | |
95 Each sequence classified by Kraken results in a single line of output. Output lines contain five tab-delimited fields; from left to right, they are:: | |
96 | |
97 1. "C"/"U": one letter code indicating that the sequence was either classified or unclassified. | |
98 2. The sequence ID, obtained from the FASTA/FASTQ header. | |
99 3. The taxonomy ID Kraken used to label the sequence; this is 0 if the sequence is unclassified. | |
100 4. The length of the sequence in bp. | |
101 5. A space-delimited list indicating the LCA mapping of each k-mer in the sequence. For example, "562:13 561:4 A:31 0:1 562:3" would indicate that: | |
102 a) the first 13 k-mers mapped to taxonomy ID #562 | |
103 b) the next 4 k-mers mapped to taxonomy ID #561 | |
104 c) the next 31 k-mers contained an ambiguous nucleotide | |
105 d) the next k-mer was not in the database | |
106 e) the last 3 k-mers mapped to taxonomy ID #562 | |
107 ]]> | |
108 </help> | |
109 <expand macro="requirements" /> | |
110 <expand macro="stdio" /> | |
111 <expand macro="version_command" /> | |
112 <expand macro="citations" /> | |
113 </tool> |