annotate protxml_to_gff.xml @ 0:04dc24d06ddb draft default tip

Uploaded
author iracooke
date Sat, 14 Jun 2014 18:18:41 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
1 <tool id="protxml_to_gff" name="ProtXML to GFF" version="1.0.1">
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
2 <requirements>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
3 <requirement type="package" version="1.3">protk</requirement>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
4 <requirement type="package" version="2.2.29">blast+</requirement>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
5 </requirements>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
6
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
7 <description>Map peptides from a protXML file to genomic coordinates</description>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
8
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
9 <command>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
10 protxml_to_gff.rb -p $protxml_file
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
11
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
12 -g $genome_fasta_file
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
13
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
14 -d $protein_fasta_file
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
15
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
16 -o $output
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
17
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
18 --threshold $peptide_threshold
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
19
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
20 --prot-threshold $protein_threshold
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
21
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
22 $stack_charges
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
23
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
24 </command>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
25
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
26
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
27
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
28
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
29 <stdio>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
30 <exit_code range="1:" level="fatal" description="Failure" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
31 </stdio>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
32
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
33 <inputs>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
34 <param name="protxml_file" type="data" format="protxml" help="ProtXML containing combined results from all searches" label="ProtXML File" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
35 <param name="genome_fasta_file" type="data" format="fasta" help="The genome against which peptides will be mapped" label="Genome fasta file" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
36 <param name="protein_fasta_file" type="data" format="fasta" help="The database used for ms/ms searches (must have genomic coords encoded in the fasta header)" label="Protein fasta file" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
37
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
38 <param name="peptide_threshold" help="Peptide Probability Threshold" type="float" value="0.95" min="0" max="1" label="Peptide Probability Threshold" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
39 <param name="protein_threshold" help="Protein Probability Threshold" type="float" value="0.99" min="0" max="1" label="Protein Probability Threshold" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
40
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
41 <param name="stack_charges" type="boolean" label="Stack Charges" help="Different peptide charge states get separate gff entries" truevalue="--stack-charge-states" falsevalue=""/>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
42
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
43 <param name="collapse_redundant_proteins" type="boolean" label="Collapse Redundant Proteins" help="Proteins that cover genomic regions already covered will be skipped" truevalue="--collapse-redundant-proteins" falsevalue=""/>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
44
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
45 </inputs>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
46
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
47 <outputs>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
48 <data format="gff3" name="output" />
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
49 </outputs>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
50
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
51
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
52 <help>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
53
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
54 **What it does**
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
55
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
56 Generates a gff file containing genomic coordinates for peptides present in a protXML file.
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
57
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
58 In order for this tool to work the inputs must satisfy certain requirements.
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
59
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
60 1. The genome fasta should encode the scaffold numbers as in the following example
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
61
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
62 >scaffoldXXX
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
63
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
64 or
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
65
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
66 >scaffold_XXX
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
67
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
68 where XXX represent digits encoding the scaffold number. Any number of digits are allowed
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
69
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
70 2. The protXML should have been generated by searching a database generated using the protk Generate 6 frame translation tool and the extract proteins from gff3 tool. Both those tools should be run with the genomics coordinates included in the output file.
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
71
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
72
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
73
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
74 ----
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
75
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
76 **References**
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
77
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
78
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
79 </help>
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
80
04dc24d06ddb Uploaded
iracooke
parents:
diff changeset
81 </tool>