comparison parseval.xml @ 0:1b52f0c8ad7f draft

"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/aegean commit e6c01517075cab35e620fe1bbdb5fd68e4d1359f"
author iuc
date Sun, 03 Jan 2021 14:58:02 +0000
parents
children d6c074a93c51
comparison
equal deleted inserted replaced
-1:000000000000 0:1b52f0c8ad7f
1 <tool id='aegean_parseval' name='AEGeAn ParsEval' version='@TOOL_VERSION@' profile='20.01'>
2 <description> compare two sets of gene annotations for the same sequence.</description>
3 <macros>
4 <import>macros.xml</import>
5 </macros>
6 <expand macro='xrefs'/>
7 <expand macro='edam_ontology'/>
8 <expand macro='requirements'/>
9 <version_command>parseval --version</version_command>
10 <command detect_errors='exit_code'>
11 <![CDATA[
12 #if $output_type == 'html'
13 mkdir -p '${output_html.extra_files_path}' &&
14 #end if
15 parseval '$referencegff3' '$predictiongff3'
16 --delta $delta
17 --maxtrans $maxtrans
18 -w
19 #if $refrlabel
20 --refrlabel '$refrlabel'
21 #end if
22 #if $predlabel
23 --predlabel '$predlabel'
24 #end if
25 #if $output_type =='text'
26 -f 'text'
27 -o '$output_txt'
28 #else if $output_type == 'html'
29 -f 'html'
30 -o '${output_html.files_path}' &&
31 echo "</div> </body> </html>" >> '${output_html.files_path}'/index.html &&
32 cp '${output_html.files_path}'/index.html '$output_html'
33 #end if
34 ]]>
35 </command>
36 <inputs>
37 <param name='referencegff3' type='data' format='gff3' label="Reference GFF3 file"/>
38 <param name='predictiongff3' type='data' format='gff3' label="Prediction GFF3 file"/>
39 <param argument='--delta' type='integer'
40 min='0' max='20' value='0'
41 label='Number of nucleotides to extend gene loci'/>
42 <param argument='--maxtrans' type='integer'
43 min='1' max='50' value='32'
44 label='Maximum transcript allowed per locus' />
45 <param name='output_type' type='select'
46 label='Select the output type'>
47 <option value='text'>Text</option>
48 <option value='html'>HTML</option>
49 </param>
50 <param argument='--refrlabel' type='text'
51 value='' optional='true'
52 label='Reference annotations'
53 help='Optional label for reference annotations'/>
54 <param argument='--predlabel' type='text'
55 value='' optional='true'
56 label='Prediction annotations'
57 help='Optional label for prediction annotations'/>
58 </inputs>
59 <outputs>
60 <data name='output_txt' format='txt'>
61 <filter>output_type == 'text'</filter>
62 </data>
63 <data name='output_html' format='html' from_work_dir="output.html">
64 <filter>output_type == 'html'</filter>
65 </data>
66 </outputs>
67 <tests>
68 <test expect_num_outputs="1">
69 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
70 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
71 <param name='delta' value='5'/>
72 <param name='maxtrans' value='20'/>
73 <output name='output_txt' file='parseval_output_test1.txt' lines_diff='8'/>
74 </test>
75 <test expect_num_outputs="1">
76 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
77 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
78 <output name='output_txt' file='parseval_output_test2.txt' lines_diff='8'/>
79 </test>
80 <test expect_num_outputs="1">
81 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
82 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
83 <param name='delta' value='10'/>
84 <param name='maxtrans' value='10'/>
85 <output name='output_txt' file='parseval_output_test3.txt' lines_diff='8'/>
86 </test>
87 <test expect_num_outputs="1">
88 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
89 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
90 <param name='output_type' value='html'/>
91 <output name='output_html' file='parseval_output_test4.html' lines_diff='8'/>
92 </test>
93 <test expect_num_outputs="1">
94 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
95 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
96 <param name='output_type' value='html'/>
97 <param name='delta' value='10'/>
98 <param name='maxtrans' value='10'/>
99 <output name='output_html' file='parseval_output_test5.html' lines_diff='8'/>
100 </test>
101 <test expect_num_outputs="1">
102 <param name='referencegff3' value='TAIR9_GFF3_genes.gff'/>
103 <param name='predictiongff3' value='TAIR10_GFF3_genes.gff'/>
104 <param name='output_type' value='text'/>
105 <param name='refrlabel' value='example_ref_label'/>
106 <param name='predlabel' value='example_pred_label'/>
107 <output name='output_txt' file='parseval_output_test6.txt' lines_diff='8'/>
108 </test>
109 </tests>
110 <help>
111 <![CDATA[
112 .. class:: infomark
113
114 **Purpose**
115
116 ParsEval is a program for comparing two sets of gene annotations for the same sequence. The most common use cases for ParsEval are as follows.
117
118 - You are annotating a newly assembled genome. The optimal parameter settings for annotation are not clear initially, so you do some exploratory data analysis and try several different parameter settings. You can use ParsEval to identify the similarities and differences between the different annotations you have produced.
119
120 - You are doing a genome-wide analysis of genes in your favorite organism. There is a gene annotation available from the consortium that sequenced and assembled the genome, but there is a different annotation available at NCBI. Again, ParsEval is the best way to compare these two annotations to quickly identify their similarities and differences.
121
122 -----
123
124 .. class:: infomark
125
126 **Input**
127
128 Input for ParsEval is two sets of annotations in GFF3 format. ParsEval uses the GenomeTools GFF3 parser, which strictly enforces syntax rules laid out in the GFF3 specification. ParsEval itself does some additional checks on the data to make sure valid comparisons are possible. Any features not directly related to protein-coding genes are ignored.
129
130 ParsEval will infer features implicitly encoded in the data. For example, if a gene annotation declares 6 exon features but no intron features, ParsEval will infer the 5 corresponding intron features from the exon boundaries. However, if ParsEval sees any intron features in a gene model it will assume all introns are declared explicitly. Violations of that assumption will likely elicit a program error.
131
132 ParsEval is pretty flexible in handling various common conventions for encoding gene structure: exons + start/stop codons, exons + CDS, CDS + UTRs, etc. Any subset of features that completely captures the gene’s exon/intron structure, CDS(s), and UTRs should be handled correctly.
133
134 ParsEval requires that gene isoforms be encoded using the feature type mRNA (as opposed to transcript, primary_transcript, or other valid SO terms). For mRNA features lacking an explicitly declared gene parent, ParsEval will create one. Note, however, that ParsEval will treat all such transcripts as belonging to separate distinct genes, which will erroneously inflate summary statistics reported by ParsEval.
135
136 -----
137
138 .. class:: infomark
139
140 **Output**
141
142 ParsEval output includes a variety of similarity statistics that measure the agreement between the two annotations. Our use of agreement here instead of accuracy is intentional: except in a very few rare cases, you will not be comparing a prediction to a true high-quality “gold standard.” It is much more common to compare two annotation sets whose relative quality is unknown. ParsEval uses the terms reference and prediction only to distinguish the two sets: it makes no assumptions as to their relative quality.
143
144
145 ]]>
146 </help>
147 <expand macro='citations'/>
148 </tool>