cuffcompare: cuffcompare_wrapper.xml comparison

comparison cuffcompare_wrapper.xml @ 9:e66b9b5b8580 draft

planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tool_collections/cufflinks/cuffcompare commit 82ee6fc860c52c531b7a57bbb346ab1a67a434a5

author	devteam
date	Sun, 19 Feb 2017 12:12:05 -0500
parents	1322b73ffe44
children	f648e5180e40

comparison

equal deleted inserted replaced

-:1322b73ffe44
+:e66b9b5b8580
-<tool id="cuffcompare" name="Cuffcompare" version="2.2.1.0">
+<tool id="cuffcompare" name="Cuffcompare" version="@VERSION@.1">
 <description>compare assembled transcripts to a reference annotation and track Cufflinks transcripts across multiple experiments</description>
 <macros>
 <import>cuff_macros.xml</import>
 </macros>
 <expand macro="requirements" />
-<expand macro="stdio" />
 <version_command>cuffcompare 2>&amp;1 | head -n 1</version_command>
-<command>
+<command detect_errors="aggressive"><![CDATA[
 python '$__tool_directory__/cuffcompare_wrapper.py'
 ## Use annotation reference?
 #if $annotation.use_ref_annotation == "Yes":
 -r '$annotation.reference_annotation'
 #if $annotation.ignore_nonoverlapping_reference:
 #if $annotation.ignore_nonoverlapping_transfrags:
 -Q
 #end if
 #end if
 ## Use sequence data?
 #if $seq_data.use_seq_data == "Yes":
 -s
 #if $seq_data.seq_source.index_source == "history":
 --ref_file '$seq_data.seq_source.ref_file'
 #else:
 --index '${seq_data.seq_source.index.fields.path}'
 #end if
 #end if
 $discard_single_exon
 -e $max_dist_exon
 -d $max_dist_group
 #if $discard_intron_redundant_transfrags:
 -F
 #end if
 ## Outputs.
 --combined-transcripts '${transcripts_combined}'
 @CUFFLINKS_GTF_INPUTS@
-</command>
+]]></command>
 <inputs>
 <expand macro="cufflinks_gtf_inputs" />
 <conditional name="annotation">
 <param name="use_ref_annotation" type="select" label="Use Reference Annotation">
 <option value="No">No</option>
 </when>
 <when value="No">
 </when>
 </conditional>
 <conditional name="seq_data">
 <param name="use_seq_data" type="select" label="Use Sequence Data"
 help="Use sequence data for some optional classification functions, including the addition of the p_id attribute required by Cuffdiff.">
 <option value="Yes">Yes</option>
 <option value="No">No</option>
 </param>
 <when value="No"></when>
 <param type="select" name="discard_single_exon" label="discard (ignore) single-exon transcripts">
 <option value="" selected="True">No</option>
 <option value="-M">Discard single-exon transfrags and reference transcripts</option>
 <option value="-N">Discard single-exon reference transcripts</option>
 </param>
 <param type="integer" name="max_dist_exon" value="100" label="Max. Distance for assessing exon accuracy"
 help="max. distance (range) allowed from free ends of terminal exons of reference transcripts when assessing exon accuracy. Default: 100" />
 <param type="integer" name="max_dist_group" value="100" label="Max.Distance for transcript grouping"
 help="max. distance (range) for grouping transcript start sites. Default: 100" />
 <param type="boolean" name="discard_intron_redundant_transfrags" label="discard intron-redundant transfrags sharing 5'"
 help="Discard intron-redundant transfrags if they share the 5' end (if they differ only at the 3' end)" />
 </inputs>
 <outputs>
 <data format="txt" name="transcripts_accuracy" label="${tool.name} on ${on_string}: transcript accuracy"
 from_work_dir="cc_output.stats" />
 <data format="tabular" name="input1_tmap" label="${tool.name} on ${on_string}: data ${inputs[0].hid} tmap file"
 from_work_dir="cc_output.input1.tmap" />
 <data format="tabular" name="input1_refmap"
 label="${tool.name} on ${on_string}: data ${inputs[0].hid} refmap file"
 from_work_dir="cc_output.input1.refmap">
 <filter>annotation['use_ref_annotation'] == 'Yes'</filter>
 </data>
 <data format="tabular" name="input2_tmap" label="${tool.name} on ${on_string}: data ${inputs[1].hid} tmap file" from_work_dir="cc_output.input2.tmap">
 <filter>@HAS_MULTIPLE_INPUTS@</filter>
 </data>
 <data format="tabular" name="input2_refmap"
 label="${tool.name} on ${on_string}: data ${inputs[1].hid} refmap file"
 from_work_dir="cc_output.input2.refmap">
 <filter>annotation['use_ref_annotation'] == 'Yes' and @HAS_MULTIPLE_INPUTS@</filter>
 </data>
 <data format="tabular" name="transcripts_tracking" label="${tool.name} on ${on_string}: transcript tracking" from_work_dir="cc_output.tracking">
 <filter>@HAS_MULTIPLE_INPUTS@</filter>
 </data>
 <data format="gtf" name="transcripts_combined" label="${tool.name} on ${on_string}: combined transcripts"/>
 </outputs>
 <tests>
 <!--
 cuffcompare -r cuffcompare_in3.gtf -R cuffcompare_in1.gtf cuffcompare_in2.gtf
 -->
 <test>
 <param name="inputs" value="cuffcompare_in1.gtf,cuffcompare_in2.gtf" ftype="gtf"/>
 <param name="use_ref_annotation" value="Yes"/>
 **Input format**
 Cuffcompare takes Cufflinks' GTF output as input, and optionally can take a "reference" annotation (such as from Ensembl_)
 .. _Ensembl: http://www.ensembl.org
 ------
 **Outputs**
 Cuffcompare reports various statistics related to the "accuracy" of the transcripts in each sample when compared to the reference annotation data. The typical gene finding measures of "sensitivity" and "specificity" (as defined in Burset, M., Guigó, R. : Evaluation of gene structure prediction programs (1996) Genomics, 34 (3), pp. 353-367. doi: 10.1006/geno.1996.0298) are calculated at various levels (nucleotide, exon, intron, transcript, gene) for each input file and reported in this file. The Sn and Sp columns show specificity and sensitivity values at each level, while the fSn and fSp columns are "fuzzy" variants of these same accuracy calculations, allowing for a very small variation in exon boundaries to still be counted as a "match".
 Transcripts Combined File:
 Cuffcompare reports a GTF file containing the "union" of all transfrags in each sample. If a transfrag is present in both samples, it is thus reported once in the combined gtf.
 Transcripts Tracking File:
 This file matches transcripts up between samples. Each row contains a transcript structure that is present in one or more input GTF files. Because the transcripts will generally have different IDs (unless you assembled your RNA-Seq reads against a reference transcriptome), cuffcompare examines the structure of each the transcripts, matching transcripts that agree on the coordinates and order of all of their introns, as well as strand. Matching transcripts are allowed to differ on the length of the first and last exons, since these lengths will naturally vary from sample to sample due to the random nature of sequencing.
 If you ran cuffcompare with the -r option, the first and second columns contain the closest matching reference transcript to the one described by each row.
 1               Cufflinks transfrag id    TCONS_00000045   A unique internal id for the transfrag
 2               Cufflinks locus id        XLOC_000023      A unique internal id for the locus
 3               Reference gene id         Tcea             The gene_name attribute of the reference GTF record for this transcript, or '-' if no reference transcript overlaps this Cufflinks transcript
 4               Reference transcript id   uc007afj.1       The transcript_id attribute of the reference GTF record for this transcript, or '-' if no reference transcript overlaps this Cufflinks transcript
 5               Class code                c                The type of match between the Cufflinks transcripts in column 6 and the reference transcript. See class codes
 Each of the columns after the fifth have the following format:
 qJ:gene_id|transcript_id|FMI|FPKM|conf_lo|conf_hi
 A transcript need be present in all samples to be reported in the tracking file. A sample not containing a transcript will have a "-" in its entry in the row for that transcript.
 If you ran cuffcompare with the -r option, tracking rows will contain the following values. If you did not use -r, the rows will all contain "-" in their class code column::
 Priority         Code           Description
 ---------------------------------
 1                 =               Match
 2                 c               Contained
 3                 j               New isoform
 4                 e               A single exon transcript overlapping a reference exon and at least 10 bp of a reference intron, indicating a possible pre-mRNA fragment.
 5                 i               A single exon transcript falling entirely with a reference intron
 6                 o               Generic exonic overlap with a reference transcript
 7                 p               Possible polymerase run-on fragment
 8                 r               Repeat. Currently determined by looking at the soft-masked reference sequence and applied to transcripts where at least 50% of the bases are lower case
 9                 u               Unknown, intergenic transcript
 10                x               Exonic overlap with reference on the opposite strand
 11                s               An intron of the transfrag overlaps a reference intron on the opposite strand (likely due to read mapping errors)
 12                .               (.tracking file only, indicates multiple classifications)
 -------
 **Settings**
 All of the options have a default value. You can change any of them. Most of the options in Cuffcompare have been implemented here.

Mercurial > repos > devteam > cuffcompare

comparison cuffcompare_wrapper.xml @ 9:e66b9b5b8580 draft