comparison infer_experiment.xml @ 32:580ee0c4bc4e

Fixes from Bjorn Gruning: create symlinks under $TMP and clean them up afterwards, replace R dependency with the Tool Shed R3 package, add --install-scripts, prepend tool-ids with rseqc
author lparsons
date Mon, 07 Oct 2013 15:01:13 -0400
parents cc5eaa9376d8
children
comparison
equal deleted inserted replaced
31:cc5eaa9376d8 32:580ee0c4bc4e
1 <tool id="infer_experiment" name="Infer Experiment" version="1.1"> 1 <tool id="rseqc_infer_experiment" name="Infer Experiment" version="1.1">
2 <description>speculates how RNA-seq were configured</description> 2 <description>speculates how RNA-seq were configured</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="1.7.1">numpy</requirement> 4 <requirement type="package" version="1.7.1">numpy</requirement>
5 <requirement type="package" version="2.3.7">rseqc</requirement> 5 <requirement type="package" version="2.3.7">rseqc</requirement>
6 </requirements> 6 </requirements>
7 <command> infer_experiment.py -i $input -r $refgene 7 <command>
8 8 infer_experiment.py -i $input -r $refgene
9 #if $sample_size.boolean 9 #if $sample_size.boolean
10 -s $sample_size.size 10 -s $sample_size.size
11 #end if 11 #end if
12 12
13 > $output 13 > $output
14 </command> 14 </command>
15 <inputs>
16 <param name="input" type="data" format="bam,sam" label="Input BAM/SAM file" />
17 <param name="refgene" type="data" format="bed" label="Reference gene model in bed format" />
18 <conditional name="sample_size">
19 <param name="boolean" type="boolean" label="Modify usable sampled reads" value="false" />
20 <when value="true">
21 <param name="size" type="integer" label="Number of usable sampled reads (default = 200000)" value="200000" />
22 </when>
23 </conditional>
24 </inputs>
25 <outputs>
26 <data format="txt" name="output" />
27 </outputs>
28 <stdio> 15 <stdio>
29 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" /> 16 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
30 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" /> 17 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
31 </stdio> 18 </stdio>
32 <help> 19 <inputs>
20 <param name="input" type="data" format="bam,sam" label="Input BAM/SAM file" />
21 <param name="refgene" type="data" format="bed" label="Reference gene model in bed format" />
22 <conditional name="sample_size">
23 <param name="boolean" type="boolean" label="Modify usable sampled reads" value="false" />
24 <when value="true">
25 <param name="size" type="integer" label="Number of usable sampled reads (default = 200000)" value="200000" />
26 </when>
27 </conditional>
28 </inputs>
29 <outputs>
30 <data format="txt" name="output" />
31 </outputs>
32 <help>
33 infer_experiment.py 33 infer_experiment.py
34 +++++++++++++++++++ 34 +++++++++++++++++++
35 35
36 This program is used to speculate how RNA-seq sequencing were configured, especially how 36 This program is used to speculate how RNA-seq sequencing were configured, especially how
37 reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping 37 reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping
40 40
41 Inputs 41 Inputs
42 ++++++++++++++ 42 ++++++++++++++
43 43
44 Input BAM/SAM file 44 Input BAM/SAM file
45 Alignment file in BAM/SAM format. 45 Alignment file in BAM/SAM format.
46 46
47 Reference gene model 47 Reference gene model
48 Gene model in BED format. 48 Gene model in BED format.
49 49
50 Number of usable sampled reads (default=200000) 50 Number of usable sampled reads (default=200000)
51 Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower. 51 Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower.
52 52
53 Outputs 53 Outputs
54 +++++++ 54 +++++++
55 55
56 For pair-end RNA-seq, there are two different 56 For pair-end RNA-seq, there are two different
86 Example Output 86 Example Output
87 ++++++++++++++ 87 ++++++++++++++
88 88
89 **Example1** :: 89 **Example1** ::
90 90
91 ========================================================= 91 =========================================================
92 This is PairEnd Data :: 92 This is PairEnd Data ::
93 93
94 Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992 94 Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992
95 Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008 95 Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008
96 Fraction of reads explained by other combinations: 0.0000 96 Fraction of reads explained by other combinations: 0.0000
97 ========================================================= 97 =========================================================
98 98
99 *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--". 99 *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--".
100 100
101 **Example2** :: 101 **Example2** ::
102 102
103 ============================================================ 103 ============================================================
104 This is PairEnd Data 104 This is PairEnd Data
105 105
106 Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 :: 106 Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 ::
107 Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356 107 Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356
108 Fraction of reads explained by other combinations: 0.0000 108 Fraction of reads explained by other combinations: 0.0000
109 ============================================================ 109 ============================================================
110 110
111 *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model. 111 *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model.
112 112
113 **Example3** :: 113 **Example3** ::
114 114
115 ========================================================= 115 =========================================================
116 This is SingleEnd Data :: 116 This is SingleEnd Data ::
117 117
118 Fraction of reads explained by "++,--": 0.9840 :: 118 Fraction of reads explained by "++,--": 0.9840 ::
119 Fraction of reads explained by "+-,-+": 0.0160 119 Fraction of reads explained by "+-,-+": 0.0160
120 Fraction of reads explained by other combinations: 0.0000 120 Fraction of reads explained by other combinations: 0.0000
121 ========================================================= 121 =========================================================
122 122
123 *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene. 123 *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene.
124 124
125 125
126 ----- 126 -----
135 .. image:: http://rseqc.sourceforge.net/_static/logo.png 135 .. image:: http://rseqc.sourceforge.net/_static/logo.png
136 136
137 .. _RSeQC: http://rseqc.sourceforge.net/ 137 .. _RSeQC: http://rseqc.sourceforge.net/
138 138
139 139
140 </help> 140 </help>
141 </tool> 141 </tool>