annotate RPKM_saturation.xml @ 46:68ada7ca4cc4 draft

Fixed geneBody coverage sample names
author lparsons
date Mon, 29 Sep 2014 16:32:13 -0400
parents eb339c5849bb
children 6b33e31bda10
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
45
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
1 <tool id="rseqc_RPKM_saturation" name="RPKM Saturation" version="2.4">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
2 <description>calculates raw count and RPKM values for transcript at exon, intron, and mRNA level</description>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
3 <requirements>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
4 <requirement type="package" version="3.0.3">R</requirement>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
5 <requirement type="package" version="1.7.1">numpy</requirement>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
6 <requirement type="package" version="2.4">rseqc</requirement>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
7 </requirements>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
8 <command> RPKM_saturation.py -i $input -o output -r $refgene
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
9
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
10 #if str($strand_type.strand_specific) == "pair"
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
11 -d
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
12 #if str($strand_type.pair_type) == "sd"
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
13 '1++,1--,2+-,2-+'
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
14 #else
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
15 '1+-,1-+,2++,2--'
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
16 #end if
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
17 #end if
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
18
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
19 #if str($strand_type.strand_specific) == "single"
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
20 -d
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
21 #if str($strand_type.single_type) == "s"
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
22 '++,--'
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
23 #else
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
24 '+-,-+'
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
25 #end if
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
26 #end if
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
27
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
28 -l $percentileFloor -u $percentileCeiling -s $percentileStep -c $rpkmCutoff
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
29
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
30 </command>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
31 <stdio>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
32 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
33 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
34 </stdio>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
35 <inputs>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
36 <param name="input" type="data" format="bam" label="input bam/sam file" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
37 <param name="refgene" type="data" format="bed" label="Reference gene model" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
38 <conditional name="strand_type">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
39 <param name="strand_specific" type="select" label="Strand-specific?" value="None">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
40 <option value="none">None</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
41 <option value="pair">Pair-End RNA-seq</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
42 <option value="single">Single-End RNA-seq</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
43 </param>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
44 <when value="pair">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
45 <param name="pair_type" type="select" display="radio" label="Pair-End Read Type (format: mapped --> parent)" value="sd">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
46 <option value="sd"> read1 (positive --> positive; negative --> negative), read2 (positive --> negative; negative --> positive)</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
47 <option value="ds">read1 (positive --> negative; negative --> positive), read2 (positive --> positive; negative --> negative)</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
48 </param>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
49 </when>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
50 <when value="single">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
51 <param name="single_type" type="select" display="radio" label="Single-End Read Type (format: mapped --> parent)" value="s">
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
52 <option value="s">positive --> positive; negative --> negative</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
53 <option value="d">positive --> negative; negative --> positive</option>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
54 </param>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
55 </when>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
56 <when value="none"></when>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
57 </conditional>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
58 <param name="percentileFloor" type="integer" value="5" label="Begin sampling from this percentile (default=5)" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
59 <param name="percentileCeiling" type="integer" value="100" label="End sampling at this percentile (default=100)" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
60 <param name="percentileStep" type="integer" value="5" label="Sampling step size (default=5)" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
61 <param name="rpkmCutoff" type="text" value="0.01" label="Ignore transcripts with RPKM smaller than this number (default=0.01)" />
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
62 </inputs>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
63 <outputs>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
64 <data format="xls" name="outputxls" from_work_dir="output.eRPKM.xls" label="${tool.name} on ${on_string} (RPKM XLS)"/>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
65 <data format="xls" name="outputrawxls" from_work_dir="output.rawCount.xls" label="${tool.name} on ${on_string} (Raw Count XLS)"/>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
66 <data format="txt" name="outputr" from_work_dir="output.saturation.r" label="${tool.name} on ${on_string} (R Script)"/>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
67 <data format="pdf" name="outputpdf" from_work_dir="output.saturation.pdf" label="${tool.name} on ${on_string} (PDF)"/>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
68 </outputs>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
69 <help>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
70 RPKM_saturation.py
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
71 ++++++++++++++++++
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
72
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
73 The precision of any sample statitics (RPKM) is affected by sample size (sequencing depth);
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
74 \'resampling\' or \'jackknifing\' is a method to estimate the precision of sample statistics by
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
75 using subsets of available data. This module will resample a series of subsets from total RNA
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
76 reads and then calculate RPKM value using each subset. By doing this we are able to check if
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
77 the current sequencing depth was saturated or not (or if the RPKM values were stable or not)
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
78 in terms of genes' expression estimation. If sequencing depth was saturated, the estimated
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
79 RPKM value will be stationary or reproducible. By default, this module will calculate 20
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
80 RPKM values (using 5%, 10%, ... , 95%,100% of total reads) for each transcripts.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
81
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
82 In the output figure, Y axis is "Percent Relative Error" or "Percent Error" which is used
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
83 to measures how the RPKM estimated from subset of reads (i.e. RPKMobs) deviates from real
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
84 expression level (i.e. RPKMreal). However, in practice one cannot know the RPKMreal. As a
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
85 proxy, we use the RPKM estimated from total reads to approximate RPKMreal.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
86
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
87 .. image:: http://rseqc.sourceforge.net/_images/RelativeError.png
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
88 :height: 80 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
89 :width: 400 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
90 :scale: 100 %
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
91
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
92 Inputs
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
93 ++++++++++++++
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
94
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
95 Input BAM/SAM file
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
96 Alignment file in BAM/SAM format.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
97
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
98 Reference gene model
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
99 Gene model in BED format.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
100
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
101 Strand sequencing type (default=none)
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
102 See Infer Experiment tool if uncertain.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
103
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
104 Options
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
105 ++++++++++++++
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
106
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
107 Skip Multiple Hit Reads
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
108 Use Multiple hit reads or use only uniquely mapped reads.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
109
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
110 Only use exonic reads
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
111 Renders program only used exonic (UTR exons and CDS exons) reads, otherwise use all reads.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
112
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
113 Output
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
114 ++++++++++++++
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
115
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
116 1. output..eRPKM.xls: RPKM values for each transcript
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
117 2. output.rawCount.xls: Raw count for each transcript
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
118 3. output.saturation.r: R script to generate plot
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
119 4. output.saturation.pdf:
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
120
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
121 .. image:: http://rseqc.sourceforge.net/_images/saturation.png
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
122 :height: 600 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
123 :width: 600 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
124 :scale: 80 %
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
125
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
126 - All transcripts were sorted in ascending order according to expression level (RPKM). Then they are divided into 4 groups:
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
127 1. Q1 (0-25%): Transcripts with expression level ranked below 25 percentile.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
128 2. Q2 (25-50%): Transcripts with expression level ranked between 25 percentile and 50 percentile.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
129 3. Q3 (50-75%): Transcripts with expression level ranked between 50 percentile and 75 percentile.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
130 4. Q4 (75-100%): Transcripts with expression level ranked above 75 percentile.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
131 - BAM/SAM file containing more than 100 million alignments will make module very slow.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
132 - Follow example below to visualize a particular transcript (using R console)::
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
133
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
134 pdf("xxx.pdf") #starts the graphics device driver for producing PDF graphics
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
135 x &lt;- seq(5,100,5) #resampling percentage (5,10,15,...,100)
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
136 rpkm &lt;- c(32.95,35.43,35.15,36.04,36.41,37.76,38.96,38.62,37.81,38.14,37.97,38.58,38.59,38.54,38.67, 38.67,38.87,38.68, 38.42, 38.23) #Paste RPKM values calculated from each subsets
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
137 scatter.smooth(x,100*abs(rpkm-rpkm[length(rpkm)])/(rpkm[length(rpkm)]),type="p",ylab="Precent Relative Error",xlab="Resampling Percentage")
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
138 dev.off() #close graphical device
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
139
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
140 .. image:: http://rseqc.sourceforge.net/_images/saturation_eg.png
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
141 :height: 600 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
142 :width: 600 px
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
143 :scale: 80 %
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
144
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
145 -----
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
146
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
147 About RSeQC
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
148 +++++++++++
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
149
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
150 The RSeQC_ package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. "Basic modules" quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while "RNA-seq specific modules" investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
151
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
152 The RSeQC package is licensed under the GNU GPL v3 license.
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
153
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
154 .. image:: http://rseqc.sourceforge.net/_static/logo.png
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
155
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
156 .. _RSeQC: http://rseqc.sourceforge.net/
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
157
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
158
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
159 </help>
eb339c5849bb Reupload, toolshed removed all files of previous version.
lparsons
parents:
diff changeset
160 </tool>