annotate infer_experiment.xml @ 29:907d4b021ff6

Uploaded
author nilesh
date Thu, 11 Jul 2013 12:31:33 -0400
parents 8dbd613bd835
children cc5eaa9376d8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
19
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
1 <tool id="infer_experiment" name="Infer Experiment">
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
2 <description>speculates how RNA-seq were configured</description>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
3 <requirements>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
4 <requirement type="package" version="2.3.7">rseqc</requirement>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
5 </requirements>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
6 <command interpreter="python"> infer_experiment.py -i $input -r $refgene
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
7
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
8 #if $sample_size.boolean
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
9 -s $sample_size.size
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
10 #end if
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
11
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
12 > $output
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
13 </command>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
14 <inputs>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
15 <param name="input" type="data" format="bam,sam" label="Input BAM/SAM file" />
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
16 <param name="refgene" type="data" format="bed" label="Reference gene model in bed format" />
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
17 <conditional name="sample_size">
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
18 <param name="boolean" type="boolean" label="Modify usable sampled reads" value="false" />
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
19 <when value="true">
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
20 <param name="size" type="integer" label="Number of usable sampled reads (default = 200000)" value="200000" />
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
21 </when>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
22 </conditional>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
23 </inputs>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
24 <outputs>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
25 <data format="txt" name="output" />
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
26 </outputs>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
27 <help>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
28 .. image:: https://code.google.com/p/rseqc/logo?cct=1336721062
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
29
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
30 -----
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
31
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
32 About RSeQC
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
33 +++++++++++
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
34
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
35 The RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. “Basic modules” quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while “RNA-seq specific modules” investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
36
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
37 The RSeQC package is licensed under the GNU GPL v3 license.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
38
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
39 Inputs
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
40 ++++++++++++++
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
41
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
42 Input BAM/SAM file
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
43 Alignment file in BAM/SAM format.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
44
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
45 Reference gene model
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
46 Gene model in BED format.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
47
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
48 Number of usable sampled reads (default=200000)
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
49 Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
50
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
51
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
52 Output
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
53 ++++++++++++++
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
54 This program is used to speculate how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping information to the underneath gene model. Generally, strand specific RNA-seq data should be handled differently in both visualization and RPKM calculation.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
55
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
56 For pair-end RNA-seq, there are two different ways to strand reads:
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
57
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
58 1) 1++,1--,2+-,2-+
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
59 - read1 mapped to '+' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
60 - read1 mapped to '-' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
61 - read2 mapped to '+' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
62 - read2 mapped to '-' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
63 2) 1+-,1-+,2++,2--
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
64 - read1 mapped to '+' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
65 - read1 mapped to '-' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
66 - read2 mapped to '+' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
67 - read2 mapped to '-' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
68
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
69 For single-end RNA-seq, there are also two different ways to strand reads:
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
70
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
71 1) ++,--
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
72 -read mapped to '+' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
73 - read mapped to '-' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
74 2) +-,-+
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
75 - read mapped to '+' strand indicates parental gene on '-' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
76 - read mapped to '-' strand indicates parental gene on '+' strand
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
77
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
78 Example Output
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
79 ++++++++++++++
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
80
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
81 **Example1** ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
82
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
83 =========================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
84 This is PairEnd Data ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
85
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
86 Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
87 Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
88 Fraction of reads explained by other combinations: 0.0000
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
89 =========================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
90
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
91 *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--".
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
92
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
93 **Example2** ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
94
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
95 ============================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
96 This is PairEnd Data
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
97
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
98 Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
99 Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
100 Fraction of reads explained by other combinations: 0.0000
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
101 ============================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
102
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
103 *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
104
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
105 **Example3** ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
106
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
107 =========================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
108 This is SingleEnd Data ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
109
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
110 Fraction of reads explained by "++,--": 0.9840 ::
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
111 Fraction of reads explained by "+-,-+": 0.0160
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
112 Fraction of reads explained by other combinations: 0.0000
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
113 =========================================================
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
114
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
115 *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene.
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
116 </help>
8dbd613bd835 Uploaded
nilesh
parents:
diff changeset
117 </tool>