comparison RPKM_count.xml @ 31:cc5eaa9376d8

Lance's updates
author nilesh
date Wed, 02 Oct 2013 02:20:04 -0400
parents 907d4b021ff6
children 580ee0c4bc4e
comparison
equal deleted inserted replaced
30:b5d2f575ccb6 31:cc5eaa9376d8
1 <tool id="RPKM_count" name="RPKM Count"> 1 <tool id="RPKM_count" name="RPKM Count" version="1.1">
2 <description>calculates raw count and RPKM values for transcript at exon, intron, and mRNA level</description> 2 <description>calculates raw count and RPKM values for transcript at exon, intron, and mRNA level</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="0.1.18">samtools</requirement> 4 <requirement type="package" version="1.7.1">numpy</requirement>
5 <requirement type="package" version="2.3.7">rseqc</requirement> 5 <requirement type="package" version="2.3.7">rseqc</requirement>
6 </requirements> 6 </requirements>
7 <command interpreter="python"> samtoolshelper.py RPKM_count.py -i $input -o output -r $refgene 7 <command>
8 ln -s "${input}" "local_input.bam" &amp;&amp;
9 ln -s "${input.metadata.bam_index}" "local_input.bam.bai" &amp;&amp;
10 RPKM_count.py -i "local_input.bam" -o output -r $refgene
8 11
9 #if $nx
10 -x
11 #end if
12
13 #if str($strand_type.strand_specific) == "pair" 12 #if str($strand_type.strand_specific) == "pair"
14 -d 13 -d
15 #if str($strand_type.pair_type) == "sd" 14 #if str($strand_type.pair_type) == "sd"
16 '1++,1--,2+-,2-+' 15 '1++,1--,2+-,2-+'
17 #else 16 #else
64 <param name="onlyexonic" type="boolean" value="false" label="Only use exonic (UTR exons and CDS exons) reads, otherwise use all reads" /> 63 <param name="onlyexonic" type="boolean" value="false" label="Only use exonic (UTR exons and CDS exons) reads, otherwise use all reads" />
65 </inputs> 64 </inputs>
66 <outputs> 65 <outputs>
67 <data format="xls" name="outputxls" from_work_dir="output_read_count.xls"/> 66 <data format="xls" name="outputxls" from_work_dir="output_read_count.xls"/>
68 </outputs> 67 </outputs>
68 <stdio>
69 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
70 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
71 </stdio>
69 <help> 72 <help>
70 .. image:: https://code.google.com/p/rseqc/logo?cct=1336721062 73 RPKM_count.py
74 +++++++++++++
71 75
72 ----- 76 Given a BAM file and reference gene model, this program will calculate the raw count and RPKM
73 77 values for transcript at exon, intron and mRNA level. For strand specific RNA-seq data,
74 About RSeQC 78 program will assign read to its parental gene according to strand rule, if you don't know the
75 +++++++++++ 79 strand rule, run infer_experiment.py. Please note that chromosome ID, genome cooridinates
76 80 should be concordant between BAM and BED files.
77 The RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. “Basic modules” quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while “RNA-seq specific modules” investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
78
79 The RSeQC package is licensed under the GNU GPL v3 license.
80 81
81 Inputs 82 Inputs
82 ++++++++++++++ 83 ++++++++++++++
83 84
84 Input BAM/SAM file 85 Input BAM/SAM file
100 Renders program only used exonic (UTR exons and CDS exons) reads, otherwise use all reads. 101 Renders program only used exonic (UTR exons and CDS exons) reads, otherwise use all reads.
101 102
102 Sample Output 103 Sample Output
103 ++++++++++++++ 104 ++++++++++++++
104 105
105 ===== ===== === ========= ===== =========== ============= ============= ======== ======== 106 ===== ======== ======== ===================== ===== =========== ============= ============= ======== =========
106 chrom start end accession score gene strand tag count (+) tag count (-) RPKM (+) RPKM (-) 107 chrom start end accession score gene strand tag count (+) tag count (-) RPKM (+) RPKM (-)
107 ===== ===== === ========= ===== =========== ============= ============= ======== ======== 108 ===== ======== ======== ===================== ===== =========== ============= ============= ======== =========
108 chr1 29213722 29313959 NM_001166007_intron_1 0 + 431 4329 0.086 0.863 109 chr1 29213722 29313959 NM_001166007_intron_1 0 '+' 431 4329 0.086 0.863
109 chr1 29314417 29319841 NM_001166007_intron_2 0 + 31 1 0.114 0.004 110 chr1 29314417 29319841 NM_001166007_intron_2 0 '+' 31 1 0.114 0.004
110 chr1 29320054 29323726 NM_001166007_intron_3 0 + 32 0 0.174 0 111 chr1 29320054 29323726 NM_001166007_intron_3 0 '+' 32 0 0.174 0.000
111 chr1 29323831 29338376 NM_001166007_intron_4 0 + 33 2 0.045 0.003 112 chr1 29213602 29213722 NM_001166007_exon_1 0 '+' 164 0 27.321 0.000
112 chr1 29338419 29342203 NM_001166007_intron_5 0 + 7 0 0.037 0 113 chr1 29313959 29314417 NM_001166007_exon_2 0 '+' 1699 4 74.158 0.175
113 chr1 29342279 29344735 NM_001166007_intron_6 0 + 35 4 0.285 0.033 114 chr1 29319841 29320054 NM_001166007_exon_3 0 '+' 528 1 49.554 0.094
114 chr1 29344954 29356911 NM_001166007_intron_7 0 + 34 2 0.057 0.003 115 ===== ======== ======== ===================== ===== =========== ============= ============= ======== =========
115 chr1 29356999 29359604 NM_001166007_intron_8 0 + 19 1 0.146 0.008
116 chr1 29359757 29362337 NM_001166007_intron_9 0 + 31 0 0.24 0
117 chr1 29362435 29365765 NM_001166007_intron_10 0 + 11 1 0.066 0.006
118 chr1 29365938 29379615 NM_001166007_intron_11 0 + 63 0 0.092 0
119 chr1 29379824 29391493 NM_001166007_intron_12 0 + 383 8 0.656 0.014
120 chr1 29391670 29424318 NM_001166007_intron_13 0 + 817 10 0.5 0.006
121 chr1 29424447 29435847 NM_001166007_intron_14 0 + 28 0 0.049 0
122 chr1 29435949 29438879 NM_001166007_intron_15 0 + 12 0 0.082 0
123 chr1 29438960 29442210 NM_001166007_intron_16 0 + 22 2 0.135 0.012
124 chr1 29442315 29443330 NM_001166007_intron_17 0 + 9 0 0.177 0
125 chr1 29213602 29213722 NM_001166007_exon_1 0 + 164 0 27.321 0
126 chr1 29313959 29314417 NM_001166007_exon_2 0 + 1699 4 74.158 0.175
127 chr1 29319841 29320054 NM_001166007_exon_3 0 + 528 1 49.554 0.094
128 chr1 29323726 29323831 NM_001166007_exon_4 0 + 168 0 31.985 0
129 chr1 29338376 29338419 NM_001166007_exon_5 0 + 88 0 40.911 0
130 chr1 29342203 29342279 NM_001166007_exon_6 0 + 114 3 29.986 0.789
131 chr1 29344735 29344954 NM_001166007_exon_7 0 + 290 10 26.472 0.913
132 chr1 29356911 29356999 NM_001166007_exon_8 0 + 146 1 33.166 0.227
133 chr1 29359604 29359757 NM_001166007_exon_9 0 + 404 11 52.786 1.437
134 chr1 29362337 29362435 NM_001166007_exon_10 0 + 85 7 17.339 1.428
135 chr1 29365765 29365938 NM_001166007_exon_11 0 + 198 2 22.88 0.231
136 chr1 29379615 29379824 NM_001166007_exon_12 0 + 306 5 29.269 0.478
137 chr1 29391493 29391670 NM_001166007_exon_13 0 + 243 7 27.445 0.791
138 chr1 29424318 29424447 NM_001166007_exon_14 0 + 298 7 46.18 1.085
139 chr1 29435847 29435949 NM_001166007_exon_15 0 + 396 8 77.611 1.568
140 chr1 29438879 29438960 NM_001166007_exon_16 0 + 307 0 75.767 0
141 chr1 29442210 29442315 NM_001166007_exon_17 0 + 138 0 26.273 0
142 chr1 29443330 29446558 NM_001166007_exon_18 0 + 2434 84 15.074 0.52
143 chr1 29213602 29446558 NM_001166007_mRNA 0 + 8006 150 27.704 0.519
144 ===== ===== === ========= ===== =========== ============= ============= ======== ========
145 116
117 -----
118
119 About RSeQC
120 +++++++++++
121
122 The RSeQC_ package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. "Basic modules" quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while "RNA-seq specific modules" investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
123
124 The RSeQC package is licensed under the GNU GPL v3 license.
125
126 .. image:: http://rseqc.sourceforge.net/_static/logo.png
127
128 .. _RSeQC: http://rseqc.sourceforge.net/
129
130
146 </help> 131 </help>
147 </tool> 132 </tool>