annotate read_duplication.xml @ 31:cc5eaa9376d8

Lance's updates
author nilesh
date Wed, 02 Oct 2013 02:20:04 -0400
parents 80f857718ca0
children 580ee0c4bc4e
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
1 <tool id="read_duplication" name="Read Duplication" version="1.1">
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
2 <description>determines reads duplication rate with sequence-based and mapping-based strategies</description>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
3 <requirements>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
4 <requirement type="package" version="2.11.0">R</requirement>
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
5 <requirement type="package" version="1.7.1">numpy</requirement>
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
6 <requirement type="package" version="2.3.7">rseqc</requirement>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
7 </requirements>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
8 <command> read_duplication.py -i $input -o output -u $upLimit
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
9 </command>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
10 <inputs>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
11 <param name="input" type="data" format="bam,sam" label="input bam/sam file" />
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
12 <param name="upLimit" type="integer" label="Upper Limit of Plotted Duplicated Times (default=500)" value="500" />
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
13 </inputs>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
14 <outputs>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
15 <data format="xls" name="outputxls" from_work_dir="output.dup.pos.DupRate.xls" label="${tool.name} on ${on_string} (Position XLS)"/>
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
16 <data format="xls" name="outputseqxls" from_work_dir="output.dup.seq.DupRate.xls" label="${tool.name} on ${on_string} (Sequence XLS)"/>
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
17 <data format="r" name="outputr" from_work_dir="output.DupRate_plot.r" label="${tool.name} on ${on_string} (R Script)" />
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
18 <data format="pdf" name="outputpdf" from_work_dir="output.DupRate_plot.pdf" label="${tool.name} on ${on_string} (PDF)" />
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
19 </outputs>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
20 <stdio>
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
21 <exit_code range="1:" level="fatal" description="An error occured during execution, see stderr and stdout for more information" />
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
22 <regex match="[Ee]rror" source="both" description="An error occured during execution, see stderr and stdout for more information" />
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
23 </stdio>
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
24 <help>
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
25 read_duplication.py
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
26 +++++++++++++++++++
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
27
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
28 Two strategies were used to determine reads duplication rate:
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
29
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
30 * Sequence based: reads with exactly the same sequence content are regarded as duplicated reads.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
31 * Mapping based: reads mapped to the same genomic location are regarded as duplicated reads. For splice reads, reads mapped to the same starting position and splice the same way are regarded as duplicated reads.
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
32
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
33 Inputs
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
34 ++++++++++++++
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
35
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
36 Input BAM/SAM file
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
37 Alignment file in BAM/SAM format.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
38
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
39 Upper Limit of Plotted Duplicated Times (default=500)
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
40 Only used for plotting.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
41
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
42 Output
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
43 ++++++++++++++
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
44
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
45 1. output.dup.pos.DupRate.xls: Read duplication rate determined from mapping position of read. First column is "occurrence" or duplication times, second column is number of uniquely mapped reads.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
46 2. output.dup.seq.DupRate.xls: Read duplication rate determined from sequence of read. First column is "occurrence" or duplication times, second column is number of uniquely mapped reads.
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
47 3. output.DupRate_plot.r: R script to generate pdf file
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
48 4. output.DupRate_plot.pdf: graphical output generated from R script
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
49
31
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
50 .. image:: http://rseqc.sourceforge.net/_images/duplicate.png
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
51 :height: 600 px
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
52 :width: 600 px
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
53 :scale: 80 %
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
54
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
55 -----
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
56
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
57 About RSeQC
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
58 +++++++++++
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
59
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
60 The RSeQC_ package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. "Basic modules" quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while "RNA-seq specific modules" investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
61
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
62 The RSeQC package is licensed under the GNU GPL v3 license.
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
63
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
64 .. image:: http://rseqc.sourceforge.net/_static/logo.png
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
65
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
66 .. _RSeQC: http://rseqc.sourceforge.net/
cc5eaa9376d8 Lance's updates
nilesh
parents: 24
diff changeset
67
24
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
68
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
69 </help>
80f857718ca0 Uploaded
nilesh
parents:
diff changeset
70 </tool>