Mercurial > repos > nilesh > rseqc
view infer_experiment.xml @ 54:5873cd7afb67 draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit 62a9135bf04aec398d3172d17ccd60f5242d8e82
author | iuc |
---|---|
date | Wed, 13 Jun 2018 18:02:25 -0400 |
parents | 09846d5169fa |
children | dbedfc5f5a3c |
line wrap: on
line source
<tool id="rseqc_infer_experiment" name="Infer Experiment" version="@WRAPPER_VERSION@.1"> <description>speculates how RNA-seq were configured</description> <macros> <import>rseqc_macros.xml</import> </macros> <expand macro="requirements" /> <expand macro="stdio" /> <version_command><![CDATA[infer_experiment.py --version]]></version_command> <command><![CDATA[ infer_experiment.py -i '${input}' -r '${refgene}' --sample-size ${sample_size} --mapq ${mapq} > '${output}' ]]> </command> <inputs> <expand macro="bam_param" /> <expand macro="refgene_param" /> <expand macro="sample_size_param" /> <expand macro="mapq_param" /> </inputs> <outputs> <data format="txt" name="output" /> </outputs> <tests> <test> <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/> <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed"/> <output name="output" file="output.infer_experiment.txt"/> </test> </tests> <help><![CDATA[ infer_experiment.py +++++++++++++++++++ This program is used to speculate how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping information to the underneath gene model. Inputs ++++++++++++++ Input BAM/SAM file Alignment file in BAM/SAM format. Reference gene model Gene model in BED format. Number of usable sampled reads (default=200000) Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower. Outputs +++++++ For pair-end RNA-seq, there are two different ways to strand reads (such as Illumina ScriptSeq protocol): 1. 1++,1--,2+-,2-+ * read1 mapped to '+' strand indicates parental gene on '+' strand * read1 mapped to '-' strand indicates parental gene on '-' strand * read2 mapped to '+' strand indicates parental gene on '-' strand * read2 mapped to '-' strand indicates parental gene on '+' strand 2. 1+-,1-+,2++,2-- * read1 mapped to '+' strand indicates parental gene on '-' strand * read1 mapped to '-' strand indicates parental gene on '+' strand * read2 mapped to '+' strand indicates parental gene on '+' strand * read2 mapped to '-' strand indicates parental gene on '-' strand For single-end RNA-seq, there are also two different ways to strand reads: 1. ++,-- * read mapped to '+' strand indicates parental gene on '+' strand * read mapped to '-' strand indicates parental gene on '-' strand 2. +-,-+ * read mapped to '+' strand indicates parental gene on '-' strand * read mapped to '-' strand indicates parental gene on '+' strand Example Output ++++++++++++++ **Example1** :: ========================================================= This is PairEnd Data :: Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992 Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008 Fraction of reads explained by other combinations: 0.0000 ========================================================= *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--". **Example2** :: ============================================================ This is PairEnd Data Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 :: Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356 Fraction of reads explained by other combinations: 0.0000 ============================================================ *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model. **Example3** :: ========================================================= This is SingleEnd Data :: Fraction of reads explained by "++,--": 0.9840 :: Fraction of reads explained by "+-,-+": 0.0160 Fraction of reads explained by other combinations: 0.0000 ========================================================= *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene. @ABOUT@ ]]> </help> <expand macro="citations" /> </tool>