Mercurial > repos > nilesh > rseqc
view infer_experiment.xml @ 63:27e16a30667a draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit d7544582d5599c67a284faf9232cd2ccc4daa1de
author | iuc |
---|---|
date | Tue, 09 Apr 2024 11:24:55 +0000 |
parents | 5968573462fa |
children |
line wrap: on
line source
<tool id="rseqc_infer_experiment" name="Infer Experiment" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@GALAXY_VERSION@"> <description>speculates how RNA-seq were configured</description> <macros> <import>rseqc_macros.xml</import> </macros> <expand macro="bio_tools"/> <expand macro="requirements"/> <expand macro="stdio"/> <version_command><![CDATA[infer_experiment.py --version]]></version_command> <command><![CDATA[ @BAM_SAM_INPUTS@ infer_experiment.py -i 'input.${extension}' -r '${refgene}' --sample-size ${sample_size} --mapq ${mapq} > '${output}' ]]> </command> <inputs> <expand macro="bam_param"/> <expand macro="refgene_param"/> <expand macro="sample_size_param"/> <expand macro="mapq_param"/> </inputs> <outputs> <data format="txt" name="output" label="${tool.name} on ${on_string}: RNA-seq experiment configuration"/> </outputs> <tests> <test expect_num_outputs="1"> <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/> <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed" ftype="bed12"/> <output name="output" file="output.infer_experiment.txt"/> </test> </tests> <help><![CDATA[ infer_experiment.py +++++++++++++++++++ This program is used to speculate how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping information to the underneath gene model. Inputs ++++++++++++++ Input BAM/SAM file Alignment file in BAM/SAM format. Reference gene model Gene model in BED format. Number of usable sampled reads (default=200000) Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower. Outputs +++++++ For pair-end RNA-seq, there are two different ways to strand reads (such as Illumina ScriptSeq protocol): 1. 1++,1--,2+-,2-+ * read1 mapped to '+' strand indicates parental gene on '+' strand * read1 mapped to '-' strand indicates parental gene on '-' strand * read2 mapped to '+' strand indicates parental gene on '-' strand * read2 mapped to '-' strand indicates parental gene on '+' strand 2. 1+-,1-+,2++,2-- * read1 mapped to '+' strand indicates parental gene on '-' strand * read1 mapped to '-' strand indicates parental gene on '+' strand * read2 mapped to '+' strand indicates parental gene on '+' strand * read2 mapped to '-' strand indicates parental gene on '-' strand For single-end RNA-seq, there are also two different ways to strand reads: 1. ++,-- * read mapped to '+' strand indicates parental gene on '+' strand * read mapped to '-' strand indicates parental gene on '-' strand 2. +-,-+ * read mapped to '+' strand indicates parental gene on '-' strand * read mapped to '-' strand indicates parental gene on '+' strand Example Output ++++++++++++++ **Example1** :: ========================================================= This is PairEnd Data :: Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992 Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008 Fraction of reads explained by other combinations: 0.0000 ========================================================= *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--". **Example2** :: ============================================================ This is PairEnd Data Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 :: Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356 Fraction of reads explained by other combinations: 0.0000 ============================================================ *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model. **Example3** :: ========================================================= This is SingleEnd Data :: Fraction of reads explained by "++,--": 0.9840 :: Fraction of reads explained by "+-,-+": 0.0160 Fraction of reads explained by other combinations: 0.0000 ========================================================= *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene. @ABOUT@ ]]> </help> <expand macro="citations"/> </tool>