view deseq-hts_1.0/galaxy/deseq.xml @ 9:e27b4f7811c2 draft

Updated DESeq version 1.12
author vipints <vipin@cbio.mskcc.org>
date Tue, 08 Oct 2013 08:09:28 -0400
parents 8ab01cc29c4b
children
line wrap: on
line source

<tool id="deseq-hts" name="DESeq" version="1.12.1">
  <description> Determines differentially expressed transcripts from read alignments</description>
  <requirements>
	<requirement type="package" version="0.1">oqtans</requirement>
  </requirements>
  <command interpreter="bash"> 
./../src/deseq-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat 
#for $i in $replicate_groups
#for $j in $i.replicates
$j.bam_alignment:#slurp
#end for

#end for
    >> $Log_File </command>
  <inputs>
	<param format="gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF3 file" help="A tab delimited format for storing sequence features and annotations"/>
   <repeat name="replicate_groups" title="Replicate group" min="2">
     <repeat name="replicates" title="Replicate">
      <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAM Tools."/>
     </repeat>
   </repeat>
  </inputs>

  <outputs>
    <data format="txt" name="deseq_out" label="${tool.name} on ${on_string}: Differential Expression"/>
    <data format="txt" name="Log_File" label="${tool.name} on ${on_string}: log"/>
  </outputs>

  <tests>
    <test> 
	command:
	./deseq-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
	
      <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
      <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
      <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
      <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
    </test>
  </tests> 

  <help>

.. class:: infomark

**What it does** 

DESeq_ is a tool for differential expression testing of RNA-Seq data.

.. _DESeq: http://bioconductor.org/packages/release/bioc/html/DESeq.html

`DESeq` requires:

Genome annotation file in GFF3, containing the necessary information about the transcripts that are to be quantified.

The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments, The program will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefore not recommended!

------

**Licenses**

If **DESeq** is used to obtain results for scientific publications it
should be cited as [1]_.

**References** 

.. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_. 

.. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106

------

.. class:: infomark

**About formats**


**GFF3 format** General Feature Format is a format for describing genes
and other features associated with DNA, RNA and protein
sequences. GFF3 lines have nine tab-separated fields:

1. seqid - The name of a chromosome or scaffold.
2. source - The program that generated this feature.
3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon". 
4. start - The starting position of the feature in the sequence. The first base is numbered 1.
5. stop - The ending position of the feature (inclusive).
6. score - A score between 0 and 1000. If there is no score value, enter ".".
7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
9. attributes - All lines with the same group are linked together into a single item.

For more information see http://www.sequenceontology.org/gff3.shtml

**SAM/BAM format** The Sequence Alignment/Map (SAM) format is a
tab-limited text format that stores large nucleotide sequence
alignments. BAM is the binary version of a SAM file that allows for
fast and intensive data processing. The format specification and the
description of SAMtools can be found on
http://samtools.sourceforge.net/.

------

DESeq-hts Wrapper Version 0.5 (Aug 2013)

</help>
</tool>