9
+ − 1 <tool id="deseq-hts" name="DESeq" version="1.12.1">
+ − 2 <description> Determines differentially expressed transcripts from read alignments</description>
+ − 3 <requirements>
+ − 4 <requirement type="package" version="0.1">oqtans</requirement>
+ − 5 </requirements>
+ − 6 <command interpreter="bash">
+ − 7 ./../src/deseq-hts.sh $anno_input_selected $deseq_out $deseq_out.extra_files_path/gene_map.mat
0
+ − 8 #for $i in $replicate_groups
+ − 9 #for $j in $i.replicates
+ − 10 $j.bam_alignment:#slurp
+ − 11 #end for
1
+ − 12
0
+ − 13 #end for
+ − 14 >> $Log_File </command>
+ − 15 <inputs>
+ − 16 <param format="gff3" name="anno_input_selected" type="data" label="Genome annotation in GFF3 file" help="A tab delimited format for storing sequence features and annotations"/>
+ − 17 <repeat name="replicate_groups" title="Replicate group" min="2">
+ − 18 <repeat name="replicates" title="Replicate">
+ − 19 <param format="bam" name="bam_alignment" type="data" label="BAM alignment file" help="BAM alignment file. Can be generated from SAM files using the SAM Tools."/>
+ − 20 </repeat>
+ − 21 </repeat>
+ − 22 </inputs>
+ − 23
+ − 24 <outputs>
9
+ − 25 <data format="txt" name="deseq_out" label="${tool.name} on ${on_string}: Differential Expression"/>
+ − 26 <data format="txt" name="Log_File" label="${tool.name} on ${on_string}: log"/>
0
+ − 27 </outputs>
+ − 28
+ − 29 <tests>
+ − 30 <test>
+ − 31 command:
+ − 32 ./deseq-hts.sh ../test_data/deseq_c_elegans_WS200-I-regions.gff3 ../test_data/deseq_c_elegans_WS200-I-regions_deseq.txt ../test_data/genes.mat ../test_data/deseq_c_elegans_WS200-I-regions-SRX001872.bam ../test_data/deseq_c_elegans_WS200-I-regions-SRX001875.bam
+ − 33
+ − 34 <param name="anno_input_selected" value="deseq_c_elegans_WS200-I-regions.gff3" ftype="gff3" />
+ − 35 <param name="bam_alignments1" value="deseq_c_elegans_WS200-I-regions-SRX001872.bam" ftype="bam" />
+ − 36 <param name="bam_alignments2" value="deseq_c_elegans_WS200-I-regions-SRX001875.bam" ftype="bam" />
+ − 37 <output name="deseq_out" file="deseq_c_elegans_WS200-I-regions_deseq.txt" />
+ − 38 </test>
+ − 39 </tests>
+ − 40
+ − 41 <help>
+ − 42
+ − 43 .. class:: infomark
+ − 44
+ − 45 **What it does**
+ − 46
9
+ − 47 DESeq_ is a tool for differential expression testing of RNA-Seq data.
0
+ − 48
9
+ − 49 .. _DESeq: http://bioconductor.org/packages/release/bioc/html/DESeq.html
0
+ − 50
9
+ − 51 `DESeq` requires:
0
+ − 52
9
+ − 53 Genome annotation file in GFF3, containing the necessary information about the transcripts that are to be quantified.
0
+ − 54
9
+ − 55 The BAM alignment files grouped into replicate groups, each containing several replicates. BAM files store the read alignments, The program will also work with only two groups containing only a single replicate each. However, this analysis has less statistical power and is therefore not recommended!
0
+ − 56
+ − 57 ------
+ − 58
+ − 59 **Licenses**
+ − 60
+ − 61 If **DESeq** is used to obtain results for scientific publications it
+ − 62 should be cited as [1]_.
+ − 63
+ − 64 **References**
+ − 65
+ − 66 .. [1] Anders, S and Huber, W (2010): `Differential expression analysis for sequence count data`_.
+ − 67
+ − 68 .. _Differential expression analysis for sequence count data: http://dx.doi.org/10.1186/gb-2010-11-10-r106
+ − 69
+ − 70 ------
+ − 71
+ − 72 .. class:: infomark
+ − 73
+ − 74 **About formats**
+ − 75
+ − 76
+ − 77 **GFF3 format** General Feature Format is a format for describing genes
+ − 78 and other features associated with DNA, RNA and protein
+ − 79 sequences. GFF3 lines have nine tab-separated fields:
+ − 80
+ − 81 1. seqid - The name of a chromosome or scaffold.
+ − 82 2. source - The program that generated this feature.
+ − 83 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
+ − 84 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
+ − 85 5. stop - The ending position of the feature (inclusive).
+ − 86 6. score - A score between 0 and 1000. If there is no score value, enter ".".
+ − 87 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
+ − 88 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
+ − 89 9. attributes - All lines with the same group are linked together into a single item.
+ − 90
+ − 91 For more information see http://www.sequenceontology.org/gff3.shtml
+ − 92
+ − 93 **SAM/BAM format** The Sequence Alignment/Map (SAM) format is a
+ − 94 tab-limited text format that stores large nucleotide sequence
+ − 95 alignments. BAM is the binary version of a SAM file that allows for
+ − 96 fast and intensive data processing. The format specification and the
+ − 97 description of SAMtools can be found on
+ − 98 http://samtools.sourceforge.net/.
+ − 99
+ − 100 ------
+ − 101
9
+ − 102 DESeq-hts Wrapper Version 0.5 (Aug 2013)
0
+ − 103
+ − 104 </help>
+ − 105 </tool>