Mercurial > repos > rnateam > ribotaper
comparison ribotaper_part3_main.xml @ 0:93b90466d533 draft
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/rna_tools/ribotaper/ commit a3232e388d52097083f2662ccb26351fdc2f2412-dirty
author | rnateam |
---|---|
date | Tue, 07 Jun 2016 17:49:46 -0400 |
parents | |
children | a56343c142d5 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:93b90466d533 |
---|---|
1 <tool id="ribotaper_ribosome_profiling" name="ribotaper part 3: ribosome profiling" version="0.1.0"> | |
2 <requirements> | |
3 <requirement type="package" version="1.3.1">ribotaper</requirement> | |
4 </requirements> | |
5 <stdio> | |
6 <exit_code range="1:" /> | |
7 </stdio> | |
8 | |
9 <command><![CDATA[ | |
10 tar | |
11 "xzvf" | |
12 "$annotation_path" | |
13 | |
14 && | |
15 | |
16 Ribotaper.sh | |
17 "$ribo_bam" | |
18 "$rna_bam" | |
19 "annotation_path" | |
20 "$read_lenghts_ribo1,$read_lenghts_ribo2,$read_lenghts_ribo3" | |
21 "$cutoff1,$cutoff2,$cutoff3" | |
22 "\${GALAXY_SLOTS:-12}" | |
23 | |
24 ]]></command> | |
25 <inputs> | |
26 <param name="annotation_path" type="data" format="compressed_archive" label="annotation_path" help="Please run 'ribotaper part 1' to generate the archive."/> | |
27 <param name="ribo_bam" type="data" format="BAM" label="ribo_bam" help="Ribo-seq alignment file in BAM format."/> | |
28 <param name="rna_bam" type="data" format="BAM" label="rna_bam" help="RNA-seq alignment file in BAM format."/> | |
29 <param name="read_lenghts_ribo1" type="text" value="26" label="Read length 1" help="Read length 1, which is used for P-site calculation. Default is '26' but it varies a lot in different datasets. Please run 'ribotaper part 2' to deterimine a appropriate value."/> | |
30 <param name="read_lenghts_ribo2" type="text" value="28" label="Read length 2" help="Read length 2, which is used for P-site calculation. Default is '28' but it varies a lot in different datasets. Please run 'ribotaper part 2' to deterimine a appropriate value"/> | |
31 <param name="read_lenghts_ribo3" type="text" value="29" label="Read length 3" help="Read length 3, which is used for P-site calculation. Default is '29' but it varies a lot in different datasets. Please run 'ribotaper part 2' to deterimine a appropriate value"/> | |
32 <param name="cutoff1" type="text" value="9" label="Cutoff 1" help="Offset 1, which is used for P-sites calculation. Default is '9' but it varies a lot in different datasets. | |
33 Please run 'ribotaper part 2' to deterimine a appropriate value."/> | |
34 <param name="cutoff2" type="text" value="12" label="Cutoff 2" help="Offset 2, which is used for P-sites calculation. Default is '12' but it varies a lot in different datasets. | |
35 Please run 'ribotaper part 2' to deterimine a appropriate value."/> | |
36 <param name="cutoff3" type="text" value="12" label="Cutoff 3" help="Offset 3, which is used for P-sites calculation. Default is '12' but it varies a lot in different datasets. | |
37 Please run 'ribotaper part 2' to deterimine a appropriate value."/> | |
38 </inputs> | |
39 <outputs> | |
40 <data name="output1" type="data" format="pdf" from_work_dir="quality_check_plots.pdf" label="QC plots"/> | |
41 <data name="output2" type="data" format="tabular" from_work_dir="ORFs_genes_found" label="Summary of translated ORFs"/> | |
42 <data name="output3" type="data" format="tabular" from_work_dir="ORFs_max" label="Translated ORFs (max)"/> | |
43 <data name="output4" type="data" format="tabular" from_work_dir="ORFs_max_filt" label="Translated ORFs (max_filt)"/> | |
44 <data name="output5" type="data" format="bed" from_work_dir="translated_ORFs_sorted.bed" label="Translated ORFs (sorted)"/> | |
45 <data name="output6" type="data" format="bed" from_work_dir="translated_ORFs_filtered_sorted.bed" label="Translated ORFs (filtered/sorted)"/> | |
46 <data name="output7" type="data" format="fasta" from_work_dir="protein_db_max.fasta" label="Protein DB"/> | |
47 <data name="output8" type="data" format="pdf" from_work_dir="Final_ORF_results.pdf" label="ORF categories (length/coverage)"/> | |
48 </outputs> | |
49 <tests> | |
50 <test> | |
51 <param name="annotation_path" value="annotation_path.tgz" ftype="compressed_archive"/> | |
52 <param name="ribo_bam" value="test_ribo.bam"/> | |
53 <param name="rna_bam" value="test_rna.bam"/> | |
54 <param name="read_lenghts_ribo1" value="26"/> | |
55 <param name="read_lenghts_ribo2" value="28"/> | |
56 <param name="read_lenghts_ribo3" value="29"/> | |
57 <param name="cutoff1" value="9"/> | |
58 <param name="cutoff2" value="12"/> | |
59 <param name="cutoff3" value="12"/> | |
60 <output name="output2" file="ORFs_genes_found"/> | |
61 </test> | |
62 </tests> | |
63 <help><![CDATA[ | |
64 RiboTaper is an analysis pipeline for Ribosome Profiling | |
65 (Ribo-seq) experiments, | |
66 which exploits the triplet periodicity of | |
67 ribosomal footprints to call translated regions. | |
68 See | |
69 https://ohlerlab.mdc-berlin.de/software/RiboTaper_126/ for details. | |
70 | |
71 | |
72 The Ribotaper Galaxy tool set consists of three tools: | |
73 | |
74 - ``ribotaper part 1``: creation of annotation files | |
75 - ``ribotaper part 2``: metagene analysis for P-sites definition | |
76 - ``ribotaper part 3``: ribosome profiling | |
77 | |
78 The order of execution should follow: | |
79 ``ribotaper part 1, part 2 and part 3``. | |
80 | |
81 The current tool is ``ribotaper part 3``, | |
82 ribosome profiling. | |
83 | |
84 Outputs | |
85 -------- | |
86 | |
87 **QC plots**: | |
88 This plot provides the user statistics about the Ribo-seq and RNA seq data used, together with the assessment of the P-sites calculations. | |
89 Important values are the pie chart showing the agreement between the frame (defined by the P-sites position) and the annotated frame. Reliable P-sites calculations produce an agreement above 90%. | |
90 Very important are also the length/coverage statistics for the Ribo-seq (bottom right): | |
91 This shows how the P-site calculations can be used to detect active translation in regions of different length and coverage, in a way the user can estimate the precision of the Ribo-seq data, and understand the level or resolution the data allows. | |
92 | |
93 **Summary of translated ORFs**: | |
94 Tab-separated values for the number of ORFs found and their corresponding genes, for the different ORF categories. | |
95 | |
96 **Translated ORFs (max, max_filt)**: | |
97 Tab-separated file containing information about detected ORFs. | |
98 Translated ORFs (max_filt) contains ORFs filtered for excessive multimapping and ORFs in non-coding genes overlapping known coding regions (recommended for further analysis). | |
99 | |
100 **Protein DB**: | |
101 Fasta file of the detected ORFs peptide sequence, suitable as an alternative protein database (not filtered for multimapping) | |
102 | |
103 **Translated ORFs (sorted, filtered/sorted)**: | |
104 BED files with genomic coordinates for the detected ORFs. The total number of P-sites along the ORF is reported on the 5th column. | |
105 | |
106 **ORF categories (length/coverage)**: | |
107 PDF file containing info about the number of ORFs found, together with their length and coverage per category/annotation. | |
108 | |
109 Important notes | |
110 ---------------- | |
111 | |
112 - We ran the RiboTaper analysis on an SGE cluster, using 7 cores and h_vmem 8G. For each dataset, the complete RiboTaper workflow (from the bam files to final results) took ~ 1 day. | |
113 | |
114 - The current RiboTaper framework is not designed to identify and quantify ORFs on different transcripts. This means the transcript annotation is crucial. | |
115 | |
116 - Be careful about using scaffolds, both in the genome and GTF files, which may slow the whole pipeline. | |
117 | |
118 ]]></help> | |
119 <citations> | |
120 <citation type="doi">10.1038/nmeth.3688</citation> | |
121 </citations> | |
122 </tool> |