annotate gff_to_bed.xml @ 5:6e589f267c14

Uploaded
author devteam
date Tue, 04 Nov 2014 12:15:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
6e589f267c14 Uploaded
devteam
parents:
diff changeset
1 <tool id="fml_gff2bed" name="GFF-to-BED" version="2.0.0">
6e589f267c14 Uploaded
devteam
parents:
diff changeset
2 <description>converter</description>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
3 <command interpreter="python">gff_to_bed.py $inf_gff &gt; $bed_format
6e589f267c14 Uploaded
devteam
parents:
diff changeset
4 </command>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
5 <inputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
6 <param format="gtf,gff,gff3" name="inf_gff" type="data" label="Convert this query" help="Provide genome annotation file in GFF, GTF, GFF3."/>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
7 </inputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
8 <outputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
9 <data format="bed" name="bed_format" label="${tool.name} on ${on_string}: Converted" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
10 </outputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
11 <tests>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
12 <test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
13 <param name="inf_gff" value="Aly_JGI.gff3" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
14 <output name="bed_format" file="Aly_JGI.bed" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
15 </test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
16 <test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
17 <param name="inf_gff" value="MB7_3R.gff3" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
18 <output name="bed_format" file="MB7_3R.bed" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
19 </test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
20 </tests>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
21 <help>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
22
6e589f267c14 Uploaded
devteam
parents:
diff changeset
23 **What it does**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
24
6e589f267c14 Uploaded
devteam
parents:
diff changeset
25 This tool converts gene transcript annotation from GTF or GFF or GFF3 to UCSC wiggle 12 column BED format.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
26
6e589f267c14 Uploaded
devteam
parents:
diff changeset
27 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
28
6e589f267c14 Uploaded
devteam
parents:
diff changeset
29 **Example**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
30
6e589f267c14 Uploaded
devteam
parents:
diff changeset
31 - The following data in GFF3::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
32
6e589f267c14 Uploaded
devteam
parents:
diff changeset
33 ##gff-version 3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
34 chr1 protein_coding gene 11874 14409 0 + . ID=Gene:uc001aaa.3;Name=Gene:uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
35 chr1 protein_coding transcript 11874 14409 0 + . ID=uc001aaa.3;Name=uc001aaa.3;Parent=Gene:uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
36 chr1 protein_coding exon 11874 12227 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
37 chr1 protein_coding exon 12613 12721 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
38 chr1 protein_coding exon 13221 14409 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
39
6e589f267c14 Uploaded
devteam
parents:
diff changeset
40 - Will be converted to UCSC Wiggle BED format::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
41
6e589f267c14 Uploaded
devteam
parents:
diff changeset
42 chr1 11874 14409 uc001aaa.3 0 + 11874 14409 0 3 354,109,1189, 0,739,1347,
6e589f267c14 Uploaded
devteam
parents:
diff changeset
43
6e589f267c14 Uploaded
devteam
parents:
diff changeset
44 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
45
6e589f267c14 Uploaded
devteam
parents:
diff changeset
46 **About formats**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
47
6e589f267c14 Uploaded
devteam
parents:
diff changeset
48 **GFF3 format** General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF3 lines have nine tab-separated fields::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
49
6e589f267c14 Uploaded
devteam
parents:
diff changeset
50
6e589f267c14 Uploaded
devteam
parents:
diff changeset
51 1. seqid - Must be a chromosome or scaffold or contig.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
52 2. source - The program that generated this feature.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
53 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
6e589f267c14 Uploaded
devteam
parents:
diff changeset
54 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
55 5. stop - The ending position of the feature (inclusive).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
56 6. score - A score between 0 and 1000. If there is no score value, enter ".".
6e589f267c14 Uploaded
devteam
parents:
diff changeset
57 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
58 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
59 9. attributes - All lines with the same group are linked together into a single item.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
60
6e589f267c14 Uploaded
devteam
parents:
diff changeset
61 **BED format** Browser Extensible Data format was designed at UCSC for displaying data tracks in the Genome Browser. It has three required fields and several additional optional ones:
6e589f267c14 Uploaded
devteam
parents:
diff changeset
62
6e589f267c14 Uploaded
devteam
parents:
diff changeset
63 The first three BED fields (required) are::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
64
6e589f267c14 Uploaded
devteam
parents:
diff changeset
65 1. chrom - The name of the chromosome (e.g. chr1, chrY_random).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
66 2. chromStart - The starting position in the chromosome. (The first base in a chromosome is numbered 0.)
6e589f267c14 Uploaded
devteam
parents:
diff changeset
67 3. chromEnd - The ending position in the chromosome, plus 1 (i.e., a half-open interval).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
68
6e589f267c14 Uploaded
devteam
parents:
diff changeset
69 The additional BED fields (optional) are::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
70
6e589f267c14 Uploaded
devteam
parents:
diff changeset
71 4. name - The name of the BED line.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
72 5. score - A score between 0 and 1000.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
73 6. strand - Defines the strand - either '+' or '-'.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
74 7. thickStart - The starting position where the feature is drawn thickly at the Genome Browser.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
75 8. thickEnd - The ending position where the feature is drawn thickly at the Genome Browser.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
76 9. reserved - This should always be set to zero.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
77 10. blockCount - The number of blocks (exons) in the BED line.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
78 11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
79 12. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
80
6e589f267c14 Uploaded
devteam
parents:
diff changeset
81 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
82
6e589f267c14 Uploaded
devteam
parents:
diff changeset
83 **Copyright**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
84
6e589f267c14 Uploaded
devteam
parents:
diff changeset
85 2009-2014 Max Planck Society, University of Tübingen &amp; Memorial Sloan Kettering Cancer Center
6e589f267c14 Uploaded
devteam
parents:
diff changeset
86
6e589f267c14 Uploaded
devteam
parents:
diff changeset
87 Sreedharan VT, Schultheiss SJ, Jean G, Kahles A, Bohnert R, Drewe P, Mudrakarta P, Görnitz N, Zeller G, Rätsch G. Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis. Bioinformatics 10.1093/bioinformatics/btt731 (2014)
6e589f267c14 Uploaded
devteam
parents:
diff changeset
88
6e589f267c14 Uploaded
devteam
parents:
diff changeset
89 </help>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
90 </tool>