annotate tools/filters/bed2gff.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="bed2gff1" name="BED-to-GFF" version="2.0.0">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>converter</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">bed_to_gff_converter.py $input $out_file1</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param format="bed" name="input" type="data" label="Convert this query"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <data format="gff" name="out_file1" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <param name="input" value="9.bed"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <output name="out_file1" file="bed2gff_out.gff"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 This tool converts data from BED format to GFF format (scroll down for format description).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 --------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 **Example**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26 The following data in BED format::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 chr28 346187 388197 BC114771 0 + 346187 388197 0 9 144,81,115,63,155,96,134,105,112, 0,24095,26190,31006,32131,33534,36994,41793,41898,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 Will be converted to GFF (**note** that the start coordinate is incremented by 1)::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 ##gff-version 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33 ##bed_to_gff_converter.py
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 chr28 bed2gff mRNA 346188 388197 0 + . mRNA BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 chr28 bed2gff exon 346188 346331 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 chr28 bed2gff exon 370283 370363 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 chr28 bed2gff exon 372378 372492 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 chr28 bed2gff exon 377194 377256 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 chr28 bed2gff exon 378319 378473 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 chr28 bed2gff exon 379722 379817 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 chr28 bed2gff exon 383182 383315 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 chr28 bed2gff exon 387981 388085 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44 chr28 bed2gff exon 388086 388197 0 + . exon BC114771;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 .. class:: informark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 **About formats**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 **BED format** Browser Extensible Data format was designed at UCSC for displaying data tracks in the Genome Browser. It has three required fields and several additional optional ones:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 The first three BED fields (required) are::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 1. chrom - The name of the chromosome (e.g. chr1, chrY_random).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58 2. chromStart - The starting position in the chromosome. (The first base in a chromosome is numbered 0.)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 3. chromEnd - The ending position in the chromosome, plus 1 (i.e., a half-open interval).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 The additional BED fields (optional) are::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 4. name - The name of the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64 5. score - A score between 0 and 1000.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 6. strand - Defines the strand - either '+' or '-'.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66 7. thickStart - The starting position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 8. thickEnd - The ending position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68 9. reserved - This should always be set to zero.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69 10. blockCount - The number of blocks (exons) in the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70 11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 12. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72 13. expCount - The number of experiments.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73 14. expIds - A comma-separated list of experiment ids. The number of items in this list should correspond to expCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74 15. expScores - A comma-separated list of experiment scores. All of the expScores should be relative to expIds. The number of items in this list should correspond to expCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76 **GFF format** General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF lines have nine tab-separated fields::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 1. seqname - Must be a chromosome or scaffold.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
79 2. source - The program that generated this feature.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
80 3. feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon".
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
81 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
82 5. end - The ending position of the feature (inclusive).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
83 6. score - A score between 0 and 1000. If there is no score value, enter ".".
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
84 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
85 8. frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
86 9. group - All lines with the same group are linked together into a single item.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
87
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
88 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
89 </tool>