annotate DimontDataExtractor.xml @ 1:eb36f7f72fb1 draft

Uploaded
author grau
date Wed, 20 Nov 2013 04:33:20 -0500
parents b7d6db3ba6bc
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
1 <tool id="DimontDataExtractor" name="Dimont Data Extractor" version="0.1">
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
2 <description>prepares data in the format required by Dimont</description>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
3 <command interpreter="perl">extract_data_single_galaxy.pl $genomefa $regions $chromcol $startcol $seccol $seccoord $width $statcol $extracted</command>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
4 <inputs>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
5 <param type="data" format="bed,gtf,txt,tabular" name="regions" label="Genomic regions" value="" optional="false" help="The genomic regions to be extracted in a BED-like file format, e.g., BED, GTF, narrowPeak." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
6 <param type="data" format="fasta" name="genomefa" label="Genome sequence" value="" optional="false" help="The input genome to which the genomic regions refer." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
7 <param type="data_column" data_ref="regions" force_select="true" name="chromcol" label="Chromosome column" value="1" optional="false" help="The column of the regions file, which contains the chromosome information." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
8 <param type="data_column" data_ref="regions" force_select="true" name="startcol" label="Start column" value="2" optional="false" help="The column of the Regions file containing the start position of the genomic region." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
9 <param type="select" name="seccol" label="Meaning of second coordinate" optional="false" help="The meaning of the second genomic coordinate. This may either be the position of the peak summit relative to the position in Start, or the end position of the peak.">
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
10 <option value="end">End of peak (in global coordinates)</option>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
11 <option value="center">Center of peak (relative to start)</option>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
12 </param>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
13 <param type="data_column" data_ref="regions" force_select="true" name="seccoord" label="Second coordinate" value="3" optional="false" help="The second genomic coordinate with meaning specified by parameter &quot;Meaning of second coordinate&quot;" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
14 <param type="data_column" data_ref="regions" force_select="true" name="statcol" label="Statistics column" value="7" optional="false" help="The column containing the peak statistics information (or another measure of peak confidence)." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
15 <param type="integer" name="width" label="Width" value="1000" optional="false" help="The width of the genomic region to be extracted. Recommended values: 1000 for ChIP-seq and 100 for ChIP-exo." />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
16 </inputs>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
17 <outputs>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
18 <data format="fasta" name="extracted">
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
19 </data>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
20 </outputs>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
21 <tests>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
22 <test>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
23 <param name="regions" value="mini.bed" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
24 <param name="genomefa" value="minigenome.fa" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
25 <param name="chromcol" value="1" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
26 <param name="startcol" value="2" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
27 <param name="seccol" value="end" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
28 <param name="seccoord" value="3" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
29 <param name="statcol" value="7" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
30 <param name="width" value="200" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
31 <output name="extracted" file="mini_extracted.fa" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
32 </test>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
33 <test>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
34 <param name="regions" value="mini2.bed" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
35 <param name="genomefa" value="minigenome.fa" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
36 <param name="chromcol" value="1" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
37 <param name="startcol" value="2" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
38 <param name="seccol" value="center" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
39 <param name="seccoord" value="3" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
40 <param name="statcol" value="7" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
41 <param name="width" value="200" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
42 <output name="extracted" file="mini2_extracted.fa" />
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
43 </test>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
44 </tests>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
45 <help>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
46 **Dimont Data Extractor** prepares an annotated FastA file as required by Dimont from a genome (in FastA format) and a tabular file (e.g., BED, GTF, narrowPeak,...). The regions specified in the tabular file are used to determine the center of the extracted sequences. All extracted sequences have the same length as specified by parameter "Width".
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
47
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
48 In case of ChIP data, the center position could for instance be the peak summit.
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
49 An annotated FastA file for ChIP-exo data comprising sequences of length 100 centered around the peak summit might look like::
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
50
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
51 > peak: 50; signal: 515
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
52 ggccatgtgtatttttttaaatttccac...
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
53 > peak: 50; signal: 199
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
54 GGTCCCCTGGGAGGATGGGGACGTGCTG...
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
55 ...
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
56
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
57 where the center is given as 50 for the first two sequences, and the confidence amounts to 515 and 199, respectively.
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
58
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
59 We also provide an example_ input file and a stand alone Perl script_ for preparing data in the format required by Dimont_.
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
60
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
61
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
62 If you experience problems using Dimont Data Extractor, please contact_ us.
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
63
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
64 .. _example: http://www.jstacs.de/downloads/dimont-example.fa
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
65 .. _script: http://www.jstacs.de/index.php/Dimont#Data_preparation
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
66 .. _Dimont: http://jstacs.de/index.php/Dimont
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
67 .. _contact: mailto:grau@informatik.uni-halle.de
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
68 </help>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
69 </tool>
b7d6db3ba6bc Uploaded
grau
parents:
diff changeset
70