annotate GALAXY_FILES/tools/EMBER/PreProcess_Expression_Data.xml @ 0:003f802d4c7d

Uploaded
author mmaiensc
date Wed, 29 Feb 2012 15:03:33 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
1 <tool id="prep_data" name="PreProcess Expression Data" version="1.3">
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
2 <description>Combines gene expression data</description>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
3 <command interpreter="perl">PreProcess_Expression_Data.pl -i $data -c $compslist -a $annot -o $output -p $thresh -l $log -v n</command>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
4 <inputs>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
5 <param format="txt" name="data" type="data" label="Expression data"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
6 <param format="txt" name="compslist" type="data" label="Comparison list"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
7 <param format="txt" name="annot" type="data" label="Annotation file"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
8 <param name="thresh" type="float" min="0" max="1" label="Percentile threshold" value="0.63" optional="true"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
9 <param name="log" type="select" label="Log transform data?">
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
10 <option value="n" selected="true">No</option>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
11 <option value="y">Yes</option>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
12 </param>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
13 </inputs>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
14 <outputs>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
15 <data format="txt" name="output"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
16 </outputs>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
17
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
18 <tests>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
19 <test>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
20 <param name="data" value="EMBER/expression.txt"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
21 <param name="compslist" value="EMBER/comparisons_list.txt"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
22 <param name="annot" value="EMBER/annotation.txt"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
23 <param name="thresh" value="0.63"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
24 <param name="log" value="n"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
25 <output name="output" file="EMBER/expression_profiles.txt"/>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
26 </test>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
27 </tests>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
28
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
29 <help>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
30
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
31 This tool discretizes the gene expression data and adds genomic annotations.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
32
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
33 More options for the EMBER tools (especially for the main program, EMBER, including searching for multiple expression patterns) are available in the command line version, available at http://dinner-group.uchicago.edu/downloads.html. That package also includes test data and sample outputs.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
34
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
35 When using any of the EMBER tools, please cite: M Maienschein-Cline, J Zhou, KP White, R Sciammas, and AR Dinner. Discovering transcription factor regulatory targets using gene expression and binding data. *Bioinformatics*, 28:206-213 (2012).
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
36
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
37 -----
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
38
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
39 Description of inputs:
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
40
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
41 *Expression Data*:
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
42
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
43 Microarray data, with data from N experiments (and at least 2 replicates per condition).
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
44
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
45 *Format (N+1 columns)*: [ID] [expt 1 value] [expt 2 value] ... [expt N value]
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
46
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
47 IMPORTANT: the first line should be a title line, first field "#ID", and subsequent fields giving the condition/replicate for each column, i.e.,
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
48
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
49 #ID [condition]#[replicate]...
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
50
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
51 where [condition] matches the values in the Comparison List, and replicate tells which number the file is. [condition] and [replicate] are delimited by a "#" (so don't use that character in the condition name).
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
52
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
53 *Comparison List*:
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
54
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
55 List of behavior dimension definitions. [condition] should match the names in the expression data list.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
56
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
57 *Format (2 columns)*: [condition1] [condition2]
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
58
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
59 *Annotation File*:
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
60
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
61 Gives the genomic coordinates of each probe set.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
62
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
63 *Format (6 columns)*: [probe id] [gene name] [chromosome] [start] [end] [strand]
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
64
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
65 *Percentile Threshold* (p):
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
66
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
67 Used to eliminate genes that are consistently expressed at a very low level. All data are concatenated into one list, and the pth percentile of that list is taken as the thresold. Then a probe set is removed if its value is less than the threshold in ALL conditions.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
68
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
69 p = 1.0 means all probes are retained, p = 0.0 means none are. However, note that this does NOT necessarily imply that 0.63 means 63% of probe sets are retained.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
70
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
71 *Log Transform*: whether or not to take the log of the data before discretization.
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
72
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
73 </help>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
74
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
75 </tool>
003f802d4c7d Uploaded
mmaiensc
parents:
diff changeset
76