0
|
1 <tool id="hgv_add_scores" name="phyloP" version="1.0.0">
|
|
2 <description>interspecies conservation scores</description>
|
|
3
|
|
4 <command>
|
|
5 add_scores $input1 ${input1.metadata.dbkey} ${input1.metadata.chromCol} ${input1.metadata.startCol} ${GALAXY_DATA_INDEX_DIR}/add_scores.loc $out_file1
|
|
6 </command>
|
|
7
|
|
8 <inputs>
|
|
9 <param format="interval" name="input1" type="data" label="Dataset">
|
|
10 <validator type="unspecified_build"/>
|
|
11 <validator type="dataset_metadata_in_file" filename="add_scores.loc" metadata_name="dbkey" metadata_column="0" message="Data is currently not available for the specified build."/>
|
|
12 </param>
|
|
13 </inputs>
|
|
14
|
|
15 <outputs>
|
|
16 <data format="input" name="out_file1" />
|
|
17 </outputs>
|
|
18
|
|
19 <requirements>
|
|
20 <requirement type="package">add_scores</requirement>
|
|
21 </requirements>
|
|
22
|
|
23 <tests>
|
|
24 <test>
|
|
25 <param name="input1" value="add_scores_input1.interval" ftype="interval" dbkey="hg18" />
|
|
26 <output name="output" file="add_scores_output1.interval" />
|
|
27 </test>
|
|
28 <test>
|
|
29 <param name="input1" value="add_scores_input2.bed" ftype="interval" dbkey="hg18" />
|
|
30 <output name="output" file="add_scores_output2.interval" />
|
|
31 </test>
|
|
32 </tests>
|
|
33
|
|
34 <help>
|
|
35 .. class:: warningmark
|
|
36
|
|
37 This currently works only for build hg18.
|
|
38
|
|
39 -----
|
|
40
|
|
41 **Dataset formats**
|
|
42
|
|
43 The input can be any interval_ format dataset. The output is also in interval format.
|
|
44 (`Dataset missing?`_)
|
|
45
|
|
46 .. _interval: ./static/formatHelp.html#interval
|
|
47 .. _Dataset missing?: ./static/formatHelp.html
|
|
48
|
|
49 -----
|
|
50
|
|
51 **What it does**
|
|
52
|
|
53 This tool adds a column that measures interspecies conservation at each SNP
|
|
54 position, using conservation scores for primates pre-computed by the
|
|
55 phyloP program. PhyloP performs an exact P-value computation under a
|
|
56 continuous Markov substitution model.
|
|
57
|
|
58 The chromosome and start position
|
|
59 are used to look up the scores, so if a larger interval is in the input,
|
|
60 only the score for the first nucleotide is returned.
|
|
61
|
|
62 -----
|
|
63
|
|
64 **Example**
|
|
65
|
|
66 - input file, with SNPs::
|
|
67
|
|
68 chr22 16440426 14440427 C/T
|
|
69 chr22 15494851 14494852 A/G
|
|
70 chr22 14494911 14494912 A/T
|
|
71 chr22 14550435 14550436 A/G
|
|
72 chr22 14611956 14611957 G/T
|
|
73 chr22 14612076 14612077 A/G
|
|
74 chr22 14668537 14668538 C
|
|
75 chr22 14668703 14668704 A/T
|
|
76 chr22 14668775 14668776 G
|
|
77 chr22 14680074 14680075 A/T
|
|
78 etc.
|
|
79
|
|
80 - output file, showing conservation scores for primates::
|
|
81
|
|
82 chr22 16440426 14440427 C/T 0.509
|
|
83 chr22 15494851 14494852 A/G 0.427
|
|
84 chr22 14494911 14494912 A/T NA
|
|
85 chr22 14550435 14550436 A/G NA
|
|
86 chr22 14611956 14611957 G/T -2.142
|
|
87 chr22 14612076 14612077 A/G 0.369
|
|
88 chr22 14668537 14668538 C 0.419
|
|
89 chr22 14668703 14668704 A/T -1.462
|
|
90 chr22 14668775 14668776 G 0.470
|
|
91 chr22 14680074 14680075 A/T 0.303
|
|
92 etc.
|
|
93
|
|
94 "NA" means that the phyloP score was not available.
|
|
95
|
|
96 -----
|
|
97
|
|
98 **Reference**
|
|
99
|
|
100 Siepel A, Pollard KS, Haussler D. (2006)
|
|
101 New methods for detecting lineage-specific selection.
|
|
102 In Proceedings of the 10th International Conference on Research in Computational
|
|
103 Molecular Biology (RECOMB 2006), pp. 190-205.
|
|
104
|
|
105 </help>
|
|
106 </tool>
|