annotate test-data/probvalueforhetero.xml @ 4:ecfc9041bcc5

Deleted selected files
author arkarachai-fungtammasan
date Wed, 01 Apr 2015 14:05:54 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
1 <tool id="heteroprob" name="Evaluate the probability of the allele combination to generate read profile" version="2.0.0">
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
2 <description></description>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
3 <command interpreter="python2.7">heteroprob.py $microsat_raw $microsat_error_profile $expectedminorallele > $microsat_corrected </command>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
4
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
5 <inputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
6 <param name="microsat_raw" type="data" label="Select microsatellite length profile and allele combination file" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
7 <param name="microsat_error_profile" type="data" label="Select microsatellite error profile that correspond to this dataset" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
8 <param name="expectedminorallele" type="float" value="0.5" label="Expected contribution of minor allele when present (0.5 for genotyping)" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
9
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
10 </inputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
11 <outputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
12 <data name="microsat_corrected" format="tabular" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
13 </outputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
14 <tests>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
15 <!-- Test data with valid values -->
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
16 <test>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
17 <param name="microsat_raw" value="probvalueforhetero_in.txt"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
18 <param name="microsat_error_profile" value="PCRinclude.allrate.bymajorallele"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
19 <param name="expectedminorallele" value="0.5"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
20 <output name="microsat_corrected" file="probvalueforhetero_out.txt"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
21 </test>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
22
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
23 </tests>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
24 <help>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
25
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
26
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
27 .. class:: infomark
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
28
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
29 **What it does**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
30
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
31 - This tool will calculate the probability that the allele combination can generated the given read profile. This tool is part of the pipeline to estimate minimum read depth.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
32 - The calculation of probability is very similar to the tool **Correct genotype for microsatellite errors**. However, this tool will restrict the calculation to only the allele combination indicated in input. Also, when it encounter allele combination that cannot be generated from error profile, the total probability will be zero instead of using base substitution rate.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
33
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
34 **Citation**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
35
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
36 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
37
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
38 **Input**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
39
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
40 The input format is the same as output from **Correct genotype for microsatellite errors** tool.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
41
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
42 - Column 1 = location of microsatellite locus.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
43 - Column 2 = length profile (length of microsatellite in each read that mapped to this location in comma separated format).
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
44 - Column 3 = motif of microsatellite in this locus. The input file can contain more than three column.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
45 - Column 4 = homozygous/heterozygous label.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
46 - Column 5 = log based 10 of (the probability of homozygous/the probability of heterozygous)
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
47 - Column 6 = Allele for most probable homozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
48 - Column 7 = Allele 1 for most probable heterozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
49 - Column 8 = Allele 2 for most probable heterozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
50
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
51 Only column 2,3,7,8 were used in calculation.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
52
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
53 **Output**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
54
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
55
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
56 The output will be contain original eight column from the input. However, it will also add these following columns.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
57 - Column 9 = Probability of the allele combination to generate given read profile.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
58 - Column 10 = Number of possible rearrangement of given read profile.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
59 - Column 11 = Probability of the allele combination to generate read profile with any rearrangement (Product of column 9 and column 10)
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
60 - Column 12 = Read depth
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
61
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
62
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
63
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
64
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
65 </help>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
66 </tool>