comparison probvalueforhetero.xml @ 2:d5ed5c2e25c3 draft

Uploaded
author arkarachai-fungtammasan
date Wed, 22 Apr 2015 12:48:40 -0400
parents 07588b899c13
children
comparison
equal deleted inserted replaced
1:f2bab38e3cbd 2:d5ed5c2e25c3
26 26
27 .. class:: infomark 27 .. class:: infomark
28 28
29 **What it does** 29 **What it does**
30 30
31 - This tool will calculate the probability that the allele combination can generated the given read profile. This tool is part of the pipeline to estimate minimum read depth. 31 - This tool will calculate the probability that the allele combination can generated the given the STR length profile. This tool is part of the pipeline to estimate minimum read depth.
32 - The calculation of probability is very similar to the tool **Correct genotype for microsatellite errors**. However, this tool will restrict the calculation to only the allele combination indicated in input. Also, when it encounter allele combination that cannot be generated from error profile, the total probability will be zero instead of using base substitution rate. 32 - The calculation of probability is very similar to the tool **Correct genotype for STR errors**. However, this tool will restrict the calculation to only the allele combination indicated in input. Also, when it encounter allele combination that cannot be generated from error profile, the total probability will be zero instead of using base substitution rate.
33 33
34 **Citation** 34 **Citation**
35 35
36 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research** 36 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
37 37
38 **Input** 38 **Input**
39 39
40 The input format is the same as output from **Correct genotype for microsatellite errors** tool. 40 The input format is the same as output from **Correct genotype for STR errors** tool.
41 41
42 - Column 1 = location of microsatellite locus. 42 - Column 1 = location of STR locus.
43 - Column 2 = length profile (length of microsatellite in each read that mapped to this location in comma separated format). 43 - Column 2 = length profile (length of STR in each read that mapped to this location in comma separated format).
44 - Column 3 = motif of microsatellite in this locus. The input file can contain more than three column. 44 - Column 3 = motif of STR in this locus. The input file can contain more than three column.
45 - Column 4 = homozygous/heterozygous label. 45 - Column 4 = homozygote/heterozygote label.
46 - Column 5 = log based 10 of (the probability of homozygous/the probability of heterozygous) 46 - Column 5 = log based 10 of (the probability of homozygote/the probability of heterozygote)
47 - Column 6 = Allele for most probable homozygous form. 47 - Column 6 = Allele for most probable homozygote.
48 - Column 7 = Allele 1 for most probable heterozygous form. 48 - Column 7 = Allele 1 for most probable heterozygote.
49 - Column 8 = Allele 2 for most probable heterozygous form. 49 - Column 8 = Allele 2 for most probable heterozygote.
50 50
51 Only column 2,3,7,8 were used in calculation. 51 Only column 2,3,7,8 were used in calculation.
52 52
53 **Output** 53 **Output**
54 54
55 55
56 The output will be contain original eight column from the input. However, it will also add these following columns. 56 The output will contain the original eight columns from the input and the following additional columns.
57 - Column 9 = Probability of the allele combination to generate given read profile. 57 - Column 9 = Probability of the allele combination to generate given read profile.
58 - Column 10 = Number of possible rearrangement of given read profile. 58 - Column 10 = Number of possible rearrangements of the given read profile.
59 - Column 11 = Probability of the allele combination to generate read profile with any rearrangement (Product of column 9 and column 10) 59 - Column 11 = Probability of the allele combination to generate read profile with any rearrangement (Product of column 9 and column 10)
60 - Column 12 = Read depth 60 - Column 12 = Read depth
61 61
62 62
63 63