annotate test-data/GenotypingSTR.xml @ 5:b27006b0a953

update to latest version
author devteam@galaxyproject.org
date Wed, 22 Apr 2015 12:19:28 -0400
parents ecfc9041bcc5
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
1 <tool id="GenotypeSTR" name="Correct genotype for microsatellite errors" version="2.0.0">
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
2 <description> during sequencing and library prep </description>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
3 <command interpreter="python2.7">GenotypeTRcorrection.py $microsat_raw $microsat_error_profile $microsat_corrected $expectedminorallele </command>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
4
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
5 <inputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
6 <param name="microsat_raw" type="data" label="Select microsatellite length profile that need to refine genotyping" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
7 <param name="microsat_error_profile" type="data" label="Select microsatellite error profile that correspond to this dataset" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
8 <param name="expectedminorallele" type="float" value="0.5" label="Expected contribution of minor allele when present (0.5 for genotyping)" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
9
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
10 </inputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
11 <outputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
12 <data name="microsat_corrected" format="tabular" />
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
13 </outputs>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
14 <tests>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
15 <!-- Test data with valid values -->
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
16 <test>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
17 <param name="microsat_raw" value="sampleTRprofile_C.txt"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
18 <param name="microsat_error_profile" value="PCRinclude.allrate.bymajorallele"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
19 <param name="expectedminorallele" value="0.5"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
20 <output name="microsat_corrected" file="sampleTRgenotypingcorrection"/>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
21 </test>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
22
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
23 </tests>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
24 <help>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
25
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
26
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
27 .. class:: infomark
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
28
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
29 **What it does**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
30
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
31 - This tool will correct for microsatellite sequencing and library preparation errors using error rates estimated from hemizygous male X chromosome or any rates provided by user. The read profile for each locus will be processed independently.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
32 - First, this tool will find three most common read lengths from input read length profile. If the read profile has only one length of TR, the length of one motif longer than the observed length will be used as the second most common read length.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
33 - Second, it will calculate probability of three forms of homozygous and use the form which give the highest probability. The same goes for heterozygous.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
34 - Third, this tools will calculate log based 10 of (the probability of homozygous/the probability of heterozygous). If this value is more than 0, it will predict this locus to homozygous. If this value is less than 0, it will predict this locus to heterozygous. If this value is 0, read profile at this locus will be discard.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
35
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
36 **Citation**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
37
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
38 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
39
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
40 **Input**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
41
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
42 - The input files need to contain at least three columns.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
43 - Column 1 = location of microsatellite locus.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
44 - Column 2 = length profile (length of microsatellite in each read that mapped to this location in comma separated format).
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
45 - Column 3 = motif of microsatellite in this locus. The input file can contain more than three column.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
46
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
47 **Output**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
48
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
49 The output will be contain original three (or more) column as the input. However, it will also have these following columns.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
50
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
51 - Additional column 1 = homozygous/heterozygous label.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
52 - Additional column 2 = log based 10 of (the probability of homozygous/the probability of heterozygous)
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
53 - Additional column 3 = Allele for most probable homozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
54 - Additional column 4 = Allele 1 for most probable heterozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
55 - Additional column 5 = Allele 2 for most probable heterozygous form.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
56
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
57 **Example**
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
58
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
59 - Suppose that we sequence one locus of microsatellite with NGS. This locus has **A** motif and the following length (bp) profile. ::
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
60
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
61 chr1_100_106 5, 6, 6, 6, 6, 7, 7, 8, 8 A
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
62
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
63 - We want to figure out if this locus is a homolozygous or heterozygous and the corresponding allele(s). Therefore, we use this tool to refine genotype.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
64 - This tool will calculate the probability of homozygous A6A6, A7A7, and A8A8 to generate observed length profile. Among this A7A7 has the highest probability. Therefore, we use this form as the representative for homozygous.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
65 - Then, this tool will calculate the probability of heterozygous A6A7, A7A8, and A6A8 to generate observed length profile. Among this A6A8 has the highest probability. Therefore, we use this form as the representative for heterozygous.
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
66 - The A6A7 has higher probability than A7A7. Therefore, the program will report that this locus is a heterozygous locus. ::
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
67
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
68 chr1 5,6,6,6,6,7,7,8,8 A hetero -14.8744881854 7 6 8
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
69
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
70
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
71 </help>
ecfc9041bcc5 Deleted selected files
arkarachai-fungtammasan
parents:
diff changeset
72 </tool>