Mercurial > repos > iuc > gemini
comparison gemini_roh.xml @ 0:720cbfb4190d draft
Imported from capsule None
author | iuc |
---|---|
date | Mon, 25 Aug 2014 17:15:54 -0400 |
parents | |
children | 93bb0cfacefb |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:720cbfb4190d |
---|---|
1 <tool id="gemini_@BINARY@" name="GEMINI @BINARY@" version="@VERSION@.0"> | |
2 <description>Identifying runs of homozygosity</description> | |
3 <expand macro="requirements" /> | |
4 <expand macro="version_command" /> | |
5 <macros> | |
6 <import>gemini_macros.xml</import> | |
7 <token name="@BINARY@">roh</token> | |
8 </macros> | |
9 <command> | |
10 <![CDATA[ | |
11 gemini @BINARY@ | |
12 --min-snps $min_snps | |
13 --min-total-depth $min_total_depth | |
14 --min-gt-depth $min_gt_depth | |
15 --min-size $min_size | |
16 --max-hets $max_hets | |
17 --max-unknowns $max_unknowns | |
18 #if $samples.strip() != '': | |
19 -s "${samples}" | |
20 #end if | |
21 "${ infile }" | |
22 > "${ outfile }" | |
23 ]]> | |
24 </command> | |
25 <expand macro="stdio" /> | |
26 <inputs> | |
27 <param name="infile" type="data" format="sqlite" label="GEMINI database" /> | |
28 | |
29 <param name="min_snps" type="integer" value="25" size="5" label="Minimum number of expected homozygous SNPs" help="default: 25 (--min-snps)"> | |
30 <validator type="in_range" min="0"/> | |
31 </param> | |
32 <param name="min_total_depth" type="integer" value="20" size="10" label="The minimum overall sequencing depth requiredfor a SNP to be considered" help="default: 20 (--min-total-depth)"> | |
33 <validator type="in_range" min="0"/> | |
34 </param> | |
35 <param name="min_gt_depth" type="integer" value="0" size="10" label="The minimum required sequencing depth underlying a given sample's genotype for a SNP to be considered" | |
36 help="default: 0 (--min-gt-depth)"> | |
37 <validator type="in_range" min="0"/> | |
38 </param> | |
39 <param name="min_size" type="integer" value="100000" size="10" label="Minimum run size in base pairs" help="default: 100000 (--min-size)"> | |
40 <validator type="in_range" min="1"/> | |
41 </param> | |
42 <param name="max_hets" type="integer" value="1" size="5" label="Maximum number of allowed hets in the run" help="default: 1 (--max-hets)"> | |
43 <validator type="in_range" min="1"/> | |
44 </param> | |
45 <param name="max_unknowns" type="integer" value="3" size="5" label="Maximum number of allowed unknowns in the run" help="default: 3 (-max-unknowns)"> | |
46 <validator type="in_range" min="0"/> | |
47 </param> | |
48 | |
49 <param name="samples" size="30" type="text" value="" label="Comma separated list of samples to screen for ROHs" help="e.g S120,S450 (-s)"/> | |
50 | |
51 </inputs> | |
52 | |
53 <outputs> | |
54 <data name="outfile" format="tabular" label="${tool.name} on ${on_string}" /> | |
55 </outputs> | |
56 <tests> | |
57 <test> | |
58 </test> | |
59 </tests> | |
60 <help> | |
61 | |
62 **What it does** | |
63 | |
64 =========================================================================== | |
65 ``ROH``: Identifying runs of homozygosity | |
66 =========================================================================== | |
67 Runs of homozygosity are long stretches of homozygous genotypes that reflect | |
68 segments shared identically by descent and are a result of consanguinity or | |
69 natural selection. Consanguinity elevates the occurrence of rare recessive | |
70 diseases (e.g. cystic fibrosis) that represent homozygotes for strongly deleterious | |
71 mutations. Hence, the identification of these runs holds medical value. | |
72 | |
73 The 'roh' tool in GEMINI returns runs of homozygosity identified in whole genome data. | |
74 The tool basically looks at every homozygous position on the chromosome as a possible | |
75 start site for the run and looks for those that could give rise to a potentially long | |
76 stretch of homozygous genotypes. | |
77 | |
78 For e.g. for the given example allowing ``1 HET`` genotype (h) and ``2 UKW`` genotypes (u) | |
79 the possible roh runs (H) would be: | |
80 | |
81 | |
82 :: | |
83 | |
84 genotype_run = H H H H h H H H H u H H H H H u H H H H H H H h H H H H H h H H H H H | |
85 roh_run1 = H H H H h H H H H u H H H H H u H H H H H H H | |
86 roh_run2 = H H H H u H H H H H u H H H H H H H h H H H H H | |
87 roh_run3 = H H H H H u H H H H H H H h H H H H H | |
88 roh_run4 = H H H H H H H h H H H H H | |
89 | |
90 roh returned for --min-snps = 20 would be: | |
91 | |
92 :: | |
93 | |
94 roh_run1 = H H H H h H H H H u H H H H H u H H H H H H H | |
95 roh_run2 = H H H H u H H H H H u H H H H H H H h H H H H H | |
96 | |
97 | |
98 As you can see, the immediate homozygous position right of a break (h or u) would be the possible | |
99 start of a new roh run and genotypes to the left of a break are pruned since they cannot | |
100 be part of a longer run than we have seen before. | |
101 | |
102 | |
103 @CITATION@ | |
104 </help> | |
105 <expand macro="citations"/> | |
106 </tool> |