0
|
1 .TH bcftools 1 "16 March 2011" "bcftools" "Bioinformatics tools"
|
|
2 .SH NAME
|
|
3 .PP
|
|
4 bcftools - Utilities for the Binary Call Format (BCF) and VCF.
|
|
5 .SH SYNOPSIS
|
|
6 .PP
|
|
7 bcftools index in.bcf
|
|
8 .PP
|
|
9 bcftools view in.bcf chr2:100-200 > out.vcf
|
|
10 .PP
|
|
11 bcftools view -vc in.bcf > out.vcf 2> out.afs
|
|
12
|
|
13 .SH DESCRIPTION
|
|
14 .PP
|
|
15 Bcftools is a toolkit for processing VCF/BCF files, calling variants and
|
|
16 estimating site allele frequencies and allele frequency spectrums.
|
|
17
|
|
18 .SH COMMANDS AND OPTIONS
|
|
19
|
|
20 .TP 10
|
|
21 .B view
|
|
22 .B bcftools view
|
|
23 .RB [ \-AbFGNQSucgv ]
|
|
24 .RB [ \-D
|
|
25 .IR seqDict ]
|
|
26 .RB [ \-l
|
|
27 .IR listLoci ]
|
|
28 .RB [ \-s
|
|
29 .IR listSample ]
|
|
30 .RB [ \-i
|
|
31 .IR gapSNPratio ]
|
|
32 .RB [ \-t
|
|
33 .IR mutRate ]
|
|
34 .RB [ \-p
|
|
35 .IR varThres ]
|
|
36 .RB [ \-P
|
|
37 .IR prior ]
|
|
38 .RB [ \-1
|
|
39 .IR nGroup1 ]
|
|
40 .RB [ \-d
|
|
41 .IR minFrac ]
|
|
42 .RB [ \-U
|
|
43 .IR nPerm ]
|
|
44 .RB [ \-X
|
|
45 .IR permThres ]
|
|
46 .I in.bcf
|
|
47 .RI [ region ]
|
|
48
|
|
49 Convert between BCF and VCF, call variant candidates and estimate allele
|
|
50 frequencies.
|
|
51
|
|
52 .RS
|
|
53 .TP
|
|
54 .B Input/Output Options:
|
|
55 .TP 10
|
|
56 .B -A
|
|
57 Retain all possible alternate alleles at variant sites. By default, the view
|
|
58 command discards unlikely alleles.
|
|
59 .TP 10
|
|
60 .B -b
|
|
61 Output in the BCF format. The default is VCF.
|
|
62 .TP
|
|
63 .BI -D \ FILE
|
|
64 Sequence dictionary (list of chromosome names) for VCF->BCF conversion [null]
|
|
65 .TP
|
|
66 .B -F
|
|
67 Indicate PL is generated by r921 or before (ordering is different).
|
|
68 .TP
|
|
69 .B -G
|
|
70 Suppress all individual genotype information.
|
|
71 .TP
|
|
72 .BI -l \ FILE
|
|
73 List of sites at which information are outputted [all sites]
|
|
74 .TP
|
|
75 .B -N
|
|
76 Skip sites where the REF field is not A/C/G/T
|
|
77 .TP
|
|
78 .B -Q
|
|
79 Output the QCALL likelihood format
|
|
80 .TP
|
|
81 .BI -s \ FILE
|
|
82 List of samples to use. The first column in the input gives the sample names
|
|
83 and the second gives the ploidy, which can only be 1 or 2. When the 2nd column
|
|
84 is absent, the sample ploidy is assumed to be 2. In the output, the ordering of
|
|
85 samples will be identical to the one in
|
|
86 .IR FILE .
|
|
87 [null]
|
|
88 .TP
|
|
89 .B -S
|
|
90 The input is VCF instead of BCF.
|
|
91 .TP
|
|
92 .B -u
|
|
93 Uncompressed BCF output (force -b).
|
|
94 .TP
|
|
95 .B Consensus/Variant Calling Options:
|
|
96 .TP 10
|
|
97 .B -c
|
|
98 Call variants using Bayesian inference. This option automatically invokes option
|
|
99 .BR -e .
|
|
100 .TP
|
|
101 .BI -d \ FLOAT
|
|
102 When
|
|
103 .B -v
|
|
104 is in use, skip loci where the fraction of samples covered by reads is below FLOAT. [0]
|
|
105 .TP
|
|
106 .B -e
|
|
107 Perform max-likelihood inference only, including estimating the site allele frequency,
|
|
108 testing Hardy-Weinberg equlibrium and testing associations with LRT.
|
|
109 .TP
|
|
110 .B -g
|
|
111 Call per-sample genotypes at variant sites (force -c)
|
|
112 .TP
|
|
113 .BI -i \ FLOAT
|
|
114 Ratio of INDEL-to-SNP mutation rate [0.15]
|
|
115 .TP
|
|
116 .BI -p \ FLOAT
|
|
117 A site is considered to be a variant if P(ref|D)<FLOAT [0.5]
|
|
118 .TP
|
|
119 .BI -P \ STR
|
|
120 Prior or initial allele frequency spectrum. If STR can be
|
|
121 .IR full ,
|
|
122 .IR cond2 ,
|
|
123 .I flat
|
|
124 or the file consisting of error output from a previous variant calling
|
|
125 run.
|
|
126 .TP
|
|
127 .BI -t \ FLOAT
|
|
128 Scaled muttion rate for variant calling [0.001]
|
|
129 .TP
|
|
130 .B -v
|
|
131 Output variant sites only (force -c)
|
|
132 .TP
|
|
133 .B Contrast Calling and Association Test Options:
|
|
134 .TP
|
|
135 .BI -1 \ INT
|
|
136 Number of group-1 samples. This option is used for dividing the samples into
|
|
137 two groups for contrast SNP calling or association test.
|
|
138 When this option is in use, the following VCF INFO will be outputted:
|
|
139 PC2, PCHI2 and QCHI2. [0]
|
|
140 .TP
|
|
141 .BI -U \ INT
|
|
142 Number of permutations for association test (effective only with
|
|
143 .BR -1 )
|
|
144 [0]
|
|
145 .TP
|
|
146 .BI -X \ FLOAT
|
|
147 Only perform permutations for P(chi^2)<FLOAT (effective only with
|
|
148 .BR -U )
|
|
149 [0.01]
|
|
150 .RE
|
|
151
|
|
152 .TP
|
|
153 .B index
|
|
154 .B bcftools index
|
|
155 .I in.bcf
|
|
156
|
|
157 Index sorted BCF for random access.
|
|
158 .RE
|
|
159
|
|
160 .TP
|
|
161 .B cat
|
|
162 .B bcftools cat
|
|
163 .I in1.bcf
|
|
164 .RI [ "in2.bcf " [ ... "]]]"
|
|
165
|
|
166 Concatenate BCF files. The input files are required to be sorted and
|
|
167 have identical samples appearing in the same order.
|
|
168 .RE
|
|
169
|
|
170 .SH BCFTOOLS SPECIFIC VCF TAGS
|
|
171
|
|
172 .TS
|
|
173 center box;
|
|
174 cb | cb | cb
|
|
175 l | l | l .
|
|
176 Tag Format Description
|
|
177 _
|
|
178 AF1 double Max-likelihood estimate of the site allele frequency (AF) of the first ALT allele
|
|
179 CI95 double[2] Equal-tail Bayesian credible interval of AF at the 95% level
|
|
180 DP int Raw read depth (without quality filtering)
|
|
181 DP4 int[4] # high-quality reference forward bases, ref reverse, alternate for and alt rev bases
|
|
182 FQ int Consensus quality. Positive: sample genotypes different; negative: otherwise
|
|
183 MQ int Root-Mean-Square mapping quality of covering reads
|
|
184 PC2 int[2] Phred probability of AF in group1 samples being larger (,smaller) than in group2
|
|
185 PCHI2 double Posterior weighted chi^2 P-value between group1 and group2 samples
|
|
186 PV4 double[4] P-value for strand bias, baseQ bias, mapQ bias and tail distance bias
|
|
187 QCHI2 int Phred-scaled PCHI2
|
|
188 RP int # permutations yielding a smaller PCHI2
|
|
189 .TE
|