view pyPRADA_1.2/tools/samtools-0.1.16/bcftools/bcftools.1 @ 0:acc2ca1a3ba4

Uploaded
author siyuan
date Thu, 20 Feb 2014 00:44:58 -0500
parents
children
line wrap: on
line source

.TH bcftools 1 "16 March 2011" "bcftools" "Bioinformatics tools"
.SH NAME
.PP
bcftools - Utilities for the Binary Call Format (BCF) and VCF.
.SH SYNOPSIS
.PP
bcftools index in.bcf
.PP
bcftools view in.bcf chr2:100-200 > out.vcf
.PP
bcftools view -vc in.bcf > out.vcf 2> out.afs

.SH DESCRIPTION
.PP
Bcftools is a toolkit for processing VCF/BCF files, calling variants and
estimating site allele frequencies and allele frequency spectrums.

.SH COMMANDS AND OPTIONS

.TP 10
.B view
.B bcftools view
.RB [ \-AbFGNQSucgv ]
.RB [ \-D
.IR seqDict ]
.RB [ \-l
.IR listLoci ]
.RB [ \-s
.IR listSample ]
.RB [ \-i
.IR gapSNPratio ]
.RB [ \-t
.IR mutRate ]
.RB [ \-p
.IR varThres ]
.RB [ \-P
.IR prior ]
.RB [ \-1
.IR nGroup1 ]
.RB [ \-d
.IR minFrac ]
.RB [ \-U
.IR nPerm ]
.RB [ \-X
.IR permThres ]
.I in.bcf
.RI [ region ]

Convert between BCF and VCF, call variant candidates and estimate allele
frequencies.

.RS
.TP
.B Input/Output Options:
.TP 10
.B -A
Retain all possible alternate alleles at variant sites. By default, the view
command discards unlikely alleles.
.TP 10
.B -b
Output in the BCF format. The default is VCF.
.TP
.BI -D \ FILE
Sequence dictionary (list of chromosome names) for VCF->BCF conversion [null]
.TP
.B -F
Indicate PL is generated by r921 or before (ordering is different).
.TP
.B -G
Suppress all individual genotype information.
.TP
.BI -l \ FILE
List of sites at which information are outputted [all sites]
.TP
.B -N
Skip sites where the REF field is not A/C/G/T
.TP
.B -Q
Output the QCALL likelihood format
.TP
.BI -s \ FILE
List of samples to use. The first column in the input gives the sample names
and the second gives the ploidy, which can only be 1 or 2. When the 2nd column
is absent, the sample ploidy is assumed to be 2. In the output, the ordering of
samples will be identical to the one in
.IR FILE .
[null]
.TP
.B -S
The input is VCF instead of BCF.
.TP
.B -u
Uncompressed BCF output (force -b).
.TP
.B Consensus/Variant Calling Options:
.TP 10
.B -c
Call variants using Bayesian inference. This option automatically invokes option
.BR -e .
.TP
.BI -d \ FLOAT
When
.B -v
is in use, skip loci where the fraction of samples covered by reads is below FLOAT. [0]
.TP
.B -e
Perform max-likelihood inference only, including estimating the site allele frequency,
testing Hardy-Weinberg equlibrium and testing associations with LRT.
.TP
.B -g
Call per-sample genotypes at variant sites (force -c)
.TP
.BI -i \ FLOAT
Ratio of INDEL-to-SNP mutation rate [0.15]
.TP
.BI -p \ FLOAT
A site is considered to be a variant if P(ref|D)<FLOAT [0.5]
.TP
.BI -P \ STR
Prior or initial allele frequency spectrum. If STR can be
.IR full ,
.IR cond2 ,
.I flat
or the file consisting of error output from a previous variant calling
run.
.TP
.BI -t \ FLOAT
Scaled muttion rate for variant calling [0.001]
.TP
.B -v
Output variant sites only (force -c)
.TP
.B Contrast Calling and Association Test Options:
.TP
.BI -1 \ INT
Number of group-1 samples. This option is used for dividing the samples into
two groups for contrast SNP calling or association test.
When this option is in use, the following VCF INFO will be outputted:
PC2, PCHI2 and QCHI2. [0]
.TP
.BI -U \ INT
Number of permutations for association test (effective only with
.BR -1 )
[0]
.TP
.BI -X \ FLOAT
Only perform permutations for P(chi^2)<FLOAT (effective only with
.BR -U )
[0.01]
.RE

.TP
.B index
.B bcftools index
.I in.bcf

Index sorted BCF for random access.
.RE

.TP
.B cat
.B bcftools cat
.I in1.bcf
.RI [ "in2.bcf " [ ... "]]]"

Concatenate BCF files. The input files are required to be sorted and
have identical samples appearing in the same order.
.RE

.SH BCFTOOLS SPECIFIC VCF TAGS

.TS
center box;
cb | cb | cb
l | l | l .
Tag	Format	Description
_
AF1	double	Max-likelihood estimate of the site allele frequency (AF) of the first ALT allele
CI95	double[2]	Equal-tail Bayesian credible interval of AF at the 95% level
DP	int	Raw read depth (without quality filtering)
DP4	int[4]	# high-quality reference forward bases, ref reverse, alternate for and alt rev bases
FQ	int	Consensus quality. Positive: sample genotypes different; negative: otherwise
MQ	int	Root-Mean-Square mapping quality of covering reads
PC2	int[2]	Phred probability of AF in group1 samples being larger (,smaller) than in group2
PCHI2	double	Posterior weighted chi^2 P-value between group1 and group2 samples
PV4	double[4]	P-value for strand bias, baseQ bias, mapQ bias and tail distance bias
QCHI2	int	Phred-scaled PCHI2
RP	int	# permutations yielding a smaller PCHI2
.TE