annotate BSseeker2/README.md @ 1:8b26adf64adc draft default tip

V2.0.5
author weilong-guo
date Tue, 05 Nov 2013 01:55:39 -0500
parents e6df770c0e58
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
1 BS-Seeker2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
2 =========
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
3
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
4 BS-Seeker2 (BS Seeker 2) performs accurate and fast mapping of bisulfite-treated short reads. BS-Seeker2 is an updated version on BS-Seeker.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
5
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
6 0. Availability
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
7 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
8
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
9 Homepage of [BS-Seeker2](http://pellegrini.mcdb.ucla.edu/BS_Seeker2/).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
10
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
11 The source code for this package is available from
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
12 [https://github.com/BSSeeker/BSseeker2](https://github.com/BSSeeker/BSseeker2).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
13 Also, you can use an instance of BS-Seeker 2 in Galaxy from [http://galaxy.hoffman2.idre.ucla.edu](http://galaxy.hoffman2.idre.ucla.edu).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
14 (Label: "NGS: Methylation Mapping"/"Methylation Map with BS Seeker2")
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
15
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
16
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
17 1. Remarkable new features
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
18 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
19 * Reduced index for RRBS, accelerating the mapping speed and increasing mappability
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
20 * Allowing local alignment with Bowtie 2, increased the mappability
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
21
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
22 2. Other features
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
23 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
24 * Supported library types
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
25 - whole genome-wide bisulfite sequencing (WGBS)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
26 - reduced representative bisulfite sequencing (RRBS)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
27
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
28 * Supported formats for input file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
29 - [fasta](http://en.wikipedia.org/wiki/FASTA_format)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
30 - [fastq](http://en.wikipedia.org/wiki/FASTQ_format)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
31 - [qseq](http://jumpgate.caltech.edu/wiki/QSeq)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
32 - pure sequence (one-line one-sequence)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
33
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
34 * Supported alignment tools
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
35 - [bowtie](http://bowtie-bio.sourceforge.net/index.shtml) : Single-seed
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
36 - [bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) : Multiple-seed, gapped-alignment
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
37 - [local alignment](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#local-alignment-example)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
38 - [end-to-end alignment](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#end-to-end-alignment-example)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
39 - [SOAP](http://soap.genomics.org.cn/)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
40
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
41 * Supported formats for mapping results
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
42 - [BAM](http://genome.ucsc.edu/FAQ/FAQformat.html#format5.1)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
43 - [SAM](http://samtools.sourceforge.net/)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
44 - [BS-seeker 1](http://pellegrini.mcdb.ucla.edu/BS_Seeker/USAGE.html)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
45
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
46 3. System requirements
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
47 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
48
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
49 * Linux or Mac OS platform
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
50 * One of the following Aligner
1
weilong-guo
parents: 0
diff changeset
51 - [bowtie](http://bowtie-bio.sourceforge.net/) (fast)
weilong-guo
parents: 0
diff changeset
52 - [bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/) (Default)
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
53 - [soap](http://soap.genomics.org.cn/)
1
weilong-guo
parents: 0
diff changeset
54 - [rmap](http://www.cmb.usc.edu/people/andrewds/rmap/)
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
55 * [Python](http://www.python.org/download/) (Version 2.6 +)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
56
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
57 (It is normally pre-installed in Linux. Type " python -V" to see the installed version.)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
58
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
59 * [pysam](http://code.google.com/p/pysam/) package is needed.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
60
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
61 (Read "Questions & Answers" if you have problem when installing this package.)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
62
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
63
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
64
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
65 4. Modules' descriptions
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
66 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
67
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
68 (0) FilterReads.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
69 ------------
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
70
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
71 Optional and independent module.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
72 Some reads would be extremely amplified during the PCR. This script helps you get unique reads before doing the mapping. You can decide whether or not to filter reads before doing the mapping.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
73
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
74 ##Usage :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
75
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
76 $ python FilterReads.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
77 Usage: FilterReads.py -i <input> -o <output> [-k]
1
weilong-guo
parents: 0
diff changeset
78 Author : Guo, Weilong; 2012-11-10
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
79 Unique reads for qseq/fastq/fasta/sequencce, and filter
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
80 low quality file in qseq file.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
81
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
82 Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
83 -h, --help show this help message and exit
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
84 -i FILE Name of the input qseq/fastq/fasta/sequence file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
85 -o FILE Name of the output file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
86 -k Would not filter low quality reads if specified
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
87
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
88
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
89 ##Tip :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
90
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
91 - This step is not suggested for RRBS library, as reads from RRBS library would more likely from the same location.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
92
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
93
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
94 (1) bs_seeker2-build.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
95 ------------
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
96
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
97 Module to build the index for BS-Seeker2.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
98
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
99
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
100 ##Usage :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
101
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
102 $ python bs_seeker2-build.py -h
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
103 Usage: bs_seeker2-build.py [options]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
104
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
105 Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
106 -h, --help show this help message and exit
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
107 -f FILE, --file=FILE Input your reference genome file (fasta)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
108 --aligner=ALIGNER Aligner program to perform the analysis: bowtie,
1
weilong-guo
parents: 0
diff changeset
109 bowtie2, soap, rmap [Default: bowtie]
weilong-guo
parents: 0
diff changeset
110 -p PATH, --path=PATH Path to the aligner program. Detected:
weilong-guo
parents: 0
diff changeset
111 bowtie: ~/install/bowtie
weilong-guo
parents: 0
diff changeset
112 bowtie2: ~/install/bowtie2
weilong-guo
parents: 0
diff changeset
113 rmap: ~/install/rmap_/bin
weilong-guo
parents: 0
diff changeset
114 soap: ~/install/soap/
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
115 -d DBPATH, --db=DBPATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
116 Path to the reference genome library (generated in
1
weilong-guo
parents: 0
diff changeset
117 preprocessing genome) [Default: ~/install
weilong-guo
parents: 0
diff changeset
118 /BSseeker2/bs_utils/reference_genomes]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
119 -v, --version show version of BS-Seeker2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
120
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
121 Reduced Representation Bisulfite Sequencing Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
122 Use this options with conjuction of -r [--rrbs]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
123
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
124 -r, --rrbs Build index specially for Reduced Representation
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
125 Bisulfite Sequencing experiments. Genome other than
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
126 certain fragments will be masked. [Default: False]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
127 -l LOW_BOUND, --low=LOW_BOUND
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
128 lower bound of fragment length (excluding recognition
1
weilong-guo
parents: 0
diff changeset
129 sequence such as C-CGG) [Default: 20]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
130 -u UP_BOUND, --up=UP_BOUND
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
131 upper bound of fragment length (excluding recognition
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
132 sequence such as C-CGG ends) [Default: 500]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
133 -c CUT_FORMAT, --cut-site=CUT_FORMAT
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
134 Cut sites of restriction enzyme. Ex: MspI(C-CGG),
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
135 Mael:(C-TAG), double-enzyme MspI&Mael:(C-CGG,C-TAG).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
136 [Default: C-CGG]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
137
1
weilong-guo
parents: 0
diff changeset
138
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
139 ##Example
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
140
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
141 * Build genome index for WGBS using bowtie, path of bowtie should be included in $PATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
142
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
143 python bs_seeker2-build.py -f genome.fa --aligner=bowtie
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
144
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
145 * Build genome index for RRBS with default parameters specifying the path for bowtie2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
146
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
147 python bs_seeker2-build.py -f genome.fa --aligner=bowtie2 -p ~/install/bowtie2-2.0.0-beta7/ -r
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
148
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
149 * Build genome index for RRBS library using bowite2, with fragment lengths ranging [40bp, 400bp]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
150
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
151 python bs_seeker2-build.py -f genome.fa -r -l 40 -u 400 --aligner=bowtie2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
152
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
153 * Build genome index for RRBS library for double-enzyme : MspI (C-CGG) & ApeKI (G-CWGC, where W=A|T, see [IUPAC code](http://www.bioinformatics.org/sms/iupac.html))
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
154
1
weilong-guo
parents: 0
diff changeset
155 python bs_seeker2-build.py -f genome.fa -r -c C-CGG,G-CWGC --aligner=bowtie
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
156
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
157 ##Tips:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
158
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
159 - Index built for BS-Seeker2 is different from the index for BS-Seeker 1.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
160 For RRBS, you need to specify "-r" in the parameters. Also, you need to specify LOW_BOUND and UP_BOUND for the range of fragment lengths according your protocol.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
161
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
162 - The fragment length is different from read length. Fragments refers to the DNA fragments which you get by size-selection step (i.e. gel-cut oor AMPure beads). Lengths of fragments are supposed to be in a range, such as [50bp,250bp].
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
163
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
164 - The indexes for RRBS and WGBS are different. Also, indexes for RRBS are specific for fragment length parameters (LOW_BOUND and UP_BOUND).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
165
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
166
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
167
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
168
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
169 (2) bs_seeker2-align.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
170 ------------
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
171
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
172 Module to map reads on 3-letter converted genome.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
173
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
174 ##Usage :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
175
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
176 $ python ~/install/BSseeker2/bs_seeker2-align.py -h
1
weilong-guo
parents: 0
diff changeset
177 Usage: bs_seeker2-align.py {-i <single> | -1 <mate1> -2 <mate2>} -g <genome.fa> [options]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
178
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
179 Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
180 -h, --help show this help message and exit
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
181
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
182 For single end reads:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
183 -i INFILE, --input=INFILE
1
weilong-guo
parents: 0
diff changeset
184 Input read file (FORMAT: sequences, qseq, fasta,
weilong-guo
parents: 0
diff changeset
185 fastq). Ex: read.fa or read.fa.gz
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
186
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
187 For pair end reads:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
188 -1 FILE, --input_1=FILE
1
weilong-guo
parents: 0
diff changeset
189 Input read file, mate 1 (FORMAT: sequences, qseq,
weilong-guo
parents: 0
diff changeset
190 fasta, fastq)
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
191 -2 FILE, --input_2=FILE
1
weilong-guo
parents: 0
diff changeset
192 Input read file, mate 2 (FORMAT: sequences, qseq,
weilong-guo
parents: 0
diff changeset
193 fasta, fastq)
weilong-guo
parents: 0
diff changeset
194 -I MIN_INSERT_SIZE, --minins=MIN_INSERT_SIZE
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
195 The minimum insert size for valid paired-end
1
weilong-guo
parents: 0
diff changeset
196 alignments [Default: 0]
weilong-guo
parents: 0
diff changeset
197 -X MAX_INSERT_SIZE, --maxins=MAX_INSERT_SIZE
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
198 The maximum insert size for valid paired-end
1
weilong-guo
parents: 0
diff changeset
199 alignments [Default: 500]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
200
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
201 Reduced Representation Bisulfite Sequencing Options:
1
weilong-guo
parents: 0
diff changeset
202 -r, --rrbs Map reads to the Reduced Representation genome
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
203 -c pattern, --cut-site=pattern
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
204 Cutting sites of restriction enzyme. Ex: MspI(C-CGG),
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
205 Mael:(C-TAG), double-enzyme MspI&Mael:(C-CGG,C-TAG).
1
weilong-guo
parents: 0
diff changeset
206 [Default: C-CGG]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
207 -L RRBS_LOW_BOUND, --low=RRBS_LOW_BOUND
1
weilong-guo
parents: 0
diff changeset
208 Lower bound of fragment length (excluding C-CGG ends)
weilong-guo
parents: 0
diff changeset
209 [Default: 20]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
210 -U RRBS_UP_BOUND, --up=RRBS_UP_BOUND
1
weilong-guo
parents: 0
diff changeset
211 Upper bound of fragment length (excluding C-CGG ends)
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
212 [Default: 500]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
213
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
214 General options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
215 -t TAG, --tag=TAG [Y]es for undirectional lib, [N]o for directional
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
216 [Default: N]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
217 -s CUTNUMBER1, --start_base=CUTNUMBER1
1
weilong-guo
parents: 0
diff changeset
218 The first cycle of the read to be mapped [Default: 1]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
219 -e CUTNUMBER2, --end_base=CUTNUMBER2
1
weilong-guo
parents: 0
diff changeset
220 The last cycle of the read to be mapped [Default: 200]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
221 -a FILE, --adapter=FILE
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
222 Input text file of your adaptor sequences (to be
1
weilong-guo
parents: 0
diff changeset
223 trimmed from the 3'end of the reads, ). Input one seq
weilong-guo
parents: 0
diff changeset
224 for dir. lib., twon seqs for undir. lib. One line per
weilong-guo
parents: 0
diff changeset
225 sequence. Only the first 10bp will be used
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
226 --am=ADAPTER_MISMATCH
1
weilong-guo
parents: 0
diff changeset
227 Number of mismatches allowed in adapter [Default: 0]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
228 -g GENOME, --genome=GENOME
1
weilong-guo
parents: 0
diff changeset
229 Name of the reference genome (should be the same as
weilong-guo
parents: 0
diff changeset
230 "-f" in bs_seeker2-build.py ) [ex. chr21_hg18.fa]
weilong-guo
parents: 0
diff changeset
231 -m NO_MISMATCHES, --mismatches=NO_MISMATCHES
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
232 Number of mismatches in one read [Default: 4]
1
weilong-guo
parents: 0
diff changeset
233 --aligner=ALIGNER Aligner program for short reads mapping: bowtie,
weilong-guo
parents: 0
diff changeset
234 bowtie2, soap, rmap [Default: bowtie]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
235 -p PATH, --path=PATH
1
weilong-guo
parents: 0
diff changeset
236 Path to the aligner program. Detected:
weilong-guo
parents: 0
diff changeset
237 bowtie: ~/install/bowtie
weilong-guo
parents: 0
diff changeset
238 bowtie2: ~/install/bowtie2
weilong-guo
parents: 0
diff changeset
239 rmap: ~/install/rmap/bin
weilong-guo
parents: 0
diff changeset
240 soap: ~/install/soap/
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
241 -d DBPATH, --db=DBPATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
242 Path to the reference genome library (generated in
1
weilong-guo
parents: 0
diff changeset
243 preprocessing genome) [Default: ~/i
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
244 nstall/BSseeker2/bs_utils/reference_genomes]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
245 -l NO_SPLIT, --split_line=NO_SPLIT
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
246 Number of lines per split (the read file will be split
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
247 into small files for mapping. The result will be
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
248 merged. [Default: 4000000]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
249 -o OUTFILE, --output=OUTFILE
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
250 The name of output file [INFILE.bs(se|pe|rrbs)]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
251 -f FORMAT, --output-format=FORMAT
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
252 Output format: bam, sam, bs_seeker1 [Default: bam]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
253 --no-header Suppress SAM header lines [Default: False]
1
weilong-guo
parents: 0
diff changeset
254 --temp_dir=PATH The path to your temporary directory [Detected: /tmp]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
255 --XS=XS_FILTER Filter definition for tag XS, format X,Y. X=0.8 and
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
256 y=5 indicate that for one read, if #(mCH sites)/#(all
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
257 CH sites)>0.8 and #(mCH sites)>5, then tag XS=1; or
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
258 else tag XS=0. [Default: 0.5,5]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
259 --multiple-hit Output reads with multiple hits to
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
260 file"Multiple_hit.fa"
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
261 -v, --version show version of BS-Seeker2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
262
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
263 Aligner Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
264 You may specify any additional options for the aligner. You just have
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
265 to prefix them with --bt- for bowtie, --bt2- for bowtie2, --soap- for
1
weilong-guo
parents: 0
diff changeset
266 soap, --rmap- for rmap, and BS Seeker will pass them on. For example:
weilong-guo
parents: 0
diff changeset
267 --bt-p 4 will increase the number of threads for bowtie to 4, --bt--
weilong-guo
parents: 0
diff changeset
268 tryhard will instruct bowtie to try as hard as possible to find valid
weilong-guo
parents: 0
diff changeset
269 alignments when they exist, and so on. Be sure that you know what you
weilong-guo
parents: 0
diff changeset
270 are doing when using these options! Also, we don't do any validation
weilong-guo
parents: 0
diff changeset
271 on the values.
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
272
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
273
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
274 ##Examples :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
275
1
weilong-guo
parents: 0
diff changeset
276 * WGBS library ; alignment mode, bowtie ; map to WGBS index
weilong-guo
parents: 0
diff changeset
277
weilong-guo
parents: 0
diff changeset
278 python bs_seeker2-align.py -i WGBS.fa --aligner=bowtie -o WGBS.bam -f bam -g genome.fa
weilong-guo
parents: 0
diff changeset
279
weilong-guo
parents: 0
diff changeset
280 * WGBS library ; alignment mode, bowtie2-local ; map to WGBS index
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
281
1
weilong-guo
parents: 0
diff changeset
282 python bs_seeker2-align.py -i WGBS.fa --aligner=bowtie2 -o WGBS.bam -f bam -g genome.fa
weilong-guo
parents: 0
diff changeset
283
weilong-guo
parents: 0
diff changeset
284 * WGBS library ; alignment mode, bowtie2-end-to-end ; map to WGBS index
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
285
1
weilong-guo
parents: 0
diff changeset
286 python bs_seeker2-align.py -i WGBS.fa -m 3 --aligner=bowtie2 -o WGBS.bam -f bam -g genome.fa --bt2--end-to-end
weilong-guo
parents: 0
diff changeset
287
weilong-guo
parents: 0
diff changeset
288 * RRBS library ; alignment mode, bowtie ; map to RR index
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
289
1
weilong-guo
parents: 0
diff changeset
290 python bs_seeker2-align.py -i RRBS.fa --aligner=bowtie -o RRBS.bam -g genome.fa -r -a adapter.txt
weilong-guo
parents: 0
diff changeset
291
weilong-guo
parents: 0
diff changeset
292 * RRBS library ; alignment mode, bowtie ; map to WG index
weilong-guo
parents: 0
diff changeset
293
weilong-guo
parents: 0
diff changeset
294 python bs_seeker2-align.py -i RRBS.fa --aligner=bowtie -o RRBS.bam -g genome.fa -a adapter.txt
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
295
1
weilong-guo
parents: 0
diff changeset
296 * RRBS library ; alignment mode, bowtie2-end-to-end ; map to WG index
weilong-guo
parents: 0
diff changeset
297
weilong-guo
parents: 0
diff changeset
298 python bs_seeker2-align.py -i RRBS.fa --aligner=bowtie -o RRBS.bam -g genome.fa -a adapter.txt --bt2--end-to-end
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
299
1
weilong-guo
parents: 0
diff changeset
300 * Align from qseq format for RRBS with bowtie, specifying lengths of fragments ranging [40bp, 400bp]
weilong-guo
parents: 0
diff changeset
301
weilong-guo
parents: 0
diff changeset
302 python bs_seeker2-align.py -i RRBS.qseq --aligner=bowtie -o RRBS.bam -f bam -g genome.fa -r --low=40 --up=400 -a adapter.txt
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
303
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
304 The parameters '--low' and '--up' should be the same with corresponding parameters when building the genome index
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
305
1
weilong-guo
parents: 0
diff changeset
306 * WGBS library ; alignment mode, bowtie ; map to WGBS index; use 8 threads for alignment
weilong-guo
parents: 0
diff changeset
307
weilong-guo
parents: 0
diff changeset
308 python bs_seeker2-align.py -i WGBS.fa --aligner=bowtie -o WGBS.bam -f bam -g genome.fa --bt-p 4
weilong-guo
parents: 0
diff changeset
309
weilong-guo
parents: 0
diff changeset
310 BS-Seeker2 will run TWO bowtie instances in parallel.
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
311
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
312
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
313 ##Input file:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
314
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
315 - Adapter.txt (example) :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
316
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
317 AGATCGGAAGAGCACACGTC
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
318
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
319
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
320 ##Output files:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
321
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
322 - SAM file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
323
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
324 Sample:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
325
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
326 10918 0 chr1 133859922 255 100M * 0 0 TGGTTGTTTTTGTTATAGTTTTTTGTTGTAGAGTTTTTTTTGGAAAGTTGTGTTTATTTTTTTTTTTGTTTGGGTTTTGTTTGAAAGGGGTGGATGAGTT * XO:Z:+FW XS:i:0 NM:i:3 XM:Z:x--yx-zzzy--y--y--zz-zyx-yx-y--------z------------x--------z--zzz----y----y--x-zyx--------y--------z XG:Z:-C_CGGCCGCCCCTGCTGCAGCCTCCCGCCGCAGAGTTTTCTTTGGAAAGTTGCGTTTATTTCTTCCCTTGTCTGGGCTGCGCCCGAAAGGGGCAGATGAGTC_AC
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
327
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
328
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
329 Format descriptions:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
330
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
331 BS-Seeker2 specific tags:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
332 XO : orientation, from forward/reverted
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
333 XS : 1 when read is recognized as not fully converted by bisulfite treatment, or else 0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
334 XM : number of sites for mismatch
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
335 X: methylated CG
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
336 x: un-methylated CG
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
337 Y: methylated CHG
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
338 y: un-methylated CHG
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
339 Z: methylated CHH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
340 z: un-methylated CHH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
341 XG : genome sequences, with 2bp extended on both ends, from 5' to 3'
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
342 YR : tag only for RRBS, serial id of mapped fragment
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
343 YS : tag only for RRBS, start position of mapped fragment
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
344 YE : tag only for RRBS, end position of mapped fragment
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
345
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
346 Note:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
347 For reads mapped on Watson(minus) strand, the 10th colum in SAM file is not the original reads but the revered sequences.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
348
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
349
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
350 ##Tips:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
351
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
352 - Removing adapter is recommended.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
353
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
354 If you don't know what's your parameter, please ask the person who generate the library for you.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
355
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
356 If you are too shy to ask for it, you can try to de novo motif finding tools (such as [DME](http://cb1.utdallas.edu/dme/index.htm) and [MEME](http://meme.nbcr.net/meme/cgi-bin/meme.cgi)) find the enriched pattern in 1000 reads.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
357
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
358 Of course, you can also use other tools (such as [cutadapt](http://code.google.com/p/cutadapt/) ) to remove adaptor first.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
359
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
360 - It's always better to use a wider range for fragment length.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
361
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
362 For example, if 95% of reads come from fragments with length range [50bp, 250bp], you'd better choose [40bp, 300bp].
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
363
1
weilong-guo
parents: 0
diff changeset
364 - Fewer mismatches for the 'local alignment' mode.
weilong-guo
parents: 0
diff changeset
365
weilong-guo
parents: 0
diff changeset
366 As the 'local alignment', the bad sequenced bases are usually trimmed, and would not be considered by the parameter "-m".
weilong-guo
parents: 0
diff changeset
367 It is suggested to user fewer mismatches for the 'local alignment' mode.
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
368
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
369 (3) bs_seeker2-call_methylation.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
370 ------------
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
371
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
372
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
373 This module calls methylation levels from the mapping result.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
374
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
375 ##Usage:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
376
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
377 $ python bs_seeker2-call_methylation.py -h
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
378 Options:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
379 -h, --help show this help message and exit
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
380 -i INFILE, --input=INFILE
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
381 BAM output from bs_seeker2-align.py
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
382 -d DBPATH, --db=DBPATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
383 Path to the reference genome library (generated in
1
weilong-guo
parents: 0
diff changeset
384 preprocessing genome) [Default: ~/install
weilong-guo
parents: 0
diff changeset
385 /BSseeker2/bs_utils/reference_genomes]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
386 -o OUTFILE, --output-prefix=OUTFILE
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
387 The output prefix to create ATCGmap and wiggle files
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
388 [INFILE]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
389 --wig=OUTFILE The output .wig file [INFILE.wig]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
390 --CGmap=OUTFILE The output .CGmap file [INFILE.CGmap]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
391 --ATCGmap=OUTFILE The output .ATCGmap file [INFILE.ATCGmap]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
392 -x, --rm-SX Removed reads with tag 'XS:i:1', which would be
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
393 considered as not fully converted by bisulfite
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
394 treatment [Default: False]
1
weilong-guo
parents: 0
diff changeset
395 --txt Show CGmap and ATCGmap in .gz [Default: False]
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
396 -r READ_NO, --read-no=READ_NO
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
397 The least number of reads covering one site to be
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
398 shown in wig file [Default: 1]
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
399 -v, --version show version of BS-Seeker2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
400
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
401
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
402 ##Example :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
403
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
404 -For WGBS (whole genome bisulfite sequencing):
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
405
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
406 python bs_seeker2-call_methylation.py -i WGBS.bam -o output --db /path/to/BSseeker2/bs_utils/reference_genomes/genome.fa_bowtie/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
407
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
408 -For RRBS:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
409
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
410 python bs_seeker2-call_methylation.py -i RRBS.bam -o output --db /path/to/BSseeker2/bs_utils/reference_genomes/genome.fa_rrbs_40_400_bowtie2/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
411
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
412 -For RRBS and removed un-converted reads (with tag XS=1):
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
413
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
414 python bs_seeker2-call_methylation.py -x -i RRBS.bam -o output --db /path/to/BSseeker2/bs_utils/reference_genomes/genome.fa_rrbs_75_280_bowtie2/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
415
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
416 -For RRBS and only show sites covered by at least 10 reads in WIG file:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
417
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
418 python bs_seeker2-call_methylation.py -r 10 -i RRBS.bam -o output --db /path/to/BSseeker2/bs_utils/reference_genomes/genome.fa_rrbs_75_280_bowtie2/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
419
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
420
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
421 The folder “genome.fa\_rrbs\_40\_500\_bowtie2” is built in the first step
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
422
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
423 ##Output files:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
424
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
425 - wig file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
426
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
427 Sample:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
428
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
429 variableStep chrom=chr1
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
430 3000419 0.000000
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
431 3000423 -0.2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
432 3000440 0.000000
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
433 3000588 0.5
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
434 3000593 -0.000000
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
435
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
436
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
437 Format descriptions:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
438 WIG file format. Negative value for 2nd column indicate a Cytosine on minus strand.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
439
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
440
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
441 - CGmap file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
442
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
443 Sample:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
444
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
445 chr1 G 3000851 CHH CC 0.1 1 10
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
446 chr1 C 3001624 CHG CA 0.0 0 9
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
447 chr1 C 3001631 CG CG 1.0 5 5
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
448
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
449 Format descriptions:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
450
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
451 (1) chromosome
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
452 (2) nucleotide on Watson (+) strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
453 (3) position
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
454 (4) context (CG/CHG/CHH)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
455 (5) dinucleotide-context (CA/CC/CG/CT)
1
weilong-guo
parents: 0
diff changeset
456 (6) methylation-level = #_of_C / (#_of_C + #_of_T).
weilong-guo
parents: 0
diff changeset
457 (7) #_of_C (methylated C, the count of reads showing C here)
weilong-guo
parents: 0
diff changeset
458 (8) = #_of_C + #_of_T (all Cytosines, the count of reads showing C or T here)
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
459
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
460
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
461 - ATCGmap file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
462
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
463 Sample:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
464
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
465 chr1 T 3009410 -- -- 0 10 0 0 0 0 0 0 0 0 na
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
466 chr1 C 3009411 CHH CC 0 10 0 0 0 0 0 0 0 0 0.0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
467 chr1 C 3009412 CHG CC 0 10 0 0 0 0 0 0 0 0 0.0
1
weilong-guo
parents: 0
diff changeset
468 chr1 C 3009413 CG CG 0 10 50 0 0 0 0 0 0 0 0.83
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
469
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
470
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
471 Format descriptions:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
472
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
473 (1) chromosome
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
474 (2) nucleotide on Watson (+) strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
475 (3) position
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
476 (4) context (CG/CHG/CHH)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
477 (5) dinucleotide-context (CA/CC/CG/CT)
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
478
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
479 (6) - (10) plus strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
480 (6) # of reads from Watson strand mapped here, support A on Watson strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
481 (7) # of reads from Watson strand mapped here, support T on Watson strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
482 (8) # of reads from Watson strand mapped here, support C on Watson strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
483 (9) # of reads from Watson strand mapped here, support G on Watson strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
484 (10) # of reads from Watson strand mapped here, support N
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
485
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
486 (11) - (15) minus strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
487 (11) # of reads from Crick strand mapped here, support A on Watson strand and T on Crick strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
488 (12) # of reads from Crick strand mapped here, support T on Watson strand and A on Crick strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
489 (13) # of reads from Crick strand mapped here, support C on Watson strand and G on Crick strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
490 (14) # of reads from Crick strand mapped here, support G on Watson strand and C on Crick strand
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
491 (15) # of reads from Crick strand mapped here, support N
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
492
1
weilong-guo
parents: 0
diff changeset
493 (16) methylation_level = #C/(#C+#T) = C8/(C7+C8) for Watson strand, =C14/(C11+C14) for Crick strand;
weilong-guo
parents: 0
diff changeset
494 "nan" means none reads support C/T at this position.
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
495
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
496
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
497
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
498 Contact Information
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
499 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
500
1
weilong-guo
parents: 0
diff changeset
501 If you still have questions on BS-Seeker 2, or you find bugs when using BS-Seeker 2, or you have suggestions, please write email to [Weilong Guo](guoweilong@gmail.com).
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
502
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
503
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
504
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
505 Questions & Answers
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
506 ============
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
507
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
508 (1) Speed-up your alignment
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
509
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
510 Q: "It takes me days to do the alignment for one lane" ...
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
511
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
512 A: Yes, alignment is a time-consuming work, especially because the sequencing depth is increasing. An efficient way to align is :
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
513
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
514 i. cut the original sequence file into multiple small pieces;
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
515
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
516 Ex: split -l 4000000 input.fq
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
517
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
518 ii. align them in parallel;
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
519 iii. merge all the BAM files into a single one before running "bs-seeker2_call-methylation.py" (user "samtools merge" command).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
520
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
521 Ex: samtools merge out.bam in1.bam in2.bam in3.bam
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
522
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
523 (2) read in BAM/SAM
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
524
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
525 Q: Is the read sequence in BAM/SAM file is the same as my original one?
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
526
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
527 A: NO. They are different for several reasons.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
528
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
529 i. For RRBS, some reads are short because of trimming of the adapters
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
530 ii. For read mapping on Crick (-) strand, the reads are in fact the antisense version of the original sequence, opposite both in nucleotides and direction
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
531
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
532 (3) "Pysam" package related problem
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
533
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
534 Q: I'm normal account user for Linux(Cluster). I can't install "pysam". I get following error massages:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
535
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
536
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
537 $ python setup.py install
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
538 running install
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
539 error: can't create or remove files in install directory
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
540 The following error occurred while trying to add or remove files in the
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
541 installation directory:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
542 [Errno 13] Permission denied: '/usr/lib64/python2.6/site-packages/test-easy-install-26802.write-test'
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
543 ...
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
544
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
545
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
546 A: You can ask the administrator of your cluster to install pysam. If you don't want to bother him/her, you might need to build your own python, and then install the "pysam" package. The following script could be helpful for you.
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
547
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
548
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
549 mkdir ~/install
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
550 cd ~/install/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
551
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
552 # install python
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
553 wget http://www.python.org/ftp/python/2.7.4/Python-2.7.4.tgz # download the python from websites
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
554 tar zxvf Python-2.7.4.tgz # decompress
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
555 cd Python-2.7.4
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
556 ./configure --prefix=`pwd`
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
557 make
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
558 make install
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
559
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
560 # Add the path of Python to $PATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
561 # Please add the following line to file ~/.bashrc
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
562
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
563 export PATH=~/install/Python-2.7.4:$PATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
564
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
565 # save the ~/.bashrc file
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
566 source ~/.bashrc
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
567
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
568 # install pysam package
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
569 wget https://pysam.googlecode.com/files/pysam-0.7.4.tar.gz
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
570 tar zxvf pysam-0.7.4.tar.gz
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
571 cd pysam-0.7.4
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
572 python setup.py build
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
573 python setup.py install
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
574 # re-login the shell after finish installing pysam
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
575
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
576 # install BS-Seeker2
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
577 wget https://github.com/BSSeeker/BSseeker2/archive/master.zip
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
578 mv master BSSeeker2.zip
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
579 unzip BSSeeker2.zip
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
580 cd BSseeker2-master/
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
581
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
582
1
weilong-guo
parents: 0
diff changeset
583 (4)Run BS-Seeker2
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
584
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
585 Q: Can I add the path of BS-Seeker2's *.py to the $PATH, so I can call
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
586 BS-Seeker2 from anywhere?
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
587
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
588 A: If you're using the "python" from path "/usr/bin/python", you can directly
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
589 add the path of BS-Seeker2 in file "~/.bash_profile" (bash) or "~/.profile"
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
590 (other shell) or "~/.bashrc" (per-interactive-shell startup).
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
591 But if you are using python under other directories, you might need to modify
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
592 BS-Seeker2's script first. For example, if your python path is "/my_python/python",
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
593 please change the first line in "bs_seeker-build.py", "bs_seeker-align.py" and
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
594 "bs_seeker-call_methylation.py" to
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
595
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
596 #!/my_python/python
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
597
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
598 Then add
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
599
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
600 export PATH=/path/to/BS-Seeker2/:$PATH
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
601
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
602 to file "~/.bash_profile" (e.g.), and source the file:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
603
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
604 source ~/.bash_profile
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
605
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
606 Then you can use BS-Seeker2 globally by typing:
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
607
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
608 bs_seeker_build.py -h
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
609 bs_seeker-align.py -h
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
610 bs_seeker-call_methylation.py -h
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
611
1
weilong-guo
parents: 0
diff changeset
612 (5) Unique alignment
weilong-guo
parents: 0
diff changeset
613
weilong-guo
parents: 0
diff changeset
614 Q: If I want to only keep alignments that map uniquely, is this an argument I should supply directly
weilong-guo
parents: 0
diff changeset
615 to Bowtie2 (via BS Seeker 2's command line option), or is this an option that's available in
weilong-guo
parents: 0
diff changeset
616 BS Seeker 2 itself?
weilong-guo
parents: 0
diff changeset
617
weilong-guo
parents: 0
diff changeset
618 A: BS-Seeker2 reports unique alignment by default already. If you want to know how many reads
weilong-guo
parents: 0
diff changeset
619 could have multiple hits, run BS-Seeker2 with parameter "--multiple-hit".
weilong-guo
parents: 0
diff changeset
620
weilong-guo
parents: 0
diff changeset
621
weilong-guo
parents: 0
diff changeset
622 (6) Output
weilong-guo
parents: 0
diff changeset
623
weilong-guo
parents: 0
diff changeset
624 Q: In CGmap files, why some lines shown "--" but not a motif (CG/CHG/CHH), for example:
weilong-guo
parents: 0
diff changeset
625
weilong-guo
parents: 0
diff changeset
626 chr01 C 4303711 -- CC 0.0 0 2
weilong-guo
parents: 0
diff changeset
627 chr01 C 4303712 -- CN 0.0 0 2
weilong-guo
parents: 0
diff changeset
628
weilong-guo
parents: 0
diff changeset
629 A: In this example, the site 4303713 is "N" in genome, thus we could not decide the explict pattern.
weilong-guo
parents: 0
diff changeset
630
weilong-guo
parents: 0
diff changeset
631 (7) Algorithm to remove the adapter.
weilong-guo
parents: 0
diff changeset
632
weilong-guo
parents: 0
diff changeset
633 Q: What's the algorithm to remove the adapter
weilong-guo
parents: 0
diff changeset
634
weilong-guo
parents: 0
diff changeset
635 A: BS-Seeker2 has built-in algorithm for removing the adapter, which is developed by [Weilong Guo](http://bioinfo.au.tsinghua.edu.cn/member/wguo/index.html).
weilong-guo
parents: 0
diff changeset
636
weilong-guo
parents: 0
diff changeset
637 First, if the adapter was provided as "AGATCGGAAGAGCACACGTC", only the first 10bp would be used.
weilong-guo
parents: 0
diff changeset
638 Second, we use semi-local alignment strategy for removing the adapters.
weilong-guo
parents: 0
diff changeset
639 Exmaple:
weilong-guo
parents: 0
diff changeset
640
weilong-guo
parents: 0
diff changeset
641 Read : ACCGCGTTGATCGAGTACGTACGTGGGTC
weilong-guo
parents: 0
diff changeset
642 Adapter : ....................ACGTGGGTCCCG
weilong-guo
parents: 0
diff changeset
643
weilong-guo
parents: 0
diff changeset
644 no_mismatch : the maximum number allowed for mismatches
weilong-guo
parents: 0
diff changeset
645
weilong-guo
parents: 0
diff changeset
646 Algorithm: (allowing 1 mismatch)
weilong-guo
parents: 0
diff changeset
647 -Step 1:
weilong-guo
parents: 0
diff changeset
648 ACCGCGTTGATCGAGTACGTACGTGGGTC
weilong-guo
parents: 0
diff changeset
649 ||XX
weilong-guo
parents: 0
diff changeset
650 ACGTGGGTCCCG
weilong-guo
parents: 0
diff changeset
651 -Step 2:
weilong-guo
parents: 0
diff changeset
652 ACCGCGTTGATCGAGTACGTACGTGGGTC
weilong-guo
parents: 0
diff changeset
653 X||X
weilong-guo
parents: 0
diff changeset
654 .ACGTGGGTCCCG
weilong-guo
parents: 0
diff changeset
655 -Step 3:
weilong-guo
parents: 0
diff changeset
656 ACCGCGTTGATCGAGTACGTACGTGGGTC
weilong-guo
parents: 0
diff changeset
657 XX
weilong-guo
parents: 0
diff changeset
658 ..ACGTGGGTCCCG
weilong-guo
parents: 0
diff changeset
659 -Step ...
weilong-guo
parents: 0
diff changeset
660 -Step N:
weilong-guo
parents: 0
diff changeset
661 ACCGCGTTGATCGAGTACGTACGTGGGTC
weilong-guo
parents: 0
diff changeset
662 |||||||||
weilong-guo
parents: 0
diff changeset
663 ....................ACGTGGGTCCCG
weilong-guo
parents: 0
diff changeset
664 Success & return!
weilong-guo
parents: 0
diff changeset
665
weilong-guo
parents: 0
diff changeset
666 Third, we also removed the synthesized bases at the end of RRBS fragments.
weilong-guo
parents: 0
diff changeset
667 Take the "C-CGG" cutting site as example,
weilong-guo
parents: 0
diff changeset
668
weilong-guo
parents: 0
diff changeset
669 - - C|U G G - - =>cut=> - - C =>add=> - - C|C G =>sequencing
weilong-guo
parents: 0
diff changeset
670 - - G G C|C - - - - G G C - - G G C
weilong-guo
parents: 0
diff changeset
671
weilong-guo
parents: 0
diff changeset
672 In our algorithm, the "CG" in "--CCG" (upper strand) was trimmed, in order to get accurate methyaltion level.
0
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
673
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
674
e6df770c0e58 Initial upload
weilong-guo
parents:
diff changeset
675
1
weilong-guo
parents: 0
diff changeset
676
weilong-guo
parents: 0
diff changeset
677