0
|
1 Release 0.7.9 (19 May, 2014)
|
|
2 ----------------------------
|
|
3
|
|
4 This release brings several major changes to BWA-MEM. Notably, BWA-MEM now
|
|
5 formally supports PacBio read-to-reference alignment and experimentally supports
|
|
6 PacBio read-to-read alignment. BWA-MEM also runs faster at a minor cost of
|
|
7 accuracy. The speedup is more significant when GRCh38 is in use. More
|
|
8 specifically:
|
|
9
|
|
10 * Support PacBio subread-to-reference alignment. Although older BWA-MEM works
|
|
11 with PacBio data in principle, the resultant alignments are frequently
|
|
12 fragmented. In this release, we fine tuned existing methods and introduced
|
|
13 new heuristics to improve PacBio alignment. These changes are not used by
|
|
14 default. Users need to add option "-x pacbio" to enable the feature.
|
|
15
|
|
16 * Support PacBio subread-to-subread alignment (EXPERIMENTAL). This feature is
|
|
17 enabled with option "-x pbread". In this mode, the output only gives the
|
|
18 overlapping region between a pair of reads without detailed alignment.
|
|
19
|
|
20 * Output alternative hits in the XA tag if there are not so many of them. This
|
|
21 is a BWA-backtrack feature.
|
|
22
|
|
23 * Support mapping to ALT contigs in GRCh38 (EXPERIMENTAL). We provide a script
|
|
24 to postprocess hits in the XA tag to adjust the mapping quality and generate
|
|
25 new primary alignments to all overlapping ALT contigs. We would *NOT*
|
|
26 recommend this feature for production uses.
|
|
27
|
|
28 * Improved alignments to many short reference sequences. Older BWA-MEM may
|
|
29 generate an alignment bridging two or more adjacent reference sequences.
|
|
30 Such alignments are split at a later step as postprocessing. This approach
|
|
31 is complex and does not always work. This release forbids these alignments
|
|
32 from the very beginning. BWA-MEM should not produce an alignment bridging
|
|
33 two or more reference sequences any more.
|
|
34
|
|
35 * Reduced the maximum seed occurrence from 10000 to 500. Reduced the maximum
|
|
36 rounds of Smith-Waterman mate rescue from 100 to 50. Added a heuristic to
|
|
37 lower the mapping quality if a read contains seeds with excessive
|
|
38 occurrences. These changes make BWA-MEM faster at a minor cost of accuracy
|
|
39 in highly repetitive regions.
|
|
40
|
|
41 * Added an option "-Y" to use soft clipping for supplementary alignments.
|
|
42
|
|
43 * Bugfix: incomplete alignment extension in corner cases.
|
|
44
|
|
45 * Bugfix: integer overflow when aligning long query sequences.
|
|
46
|
|
47 * Bugfix: chain score is not computed correctly (almost no practical effect)
|
|
48
|
|
49 * General code cleanup
|
|
50
|
|
51 * Added FAQs to README
|
|
52
|
|
53 Changes in BWA-backtrack:
|
|
54
|
|
55 * Bugfix: a segmentation fault when an alignment stands out of the end of the
|
|
56 last chromosome.
|
|
57
|
|
58 (0.7.9: 19 May 2014, r783)
|
|
59
|
|
60
|
|
61
|
|
62 Release 0.7.8 (31 March, 2014)
|
|
63 ------------------------------
|
|
64
|
|
65 Changes in BWA-MEM:
|
|
66
|
|
67 * Bugfix: off-diagonal X-dropoff (option -d) not working as intended.
|
|
68 Short-read alignment is not affected.
|
|
69
|
|
70 * Bugfix: unnecessarily large bandwidth used during global alignment,
|
|
71 which reduces the mapping speed by -5% for short reads. Results are not
|
|
72 affected.
|
|
73
|
|
74 * Bugfix: when the matching score is not one, paired-end mapping quality is
|
|
75 inaccurate.
|
|
76
|
|
77 * When the matching score (option -A) is changed, scale all score-related
|
|
78 options accordingly unless overridden by users.
|
|
79
|
|
80 * Allow to specify different gap open (or extension) penalties for deletions
|
|
81 and insertions separately.
|
|
82
|
|
83 * Allow to specify the insert size distribution.
|
|
84
|
|
85 * Better and more detailed debugging information.
|
|
86
|
|
87 With the default setting, 0.7.8 and 0.7.7 gave identical output on one million
|
|
88 100bp read pairs.
|
|
89
|
|
90 (0.7.8: 31 March 2014, r455)
|
|
91
|
|
92
|
|
93
|
|
94 Release 0.7.7 (25 Feburary, 2014)
|
|
95 ---------------------------------
|
|
96
|
|
97 This release fixes incorrect MD tags in the BWA-MEM output.
|
|
98
|
|
99 A note about short-read mapping to GRCh38. The new human reference genome
|
|
100 GRCh38 contains 60Mbp program generated alpha repeat arrays, some of which are
|
|
101 hard masked as they cannot be localized. These highly repetitive arrays make
|
|
102 BWA-MEM -50% slower. If you are concerned with the performance of BWA-MEM, you
|
|
103 may consider to use option "-c2000 -m50". On simulated data, this setting helps
|
|
104 the performance at a very minor cost on accuracy. I may consider to change the
|
|
105 default in future releases.
|
|
106
|
|
107 (0.7.7: 25 Feburary 2014, r441)
|
|
108
|
|
109
|
|
110
|
|
111 Release 0.7.6 (31 Januaray, 2014)
|
|
112 ---------------------------------
|
|
113
|
|
114 Changes in BWA-MEM:
|
|
115
|
|
116 * Changed the way mapping quality is estimated. The new method tends to give
|
|
117 the same alignment a higher mapping quality. On paired-end reads, the change
|
|
118 is minor as with pairing, the mapping quality is usually high. For short
|
|
119 single-end reads, the difference is considerable.
|
|
120
|
|
121 * Improved load balance when many threads are spawned. However, bwa-mem is
|
|
122 still not very thread efficient, probably due to the frequent heap memory
|
|
123 allocation. Further improvement is a little difficult and may affect the
|
|
124 code stability.
|
|
125
|
|
126 * Allow to use different clipping penalties for 5'- and 3'-ends. This helps
|
|
127 when we do not want to clip one end.
|
|
128
|
|
129 * Print the @PG line, including the command line options.
|
|
130
|
|
131 * Improved the band width estimate: a) fixed a bug causing the band
|
|
132 width extimated from extension not used in the final global alignment; b)
|
|
133 try doubled band width if the global alignment score is smaller.
|
|
134 Insufficient band width leads to wrong CIGAR and spurious mismatches/indels.
|
|
135
|
|
136 * Added a new option -D to fine tune a heuristic on dropping suboptimal hits.
|
|
137 Reducing -D increases accuracy but decreases the mapping speed. If unsure,
|
|
138 leave it to the default.
|
|
139
|
|
140 * Bugfix: for a repetitive single-end read, the reported hit is not randomly
|
|
141 distributed among equally best hits.
|
|
142
|
|
143 * Bugfix: missing paired-end hits due to unsorted list of SE hits.
|
|
144
|
|
145 * Bugfix: incorrect CIGAR caused by a defect in the global alignment.
|
|
146
|
|
147 * Bugfix: incorrect CIGAR caused by failed SW rescue.
|
|
148
|
|
149 * Bugfix: alignments largely mapped to the same position are regarded to be
|
|
150 distinct from each other, which leads to underestimated mapping quality.
|
|
151
|
|
152 * Added the MD tag.
|
|
153
|
|
154 There are no changes to BWA-backtrack in this release. However, it has a few
|
|
155 known issues yet to be fixed. If you prefer BWA-track, It is still advised to
|
|
156 use bwa-0.6.x.
|
|
157
|
|
158 While I developed BWA-MEM, I also found a few issues with BWA-SW. It is now
|
|
159 possible to improve BWA-SW with the lessons learned from BWA-MEM. However, as
|
|
160 BWA-MEM is usually better, I will not improve BWA-SW until I find applications
|
|
161 where BWA-SW may excel.
|
|
162
|
|
163 (0.7.6: 31 January 2014, r432)
|
|
164
|
|
165
|
|
166
|
|
167 Release 0.7.5a (30 May, 2013)
|
|
168 -----------------------------
|
|
169
|
|
170 Fixed a bug in BWA-backtrack which leads to off-by-one mapping errors in rare
|
|
171 cases.
|
|
172
|
|
173 (0.7.5a: 30 May 2013, r405)
|
|
174
|
|
175
|
|
176
|
|
177 Release 0.7.5 (29 May, 2013)
|
|
178 ----------------------------
|
|
179
|
|
180 Changes in all components:
|
|
181
|
|
182 * Improved error checking on memory allocation and file I/O. Patches provided
|
|
183 by Rob Davies.
|
|
184
|
|
185 * Updated README.
|
|
186
|
|
187 * Bugfix: return code is zero upon errors.
|
|
188
|
|
189 Changes in BWA-MEM:
|
|
190
|
|
191 * Changed the way a chimeric alignment is reported (conforming to the upcoming
|
|
192 SAM spec v1.5). With 0.7.5, if the read has a chimeric alignment, the paired
|
|
193 or the top hit uses soft clipping and is marked with neither 0x800 nor 0x100
|
|
194 bits. All the other hits part of the chimeric alignment will use hard
|
|
195 clipping and be marked with 0x800 if option "-M" is not in use, or marked
|
|
196 with 0x100 otherwise.
|
|
197
|
|
198 * Other hits part of a chimeric alignment are now reported in the SA tag,
|
|
199 conforming to the SAM spec v1.5.
|
|
200
|
|
201 * Better method for resolving an alignment bridging two or more short
|
|
202 reference sequences. The current strategy maps the query to the reference
|
|
203 sequence that covers the middle point of the alignment. For most
|
|
204 applications, this change has no effects.
|
|
205
|
|
206 Changes in BWA-backtrack:
|
|
207
|
|
208 * Added a magic number to .sai files. This prevents samse/sampe from reading
|
|
209 corrupted .sai (e.g. a .sai file containing LSF log) or incompatible .sai
|
|
210 generated by a different version of bwa.
|
|
211
|
|
212 * Bugfix: alignments in the XA:Z: tag were wrong.
|
|
213
|
|
214 * Keep track of #ins and #del during backtracking. This simplifies the code
|
|
215 and reduces errors in rare corner cases. I should have done this in the
|
|
216 early days of bwa.
|
|
217
|
|
218 In addition, if you use BWA-MEM or the fastmap command of BWA, please cite:
|
|
219
|
|
220 - Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs
|
|
221 with BWA-MEM. arXiv:1303.3997v2 [q-bio.GN].
|
|
222
|
|
223 Thank you.
|
|
224
|
|
225 (0.7.5: 29 May 2013, r404)
|
|
226
|
|
227
|
|
228
|
|
229 Release 0.7.4 (23 April, 2013)
|
|
230 ------------------------------
|
|
231
|
|
232 This is a bugfix release. Most of bugs are considered to be minor which only
|
|
233 occur very rarely.
|
|
234
|
|
235 * Bugfix: wrong CIGAR when a query sequence bridges three or more target
|
|
236 sequences. This only happens when aligning reads to short assembly contigs.
|
|
237
|
|
238 * Bugfix: leading "D" operator in CIGAR.
|
|
239
|
|
240 * Extend more seeds for better alignment around tandem repeats. This is also
|
|
241 a cause of the leading "D" operator in CIGAR.
|
|
242
|
|
243 * Bugfix: SSE2-SSW may occasionally find incorrect query starting position
|
|
244 around tandem repeat. This will lead to a suboptimal CIGAR in BWA-MEM and
|
|
245 a wrong CIGAR in BWA.
|
|
246
|
|
247 * Bugfix: clipping penalty does not work as is intended when there is a gap
|
|
248 towards the end of a read.
|
|
249
|
|
250 * Fixed an issue caused by a bug in the libc from Mac/Darwin. In Darwin,
|
|
251 fread() is unable to read a data block longer than 2GB due to an integer
|
|
252 overflow bug in its implementation.
|
|
253
|
|
254 Since version 0.7.4, BWA-MEM is considered to reach similar stability to
|
|
255 BWA-backtrack for short-read mapping.
|
|
256
|
|
257 (0.7.4: 23 April, r385)
|
|
258
|
|
259
|
|
260
|
|
261 Release 0.7.3a (15 March, 2013)
|
|
262 -------------------------------
|
|
263
|
|
264 In 0.7.3, the wrong CIGAR bug was only fixed in one scenario, but not fixed
|
|
265 in another corner case.
|
|
266
|
|
267 (0.7.3a: 15 March 2013, r367)
|
|
268
|
|
269
|
|
270
|
|
271 Release 0.7.3 (15 March, 2013)
|
|
272 ------------------------------
|
|
273
|
|
274 Changes to BWA-MEM:
|
|
275
|
|
276 * Bugfix: pairing score is inaccurate when option -A does not take the default
|
|
277 value. This is a very minor issue even if it happens.
|
|
278
|
|
279 * Bugfix: occasionally wrong CIGAR. This happens when in the alignment there
|
|
280 is a 1bp deletion and a 1bp insertion which are close to the end of the
|
|
281 reads, and there are no other substitutions or indels. BWA-MEM would not do
|
|
282 a gapped alignment due to the bug.
|
|
283
|
|
284 * New feature: output other non-overlapping alignments in the XP tag such that
|
|
285 we can see the entire picture of alignment from one SAM line. XP gives the
|
|
286 position, CIGAR, NM and mapQ of each aligned subsequence of the query.
|
|
287
|
|
288 BWA-MEM has been used to align -300Gbp 100-700bp SE/PE reads. SNP/indel calling
|
|
289 has also been evaluated on part of these data. BWA-MEM generally gives better
|
|
290 pre-filtered SNP calls than BWA. No significant issues have been observed since
|
|
291 0.7.2, though minor improvements or bugs (e.g. the bug fixed in this release)
|
|
292 are still possible. If you find potential issues, please send bug reports to
|
|
293 <bio-bwa-help@lists.sourceforge.net> (free registration required).
|
|
294
|
|
295 In addition, more detailed description of the BWA-MEM algorithm can be found at
|
|
296 <https://github.com/lh3/mem-paper>.
|
|
297
|
|
298 (0.7.3: 15 March 2013, r366)
|
|
299
|
|
300
|
|
301
|
|
302 Release 0.7.2 (9 March, 2013)
|
|
303 -----------------------------
|
|
304
|
|
305 Emergent bug fix: 0.7.0 and 0.7.1 give a wrong sign to TLEN. In addition,
|
|
306 flagging 'properly paired' also gets improved a little.
|
|
307
|
|
308 (0.7.2: 9 March 2013, r351)
|
|
309
|
|
310
|
|
311
|
|
312 Release 0.7.1 (8 March, 2013)
|
|
313 -----------------------------
|
|
314
|
|
315 Changes to BWA-MEM:
|
|
316
|
|
317 * Bugfix: rare segmentation fault caused by a partial hit to the end of the
|
|
318 last sequence.
|
|
319
|
|
320 * Bugfix: occasional mis-pairing given an interleaved fastq.
|
|
321
|
|
322 * Bugfix: wrong mate information when the mate is unmapped. SAM generated by
|
|
323 BWA-MEM can now be validated with Picard.
|
|
324
|
|
325 * Improved the performance and accuracy for ultra-long query sequences.
|
|
326 Short-read alignment is not affected.
|
|
327
|
|
328 Changes to other components:
|
|
329
|
|
330 * In BWA-backtrack and BWA-SW, replaced the code for global alignment,
|
|
331 Smith-Waterman and SW extension. The performance and accuracy of the two
|
|
332 algorithms stay the same.
|
|
333
|
|
334 * Added an experimental subcommand to merge overlapping paired ends. The
|
|
335 algorithm is very conservative: it may miss true overlaps but rarely makes
|
|
336 mistakes.
|
|
337
|
|
338 An important note is that like BWA-SW, BWA-MEM may output multiple primary
|
|
339 alignments for a read, which may cause problems to some tools. For aligning
|
|
340 sequence reads, it is advised to use '-M' to flag extra hits as secondary. This
|
|
341 option is not the default because multiple primary alignments are theoretically
|
|
342 possible in sequence alignment.
|
|
343
|
|
344 (0.7.1: 8 March 2013, r347)
|
|
345
|
|
346
|
|
347
|
|
348 Beta Release 0.7.0 (28 Feburary, 2013)
|
|
349 --------------------------------------
|
|
350
|
|
351 This release comes with a new alignment algorithm, BWA-MEM, for 70bp-1Mbp query
|
|
352 sequences. BWA-MEM essentially seeds alignments with a variant of the fastmap
|
|
353 algorithm and extends seeds with banded affine-gap-penalty dynamic programming
|
|
354 (i.e. the Smith-Waterman-Gotoh algorithm). For typical Illumina 100bp reads or
|
|
355 longer low-divergence query sequences, BWA-MEM is about twice as fast as BWA
|
|
356 and BWA-SW and is more accurate. It also supports split alignments like BWA-SW
|
|
357 and may optionally output multiple hits like BWA. BWA-MEM does not guarantee
|
|
358 to find hits within a certain edit distance, but BWA is not efficient for such
|
|
359 task given longer reads anyway, and the edit-distance criterion is arguably
|
|
360 not as important in long-read alignment.
|
|
361
|
|
362 In addition to the algorithmic improvements, BWA-MEM also implements a few
|
|
363 handy features in practical aspects:
|
|
364
|
|
365 1. BWA-MEM automatically switches between local and glocal (global wrt reads;
|
|
366 local wrt reference) alignment. It reports the end-to-end glocal alignment
|
|
367 if the glocal alignment is not much worse than the optimal local alignment.
|
|
368 Glocal alignment reduces reference bias.
|
|
369
|
|
370 2. BWA-MEM automatically infers pair orientation from a batch of single-end
|
|
371 alignments. It allows more than one orientations if there are sufficient
|
|
372 supporting reads. This feature has not been tested on reads from Illumina
|
|
373 jumping library yet. (EXPERIMENTAL)
|
|
374
|
|
375 3. BWA-MEM optionally takes one interleaved fastq for paired-end mapping. It
|
|
376 is possible to convert a name-sorted BAM to an interleaved fastq on the fly
|
|
377 and feed the data stream to BWA-MEM for mapping.
|
|
378
|
|
379 4. BWA-MEM optionally copies FASTA/Q comments to the final SAM output, which
|
|
380 helps to transfer individual read annotations to the output.
|
|
381
|
|
382 5. BWA-MEM supports more advanced piping. Users can now run:
|
|
383 (bwa mem ref.fa '<bzcat r1.fq.bz2' '<bzcat r2.fq.bz2') to map bzip'd read
|
|
384 files without replying on bash features.
|
|
385
|
|
386 6. BWA-MEM provides a few basic APIs for single-end mapping. The 'example.c'
|
|
387 program in the source code directory implements a full single-end mapper in
|
|
388 50 lines of code.
|
|
389
|
|
390 The BWA-MEM algorithm is in the beta phase. It is not advised to use BWA-MEM
|
|
391 for production use yet. However, when the implementation becomes stable after a
|
|
392 few release cycles, existing BWA users are recommended to migrate to BWA-MEM
|
|
393 for 76bp or longer Illumina reads and long query sequences. The original BWA
|
|
394 short-read algorithm will not deliver satisfactory results for 150bp+ Illumina
|
|
395 reads. Change of mappers will be necessary sooner or later.
|
|
396
|
|
397 (0.7.0 beta: 28 Feburary 2013, r313)
|
|
398
|
|
399
|
|
400
|
|
401 Release 0.6.2 (19 June, 2012)
|
|
402 -----------------------------
|
|
403
|
|
404 This is largely a bug-fix release. Notable changes in BWA-short and BWA-SW:
|
|
405
|
|
406 * Bugfix: BWA-SW may give bad alignments due to incorrect band width.
|
|
407
|
|
408 * Bugfix: A segmentation fault due to an out-of-boundary error. The fix is a
|
|
409 temporary solution. The real cause has not been identified.
|
|
410
|
|
411 * Attempt to read index from prefix.64.bwt, such that the 32-bit and 64-bit
|
|
412 index can coexist.
|
|
413
|
|
414 * Added options '-I' and '-S' to control BWA-SW pairing.
|
|
415
|
|
416 (0.6.2: 19 June 2012, r126)
|
|
417
|
|
418
|
|
419
|
|
420 Release 0.6.1 (28 November, 2011)
|
|
421 ---------------------------------
|
|
422
|
|
423 Notable changes to BWA-short:
|
|
424
|
|
425 * Bugfix: duplicated alternative hits in the XA tag.
|
|
426
|
|
427 * Bugfix: when trimming enabled, bwa-aln trims 1bp less.
|
|
428
|
|
429 * Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
|
|
430 present.
|
|
431
|
|
432 Notable changes to BWA-SW:
|
|
433
|
|
434 * Bugfix: segfault due to excessive ambiguous bases.
|
|
435
|
|
436 * Bugfix: incorrect mate position in the SE mode.
|
|
437
|
|
438 * Bugfix: rare segfault in the PE mode
|
|
439
|
|
440 * When macro _NO_SSE2 is in use, fall back to the standard Smith-Waterman
|
|
441 instead of SSE2-SW.
|
|
442
|
|
443 * Optionally mark split hits with lower alignment scores as secondary.
|
|
444
|
|
445 Changes to fastmap:
|
|
446
|
|
447 * Bugfix: infinite loop caused by ambiguous bases.
|
|
448
|
|
449 * Optionally output the query sequence.
|
|
450
|
|
451 (0.6.1: 28 November 2011, r104)
|
|
452
|
|
453
|
|
454
|
|
455 Release 0.5.10 and 0.6.0 (12 November, 2011)
|
|
456 --------------------------------------------
|
|
457
|
|
458 The 0.6.0 release comes with two major changes. Firstly, the index data
|
|
459 structure has been changed to support genomes longer than 4GB. The forward and
|
|
460 reverse backward genome is now integrated in one index. This change speeds up
|
|
461 BWA-short by about 20% and BWA-SW by 90% with the mapping acccuracy largely
|
|
462 unchanged. A tradeoff is BWA requires more memory, but this is the price almost
|
|
463 all mappers that index the genome have to pay.
|
|
464
|
|
465 Secondly, BWA-SW in 0.6.0 now works with paired-end data. It is more accurate
|
|
466 for highly unique reads and more robust to long indels and structural
|
|
467 variations. However, BWA-short still has edges for reads with many suboptimal
|
|
468 hits. It is yet to know which algorithm is the best for variant calling.
|
|
469
|
|
470 0.5.10 is a bugfix release only and is likely to be the last release in the 0.5
|
|
471 branch unless I find critical bugs in future.
|
|
472
|
|
473 Other notable changes:
|
|
474
|
|
475 * Added the 'fastmap' command that finds super-maximal exact matches. It does
|
|
476 not give the final alignment, but runs much faster. It can be a building
|
|
477 block for other alignment algorithms. [0.6.0 only]
|
|
478
|
|
479 * Output the timing information before BWA exits. This also tells users that
|
|
480 the task has been finished instead of being killed or aborted. [0.6.0 only]
|
|
481
|
|
482 * Sped up multi-threading when using many (>20) CPU cores.
|
|
483
|
|
484 * Check I/O error.
|
|
485
|
|
486 * Increased the maximum barcode length to 63bp.
|
|
487
|
|
488 * Automatically choose the indexing algorithm.
|
|
489
|
|
490 * Bugfix: very rare segfault due to an uninitialized variable. The bug also
|
|
491 affects the placement of suboptimal alignments. The effect is very minor.
|
|
492
|
|
493 This release involves quite a lot of tricky changes. Although it has been
|
|
494 tested on a few data sets, subtle bugs may be still hidden. It is *NOT*
|
|
495 recommended to use this release in a production pipeline. In future, however,
|
|
496 BWA-SW may be better when reads continue to go longer. I would encourage users
|
|
497 to try the 0.6 release. I would also like to hear the users' experience. Thank
|
|
498 you.
|
|
499
|
|
500 (0.6.0: 12 November 2011, r85)
|
|
501
|
|
502
|
|
503
|
|
504 Beta Release 0.5.9 (24 January, 2011)
|
|
505 -------------------------------------
|
|
506
|
|
507 Notable changes:
|
|
508
|
|
509 * Feature: barcode support via the '-B' option.
|
|
510
|
|
511 * Feature: Illumina 1.3+ read format support via the '-I' option.
|
|
512
|
|
513 * Bugfix: RG tags are not attached to unmapped reads.
|
|
514
|
|
515 * Bugfix: very rare bwasw mismappings
|
|
516
|
|
517 * Recommend options for PacBio reads in bwasw help message.
|
|
518
|
|
519
|
|
520 Also, since January 13, the BWA master repository has been moved to github:
|
|
521
|
|
522 https://github.com/lh3/bwa
|
|
523
|
|
524 The revision number has been reset. All recent changes will be first
|
|
525 committed to this repository.
|
|
526
|
|
527 (0.5.9: 24 January 2011, r16)
|
|
528
|
|
529
|
|
530
|
|
531 Beta Release Candidate 0.5.9rc1 (10 December, 2010)
|
|
532 ---------------------------------------------------
|
|
533
|
|
534 Notable changes in bwasw:
|
|
535
|
|
536 * Output unmapped reads.
|
|
537
|
|
538 * For a repetitive read, choose a random hit instead of a fixed
|
|
539 one. This is not well tested.
|
|
540
|
|
541 Notable changes in bwa-short:
|
|
542
|
|
543 * Fixed a bug in the SW scoring system, which may lead to unexpected
|
|
544 gaps towards the end of a read.
|
|
545
|
|
546 * Fixed a bug which invalidates the randomness of repetitive reads.
|
|
547
|
|
548 * Fixed a rare memory leak.
|
|
549
|
|
550 * Allowed to specify the read group at the command line.
|
|
551
|
|
552 * Take name-grouped BAM files as input.
|
|
553
|
|
554 Changes to this release are usually safe in that they do not interfere
|
|
555 with the key functionality. However, the release has only been tested on
|
|
556 small samples instead of on large-scale real data. If anything weird
|
|
557 happens, please report the bugs to the bio-bwa-help mailing list.
|
|
558
|
|
559 (0.5.9rc1: 10 December 2010, r1561)
|
|
560
|
|
561
|
|
562
|
|
563 Beta Release 0.5.8 (8 June, 2010)
|
|
564 ---------------------------------
|
|
565
|
|
566 Notable changes in bwasw:
|
|
567
|
|
568 * Fixed an issue of missing alignments. This should happen rarely and
|
|
569 only when the contig/read alignment is multi-part. Very rarely, bwasw
|
|
570 may still miss a segment in a multi-part alignment. This is difficult
|
|
571 to fix, although possible.
|
|
572
|
|
573 Notable changes in bwa-short:
|
|
574
|
|
575 * Discard the SW alignment when the best single-end alignment is much
|
|
576 better. Such a SW alignment may caused by structural variations and
|
|
577 forcing it to be aligned leads to false alignment. This fix has not
|
|
578 been tested thoroughly. It would be great to receive more users
|
|
579 feedbacks on this issue.
|
|
580
|
|
581 * Fixed a typo/bug in sampe which leads to unnecessarily large memory
|
|
582 usage in some cases.
|
|
583
|
|
584 * Further reduced the chance of reporting 'weird pairing'.
|
|
585
|
|
586 (0.5.8: 8 June 2010, r1442)
|
|
587
|
|
588
|
|
589
|
|
590 Beta Release 0.5.7 (1 March, 2010)
|
|
591 ----------------------------------
|
|
592
|
|
593 This release only has an effect on paired-end data with fat insert-size
|
|
594 distribution. Users are still recommended to update as the new release
|
|
595 improves the robustness to poor data.
|
|
596
|
|
597 * The fix for 'weird pairing' was not working in version 0.5.6, pointed
|
|
598 out by Carol Scott. It should work now.
|
|
599
|
|
600 * Optionally output to a normal file rather than to stdout (by Tim
|
|
601 Fennel).
|
|
602
|
|
603 (0.5.7: 1 March 2010, r1310)
|
|
604
|
|
605
|
|
606
|
|
607 Beta Release 0.5.6 (10 Feburary, 2010)
|
|
608 --------------------------------------
|
|
609
|
|
610 Notable changes in bwa-short:
|
|
611
|
|
612 * Report multiple hits in the SAM format at a new tag XA encoded as:
|
|
613 (chr,pos,CIGAR,NM;)*. By default, if a paired or single-end read has
|
|
614 4 or fewer hits, they will all be reported; if a read in a anomalous
|
|
615 pair has 11 or fewer hits, all of them will be reported.
|
|
616
|
|
617 * Perform Smith-Waterman alignment also for anomalous read pairs when
|
|
618 both ends have quality higher than 17. This reduces false positives
|
|
619 for some SV discovery algorithms.
|
|
620
|
|
621 * Do not report "weird pairing" when the insert size distribution is
|
|
622 too fat or has a mean close to zero.
|
|
623
|
|
624 * If a read is bridging two adjacent chromsomes, flag it as unmapped.
|
|
625
|
|
626 * Fixed a small but long existing memory leak in paired-end mapping.
|
|
627
|
|
628 * Multiple bug fixes in SOLiD mapping: a) quality "-1" can be correctly
|
|
629 parsed by solid2fastq.pl; b) truncated quality string is resolved; c)
|
|
630 SOLiD read mapped to the reverse strand is complemented.
|
|
631
|
|
632 * Bwa now calculates skewness and kurtosis of the insert size
|
|
633 distribution.
|
|
634
|
|
635 * Deploy a Bayesian method to estimate the maximum distance for a read
|
|
636 pair considered to be paired properly. The method is proposed by
|
|
637 Gerton Lunter, but bwa only implements a simplified version.
|
|
638
|
|
639 * Export more functions for Java bindings, by Matt Hanna (See:
|
|
640 http://www.broadinstitute.org/gsa/wiki/index.php/Sting_BWA/C_bindings)
|
|
641
|
|
642 * Abstract bwa CIGAR for further extension, by Rodrigo Goya.
|
|
643
|
|
644 (0.5.6: 10 Feburary 2010, r1303)
|
|
645
|
|
646
|
|
647
|
|
648 Beta Release 0.5.5 (10 November, 2009)
|
|
649 --------------------------------------
|
|
650
|
|
651 This is a bug fix release:
|
|
652
|
|
653 * Fixed a serious bug/typo in aln which does not occur given short
|
|
654 reads, but will lead to segfault for >500bp reads. Of course, the aln
|
|
655 command is not recommended for reads longer than 200bp, but this is a
|
|
656 bug anyway.
|
|
657
|
|
658 * Fixed a minor bug/typo which leads to incorrect single-end mapping
|
|
659 quality when one end is moved to meet the mate-pair requirement.
|
|
660
|
|
661 * Fixed a bug in samse for mapping in the color space. This bug is
|
|
662 caused by quality filtration added since 0.5.1.
|
|
663
|
|
664 (0.5.5: 10 November 2009, r1273)
|
|
665
|
|
666
|
|
667
|
|
668 Beta Release 0.5.4 (9 October, 2009)
|
|
669 ------------------------------------
|
|
670
|
|
671 Since this version, the default seed length used in the "aln" command is
|
|
672 changed to 32.
|
|
673
|
|
674 Notable changes in bwa-short:
|
|
675
|
|
676 * Added a new tag "XC:i" which gives the length of clipped reads.
|
|
677
|
|
678 * In sampe, skip alignments in case of a bug in the Smith-Waterman
|
|
679 alignment module.
|
|
680
|
|
681 * In sampe, fixed a bug in pairing when the read sequence is identical
|
|
682 to its reverse complement.
|
|
683
|
|
684 * In sampe, optionally preload the entire FM-index into memory to
|
|
685 reduce disk operations.
|
|
686
|
|
687 Notable changes in dBWT-SW/BWA-SW:
|
|
688
|
|
689 * Changed name dBWT-SW to BWA-SW.
|
|
690
|
|
691 * Optionally use "hard clipping" in the SAM output.
|
|
692
|
|
693 (0.5.4: 9 October 2009, r1245)
|
|
694
|
|
695
|
|
696
|
|
697 Beta Release 0.5.3 (15 September, 2009)
|
|
698 ---------------------------------------
|
|
699
|
|
700 Fixed a critical bug in bwa-short: reads mapped to the reverse strand
|
|
701 are not complemented.
|
|
702
|
|
703 (0.5.3: 15 September 2009, r1225)
|
|
704
|
|
705
|
|
706
|
|
707 Beta Release 0.5.2 (13 September, 2009)
|
|
708 ---------------------------------------
|
|
709
|
|
710 Notable changes in bwa-short:
|
|
711
|
|
712 * Optionally trim reads before alignment. See the manual page on 'aln
|
|
713 -q' for detailed description.
|
|
714
|
|
715 * Fixed a bug in calculating the NM tag for a gapped alignment.
|
|
716
|
|
717 * Fixed a bug given a mixture of reads with some longer than the seed
|
|
718 length and some shorter.
|
|
719
|
|
720 * Print SAM header.
|
|
721
|
|
722 Notable changes in dBWT-SW:
|
|
723
|
|
724 * Changed the default value of -T to 30. As a result, the accuracy is a
|
|
725 little higher for short reads at the cost of speed.
|
|
726
|
|
727 (0.5.2: 13 September 2009, r1223)
|
|
728
|
|
729
|
|
730
|
|
731 Beta Release 0.5.1 (2 September, 2009)
|
|
732 --------------------------------------
|
|
733
|
|
734 Notable changes in the short read alignment component:
|
|
735
|
|
736 * Fixed a bug in samse: do not write mate coordinates.
|
|
737
|
|
738 Notable changes in dBWT-SW:
|
|
739
|
|
740 * Randomly choose one alignment if the read is a repetitive.
|
|
741
|
|
742 * Fixed a flaw when a read is mapped across two adjacent reference
|
|
743 sequences. However, wrong alignment reports may still occur rarely in
|
|
744 this case.
|
|
745
|
|
746 * Changed the default band width to 50. The speed is slower due to this
|
|
747 change.
|
|
748
|
|
749 * Improved the mapping quality a little given long query sequences.
|
|
750
|
|
751 (0.5.1: 2 September 2009, r1209)
|
|
752
|
|
753
|
|
754
|
|
755 Beta Release 0.5.0 (20 August, 2009)
|
|
756 ------------------------------------
|
|
757
|
|
758 This release implements a novel algorithm, dBWT-SW, specifically
|
|
759 designed for long reads. It is 10-50 times faster than SSAHA2, depending
|
|
760 on the characteristics of the input data, and achieves comparable
|
|
761 alignment accuracy while allowing chimera detection. In comparison to
|
|
762 BLAT, dBWT-SW is several times faster and much more accurate especially
|
|
763 when the error rate is high. Please read the manual page for more
|
|
764 information.
|
|
765
|
|
766 The dBWT-SW algorithm is kind of developed for future sequencing
|
|
767 technologies which produce much longer reads with a little higher error
|
|
768 rate. It is still at its early development stage. Some features are
|
|
769 missing and it may be buggy although I have evaluated on several
|
|
770 simulated and real data sets. But following the "release early"
|
|
771 paradigm, I would like the users to try it first.
|
|
772
|
|
773 Other notable changes in BWA are:
|
|
774
|
|
775 * Fixed a rare bug in the Smith-Waterman alignment module.
|
|
776
|
|
777 * Fixed a rare bug about the wrong alignment coordinate when a read is
|
|
778 poorly aligned.
|
|
779
|
|
780 * Fixed a bug in generating the "mate-unmap" SAM tag when both ends in
|
|
781 a pair are unmapped.
|
|
782
|
|
783 (0.5.0: 20 August 2009, r1200)
|
|
784
|
|
785
|
|
786
|
|
787 Beta Release 0.4.9 (19 May, 2009)
|
|
788 ---------------------------------
|
|
789
|
|
790 Interestingly, the integer overflow bug claimed to be fixed in 0.4.7 has
|
|
791 not in fact. Now I have fixed the bug. Sorry for this and thank Quan
|
|
792 Long for pointing out the bug (again).
|
|
793
|
|
794 (0.4.9: 19 May 2009, r1075)
|
|
795
|
|
796
|
|
797
|
|
798 Beta Release 0.4.8 (18 May, 2009)
|
|
799 ---------------------------------
|
|
800
|
|
801 One change to "aln -R". Now by default, if there are no more than '-R'
|
|
802 equally best hits, bwa will search for suboptimal hits. This change
|
|
803 affects the ability in finding SNPs in segmental duplications.
|
|
804
|
|
805 I have not tested this option thoroughly, but this simple change is less
|
|
806 likely to cause new bugs. Hope I am right.
|
|
807
|
|
808 (0.4.8: 18 May 2009, r1073)
|
|
809
|
|
810
|
|
811
|
|
812 Beta Release 0.4.7 (12 May, 2009)
|
|
813 ---------------------------------
|
|
814
|
|
815 Notable changes:
|
|
816
|
|
817 * Output SM (single-end mapping quality) and AM (smaller mapping
|
|
818 quality among the two ends) tag from sam output.
|
|
819
|
|
820 * Improved the functionality of stdsw.
|
|
821
|
|
822 * Made the XN tag more accurate.
|
|
823
|
|
824 * Fixed a very rare segfault caused by integer overflow.
|
|
825
|
|
826 * Improve the insert size estimation.
|
|
827
|
|
828 * Fixed compiling errors for some Linux systems.
|
|
829
|
|
830 (0.4.7: 12 May 2009, r1066)
|
|
831
|
|
832
|
|
833
|
|
834 Beta Release 0.4.6 (9 March, 2009)
|
|
835 ----------------------------------
|
|
836
|
|
837 This release improves the SOLiD support. First, a script for converting
|
|
838 SOLiD raw data is provided. This script is adapted from solid2fastq.pl
|
|
839 in the MAQ package. Second, a nucleotide reference file can be directly
|
|
840 used with 'bwa index'. Third, SOLiD paired-end support is
|
|
841 completed. Fourth, color-space reads will be converted to nucleotides
|
|
842 when SAM output is generated. Color errors are corrected in this
|
|
843 process. Please note that like MAQ, BWA cannot make use of the primer
|
|
844 base and the first color.
|
|
845
|
|
846 In addition, the calculation of mapping quality is also improved a
|
|
847 little bit, although end-users may barely observe the difference.
|
|
848
|
|
849 (0.4.6: 9 March 2009, r915)
|
|
850
|
|
851
|
|
852
|
|
853 Beta Release 0.4.5 (18 Feburary, 2009)
|
|
854 --------------------------------------
|
|
855
|
|
856 Not much happened, but I think it would be good to let the users use the
|
|
857 latest version.
|
|
858
|
|
859 Notable changes (Thank Bob Handsaker for catching the two bugs):
|
|
860
|
|
861 * Improved bounary check. Previous version may still give incorrect
|
|
862 alignment coordinates in rare cases.
|
|
863
|
|
864 * Fixed a bug in SW alignment when no residue matches. This only
|
|
865 affects the 'sampe' command.
|
|
866
|
|
867 * Robustly estimate insert size without setting the maximum on the
|
|
868 command line. Since this release 'sampe -a' only has an effect if
|
|
869 there are not enough good pairs to infer the insert size
|
|
870 distribution.
|
|
871
|
|
872 * Reduced false PE alignments a little bit by using the inferred insert
|
|
873 size distribution. This fix may be more important for long insert
|
|
874 size libraries.
|
|
875
|
|
876 (0.4.5: 18 Feburary 2009, r829)
|
|
877
|
|
878
|
|
879
|
|
880 Beta Release 0.4.4 (15 Feburary, 2009)
|
|
881 --------------------------------------
|
|
882
|
|
883 This is mainly a bug fix release. Notable changes are:
|
|
884
|
|
885 * Imposed boundary check for extracting subsequence from the
|
|
886 genome. Previously this causes memory problem in rare cases.
|
|
887
|
|
888 * Fixed a bug in failing to find whether an alignment overlapping with
|
|
889 N on the genome.
|
|
890
|
|
891 * Changed MD tag to meet the latest SAM specification.
|
|
892
|
|
893 (0.4.4: 15 Feburary 2009, r815)
|
|
894
|
|
895
|
|
896
|
|
897 Beta Release 0.4.3 (22 January, 2009)
|
|
898 ------------------------------------
|
|
899
|
|
900 Notable changes:
|
|
901
|
|
902 * Treat an ambiguous base N as a mismatch. Previous versions will not
|
|
903 map reads containing any N.
|
|
904
|
|
905 * Automatically choose the maximum allowed number of differences. This
|
|
906 is important when reads of different lengths are mixed together.
|
|
907
|
|
908 * Print mate coordinate if only one end is unmapped.
|
|
909
|
|
910 * Generate MD tag. This tag encodes the mismatching positions and the
|
|
911 reference bases at these positions. Deletions from the reference will
|
|
912 also be printed.
|
|
913
|
|
914 * Optionally dump multiple hits from samse, in another concise format
|
|
915 rather than SAM.
|
|
916
|
|
917 * Optionally disable iterative search. This is VERY SLOOOOW, though.
|
|
918
|
|
919 * Fixed a bug in generate SAM.
|
|
920
|
|
921 (0.4.3: 22 January 2009, r787)
|
|
922
|
|
923
|
|
924
|
|
925 Beta Release 0.4.2 (9 January, 2009)
|
|
926 ------------------------------------
|
|
927
|
|
928 Aaron Quinlan found a bug in the indexer: the bwa indexer segfaults if
|
|
929 there are no comment texts in the FASTA header. This is a critical
|
|
930 bug. Nothing else was changed.
|
|
931
|
|
932 (0.4.2: 9 January 2009, r769)
|
|
933
|
|
934
|
|
935
|
|
936 Beta Release 0.4.1 (7 January, 2009)
|
|
937 ------------------------------------
|
|
938
|
|
939 I am sorry for the quick updates these days. I like to set a milestone
|
|
940 for BWA and this release seems to be. For paired end reads, BWA also
|
|
941 does Smith-Waterman alignment for an unmapped read whose mate can be
|
|
942 mapped confidently. With this strategy BWA achieves similar accuracy to
|
|
943 maq. Benchmark is also updated accordingly.
|
|
944
|
|
945 (0.4.1: 7 January 2009, r760)
|
|
946
|
|
947
|
|
948
|
|
949 Beta Release 0.4.0 (6 January, 2009)
|
|
950 ------------------------------------
|
|
951
|
|
952 In comparison to the release two days ago, this release is mainly tuned
|
|
953 for performance with some tricks I learnt from Bowtie. However, as the
|
|
954 indexing format has also been changed, I have to increase the version
|
|
955 number to 0.4.0 to emphasize that *DATABASE MUST BE RE-INDEXED* with
|
|
956 'bwa index'.
|
|
957
|
|
958 * Improved the speed by about 20%.
|
|
959
|
|
960 * Added multi-threading to 'bwa aln'.
|
|
961
|
|
962 (0.4.0: 6 January 2009, r756)
|
|
963
|
|
964
|
|
965
|
|
966 Beta Release 0.3.0 (4 January, 2009)
|
|
967 ------------------------------------
|
|
968
|
|
969 * Added paired-end support by separating SA calculation and alignment
|
|
970 output.
|
|
971
|
|
972 * Added SAM output.
|
|
973
|
|
974 * Added evaluation to the documentation.
|
|
975
|
|
976 (0.3.0: 4 January 2009, r741)
|
|
977
|
|
978
|
|
979
|
|
980 Beta Release 0.2.0 (15 Augusst, 2008)
|
|
981 -------------------------------------
|
|
982
|
|
983 * Take the subsequence at the 5'-end as seed. Seeding strategy greatly
|
|
984 improves the speed for long reads, at the cost of missing a few true
|
|
985 hits that contain many differences in the seed. Seeding also increase
|
|
986 the memory by 800MB.
|
|
987
|
|
988 * Fixed a bug which may miss some gapped alignments. Fixing the bug
|
|
989 also slows the speed a little.
|
|
990
|
|
991 (0.2.0: 15 August 2008, r428)
|
|
992
|
|
993
|
|
994
|
|
995 Beta Release 0.1.6 (08 Augusst, 2008)
|
|
996 -------------------------------------
|
|
997
|
|
998 * Give accurate CIGAR string.
|
|
999
|
|
1000 * Add a simple interface to SW/NW alignment
|
|
1001
|
|
1002 (0.1.6: 08 August 2008, r414)
|
|
1003
|
|
1004
|
|
1005
|
|
1006 Beta Release 0.1.5 (27 July, 2008)
|
|
1007 ----------------------------------
|
|
1008
|
|
1009 * Improve the speed. This version is expected to give the same results.
|
|
1010
|
|
1011 (0.1.5: 27 July 2008, r400)
|
|
1012
|
|
1013
|
|
1014
|
|
1015 Beta Release 0.1.4 (22 July, 2008)
|
|
1016 ----------------------------------
|
|
1017
|
|
1018 * Fixed a bug which may cause missing gapped alignments.
|
|
1019
|
|
1020 * More clearly define what alignments can be found by BWA (See
|
|
1021 manual). Now BWA runs a little slower because it will visit more
|
|
1022 potential gapped alignments.
|
|
1023
|
|
1024 * A bit code clean up.
|
|
1025
|
|
1026 (0.1.4: 22 July 2008, r387)
|
|
1027
|
|
1028
|
|
1029
|
|
1030 Beta Release 0.1.3 (21 July, 2008)
|
|
1031 ----------------------------------
|
|
1032
|
|
1033 Improve the speed with some tricks on retrieving occurences. The results
|
|
1034 should be exactly the same as that of 0.1.2.
|
|
1035
|
|
1036 (0.1.3: 21 July 2008, r382)
|
|
1037
|
|
1038
|
|
1039
|
|
1040 Beta Release 0.1.2 (17 July, 2008)
|
|
1041 ----------------------------------
|
|
1042
|
|
1043 Support gapped alignment. Codes for ungapped alignment has been removed.
|
|
1044
|
|
1045 (0.1.2: 17 July 2008, r371)
|
|
1046
|
|
1047
|
|
1048
|
|
1049 Beta Release 0.1.1 (03 June, 2008)
|
|
1050 -----------------------------------
|
|
1051
|
|
1052 This is the first release of BWA, Burrows-Wheeler Alignment tool. Please
|
|
1053 read man page for more information about this software.
|
|
1054
|
|
1055 (0.1.1: 03 June 2008, r349)
|