comparison test-data/summary_pbat.txt @ 17:aa9bf0f29a9f draft

"planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/bismark commit d85012b50faac3234496bb51e2a29c2d8c113bde"
author bgruening
date Wed, 28 Aug 2019 07:08:45 -0400
parents
children
comparison
equal deleted inserted replaced
16:a4504327c890 17:aa9bf0f29a9f
1 Create a temporary index with the offered files from the user. Utilizing the script: bismark_genome_preparation
2 Generating index with: 'bismark_genome_preparation --bowtie2 /tmp/tmpuqE7r1'
3 Writing bisulfite genomes out into a single MFA (multi FastA) file
4
5 Bisulfite Genome Indexer version v0.22.1 (last modified: 14 April 2019)
6
7 Step I - Prepare genome folders - completed
8
9
10
11 Total number of conversions performed:
12 C->T: 146875
13 G->A: 150504
14
15 Step II - Genome bisulfite conversions - completed
16
17
18 Bismark Genome Preparation - Step III: Launching the Bowtie 2 indexer
19 Please be aware that this process can - depending on genome size - take several hours!
20 Settings:
21 Output files: "BS_CT.*.bt2"
22 Line rate: 6 (line is 64 bytes)
23 Lines per side: 1 (side is 64 bytes)
24 Offset rate: 4 (one in 16)
25 FTable chars: 10
26 Strings: unpacked
27 Max bucket size: default
28 Max bucket size, sqrt multiplier: default
29 Max bucket size, len divisor: 4
30 Difference-cover sample period: 1024
31 Endianness: little
32 Actual local endianness: little
33 Sanity checking: disabled
34 Assertions: disabled
35 Random seed: 0
36 Sizeofs: void*:8, int:4, long:8, size_t:8
37 Input files DNA, FASTA:
38 genome_mfa.CT_conversion.fa
39 Building a SMALL index
40 Reading reference sizes
41 Time reading reference sizes: 00:00:00
42 Calculating joined length
43 Writing header
44 Reserving space for joined string
45 Joining reference sequences
46 Time to join reference sequences: 00:00:00
47 bmax according to bmaxDivN setting: 189039
48 Using parameters --bmax 141780 --dcv 1024
49 Doing ahead-of-time memory usage test
50 Passed! Constructing with these parameters: --bmax 141780 --dcv 1024
51 Constructing suffix-array element generator
52 Building DifferenceCoverSample
53 Building sPrime
54 Building sPrimeOrder
55 V-Sorting samples
56 V-Sorting samples time: 00:00:00
57 Allocating rank array
58 Ranking v-sort output
59 Ranking v-sort output time: 00:00:00
60 Invoking Larsson-Sadakane on ranks
61 Invoking Larsson-Sadakane on ranks time: 00:00:00
62 Sanity-checking and returning
63 Building samples
64 Reserving space for 12 sample suffixes
65 Generating random suffixes
66 QSorting 12 sample offsets, eliminating duplicates
67 QSorting sample offsets, eliminating duplicates time: 00:00:00
68 Multikey QSorting 12 samples
69 (Using difference cover)
70 Multikey QSorting samples time: 00:00:00
71 Calculating bucket sizes
72 Splitting and merging
73 Splitting and merging time: 00:00:00
74 Avg bucket size: 756159 (target: 141779)
75 Converting suffix-array elements to index image
76 Allocating ftab, absorbFtab
77 Entering Ebwt loop
78 Getting block 1 of 1
79 No samples; assembling all-inclusive block
80 Sorting block of length 756159 for bucket 1
81 (Using difference cover)
82 Sorting block time: xxxx
83 Returning block of 756160 for bucket 1
84 Exited Ebwt loop
85 fchr[A]: 0
86 fchr[C]: 235897
87 fchr[G]: 235897
88 fchr[T]: 386401
89 fchr[$]: 756159
90 Exiting Ebwt::buildToDisk()
91 Returning from initFromVector
92 Wrote 4446745 bytes to primary EBWT file: BS_CT.1.bt2
93 Wrote 189044 bytes to secondary EBWT file: BS_CT.2.bt2
94 Re-opening _in1 and _in2 as input streams
95 Returning from Ebwt constructor
96 Headers:
97 len: 756159
98 bwtLen: 756160
99 sz: 189040
100 bwtSz: 189040
101 lineRate: 6
102 offRate: 4
103 offMask: 0xfffffff0
104 ftabChars: 10
105 eftabLen: 20
106 eftabSz: 80
107 ftabLen: 1048577
108 ftabSz: 4194308
109 offsLen: 47260
110 offsSz: 189040
111 lineSz: 64
112 sideSz: 64
113 sideBwtSz: 48
114 sideBwtLen: 192
115 numSides: 3939
116 numLines: 3939
117 ebwtTotLen: 252096
118 ebwtTotSz: 252096
119 color: 0
120 reverse: 0
121 Total time for call to driver() for forward index: xxxx
122 Reading reference sizes
123 Time reading reference sizes: 00:00:00
124 Calculating joined length
125 Writing header
126 Reserving space for joined string
127 Joining reference sequences
128 Time to join reference sequences: 00:00:00
129 Time to reverse reference sequence: 00:00:00
130 bmax according to bmaxDivN setting: 189039
131 Using parameters --bmax 141780 --dcv 1024
132 Doing ahead-of-time memory usage test
133 Passed! Constructing with these parameters: --bmax 141780 --dcv 1024
134 Constructing suffix-array element generator
135 Building DifferenceCoverSample
136 Building sPrime
137 Building sPrimeOrder
138 V-Sorting samples
139 V-Sorting samples time: 00:00:00
140 Allocating rank array
141 Ranking v-sort output
142 Ranking v-sort output time: 00:00:00
143 Invoking Larsson-Sadakane on ranks
144 Invoking Larsson-Sadakane on ranks time: 00:00:00
145 Sanity-checking and returning
146 Building samples
147 Reserving space for 12 sample suffixes
148 Generating random suffixes
149 QSorting 12 sample offsets, eliminating duplicates
150 QSorting sample offsets, eliminating duplicates time: 00:00:00
151 Multikey QSorting 12 samples
152 (Using difference cover)
153 Multikey QSorting samples time: 00:00:00
154 Calculating bucket sizes
155 Splitting and merging
156 Splitting and merging time: 00:00:00
157 Avg bucket size: 756159 (target: 141779)
158 Converting suffix-array elements to index image
159 Allocating ftab, absorbFtab
160 Entering Ebwt loop
161 Getting block 1 of 1
162 No samples; assembling all-inclusive block
163 Sorting block of length 756159 for bucket 1
164 (Using difference cover)
165 Sorting block time: xxxx
166 Returning block of 756160 for bucket 1
167 Exited Ebwt loop
168 fchr[A]: 0
169 fchr[C]: 235897
170 fchr[G]: 235897
171 fchr[T]: 386401
172 fchr[$]: 756159
173 Exiting Ebwt::buildToDisk()
174 Returning from initFromVector
175 Wrote 4446745 bytes to primary EBWT file: BS_CT.rev.1.bt2
176 Wrote 189044 bytes to secondary EBWT file: BS_CT.rev.2.bt2
177 Re-opening _in1 and _in2 as input streams
178 Returning from Ebwt constructor
179 Headers:
180 len: 756159
181 bwtLen: 756160
182 sz: 189040
183 bwtSz: 189040
184 lineRate: 6
185 offRate: 4
186 offMask: 0xfffffff0
187 ftabChars: 10
188 eftabLen: 20
189 eftabSz: 80
190 ftabLen: 1048577
191 ftabSz: 4194308
192 offsLen: 47260
193 offsSz: 189040
194 lineSz: 64
195 sideSz: 64
196 sideBwtSz: 48
197 sideBwtLen: 192
198 numSides: 3939
199 numLines: 3939
200 ebwtTotLen: 252096
201 ebwtTotSz: 252096
202 color: 0
203 reverse: 1
204 Total time for backward call to driver() for mirror index: 00:00:01
205 Settings:
206 Output files: "BS_GA.*.bt2"
207 Line rate: 6 (line is 64 bytes)
208 Lines per side: 1 (side is 64 bytes)
209 Offset rate: 4 (one in 16)
210 FTable chars: 10
211 Strings: unpacked
212 Max bucket size: default
213 Max bucket size, sqrt multiplier: default
214 Max bucket size, len divisor: 4
215 Difference-cover sample period: 1024
216 Endianness: little
217 Actual local endianness: little
218 Sanity checking: disabled
219 Assertions: disabled
220 Random seed: 0
221 Sizeofs: void*:8, int:4, long:8, size_t:8
222 Input files DNA, FASTA:
223 genome_mfa.GA_conversion.fa
224 Building a SMALL index
225 Reading reference sizes
226 Time reading reference sizes: 00:00:00
227 Calculating joined length
228 Writing header
229 Reserving space for joined string
230 Joining reference sequences
231 Time to join reference sequences: 00:00:00
232 bmax according to bmaxDivN setting: 189039
233 Using parameters --bmax 141780 --dcv 1024
234 Doing ahead-of-time memory usage test
235 Passed! Constructing with these parameters: --bmax 141780 --dcv 1024
236 Constructing suffix-array element generator
237 Building DifferenceCoverSample
238 Building sPrime
239 Building sPrimeOrder
240 V-Sorting samples
241 V-Sorting samples time: 00:00:00
242 Allocating rank array
243 Ranking v-sort output
244 Ranking v-sort output time: 00:00:00
245 Invoking Larsson-Sadakane on ranks
246 Invoking Larsson-Sadakane on ranks time: 00:00:00
247 Sanity-checking and returning
248 Building samples
249 Reserving space for 12 sample suffixes
250 Generating random suffixes
251 QSorting 12 sample offsets, eliminating duplicates
252 QSorting sample offsets, eliminating duplicates time: 00:00:00
253 Multikey QSorting 12 samples
254 (Using difference cover)
255 Multikey QSorting samples time: 00:00:00
256 Calculating bucket sizes
257 Splitting and merging
258 Splitting and merging time: 00:00:00
259 Avg bucket size: 756159 (target: 141779)
260 Converting suffix-array elements to index image
261 Allocating ftab, absorbFtab
262 Entering Ebwt loop
263 Getting block 1 of 1
264 No samples; assembling all-inclusive block
265 Sorting block of length 756159 for bucket 1
266 (Using difference cover)
267 Sorting block time: xxxx
268 Returning block of 756160 for bucket 1
269 Exited Ebwt loop
270 fchr[A]: 0
271 fchr[C]: 386401
272 fchr[G]: 533276
273 fchr[T]: 533276
274 fchr[$]: 756159
275 Exiting Ebwt::buildToDisk()
276 Returning from initFromVector
277 Wrote 4446745 bytes to primary EBWT file: BS_GA.1.bt2
278 Wrote 189044 bytes to secondary EBWT file: BS_GA.2.bt2
279 Re-opening _in1 and _in2 as input streams
280 Returning from Ebwt constructor
281 Headers:
282 len: 756159
283 bwtLen: 756160
284 sz: 189040
285 bwtSz: 189040
286 lineRate: 6
287 offRate: 4
288 offMask: 0xfffffff0
289 ftabChars: 10
290 eftabLen: 20
291 eftabSz: 80
292 ftabLen: 1048577
293 ftabSz: 4194308
294 offsLen: 47260
295 offsSz: 189040
296 lineSz: 64
297 sideSz: 64
298 sideBwtSz: 48
299 sideBwtLen: 192
300 numSides: 3939
301 numLines: 3939
302 ebwtTotLen: 252096
303 ebwtTotSz: 252096
304 color: 0
305 reverse: 0
306 Total time for call to driver() for forward index: xxxx
307 Reading reference sizes
308 Time reading reference sizes: 00:00:00
309 Calculating joined length
310 Writing header
311 Reserving space for joined string
312 Joining reference sequences
313 Time to join reference sequences: 00:00:00
314 Time to reverse reference sequence: 00:00:00
315 bmax according to bmaxDivN setting: 189039
316 Using parameters --bmax 141780 --dcv 1024
317 Doing ahead-of-time memory usage test
318 Passed! Constructing with these parameters: --bmax 141780 --dcv 1024
319 Constructing suffix-array element generator
320 Building DifferenceCoverSample
321 Building sPrime
322 Building sPrimeOrder
323 V-Sorting samples
324 V-Sorting samples time: 00:00:00
325 Allocating rank array
326 Ranking v-sort output
327 Ranking v-sort output time: 00:00:00
328 Invoking Larsson-Sadakane on ranks
329 Invoking Larsson-Sadakane on ranks time: 00:00:00
330 Sanity-checking and returning
331 Building samples
332 Reserving space for 12 sample suffixes
333 Generating random suffixes
334 QSorting 12 sample offsets, eliminating duplicates
335 QSorting sample offsets, eliminating duplicates time: 00:00:00
336 Multikey QSorting 12 samples
337 (Using difference cover)
338 Multikey QSorting samples time: 00:00:00
339 Calculating bucket sizes
340 Splitting and merging
341 Splitting and merging time: 00:00:00
342 Avg bucket size: 756159 (target: 141779)
343 Converting suffix-array elements to index image
344 Allocating ftab, absorbFtab
345 Entering Ebwt loop
346 Getting block 1 of 1
347 No samples; assembling all-inclusive block
348 Sorting block of length 756159 for bucket 1
349 (Using difference cover)
350 Sorting block time: xxxx
351 Returning block of 756160 for bucket 1
352 Exited Ebwt loop
353 fchr[A]: 0
354 fchr[C]: 386401
355 fchr[G]: 533276
356 fchr[T]: 533276
357 fchr[$]: 756159
358 Exiting Ebwt::buildToDisk()
359 Returning from initFromVector
360 Wrote 4446745 bytes to primary EBWT file: BS_GA.rev.1.bt2
361 Wrote 189044 bytes to secondary EBWT file: BS_GA.rev.2.bt2
362 Re-opening _in1 and _in2 as input streams
363 Returning from Ebwt constructor
364 Headers:
365 len: 756159
366 bwtLen: 756160
367 sz: 189040
368 bwtSz: 189040
369 lineRate: 6
370 offRate: 4
371 offMask: 0xfffffff0
372 ftabChars: 10
373 eftabLen: 20
374 eftabSz: 80
375 ftabLen: 1048577
376 ftabSz: 4194308
377 offsLen: 47260
378 offsSz: 189040
379 lineSz: 64
380 sideSz: 64
381 sideBwtSz: 48
382 sideBwtLen: 192
383 numSides: 3939
384 numLines: 3939
385 ebwtTotLen: 252096
386 ebwtTotSz: 252096
387 color: 0
388 reverse: 1
389 Total time for backward call to driver() for mirror index: 00:00:01
390 Running bismark with: 'bismark --bam --temp_dir /tmp/tmpi3V3GI -o /tmp/tmpi3V3GI/results --quiet --fastq -L 20 -D 15 -R 2 --pbat --un --ambiguous /tmp/tmpuqE7r1 input_1.fq'
391 Bowtie 2 seems to be working fine (tested command 'bowtie2 --version' [2.3.5])
392 Output format is BAM (default)
393 Alignments will be written out in BAM format. Samtools found here: '/home/abretaud/miniconda3/envs/mulled-v1-9f2317dbfb405ed6926c55752e5c11678eee3256a6ea680d1c0f912251153030/bin/samtools'
394 Reference genome folder provided is /tmp/tmpuqE7r1/ (absolute path is '/tmp/tmpuqE7r1/)'
395 FastQ format specified
396
397 Input files to be analysed (in current folder '/tmp/tmpU_oiEI/job_working_directory/000/4/working'):
398 input_1.fq
399 Library was specified as PBAT-Seq (Post-Bisulfite Adapter Tagging), only performing alignments to the complementary strands (CTOT and CTOB)
400 Created output directory /tmp/tmpi3V3GI/results/!
401
402 Output will be written into the directory: /tmp/tmpi3V3GI/results/
403
404 Using temp directory: /tmp/tmpi3V3GI
405 Temporary files will be written into the directory: /tmp/tmpi3V3GI/
406 Setting parallelization to single-threaded (default)
407
408 Summary of all aligner options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --quiet
409 Current working directory is: /tmp/tmpU_oiEI/job_working_directory/000/4/working
410
411 Now reading in and storing sequence information of the genome specified in: /tmp/tmpuqE7r1/
412
413 chr chrY_JH584300_random (182347 bp)
414 chr chrY_JH584301_random (259875 bp)
415 chr chrY_JH584302_random (155838 bp)
416 chr chrY_JH584303_random (158099 bp)
417
418 Single-core mode: setting pid to 1
419
420 Single-end alignments will be performed
421 =======================================
422
423 Input file is in FastQ format
424 Writing a G -> A converted version of the input file input_1.fq to /tmp/tmpi3V3GI/input_1.fq_G_to_A.fastq
425
426 Created G -> A converted version of the FastQ file input_1.fq (44115 sequences in total)
427
428 Input file is input_1.fq_G_to_A.fastq (FastQ)
429
430 Now running 2 instances of Bowtie 2 against the bisulfite genome of /tmp/tmpuqE7r1/ with the specified options: -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --quiet
431
432 Now starting the Bowtie 2 aligner for GAreadCTgenome (reading in sequences from /tmp/tmpi3V3GI/input_1.fq_G_to_A.fastq with options -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --quiet --nofw)
433 Using Bowtie 2 index: /tmp/tmpuqE7r1/Bisulfite_Genome/CT_conversion/BS_CT
434
435 Found first alignment: 1_1 4 * 0 0 * * 0 0 TTATATATATTAAATAAATTAATTTTTTTTATTTATATATTAAATTTTTTAATTAATTTATTAATATTTTATAAATTTTTAAATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UU
436 Now starting the Bowtie 2 aligner for GAreadGAgenome (reading in sequences from /tmp/tmpi3V3GI/input_1.fq_G_to_A.fastq with options -q -L 20 -D 15 -R 2 --score-min L,0,-0.2 --ignore-quals --quiet --norc)
437 Using Bowtie 2 index: /tmp/tmpuqE7r1/Bisulfite_Genome/GA_conversion/BS_GA
438
439 Found first alignment: 1_1 4 * 0 0 * * 0 0 TTATATATATTAAATAAATTAATTTTTTTTATTTATATATTAAATTTTTTAATTAATTTATTAATATTTTATAAATTTTTAAATA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEE YT:Z:UU
440
441 >>> Writing bisulfite mapping results to /tmp/tmpi3V3GI/results/input_1_bismark_bt2.bam <<<
442
443 Unmapped sequences will be written to /tmp/tmpi3V3GI/results/input_1.fq_unmapped_reads.fq.gz
444 Ambiguously mapping sequences will be written to /tmp/tmpi3V3GI/results/input_1.fq_ambiguous_reads.fq.gz
445
446 Reading in the sequence file input_1.fq
447 Processed 44115 sequences in total
448
449
450 Successfully deleted the temporary file /tmp/tmpi3V3GI/input_1.fq_G_to_A.fastq
451
452 Final Alignment report
453 ======================
454 Sequences analysed in total: 44115
455 Number of alignments with a unique best hit from the different alignments: 13
456 Mapping efficiency: 0.0%
457
458 Sequences with no alignments under any condition: 44059
459 Sequences did not map uniquely: 43
460 Sequences which were discarded because genomic sequence could not be extracted: 0
461
462 Number of sequences with unique best (first) alignment came from the bowtie output:
463 CT/CT: 0 ((converted) top strand)
464 CT/GA: 0 ((converted) bottom strand)
465 GA/CT: 11 (complementary to (converted) top strand)
466 GA/GA: 2 (complementary to (converted) bottom strand)
467
468 Final Cytosine Methylation Report
469 =================================
470 Total number of C's analysed: 307
471
472 Total methylated C's in CpG context: 1
473 Total methylated C's in CHG context: 3
474 Total methylated C's in CHH context: 227
475 Total methylated C's in Unknown context: 0
476
477 Total unmethylated C's in CpG context: 1
478 Total unmethylated C's in CHG context: 4
479 Total unmethylated C's in CHH context: 71
480 Total unmethylated C's in Unknown context: 0
481
482 C methylated in CpG context: 50.0%
483 C methylated in CHG context: 42.9%
484 C methylated in CHH context: 76.2%
485 Can't determine percentage of methylated Cs in Unknown context (CN or CHN) if value was 0
486
487
488 Bismark completed in xxxx
489
490 ====================
491 Bismark run complete
492 ====================
493