view test-data/denovo_map/denovo_map.log @ 6:33f0793edf93 draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/stacks2 commit 2f4c9bfc48d63075ae18a1687e8d01ffea509084
author iuc
date Wed, 11 May 2022 06:43:52 +0000
parents 5e10f396979e
children
line wrap: on
line source

denovo_map.pl version 2.55 started at 2021-08-24 13:29:01
/home/berntm/miniconda3/envs/mulled-v1-4cbe4390dde1292aa8dbfc7f3537b788791787f5ad55cd86edeadacb1e3df63a/bin/denovo_map.pl --samples demultiplexed --popmap denovo_map/popmap_cstacks.tsv -o stacks_outputs --paired

ustacks
==========

Sample 1 of 2 'PopA_01'
----------
ustacks -t fastq -f demultiplexed/PopA_01.1.fq -o stacks_outputs -i 1 --name PopA_01
ustacks parameters selected:
  Input file: 'demultiplexed/PopA_01.1.fq'
  Sample ID: 1
  Min depth of coverage to create a stack (m): 3
  Repeat removal algorithm: enabled
  Max distance allowed between stacks (M): 2
  Max distance allowed to align secondary reads: 4
  Max number of stacks allowed per de novo locus: 3
  Deleveraging algorithm: disabled
  Gapped assembly: enabled
  Minimum alignment length: 0.8
  Model type: SNP
  Alpha significance level for model: 0.05

Loading RAD-Tags...

Loaded 66 reads; formed:
  4 stacks representing 63 primary reads (95.5%)
  3 secondary stacks representing 3 secondary reads (4.5%)

Stack coverage: mean=15.75; stdev=7.46; max=27; n_reads=63(95.5%)
Removing repetitive stacks: cov > 39 (mean+3*stdev)...
  Blacklisted 0 stacks.
Coverage after repeat removal: mean=15.75; stdev=7.46; max=27; n_reads=63(95.5%)

Assembling stacks (max. dist. M=2)...
  Assembled 4 stacks into 3; blacklisted 0 stacks.
Coverage after assembling stacks: mean=21.00; stdev=4.24; max=27; n_reads=63(95.5%)

Merging secondary stacks (max. dist. N=4 from consensus)...
  Merged 3 out of 3 secondary reads (100.0%), 1 merged with gapped alignments.
Coverage after merging secondary stacks: mean=22.00; stdev=4.32; max=28; n_reads=66(100.0%)

Assembling stacks, allowing for gaps (min. match length 80.0%)...
  Assembled 3 stacks into 3 stacks.
Coverage after gapped assembly: mean=22.00; stdev=4.32; max=28; n_reads=66(100.0%)

Merging secondary stacks, allowing for gaps (min. match length 80.0%)...
  Merged 0 out of 0 secondary reads (-nan%).
Coverage after merging gapped secondary stacks: mean=22.00; stdev=4.32; max=28; n_reads=66(100.0%)

Final coverage: mean=22.00; stdev=4.32; max=28; n_reads=66(100.0%)
Calling consensus sequences and haplotypes for catalog assembly...
Writing tags, SNPs, and alleles files...
Refetching read IDs...done.
ustacks is done.

Sample 2 of 2 'PopA_02'
----------
ustacks -t fastq -f demultiplexed/PopA_02.1.fq -o stacks_outputs -i 2 --name PopA_02
ustacks parameters selected:
  Input file: 'demultiplexed/PopA_02.1.fq'
  Sample ID: 2
  Min depth of coverage to create a stack (m): 3
  Repeat removal algorithm: enabled
  Max distance allowed between stacks (M): 2
  Max distance allowed to align secondary reads: 4
  Max number of stacks allowed per de novo locus: 3
  Deleveraging algorithm: disabled
  Gapped assembly: enabled
  Minimum alignment length: 0.8
  Model type: SNP
  Alpha significance level for model: 0.05

Loading RAD-Tags...

Loaded 60 reads; formed:
  4 stacks representing 55 primary reads (91.7%)
  5 secondary stacks representing 5 secondary reads (8.3%)

Stack coverage: mean=13.75; stdev=9.12; max=26; n_reads=55(91.7%)
Removing repetitive stacks: cov > 42 (mean+3*stdev)...
  Blacklisted 0 stacks.
Coverage after repeat removal: mean=13.75; stdev=9.12; max=26; n_reads=55(91.7%)

Assembling stacks (max. dist. M=2)...
  Assembled 4 stacks into 3; blacklisted 0 stacks.
Coverage after assembling stacks: mean=18.33; stdev=6.55; max=26; n_reads=55(91.7%)

Merging secondary stacks (max. dist. N=4 from consensus)...
  Merged 5 out of 5 secondary reads (100.0%), 0 merged with gapped alignments.
Coverage after merging secondary stacks: mean=20.00; stdev=6.53; max=28; n_reads=60(100.0%)

Assembling stacks, allowing for gaps (min. match length 80.0%)...
  Assembled 3 stacks into 3 stacks.
Coverage after gapped assembly: mean=20.00; stdev=6.53; max=28; n_reads=60(100.0%)

Merging secondary stacks, allowing for gaps (min. match length 80.0%)...
  Merged 0 out of 0 secondary reads (-nan%).
Coverage after merging gapped secondary stacks: mean=20.00; stdev=6.53; max=28; n_reads=60(100.0%)

Final coverage: mean=20.00; stdev=6.53; max=28; n_reads=60(100.0%)
Calling consensus sequences and haplotypes for catalog assembly...
Writing tags, SNPs, and alleles files...
Refetching read IDs...done.
ustacks is done.

Depths of Coverage for Processed Samples:
PopA_01: 22.00x
PopA_02: 20.00x

cstacks
==========
cstacks -M denovo_map/popmap_cstacks.tsv -P stacks_outputs

cstacks parameters selected:
  Loci matched based on sequence identity.
  Number of mismatches allowed between stacks: 1
  Gapped alignments: enabled
Constructing catalog from 2 samples.

Initializing new catalog...
  Parsing stacks_outputs/PopA_01.tags.tsv
  Parsing stacks_outputs/PopA_01.snps.tsv
  Parsing stacks_outputs/PopA_01.alleles.tsv
  3 loci were newly added to the catalog.

Processing sample stacks_outputs/PopA_02 [2 of 2]
  Parsing stacks_outputs/PopA_02.tags.tsv
  Parsing stacks_outputs/PopA_02.snps.tsv
  Parsing stacks_outputs/PopA_02.alleles.tsv
Searching for sequence matches...
  3 loci in the catalog, 184 kmers in the catalog hash.
Searching for gapped alignments...
Merging matches into catalog...
  3 loci were matched to a catalog locus.
  0 loci were matched to a catalog locus using gapped alignments.
  0 loci were newly added to the catalog.
  0 loci matched more than one catalog locus, linking them.
    0 linked catalog loci were merged into 0 loci.

Writing catalog in directory 'stacks_outputs/'.
Final catalog contains 3 loci.
cstacks is done.

sstacks
==========
sstacks -P stacks_outputs -M denovo_map/popmap_cstacks.tsv

Searching for matches by sequence identity...
  Parsing stacks_outputs/catalog.tags.tsv
  Parsing stacks_outputs/catalog.snps.tsv
  Parsing stacks_outputs/catalog.alleles.tsv
Populating kmer dictionary for exact matches...done.
Populating kmer dictionary for gapped alignments...done.

Processing sample 'stacks_outputs/PopA_01' [1 of 2]
  Parsing stacks_outputs/PopA_01.tags.tsv
  Parsing stacks_outputs/PopA_01.snps.tsv
  Parsing stacks_outputs/PopA_01.alleles.tsv
Searching for sequence matches...
3 sample loci compared against the catalog containing 3 loci.
  3 matching loci, 0 contained no verified haplotypes.
  0 loci matched more than one catalog locus and were excluded.
  0 loci contained SNPs unaccounted for in the catalog and were excluded.
  4 total haplotypes examined from matching loci, 4 verified.
Searching for gapped alignments...
Out of 3 query loci, 0 gapped alignments attempted.
  0 loci matched one catalog locus; 0 total haplotypes examined, 0 verified.
  0 loci matched no catalog locus;
    0 loci matched more than one catalog locus and were excluded.
    0 loci contained SNPs unaccounted for in the catalog and were excluded.
    0 loci had no verified haplotypes.
    0 loci had inconsistent alignments to a catalog locus and were excluded.
Outputing to file stacks_outputs/PopA_01.matches.tsv

Processing sample 'stacks_outputs/PopA_02' [2 of 2]
  Parsing stacks_outputs/PopA_02.tags.tsv
  Parsing stacks_outputs/PopA_02.snps.tsv
  Parsing stacks_outputs/PopA_02.alleles.tsv
Searching for sequence matches...
3 sample loci compared against the catalog containing 3 loci.
  3 matching loci, 0 contained no verified haplotypes.
  0 loci matched more than one catalog locus and were excluded.
  0 loci contained SNPs unaccounted for in the catalog and were excluded.
  4 total haplotypes examined from matching loci, 4 verified.
Searching for gapped alignments...
Out of 3 query loci, 0 gapped alignments attempted.
  0 loci matched one catalog locus; 0 total haplotypes examined, 0 verified.
  0 loci matched no catalog locus;
    0 loci matched more than one catalog locus and were excluded.
    0 loci contained SNPs unaccounted for in the catalog and were excluded.
    0 loci had no verified haplotypes.
    0 loci had inconsistent alignments to a catalog locus and were excluded.
Outputing to file stacks_outputs/PopA_02.matches.tsv

sstacks is done.

tsv2bam
==========
tsv2bam -P stacks_outputs  -M denovo_map/popmap_cstacks.tsv -R demultiplexed/

Logging to 'stacks_outputs/tsv2bam.log'.
Configuration for this run:
  Stacks directory: 'stacks_outputs/'
  Population map: 'denovo_map/popmap_cstacks.tsv'
  Num. samples: 2
  Paired-end reads directory: 'demultiplexed/'

Paired-end reads files found, e.g. 'demultiplexed/PopA_01.2.fq'.
Loading the catalog...
Processing sample 'PopA_01'...
Processing sample 'PopA_02'...

Sample 'PopA_01': matched 3 sample loci to 3 catalog loci; found a paired-end read for 66 (100.0%) of the assembled forward reads; wrote 132 records.
Sample 'PopA_02': matched 3 sample loci to 3 catalog loci; found a paired-end read for 60 (100.0%) of the assembled forward reads; wrote 120 records.

tsv2bam is done.

gstacks
==========
gstacks -P stacks_outputs -M denovo_map/popmap_cstacks.tsv

Logging to 'stacks_outputs/gstacks.log'.
Locus/sample distributions will be written to 'stacks_outputs/gstacks.log.distribs'.

Configuration for this run:
  Input mode: denovo
  Population map: 'denovo_map/popmap_cstacks.tsv'
  Input files: 2, e.g. 'stacks_outputs/PopA_01.matches.bam'
  Output to: 'stacks_outputs/'
  Model: marukilow (var_alpha: 0.01, gt_alpha: 0.05)

Reading BAM headers...
Processing all loci...
20%...
50%...
100%

Attempted to assemble and align paired-end reads for 3 loci:
  0 loci had no or almost no paired-end reads (0.0%);
  0 loci had paired-end reads that couldn't be assembled into a contig (0.0%);
  For the remaining 3 loci (100.0%), a paired-end contig was assembled;
    Average contig size was 204.3 bp;
  0 paired-end contigs overlapped the forward region (0.0%)
    Mean overlap: -nanbp; mean size of overlapped loci after merging: -nan;
  Out of 252 paired-end reads in these loci (mean 84.0 reads per locus),
    252 were successfuly aligned (100.0%);
  Mean insert length was -nan, stdev: nan (based on aligned reads in overlapped loci).

Genotyped 3 loci:
  effective per-sample coverage: mean=31.0x, stdev=1.0x, min=30.0x, max=32.0x
  mean number of sites per locus: 194.3
  a consistent phasing was found for 5 of out 5 (100.0%) diploid loci needing phasing

gstacks is done.

populations
==========
populations -P stacks_outputs -M denovo_map/popmap_cstacks.tsv

Logging to 'stacks_outputs/populations.log'.
Locus/sample distributions will be written to 'stacks_outputs/populations.log.distribs'.
populations parameters selected:
  Percent samples limit per population: 0
  Locus Population limit: 1
  Percent samples overall: 0
  Minor allele frequency cutoff: 0
  Maximum observed heterozygosity cutoff: 1
  Applying Fst correction: none.
  Pi/Fis kernel smoothing: off
  Fstats kernel smoothing: off
  Bootstrap resampling: off

Parsing population map...
The population map contained 2 samples, 1 population(s), 1 group(s).
Working on 2 samples.
Working on 1 population(s):
    1: PopA_01, PopA_02
Working on 1 group(s) of populations:
    defaultgrp: 1

Raw haplotypes will be written to 'stacks_outputs/populations.haplotypes.tsv'
Population-level summary statistics will be written to 'stacks_outputs/populations.sumstats.tsv'
Population-level haplotype summary statistics will be written to 'stacks_outputs/populations.hapstats.tsv'

Processing data in batches:
  * load a batch of catalog loci and apply filters
  * compute SNP- and haplotype-wise per-population statistics
  * write the above statistics in the output files
  * export the genotypes/haplotypes in specified format(s)
More details in 'stacks_outputs/populations.log.distribs'.
Now processing...
Batch 1 

Removed 0 loci that did not pass sample/population constraints from 3 loci.
Kept 3 loci, composed of 613 sites; 0 of those sites were filtered, 6 variant sites remained.
Number of loci with PE contig: 3.00 (100.0%);
  Mean length of loci: 194.33bp (stderr 0.33);
Number of loci with SE/PE overlap: 0.00 (0.0%);
  Mean length of overlapping loci: -nanbp (stderr -0.00); mean overlap: -nanbp (stderr -0.00);
Mean genotyped sites per locus: 194.33bp (stderr 0.33).

Population summary statistics (more detail in populations.sumstats_summary.tsv):
  1: 2 samples per locus; pi: 0.61111; all/variant/polymorphic sites: 583/6/6; private alleles: 0
Populations is done.
denovo_map.pl is done.

denovo_map.pl completed at 2021-08-24 13:29:01