Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Warning! This can take a few seconds.

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        About MultiQC

        This report was generated using MultiQC, version 1.5

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.

        Report generated on 2018-04-06, 18:29 based on data in: /Users/bebatut/Documents/galaxy/tools/tools-iuc/tools/multiqc/multiqc_WDir


        General Statistics

        Showing 21/21 rows and 31/34 columns.
        Sample NameM Reads MappedN50 (Kbp)Length (Mbp)Change rateTs/TvM VariantsTiTV ratio (known)TiTV ratio (novel)% AssignedM AssignedVarsSNPIndelTs/TvM Assigned% rRNA% mRNA% AlignedInsert Size% DupsOrganismContigsCDS% DupsError rateM Non-PrimaryM Reads Mapped% MappedM Total seqs% Duplicates% Mapped
        14892_1#15
        115.1Kbp
        18.4Mbp
        70: TopHat on data 1, data 4, and data 3: accepted_hits
        70.8%
        0.3
        75: TopHat on data 1, data 6, and data 5: accepted_hits
        69.6%
        0.4
        80: TopHat on data 1, data 8, and data 7: accepted_hits
        71.8%
        0.4
        85: TopHat on data 1, data 10, and data 9: accepted_hits
        72.0%
        0.4
        90: TopHat on data 1, data 12, and data 11: accepted_hits
        71.3%
        0.4
        95: TopHat on data 1, data 14, and data 13: accepted_hits
        70.7%
        0.5
        D11_H4K16ac_Rep1_R1_fastq_gz
        98%
        Sample1
        Helicobacter pylori
        30.0
        1548
        Sample2
        Escherichia coli
        52.0
        1548
        Test1
        5522770
        4474244
        902934
        1.97
        bamtools
        0.0%
        93.1%
        dataset_114
        0.6%
        dataset_197
        176 bp
        gatk_varianteval
        0.0
        2.2
        htseq
        0.0%
        0.0
        picard_CollectRnaSeqMetrics_bam
        %
        79.6%
        samtools_flagstat
        20.7
        samtools_stats
        0.42%
        0.0
        0.6
        100.0%
        0.6
        snpeff
        3190
        0.000
        0.97
        virtual-normal
        1.3%

        QUAST

        QUAST is a quality assessment tool for genome assemblies, written by the Center for Algorithmic Biotechnology.

        Assembly Statistics

        Showing 1/1 rows and 12/12 columns.
        Sample NameN50 (Kbp)N75 (Kbp)L50 (K)L75 (K)Largest contig (Kbp)Length (Mbp)MisassembliesMismatches/100kbpIndels/100kbpGenesGenes (Partial)Genome Fraction
        14892_1#15
        115.1Kbp
        62.3Kbp
        0.0K
        100.0K
        435.5Kbp
        18.4Mbp
        107.0
        262.21
        33.75
        7280
        968
        95.5%

        Number of Contigs

        This plot shows the number of contigs found for each assembly, broken down by length.

        loading..

        RSeQC

        RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput RNA-seq data.

        Read GC Content

        read_GC calculates a histogram of read GC content.

        loading..

        BUSCO

        BUSCO assesses genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs.

        Lineage: fungi_odb9

        loading..

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.

        Filtering metrics

        Estimated percentages of alignments filtered independently for each setting in estimateReadFiltering

        Showing 2/2 rows and 10/10 columns.
        Sample NameM entries% Aligned% Tot. Filtered% Blacklisted% Missing Flags% Forbidden Flags% deepTools Dupes% Duplication% Singletons% Strand Filtered
        bismark_se.pbat
        30.9
        100.0
        44.1
        0.0
        0.0
        0.0
        43.4
        0.0
        0.0
        5.0
        bwameth_se.pbat
        71.4
        64.7
        19.8
        0.0
        0.0
        0.0
        28.4
        0.0
        0.0
        6.4

        Coverage metrics

        Showing 2/2 rows and 7/7 columns.
        Sample NameMin1st QuartileMedianMean3rd QuartileMaxStd. Dev.
        bismark_se.pbat
        0.0
        0.0
        0.0
        3.0
        0.0
        959.0
        16.5
        bwameth_se.pbat
        0.0
        0.0
        0.0
        4.5
        0.0
        2589.0
        23.4

        Coverage distribution

        The fraction of bases with a given number of read/fragment coverage

        loading..

        Read lengths

        Showing 5/5 rows and 9/9 columns.
        Sample Name# SampledMin1st QuartileMeanMedian3rd QuartileMaxStd. Dev.MAD
        bismark_se.pbat
        34118
        21
        101
        99.6
        101
        101
        101
        7.1
        0.0
        bwameth_se.pbat
        51142
        40
        101
        98.0
        101
        101
        101
        10.9
        0.0
        dnmt1MUT_mat2aMUT_1_RGi
        15504
        148
        148
        148.0
        148
        148
        148
        0.0
        0.0
        dnmt1MUT_mat2aMUT_2_RGi
        17327
        148
        148
        148.0
        148
        148
        148
        0.0
        0.0
        dnmt1MUT_mat2aMUT_3_RGi
        15266
        102
        148
        148.0
        148
        148
        148
        0.4
        0.0

        Fragment lengths

        Showing 3/3 rows and 9/9 columns.
        Sample Name# SampledMin1st QuartileMeanMedian3rd QuartileMaxStd. Dev.MAD
        dnmt1MUT_mat2aMUT_1_RGi
        15504
        174
        185
        185
        185
        185
        186
        0.2
        0.0
        dnmt1MUT_mat2aMUT_2_RGi
        17327
        176
        185
        185
        185
        185
        186
        0.2
        0.0
        dnmt1MUT_mat2aMUT_3_RGi
        15266
        104
        185
        185
        185
        185
        186
        0.7
        0.0

        Read/fragment length distribution

        loading..

        Signal enrichment per feature

        Signal enrichment per feature according to plotEnrichment

        loading..

        Fingerprint

        Signal fingerprint according to plotFingerprint

        loading..

        SnpEff

        SnpEff is a genetic variant annotation and effect prediction toolbox. It annotates and predicts the effects of variants on genes (such as amino acid changes).

        Variants by Genomic Region

        The stacked bar plot shows locations of detected variants in the genome and the number of variants for each location.

        The upstream and downstream interval size to detect these genomic regions is 5000bp by default.

        loading..

        Variant Effects by Impact

        The stacked bar plot shows the putative impact of detected variants and the number of variants for each impact.

        There are four levels of impacts predicted by SnpEff:

        • High: High impact (like stop codon)
        • Moderate: Middle impact (like same type of amino acid substitution)
        • Low: Low impact (ie silence mutation)
        • Modifier: No impact

        loading..

        Variant Effects by Class

        The stacked bar plot shows the effect of variants at protein level and the number of variants for each effect type.

        This plot shows the effect of variants on the translation of the mRNA as protein. There are three possible cases:

        • Silent: The amino acid does not change.
        • Missense: The amino acid is different.
        • Nonsense: The variant generates a stop codon.

        Error - was not able to plot data.


        Variant Qualities

        The line plot shows the quantity as function of the variant quality score.

        The quality score corresponds to the QUAL column of the VCF file. This score is set by the variant caller.

        loading..

        GATK

        GATK is a toolkit offering a wide variety of tools with a primary focus on variant discovery and genotyping.

        Observed Quality Scores

        This plot shows the distribution of base quality scores in each sample before and after base quality score recalibration (BQSR). Applying BQSR should broaden the distribution of base quality scores.

        For more information see the Broad's description of BQSR.

        loading..

        Variant Counts

        loading..

        Compare Overlap

        Showing 0/0 rows and 0/0 columns.
        Sample Name

        HTSeq Count

        HTSeq Count is part of the HTSeq Python package - it takes a file with aligned sequencing reads, plus a list of genomic features and counts how many reads map to each feature.

        loading..

        Bcftools

        Bcftools contains utilities for variant calling and manipulating VCFs and BCFs.

        Variant Substitution Types

        loading..

        Indel Distribution

        loading..

        Variant depths

        Read depth support distribution for called variants

        loading..

        featureCounts

        Subread featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations.

        loading..

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Alignment Summary

        Plase note that Picard's read counts are divided by two for paired-end data.

        loading..

        Base Distribution

        Plot shows the distribution of bases by cycle.

        loading..

        GC Coverage Bias

        This plot shows bias in coverage across regions of the genome with varying GC content. A perfect library would be a flat line at y = 1.

        loading..

        Insert Size

        Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

        loading..

        Mark Duplicates

        loading..

        RnaSeqMetrics Assignment

        Number of bases in primary alignments that align to regions in the reference genome.

        loading..

        Gene Coverage

        loading..

        Prokka

        Prokka is a software tool for the rapid annotation of prokaryotic genomes.

        This barplot shows the distribution of different types of features found in each contig.

        Prokka can detect different features:

        • CDS
        • rRNA
        • tmRNA
        • tRNA
        • miscRNA
        • signal peptides
        • CRISPR arrays

        This barplot shows you the distribution of these different types of features found in each contig.

        loading..

        Samblaster

        Samblaster is a tool to mark duplicates and extract discordant and split reads from sam files.

        loading..

        Samtools

        Samtools is a suite of programs for interacting with high-throughput sequencing data.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        Samtools Flagstat

        This module parses the output from samtools flagstat. All numbers in millions.

        loading..

        Bamtools

        Bamtools provides both a programmer's API and an end-user's toolkit for handling BAM files.

        Bamtools Stats

        loading..

        VCFTools

        VCFTools is a program for working with and reporting on VCF files.

        TsTv by Qual

        Plot of TSTV-BY-QUAL - the transition to transversion ratio as a function of SNP quality from the output of vcftools TsTv-by-qual.

        Transition is a purine-to-purine or pyrimidine-to-pyrimidine point mutations. Transversion is a purine-to-pyrimidine or pyrimidine-to-purine point mutation. Quality here is the Phred-scaled quality score as given in the QUAL column of VCF. Note: only bi-allelic SNPs are used (multi-allelic sites and INDELs are skipped.) Refer to Vcftools's manual (https://vcftools.github.io/man_latest.html) on --TsTv-by-qual

        loading..