Mercurial > repos > devteam > fastx_quality_statistics
view fastx_quality_statistics.xml @ 2:7306ec78632a draft
planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tool_collections/fastx_toolkit/fastx_quality_statistics commit a1517c9d22029095120643bbe2c8fa53754dd2b7
author | devteam |
---|---|
date | Wed, 11 Nov 2015 12:39:32 -0500 |
parents | c2af34024061 |
children | 59003d0543cd |
line wrap: on
line source
<tool id="cshl_fastx_quality_statistics" version="1.0.0" name="Compute quality statistics"> <description></description> <requirements> <requirement type="package" version="0.0.13">fastx_toolkit</requirement> </requirements> <command> <![CDATA[ zcat -f < '$input' | fastx_quality_stats -o '$output' -Q 33 ]]> </command> <inputs> <param format="fastqsanger" name="input" type="data" label="Library to analyse" /> </inputs> <outputs> <data format="txt" name="output" metadata_source="input" /> </outputs> <tests> <test> <param name="input" value="fastq_stats1.fastq" ftype="fastqsanger"/> <output name="output" file="fastq_stats1.out" /> </test> </tests> <help> **What it does** Creates quality statistics report for the given Solexa/FASTQ library. .. class:: infomark **TIP:** This statistics report can be used as input for **Quality Score** and **Nucleotides Distribution** tools. ----- **The output file will contain the following fields:** * column = column number (1 to 36 for a 36-cycles read Solexa file) * count = number of bases found in this column. * min = Lowest quality score value found in this column. * max = Highest quality score value found in this column. * sum = Sum of quality score values for this column. * mean = Mean quality score value for this column. * Q1 = 1st quartile quality score. * med = Median quality score. * Q3 = 3rd quartile quality score. * IQR = Inter-Quartile range (Q3-Q1). * lW = 'Left-Whisker' value (for boxplotting). * rW = 'Right-Whisker' value (for boxplotting). * A_Count = Count of 'A' nucleotides found in this column. * C_Count = Count of 'C' nucleotides found in this column. * G_Count = Count of 'G' nucleotides found in this column. * T_Count = Count of 'T' nucleotides found in this column. * N_Count = Count of 'N' nucleotides found in this column. For example:: 1 6362991 -4 40 250734117 39.41 40 40 40 0 40 40 1396976 1329101 678730 2958184 0 2 6362991 -5 40 250531036 39.37 40 40 40 0 40 40 1786786 1055766 1738025 1782414 0 3 6362991 -5 40 248722469 39.09 40 40 40 0 40 40 2296384 984875 1443989 1637743 0 4 6362991 -4 40 248214827 39.01 40 40 40 0 40 40 2536861 1167423 1248968 1409739 0 36 6362991 -5 40 117158566 18.41 7 15 30 23 -5 40 4074444 1402980 63287 822035 245 ------ This tool is based on `FASTX-toolkit`__ by Assaf Gordon. .. __: http://hannonlab.cshl.edu/fastx_toolkit/ </help> <!-- FASTQ-Statistics is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) --> </tool>