Mercurial > repos > devteam > fastq_stats
view fastq_stats.xml @ 2:e2cf940128d5 draft
planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tool_collections/galaxy_sequence_utils/fastq_stats commit f2582539542b33240234e8ea6093e25d0aee9b6a
author | devteam |
---|---|
date | Sat, 30 Sep 2017 13:55:30 -0400 |
parents | daaf552153fe |
children | b1cea1b2fcd0 |
line wrap: on
line source
<tool id="fastq_stats" name="FASTQ Summary Statistics" version="1.1.1"> <description>by column</description> <requirements> <requirement type="package" version="1.1.1">galaxy_sequence_utils</requirement> </requirements> <command><![CDATA[ gx-fastq-stats '$input_file' '$output_file' '${input_file.extension[len('fastq'):]}' ]]></command> <inputs> <param name="input_file" type="data" format="fastqsanger,fastqillumina,fastqsolexa,fastqcssanger,fastqsanger.gz,fastqillumina.gz,fastqsolexa.gz,fastqcssanger.gz,fastqsanger.bz2,fastqillumina.bz2,fastqsolexa.bz2,fastqcssanger.bz2" label="FASTQ File"/> </inputs> <outputs> <data name="output_file" format="tabular" /> </outputs> <tests> <test> <param name="input_file" value="fastq_stats1.fastq" ftype="fastqsanger" /> <output name="output_file" file="fastq_stats_1_out.tabular" /> </test> </tests> <help><![CDATA[ **What is does** This tool creates summary statistics on a FASTQ file. .. class:: infomark **TIP:** This statistics report can be used as input for the **Boxplot** tools. ----- **The output file will contain the following fields:** * column = column number (1 to 36 for a 36-cycles read Solexa file) * count = number of bases found in this column. * min = Lowest quality score value found in this column. * max = Highest quality score value found in this column. * sum = Sum of quality score values for this column. * mean = Mean quality score value for this column. * Q1 = 1st quartile quality score. * med = Median quality score. * Q3 = 3rd quartile quality score. * IQR = Inter-Quartile range (Q3-Q1). * lW = 'Left-Whisker' value (for boxplotting). * rW = 'Right-Whisker' value (for boxplotting). * outliers = Scores falling beyond the left and right whiskers (comma separated list). * A_Count = Count of 'A' nucleotides found in this column. * C_Count = Count of 'C' nucleotides found in this column. * G_Count = Count of 'G' nucleotides found in this column. * T_Count = Count of 'T' nucleotides found in this column. * N_Count = Count of 'N' nucleotides found in this column. * Other_Nucs = Comma separated list of other nucleotides found in this column. * Other_Count = Comma separated count of other nucleotides found in this column. For example:: #column count min max sum mean Q1 med Q3 IQR lW rW outliers A_Count C_Count G_Count T_Count N_Count other_bases other_base_count 1 14336356 2 33 450600675 31.4306281875 32.0 33.0 33.0 1.0 31 33 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 4482314 2199633 4425957 3208745 19707 2 14336356 2 34 441135033 30.7703737965 30.0 33.0 33.0 3.0 26 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25 4419184 2170537 4627987 3118567 81 3 14336356 2 34 433659182 30.2489127642 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4310988 2941988 3437467 3645784 129 4 14336356 2 34 433635331 30.2472490917 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4110637 3007028 3671749 3546839 103 5 14336356 2 34 432498583 30.167957813 29.0 32.0 33.0 4.0 23 34 2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 4348275 2935903 3293025 3759029 124 ----- .. class:: warningmark Adapter bases in color space reads are excluded from statistics. ]]></help> <citations> <citation type="doi">10.1093/bioinformatics/btq281</citation> </citations> </tool>