Mercurial > repos > jjohnson > fastqc

<tool id="FastQC" name="FastQC" version="1.0.0">
  <description>quality control checks on raw sequence data</description>
  <command interpreter="python">fastqc.py
   #if $input.extension.startswith( "fastq"):
    --format=fastq
   #else
    --format=$input.extension
   #end if
   --input='$input'
   --name='$input.name'
   --dir='$report.extra_files_path'
   --report='$report'
   #if $contaminants != None and $contaminants != "None" and $contaminants != "":
     --contaminants=$contaminants
   #end if
  </command>
  <inputs>
   <param name="input" type="data" format="fastq,sam,bam" label="FASTQ reads" />
   <param name="contaminants" type="data" format="tabular" optional="true" label="Contaminants"
           help="Two fields per line separated by a TAB: name DNA_sequence.  For example: Illumina Small RNA RT Primer	CAAGCAGAAGACGGCATACGA"/>
  </inputs>
  <outputs>
    <data name="report" format="html" />
  </outputs>
  <tests>
    <!--
    <test>
      <param name="input1_file" value="3.fastqsanger" ftype="fastqsanger" />
      <output name="output1_file" file="split_pair_reads_1.fastqsanger" />
      <output name="output2_file" file="split_pair_reads_2.fastqsanger" />
    </test>
    -->
  </tests>
  <help>
**What it does**

FastQC_ is a product of Bioinformatics Group at the Babraham Institute.  FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are::

  - Import of data from BAM, SAM or FastQ files (any variant)
  - Provding a quick overview to tell you in which areas there may be problems
  - Summary graphs and tables to quickly assess your data
  - Export of results to an HTML based permanent report
  - Offline operation to allow automated generation of reports without running the interactive application


.. _FastQC: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

-----

**Input format**

Any fastq file, for example::

    @HWI-EAS91_1_30788AAXX:7:21:1542:1758
    GTCAATTGTACTGGTCAATACTAAAAGAATAGGATCGCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA
    +HWI-EAS91_1_30788AAXX:7:21:1542:1758
    hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR

**Contaminants format**

An optional contaminant file (otherwise FastQC will use the default)::

  # This file contains a list of potential contaminants which are
  # frequently found in high throughput sequencing reactions.  These
  # are mostly sequences of adapters / primers used in the various
  # sequencing chemistries.
  #
  # You can add more sequences to the file by putting one line per entry
  # and specifying a name[tab]sequence.  If the contaminant you add is
  # likely to be of use to others please consider sending it to the FastQ
  # authors, either via a bug report at www.bioinformatics.bbsrc.ac.uk/bugzilla/
  # or by directly emailing simon.andrews@bbsrc.ac.uk so other users of
  # the program can benefit.
  Illumina Single End Apapter 1   ACACTCTTTCCCTACACGACGCTGTTCCATCT
  Illumina Single End Apapter 2   CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
  Illumina Single End PCR Primer 1        AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
  Illumina Single End PCR Primer 2        CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT
  Illumina Single End Sequencing Primer   ACACTCTTTCCCTACACGACGCTCTTCCGATCT


-----

**Outputs**

An HTML file with links to::

  - fastqc_report.html
  - summary.txt
  - fastqc_data.txt

  </help>
</tool>
author	jjohnson
date	Tue, 07 Jun 2011 17:22:05 -0400
parents
children