Mercurial > repos > iuc > spotyping

<tool id="spotyping" name="SpoTyping" version="@TOOL_VERSION@+galaxy2" profile="17.01">
  <description>fast and accurate in silico Mycobacterium spoligotyping from sequence reads</description>

  <macros>
    <import>spotyping_macros.xml</import>
  </macros>

  <requirements>
    <requirement type="package" version="@TOOL_VERSION@">spotyping</requirement>
  </requirements>

  <command detect_errors="aggressive"><![CDATA[
    #set $input_file='input.' + $input.extension
    ln -s '${input}' $input_file &&
    SpoTyping.py
    $advanced.noQuery
    $advanced.seq
    $advanced.swift
    $advanced.filter
    $advanced.sorted
    $input_file &&
    cat SpoTyping.log SpoTyping > '${output_txt}'
    #if $advanced.noQuery != '--noQuery':
      && cp SITVIT_ONLINE.*.xls '${output_xls}'
      #if $advanced.pdf_plot == 'yes':
        && SpoTyping_plot.r '${output_xls}' '${output_pdf}'
      #end if
    #end if
    ]]>
  </command>

  <inputs>
    <param name="input" type="data" format="fastq,fastq.gz,fasta" label="Sequence reads" />
    <section name="advanced" title="Advanced options" expanded="false">
      <param type="boolean" argument="--noQuery" label="Query SITVIT" truevalue="" falsevalue="--noQuery" checked="false" help="Query SITVIT service for report on spoligotype" />
      <param type="boolean" argument="--seq" label="Input is assembled sequence" help="Input is either a complete genomic sequence or assembled contigs from an isolate" truevalue="--seq" falsevalue="" checked="false" />
      <param type="boolean" argument="--swift" label="Swift mode" checked="true" truevalue="--swift=on" falsevalue="--swift=off" />
      <param type="boolean" argument="--filter" label="Stringent filtering of reads" truevalue="--filter" falsevalue="" checked="false" />
      <param type="boolean" argument="--sorted" label="Reads are sorted to a reference genome" truevalue="--sorted" falsevalue="" />
      <param type="boolean" name="pdf_plot" label="Generate plot from SITVIT report output" truevalue="yes" falsevalue="" />
    </section>
  </inputs>

  <outputs>
    <data name="output_txt" label="SpoTyping spoligotyping on ${on_string}" format="txt" />
    <data name="output_xls" label="SpoTyping SITVIT report on ${on_string}" format="excel.xls">
      <filter>advanced['noQuery']</filter>
    </data>
    <data name="output_pdf" label="SpoTyping plot from SITVIT report on ${on_string}" format="pdf">
      <filter>advanced['pdf_plot']</filter>
    </data>
  </outputs>

  <tests>
    <test expect_num_outputs="1">
      <param name="input" value="input.fastq.gz" ftype="fastq.gz" />
      <output name="output_txt">
        <assert_contents>
          <has_text text="1111111111111111101111111111111100001111111" />
        </assert_contents>
      </output>
    </test>
    <test expect_failure="true">
      <param name="input" value="input.fastq.gz" ftype="fastq.gz" />
      <param name="seq" value="--seq" />
      <assert_stderr>
        <has_text text="BLAST options error" />
      </assert_stderr>
    </test>
    <test expect_num_outputs="2">
      <param name="input" value="input.fastq.gz" ftype="fastq.gz" />
      <param name="noQuery" value="" />
      <output name="output_txt">
        <assert_contents>
          <has_text text="1111111111111111101111111111111100001111111" />
        </assert_contents>
      </output>
      <output name="output_xls">
        <assert_contents>
          <has_text text="Spoligotype" />
        </assert_contents>
      </output>
    </test>
    <test expect_num_outputs="3">
      <param name="input" value="input.fastq.gz" ftype="fastq.gz" />
      <param name="noQuery" value="" />
      <param name="pdf_plot" value="yes" />
      <output name="output_txt">
        <assert_contents>
          <has_text text="1111111111111111101111111111111100001111111" />
        </assert_contents>
      </output>
      <output name="output_xls">
        <assert_contents>
          <has_text text="Spoligotype" />
        </assert_contents>
      </output>
      <output name="output_pdf">
        <assert_contents>
          <has_text text="PDF-1.4" />
        </assert_contents>
      </output>
    </test>

  </tests>

  <help><![CDATA[
    SpoTyping_ is a software for predicting spoligotype_ from sequencing reads, complete genomic sequences and assembled contigs.

    **Input:**

    - Fastq file - if paired end data is used, you may choose to concatenate paired reads into a single input (e.g. using the cat tool)
    - Fasta file of a complete genomic sequence or assembled contigs of an isolate (with --seq option)

    *Note on input size*: In swift mode the sampling threshold is reached in approximately 30x coverage when using
    paired end sequencing of a *M. tuberculosis* genome.

    **Output:**

    Count of hits from BLAST result for each spacer sequence and predicted spoligotype in the format of binary code and octal code.

    **Options:**

    \--noQuery
    Avoid querying the SITVIT_ online service to describe the prevalance of the reported spoligotype.

    \--seq
    Set this if input is a fasta file that contains only complete genomic sequence or assembled contigs from an isolate. [Default is off]

    \-s SWIFT, --swift=SWIFT
    Swift mode, either "on" or "off" [Default: on] - swift mode samples 250 million bases to use for spoligotyping

    \--sorted
    Set if input reads are sorted relative to positions on a reference genome. If reads are sorted and swift mode is used, swift mode's sampling is adjusted
    to sample reads across positions in the genome evenly.

    \--filter
    Filter reads such that:

    1. Leading and trailing 'N's would be removed.
    2. Any read with more than 3 'N's in the middle would be removed.
    3. Any read with more than 7 consecutive bases identical would be trimmed/filtered out given
       the length of the flanking regions.

    **Got weird spoligotype prediction?**

    Sequencing throughput is very low (<40Mbp, for example): SpoTyping may not be able to give accurate prediction due to the relatively low read depth.

    **Interpreting the spoligotype**

    The binary or octal spoligotype can be used to look up lineage information using a service
    like `TB Lineage`_.

    **SITVIT reports**

    Optionally a report on the detected spoligotype can be retrieved from the SITVIT_ database. If such a report is requested it can also be
    illustrated as a (PDF format) plot.

  .. _SpoTyping: https://github.com/xiaeryu/SpoTyping
  .. _spoligotype: https://www.ncbi.nlm.nih.gov/pubmed/19521871
  .. _TB Lineage: http://tbinsight.cs.rpi.edu/run_tb_lineage.html
  .. _SITVIT: http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE/
    ]]>
  </help>

  <citations>
    <expand macro="spotyping_citation" />
  </citations>
</tool>
author	iuc
date	Sun, 21 Oct 2018 13:03:00 -0400
parents	f82981245fbe
children