view galaxy-conf/PowerSpectrum.xml @ 22:727fbba02ef7 draft

Fix a bug in GeneTrackToWig where +/- strands were not merged correctly. Add option to vary the chunk size of tools that process chromosomes in chunks.
author timpalpant
date Tue, 19 Jun 2012 22:09:23 -0400
parents 9d56b5b85740
children b43c420a6135
line wrap: on
line source

<tool id="PowerSpectrum" name="Compute the power spectrum" version="1.0.0">
  <description>of data in a Wiggle file</description>
  <command interpreter="sh">galaxyToolRunner.sh ngs.PowerSpectrum -i $input -l $windows -m $max -o $output</command>
  <inputs>
      <param format="bigwig,wig" name="input" type="data" label="Input data" />
      <param format="bed,bedgraph,gff" name="windows" type="data" label="List of intervals" />
      <param name="max" type="integer" value="40" label="Number of frequencies to output" />
  </inputs>
  <outputs>
      <data format="tabular" name="output" />
  </outputs>
  
<help>
  
This tool computes the power spectrum of intervals of sequencing data. For each interval provided, the normalized power spectrum is calculated, representing the relative power in each frequency. Power spectra are normalized to have total power 1, with the DC component (0 frequency) removed. Power spectra are computed using the FFT_ implementation in JTransforms_.

.. _FFT: http://en.wikipedia.org/wiki/Fast_Fourier_transform

.. _JTransforms: http://sites.google.com/site/piotrwendykier/software/jtransforms

-----

**Syntax**

- **Input data** is the genomic data on which to compute the power spectrum.
- **List of intervals:** The power spectrum will be computed for each genomic interval specified in this list.
- **Number of frequencies:** The power spectrum will be truncated at this frequency in the output

-----

**Output**

The output has the following format ::

  chr  start  stop  id  alignment  strand  freq1  freq2  ...
  
up to the maximum frequency specified. Frequencies are truncated to reduce the size of the output since signals are often band-limited.

-----

.. class:: warningmark

**NOTE:** Even though frequencies may be truncated in the output, all frequencies in the power spectrum are computed and used for normalization.
  
.. class:: infomark

**TIP:** If your dataset does not appear in the pulldown menu, it means that it is not in Wig or BigWig format. Use "edit attributes" to set the correct format if it was not detected correctly. Intervals must be provided in Bed, BedGraph, or GFF format.

-----

This tool is equivalent to the following Matlab commands, where x is a vector with the interval of sequencing data ::

    N = length(x);
    f = fft(x);
    p = abs(f(2:N/2)).^2;
    p = p / sum(p);

</help>
</tool>