view galaxy-conf/WigCorrelate.xml @ 13:eb53be9a09f4 draft

Uploaded
author timpalpant
date Sat, 19 May 2012 10:36:45 -0400
parents 81d5b81fb3c2
children
line wrap: on
line source

<tool id="WigCorrelate" name="Correlate" version="1.0.0">
  <description>multiple (Big)Wig files</description>
  <command interpreter="sh">
    galaxyToolRunner.sh wigmath.WigCorrelate -w $window -s $step -t $type -o $output
      #for $input in $inputs
        ${input.file}
      #end for
  </command>
  <inputs>
      <repeat name="inputs" title="(Big)Wig file">
        <param name="file" type="data" format="bigwig,wig" />
      </repeat>
      <param name="window" type="integer" value="100" label="Window size (bp)" />
      <param name="step" type="integer" value="50" label="Sliding step size (bp)" />
      <param name="type" type="select" label="Correlation metric">
      	<option value="pearson">Pearson</option>
      	<option value="spearman">Spearman</option>
      </param>
  </inputs>
  <outputs>
      <data format="tabular" name="output" />
  </outputs>
  
<help>
  
This tool will compute a correlation matrix between the supplied Wig or BigWig files. Each row/column in the matrix is added in the order that files are added above, starting from the top left. The Wig file is downsampled into sliding windows with the specified bin size and shift by computing the mean value in each window. These windows are then correlated using either Pearson_'s Product-Moment correlation coefficient or Spearman_'s rank correlation coefficient. If the window size is set to 1, the correlation is calculated between all base pairs in the genome.
  
.. _Pearson: http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

.. _Spearman: http://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient
  
-----

**Syntax**

- **Inputs** are the genomic data to correlate
- **Window size** is the size of the window to bin data into
- **Sliding step size** is the shift step size of the sliding window used during binning
- **Correlation metric** is the type of correlation to calculate
  
-----

.. class:: warningmark

**WARN:** In order to calculate the correlation coefficient, the data is loaded into entirely into memory. For large genomes, this may require a lot of RAM unless comparably larger window sizes are used.

-----

**Citation**

This tool was inspired by ACT_ from the Gerstein lab.

.. _ACT: http://act.gersteinlab.org

J Jee*, J Rozowsky*, KY Yip*, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein. ACT: Aggregation and Correlation Toolbox for Analyses of Genome Tracks. (2011) Bioinformatics 27(8): 1152-4.
  
</help>
</tool>