carpet_toolsuite: carpet-src-1/tools/CARPET/PeakPeaker.xml comparison

comparison carpet-src-1/tools/CARPET/PeakPeaker.xml @ 0:cdd489d98766

Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository

author	matces
date	Tue, 07 Jun 2011 16:50:41 -0400
parents
children

comparison

equal deleted inserted replaced

--1:000000000000
+:cdd489d98766
+<tool id="Find peaks" name="PeakPicker" version="1.0.0">
+<description>Finding Peaks in a GFF Nimblegen File</description>
+<command interpreter="perl">PeakPeaker2.pl --in $input --out $output --t $type --dist_peaks $dist_peaks --col3 $col3 --log $log --perc $perc --num $num --dist $dist --w $window --f_pv $output2</command>
+<inputs>
+<param format="tabular" name="input" type="data" label="Source file"/>
+<param name="col3" size="20" type="text" value="Analisys" label="Analisys name"/>
+<param name="type" type="select" label="Analysis type">
+<option value="p">p-value</option>
+<option value="s">score</option>
+</param>
+<param name="perc" size="4" type="text" value="0.95" label="percentile value"/>
+<param name="log" size="2" type="text" value="7" label="-log p-value cutoff"/>
+<param name="num" size="2" type="text" value="3" label="minimal number of probes"/>
+<param name="dist" size="4" type="text" value="100" label="max distance between two probes"/>
+<param name="dist_peaks" size="4" type="text" value="200" label="min distance between two peaks"/>
+<param name="window" size="4" type="text" value="500" label="window length"/>
+</inputs>
+<outputs>
+<data format="bed" name="output" />
+<data format="gff" name="output2" />
+</outputs>
+<help>
+.. class:: infomark
+**What it does**
+This tool utilizes NimbleGen ratio files in gff format as INPUT FILE and provides a table of the computed peaks in the same gff format.
+--------
+**Parameters:**
+- **Analysis type:**
+- **p-value** analysis performs peaks determination based on p-value inference
+- **score** analysis performs peaks determination based on a scoring system
+- **Percentile value:** it is used to calculate the threshold rate based on dataset distribution to filter out background
+- **-log p-value cutoff:** (required only for p-value based analysis) cutoff integer to be used to identify a significant peak
+- **minimal # of probes:** minimal number of consecutive probes used to define a peak
+- **max distance 2 probes:** greatest nucleotide distance (bp) between two probes that allow to consider two probes as adjacent
+- **min distance 2 peaks:** minimum nucleotide distance (bp) required to consider two peaks as separate entities
+- **window length:** length in bp of the window used for statistical analysis
+--------
+**INPUT FILE**
+Nimblegen gives you back a GFF file with the coordinates of each probe and the normalized signal value --> log2(Cy5/Cy3).
+Click here_ to download a GFF file example.
+.. _here: /static/example_file/GFF_file_norm.txt.zip
+Example of Nimblegen GFF format::
+chr19  Nimblegen  tiling_array  100000  1000051  -1.2	 +   .  probe_name
+chr19  Nimblegen  tiling_array  100100  1000151   2.9	 +   .  probe_name
+.. class:: warningmark
+The sixth column **must** contain the normalized log2(cy5/cy3) that Nimblegen gives you back after the experiment
+---------
+.. class:: infomark
+**How does it work?**
+**Two assumptions:**
+- data are enriched for signal in the positive direction ("one-tailed")
+- a peak (or enriched region) is represented by multiple probes that are genomically located close to each other
+**Statistical approach: sliding window**
+A window centered at each probe of the array moves probe by probe. In each window Chi squared is calculated
+.. image:: static/images/CARPET/chi_squared.png
+by building a contingency table for each probe, and a p-value is assigned
+.. image:: static/images/CARPET/centered.png
+**"-log2(p-value)"** is associated to each probe. This value takes in account the neighbouring probes effect.
+This approach dramatically decreases the background signal.
+.. image:: static/images/CARPET/background.png
+New values are considered to defined an enriched locus
+.. image:: static/images/CARPET/pvalue.png
+Moreover a score is calculated taking into account the length and the raw signal of the peak
+.. image:: static/images/CARPET/pvalue_score.png
+Output is a gff file
+.. image:: static/images/CARPET/table_pv.png
+**NON Statistical approach: score**
+Only the raw signal of each probe is considered. Only the regions with a number of consecutive probes above the defined threshold are selected
+.. image:: static/images/CARPET/score.png
+Output is a GFF file
+.. image:: static/images/CARPET/table_score.png
+and a GFF file with the p-values associate to each probe
+</help>
+</tool>

Mercurial > repos > matces > carpet_toolsuite

comparison carpet-src-1/tools/CARPET/PeakPeaker.xml @ 0:cdd489d98766