sharplabtool: tools/human_genome_variation/pass.xml annotate

annotate tools/human_genome_variation/pass.xml @ 0:9071e359b9a3

Uploaded

author	xuebing
date	Fri, 09 Mar 2012 19:37:19 -0500
parents
children

rev	line source
0 9071e359b9a3 Uploaded xuebing parents: diff changeset	1 <tool id="hgv_pass" name="PASS" version="1.0.0">
9071e359b9a3 Uploaded xuebing parents: diff changeset	2 <description>significant transcription factor binding sites from ChIP data</description>
9071e359b9a3 Uploaded xuebing parents: diff changeset	3
9071e359b9a3 Uploaded xuebing parents: diff changeset	4 <command interpreter="bash">
9071e359b9a3 Uploaded xuebing parents: diff changeset	5 pass_wrapper.sh "$input" "$min_window" "$max_window" "$false_num" "$output"
9071e359b9a3 Uploaded xuebing parents: diff changeset	6 </command>
9071e359b9a3 Uploaded xuebing parents: diff changeset	7
9071e359b9a3 Uploaded xuebing parents: diff changeset	8 <inputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	9 <param format="gff" name="input" type="data" label="Dataset"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	10 <param name="min_window" label="Smallest window size (by # of probes)" type="integer" value="2" />
9071e359b9a3 Uploaded xuebing parents: diff changeset	11 <param name="max_window" label="Largest window size (by # of probes)" type="integer" value="6" />
9071e359b9a3 Uploaded xuebing parents: diff changeset	12 <param name="false_num" label="Expected total number of false positive intervals to be called" type="float" value="5.0" help="N.B.: this is a <em>count</em>, not a rate." />
9071e359b9a3 Uploaded xuebing parents: diff changeset	13 </inputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	14
9071e359b9a3 Uploaded xuebing parents: diff changeset	15 <outputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	16 <data format="tabular" name="output" />
9071e359b9a3 Uploaded xuebing parents: diff changeset	17 </outputs>
9071e359b9a3 Uploaded xuebing parents: diff changeset	18
9071e359b9a3 Uploaded xuebing parents: diff changeset	19 <requirements>
9071e359b9a3 Uploaded xuebing parents: diff changeset	20 <requirement type="package">pass</requirement>
9071e359b9a3 Uploaded xuebing parents: diff changeset	21 <requirement type="binary">sed</requirement>
9071e359b9a3 Uploaded xuebing parents: diff changeset	22 </requirements>
9071e359b9a3 Uploaded xuebing parents: diff changeset	23
9071e359b9a3 Uploaded xuebing parents: diff changeset	24 <!-- we need to be able to set the seed for the random number generator
9071e359b9a3 Uploaded xuebing parents: diff changeset	25 <tests>
9071e359b9a3 Uploaded xuebing parents: diff changeset	26 <test>
9071e359b9a3 Uploaded xuebing parents: diff changeset	27 <param name="input" ftype="gff" value="pass_input.gff"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	28 <param name="min_window" value="2"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	29 <param name="max_window" value="6"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	30 <param name="false_num" value="5"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	31 <output name="output" file="pass_output.tab"/>
9071e359b9a3 Uploaded xuebing parents: diff changeset	32 </test>
9071e359b9a3 Uploaded xuebing parents: diff changeset	33 </tests>
9071e359b9a3 Uploaded xuebing parents: diff changeset	34 -->
9071e359b9a3 Uploaded xuebing parents: diff changeset	35
9071e359b9a3 Uploaded xuebing parents: diff changeset	36 <help>
9071e359b9a3 Uploaded xuebing parents: diff changeset	37 Dataset formats
9071e359b9a3 Uploaded xuebing parents: diff changeset	38
9071e359b9a3 Uploaded xuebing parents: diff changeset	39 The input is in GFF_ format, and the output is tabular_.
9071e359b9a3 Uploaded xuebing parents: diff changeset	40 (`Dataset missing?`_)
9071e359b9a3 Uploaded xuebing parents: diff changeset	41
9071e359b9a3 Uploaded xuebing parents: diff changeset	42 .. _GFF: ./static/formatHelp.html#gff
9071e359b9a3 Uploaded xuebing parents: diff changeset	43 .. _tabular: ./static/formatHelp.html#tab
9071e359b9a3 Uploaded xuebing parents: diff changeset	44 .. _Dataset missing?: ./static/formatHelp.html
9071e359b9a3 Uploaded xuebing parents: diff changeset	45
9071e359b9a3 Uploaded xuebing parents: diff changeset	46 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	47
9071e359b9a3 Uploaded xuebing parents: diff changeset	48 What it does
9071e359b9a3 Uploaded xuebing parents: diff changeset	49
9071e359b9a3 Uploaded xuebing parents: diff changeset	50 PASS (Poisson Approximation for Statistical Significance) detects
9071e359b9a3 Uploaded xuebing parents: diff changeset	51 significant transcription factor binding sites in the genome from
9071e359b9a3 Uploaded xuebing parents: diff changeset	52 ChIP data. This is probably the only peak-calling method that
9071e359b9a3 Uploaded xuebing parents: diff changeset	53 accurately controls the false-positive rate and FDR in ChIP data,
9071e359b9a3 Uploaded xuebing parents: diff changeset	54 which is important given the huge discrepancy in results obtained
9071e359b9a3 Uploaded xuebing parents: diff changeset	55 from different peak-calling algorithms. At the same time, this
9071e359b9a3 Uploaded xuebing parents: diff changeset	56 method achieves a similar or better power than previous methods.
9071e359b9a3 Uploaded xuebing parents: diff changeset	57
9071e359b9a3 Uploaded xuebing parents: diff changeset	58 <!-- we don't have wrapper support for the "prior" file yet
9071e359b9a3 Uploaded xuebing parents: diff changeset	59 Another unique feature of this method is that it allows varying
9071e359b9a3 Uploaded xuebing parents: diff changeset	60 thresholds to be used for peak calling at different genomic
9071e359b9a3 Uploaded xuebing parents: diff changeset	61 locations. For example, if a position lies in an open chromatin
9071e359b9a3 Uploaded xuebing parents: diff changeset	62 region, is depleted of nucleosome positioning, or a co-binding
9071e359b9a3 Uploaded xuebing parents: diff changeset	63 protein has been detected within the neighborhood, then the position
9071e359b9a3 Uploaded xuebing parents: diff changeset	64 is more likely to be bound by the target protein of interest, and
9071e359b9a3 Uploaded xuebing parents: diff changeset	65 hence a lower threshold will be used to call significant peaks.
9071e359b9a3 Uploaded xuebing parents: diff changeset	66 As a result, weak but real binding sites can be detected.
9071e359b9a3 Uploaded xuebing parents: diff changeset	67 -->
9071e359b9a3 Uploaded xuebing parents: diff changeset	68
9071e359b9a3 Uploaded xuebing parents: diff changeset	69 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	70
9071e359b9a3 Uploaded xuebing parents: diff changeset	71 Hints
9071e359b9a3 Uploaded xuebing parents: diff changeset	72
9071e359b9a3 Uploaded xuebing parents: diff changeset	73 - ChIP-Seq data:
9071e359b9a3 Uploaded xuebing parents: diff changeset	74
9071e359b9a3 Uploaded xuebing parents: diff changeset	75 If the data is from ChIP-Seq, you need to convert the ChIP-Seq values
9071e359b9a3 Uploaded xuebing parents: diff changeset	76 into z-scores before using this program. It is also recommended that
9071e359b9a3 Uploaded xuebing parents: diff changeset	77 you group read counts within a neighborhood together, e.g. in tiled
9071e359b9a3 Uploaded xuebing parents: diff changeset	78 windows of 30bp. In this way, the ChIP-Seq data will resemble
9071e359b9a3 Uploaded xuebing parents: diff changeset	79 ChIP-chip data in format.
9071e359b9a3 Uploaded xuebing parents: diff changeset	80
9071e359b9a3 Uploaded xuebing parents: diff changeset	81 - Choosing window size options:
9071e359b9a3 Uploaded xuebing parents: diff changeset	82
9071e359b9a3 Uploaded xuebing parents: diff changeset	83 The window size is related to the probe tiling density. For example,
9071e359b9a3 Uploaded xuebing parents: diff changeset	84 if the probes are tiled at every 100bp, then setting the smallest
9071e359b9a3 Uploaded xuebing parents: diff changeset	85 window = 2 and largest window = 6 is appropriate, because the DNA
9071e359b9a3 Uploaded xuebing parents: diff changeset	86 fragment size is around 300-500bp.
9071e359b9a3 Uploaded xuebing parents: diff changeset	87
9071e359b9a3 Uploaded xuebing parents: diff changeset	88 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	89
9071e359b9a3 Uploaded xuebing parents: diff changeset	90 Example
9071e359b9a3 Uploaded xuebing parents: diff changeset	91
9071e359b9a3 Uploaded xuebing parents: diff changeset	92 - input file::
9071e359b9a3 Uploaded xuebing parents: diff changeset	93
9071e359b9a3 Uploaded xuebing parents: diff changeset	94 chr7 Nimblegen ID 40307603 40307652 1.668944 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	95 chr7 Nimblegen ID 40307703 40307752 0.8041307 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	96 chr7 Nimblegen ID 40307808 40307865 -1.089931 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	97 chr7 Nimblegen ID 40307920 40307969 1.055044 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	98 chr7 Nimblegen ID 40308005 40308068 2.447853 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	99 chr7 Nimblegen ID 40308125 40308174 0.1638694 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	100 chr7 Nimblegen ID 40308223 40308275 -0.04796628 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	101 chr7 Nimblegen ID 40308318 40308367 0.9335709 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	102 chr7 Nimblegen ID 40308526 40308584 0.5143972 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	103 chr7 Nimblegen ID 40308611 40308660 -1.089931 . . .
9071e359b9a3 Uploaded xuebing parents: diff changeset	104 etc.
9071e359b9a3 Uploaded xuebing parents: diff changeset	105
9071e359b9a3 Uploaded xuebing parents: diff changeset	106 In GFF, a value of dot '.' is used to mean "not applicable".
9071e359b9a3 Uploaded xuebing parents: diff changeset	107
9071e359b9a3 Uploaded xuebing parents: diff changeset	108 - output file::
9071e359b9a3 Uploaded xuebing parents: diff changeset	109
9071e359b9a3 Uploaded xuebing parents: diff changeset	110 ID Chr Start End WinSz PeakValue # of FPs FDR
9071e359b9a3 Uploaded xuebing parents: diff changeset	111 1 chr7 40310931 40311266 4 1.663446 0.248817 0.248817
9071e359b9a3 Uploaded xuebing parents: diff changeset	112
9071e359b9a3 Uploaded xuebing parents: diff changeset	113 -----
9071e359b9a3 Uploaded xuebing parents: diff changeset	114
9071e359b9a3 Uploaded xuebing parents: diff changeset	115 References
9071e359b9a3 Uploaded xuebing parents: diff changeset	116
9071e359b9a3 Uploaded xuebing parents: diff changeset	117 Zhang Y. (2008)
9071e359b9a3 Uploaded xuebing parents: diff changeset	118 Poisson approximation for significance in genome-wide ChIP-chip tiling arrays.
9071e359b9a3 Uploaded xuebing parents: diff changeset	119 Bioinformatics. 24(24):2825-31. Epub 2008 Oct 25.
9071e359b9a3 Uploaded xuebing parents: diff changeset	120
9071e359b9a3 Uploaded xuebing parents: diff changeset	121 Chen KB, Zhang Y. (2010)
9071e359b9a3 Uploaded xuebing parents: diff changeset	122 A varying threshold method for ChIP peak calling using multiple sources of information.
9071e359b9a3 Uploaded xuebing parents: diff changeset	123 Submitted.
9071e359b9a3 Uploaded xuebing parents: diff changeset	124
9071e359b9a3 Uploaded xuebing parents: diff changeset	125 </help>
9071e359b9a3 Uploaded xuebing parents: diff changeset	126 </tool>

Mercurial > repos > xuebing > sharplabtool

annotate tools/human_genome_variation/pass.xml @ 0:9071e359b9a3