Mercurial > repos > xuebing > sharplabtool
diff tools/human_genome_variation/gpass.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/human_genome_variation/gpass.xml Fri Mar 09 19:37:19 2012 -0500 @@ -0,0 +1,112 @@ +<tool id="hgv_gpass" name="GPASS" version="1.0.0"> + <description>significant single-SNP associations in case-control studies</description> + + <command interpreter="perl"> + gpass.pl ${input1.extra_files_path}/${input1.metadata.base_name}.map ${input1.extra_files_path}/${input1.metadata.base_name}.ped $output $fdr + </command> + + <inputs> + <param name="input1" type="data" format="lped" label="Dataset"/> + <param name="fdr" type="float" value="0.05" label="FDR"/> + </inputs> + + <outputs> + <data name="output" format="tabular" /> + </outputs> + + <requirements> + <requirement type="package">gpass</requirement> + </requirements> + + <!-- we need to be able to set the seed for the random number generator + <tests> + <test> + <param name='input1' value='gpass_and_beam_input' ftype='lped' > + <metadata name='base_name' value='gpass_and_beam_input' /> + <composite_data value='gpass_and_beam_input.ped' /> + <composite_data value='gpass_and_beam_input.map' /> + <edit_attributes type='name' value='gpass_and_beam_input' /> + </param> + <param name="fdr" value="0.05" /> + <output name="output" file="gpass_output.txt" /> + </test> + </tests> + --> + + <help> +**Dataset formats** + +The input dataset must be in lped_ format, and the output is tabular_. +(`Dataset missing?`_) + +.. _lped: ./static/formatHelp.html#lped +.. _tabular: ./static/formatHelp.html#tab +.. _Dataset missing?: ./static/formatHelp.html + +----- + +**What it does** + +GPASS (Genome-wide Poisson Approximation for Statistical Significance) +detects significant single-SNP associations in case-control studies at a user-specified FDR. Unlike previous methods, this tool can accurately approximate the genome-wide significance and FDR of SNP associations, while adjusting for millions of multiple comparisons, within seconds or minutes. + +The program has two main functionalities: + +1. Detect significant single-SNP associations at a user-specified false + discovery rate (FDR). + + *Note*: a "typical" definition of FDR could be + FDR = E(# of false positive SNPs / # of significant SNPs) + + This definition however is very inappropriate for association mapping, since SNPs are + highly correlated. Our FDR is + defined differently to account for SNP correlations, and thus will obtain + a proper FDR in terms of "proportion of false positive loci". + +2. Approximate the significance of a list of candidate SNPs, adjusting for + multiple comparisons. If you have isolated a few SNPs of interest and want + to know their significance in a GWAS, you can supply the GWAS data and let + the program specifically test those SNPs. + + +*Also note*: the number of SNPs in a study cannot be both too small and at the same +time too clustered in a local region. A few hundreds of SNPs, or tens of SNPs +spread in different regions, will be fine. The sample size cannot be too small +either; around 100 or more individuals (case + control combined) will be fine. +Otherwise use permutation. + +----- + +**Example** + +- input map file:: + + 1 rs0 0 738547 + 1 rs1 0 5597094 + 1 rs2 0 9424115 + etc. + +- input ped file:: + + 1 1 0 0 1 1 G G A A A A A A A A A G A A G G G G A A G G G G G G A A A A A G A A G G A G A G A A G G A A G G A A G G A G A A G G A A G G A A A G A G G G A G G G G G A A A G A A G G G G G G G G A G A A A A A A A A + 1 1 0 0 1 1 G G A G G G A A A A A G A A G G G G G G A A G G A G A G G G G G A G G G A G A A G G A G G G A A G G G G A G A G G G A G A A A A G G G G A G A G G G A G A A A A A G G G A G G G A G G G G G A A G G A G + etc. + +- output dataset, showing significant SNPs and their p-values and FDR:: + + #ID chr position Statistics adj-Pvalue FDR + rs35 chr1 136606952 4.890849 0.991562 0.682138 + rs36 chr1 137748344 4.931934 0.991562 0.795827 + rs44 chr2 14423047 7.712832 0.665086 0.218776 + etc. + +----- + +**Reference** + +Zhang Y, Liu JS. (2010) +Fast and accurate significance approximation for genome-wide association studies. +Submitted. + + </help> +</tool>