4
|
1 <tool id="SPIA" name="SPIA (Signaling Pathway Impact Analysis)" version="0.1.0" >
|
|
2 <description>A method based on over-representation and signaling perturbation accumulation to analyze KEGG signaling pathways.</description>
|
|
3
|
|
4 <requirements>
|
|
5 <requirement type="package" version="1.20.3">r-getopt</requirement>
|
|
6 <requirement type="package" version="2.42.0">bioconductor-SPIA</requirement>
|
|
7 </requirements>
|
|
8
|
|
9 <command detect_errors="exit_code"><![CDATA[
|
|
10 Rscript '$__tool_directory__/SPIA.R'
|
|
11 -D '$input_data'
|
|
12 -O '$organism'
|
|
13 -R '$sigP_output'
|
|
14
|
|
15 -P '$adv.P_value_threshold'
|
|
16
|
|
17 -N '$adv.Number_bootstrap'
|
|
18 -C '$adv.method_combine_pvalue'
|
|
19 #if $adv.plot_perturbation=="True":
|
|
20 -W
|
|
21 -L '$SPIA_Perturbation_Plots'
|
|
22 #end if
|
|
23
|
|
24 #if $adv.pathwayId !="":
|
|
25 -I '$adv.pathwayId'
|
|
26 #end if
|
|
27
|
|
28 ]]></command>
|
|
29
|
|
30 <inputs>
|
|
31 <param type="data" name="input_data" format="csv" multiple="false" label="Input data" help="A csv file including the columns ENTREZ, logFC, and adj.P.Val"/>
|
|
32 <param type="text" name="organism" value="hsa" label="Organism" help="A three letter character designating the organism. Default is `hsa` (human). See a full list at https://www.genome.jp/kegg/catalog/org_list.html" />
|
|
33
|
|
34 <section name="adv" title="Advanced Options" expanded="false">
|
|
35 <param type="float" name="P_value_threshold" label="P value threshold to select DEgenes" value="0.05" min="0.00" max="1.00" help="Set a threshold value to define differentially expressed genes"/>
|
|
36 <param type="integer" name="Number_bootstrap" value="2000" min="100" label="Bootstrap iterations" help="Number of bootstrap iterations used to compute the P PERT value. Should be larger than 100. A recommended value is 2000." />
|
|
37 <param type="select" name="method_combine_pvalue" label="Method to combine P values" help="Method used to combine the two types of p-values. If set to 'fisher' it will use Fisher's method. If set to 'norminv' it will use the normal inversion method.">
|
|
38 <option value="fisher" selected="True">fisher</option>
|
|
39 <option value="norminv">norminv</option>
|
|
40 </param>
|
|
41 <param type="boolean" name="plot_perturbation" truevalue="True" falsevalue="False" checked="False" label="Plot perturbation" help="If set to Yes, plot the gene perturbation accumulation vs log2 fold change for every gene on each pathway. Default is No." />
|
|
42 <param type="text" name="pathwayId" value="" label="Pathway IDs -- default as NULL and analysis all pathway. " help="Special one or more pathway to analysis, input pathway ID at here. For example: 03018, 03320."/>
|
|
43 </section>
|
|
44
|
|
45 </inputs>
|
|
46
|
|
47 <outputs>
|
|
48 <data name="sigP_output" format="csv" label="SPIA_enrich_kegg" />
|
|
49 <data format="pdf" name="SPIA_Perturbation_Plots" label="SPIA_Perturbation_Plots">
|
|
50 <filter>adv['plot_perturbation'] == True</filter>
|
|
51 </data>
|
|
52 </outputs>
|
|
53
|
|
54 <tests>
|
|
55 <test>
|
|
56 <param name="input_data" value="SPIA_input.csv" ftype="csv" />
|
|
57 <output name="sigP_output" file="x.csv" ftype="csv" />
|
|
58 </test>
|
|
59 </tests>
|
|
60
|
|
61 <help><![CDATA[
|
|
62
|
|
63 .. class:: infomark
|
|
64
|
|
65 **What it does**
|
|
66
|
|
67 SPIA (Signaling pathway impact analysis) combines the evidence obtained from the
|
|
68 classical enrichment analysis with a novel type of evidence, which measures the actual
|
|
69 perturbation on a given pathway under a given condition.
|
|
70
|
|
71 A bootstap procedure is used to assess the significance of the observed total pathway perturbation.
|
|
72
|
|
73 Then we can calculate a global pathway significance P-value, which combines the enrichment and perturbation P-values.
|
|
74
|
|
75 SPIA tool analyzes KEGG signaling pathways.
|
|
76
|
|
77 -------
|
|
78
|
|
79 =========
|
|
80 **Input**
|
|
81 =========
|
|
82
|
|
83 Basic options
|
|
84 --------------
|
|
85
|
|
86 **Input data**
|
|
87
|
|
88 The input data is a csv file, which includes the columns `ENTREZ`, `logFC` and `adj.P.Val`.
|
|
89 This file contains all genes of your dataset.
|
|
90
|
|
91 ====== ========== ======= ========== ========= ==== ========
|
|
92 logFC AveExpr t P.Value adj.P.Val B ENTREZ
|
|
93 ====== ========== ======= ========== ========= ==== ========
|
|
94 5.96 6.23 23.9 1.79e-17 9.78e-13 25.4 3491
|
|
95 5.14 7.49 17.4 1.56e-14 2.84e-10 21.0 2353
|
|
96 4.15 7.04 16.5 5.15e-14 7.04e-10 20.1 1958
|
|
97 2.43 9.59 14.1 1.29e-12 1.41e- 8 17.7 1843
|
|
98 1.53 8.22 11.0 1.69e-10 1.15e- 6 13.6 3725
|
|
99 1.43 5.33 10.5 4.27e-10 2.42e- 6 12.8 23645
|
|
100 ====== ========== ======= ========== ========= ==== ========
|
|
101
|
|
102 **Organism**
|
|
103
|
|
104 A three letter word designating the organism of your data. Default is `hsa` (Human). See a full list of options at https://www.genome.jp/kegg/catalog/org_list.html.
|
|
105
|
|
106 ------
|
|
107
|
|
108 Advanced Options
|
|
109 -----------------
|
|
110
|
|
111 **P value threshold to select DEgenes**
|
|
112
|
|
113 Set a threshold value to define differentially expressed genes. Default is 0.05.
|
|
114
|
|
115 **Bootstrap iterations**
|
|
116
|
|
117 Number of bootstrap iterations used to compute the `pPERT` value. Should be larger than 100. A recommended value is 2000.
|
|
118
|
|
119 **Method to combine P values**
|
|
120
|
|
121 Method used to combine the two types of p-values. If set to 'fisher' it will use Fisher's method. If set to 'norminv' it will use the normal inversion method.
|
|
122
|
|
123 **Plot perturbation**
|
|
124
|
|
125 If set to `Yes`, plots the gene perturbation accumulation vs log2 fold change for every gene on each pathway. Default is `No`.
|
|
126
|
|
127 **Pathway IDs -- default as NULL and analysis all pathway.**
|
|
128
|
|
129 if you want special one or more pathway to analysis, Input pathway id at here. for example: `03018, 03320`.
|
|
130
|
|
131 ------
|
|
132
|
|
133 ==========
|
|
134 **Output**
|
|
135 ==========
|
|
136
|
|
137 **CSV file**
|
|
138
|
|
139 This file contains the ranked pathways and various statistics:
|
|
140 - **Name** is the pathway name;
|
|
141 - **ID** is the pathway ID;
|
|
142 - **pSize** is the number of genes on the pathway;
|
|
143 - **NDE** is the number of DE genes per pathway;
|
|
144 - **tA** is the observed total perturbation accumulation in the pathway;
|
|
145 - **pNDE** is the probability to observe at least NDE genes on the pathway using a hypergeometric model;
|
|
146 - **pPERT** is the probability to observe a total accumulation more extreme than tA only by chance;
|
|
147 - **pG** is the p-value obtained by combining pNDE and pPERT;
|
|
148 - **pGFdr** and **pGFWER** are the False Discovery Rate and Bonferroni adjusted global p-values;
|
|
149 - **Status** gives the direction in which the pathway is perturbed (activated or inhibited).
|
|
150 - **KEGGLINK** gives a web link to the KEGG website that displays the pathway image with the differentially expressed genes highlighted in red.
|
|
151
|
|
152 **PDF file**
|
|
153
|
|
154 If the plot argument is set to `Yes`, it will output the plots for the gene perturbation accumulation vs log2 fold change for every gene on each pathway.
|
|
155
|
|
156 ------
|
|
157
|
|
158 Please cite SPIA_ appropriately if you use them.
|
|
159
|
|
160 .. _SPIA: https://pubmed.ncbi.nlm.nih.gov/18990722/
|
|
161
|
|
162 ]]></help>
|
|
163
|
|
164 <citations>
|
|
165 <citation type="doi">10.1093/bioinformatics/btn577</citation>
|
|
166 </citations>
|
|
167
|
|
168 </tool>
|