comparison gss.Rmd @ 0:d0cbe6cc1f04 draft default tip

"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/genomic_super_signature commit 1aadd5dce3b254e7714c2fdd39413029fd4b9b7a"
author iuc
date Wed, 12 Jan 2022 19:07:45 +0000
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:d0cbe6cc1f04
1 ---
2 title: "Analysis by GenomicSuperSignature"
3 date: "`r Sys.Date()`"
4 output:
5 BiocStyle::html_document:
6 toc: true
7 toc_float: false
8 toc_depth: 3
9 params:
10 val_all: val_all
11 dat: dat
12 RAVmodel: RAVmodel
13 inputName: inputName
14 numOut: numOut
15 ---
16
17 ```{r setup, include=FALSE}
18 knitr::opts_chunk$set(echo = FALSE)
19 ```
20
21 # RAVs best represents your dataset
22 The *validation* provides a quantitative representation of the relevance
23 between your dataset and RAVs. Below shows the top 6 validated RAVs and
24 the complete result is saved as `{input_name}_validate.csv`.
25
26 ```{r}
27 head(params$val_all)
28 ```
29
30 ## Heatmap Table
31 `heatmapTable` takes validation results as its input and displays them into
32 a two panel table: the top panel shows the average silhouette width (avg.sw)
33 and the bottom panel displays the validation score.
34
35 `heatmapTable` can display different subsets of the validation output. For
36 example, if you specify `scoreCutoff`, any validation result above that score
37 will be shown. If you specify the number (n) of top validation results through
38 `num.out`, the output will be a n-columned heatmap table. You can also use the
39 average silhouette width (`swCutoff`), the size of cluster (`clsizecutoff`),
40 one of the top 8 PCs from the dataset (`whichPC`).
41
42 Here, we print out top `r params$numOut` validated RAVs with average silhouette
43 width above 0.
44
45 ```{r out.height="45%", out.width="45%", message=FALSE, warning=FALSE}
46 heatmapTable(params$val_all, num.out = params$numOut, swCutoff = 0)
47 ```
48
49 ## Interactive Graph
50 Under the default condition, `plotValidate` plots validation results of all non
51 single-element RAVs in one graph, where x-axis represents average silhouette
52 width of the RAVs (a quality control measure of RAVs) and y-axis represents
53 validation score. We recommend users to focus on RAVs with higher validation
54 score and use average silhouette width as a secondary criteria.
55
56 ```{r out.height="80%", out.width="80%"}
57 plotValidate(params$val_all, interactive = TRUE)
58 ```
59
60 Note that `interactive = TRUE` will result in a zoomable, interactive plot that
61 included tooltips, which is saved as `{input_name}_validate_plot.html` file.
62
63 You can hover each data point for more information:
64
65 - **sw** : the average silhouette width of the cluster
66 - **score** : the top validation score between 8 PCs of the dataset and RAVs
67 - **cl_size** : the size of RAVs, represented by the dot size
68 - **cl_num** : the RAV number. You need this index to find more information
69 about the RAV.
70 - **PC** : test dataset's PC number that validates the given RAV. Because we
71 used top 8 PCs of the test dataset for validation, there are 8 categories.
72
73 If you double-click the PC legend on the right, you will enter an
74 individual display mode where you can add an additional group of data
75 point by single-click.
76
77
78 # Prior information associated to your dataset
79 ```{r echo=FALSE}
80 validated_ind <- validatedSignatures(params$val_all, num.out = params$numOut,
81 swCutoff = 0, indexOnly = TRUE)
82
83 # In case, there are fewer validated_ind than the number of outputs user set
84 n <- min(params$numOut, length(validated_ind), na.rm = TRUE)
85 ```
86
87 ## MeSH terms in wordcloud
88 ```{r out.height="60%", out.width="60%", fig.width=8, fig.height=8}
89 for (i in seq_len(n)) {
90 set.seed(1)
91 print(paste0("MeSH terms related to RAV", validated_ind[i]))
92 drawWordcloud(params$RAVmodel, validated_ind[i])
93 }
94 ```
95
96 ## GSEA
97 The complete result is saved as `{input_name}_genesets_RAV*.csv`.
98 ```{r}
99 res_all <- vector(mode = "list", length = n)
100 for (i in seq_len(n)) {
101 RAVnum <- validated_ind[i]
102 RAVname <- paste0("RAV", RAVnum)
103 res <- gsea(params$RAVmodel)[[RAVname]]
104 res_all[[i]] <- head(res)
105 names(res_all)[i] <- paste0("Enriched gene sets for RAV", validated_ind[i])
106 }
107 res_all
108 ```
109
110 ## Publication
111 The complete result is saved as `{input_name}_literatures_RAV*.csv`.
112 ```{r}
113 res_all <- vector(mode = "list", length = n)
114 for (i in seq_len(n)) {
115 RAVnum <- validated_ind[i]
116 res <- findStudiesInCluster(params$RAVmodel, RAVnum, studyTitle = TRUE)
117 res_all[[i]] <- head(res)
118 names(res_all)[i] <- paste0("Studies related to RAV", validated_ind[i])
119 }
120 res_all
121 ```
122