Mercurial > repos > iuc > genomic_super_signature
diff gss.Rmd @ 0:d0cbe6cc1f04 draft default tip
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/genomic_super_signature commit 1aadd5dce3b254e7714c2fdd39413029fd4b9b7a"
author | iuc |
---|---|
date | Wed, 12 Jan 2022 19:07:45 +0000 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/gss.Rmd Wed Jan 12 19:07:45 2022 +0000 @@ -0,0 +1,122 @@ +--- +title: "Analysis by GenomicSuperSignature" +date: "`r Sys.Date()`" +output: + BiocStyle::html_document: + toc: true + toc_float: false + toc_depth: 3 +params: + val_all: val_all + dat: dat + RAVmodel: RAVmodel + inputName: inputName + numOut: numOut +--- + +```{r setup, include=FALSE} +knitr::opts_chunk$set(echo = FALSE) +``` + +# RAVs best represents your dataset +The *validation* provides a quantitative representation of the relevance +between your dataset and RAVs. Below shows the top 6 validated RAVs and +the complete result is saved as `{input_name}_validate.csv`. + +```{r} +head(params$val_all) +``` + +## Heatmap Table +`heatmapTable` takes validation results as its input and displays them into +a two panel table: the top panel shows the average silhouette width (avg.sw) +and the bottom panel displays the validation score. + +`heatmapTable` can display different subsets of the validation output. For +example, if you specify `scoreCutoff`, any validation result above that score +will be shown. If you specify the number (n) of top validation results through +`num.out`, the output will be a n-columned heatmap table. You can also use the +average silhouette width (`swCutoff`), the size of cluster (`clsizecutoff`), +one of the top 8 PCs from the dataset (`whichPC`). + +Here, we print out top `r params$numOut` validated RAVs with average silhouette +width above 0. + +```{r out.height="45%", out.width="45%", message=FALSE, warning=FALSE} +heatmapTable(params$val_all, num.out = params$numOut, swCutoff = 0) +``` + +## Interactive Graph +Under the default condition, `plotValidate` plots validation results of all non +single-element RAVs in one graph, where x-axis represents average silhouette +width of the RAVs (a quality control measure of RAVs) and y-axis represents +validation score. We recommend users to focus on RAVs with higher validation +score and use average silhouette width as a secondary criteria. + +```{r out.height="80%", out.width="80%"} +plotValidate(params$val_all, interactive = TRUE) +``` + +Note that `interactive = TRUE` will result in a zoomable, interactive plot that +included tooltips, which is saved as `{input_name}_validate_plot.html` file. + +You can hover each data point for more information: + +- **sw** : the average silhouette width of the cluster +- **score** : the top validation score between 8 PCs of the dataset and RAVs +- **cl_size** : the size of RAVs, represented by the dot size +- **cl_num** : the RAV number. You need this index to find more information +about the RAV. +- **PC** : test dataset's PC number that validates the given RAV. Because we +used top 8 PCs of the test dataset for validation, there are 8 categories. + +If you double-click the PC legend on the right, you will enter an +individual display mode where you can add an additional group of data +point by single-click. + + +# Prior information associated to your dataset +```{r echo=FALSE} +validated_ind <- validatedSignatures(params$val_all, num.out = params$numOut, + swCutoff = 0, indexOnly = TRUE) + +# In case, there are fewer validated_ind than the number of outputs user set +n <- min(params$numOut, length(validated_ind), na.rm = TRUE) +``` + +## MeSH terms in wordcloud +```{r out.height="60%", out.width="60%", fig.width=8, fig.height=8} +for (i in seq_len(n)) { + set.seed(1) + print(paste0("MeSH terms related to RAV", validated_ind[i])) + drawWordcloud(params$RAVmodel, validated_ind[i]) +} +``` + +## GSEA +The complete result is saved as `{input_name}_genesets_RAV*.csv`. +```{r} +res_all <- vector(mode = "list", length = n) +for (i in seq_len(n)) { + RAVnum <- validated_ind[i] + RAVname <- paste0("RAV", RAVnum) + res <- gsea(params$RAVmodel)[[RAVname]] + res_all[[i]] <- head(res) + names(res_all)[i] <- paste0("Enriched gene sets for RAV", validated_ind[i]) +} +res_all +``` + +## Publication +The complete result is saved as `{input_name}_literatures_RAV*.csv`. +```{r} +res_all <- vector(mode = "list", length = n) +for (i in seq_len(n)) { + RAVnum <- validated_ind[i] + res <- findStudiesInCluster(params$RAVmodel, RAVnum, studyTitle = TRUE) + res_all[[i]] <- head(res) + names(res_all)[i] <- paste0("Studies related to RAV", validated_ind[i]) +} +res_all +``` +