+ + + + + + + + + + + + + + + + + + + + + + + +

+ +

+ + + + + + + + + + + + + + + + + + +

+ +

+ + + + + + + + + + + + + + + + + +

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ +

+ + + + +

+ + + + + + +

+ +

+ + +

+ +

+ + + + + + + + + `_ - disease associations, drug-target associations, cancer hallmarks, and druggability/tractability rankings + +- `The Cancer Genome Atlas `_ - gene aberration frequencies and co-expression patterns in approximately 10,000 primary tumor samples + +- `The Human Protein Atlas `_ - expression data for healthy human tissues (`GTex `_)/cell types, and prognostic gene expression associations in cancer (`The Pathology Atlas `_) + +- `Molecular Signatures Database (MSigDB) `_ - collection of annotated (e.g. towards pathways) gene sets for enrichment/overrepresentation analysis. This includes gene sets from `Gene Ontology `_, `Reactome `_, `KEGG `_, `WikiPathways `_, `BIOCARTA `_, as well as curated `immunologic `_ and `cancer-specific `_ signatures. + +- `NetPath `_ - manually curated resource of signal transduction pathways in humans + +- `STRING `_ - protein-protein interaction database + +- `CellChatDB `_ - database on ligand-receptor interactions + +- `DoRothEA `_ - gene set resource containing signed transcription factor (TF) - target interactions + +- `CORUM `_ - protein complex database + +- `Compleat `_ - protein complex resource + +- `ComplexPortal `_ - manually curated, encyclopaedic resource of macromolecular complexes + +- `hu.MAP2 `_ - human protein complex map + +- `ComPPI `_ - subcellular compartment database + +- `CancerMine `_ - literature-mined resource on cancer drivers, oncogenes and tumor suppressor genes + +- `Network of Cancer Genes `_ - manually curated collection of cancer genes, healthy drivers and their properties + +- `Project Score `_ - database on the effects on cancer cell line viability elicited by CRISPR-Cas9 mediated gene activation + +- `Genetic determinants of survival in cancer `_ - resource on the prognostic impact of genetic aberrations (methylation, CNA, mutation, expression) in human cancers (TCGA) + +- `Predicted synthetic lethality interactions `_ - comprehensive prediction of synthetic lethality interactions in human cancer cell lines + +The contents of the gene set analysis report attempt to answer the following questions related to the query set: + +- Which diseases/tumor types are known to be associated with genes in the query set, and to what extent? Which genes are a classified as proto-oncogenes, tumor suppressors or cancer driver genes? + +- Which query genes have been linked (through literature) to the various hallmarks of cancer? + +- Which genes in the query set are poorly characterized or have an unknown function? + +- Which proteins in the query set can be targeted by inhibitors for diffferent cancer conditions (early and late clinical development phases)? What is the tractability/druggability status for other targets in the query set? + +- Which cancer-relevant protein complexes are involved for proteins in the query set? + +- Are there known cancer-relevant regulatory interactions (transcription factor (TF) - target) found in the query set? + +- Are there known ligand-receptor interactions in the query set? + +- Which subcellular compartments (nucleus, cytosol, plasma membrane etc.) are dominant localizations for members of the query set? + +- Are specific tissues or cell types enriched in the query set, considering healthy tissue/cell-type specific expression patterns (GTex/Human Protein Atlas) of query genes? + +- Which protein-protein interactions are known within the query set? Are there interactions between members of the query set and other cancer-relevant proteins (e.g. proto-oncogenes, tumor-suppressors or predicted cancer drivers)? Which proteins constitute hubs in the protein-protein interaction network? + +- Are there specific pathways, biological processes or molecular functions that are enriched within the query set, as compared to a reference/background set? + +- Which members of the query set are frequently mutated in tumor sample cohorts (TCGA - SNVs/InDels / homozygous deletions / copy number amplifications)? What are the most frequent recurrent somatic variants (SNVs/InDels) in the query set genes? + +- Which members of the query set are co-expressed (strong negative or positive correlations) with cancer-relevant genes (i.e. proto-oncogenes or tumor suppressors) in tumor sample cohorts (TCGA)? + +- Which members of the query set are associated with better/worse survival in different cancers, considering mutation, expression, methylation or copy number levels in tumors? + +- Which members of the query set are predicted as partners of synthetic lethality interactions? + +- Which members of the query set are associated with cellular loss-of-fitness in CRISPR/Cas9 whole-genome drop out screens of cancer cell lines (i.e. reduction of cell viability elicited by a gene inactivation)? Which genes should be prioritized considering genomic biomarkers and fitness scores in combination? + + +]]> + + + + + 10.48550/arXiv.2107.13247 + + +