annotate wgcna_preprocessing.Rmd @ 6:2f4df2be0572 draft

planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
author mingchen0919
date Tue, 08 Aug 2017 12:35:11 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
1 ---
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
2 title: 'WGCNA: data preprocessing'
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
3 output:
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
4 html_document:
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
5 number_sections: true
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
6 toc: true
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
7 theme: cosmo
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
8 highlight: tango
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
9 ---
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
10
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
11 ```{r setup, include=FALSE, warning=FALSE, message=FALSE}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
12 knitr::opts_chunk$set(
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
13 echo = ECHO
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
14 )
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
15 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
16
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
17 ```{r}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
18 str(opt)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
19 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
20
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
21 # Import data
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
22
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
23 Each row represents a gene and each column represents a sample.
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
24
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
25 ```{r}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
26 expression_data = read.csv('EXPRESSION_DATA', header = TRUE, row.names = 1)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
27 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
28
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
29 Display the first 100 genes.
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
30
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
31 ```{r}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
32 datatable(head(expression_data, 100), style="bootstrap", filter = 'top',
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
33 class="table-condensed", options = list(dom = 'tp', scrollX = TRUE))
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
34 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
35
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
36 Transpose expression data matrix so that each row represents a sample and each column represents a gene.
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
37
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
38 ```{r}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
39 expression_data = as.data.frame(t(expression_data))
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
40 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
41
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
42 # Checking data
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
43
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
44 Checking data for excessive missing values and identification of outlier microarray samples.
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
45
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
46 ```{r}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
47 gsg = goodSamplesGenes(expression_data, verbose = 3)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
48 if (!gsg$allOK) {
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
49 # Optionally, print the gene and sample names that were removed:
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
50 if (sum(!gsg$goodGenes)>0)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
51 printFlush(paste("Removing genes:", paste(names(expression_data)[!gsg$goodGenes], collapse = ", ")));
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
52 if (sum(!gsg$goodSamples)>0)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
53 printFlush(paste("Removing samples:", paste(rownames(expression_data)[!gsg$goodSamples], collapse = ", ")));
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
54 # Remove the offending genes and samples from the data:
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
55 expression_data = expression_data[gsg$goodSamples, gsg$goodGenes]
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
56 } else {
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
57 print('all genes are OK!')
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
58 }
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
59 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
60
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
61 # Clustering samples
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
62
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
63 If there are any outliers, choose a height cut that will remove the offending sample. Remember this number since you will need this number in further analysis.
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
64
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
65 ```{r fig.align='center'}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
66 sampleTree = hclust(dist(expression_data), method = "average");
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
67 plot(sampleTree, main = "Sample clustering to detect outliers", sub="", xlab="",
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
68 cex.axis = 1, cex.main = 1, cex = 0.5)
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
69 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
70
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
71
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
72 ```{r echo=FALSE}
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
73 rm("opt")
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
74 save(list=ls(all.names = TRUE), file='PREPROCESSING_WORKSPACE')
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
75 ```
2f4df2be0572 planemo upload for repository https://github.com/statonlab/docker-GRReport/tree/master/my_tools/rmarkdown_wgcna commit d91f269e8bc09a488ed2e005122bbb4a521f44a0-dirty
mingchen0919
parents:
diff changeset
76