This question has been asked before, but not in a way it fits my data, so I try again :)
I want to make multiple individual ggplots without having to specify how it should be made every time.
My dataset contains gene expression data and from these I want to plot specific genes.
Let's use this as an example
df <- read.table(header = TRUE,
stringsAsFactors = FALSE,
text="GENE SYMBOL Patient1 Patient2
TP53 ILMN_2 3.55 3.66
TP53 ILMN_3 5.49 4.99
XBP1 ILMN_5 4.06 2.53
TP27 ILMN_1 2.53 3.33
REDD1 ILMN_4 3.99 4.56
ERO1L ILMN_6 5.02 6.95
STK11 ILMN_9 3.64 2.01
HIF2A ILMN_8 2.96 4.76 ")
In order to plot selected genes from df
, I usually do the following:
First I make an object I can use for searching in the dataframe
SYMBOL_info <- select(df, SYMBOL)
Then, I define the gene I'm interested in as:
library(dplyr)
library(tidyr)
library(ggplot2)
geneOfInterest <- c(SYMBOL_info == "5")
Next, the gene of interest is found in the dataframe and the dataframe is gathered to fit ggplots requirements:
df_gather<- df %>%
filter(geneOfInterest) %>%
gather(key=Patient, value=values, -c(GENE, SYMBOL))
In the end, the gene of interest in the dataset is plotted:
ggplot()+
geom_point(df_gather, mapping = aes(x=Patient, y=values, color=GENE))+
labs(title="XBP1 plot", subtitle = "Symbol: 5")+
ggsave("XBP1_plot.png")
However, I have many genes I'd like to plot, e.g. both versions of TP53, REDD1, STK11 and HIF1A. Any suggestions to how this can be done without having to change geneOfInterest
and the information with in the plotting-part of the code everytime? I guess a for loop needs to be made, but copying the other solutions given here didn't help me (as shown here: R: saving multiple ggplots using a for loop).
Thanks in advance! :)
EDIT: SYMBOL-values are changed to start with ILMN_ instead of just numbers
No comments:
Post a Comment