Tuesday, August 6, 2019

r - How to jitter text to avoid overlap in a ggplot2 scatterplot?

I would like to create a clean version of a scatterplot of text labels in ggplot2. The goal is to represent visually the increasing values associated with about 25 items. I am already using "position_jitter," but I wonder if I can do better.



Here is some mock data:



title <- rep("A Really Rather Long Text Label", 25)

value <- runif(25, 1,10)
spacing <- seq(1:25)
df <- data.frame(title, value, spacing, stringsAsFactors = FALSE)


And here is the code that generates the graph:



library(ggplot2)
myplot <- ggplot(data=df, aes(x=spacing, y=value, label = title)) +
geom_text(aes(colour = value),

size = 2.5, fontface = "bold",
vjust = 0,
position = position_jitter(width=5, height=0)) +
theme_bw() +
scale_x_continuous(limits = c(-5, 30))+
scale_colour_gradient(low = "#6BAED6", high = "#08306B") +
theme(axis.title.x = element_blank(),
axis.ticks = element_blank(),
axis.text.x = element_blank(),
legend.position = "none")

myplot


There is plenty of space for all of this text in a graph of reasonable size -- so long as the text is free to shift as far as it needs to horizontally! I don't want to jitter vertically, because the point is to show the y value associated with each text label.



The graph comes out slightly differently every time you run the ggplot command -- and sometimes the jittering results are better than other times. But I haven't found a way to systematically prevent the labels from overlapping.



I'd be grateful for any suggestions on how to clean up the horizontal positioning of the text without having to move individual items manually. I'd also be glad to hear other tips on how to improve this type of visual representation.

No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...