library(echarts4r)
library(dplyr)
library(lubridate)
library(nycflights13)
library(stringr)
As web-oriented presentation in R Markdown and Shiny becomes more and more popular, there is increasing demand for interactive graphics with R. Whereas ggplot2 and its vast extension ecosystem is clearly leading in static graphics, there is no one go-to package for interactivity. This article is a tour of echarts4r, an interface with the echarts.js JavaScript library.
There are numerous options for interactive graphics with R:
- plotly, an interface to Plotly.js that also works with ggplot2
- ggiraph, an extension for ggplot2 that implements interactive geoms
- highcharter, an interface to Highcharts
- r2d3, an interface to custom graphics with D3
- googleVis, an interface to the Google Visualization API
In addition, there are many packages specializing in a type of graph, such as dygraphs (time series) and visNetwork (network graphs).
echarts4r is a relatively new addition (CRAN release was 2018-09-17). It is an R interface with echarts.js, a free JavaScript charting library developed by Baidu and now part of the Apache Foundation.
Why echarts?
- Charts look great out of the box, especially the opening animations, tooltips and hover highlighting look great and work on mobile too
- While Plotly is optimized for exploratory data visualization by experts, echarts provides simpler interactions for a general audience, similar to Highcharts
- It covers almost all chart types imaginable, so there’s no need to switch between packages and have inconsistent styling
- echarts.js is highly customizable and it thoroughly documented (see documentation and cheat sheet cheat sheet). There is also a giant library of examples, all with source code
- It’s free to use commercially, unlike Highcharts, which otherwise ticks the same boxes
- In the development version, echarts4r offers proxies for interaction with Shiny (see https://echarts4r.john-coene.com/articles/shiny.html)
In addition to these advantages, it also offers features not seen in other packages (or at least not in this specific form): Geospatial 3D maps and timelines
In terms of ease of use, I’d put echarts4r in the middle of the pack. The R documentation is easy to follow and has good examples, but it cannot cover every detail, so one has to consult the official echarts documentation frequently. However, in contrast with learning D3 from scratch, this doesn’t take much time, an advantage that GitLab also noted in their comparison. As a long time ggplot2 user, I do miss the in-depth aesthetic mappings of ggplot2, faceting and ease of use when customizing axes. One last thing to keep in mind is that as a recent package and a larger userbase in China than in the West, StackOverflow doesn’t yet have many questions and answers on echarts.
Let’s get started
The remainder of this article is a tour of echarts4r’s features using the nycflights13 dataset. The official package website shows many more types of graphs, including maps, which are not covered here.
Set a theme
Like ggplot’s theme_set(), e_common() let’s us set a theme for all plots to come.
e_common(font_family = "helvetica", theme = "westeros")
Bar charts
We start with a classic bar chart. Composition is similar to ggplot’s geom_col() and even e_flip_coords() sounds suspiciously like ggplot’s coord_flip().
A little quirk of the package is that its Chinese origins at Baidu sometimes show through. Here, I gave the save as image button a new English title instead of the Chinese tooltip.
Horizontal bar chart
<- flights %>%
top_destinations count(dest) %>%
top_n(15, n) %>%
arrange(n)
%>%
top_destinations e_charts(x = dest) %>%
e_bar(n, legend = FALSE, name = "Flights") %>%
e_labels(position = "right") %>%
e_tooltip() %>%
e_title("Flights by destination", "Top 15 destinations") %>%
e_flip_coords() %>%
e_y_axis(splitLine = list(show = FALSE)) %>%
e_x_axis(show = FALSE) %>%
e_toolbox_feature(
feature = "saveAsImage",
title = "Save as image"
)
Stacked bar chart
echarts uses grouping with dplyr::group_by() instead of an aes() function like ggplot2 and highcharter.
<- flights %>%
flights_daytime transmute(origin, daytime = case_when(
>= 22 & hour < 6 ~ "Night",
hour >= 6 & hour < 12 ~ "Morning",
hour >= 12 & hour < 18 ~ "Afternoon",
hour TRUE ~ "Evening"
%>%
)) count(origin, daytime) %>%
group_by(daytime)
%>%
flights_daytime e_charts(origin, stack = "grp") %>%
e_bar(n) %>%
e_tooltip(
trigger = "axis",
axisPointer = list(
type = "shadow"
)%>%
) e_title(
text = "Outgoing flights by time of day",
subtext = "There are no night flights"
%>%
) e_y_axis(
splitArea = list(show = FALSE),
splitLine = list(show = FALSE)
)
Scatter plot
I plot arrival delay and departure delay. The original dataset has 336776 rows, which is too much to plot. I simply draw a sample of 1000 rows for the scatterplot and later show the full data in a heatmap.
For 1000 closely clustered points, it doesn’t make much sense to have a tooltip for each of them, so I used spike lines instead (called axis pointers in echarts).
A linear regression model fits the relationship between arrival and departure delay well, so I added a regression line with the convenient e_lm() function.
set.seed(123)
<- flights %>%
flights_sm filter(complete.cases(.)) %>%
sample_n(1000)
%>%
flights_sm e_charts(x = dep_delay) %>%
e_scatter(arr_delay, name = "Flight") %>%
e_lm(arr_delay ~ dep_delay, name = "Linear model") %>%
e_axis_labels(x = "Departure delay", y = "Arrival delay") %>%
e_title(
text = "Arrival delay vs. departure delay",
subtext = "The later you start, the later you finish"
%>%
) e_x_axis(
nameLocation = "center",
splitArea = list(show = FALSE),
axisLabel = list(margin = 3),
axisPointer = list(
show = TRUE,
lineStyle = list(
color = "#999999",
width = 0.75,
type = "dotted"
)
)%>%
) e_y_axis(
nameLocation = "center",
splitArea = list(show = FALSE),
axisLabel = list(margin = 0),
axisPointer = list(
show = TRUE,
lineStyle = list(
color = "#999999",
width = 0.75,
type = "dotted"
)
) )
<- 100 # binning
n_bins %>%
flights filter(complete.cases(.)) %>%
mutate(
arr_delay = cut(arr_delay, n_bins),
dep_delay = cut(dep_delay, n_bins)
%>%
) count(arr_delay, dep_delay) %>%
e_charts(dep_delay) %>%
e_heatmap(arr_delay, n) %>%
e_visual_map(n) %>%
e_title("Arrival delay vs. departure delay") %>%
e_axis_labels("Departure delay", "Arrival delay")
Pie chart
Pie charts tend to be a bad choice for accurate visualization, but they look nice. Here, the plot shows about even numbers of flights for the three origin airports, but it’s near impossible to tell that EWR has the most flights. Creating pie charts is surprisingly hard in ggplot2, especially when it comes to labeling them. In echarts it’s very easy.
<- count(flights, origin) %>%
pie e_charts(x = origin) %>%
e_pie(n, legend = FALSE, name = "Flights") %>%
e_tooltip() %>%
e_title("Flights by origin", "This is really hard with ggplot2")
pie
Time series
I need time series graphs for an upcoming project, so that’ll be a focus of this article. Let’s analyze departure delays from all three origin airports.
<- flights %>%
flights_ts transmute(week = as.Date(cut(time_hour, "week")), dep_delay, origin) %>%
group_by(origin, week) %>% # works with echarts
summarise(dep_delay = sum(dep_delay, na.rm = TRUE), .groups = "drop_last")
Regular time series
After much testing, I found that the way Highcharts does time series is the most intuitive and easy to use. The graph has a slider on the bottom for zooming, tooltips for multiple series are collected in one box and points grow when brushing over them.
<- flights_ts %>%
ts_base e_charts(x = week) %>%
e_datazoom(
type = "slider",
toolbox = FALSE,
bottom = -5
%>%
) e_tooltip() %>%
e_title("Departure delays by airport") %>%
e_x_axis(week, axisPointer = list(show = TRUE))
%>% e_line(dep_delay) ts_base
Stacked area
Switching from line to area graphs is done in one line of code. It’s also possible to reuse chart elements.
<- ts_base %>% e_area(dep_delay, stack = "grp")
area area
Timeline
A standout feature of echarts is the timeline visualization. It’s somewhat similar to gganimate, but instead of videos it outputs an HTMLwidget with controls. Here, I used it to show weekly aggregated departure delays for JFK airport. One thing to look out for is that the timeline view doesn’t provide as much context as the classic time series graphs shown before. However, this focus on a single time period at a time also lends itself to data storytelling.
%>%
flights_ts filter(origin == "JFK") %>%
group_by(month = month(week, label = TRUE)) %>%
e_charts(x = week, timeline = TRUE) %>%
e_bar(
dep_delay,name = "Departure Delay",
symbol = "none",
legend = FALSE
)
Synchronized plots
Similar to the crosstalk package, echarts allows linking of plots. They share legend, sliders and data zooms. From my point of view, it’s easier to use but not as flexible. Crosstalk allows linking of any compatible HTMLWidgets like Leaflet maps and DT tables, while echarts is limited to echarts itself. When used in Shiny applications, the complex interactions can be handled by the server functions, so echarts can also be linked without limitations.
Grab bag
This final section is a collection of various specialized graphs.
Correlation matrix
The convenient e_correlations() function combines e_heatmap() with corrMatOrder() from the corrplot package. As a specialized function, the original corrplot() function has many more options though, such as only displaying the upper of lower triangle and visualizing the results of statistical significance tests.
<- flights %>%
cor_data select(arr_delay, dep_delay, air_time, distance, hour) %>%
filter(complete.cases(.)) %>%
::set_colnames(colnames(.) %>% str_replace("_", " ") %>% str_to_title()) %>%
magrittrcor()
%>%
cor_data e_charts() %>%
e_correlations(
visual_map = TRUE,
order = "hclust",
inRange = list(color = c("#edafda", "#eeeeee", "#59c4e6")), # scale colors
itemStyle = list(
borderWidth = 2,
borderColor = "#fff"
)%>%
) e_title("Correlation")
Wordcloud
Wordclouds fall in the same category as pie charts - a pretty, but imprecise display. A hover effect can add more detail by showing the precise word frequency. Here, I show the 50 top destinations by number of flights.
<- flights %>%
tf count(dest, sort = TRUE) %>%
head(50)
%>%
tf e_color_range(n, color, colors = c("#59c4e6", "#edafda")) %>%
e_charts() %>%
e_cloud(
word = dest,
freq = n,
color = color,
shape = "circle",
rotationRange = c(0, 0),
sizeRange = c(8, 100)
%>%
) e_tooltip() %>%
e_title("Flight destinations")
Wrapup
The charts covered in this article are just a sample of the large variety of capabilities of echarts. There are many more examples on the official site and the echarts4r website. Personally, echarts4r will become my go-to for interactive HTML publications in R Markdown and Shiny. For static graphs I’ll stick with ggplot2 and vast ecosystem of extension packages, and for quick exploratory data analysis plotly’s ggplotly() is by far the easiest tool.