R code for animated chord diagrams
Background
A little while ago my paper in International Migration Review on global migration flow estimates came out online. The paper includes a number of directional chord diagrams to visualize the estimates.
Recently I have been playing around tweenr
and the magick
packages for animated population pyramids. In this post I attempt to show how to use these packages to produce animated directional chord diagrams of global migration flow estimates
Data
The first step is to read into R two data frames (these are in my migest
R package if you wish to replicate the code below).
- Time series of bilateral migration flow estimates:
# install.packages("migest")
library(tidyverse)
d0 <- read_csv(system.file("imr", "reg_flow.csv", package = "migest"))
d0
## # A tibble: 891 × 4
## year0 orig_reg dest_reg flow
## <dbl> <chr> <chr> <dbl>
## 1 1960 Africa Africa 1377791
## 2 1960 Africa Eastern Asia 5952
## 3 1960 Africa Eastern Europe & Central Asia 7303
## 4 1960 Africa Europe 919252
## 5 1960 Africa Latin America & Caribbean 15796
## 6 1960 Africa Northern America 82463
## 7 1960 Africa Oceania 32825
## 8 1960 Africa Southern Asia 35603
## 9 1960 Africa Western Asia 106580
## 10 1960 Eastern Asia Africa 37301
## # … with 881 more rows
- Some regional meta data for chord diagram plots:
d1 <- read_csv(system.file("vidwp", "reg_plot.csv", package = "migest"))
d1
## # A tibble: 9 × 5
## region order1 col1 reg1 reg2
## <chr> <dbl> <chr> <chr> <chr>
## 1 Northern America 1 #40A4D8 Northern America
## 2 Africa 2 #33BEB7 Africa <NA>
## 3 Europe 3 #B2C224 Europe <NA>
## 4 Eastern Europe & Central Asia 4 #FECC2F Eastern Europe & Central Asia
## 5 Western Asia 5 #FBA127 Western Asia
## 6 Southern Asia 6 #F66320 Southern Asia
## 7 Eastern Asia 7 #DB3937 Eastern Asia
## 8 Oceania 8 #A463D7 Oceania <NA>
## 9 Latin America & Caribbean 9 #0C5BCE Latin America & Caribbean
Tween
The next step is to tween the data by migration corridor.
library(tweenr)
d2 <- d0 %>%
mutate(corridor = paste(orig_reg, dest_reg, sep = " -> ")) %>%
select(corridor, year0, flow) %>%
mutate(ease = "linear") %>%
tween_elements(time = "year0", group = "corridor", ease = "ease", nframes = 100) %>%
as_tibble()
d2
## # A tibble: 8,181 × 4
## year0 flow .frame .group
## <dbl> <dbl> <int> <chr>
## 1 1960 1377791 0 Africa -> Africa
## 2 1960 5952 0 Africa -> Eastern Asia
## 3 1960 7303 0 Africa -> Eastern Europe & Central Asia
## 4 1960 919252 0 Africa -> Europe
## 5 1960 15796 0 Africa -> Latin America & Caribbean
## 6 1960 82463 0 Africa -> Northern America
## 7 1960 32825 0 Africa -> Oceania
## 8 1960 35603 0 Africa -> Southern Asia
## 9 1960 106580 0 Africa -> Western Asia
## 10 1960 37301 0 Eastern Asia -> Africa
## # … with 8,171 more rows
This creates larger data frame d2
, with 100 observations for each corridor, one for each frame in the animation. In the original data d0
there are only 11 observations for each corridor, one for each five-year period.
Then some further minor data wrangling is required to ready the data for plotting using the chordDiagram
function; namely the first three columns in the data must correspond to the origin, destination and flow.
d2 <- d2 %>%
separate(col = .group, into = c("orig_reg", "dest_reg"), sep = " -> ") %>%
select(orig_reg, dest_reg, flow, everything()) %>%
mutate(flow = flow/1e06)
d2
## # A tibble: 8,181 × 5
## orig_reg dest_reg flow year0 .frame
## <chr> <chr> <dbl> <dbl> <int>
## 1 Africa Africa 1.38 1960 0
## 2 Africa Eastern Asia 0.00595 1960 0
## 3 Africa Eastern Europe & Central Asia 0.00730 1960 0
## 4 Africa Europe 0.919 1960 0
## 5 Africa Latin America & Caribbean 0.0158 1960 0
## 6 Africa Northern America 0.0825 1960 0
## 7 Africa Oceania 0.0328 1960 0
## 8 Africa Southern Asia 0.0356 1960 0
## 9 Africa Western Asia 0.107 1960 0
## 10 Eastern Asia Africa 0.0373 1960 0
## # … with 8,171 more rows
Plots for Each Frame
Now the data is in the correct format, chord diagrams can be produced for each frame of the eventual animation. To do this, I used a for
loop to cycle through the tweend data. The arguments I used in the circos.par
, chordDiagram
and circos.track
functions to produce each plot are explained in more detail in the comments of the migest
demo.
# create a directory to store the individual plots
dir.create("./plot-ani/")
library(circlize)
for(f in unique(d2$.frame)){
# open a PNG plotting device
png(file = paste0("./plot-ani/globalchord", f, ".png"), height = 7, width = 7,
units = "in", res = 500)
# initialise the circos plot
circos.clear()
par(mar = rep(0, 4), cex=1)
circos.par(start.degree = 90, track.margin=c(-0.1, 0.1),
gap.degree = 4, points.overflow.warning = FALSE)
# plot the chord diagram
chordDiagram(x = filter(d2, .frame == f), directional = 1, order = d1$region,
grid.col = d1$col1, annotationTrack = "grid",
transparency = 0.25, annotationTrackHeight = c(0.05, 0.1),
direction.type = c("diffHeight", "arrows"), link.arr.type = "big.arrow",
diffHeight = -0.04, link.sort = TRUE, link.largest.ontop = TRUE)
# add labels and axis
circos.track(track.index = 1, bg.border = NA, panel.fun = function(x, y) {
xlim = get.cell.meta.data("xlim")
sector.index = get.cell.meta.data("sector.index")
reg1 = d1 %>% filter(region == sector.index) %>% pull(reg1)
reg2 = d1 %>% filter(region == sector.index) %>% pull(reg2)
circos.text(x = mean(xlim), y = ifelse(is.na(reg2), 3, 4),
labels = reg1, facing = "bending", cex = 1.1)
circos.text(x = mean(xlim), y = 2.75, labels = reg2, facing = "bending", cex = 1.1)
circos.axis(h = "top", labels.cex = 0.8
labels.niceFacing = FALSE, labels.pos.adjust = FALSE)
})
# close plotting device
dev.off()
}
Creating an animation
Using the magick
package an animation can be created by using the code below to
- Read in an initial plot and then combine together all other images created above.
- Scale the combined images.
- Animate the combined images and save as a
.gif
ormp4
library(magick)
img <- image_read(path = "./plot-ani/globalchord0.png")
for(f in unique(d2$.frame)[-1]){
img0 <- image_read(path = paste0("./plot-ani/globalchord",f,".png"))
img <- c(img, img0)
message(f)
}
img1 <- image_scale(image = img, geometry = "720x720")
ani0 <- image_animate(image = img1, fps = 10)
image_write(image = ani0, path = "./plot-ani/globalchord.gif")
image_write_video(image = img1, path = "./plot-ani/globalchord.mp4", framerate = 10)
This gives an output much like this minus the additional details in the corners:
Fixing Scales in Chord Diagrams
Whilst the plot above allows comparisons of the distributions of flows overtime it is more difficult to compare volumes. For such comparisons, Zuguang Gu suggests scaling the gaps between the sectors on the outside of the chord diagram. I wrote a little function that can do this for flow data arranged in a tidy format;
scale_gap <- function(flow_m, flow_max, gap_at_max = 1, gaps = NULL) {
p <- flow_m / flow_max
if(length(gap_at_max) == 1 & !is.null(gaps)) {
gap_at_max <- rep(gap_at_max, gaps)
}
gap_degree <- (360 - sum(gap_at_max)) * (1 - p)
gap_m <- (gap_degree + sum(gap_at_max))/gaps
return(gap_m)
}
where
flow_m
is the size of total flows in the matrix for the given year being re-scaled.flow_max
is the maximum size of the flow matrix over all yearsgap_at_max
is the size in degrees of the gaps in the flow matrix in the year where the flows are at their all time maximum.gaps
is the number of gaps in the chord diagram (i.e. the number of regions).
The function can be used to derive the size of gaps in each frame for a new animation.
d3 <- d2 %>%
group_by(.frame) %>%
summarise(flow = sum(flow)) %>%
mutate(gaps = scale_gap(flow_m = flow, flow_max = max(.$flow),
gap_at_max = 4, gaps = 9))
d3
## # A tibble: 101 × 3
## .frame flow gaps
## <int> <dbl> <dbl>
## 1 0 17.6 25.9
## 2 1 17.8 25.7
## 3 2 18.1 25.6
## 4 3 18.3 25.4
## 5 4 18.5 25.2
## 6 5 18.7 25.1
## 7 6 18.9 24.9
## 8 7 19.1 24.7
## 9 8 19.3 24.6
## 10 9 19.6 24.4
## # … with 91 more rows
The calculations in d3
can then be plugged into the for
loop above, where the circos.par()
function is replaced by
circos.par(start.degree = 90, track.margin = c(-0.1, 0.1),
gap.degree = filter(d3, .frame == f)$gaps,
points.overflow.warning = FALSE)
Once the for loop has produced a new set of images, the same code to produce previous animation can be run to obtain the animated chord diagrams with changing gaps;
Whilst the sector axes are now fixed, I am not convinced that changing the relative gaps is the best way to compare volumes when using animated chord diagrams. The sectors of all regions - bar Northern America - are rotating making it hard follow their changes over time.
Fortunately there is new xmax
option in chordDiagram
that can be used to fix the lengths of the x-axis for each sector using a named vector. In the context of producing an animation, the historic maximum migration flows (of combined immigration and emigration flows) in each region can be used, calculated from the original data d0
library(magrittr)
reg_max <- d0 %>%
group_by(year0, orig_reg) %>%
mutate(tot_out = sum(flow)) %>%
group_by(year0, dest_reg) %>%
mutate(tot_in = sum(flow)) %>%
filter(orig_reg == dest_reg) %>%
mutate(tot = tot_in + tot_out) %>%
mutate(reg = orig_reg) %>%
group_by(reg) %>%
summarise(tot_max = max(tot)/1e06) %$%
'names<-'(tot_max, reg)
reg_max
## Africa Eastern Asia
## 17.429942 14.805479
## Eastern Europe & Central Asia Europe
## 8.361300 15.536978
## Latin America & Caribbean Northern America
## 7.697638 10.416927
## Oceania Southern Asia
## 2.968412 15.067631
## Western Asia
## 15.072561
The reg_max
object can then be used in the chordDiagram
function in the for
loop above, replacing the original call with
chordDiagram(x = filter(d2, .frame == f), directional = 1, order = d1$region,
grid.col = d1$col1, annotationTrack = "grid",
transparency = 0.25, annotationTrackHeight = c(0.05, 0.1),
direction.type = c("diffHeight", "arrows"), link.arr.type = "big.arrow",
diffHeight = -0.04, link.sort = TRUE, link.largest.ontop = TRUE,
xmax = reg_max)
Running the complete code - the adapted for
loop to produce the images and then the magick
functions to compile the animation - results in the following: