World Data Overview - Global Heat Maps

COVID-19 in Motion - Animated Time Series for cases, deaths and testing

Houston Haynes

8 minute read

Exercise in Scale

When I started working with the Our World In Data COVID-19 data set I wanted to chart out a few things to get a feel for the data. I’ve done this work for quite a while and the first thing you often do is a form of reality check. In some cases I checked againtt other charts based on the same data and thought “that can’t be right” only to plot it myself and see it directly. (looking at you, Europe) As we all have witnessed this year experts have been sounding the alarm since early on. Pandemics are like locomotives, they’re difficult to stop once they gain momentum. The same goes for an effective response.

So I took the data and plotted it on a map, to play on a day grain through time. The gradationt are pretty subtle at the start, and that’s because of the data getting very dark late in the range.

Daily COVID Mortality

One of the things I expected was to see some “chop” in early data as reporting became more contistent. Here I was surprised to see irregular bumps in the “smoothed” data. Once South America is removed from the set then things look much less strange. Such as it is, the visuals tell a convincing story of where things are going as of November, and it’s not good.

Again here - like the area chart - the direction of what’s happening in Europe is unmistakable. In fact it bothers me to the point that I’ll probably end up sourcing some other data just so I can sleep tonight. But one thing we can see and possibly discard is the sudden jump in values in South America. Per our area chart above which showed that as the offending portion of the data set that was skewing the result.

Faces of COVID-19
The pandemic-related posts on this site are about more than data. Behind every number is a person, a family and a community. As reports are refreshed, new selectiont will also be chosen at random. To see how this is done, see this sidebar.

Testing, Testing, 1, 2, 3…

While spelunking through the data I thought I’d take a look at testing. The good news, the numbers are increasing.

Here’s the flip-side of the previous two reports - where the numbers go green as they go higher. But again, it runt the risk of painting an inaccurate picture to show less than 5 tests per 1,000 people as a positive indication. That value should be much, much higher - but in this case relative values matter.

How the reports were built

The area charts are relatively straightforward. The only data munging task was summing the values by continent and date. After that it was mainly about formatting and some tweaks to keep the display clean.

# begin setup code chunk
library(highcharter)
library(widgetframe)
library(tidyverse)
library(lubridate)
library(xts)

URL <- "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv"

coviddata <- read.csv(URL)

# end setup code chunk

pop <- read.csv("../../data/population_by_country_2020.csv", header = TRUE)
data <- select(coviddata[with(coviddata, order(date, iso_code)),], 
               date, continent, location, new_cases)   %>%
  filter(location != "International",
         location != "World",
         date > "2020-03-01") %>%
  mutate(new_cases = ifelse(is.na(new_cases), 0, new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-07-18" &
                                location == "Chile" &
                                new_cases == 1057),
                             57,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-07-24" &
                                location == "Peru" &
                                new_cases == 3887),
                             187,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-08-14" &
                                location == "Peru" &
                                new_cases == 3935),
                             235,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-09-07" &
                                location == "Ecuador" &
                                new_cases == 3800),
                             38,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-09-07" &
                                location == "Bolivia" &
                                new_cases == 1610),
                             61,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-02" &
                                location == "Argentina" &
                                new_cases == 3351),
                            351,
                            new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-09" &
                                location == "Ecuador" &
                                new_cases == 398),
                            39,
                            new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-09" &
                                location == "Mexico" &
                                new_cases == 3013),
                            313,
                            new_cases)) %>%
  mutate(date = ymd(date)) %>%
  left_join(pop, location = location) %>%
  select(date, continent, new_cases, population) %>%
  group_by(date, continent) %>%
  summarize(new_cases = sum(new_cases), population = sum(population)) %>%
  mutate(ncpm = ((new_cases / population) * 1000000)) %>%
  group_by(date, continent) %>%
  summarize(new_cases = sum(new_cases), ncpm = sum(ncpm)) %>%
  group_by(continent) %>%
  arrange(continent, date) %>%
  mutate(ncrm = slider::slide_dbl(new_cases, mean, .before = 3, .after = 3)) %>%
  mutate(ncpmrm = slider::slide_dbl(ncpm, mean, .before = 3, .after = 3)) %>%
  ungroup()

#datatable(data)

thm <- hc_theme_merge(
  hc_theme_ffx(),
  hc_theme(title = list(style = list(fontFamily = "Ubuntu")),
           subtitle = list(style = list(fontFamily = "Fira Code")),
           legenc = list(itemStyle = list(fontFamily ='Ubuntu'))))

widget <- hchart(data, "area", hcaes(x = date, y = ncpmrm, group = continent, labels = FALSE)) %>%
  hc_plotOptions(series = list(stacking = 'normal')) %>%
  hc_title(text = "World COVID Daily New Cases per Million (smoothed)", align = "center") %>%
  hc_add_theme(thm) %>%
  hc_subtitle(text = "Summed by continent :: Higher == BAD", align = "center")%>%
  hc_chart(
    borderColor = 'rgba(160, 160, 160, 0.3)',
    borderRadius = 8,
    borderWidth = 2,
    marginBottom = '80',
    marginTop = '60',
    marginLeft = '60',
    marginRight = '60') %>%
  hc_yAxis(title = list(enabled = FALSE)) %>%
  hc_xAxis(title = list(enabled = FALSE))

frameWidget(widget, width="100%", height="26rem")

The map charts were more complex. Aside from tying together the geojson map to the data (by joining iso-a3 in the map data to the iso_code value from the source data) there was building the sequence of data itself. This wasn’t immediately apparent but this tutorial was a great help.

# begin setup code chunk
library(highcharter)
library(widgetframe)
library(tidyverse)
library(lubridate)
library(xts)

URL <- "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv"

coviddata <- read.csv(URL)

# end setup code chunk

pop <- read.csv("../../data/population_by_country_2020.csv", header = TRUE)
data <- select(coviddata[with(coviddata, order(date, iso_code)),], 
               date, iso_code, location, new_cases)   %>%
  filter(location != "International",
         location != "World") %>%
  mutate(new_cases = ifelse(is.na(new_cases), 0, new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-07-18" &
                                location == "Chile" &
                                new_cases == 1057),
                             57,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-07-24" &
                                location == "Peru" &
                                new_cases == 3887),
                             187,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-08-14" &
                                location == "Peru" &
                                new_cases == 3935),
                             235,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-09-07" &
                                location == "Ecuador" &
                                new_cases == 3800),
                             38,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-09-07" &
                                location == "Bolivia" &
                                new_cases == 1610),
                             61,
                             new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-02" &
                                location == "Argentina" &
                                new_cases == 3351),
                            351,
                            new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-09" &
                                location == "Ecuador" &
                                new_cases == 398),
                            39,
                            new_cases)) %>%
  mutate(new_cases = ifelse((date == "2020-10-09" &
                                location == "Mexico" &
                                new_cases == 3013),
                            313,
                            new_cases)) %>%
  mutate(date = ymd(date)) %>%
  left_join(pop, location = location) %>%
  select(date, iso_code, new_cases, population) %>%
  group_by(date, iso_code, population) %>%
  summarize(new_cases = sum(new_cases)) %>%
  mutate(ncpm = ((new_cases / population) * 1000000)) %>%
  group_by(iso_code) %>%
  arrange(iso_code, date) %>%
  mutate(ncrm = slider::slide_dbl(new_cases, mean, .before = 3, .after = 3)) %>%
  mutate(ncpmrm = slider::slide_dbl(ncpm, mean, .before = 3, .after = 3)) %>%
  ungroup() %>%
  mutate_at(vars(ncpm, ncrm, ncpmrm), funs(round(., 2)))

#datatable(data)


ds <- data %>%
  group_by(iso_code) %>%
  do(item = list(
    iso_code = first(.$iso_code),
    sequence = .$ncpmrm,
    ncpmrm = last(.$ncpmrm))) %>%
  .$item

#head(ds, 10)

vec <- sort(unique(pull(data, date)))

url <- "https://code.highcharts.com/mapdata/custom/world-robinson-highres.js"
tmpfile <- tempfile(fileext = ".json")
download.file(url, tmpfile)
geo <- readLines(tmpfile)
geo <- gsub(".* = ", "", geo)
map <- jsonlite::fromJSON(geo, simplifyVector = FALSE)

last_updated <- paste("Source: https://ourworldincata.org/  -  Report Last Updated:",
                      format(Sys.time(), "%a %b %d %Y %X"))

thm <- hc_theme_merge(
  hc_theme_ffx(),
  hc_theme(title = list(style = list(fontFamily = "Ubuntu")),
           subtitle = list(style = list(fontFamily = "Ubuntu")),
           legenc = list(itemStyle = list(fontFamily ='Ubuntu'))))

widget <- highchart(type = "map") %>%
  hc_add_theme(thm) %>%
  hc_chart(
    borderColor = 'rgba(160, 160, 160, 0.3)',
    borderRadius = 8,
    borderWidth = 2,
    marginBottom = '80',
    marginTop = '60',
    marginLeft = '60',
    marginRight = '60') %>%
  hc_title(text = "Global COVID Daily New Cases per Million (smoothed)", align = "center") %>%
  hc_subtitle(text = "Press play button or use slider to change date", align = "center") %>%
  hc_add_series(data = ds,
                name = "New Cases per Million",
                mapData = map,
                joinBy = c("iso-a3", "iso_code"),
                borderWidth = 0.01) %>%
  hc_colorAxis (min = 0 ,
                max = 1000,
                stops = color_stops(colors = viridisLite::inferno(20,
                                                                begin = 0.1,
                                                                direction = -1))) %>%
  hc_motion (
    enabled = TRUE,
    startIndex = 23,
    axisLabel = "date",
    labels = vec,
    updateInterval = 5,
    playIcon = "fas fa-play",
    pauseIcon = "fas fa-pause",
    magnet = list(
      rounc = "floor",
      step = 0.1
      )
  ) %>%
  hc_legend(layout = "vertical", verticalAlign = "top",
            align = "right", valueDecimals = 0) %>%
  hc_mapNavigation(
    enabled = TRUE,
    enableMouseWheelZoom = TRUE,
    enableDoubleClickZoom = TRUE
  ) %>%
  hc_credits(enabled = TRUE,
             text = last_updated,
             position = list(align = "left", x = 10, y = -5))


frameWidget(widget, width="100%", height="40rem")

Further analysis

Aside from drilling into other categories of data here and checking againtt other data sets, I’ll also be looking at various forecast models and lag indicators. There will also be drill down into national and regional level data. In any case, as the story continues to unfold we should all brace for a very dark winter.

Key Value
BuildDateTime 2021-06-15 16:21:40 -0700
LastGitUpdate 2021-06-12 10:59:02 -0700
GitHash ec878bd
CommitComment Using F# function for site cards