
COVID-19 US Data - Inverse Correlation of Masks vs Cases
Cumlative, population-leveled case data versus public mask percentage
Contrast and Compare
Wearing masks more often means fewer COVID-19 cases. It’s that simple, and it couldn’t be more clear in the data.
I started looking into US data with the latest cumulative counts for likely and confirmed COVID-19 cases and deaths compiled by the New York Times. Aside from the split between “likely” and “confirmed” cases (the latter coming from a very specific postmortem test) I also wanted to look at the survey data collected on public mask wearing.
I had suspected that there would be an inverse correlation but I didn’t expect it to come up this quickly, or that it would be such a clear, stark result. I still have quite a bit of refinement work to do on presenting the report. And I also want to start looking at time series data, in order to use a weighted average that matches the time when the mask-wearing surveys are conducted. Right now I’m simply taking the full count, but when the mask guidelines change and compliance shifts I want to be sure I also evaluate changes in case rate.
The Charts
After some edits and surfacing state labels for tooltips (which pop up when you hover over any of the counties in the chart) I re-rendered the reports here for more detailed review. Like the other reports that pull data from OWID and JHU these will continue to shift and change as new cumulative case numbers are added in. And likewise, when there’s an update to the mask wearing surveys there will be new data there as well. Once I have a better view of the full time COVID time series data set I’ll shift to a weighted average for the past few weeks or month, or match the case counts to the time range for the survey period. But for now this provides a valuable, if slowly-moving snapshot.
Exceptions Worth Noting
It should be noted that the NYT found there to be some marginal variance in the actual level of mask wearers and the reported answers from the survey.
Researchers who hand-counted Wisconsin grocery shoppers in May and June found about 40 percent of shoppers wore masks, a level that is lower than the 45 percent who said they always wore masks in the recent Dynata sample (another 24 percent said they frequently wore masks).
They noted that it didn’t affect the result in aggregate. But it bears further scrutiny as more emphasis on masks is placed by public officials and the “virtue signal” of answering the questionnaires/surveys widen the gap with actual use.
MARTIN FOLEY, 63 of Fremont, New Hampshire, died of COVID on Dec. 11.
— FacesOfCOVID (@FacesOfCOVID) December 27, 2020
"Marty was the best husband, father, papa, and friend this world has ever seen. He had a one-of-a-kind sense of humor and could light up a room the minute he walked in."https://t.co/diiVQbAvSK pic.twitter.com/UJpHScpL2c
WAYNE EDWARDS, 61 of Queens, New York died of COVID on April 1.
— FacesOfCOVID (@FacesOfCOVID) January 6, 2021
"He was a stocker and went all through the hospital to stock rooms with supplies, then he handed out masks and that’s how he got exposed...I miss him so much."https://t.co/TKKresGqEu pic.twitter.com/HzkWH8580O
KEITH JACOBS, 64 of Stoughton, Massachusetts died of COVID on April 14.
— FacesOfCOVID (@FacesOfCOVID) December 2, 2020
"He was a talented photographer who told everyone to 'have a picture perfect day.' He was as caring, generous, and empathetic person you will find."
Submitted to @FacesOfCOVID by his son. pic.twitter.com/4NdYniwxGt
The code behind the reports
# begin setup code chunk
library(tidyverse)
library(highcharter)
library(widgetframe)
library(lubridate)
counties <- read.csv("../../data/us_county.csv", header = TRUE)
counties <-
counties %>%
mutate(fips = str_pad(fips, 5, side = 'left', pad = '0'))
county_pop <- select(counties, fips, state, population)
mask_URL <-
"https://raw.githubusercontent.com/nytimes/covid-19-data/master/mask-use/mask-use-by-county.csv"
mask_data <- read.csv(mask_URL)
mask_data <-
mask_data %>%
mutate(COUNTYFP = str_pad(COUNTYFP, 5, side = 'left', pad = '0')) %>%
select(COUNTYFP, NEVER, RARELY, SOMETIMES, FREQUENTLY, ALWAYS) %>%
mutate(always_pct = (ALWAYS * 100)) %>%
rename(fips = COUNTYFP) %>%
left_join(county_pop, fips = fips) %>%
rename(location = state)
last_updated <- paste("Source: New York Times - Report Last Updated:"
, format(Sys.time(), "%a %b %d %Y %X"))
# end setup code chunk
widget <- hcmap("countries/us/us-all-all",
data = mask_data,
value = "always_pct",
name = "Percentage that always wears masks in public",
join = "fips") %>%
hc_colorAxis(minColor = "white", maxColor = "#32644F") %>%
hc_title(text = "Percentage of US Respondants Claiming to Always Wear Masks in Public"
, align = "center") %>%
hc_chart(
borderColor = 'rgba(160, 160, 160, 0.3)',
borderRadius = 8,
borderWidth = 2) %>%
hc_tooltip(pointFormat = "State: {point.location}
{point.name} County: {point.value}%")%>%
hc_legend(layout = "vertical", verticalAlign = "top",
align = "right", valueDecimals = 0) %>%
hc_credits(enabled = TRUE,
text = last_updated,
position = list(align = "left", x = 10, y = -5))
frameWidget(widget, height="100%", width="40rem")
# begin setup code chunk
library(tidyverse)
library(highcharter)
library(widgetframe)
library(lubridate)
counties <- read.csv("../../data/us_county.csv", header = TRUE)
counties <-
counties %>%
mutate(fips = str_pad(fips, 5, side = 'left', pad = '0'))
county_pop <- select(counties, fips, state, population)
US_COVID_data <-
"https://raw.githubusercontent.com/nytimes/covid-19-data/master/live/us-counties.csv"
us_data <- read.csv(US_COVID_data)
us_data_map <-
us_data %>%
filter(fips != "") %>%
mutate(date = ymd(date)) %>%
mutate(fips = str_pad(fips, 5, side = 'left', pad = '0')) %>%
select(date, fips, state, county, cases, deaths) %>%
left_join(county_pop, fips = fips) %>%
arrange(fips, date) %>%
mutate(ncpht = ((cases / population) * 100000)) %>%
mutate(ndpht = ((deaths / population) * 100000)) %>%
ungroup() %>%
rename(location = state)
last_updated <- paste("Source: New York Times - Report Last Updated:"
, format(Sys.time(), "%a %b %d %Y %X"))
# end setup code chunk
widget <- hcmap("countries/us/us-all-all",
data = mask_data,
value = "always_pct",
name = "Percentage that always wears masks in public",
join = "fips") %>%
hc_colorAxis(minColor = "white", maxColor = "#32644F") %>%
hc_title(text = "Percentage of US Respondants Claiming to Always Wear Masks in Public"
, align = "center") %>%
hc_chart(
borderColor = 'rgba(160, 160, 160, 0.3)',
borderRadius = 8,
borderWidth = 2) %>%
hc_tooltip(pointFormat = "State: {point.location}
{point.name} County: {point.value}%")%>%
hc_legend(layout = "vertical", verticalAlign = "top",
align = "right", valueDecimals = 0) %>%
hc_credits(enabled = TRUE,
text = last_updated,
position = list(align = "left", x = 10, y = -5))
frameWidget(widget, height="100%", width="40rem")
Further analysis
As I mentioned above, the focus will eventually shift to the time series data, as opposed to the latest cumulative data files. There will be more detailed reporting as well. Right now the data looks bleak, but there is a silver lining here - on how following simple guidelines and hygiene protocols can limit the spread of infection and save lives.
Share this post
Twitter
Reddit
LinkedIn
Email