Coronaviruses are a general family of virus, which affect the respiratory system (the most fatally known being SARS). It is a zoonotic virus meaning that it can infect and be transmitted between humans and animals. The most recent strain – 2019-nCoV continues to consume the news cycle given it’s potential to spread quickly and the uncertainty given it has not previously been identified in humans.
I will not pretend to be an expert on the health risks, strain sequencing or external economic risks it may impose. Though part of understanding complex issues with the potential for geographic spread, lend themselves to the importance of being able to quickly detect geographic transference points, hot spots or emerging contagion.
Others have built far more sophisticated tools and models for tracking/forecasting such as:
For our quick map, we’ll source recent data from John Hopkins Github for building our map. There are certainly more dynamic ways to do this, but sometimes a quick visualization is all we want….
Pull in the Data:
We’ll use the Johns Hopkins data available here. We could simply utilize the pre-geocoded data, but we’ll take the hard path to add another tool to our mapping toolbelt. We’ll utilize the Google API to reverse geocode the locations based on their name.
# Link to raw data
case_url <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/daily_case_updates/02-13-2020_2115.csv"
# The columns renames
case_col_nms <- c("prov_state", "cntry_region", "dt_updated", "confirmed", "death", "recovered")
# Load data **note last mutate** which passes the state, country to google's API
# ****NOTE***** there is a cost based on usage
d_case <- read_csv(case_url, col_names = case_col_nms, skip=1)%>%
mutate(row_id = row_number())%>%
mutate(prov_state = ifelse(is.na(prov_state), "", prov_state),
cbin_st_region = paste(prov_state, cntry_region, sep=","))%>%
mutate_geocode(cbin_st_region)
# Create a smaller dataset with just the relevant information to write out to csv to store
# d_geo_out <- d_case%>%
# select(row_id, cbin_st_region, lon, lat)
#OPTIONAL
# write_csv(d_geo_out, paste0(d_path, "geo_point_wid.csv"))
Map Data
We’ll use leaflet for this plot similar to previous posts. We’ll first create our popup text based on the variables aggregated by Johns Hopkins. Based on outbreak figures, we’ll set the view to China given the impact thus far (bounding box coordinates tool)
Finally, we’ll add some character by utilizing built-in leaflet tiles to get the dramatic black background coloring. A full list of map tile options for leaflet can be found here
In a few lines of code we have a quick way to navigate geographic occurrences and outcomes by region. Viruses’ like these are very scary, given the lack of prior research available and limitations on containment. Assuredly many teams and medical professionals are working around the clock. Hoping everyone remains safe and healthy!