A Quick Map of Age of Structures in the U.S.
The Census captures a number of different household, housing stock and population data. One of the measures they frequently update through their annual American Community Survey (ACS), is the median age of housing units within the United States.
We will build a quick map to compare the median age of units within the U.S. as of the ACS - 5 year, 2020 survey.
Read in Data with Tidycensus
The U.S. Census table code for median year of structure built is: B25035_001. For future analysis, we will also provide the codes for distributions of age structure for future plotting of proportions of housing structure year built at a more granular level than state.
# Census code for reading in median structure year
med_yr_structure <- "B25035_001"
# We will NOT use these, but they are the total population of housing units
# and the proportion of counts by year of age built
# tot_built <- "B25034_001"
# built_age_cohorts <- sprintf("B250034_%s",append_cohort)
# Set coordinate system for projecting our map
# See crsuggest package for alternatives
set_crs <- 2163
# Get National Median
g_natl <- tidycensus::get_acs(geography = "us",
variables = med_yr_structure,
year = 2020,
survey = "acs5",
geometry = F)
# Read in State Boundaries with additional detail
# Remove DC and Territories
g_state <- tigris::states()%>%
filter(!STUSPS %in% c("VI", "MP", "GU", "PR", "AS", "DC"))%>%
shift_geometry()%>%
st_transform(set_crs)
# Read in Median Year of Structure Built by State for ACS-5, 2020 Survey
# Include Geometry for Plot
g_yr_built <- tidycensus::get_acs(geography = "state",
variables = med_yr_structure,
year = 2020,
survey = "acs5",
geometry = T)%>%
shift_geometry()%>%
st_transform(set_crs)
Setting Up Plot
In order to build a map, there are a few tweaks we need to make to the data to ensure pretty labels of the median year of structure built values. In particular we need to make adjustments to labels for states in the northeast so they are legible. We also remove the District of Columbia from our plot since it is not visible.
# Create cuts for categorizing and building colors
start_seq <- min(g_yr_built$estimate)
seq_built <- c(-Inf, 1950, 1960, 1965, 1970, 1974, 1980, 1984, 1990, +Inf)
lab_built <- c("Prior to 50's","1950-59","1960-64", "1965-69",
"1970-74", "1975-79", "1980-84", "1985-90", "1990+")
# Create a way to filter out states that need adjustments
ne_state_shift <- c("CT","DE","RI", "MD", "MA","NH", "NJ", "VT")
# Create a full list of centroids of each state for our labels
# We remove geometry because we use coords x and y
d_centroid <- g_yr_built[,c("GEOID", "NAME", "estimate")]%>%
mutate(centroid = st_centroid(geometry),
x = unlist(map(centroid, 1)),
y = unlist(map(centroid, 2)))%>%
st_drop_geometry(.)%>%
inner_join(., g_state%>%
select(STUSPS, NAME)%>%
st_drop_geometry(), by = c("NAME"="NAME"))%>%
select(fips2 = GEOID, state_name = NAME, state_abb = STUSPS, x, y, estimate)
# Extract northeast states
# Adjust Northeast states further east (x plus) and to the south (y minus)
# so geom_text labels are clean and legible
lab_repel <- d_centroid%>%
filter(state_abb %in% ne_state_shift)%>%
mutate(x_adj = case_when(
state_abb == "MA" ~ x + 300000,
state_abb == "MD" ~ x + 300000,
# state_abb == "DC" ~ x + 300000,
state_abb == "RI" ~ x + 200000,
state_abb == "DE" ~ x + 200000,
state_abb == "NJ" ~ x + 200000,
state_abb == "NH" ~ x + 200000,
state_abb == "VT" ~ x - 100000,
TRUE ~ x
),
y_adj = case_when(
state_abb == "VT" ~ y + 200000,
state_abb == "MD" ~ y - 175000,
state_abb == "CT" ~ y - 75000,
state_abb == "RI" ~ y - 55000,
# state_abb == "DC" ~ y - 55000,
TRUE ~ y
))
# Create separate object for remaining (non-Northeast) states
# Make minor position adjustments so geom_text labels are legible (MI, FL, LA, WV)
# Make Color coding so the estimates can be seen clearer with the fill background
lab_centroid <- d_centroid%>%
filter(!state_abb %in% ne_state_shift)%>%
mutate(x = case_when(
state_abb == "MI" ~ x + 75000,
state_abb == "FL" ~ x + 75000,
state_abb == "LA" ~ x - 45000,
state_abb == "WV" ~ x - 25000,
TRUE ~ x
),
y = case_when(
state_abb == "LA" ~ y - 35000,
state_abb == "WV" ~ y - 35000,
state_abb == "MI" ~ y - 100000,
TRUE ~ y
),
color_txt = case_when(
state_abb == "NY" | state_abb == "HI" ~ "B",
TRUE ~ "W"
))
m <- g_yr_built%>%
mutate(plot_me = cut(estimate, breaks = seq_built, labels = lab_built, right = F))%>%
ggplot() +
geom_sf(aes(fill = plot_me), color = "white", lwd = 0.1) +
geom_text(data = lab_centroid, aes(x = x, y = y, label = estimate, color = color_txt),
size = 2.5, fontface = "bold") +
scale_color_manual(values = c("B" = "black", "W"= "white"),
guide = "none") +
geom_segment(data = lab_repel, aes(x = x, xend = x_adj-80000,
y = y, yend = y_adj-15000)) +
geom_text(data = lab_repel, aes(x = x_adj, y = y_adj, label = estimate),
size = 2.5, fontface = "bold") +
# Manually adjusted in-lieu of ggrepel package
# ggrepel::geom_label_repel(data = lab_repel,
# aes(x = x, y = y, label = estimate)) +
scale_fill_viridis_d("", option = "magma", direction = -1) +
theme_void()+
labs(title = "Median Year of Structure Built by State",
subtitle = "ACS 5-year, 2020",
x = NULL,
y = NULL)+
theme(legend.position = "top")
print(m)
Quick Summary and Future Posts
Unsurprisingly, the east coast has the oldest structures within the U.S. with the District of Columbia and New York with a corresponding median structure built year of 1954 and 1957 respectfully. Meanwhile Nevada and Arizona have newer constructed structures with the median ages of 1994 and 1990 respectfully.
g_yr_built%>%
arrange(estimate)%>%
select(State = NAME, `Year Built` = estimate)%>%
st_drop_geometry(.)%>%
head(10)%>%
knitr::kable(., caption = "Top 10: Oldest Median Structure Built by State")
State | Year Built |
---|---|
District of Columbia | 1954 |
New York | 1957 |
Massachusetts | 1961 |
Rhode Island | 1961 |
Pennsylvania | 1963 |
Connecticut | 1965 |
New Jersey | 1968 |
Ohio | 1968 |
Illinois | 1969 |
Iowa | 1970 |
g_yr_built%>%
arrange(desc(estimate))%>%
select(State = NAME, `Year Built` = estimate)%>%
st_drop_geometry(.)%>%
head(10)%>%
knitr::kable(., caption = "Top 10: Most Recent Median Structure Built by State")
State | Year Built |
---|---|
Nevada | 1994 |
Arizona | 1990 |
Georgia | 1989 |
Utah | 1989 |
South Carolina | 1989 |
Idaho | 1988 |
North Carolina | 1988 |
Texas | 1987 |
Florida | 1986 |
Delaware | 1985 |
Overall, the age of housing stock within the United States is quite aged with the United States median structure being built in 1978 (or built 44 year ago at the time of the survey). Since 1978, the U.S. has implemented a significant number of improvements for safety of the structure and the populous living within the homes. As an example, the Lead Disclosure Rule or Title X was signed into law in 1992 to prevent the usage and government financing of homes requiring disclosures for structures built prior to 1972 where lead was used.
Congress passed the Residential Lead-Based Paint Hazard Reduction Act of 1992, also known as Title X, to protect families from exposure to lead from paint, dust, and soil. Section 1018 of this law directed HUD and EPA to require the disclosure of known information on lead-based paint and lead-based paint hazards before the sale or lease of most housing built before 1978.
Further Detail from Housing and Urban Development (HUD) here
In future posts, we’ll explore how this varies at a more granular geographic level. The importance of age of structure should not be understated as federal, state and local agencies consider demolition of vacant homes and rehabilitation of the existing housing stock to be more efficient and resilient to future climate changes or natural disasters.