I'm following along to one of Julia Silge's awesome blogs here and I want to modify one of the plots to include the major cities in Texas.
I'm using the maps package to get the lat long coordinates of the top cities and wish to add these as labels to a ggplot map. The map plots just fine but once I add the geom_text(color = 'black', data = cities_data, check_overlap = TRUE, aes(x = long, y = lat, label = fixed_name))
I'm also following along to this blog where the author appears to add geom_text labels to a ggplot map from another data frame so I'm confused whats causing the issue below:
Error in `geom_text()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error:
! object '.estimate' not found
Can someone help me add the main cities to the maps please, here's the code I'm using:
# LOAD NECESSARY LIBRARIES
pacman::p_load(tidyverse,tidycensus,tidymodels,spatialsample,googleway,sf,maps)
# IMPORT DROUGHT DATA
drought_raw <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-06-14/drought-fips.csv')
# PREPARE DROUGHT DATA: FILTER FOR TEXAS IN 2021 AND CALCULATE MEAN DSCI PER COUNTY
drought <- drought_raw %>%
filter(State == "TX", lubridate::year(date) == 2021) %>%
group_by(GEOID = FIPS) %>%
# DSCI = Drought Severity and Coverage Index
summarise(DSCI = mean(DSCI)) %>%
ungroup()
# GET MEDIAN INCOME DATA FROM THE ACS FOR TEXAS COUNTIES IN 2020
tx_median_rent <-
get_acs(
geography = "county",
state = "TX",
#https://walker-data.com/tidycensus/articles/basic-usage.html
variables = "B19013_001", # Median household income
year = 2020,
geometry = TRUE # Include county geometries for mapping
)
# COMBINE DROUGHT AND MEDIAN INCOME DATA
drought_sf <- tx_median_rent %>% left_join(drought)
# VISUALIZE DROUGHT SEVERITY ACROSS TEXAS COUNTIES
drought_sf %>%
ggplot(aes(fill = DSCI)) +
geom_sf(alpha = 0.9, color = NA) + # Use geom_sf to plot the county shapes
scale_fill_viridis_c() # Use a viridis color scale
# VISUALIZE THE CORRELATION BETWEEN DROUGHT SCORE AND MEDIAN INCOME
drought_sf %>%
ggplot(aes(DSCI, estimate)) +
geom_point(size = 2, alpha = 0.8) + # Scatter plot of DSCI vs. median income
geom_smooth(method = "lm") + # Add a linear regression line
scale_y_continuous(labels = scales::dollar_format()) + # Format y-axis labels as dollars
labs(x = "Drought score", y = "Median household income")
# CREATE SPATIAL CROSS-VALIDATION FOLDS
set.seed(123) # Set seed for reproducibility
folds <- spatial_block_cv(drought_sf, v = 10) # 10 spatial folds
# VISUALIZE THE SPATIAL FOLDS
folds
autoplot(folds)
autoplot(folds$splits[[1]])
# BUILD AND EVALUATE A LINEAR REGRESSION MODEL
drought_res <-
workflow(estimate ~ DSCI, linear_reg()) %>% # Define workflow with linear regression
fit_resamples(folds, control = control_resamples(save_pred = TRUE)) # Fit model with spatial CV
# COLLECT MODEL PREDICTIONS
collect_predictions(drought_res)
drought_res
collect_predictions(drought_res)
# CALCULATE ROOT MEAN SQUARED ERROR (RMSE) BY COUNTY
drought_rmse <-
drought_sf %>%
mutate(.row = row_number()) %>%
left_join(collect_predictions(drought_res)) %>% # Join predictions to original data
group_by(GEOID) %>%
rmse(estimate, .pred) %>% # Calculate RMSE
select(GEOID, .estimate) # Select GEOID and RMSE estimate
cities_data <- us.cities %>%
filter(country.etc == 'TX' & pop > 500000) %>%
mutate(fixed_name = str_replace(name,country.etc,''))
drought_sf %>%
left_join(drought_rmse, by = 'GEOID') %>%
ggplot(
aes(fill = .estimate)
) +
geom_sf(color = NA, alpha = 0.8) +
labs(fill = "RMSE") +
scale_fill_viridis_c(labels = scales::dollar_format()) +
geom_text(color = 'black', data = cities_data, check_overlap = TRUE, aes(x = long, y = lat, label = fixed_name))
When you add geom_text
, it inherits all of the aesthetics you specified inside ggplot
- but because you are using different data for geom_text
, it can't find .estimate
. You can just move aes(fill = .estimate)
from inside ggplot
to inside geom_sf
.
drought_sf %>%
left_join(drought_rmse, by = 'GEOID') %>%
ggplot() +
geom_sf(color = NA, alpha = 0.8, aes(fill = .estimate)) +
labs(fill = "RMSE") +
scale_fill_viridis_c(labels = scales::dollar_format()) +
geom_text(
data = cities_data,
aes(x = long, y = lat, label = fixed_name),
color = 'black',
check_overlap = TRUE
)