rggplot2tidycensus

Adding city names to a ggplot map from another data frame


I'm following along to one of Julia Silge's awesome blogs here and I want to modify one of the plots to include the major cities in Texas.

I'm using the maps package to get the lat long coordinates of the top cities and wish to add these as labels to a ggplot map. The map plots just fine but once I add the geom_text(color = 'black', data = cities_data, check_overlap = TRUE, aes(x = long, y = lat, label = fixed_name))

I'm also following along to this blog where the author appears to add geom_text labels to a ggplot map from another data frame so I'm confused whats causing the issue below:

Error in `geom_text()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 2nd layer.
Caused by error:
! object '.estimate' not found

Can someone help me add the main cities to the maps please, here's the code I'm using:

# LOAD NECESSARY LIBRARIES
pacman::p_load(tidyverse,tidycensus,tidymodels,spatialsample,googleway,sf,maps) 

# IMPORT DROUGHT DATA
drought_raw <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-06-14/drought-fips.csv')

# PREPARE DROUGHT DATA: FILTER FOR TEXAS IN 2021 AND CALCULATE MEAN DSCI PER COUNTY
drought <- drought_raw %>%
  filter(State == "TX", lubridate::year(date) == 2021) %>%
  group_by(GEOID = FIPS) %>%
  # DSCI = Drought Severity and Coverage Index
  summarise(DSCI = mean(DSCI)) %>% 
  ungroup()

# GET MEDIAN INCOME DATA FROM THE ACS FOR TEXAS COUNTIES IN 2020
tx_median_rent <-
  get_acs(
    geography = "county",
    state = "TX",
    #https://walker-data.com/tidycensus/articles/basic-usage.html
    variables = "B19013_001", # Median household income
    year = 2020,
    geometry = TRUE  # Include county geometries for mapping
  )

# COMBINE DROUGHT AND MEDIAN INCOME DATA
drought_sf <- tx_median_rent %>% left_join(drought)

# VISUALIZE DROUGHT SEVERITY ACROSS TEXAS COUNTIES
drought_sf %>%
  ggplot(aes(fill = DSCI)) +
  geom_sf(alpha = 0.9, color = NA) +  # Use geom_sf to plot the county shapes
  scale_fill_viridis_c() # Use a viridis color scale

# VISUALIZE THE CORRELATION BETWEEN DROUGHT SCORE AND MEDIAN INCOME
drought_sf %>%
  ggplot(aes(DSCI, estimate)) +
  geom_point(size = 2, alpha = 0.8) +  # Scatter plot of DSCI vs. median income
  geom_smooth(method = "lm") +  # Add a linear regression line
  scale_y_continuous(labels = scales::dollar_format()) + # Format y-axis labels as dollars
  labs(x = "Drought score", y = "Median household income")

# CREATE SPATIAL CROSS-VALIDATION FOLDS
set.seed(123)  # Set seed for reproducibility
folds <- spatial_block_cv(drought_sf, v = 10)  # 10 spatial folds

# VISUALIZE THE SPATIAL FOLDS
folds
autoplot(folds) 
autoplot(folds$splits[[1]]) 

# BUILD AND EVALUATE A LINEAR REGRESSION MODEL
drought_res <-
  workflow(estimate ~ DSCI, linear_reg()) %>% # Define workflow with linear regression
  fit_resamples(folds, control = control_resamples(save_pred = TRUE)) # Fit model with spatial CV

# COLLECT MODEL PREDICTIONS
collect_predictions(drought_res) 
drought_res 
collect_predictions(drought_res) 

# CALCULATE ROOT MEAN SQUARED ERROR (RMSE) BY COUNTY
drought_rmse <-
  drought_sf %>%
  mutate(.row = row_number()) %>%
  left_join(collect_predictions(drought_res)) %>%  # Join predictions to original data
  group_by(GEOID) %>%
  rmse(estimate, .pred) %>%  # Calculate RMSE
  select(GEOID, .estimate)  # Select GEOID and RMSE estimate

cities_data <- us.cities %>% 
  filter(country.etc == 'TX' & pop > 500000)  %>% 
  mutate(fixed_name = str_replace(name,country.etc,''))

drought_sf %>%
  left_join(drought_rmse, by = 'GEOID') %>%
  ggplot(
    aes(fill = .estimate)
    ) +
  geom_sf(color = NA, alpha = 0.8) +
  labs(fill = "RMSE") +
  scale_fill_viridis_c(labels = scales::dollar_format()) +
  geom_text(color = 'black', data = cities_data, check_overlap = TRUE, aes(x = long, y = lat, label = fixed_name))

Solution

  • When you add geom_text, it inherits all of the aesthetics you specified inside ggplot - but because you are using different data for geom_text, it can't find .estimate. You can just move aes(fill = .estimate) from inside ggplot to inside geom_sf.

    drought_sf %>%
      left_join(drought_rmse, by = 'GEOID') %>%
      ggplot() +
      geom_sf(color = NA, alpha = 0.8, aes(fill = .estimate)) +
      labs(fill = "RMSE") +
      scale_fill_viridis_c(labels = scales::dollar_format()) +
      geom_text(
        data = cities_data, 
        aes(x = long, y = lat, label = fixed_name), 
        color = 'black', 
        check_overlap = TRUE
      )
    

    enter image description here