pythonpostgresqlfastapipsycopg2multi-tenant

FastAPI Middleware for Postgres Multi-Tenant Schema Switching Causes Race Conditions with Concurrent Requests


I'm building a multi-tenant FastAPI application that uses PostgreSQL schemas to separate tenant data. I have a middleware that extracts an X-Tenant-ID header, looks up the tenant's schema, and then switches the current schema for the database session accordingly. For a single request (via Postman) the middleware works fine; however, when sending multiple requests concurrently, I sometimes get errors such as:

It appears that the DB connection is closing prematurely or reverting to the public schema too soon, so tenant-specific tables are not found.

Below are the relevant code snippets:


Middleware (SchemaSwitchMiddleware)

from typing import Optional, Callable
from fastapi import Request, Response
from fastapi.responses import JSONResponse
from starlette.middleware.base import BaseHTTPMiddleware
from app.db.session import SessionLocal, switch_schema
from app.repositories.tenant_repository import TenantRepository
from app.core.logger import logger
from contextvars import ContextVar


current_schema: ContextVar[str] = ContextVar("current_schema", default="public")

class SchemaSwitchMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next: Callable) -> Response:
        """
        Middleware to dynamically switch the schema based on the `X-Tenant-ID` header.
        If no header is present, defaults to `public` schema.
        """
        db = SessionLocal()  # Create a session here
        try:
            tenant_id: Optional[str] = request.headers.get("X-Tenant-ID")

            if tenant_id:
                try:
                    tenant_repo = TenantRepository(db)
                    tenant = tenant_repo.get_tenant_by_id(tenant_id)

                    if tenant:
                        schema_name = tenant.schema_name
                    else:
                        logger.warning("Invalid Tenant ID received in request headers")
                        return JSONResponse(
                            {"detail": "Invalid access"},
                            status_code=400
                        )
                except Exception as e:
                    logger.error(f"Error fetching tenant: {e}. Defaulting to public schema.")
                    db.rollback()
                    schema_name = "public"
            else:
                schema_name = "public"

            current_schema.set(schema_name)
            switch_schema(db, schema_name)
            request.state.db = db  # Store the session in request state

            response = await call_next(request)
            return response

        except Exception as e:
            logger.error(f"SchemaSwitchMiddleware error: {str(e)}")
            db.rollback()
            return JSONResponse({"detail": "Internal Server Error"}, status_code=500)

        finally:
            switch_schema(db, "public")  # Always revert to public
            db.close()


Database Session (app/db/session.py)

from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker, declarative_base, Session
from app.core.logger import logger
from app.core.config import settings

# Base for models
Base = declarative_base()

DATABASE_URL = settings.DATABASE_URL

# SQLAlchemy engine
engine = create_engine(
    DATABASE_URL,
    pool_pre_ping=True,
    pool_size=20,
    max_overflow=30,
)

# Session factory
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

def switch_schema(db: Session, schema_name: str):
    """Helper function to switch the search_path to the desired schema."""
    db.execute(text(f"SET search_path TO {schema_name}"))
    db.commit()
    # logger.debug(f"Switched schema to: {schema_name}")


Example tables

Public Schema: Contains tables like users, roles, tenants, and user_lookup.

Tenant Schema: Contains tables like users, roles, buildings, and floors.

When I test with a single request, everything works fine. However, with concurrent requests, the switching sometimes reverts to the public schema too early, resulting in errors because tenant-specific tables are missing.

Question

  1. What could be causing the race condition where the connection’s schema gets switched back to public during concurrent requests?
  2. How can I ensure that each request correctly maintains its tenant schema throughout the request lifecycle without interference from concurrent requests?
  3. Is there a better approach (such as using middleware or context variables) to avoid this issue?

any help on this is much apricated. Thankyou


Solution

  • I also implemented a multi-tenant FastAPI application using PostgreSQL via schemas.

    In my case, I avoided using middleware because the database session (obtained from SessionLocal) and its state need to be isolated per request.

    When using middleware, the connection (and its state) from the connection pool may be reused across requests. Even though ContextVar is designed to be isolated in asynchronous contexts, the actual database connection can still be shared, leading to race conditions. For example, if one request changes the schema and the connection is then reused for another request, that new request might unexpectedly start with the wrong schema (like reverting to "public").

    Instead, I handle the tenant schema switching in a dependency (using Depends). This way, each request gets its own session, and we can safely set the schema for that specific request without affecting others.

    Below is an example implementation using a synchronous SQLAlchemy Session

    from sqlalchemy import create_engine, text
    from sqlalchemy.orm import sessionmaker, declarative_base, Session
    from app.core.logger import logger
    from app.core.config import settings
    from typing import Annotated
    from fastapi import Header
    
    # Base for models
    Base = declarative_base()
    
    DATABASE_URL = settings.DATABASE_URL
    
    # SQLAlchemy engine
    engine = create_engine(
        DATABASE_URL,
        pool_pre_ping=True,
        pool_size=20,
        max_overflow=30,
    )
    
    # Session factory
    SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
    
    # TODO: use this get_db_session function in path operation.
    def get_db_session(tenant_id: Annotated[str, Header(alias="X-Tenant-ID")]) -> Generator[Session, None, None]:
        session = SessionLocal()
        try:
            # TODO: Implement tenant_id to tenant_schema here
            session.execute(text(f"SET search_path TO {tenant_id};"))
            session.commit()  # Ensure the schema change is applied immediately
            yield session
        finally:
            session.close()