pythonsqlalchemy

When do I need to use sqlalchemy back_populates?


When I try SQLAlchemy Relation Example following this guide: Basic Relationship Patterns

I have this code

#!/usr/bin/env python
# encoding: utf-8
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base(bind=engine)

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship("Parent")

Base.metadata.create_all()

p = Parent()
session.add(p)
session.commit()
c = Child(parent_id=p.id)
session.add(c)
session.commit()
print "children: {}".format(p.children[0].id)
print "parent: {}".format(c.parent.id)

It works well, but in the guide, it says the model should be:

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship("Parent", back_populates="children")

Why don't I need back_populates or backref in my example? When should I use one or the other?


Solution

  • If you use backref you don't need to declare the relationship on the second table.

    class Parent(Base):
        __tablename__ = 'parent'
        id = Column(Integer, primary_key=True)
        children = relationship("Child", backref="parent")
    
    class Child(Base):
        __tablename__ = 'child'
        id = Column(Integer, primary_key=True)
        parent_id = Column(Integer, ForeignKey('parent.id'))
    

    If you're not using backref, and defining the relationship's separately, then if you don't use back_populates, sqlalchemy won't know to connect the relationships, so that modifying one also modifies the other.

    So, in your example, where you've defined the relationship's separately, but didn't provide a back_populates argument, modifying one field wouldn't automatically update the other in your transaction.

    >>> parent = Parent()
    >>> child = Child()
    >>> child.parent = parent
    >>> print(parent.children)
    []
    

    See how it didn't automatically fill out the children field?

    Now, if you supply a back_populates argument, sqlalchemy will connect the fields.

    class Parent(Base):
        __tablename__ = 'parent'
        id = Column(Integer, primary_key=True)
        children = relationship("Child", back_populates="parent")
    
    class Child(Base):
        __tablename__ = 'child'
        id = Column(Integer, primary_key=True)
        parent_id = Column(Integer, ForeignKey('parent.id'))
        parent = relationship("Parent", back_populates="children")
    

    So now we get

    >>> parent = Parent()
    >>> child = Child()
    >>> child.parent = parent
    >>> print(parent.children)
    [Child(...)]
    

    Sqlalchemy knows these two fields are related now, and will update each as the other is updated. It's worth noting that using backref will do this, too. Using back_populates is nice if you want to define the relationships on every class, so it's easy to see all the fields just be glancing at the model class, instead of having to look at other classes that define fields via backref.