I'm trying to get Marshmallow-SQLAlchemy to deserialize an object with a nested object without specifying the foreign key for the nested object (which should be the primary key of the parent object). Here's a standalone example:
# Python version == 3.8.2
from datetime import datetime
import re
# SQLAlchemy == 1.3.23
from sqlalchemy import func, create_engine, Column, ForeignKey, Text, DateTime
from sqlalchemy.ext.declarative import as_declarative, declared_attr
from sqlalchemy.orm import relationship, sessionmaker
# marshmallow==3.10.0
# marshmallow-sqlalchemy==0.24.2
from marshmallow import fields
from marshmallow.fields import Nested
from marshmallow_sqlalchemy import SQLAlchemyAutoSchema
################################################################################
# Set up
################################################################################
engine = create_engine("sqlite:///test.db")
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
################################################################################
# Models
################################################################################
@as_declarative()
class Base(object):
@declared_attr
def __tablename__(cls):
# From https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', cls.__name__)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower()
@declared_attr
def updated(cls):
return Column(DateTime, default=func.now(), onupdate=func.now(), nullable=False)
class Account(Base):
id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
tags = relationship("AccountTag", backref="account")
class AccountTag(Base):
account_id = Column(Text, ForeignKey('account.id'), primary_key=True)
Key = Column(Text, primary_key=True)
Value = Column(Text, nullable=False)
################################################################################
# Schemas
################################################################################
class AutoSchemaWithUpdate(SQLAlchemyAutoSchema):
class Meta:
load_instance = True
sqla_session = session
updated = fields.DateTime(default=lambda: datetime.now())
class AccountSchema(AutoSchemaWithUpdate):
class Meta:
model = Account
include_relationships = True
tags = Nested("AccountTagSchema", many=True)
class AccountTagSchema(AutoSchemaWithUpdate):
class Meta:
model = AccountTag
include_fk = True
################################################################################
# Test
################################################################################
Base.metadata.create_all(engine)
account_object = AccountSchema().load({
"id": "ABC1234567",
"name": "Account Name",
"tags": [
{
"Value": "Color",
"Key": "Blue"
}
]
})
session.merge(account_object)
session.commit()
And here's the error I'm getting:
Traceback (most recent call last):
File "example.py", line 88, in <module>
account_object = AccountSchema().load({
File "C:\python\site-packages\marshmallow_sqlalchemy\schema\load_instance_mixin.py", line 92, in load
return super().load(data, **kwargs)
File "C:\python\site-packages\marshmallow\schema.py", line 727, in load
return self._do_load(
File "C:\python\site-packages\marshmallow\schema.py", line 909, in _do_load
raise exc
marshmallow.exceptions.ValidationError: {'tags': {0: {'account_id': ['Missing data for required field.']}}}
I feel like I'm trying to do something intuitive but I'm not sure anymore. I'm sure that I'm close here but am having no luck getting this to work. Help is much appreciated.
You are getting the error because you've specified include_fk
in the Meta
class for AccountTagSchema
.
You can inspect the fields that have been generated for the schema:
print(AccountTagSchema._declared_fields["account_id"])
# <fields.String(default=<marshmallow.missing>, attribute=None, validate=[], required=True, load_only=False, dump_only=False, missing=<marshmallow.missing>, allow_none=False, error_messages={'required': 'Missing data for required field.', 'null': 'Field may not be null.', 'validator_failed': 'Invalid value.', 'invalid': 'Not a valid string.', 'invalid_utf8': 'Not a valid utf-8 string.'})>
Notice that it generates account_id
with required=True
, this is due to the sqlalchemy column that it represents being NOT NULL
as it is part of the primary key.
So the simplest thing is to remove include_fk
from the schema meta:
class AccountTagSchema(AutoSchemaWithUpdate):
class Meta(AutoSchemaWithUpdate.Meta):
model = AccountTag
# include_fk = True <--- remove
...however, run the script and you'll run into another problem:
sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.dict' is not mapped
This means that we end up passing a dict
to SQLAlchemy session where it is expecting a mapped Base
subclass.
The reason for this is that when a child class inherits from a base schema, e.g., AutoSchemaWithUpdate
in this case, the child class does not automatically inherit the parent's meta configuration. The docs provide a couple of strategies for this, the simplest being that the child's Meta
class should also inherit from the parent's Meta
class:
class AccountSchema(AutoSchemaWithUpdate):
class Meta(AutoSchemaWithUpdate.Meta): # <--- this here
model = Account
include_relationships = True
tags = Nested("AccountTagSchema", many=True)
Once we do that for both AccountSchema
and AccountTagSchema
we are ready to run the script again and it works...the first time. Immediately run the script again, and another error occurs:
AssertionError: Dependency rule tried to blank-out primary key column 'account_tag.account_id' on instance '<AccountTag at 0x7f14b0f9b670>'
This is a consequence of the design decision to have the loaded AccountTag
instances unidentifiable (i.e., excluding the primary key from the payload) and the decision to include the foreign key field as part of the primary key for AccountTag
.
SQLAlchemy cannot identify that the newly created AccountTag
instances are the same as the ones that already exist, so it first tries to disassociate the original account tags from the account by setting the value of the foreign key field to None
. However, this isn't allowed as the foreign key is also the primary key and cannot be set NULL.
The solution for this is described here and involves setting an explicit cascade
on the relationship
:
class Account(Base):
id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
tags = relationship("AccountTag", backref="account", cascade="all,delete-orphan")
Now run the script again, and it will work every time.