pythonsqlalchemy

SqlAlchemy 2.x with specific columns makes scalars() return non-orm objects


This question is probably me not understanding architecture of (new) sqlalchemy, typically I use code like this:

query = select(models.Organization).where(
    models.Organization.organization_id == organization_id
)
result = await self.session.execute(query)

return result.scalars().all()

Works fine, I get a list of models (if any).

With a query with specific columns only:

query = (
    select(
        models.Payment.organization_id,
        models.Payment.id,
        models.Payment.payment_type,
    )
    .where(
        models.Payment.is_cleared.is_(True),
    )
    .limit(10)
)

result = await self.session.execute(query)

return result.scalars().all()

I am getting first row, first column only. Same it seems to: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=scalar#sqlalchemy.engine.Result.scalar

My understanding so far was that in new sqlalchemy we should always call scalars() on the query, as described here: https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage

But with specific columns, it seems we cannot use scalars() at all. What is even more confusing is that result.scalars() returns sqlalchemy.engine.result.ScalarResult that has fetchmany(), fechall() among other methods that I am unable to iterate in any meaningful way.

My question is, what do I not understand?


Solution

  • My understanding so far was that in new sqlalchemy we should always call scalars() on the query

    That is mostly true, but only for queries that return whole ORM objects.

    Just a regular .execute()

    query = select(Payment)
    
    results = sess.execute(query).all()
    print(results)  # [(Payment(id=1),), (Payment(id=2),)]
    print(type(results[0]))  # <class 'sqlalchemy.engine.row.Row'>
    

    returns a list of Row objects, each containing a single ORM object. Users found that awkward since they needed to unpack the ORM object from the Row object. So .scalars() is now recommended

    results = sess.scalars(query).all()
    print(results)  # [Payment(id=1), Payment(id=2)]
    print(type(results[0]))  # <class '__main__.Payment'>
    

    However, for queries that return individual attributes (columns) we don't want to use .scalars() because that will just give us one column from each row, normally the first column

    query = select(
        Payment.id,
        Payment.organization_id,
        Payment.payment_type,
    )
    
    results = sess.scalars(query).all()
    print(results)  # [1, 2]
    

    Instead, we want to use a regular .execute() so we can see all the columns

    results = sess.execute(query).all()
    print(results)  # [(1, 123, None), (2, 234, None)]
    

    Notes: