SqlAlchemy 2.x with specific columns makes scalars() return non-orm objects

This question is probably me not understanding architecture of (new) sqlalchemy, typically I use code like this:

query = select(models.Organization).where(
    models.Organization.organization_id == organization_id
)
result = await self.session.execute(query)

return result.scalars().all()

Works fine, I get a list of models (if any).

With a query with specific columns only:

query = (
    select(
        models.Payment.organization_id,
        models.Payment.id,
        models.Payment.payment_type,
    )
    .where(
        models.Payment.is_cleared.is_(True),
    )
    .limit(10)
)

result = await self.session.execute(query)

return result.scalars().all()

I am getting first row, first column only. Same it seems to: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=scalar#sqlalchemy.engine.Result.scalar

My understanding so far was that in new sqlalchemy we should always call scalars() on the query, as described here: https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage

But with specific columns, it seems we cannot use scalars() at all. What is even more confusing is that result.scalars() returns sqlalchemy.engine.result.ScalarResult that has fetchmany(), fechall() among other methods that I am unable to iterate in any meaningful way.

My question is, what do I not understand?

Solution

My understanding so far was that in new sqlalchemy we should always call scalars() on the query

That is mostly true, but only for queries that return whole ORM objects.

Just a regular .execute()

query = select(Payment)

results = sess.execute(query).all()
print(results)  # [(Payment(id=1),), (Payment(id=2),)]
print(type(results[0]))  # <class 'sqlalchemy.engine.row.Row'>

returns a list of Row objects, each containing a single ORM object. Users found that awkward since they needed to unpack the ORM object from the Row object. So .scalars() is now recommended

results = sess.scalars(query).all()
print(results)  # [Payment(id=1), Payment(id=2)]
print(type(results[0]))  # <class '__main__.Payment'>

However, for queries that return individual attributes (columns) we don't want to use .scalars() because that will just give us one column from each row, normally the first column

query = select(
    Payment.id,
    Payment.organization_id,
    Payment.payment_type,
)

results = sess.scalars(query).all()
print(results)  # [1, 2]

Instead, we want to use a regular .execute() so we can see all the columns

results = sess.execute(query).all()
print(results)  # [(1, 123, None), (2, 234, None)]

Notes:

.scalars() is doing the same thing in both cases: return a list containing a single (scalar) value from each row (default is index=0).
sess.scalars() is the preferred construct. It is simply shorthand for sess.execute().scalars().