This question is probably me not understanding architecture of (new) sqlalchemy, typically I use code like this:
query = select(models.Organization).where(
models.Organization.organization_id == organization_id
)
result = await self.session.execute(query)
return result.scalars().all()
Works fine, I get a list of models (if any).
With a query with specific columns only:
query = (
select(
models.Payment.organization_id,
models.Payment.id,
models.Payment.payment_type,
)
.where(
models.Payment.is_cleared.is_(True),
)
.limit(10)
)
result = await self.session.execute(query)
return result.scalars().all()
I am getting first row, first column only. Same it seems to: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=scalar#sqlalchemy.engine.Result.scalar
My understanding so far was that in new sqlalchemy we should always call scalars()
on the query, as described here: https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage
But with specific columns, it seems we cannot use scalars() at all. What is even more confusing is that result.scalars()
returns sqlalchemy.engine.result.ScalarResult
that has fetchmany(), fechall() among other methods that I am unable to iterate in any meaningful way.
My question is, what do I not understand?
My understanding so far was that in new sqlalchemy we should always call scalars() on the query
That is mostly true, but only for queries that return whole ORM objects.
Just a regular .execute()
query = select(Payment)
results = sess.execute(query).all()
print(results) # [(Payment(id=1),), (Payment(id=2),)]
print(type(results[0])) # <class 'sqlalchemy.engine.row.Row'>
returns a list of Row
objects, each containing a single ORM object. Users found that awkward since they needed to unpack the ORM object from the Row
object. So .scalars()
is now recommended
results = sess.scalars(query).all()
print(results) # [Payment(id=1), Payment(id=2)]
print(type(results[0])) # <class '__main__.Payment'>
However, for queries that return individual attributes (columns) we don't want to use .scalars()
because that will just give us one column from each row, normally the first column
query = select(
Payment.id,
Payment.organization_id,
Payment.payment_type,
)
results = sess.scalars(query).all()
print(results) # [1, 2]
Instead, we want to use a regular .execute()
so we can see all the columns
results = sess.execute(query).all()
print(results) # [(1, 123, None), (2, 234, None)]
Notes:
.scalars()
is doing the same thing in both cases: return a list containing a single (scalar) value from each row (default is index=0).
sess.scalars()
is the preferred construct. It is simply shorthand for sess.execute().scalars()
.