mysqlpython-3.xpeewee

Peewee - Using custom SQL as a subquery


I ran into a problem using Peewee with MySQL. Basically I want to filter the logs of my users and bin them into the last 8 weeks (say week 0 to week 7). The result should look like:

{ 'userid': 1, 'week':1, 'bin': ?},
...
{ 'userid': 1, 'week':7, 'bin': ?},
{ 'userid': 2, 'week':0, 'bin': ?},
...
etc.

However, I want to ensure that even if the user has no entry in the log for a particular week, it still shows as 'bin': 0. I would use an OUTER JOIN, but I have no corresponding "week" table to OUTER JOIN with. In SQL, I would do the following:

WITH  weeks (week) AS (
VALUES ROW(0),ROW(1),ROW(2),ROW(3),ROW(4),ROW(5),ROW(6),ROW(7)
)'

and then use "weeks" in the join. However I can't figure out how to do this in peewee. If I create a "RawQuery" I cannot use it for joining as I would do with a regular query (it has no .c to access the fields or .select()).

I ended up using a dirty fix like this, on a table that has not too many records:

max_weeks = 8
weeks = (
    MyTable.select(
        (fn.MOD(MyTable.id,max_weeks)).alias("week")
    ).distinct()
)

but I would prefer to do it the right way.

I must specify here that my requirement is that I make only one query (composed of multiple subqueries) to the database, and that the number of weeks (8 in this case) is variable. Also, I have read-only permission on the database.


Solution

  • I don't have MySQL handy as I use MariaDB which uses slightly different syntax, but hopefully something like this should work:

    from peewee import *
    from peewee import NodeList, CommaNodeList, CTE
    
    db = MySQLDatabase('peewee_test')
    
    class Reg(db.Model):
        value = IntegerField()
    
    db.create_tables([Reg])
    Reg.create(value=1)
    
    weeks = CommaNodeList([fn.ROW(i) for i in range(8)])
    
    cte = CTE('weeks', NodeList([SQL('VALUES'), weeks]), columns=['week'])
    
    # Just a simple cartesian product of Reg x Weeks - in your example
    # you would be performing a join instead.
    query = Reg.select(Reg.value, cte.c.week).from_(Reg, cte).with_cte(cte)
    print(query.sql())
    for row in query:
        print(row.value, row.week)
    

    The SQL generated:

    WITH `weeks` (`week`) AS (
        VALUES ROW(0), ROW(1), ROW(2), ROW(3),
               ROW(4), ROW(5), ROW(6), ROW(7)) 
    SELECT `t1`.`value`, `weeks`.`week` 
    FROM `reg` AS `t1`, `weeks`