sqlpostgresqlgrafana

Merging 2 sql queries into 1


I am extracting some database data using PostgreSQL via Grafana.

The first SQL gives me a list of user_ids (User2, User3...) who have interacted with a specific User1 I know beforehand:

SELECT DISTINCT user_id AS user
FROM transactions
WHERE (type_id, operation_id) IN (
    SELECT type_id, operation_id
    FROM transactions
    WHERE user_id = 11111 --This is User1
)
AND user_id != 11111;

The second SQL gives me some stats for ONE specific user_id (e.g. User2) from another table:

SELECT
  COUNT(DISTINCT operation_id) AS "# of operations"
FROM
  operations
WHERE
  category = "category1" AND
  user_id = 22222 --This is User2

How can I "merge" both queries, so that, for each of the users retrieved from the 1st query (User2, User3... so I don't know them beforehand) I can get the stats on the second query (rather than only for a hardcoded User2)?

Thanks!


Solution

  • JOIN the tables, merge the lists of conditions, then GROUP BY. You can also FILTER what goes into the aggregate functions, like the count(*) you used: demo at db<>fiddle

    SELECT user_id 
          ,COUNT(DISTINCT o.operation_id)
             FILTER(WHERE category='category1')AS "# of category1 operations"
          ,COUNT(DISTINCT o.operation_id)
             FILTER(WHERE category='category2')AS "# of category2 operations"
    FROM transactions AS t
    JOIN operations AS o
    USING (user_id)
    WHERE (t.type_id, t.operation_id) IN (
        SELECT type_id, operation_id
        FROM transactions
        WHERE user_id = 11111 --This is User1
    )
    AND t.user_id <> 11111
    AND category IN ('category1','category2')
    GROUP BY user_id;
    
    user_id # of operations # of category1 operations # of category2 operations
    22222 3 2 0
    33333 1 1 0

    Make sure those tables are indexed:

    create index on transactions(type_id,operation_id); 
    create index on transactions(user_id,type_id,operation_id);
    create index on operations(user_id,category)include(operation_id); 
    

    The demo test on 200k randomised rows in each table shows how these indexes help reduce the execution time from 200ms down to 5ms.