sqlpostgresqljoingreatest-n-per-group

SELECT posts and their comments


I have three tables: users, posts, and comments.

How do I SELECT the first three posts, each with their comment count and first three comments?

Example: SQL Fiddle

Goal is to build a table like the one below

Expected output:

-- post_id   comment_id   user_id   body               created_at
-- 1         4            1         Hello. I'm Jane.   August, 28 2016 14:12:01
-- 1         1            2         Nice post, Jane.   August, 28 2016 14:12:01
-- 1         2            1         Thank you, John.   August, 28 2016 14:12:01
-- 1         3            2         You're welcome.    August, 28 2016 14:12:01
-- 2         2            1         This is post 2.    August, 28 2016 14:12:01
-- 2         5            2         I like this.       August, 28 2016 14:12:01
-- 2         6            1         Why, thank you.    August, 28 2016 14:12:01
-- 3         0            1         This is post 3.    August, 28 2016 14:12:01

Solution

  • Understanding the logic behind the expected output

    You have a pretty sophisticated demand for your expected output. From what I understood from your sql fiddle looking at expected result, you want to:

    1. get first three posts
    2. get first three comments for them
    3. append the result from 1 and 2 with different logic for columns comment_id and body

    The difference in logic seems to be like below:

    For each row representing a post:

    While for each row representing comment the logic is analogical (but for a comment, not a post) with the exception of comment_id column where you want to store the comment id.

    Query and explanation

    For live example look at SQL fiddle

    First, take first three posts and build rows for them counting comments for each one. Then, union those posts rows with comment rows, and use row_number() function to limit comment rows in the output to maximum of 3 per a post.

    Assigning 0 as row number for posts means they fulfill the condition of rn <= 3.

    To order the output as you wish so that for every post their comments are sorted right after them I've added an order_column to be able to include it in ORDER BY.

    WITH first_posts AS (
        SELECT p.id AS post_id, COUNT(c.id) AS comment_id, p.user_id, p.body, p.created_at
        FROM (SELECT * FROM posts ORDER BY id LIMIT 3) AS p
        LEFT JOIN comments AS c
        ON p.id = c.post_id
        GROUP BY 1, 3, 4, 5
    )
    SELECT post_id, comment_id, user_id, body, created_at
    FROM (
        SELECT 1 AS type, post_id, comment_id, user_id, body, created_at, 0 AS r
        FROM first_posts
        UNION ALL
        SELECT 2 AS type, p.post_id, c.id, c.user_id, c.body, c.created_at, 
            ROW_NUMBER() OVER (PARTITION BY p.post_id ORDER BY c.id) AS r
        FROM first_posts AS p
        INNER JOIN comments AS c
        ON p.post_id = c.post_id
        ORDER BY post_id, type, comment_id
    ) AS f
    WHERE r <= 3;