sqlmany-to-manyjoinjunction-table

When Querying Many-To-Many Relationship in SQL, Return Multiple Connections As an Array In Single Row?


Basically, I have 3 tables, titles, providers, and provider_titles.

Let's say they look like this:

| title_id   |     title_name |
|------------|----------------|
| 1          |     San Andres |
| 2          |Human Centipede |
| 3          |    Zoolander 2 |
| 4          |    Hot Pursuit |

| provider_id|  provider_name |
|------------|----------------|
| 1          |           Hulu |
| 2          |        Netflix |
| 3          |   Amazon_Prime |
| 4          |         HBO_GO |

| provider_id|       title_id |
|------------|----------------|
| 1          |              1 |
| 1          |              2 |
| 2          |              1 |
| 3          |              1 |
| 3          |              3 |
| 4          |              4 |

So, clearly there are titles with multiple providers, yeah? Typical many-to-many so far.

So what I'm doing to query it is with a JOIN like the following:

SELECT * FROM provider_title JOIN provider ON provider_title.provider_id = provider.provider_id JOIN title ON title.title_id = provider_title.title_id WHERE provider.name IN ('Netflix', 'HBO_GO', 'Hulu', 'Amazon_Prime')

Ok, now to the actual issue. I don't want repeated title names back, but I do want all of the providers associated with the title. Let me explain with another table. Here is what I am getting back with the current query, as is:

| provider_id| provider_name | title_id | title_name    |
|------------|---------------|----------|---------------|
| 1          |          Hulu |         1|San Andreas    |
| 1          |          Hulu |         2|Human Centipede|
| 2          |       Netflix |         1|San Andreas    |
| 3          |  Amazon_Prime |         1|San Andreas    |
| 3          |  Amazon_prime |         3|Zoolander 2    |
| 4          |        HBO_GO |         4|Hot Pursuit    |

But what I really want would be something more like

| provider_id| provider_name               |title_id| title_name|
|------------|-----------------------------|--------|-----------|
| [1, 2, 3]  |[Hulu, Netflix, Amazon_Prime]|       1|San Andreas|

Meaning I only want distinct titles back, but I still want each title's associated providers. Is this only possible to do post-sql query with logic iterating through the returned rows?


Solution

  • Depending on your database engine, there may be an aggregation function to help achieve this.

    For example, this SQLfiddle demonstrates the postgres array_agg function:

    SELECT  t.title_id,
            t.title_name,
            array_agg( p.provider_id ),
            array_agg( p.provider_name )
    FROM    provider_title as pt
            JOIN 
            provider as p
            ON pt.provider_id = p.provider_id 
            JOIN title as t
            ON t.title_id = pt.title_id 
    GROUP BY t.title_id,
            t.title_name
    

    Other database engines have equivalents. For example:

    If your database isn't covered by the above, you can google '[Your database engine] aggregate comma delimited string'