postgresqlgraphrelationshipsocial-network-friendship

SELECT query to return friends up to the 5th level


I am trying to implement the social network problem from the Graph Databases book, but in SQL (using PostgreSQL). I have these 2 tables, that will define a person and their friendship with someone else.

CREATE TABLE IF NOT EXISTS Person(
  ID INTEGER PRIMARY KEY,
  Person VARCHAR(200)
);

CREATE TABLE IF NOT EXISTS PersonFriend(
  PersonID INTEGER,
  FriendID INTEGER,
  PRIMARY KEY (PersonID, FriendID)
);

What I am looking for is a query that will return all the friends, friends of friends, friends of friends of friends, etc, up to the 5th level, of a person.

Friends graph

I've attached a graph example that I made in Neo4j that should help the visualization. I solved it easily using Cypher, but I'm having some trouble to do the equivalent in SQL. Using Alice as reference, my query should return Olivia, Zach, Bob, Mary, Caitlyn, Violet and Zerus, as they're her (direct and indirect) friends up to the 5th level.

The book includes this snippet of code, but it only goes up to the 2nd level (and does not return the first one).

SELECT p1.Person AS PERSON, p2.Person AS FRIEND_OF_FRIEND
FROM PersonFriend pf1 JOIN Person p1
 ON pf1.PersonID = p1.ID
JOIN PersonFriend pf2
 ON pf2.PersonID = pf1.FriendID
JOIN Person p2
 ON pf2.FriendID = p2.ID
WHERE p1.Person = 'Alice' AND pf2.FriendID <> p1.ID

I would very much appreciate any suggestions to solve this problem.


Solution

  • Recursive queries are the easiest way to handle this:

    WITH RECURSIVE FriendCTE (PersonID, depth)
    AS (
        SELECT 1, 0  -- 1 being the ID of Alice
        UNION
        SELECT pf.FriendID, depth + 1
        FROM FriendCTE cte
        JOIN PersonFriend pf ON (cte.PersonID = pf.PersonID)
        WHERE depth < 5
    )
    SELECT PersonID, Person
    FROM FriendCTE cte
    JOIN Person p ON (p.ID = cte.PersonID)
    WHERE depth > 0
    GROUP BY PersonID, Person;  -- to deal with loops
    
     PersonID | Person
    ----------+---------
            2 | Olivia
            3 | Zach
            4 | Bob
            5 | Mary
            6 | Caitlyn
            7 | Violet
            8 | Zerus
    (7 rows)
    

    Sidenote, you should be using the text type and not VARCHAR(200).