architecturesystem

'Push' vs 'Pull ' when designing social networks (twitter, fb news feed, etc)


We all know about push (fanout on write) vs pull (fanout on read) when designing a feed/twitter system on a social network.

In push mode, we write to the list of updates(posts, tweets, etc.) of an author's friends(or followers) each time an author generates a new post, so that their follower don't need to query all of their followees' feed each time.

In pull mode, we let a follower query all of his followed friends' feed each time he needs to see all of his friends' feed.

But in both cases, what mechanism is commonly used to allow a person to see updated feeds in REAL TIME on the website? ( I would think FB or twitter won't need you to manually refresh the page to see new posts from friends).

Let's say John writes a post, and in push mode, it pushes (writes to SQL or Redis cache) this post's pointer to all of his friends' feed, how would one of his friends' browser know that there's now an update from John?


Solution

  • I assume you have a dynamic (SPA) front-end.

    In pull mode, you have two options:

    In push mode:

    Generally people use a hybrid approach:

    In push method it's very important to have a capacity on the number of items in a user's feed. If a user requests more feed items, you can then fall back to just pulling. Also since there is capacity, you don't need to push to inactive users (probably will be replaced with new feed items before they log in).