We all know about push (fanout on write) vs pull (fanout on read) when designing a feed/twitter system on a social network.
In push mode, we write to the list of updates(posts, tweets, etc.) of an author's friends(or followers) each time an author generates a new post, so that their follower don't need to query all of their followees' feed each time.
In pull mode, we let a follower query all of his followed friends' feed each time he needs to see all of his friends' feed.
But in both cases, what mechanism is commonly used to allow a person to see updated feeds in REAL TIME on the website? ( I would think FB or twitter won't need you to manually refresh the page to see new posts from friends).
Let's say John writes a post, and in push mode, it pushes (writes to SQL or Redis cache) this post's pointer to all of his friends' feed, how would one of his friends' browser know that there's now an update from John?
I assume you have a dynamic (SPA) front-end.
In pull mode, you have two options:
Periodically re-fetch feeds data, each time send last query time to filter for only new feed items. This approach works fine when starting a new project but it won't scale well.
Have a message broker where after creating a new post, you need to publish events to all online clients who's feed is potentially updated, later in client side reload feeds after receiving such events. You could also include new contents inside event payload itself.
In push mode:
Periodically re-fetch feeds data (since your feed query is not complex, it has much less performance overhead).
When you're going to push, check if client has an active connection and publish events in the same time.
Generally people use a hybrid approach:
For producers who has a lot of active consumers (logged in at least once in last month) use pull method.
For producers who has smaller number of active consumers, use push method.
In push method it's very important to have a capacity on the number of items in a user's feed. If a user requests more feed items, you can then fall back to just pulling. Also since there is capacity, you don't need to push to inactive users (probably will be replaced with new feed items before they log in).