mysqldatabasedesign-patternsdatabase-designdatabase-normalization

Best approach for implementing comments and likes system like Instagram or Twitter


We are designing a database (MySQL) for the course project. This is exactly what we're stuck in: comments and likes system. So we have found this question: Implementing Comments and Likes in database

It is beautifully and precisely explained. But every likes or comments has to be a new row. The Instagram's Like button is hit an average of ~4.5 billion times per day. This is too huge for likes table. 4.5bx30 days=135 trillion per month! I don't believe they're doing a design that way.

That's how we actually thought:

database_struct

Edit: We are designing as relational database.


Solution

  • Use a separate table for Comments. It will have the 5 columns you indicated.

    Putting things in JSON makes them hard to get to, search, filter on, etc. Ditto for any kind of array. You are doing both -- putting an array of JSON objects in a single cell.

    Learn about JOIN for re-connecting the things I am telling you to separate.

    To get into the trillions, you will also need 'sharding'. But let's not discuss that until after you have gotten into the millions.

    (More advice)

    The "large companies" have a dataset that is "sharded" across hundreds of servers. Comments are very likely to be in a regular table. JSON is not likely to be used; especially not for anything to search on or sort on. JSON is good for miscellany kruft that needs to be saved, but not searched/sorted.

    It is really best for you to

    1. design and implement something (even if it includes JSON);
    2. Put in into production;
    3. Study the problems that arise;
    4. In a few months redesign - be willing to throw away most of the original design.
    5. "Rinse and repeat". There are too many hurdles to leap over to get to where the large companies got to after a decade and dozens of engineers.

    I can only help you do one iteration at a time.

    Likes... If you are keeping a counter, then do it in a 'parallel' table. This will lower contention on the main table. If you are keeping a list of who liked what, then that is a table unto itself.

    IDs... Do not use AUTO_INCREMENT on a table that has a perfectly good PK. The main example is any many:many table -- use the composite of the two ids.

    Normalize, but don't over-normalize. This is something that you will begin to understand in my 'step 3', above.

    Do not use EAV (Entity-Attribute-Value) schema design. It does not scale well.

    Subclassing often gets clumsy. In that link, they have Photo/Article/Place "is a" Entity. No. Photos should be its own table with its own columns, quirks, indexes, etc.

    Do not use any 3rd party software. OK, you can use it for the first iteration of my steps above. But in step 4, throw it entirely out. By then, you have been forced to learn the details of MySQL (since the software will fail to fully keep you from needing to learn the details).