databasemongodbmongoosenosqlnon-relational-database

Storing Likes in a Non-Relational Database


Gist

I implemented a like button in my application. Let's imagine users are able to like other users products.

Issue

I am now wondering which of the following is the most effective and robust method to store those likes in a non-relational Database (in my case MongoDB). It's important that no user can like a product twice.

Possible Solutions

(1) Store the user ids of those, who liked on the product itself and keep track of the number of likes via likes.length

// Product in database
    {
        likes: [
            'userId1',
            'userId2',
            'userId3',
            ...
        ],
        ...
    }

(2) Store all products, that a user liked on the user itself and keep track of the number of likes through a number on the product

// User in database
{
    likedProducts: [
        'productId1',
        'productId2',
        'productId3',
        ...
    ]
    ...
}
// Product in database
{
    numberOfLikes: 42,
    ...
}

(3) Maybe there is even a better solution for this?

Either way, if the product has many likes or the user liked many products, there is a big amount of data, that has to load only to show likes and check if the user has already liked it.


Solution

  • Which approach to use, (1) or (2) depends on your use case, specifically, you should think about what data you will need to access more: to retrieve all products liked by a particular user (2) or to retrieve all users who liked a particular product (1). It looks more likely that (1) is a more frequent case - that way you would easily know if the user already liked the product as well as number of likes for the product as it is simply array length.

    I would argue that any further improvement would likely be a premature optimization - it's better to optimize with a problem in hand.

    If showing number of likes, for example, appears to be a bottleneck, you can denormalize your data further by storing array length as a separate key-value. That way displaying the product list wouldn't require receiving array of likes with userIds from the database.

    Even more unlikely, with millions of likes of a single product, you'll find significant slowdown from looping through the likes array to check if the userId is already in it. You can, of course, use something like a sorted array to keep likes sorted, but database communication would be still slow (slower than looping through array in memory anyway). It's better to use the database indexing for binary search and instead of storing array of likes as array embedded into the product (or user) you can store likes in a separate collection:

    {
        _id: $oid1,
        productId: $oid2,
        userId: $oid3
    }
    

    That, assuming, that the product has key with a number of likes, should be fastest way of accessing likes if all 3 keys are indexed.

    You can also be creative and use concatenation of $oid2+$oid3 as $oid1 which would automatically enforce uniqueness of the user-product pair likes. So you'd just try saving it and ignore database error (might lead to subtle bugs, so it'd be safer to check like exists on a failure to save).