mysqldatabase-designlivechatapplication-designdenormalized

Database Denormalization?


Just to provide some context, I am working on a realtime chat application using Pusher(Websocket Service over cloud), where each messaged is logged in my server before it is pushed to the websocket server which in turn pushes the data to the client.So I am using a MySQL for storing the messages.

So one such table is my messages, which has the following fields id,chat_payload,message_time,user_id,room_id.

For populating the initial chatroom I would need to retrieve the message history, wherein each message object would include the username,useravatar,chat_payload,timestamp.But the useravatar,username is stored in users, user_meta tables respectively.So fetching the data on the run using joins clearly seems expensive since user_meta table grows very fast.And clearly storing username,useravatar in messages table doesn't seem right as it would pose updation problems.

So considering the above scenario could anyone please suggest an appropriate DB/Application Design for the same?


Solution

  • storing username,useravatar in messages table ... would pose updation problems

    Good data is job #1, right? You are contemplating compromising that for the promise of performance you can't be sure of.

    It's sometimes under-apprecated that normal forms were invented specifically to prevent what are called update anomalies. 3NF will go a long way toward keeping your data consistent. As an added benefit, the lack of redundancy is efficient because it minimizes the amount of information that has to be updated.

    For those reasons, among 1000 others, the best course of action is to design a normalized database, just like the textbooks say, and let the chips fall where they may. Your DBMS was designed with joins in mind, you can be sure. And it has many features to make them efficient, not least indexes. Rely on that, and not on your first-blush impression of what might be fast or slow.

    If you do run into performance problems, it's unlikely the database design per se will be at fault. If it is, it will happen slowly, and you'll have time to examine it and redress it based on an actual situation, with real facts. Better then with that, than now without.