mysqldatabasejoinoptimizationchat

Is there any other method to stop join when first matching column is found?


I'm using MySQL database. I have to join two tables (A, B). They have relationship (N->N). Actually, A is chatting room table and B is chat (chat record).

Chatting room has many chats. If a user enters chat app, the list of chat rooms should be orderd by the latest received chat.

But at this moment, chatting room has too many chats, so join performance is severely degraded. So, my idea is :

  1. Chat table is ordered by its created time. So, this is a type of time series data.

  2. Chatting room finds chat table reversely, therefore they can find the latest created chat.

  3. When certain chatting room finds its first chat, IT STOPS SEARCH FOR JOIN.

My question is :

  1. Is there any other method to find columns reversely?

  2. Is there any other method to stop on the moment that a certain column is matched(found) when two tables join?

Chat

CREATE TABLE `chat` (
  `chat_room_id` bigint DEFAULT NULL,
  `created_at` datetime(6) DEFAULT NULL,
  `id` bigint NOT NULL AUTO_INCREMENT,
  `member_id` bigint DEFAULT NULL,
  `subject_id` bigint DEFAULT NULL,
  `content` varchar(255) NOT NULL,
  `message_type` enum('POST','IMAGE','TEXT') DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `FK44b6elhh512d2722l09i6qdku` (`chat_room_id`),
  KEY `FKgvc5hrt0h18xk63qosss3ti30` (`member_id`),
  CONSTRAINT `FK44b6elhh512d2722l09i6qdku` FOREIGN KEY (`chat_room_id`) REFERENCES `chat_room` (`id`),
  CONSTRAINT `FKgvc5hrt0h18xk63qosss3ti30` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`)
) 

Chat Room

CREATE TABLE `chat_room` (
  `member_cnt` int DEFAULT NULL,
  `id` bigint NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `room_type` enum('GROUP','PERSONAL') DEFAULT NULL,
  PRIMARY KEY (`id`)
)

Thank you.


Solution

  • You can do something like

    select chat_room.*
    from chat
    join chat_room
    on chat.chat_room_id = chat_room.id
    left join chat newer_chat
    on newest_chat.chat_room_id = chat_room.id and newer_chat.created_at > chat.created_at
    where newer_chat.id is null
    order by chat.created_at desc
    

    this joins the chat rooms with individual chats, provided that there is no newer chat than the one we join with. Now, if this is not quick-enough, then you might want to make sure you have an index on (chat_room_id, created_at) for the chat table, which will greatly increase the speed of this search. Alternatively you could have a last_chat_in_room table where, whenever you insert a new chat, its corresponding record in this table would be properly upserted so it will be up-to-date and use this aggregate table rather than the chat table. Or you could have a cron job ranking the rooms by latest chats and caching the result and reading from that cache.

    EDIT: How to store a command queue?

    You can store a command queue at the place where you desire to do it, it could be your database, in-memory, like in Redis or files alike, so it's a matter of preference. A neat way to store the commands is to (I will use general terms here, so both you and readers may benefit of the steps):

    1. Create a table based on your table:

    Assuming that t is your table, run

    show create table t;
    

    Copy the script it gives you into a text editor and make some edits upon it. First of all, your primary key should not be a primary key, nor be auto_increment, because this is only a copy of the main table and you will have some nulls for the id for items to be inserted later on. Instead, create a separate field for pyimary key and have it auto_increment. So, your chat table:

    CREATE TABLE `chat` (
      `chat_room_id` bigint DEFAULT NULL,
      `created_at` datetime(6) DEFAULT NULL,
      `id` bigint NOT NULL AUTO_INCREMENT,
      `member_id` bigint DEFAULT NULL,
      `subject_id` bigint DEFAULT NULL,
      `content` varchar(255) NOT NULL,
      `message_type` enum('POST','IMAGE','TEXT') DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `FK44b6elhh512d2722l09i6qdku` (`chat_room_id`),
      KEY `FKgvc5hrt0h18xk63qosss3ti30` (`member_id`),
      CONSTRAINT `FK44b6elhh512d2722l09i6qdku` FOREIGN KEY (`chat_room_id`) REFERENCES `chat_room` (`id`),
      CONSTRAINT `FKgvc5hrt0h18xk63qosss3ti30` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`)
    ) 
    

    would have a pair of

    CREATE TABLE `chat_queue` (
      `chat_room_id` bigint DEFAULT NULL,
      `created_at` datetime(6) DEFAULT NULL,
      `id` bigint DEFAULT NULL,
      `member_id` bigint DEFAULT NULL,
      `subject_id` bigint DEFAULT NULL,
      `content` varchar(255) DEFAULT NULL,
      `message_type` enum('POST','IMAGE','TEXT') DEFAULT NULL,
      `chat_queue_id` int primary key auto_increment,
      CONSTRAINT `chat_queue_chat_room_id` FOREIGN KEY (`chat_room_id`) REFERENCES `chat_room` (`id`) on delete cascade on update cascade,
      CONSTRAINT `chat_queue_member_id` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`) on delete cascade on update cascade,
      CONSTRAINT `chat_queue_chat_id` FOREIGN KEY (`id`)
    REFERENCES `chat` (`id`) on delete cascade on update cascade
    ) 
    

    Note that I have set all values to nullable, because you might decide not to upsert some fields and leave them as they were, in which case we leave them null in the queue. Note further that I kept the foreign keys, made them cascade, so if you update or remove a chat record that the queue references to, then the change will trigger a similar change on this table too. I also added a foreign key of id to the chat table with similar approach.

    2. How to add items

    When you add an item to this queue, you are basically preparing an insert or an update. So you define the field values you want and leave the fields you want to be left unchanged as null. So, an update of an item could be prepared like this:

    insert into chat_queue(id, content, message_type)
    values(42, 'jdkfhkdsjf', 'TEXT');
    

    This basically means that you want to update the record whose id is 42, changing its content and message_type to the values you define, but you do not want to change anything else, so they are left as null. Similarly, if you want to prepare an insert, then insert a chat_queue record without an id, notifying that there is nothing to be updated, so it's an import.

    3. Executing the commands

    For the inserts, you just run

    select concat(
        'insert into t(',
        group_concat(COLUMN_NAME separator ','),
        ') select ',
        group_concat(COLUMN_NAME separator, ','),
        ' from t_queue where id is null'
    )
    from t_queue
    join information_schema.columns
    on TABLE_SCHEMA = 't_queue' and
       COLUMN_NAME not in ('id', 't_queue_id')
    
    

    The query generates the insert for you.

    Of course, you could write your query instead of generating it like in the above, but, the advantage of the approach I am suggesting here is that it will be reusable for other tables, you just change the names of t and t_queue and t_queue_id to your values. This generates the insert commands you want to run.

    Now, let's generate the update

    select concat(
        'select group_concat(',
        COLUMN_NAME,
        ' separator \',\') as ',
        COLUMN_NAME,
        ' from ',
        TABLE_SCHEMA,
        ' where id is not null group by id'
    )
    from t_queue
    join information_schema.columns
    on TABLE_SCHEMA = 't_queue' and
       COLUMN_NAME not in ('id', 't_queue_id')
    

    This will give you the values of each fields separated by comma, so if there were multiple updates on the same record, then you will see what the values were and decide for yourself how to handle them.

    Of course, once you finish executing your commands, remove the commands you executed. It's best to store the maximum queue id before you proceed with your operations, always filter by the queue id being lower or equal to the one you have seen and then remove all records having that queue id or lower.

    4. Periodic schedule

    You will likely want a cron job or some other periodic means to repeatedly execute your command.