amazon-web-servicesarchitecturenotificationswebhooksmessage-queue

Ordered Messaging Service Architecture


enter image description here

Hi everyone, I need to create a service which could receive messages from a FIFO data stream and send the messages in order to each client server.

Let say the the data stream contains of A1, A2, A3, A4, B1, B2, B3, C1, C2, etc., then I need to send message A1, A2, A3, A4 sequentially to server A, message B1, B2, B3, B4 sequentially to server B, and so on. Each client server must receive the messages in order. The input data stream is guaranteed to be in order.

Requirements:

  1. Number of message pushed into the data stream is up to 10,000 messages per seconds.
  2. Number of different client servers is up to 1,000,000 and increasing. (A, B, C, ...)
  3. Number of message sequence per client servers is up to 10 sequential message. (A1, A2, ..., A10)
  4. Client servers need to receive messages in near real-time (less than 2 minutes after messages pushed into the data stream).

Here's the problem I encountered:

There're high chances of client servers are not responding. In that case, the service needs to wait and stop sending messages to that client server until the client server is ready to receive messages. (e.g. server A is down, then the service should stop sending messages to server A, but the service keeps continue sending messages to another non-down servers).

Here's my current solution:

I was thinking about storing all messages from the input data stream into DB. Simultaneously, there's a cron job that will select first message from the DB where the message is not sent order by their timestamp. The selected messages will be sent asynchronously to their respective client server.

enter image description here

However, I read many blogs online and they do not suggesting to store messages on DB (Never ever use a database as a message queue), thus I'm looking for another architecture suggestions, but couldn't find one.

Do you anyone have any architecture suggestions for this?

Any third party services (AWS, GCP, Kafka, Ably, etc.) is allowed.


Solution

  • Given the constraints you mentioned around the server's availability, I would think of a solution that uses one queue per server.

    Basically your app reads from the FIFO queue, accumulates in-memory 10 messages per server (ex. for server A) , and pushes those messages in a dedicated queue that is consumed only by server A.

    This way you switch the data transfer model from push-based to pull-based and you don't have to worry about server A availability, because if it's down, the messages will remain in its queue and when it gets back up, it will process the latest messages in the same order.

    Now of course, you'd need 1M+ queues. If you use AWS SQS, which guarantees ordering for FIFO delivery logic, there are no limits in terms of the number of queues you can create per account, so at least technically it should work.

    For those types of high-throughput, low-latency use-cases, I wouldn't use a DB (especially a relational one), because it can quickly become a bottleneck.

    Instead, I would check Redis Streams to see if it fits this use-case. Some people say it handles 200K streams pretty well, I think it can go up to 1M+ without any issues.