I have a use case for which QLDB makes the most sense. I also think the Outbox pattern makes sense for data reliability. However, I am worried about polluting the Journal with the outbox entries.
My understanding is that while I can have my 'outbox' table separate from my main data table, the journal is shared across the entire ledger. It seems that the outbox pattern traditionally uses a relational DB where the concept of an immutable journal just isn't a concern.
Is this going to be a problem as the data set grows? More so, is there an alternate pattern that would make more sense to use?
Since the journal is an immutable history of every transaction ever committed to the ledger, if you use the Outbox pattern with QLDB, your ledger will contain a permanent history of messages that passed through your Outbox table. This is great if you need an unfalsifiable audit history of the messages queued for sending and a record of them being sent ("deleted" from the table). However, if you don't need that, then you'll be paying storage for those messages for the life of the ledger and not getting much value from it.
The typical event-driven approach would be to use QLDB's streaming feature, which associates a Kinesis Data Stream to your ledger. Every time you commit a transaction, QLDB will publish the transaction to the Kinesis Data Stream. This enables you to drive events from transactions occurring in your ledger. With this approach, you commit your business data to the ledger without worrying about the Outbox table. The document should contain the information you would need in your messaging. Upon commit, QLDB pushes the document(s) from the transaction into Kinesis where you process it using a Lambda function that sends the message onward.
One thing to note is that QLDB offers an at-least-once guarantee of delivering data into Kinesis. This means that you'll need to identify and handle (or just tolerate) potential duplicate messaging. You should always be thinking about idempotence in distributed systems anyway, though.
If you don't want to pay for Kinesis and don't need a real-time approach, there are things you can do with scheduled QLDB exports into S3 and some batch processing of the export files, but I'd start with streaming. DM me if you want to hear more about the export approach.
See also: