we developed a web app that relies on real-time interaction between our users. We use Angular for the frontend and Hasura with GraphQL on Postgres as our backend.
What we noticed is that when more than 300 users are active at the same time we experience crucial performance losses.
Therefore, we want to improve our subscriptions setup.
We think that possible issues could be:
Concerning 1. each user has approximately 5-10 subscriptions active when using the web app. Concerning 2. we have subscriptions that are complex as we join up to 6 tables together.
The solutions we think of:
OP question is quite broad and impossible to be answered in a general case.
So what I describe here reflects my experience with optimization of subscriptions - it's for OP to decide is it reflects their situation.
Users of system: uploads documents, extracts information, prepare new documents, converse during process (IM-like functionality), there are AI-bots that tries to reduce the burden of repetitive tasks, services that exchange data with external systems.
There are a lot of entities, a lot of interaction between both human and robot participants. Plus quite complex authorization rules: visibility of data depends on organization, departments and content of documents.
At first it was:
query
to subscription
It was OK for first 2-3 months then:
First we did optimization of queries itself but it did not suffice:
Instead of complex subscription for whole data split into parts:
A. Subscription for a single field that indicates that entity was changed
E.g.
Instead of:
subscription{
document{
id
title
# other fields
pages{ # array relation
...
}
tasks{ # array relation
...
}
# multiple other array/object relations
# pagination and ordering
}
that returns thousands of rows.
Create a function that:
So it became:
subscription{
doc_change_date{
max_change_date
}
}
Always one row and always one field
B. Change of application logic
doc_change_date
max_change_date
It's absolutely OK if subscription function sometimes returns false positives.
There is no need to replicate all predicates from source query to subscription function.
E.g.
In our case: visibility of data depends on organizations and departments (and even more).
So if a user of one department creates/modifies document - this change is not visible to user of other department.
But those changes are like ones/twice in a minute per organization.
So for subscription function we can ignore those granularity and calculate max_change_date
for whole organization.
It's beneficial to have faster and cruder subscription function: it will trigger refresh of data more frequently but whole cost will be less.
The first step is a crucial one.
And Hasura has a multiplexing of subscriptions: https://hasura.io/docs/latest/graphql/core/databases/postgres/subscriptions/execution-and-performance.html#subscription-multiplexing
So in theory Hasura could be smart enough and solve your problems.
But if you think "explicit better than implicit" there is another step you can do.
In our case:
So subscriptions becames: doc_change_date, dossier_change_date, msg_change_date and so on.
But actually it could be beneficial to have just one subscription: "hey! there are changes for you!"
So instead of multiple subscriptions application makes just one.
We thought about 2 formats of multiplexed subscription:
{max_change_date}
that is accumulative for all entities{doc_change_date, dossier_change_date, msg_change_date}
Right now "A" works for us. But maybe we change to "B" in future.
That's what we did not tried yet.
Hasura 2.0 allows registering VOLATILE functions for queries.
That allows creation of functions with memoization in DB:
That allows further optimizations both for subscription functions and query functions.
Actually it's possible to do that without waiting for hasura 2.0 but it requires trickery on postgresql side:
It works but that's trick is hard to recommend.
Who knows, maybe future postgresql versions or updates will make it impossible.
That's everything that I can say on the topic right now.
Actually I would be glad to read something similar a year ago.
If somebody sees some pitfalls - please comment, I would be glad to hear opinions and maybe alternative ways.
I hope that this explanation will help somebody or at least provoke thought how to deal with subscriptions in other ways.