djangodjango-annotatedjango-aggregationdjango-subquery

Annotating with the count of a subquery filtering by jsonb fields in Django


I have a single model with a jsonb field. There is a value inside this jsonb field that can be shared amongst other rows. I am trying to get the count of a subquery while filtering by this jsonb field.

Some pseudo code of what I have been attempting borrowing examples from this post.

alpha_qs = MyModel.objects.filter(class_id="ALPHA")
# unnest jsonb field so I can leverage it via OuterRef
alpha_qs = alpha_qs.annotate(nested_value_id=KeyTextTransform("nested_value_id", "a_jsonb_field"))
related_beta_subquery = MyModel.objects.filter(class_id="BETA", a_jsonb_field__nested_value_id=OuterRef("nested_value_id"))
related_beta_subquery_count = related_beta_subquery.annotate(count=Count("*")).values("count")
alpha_qs = alpha_qs.annotate(related_beta_count=Subquery(related_beta_subquery))

Using this example data I would expect the top instance to have a related_beta_count of 2 because there are two associated betas with the same nested_value_id.

{
  "class_id": "ALPHA",
  "a_jsonb_field": {
    "nested_value_id": 'abc'
  }
}

{
  "class_id": "BETA",
  "a_jsonb_field": {
    "nested_value_id": 'abc'
  }
}
{
  "class_id": "BETA",
  "a_jsonb_field": {
    "nested_value_id": 'abc'
  }
}
{
  "class_id": "BETA",
  "a_jsonb_field": {
    "nested_value_id": 'zyz'
  }
}

I've been getting an error below but haven't been able to resolve it.

ProgrammingError: operator does not exist: jsonb = text
LINE 1: ...d AND (U0."a_jsonb_field" -> 'nested_value_id') = ("my_model...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Solution

  • I got it working my use case. I don't consider myself well versed in Django so there may be a cleaner way to achieve this.

    # Get initial queryset
    alpha_qs = MyModel.objects.filter(class_id="ALPHA")
    # Pull nested jsonb value out via annotation/KeyTextTransform
    alpha_qs = alpha_qs.annotate(nested_value_id=KeyTextTransform("nested_value_id", "a_jsonb_field"))
    related_beta_subquery = (
        MyModel.objects.filter(
            class_id="BETA"
        )
        # Pull out nested jsonb for subquery as well
        .annotate(
            nested_value_id=KeyTextTransform("nested_value_id", "a_jsonb_field")
        )
        # Filter to ensure these BETA models have a matched nested_value_id to my ALPHA models
        .filter(
            nested_value_id=OuterRef("nested_value_id")
        )
        # This was honestly the harder part that took a lot of trial and error
        .order_by()
        .values("nested_value_id")
        .annotate(count=Count("*"))
        .values("count")
    )
    alpha_qs = alpha_qs.annotate(
        count=Case(
            # This Case statement is mostly just bulletproofing. May be unnecessary for your use case
            When(a_jsonb_field__has_key="nested_value_id", then=Subquery(related_beta_subquery)),
            default=Value(0),
            output_field=IntegerField(),
        )
    )