pythondjangojoindjango-orm

django ORM query filter methods running multiple filter duplicates joins


I'm trying to run filters using methods in two separate attributes.

In ICD10Filter:

class Icd10Filter(filters.FilterSet):


    # New Filters for DOS Range
    dosFrom = filters.DateFilter(method='filter_by_dos_from', lookup_expr='gte')
    dosTo = filters.DateFilter(method='filter_by_dos_to', lookup_expr='lte')



    def filter_by_dos_from(self, queryset, name, value):
    
        return queryset.filter(
            base_icd__chart_review_dos__dos_from__gte=value
        )

    def filter_by_dos_to(self, queryset, name, value):        
        return queryset.filter(
            base_icd__chart_review_dos__dos_to__lte=value
        )

ICD10 filter is referenced in ChartReviewDx Model:

class ChartReviewDx(models.Model):
    chart_review_dos = models.ForeignKey(
        ChartReviewDos, on_delete=models.SET_NULL, null=True, related_name="diagnosis_details"
    )
    diagnosis_code = models.CharField(max_length=1024, null=True, blank=True)
    diagnosis_description = models.CharField(max_length=1024, null=True, blank=True)
    icd10 = models.ForeignKey("risk_adjustment.Icd10", on_delete=models.SET_NULL, null=True)
    base_icd = models.ForeignKey(
        "risk_adjustment.Icd10", on_delete=models.SET_NULL, null=True, blank=True, related_name="base_icd"
    )

and ChartReviewDx is referenced in ChartReviewDOS model:

class ChartReviewDos(models.Model):
    chart = models.ForeignKey(Chart, on_delete=models.SET_NULL, null=True, blank=True, related_name="diagnosis")
    dos_from = models.DateField()
    dos_to = models.DateField()

I want to fetch the ICD10 codes for particular DOS range only.

The desired query is:

SELECT 
  distinct id, 
  code, 
  description 
FROM 
  risk_adjustment_icd10 
  INNER JOIN healthcare_data_chart_review_dx ON (
    id = healthcare_data_chart_review_dx.base_icd_id
  ) 
  INNER JOIN healthcare_data_chart_review_dos ON (
    healthcare_data_chart_review_dx.chart_review_dos_id = healthcare_data_chart_review_dos.id
  ) 
WHERE 
  (
    valid = 1     
    AND healthcare_data_chart_review_dos.dos_from >= '2023-08-19' 
    AND healthcare_data_chart_review_dos.dos_to <= '2023-08-19' 
  ) 
ORDER BY 
  code ASC

When I only run the filter for one of the fields, the query is working fine.

But running filters on both fields give redundant JOINS and thus inaccurate results:

The query that is generated after applying both filters:

SELECT 
  DISTINCT id, 
  code, 
  description 
FROM 
  risk_adjustment_icd10 
  INNER JOIN healthcare_data_chart_review_dx ON (
    id = healthcare_data_chart_review_dx.base_icd_id
  ) 
  INNER JOIN healthcare_data_chart_review_dos ON (
    healthcare_data_chart_review_dx.chart_review_dos_id = healthcare_data_chart_review_dos.id
  ) 
  INNER JOIN healthcare_data_chart_review_dx T4 ON (
    id = T4.base_icd_id
  ) 
  INNER JOIN healthcare_data_chart_review_dos T5 ON (
    T4.chart_review_dos_id = T5.id
  ) 
WHERE 
  (
    valid = 1 
    AND healthcare_data_chart_review_dos.dos_from >= '2023-08-19' 
    AND T5.dos_to <= '2023-08-19' 
  ) 
ORDER BY 
  code asc

How can I remove this redundant joins?


Solution

  • If you join on models, there is a huge difference between:

    queryset.filter(
        base_icd__chart_review_dos__dos_from__gte=value
    ).filter(
        base_icd__chart_review_dos__dos_to__lte=value
    )

    and:

    queryset.filter(
        base_icd__chart_review_dos__dos_from__gte=value,
        base_icd__chart_review_dos__dos_to__lte=value,
    )

    The former asks for a record where there is a related review_dos for which from is greater than or equal to value, and a related review_dos, possibly a different one, for which to is less than or equal to value. For the latter, that review_dos record has to be the same.

    If you thus use the same .filter(…) [Django-doc], then the JOINs are reused along the clauses in that .filter(…) whereas separate .filter(…) calls will thus generate different joins.

    This is thus not a bug, both are valid cases, depending on what you want, you use one of the scenarios.

    You thus probably should make a third method to filter where you use the same .filter(…).