pythonsqldjangorawsql

Django Error "django.db.utils.ProgrammingError: subquery has too many columns "


The raw query itself is correct and I am able to get retrieve the rawqueryset from the db. I need to convert this into queryset for further processing and I am facing below error.

Creating corresponding django query was hard for me and that is why I created SQL query, got the raw query set and now attempting to convert it to query set for further processing.

I have changed django model names and table names for anonymity.

Here is the output of what I tried in django shell. I was able to execute the below query but getting the error "django.db.utils.ProgrammingError: subquery has too many columns" when I try to access "queryset" below.

from django.db.models.expressions import RawSQL
from xyz.models import *
value = '1.2.3.4'
queryset = Test1.objects.filter(id__in=RawSQL("SELECT DISTINCT ON (test1.start_time, test1.id) test1.id, test1.name, test1.start_time FROM test1 WHERE EXISTS (SELECT * FROM test2 JOIN test3 ON test2.test3_id = test3.id AND test3.value = %s JOIN test4 ON test2.test4_id = test4.id AND test4.test1_id = test1.id) ORDER BY test1.start_time DESC", params=[value]))

For readability I have formatted the query used below.

SELECT
  DISTINCT ON (test1.start_time, test1.id)
  test1.id,
  test1.name,
  test1.start_time
FROM
  test1
WHERE
  EXISTS (
    SELECT
      *
    FROM
      test2
      JOIN test3 ON test2.test3_id = test3.id
      AND test3.value = 'value'
      JOIN test4 ON test2.test4_id = test4.id
      AND test4.test1_id = test1.id
  )
ORDER BY
  test1.start_time DESC

Solution

  • As the error states, you are selecting too many columns which your filter condition does not require. You subquery must select only Test1 table id field as you filter on id field. So basically you subquery should only project id field like this:

    SELECT
      test1.id
    FROM
      test1
    WHERE
      EXISTS (
        SELECT
          *
        FROM
          test2
          JOIN test3 ON test2.test3_id = test3.id
          AND test3.value = 'value'
          JOIN test4 ON test2.test4_id = test4.id
          AND test4.test1_id = test1.id
      )
    ORDER BY
      test1.start_time DESC
    

    So your final queryset should be like:

    queryset = Test1.objects.filter(id__in=RawSQL("SELECT test1.id FROM test1 WHERE EXISTS (SELECT * FROM test2 JOIN test3 ON test2.test3_id = test3.id AND test3.value = %s JOIN test4 ON test2.test4_id = test4.id AND test4.test1_id = test1.id) ORDER BY test1.start_time DESC", params=[value]))