I want to build a complicated filter:
queryset.filter(
(Q(k__contains=“1”) & Q(k__contains=“2”) & (~Q(k__contains=“3”))) |
(Q(k1__contains=“2”) & (~Q(k4__contains=“3”)))
)
The structure is fixed, but the query is dynamic and depends on a case specified by given input.
Tthe input could be for example :
(k=1&k=2&~k=3) | (k1=1&~k4=3)
or
(k=1&~k=3) | (k1=1&~k4=3) | (k=4&~k=3)
How to add parentheses to build this query to make it run as expected?
Two answers are presented: one simple and one perfect
It your expressions are simple and it seems better to read it by a short Python code. The most useful simplification is that parentheses are not nested. Less important simplification is that the operator OR ("|") is never in parentheses. The operator NOT is used only inside parentheses. The operator NOT is never double repeated ("~~").
Syntax: I express these simplifications in a form of EBNF syntax rules that could be useful later in a discussion about Python code.
expression = term, [ "|", term ];
term = "(", factor, { "&", factor }, ")";
factor = [ "~" ], variable, "=", constant;
variable = "a..z_0..9"; # anything except "(", ")", "|", "&", "~", "="
constant = "0-9_a-z... ,'\""; # anything except "(", ")", "|", "&", "~", "="
Optional white spaces are handled easily by .strip() method to can freely accept expressions like in mathematics. White spaces inside constants are supported.
Solution:
def compile_q(input_expression):
q_expression = ~Q() # selected empty
for term in input_expression.split('|'):
q_term = Q() # selected all
for factor in term.strip().lstrip('(').rstrip(')').split('&'):
left, right = factor.strip().split('=', 1)
negated, left = left.startswith('~'), left.lstrip('~')
q_factor = Q(**{left.strip() + '__contains': right.strip()})
if negated:
q_factor = ~q_factor
q_term &= q_factor
q_expression |= q_term
return q_expression
The superfluous empty and full Q expressions ~Q()
and Q()
are finally optimized and eliminated by Django.
Example:
>>> expression = "(k=1&k=2&~k=3) | ( k1 = 1 & ~ k4 = 3 )"
>>> qs = queryset.filter(compile_q(expression))
Check:
>>> print(str(qs.values('pk').query)) # a simplified more readable sql print
SELECT id FROM ... WHERE
((k LIKE %1% AND k LIKE %2% AND NOT (k LIKE %3%)) OR (k1 LIKE %1% AND NOT (k4 LIKE %3%)))
>>> sql, params = qs.values('pk').query.get_compiler('default').as_sql()
>>> print(sql); print(params) # a complete parametrized escaped print
SELECT... k LIKE %s ...
[... '%2%', ...]
The first "print" is a Django command for simplified more readable SQL without apostrophes and escaping, because it is actually delegated to the driver. The second print is a more complicated parametrized SQL command with all possible safe escaping.
This answer can compile any combination of boolean operators "|", "&", "~", any level of nested parentheses and a comparision operator "=" to a Q() expression:
Solution: (not much more complicated)
import ast # builtin Python parser
from django.contrib.auth.models import User
from django.db.models import Q
def q_combine(node: ast.AST) -> Q:
if isinstance(node, ast.Module):
assert len(node.body) == 1 and isinstance(node.body[0], ast.Expr)
return q_combine(node.body[0].value)
if isinstance(node, ast.BoolOp):
if isinstance(node.op, ast.And):
q = Q() # select all rows initially
for val in node.values:
q &= q_combine(val)
return q
if isinstance(node.op, ast.Or):
q = ~Q() # select none row initially
for val in node.values:
q |= q_combine(val)
return q
if isinstance(node, ast.UnaryOp):
assert isinstance(node.op, ast.Not)
return ~q_combine(node.operand)
if isinstance(node, ast.Compare):
assert isinstance(node.left, ast.Name)
assert len(node.ops) == 1 and isinstance(node.ops[0], ast.Eq)
assert len(node.comparators) == 1 and isinstance(node.comparators[0], ast.Constant)
return Q(**{node.left.id + '__contains': str(node.comparators[0].value)})
raise ValueError('unexpected node {}'.format(type(node).__name__))
def compile_q(expression: str) -> Q:
std_expr = (expression.replace('=', '==').replace('~', ' not ')
.replace('&', ' and ').replace('|', ' or ').lstrip())
return q_combine(ast.parse(std_expr))
Example: the same as in my previous answer a more complex following:
>>> expression = "~(~k=1&(k1=2|k1=3|(k=5 & k4=3)))"
>>> qs = queryset.filter(compile_q(expression))
The same example gives the same result, a more nested example gives a correct more nested result.
EBNF syntax rules are not important in this case, because no parser is implemented in this solution and the standard Python parser AST is used. It is little different with recursion.
expression = term, [ "|", term ];
term = factor, { "&", factor };
factor = [ "~" ], variable, "=", constant | [ "~" ], "(", expression, ")";
variable = "a..z_0..9"; # any identifier acceptable by Python, e.g. not a Python keyword
constant = "0-9_a-z... ,'\""; # any numeric or string literal acceptable by Python