I'm guessing my question was asked before, but after hours of scouring, either I'm not able to figure out how the answers match my dilemma or it truly hasn't been asked (I'm guessing the former). So, I'm going to ask again, in my own way, so that I may better understand the answer.
The goal is to grab data for a project based on criteria that may be provided to me. There are three different types of criteria: C, M, and P. I usually receive two of the three different types of criteria for a project, and it's not always the same two. I need to be able to restrict a particular type of criteria only if they provide to me any of that type of criteria.
Here's the general layout of my query that I'm starting with (assuming project_id = 1 for this example):
WITH
cte_c AS (SELECT c FROM prj_rqst_c_criteria WHERE project_id = 1)
, cte_m AS (SELECT m FROM prj_rqst_m_criteria WHERE project_id = 1)
, cte_p AS (SELECT p FROM prj_rqst_p_criteria WHERE project_id = 1)
SELECT * FROM data_table
INNER JOIN cte_c ON data_table.c = cte_c.c
INNER JOIN cte_m ON data_table.m = cte_m.m
INNER JOIN cte_p ON data_table.p = cte_p.p
;
Assuming I have rows returned in cte_c and cte_p but no rows returned in cte_m, this would obviously yield 0 results for the entire query. What I want in that scenario is for cte_m to effectively be ignored while applying the JOINs for cte_c and cte_p. How do I modify the JOINs so that if any cte returns no rows, that particular cte will be ignored?
Thanks!
Edit: Some additional items: data_table has well over 1 million rows, so the goal is to only return rows that match the criteria that was provided for the project.
Some example data:
data_table
id | c | m | p
------------------
1 | A | 101 | 999
2 | B | 102 | 998
3 | A | 103 | 998
4 | A | 102 | 999
5 | B | 101 | 998
If I'm asked to grab a project where c = 'A' and p = '999' but I'm not given any criteria for m, then I'd want the following rwos returned:
id | c | m | p
------------------
1 | A | 101 | 999
4 | A | 102 | 999
If I'm asked to pull a project where m = 102 and c = 'A', then I just want the following returned:
id | c | m | p
------------------
4 | A | 102 | 999
I hope this helps to visualize. Thanks again!
In the absence of any further context, this can be solved using EXISTS
and NOT EXISTS
.
WITH
param_c AS (SELECT * FROM prj_rqst_c_criteria WHERE project_id = 1),
param_m AS (SELECT * FROM prj_rqst_m_criteria WHERE project_id = 1),
param_p AS (SELECT * FROM prj_rqst_p_criteria WHERE project_id = 1)
SELECT
*
FROM
data_table AS d
WHERE
(
NOT EXISTS (SELECT * FROM param_c)
OR EXISTS (SELECT * FROM param_c WHERE c = d.c)
)
AND
(
NOT EXISTS (SELECT * FROM param_m)
OR EXISTS (SELECT * FROM param_m WHERE m = d.m)
)
AND
(
NOT EXISTS (SELECT * FROM param_p)
OR EXISTS (SELECT * FROM param_p WHERE p = d.p)
)
It's a little long winded, but that's not the biggest problem.
The biggest problem is the repeated use of OR
to cater for cases where there are no criteria for a given project.
Ideally you'd want to tell the DBMS just to ignore that whole block, but SQL doesn't work that way. In SQL, a statement is compiled to a single execution plan, able to cater for all scenarios. Essentially a pessimistic least-worst plan.
That could be avoided with dynamic-SQL (Using SQL to write SQL, with a custom WHERE clause).
Fiddle (adapted from ValNik's) : https://dbfiddle.uk/tWuJJlY3