I have 2 tables: projects and tasks (that's a simplification, but will do for now). PROJECTS
has a hierarchy of projects where there can be a detached (standalone) project or a projects hierarchy. TASKS
contains a list of tasks. Tasks are hierarchical in nature and are linked to projects. In case of the projects hierarchy, tasks in the top level are kind of templates and can be redefined in the lower level projects (in my case, i have a hierarchy of 2 projects). That is, i want:
Sample table structure:
create table projects
( project_id varchar(14)
, parent_project_id varchar(14));
create table tasks
( project_id varchar(14)
, task_id varchar(6)
, parent_task_id varchar(6)
, inactive char(1)
, redefined char(1)
);
Sample data:
PROJECTS
:
PROJECT_ID | PARENT_PROJECT_ID |
---|---|
MASTER_PROJECT | ' ' |
EXECUTION_PROJ | MASTER_PROJECT |
UNRELATED_PROJ | ' ' |
TASKS
:
PROJECT_ID | TASK_ID | PARENT_TASK_ID | INACTIVE | REDEFINED |
---|---|---|---|---|
EXECUTION_PROJ | 101000 | 100000 | X | X |
EXECUTION_PROJ | 101010 | 101000 | X | |
EXECUTION_PROJ | 101011 | 101010 | X | |
EXECUTION_PROJ | 101012 | 101010 | ||
EXECUTION_PROJ | 102000 | 100000 | X | |
EXECUTION_PROJ | 102010 | 102000 | ||
EXECUTION_PROJ | 102011 | 102010 | ||
EXECUTION_PROJ | 210000 | 200000 | ||
EXECUTION_PROJ | 300000 | 000000 | X | |
EXECUTION_PROJ | 310000 | 000000 | X | X |
MASTER_PROJECT | 100000 | 000000 | ||
MASTER_PROJECT | 101000 | 100000 | ||
MASTER_PROJECT | 102000 | 100000 | X | |
MASTER_PROJECT | 200000 | 000000 | ||
MASTER_PROJECT | 300000 | 000000 | ||
MASTER_PROJECT | 310000 | 300000 |
I can determine the top-level project easily (or i can hardcode it, doesn't matter); and i can show hierarchy of tasks, that also works fine (sorry, working in the ABAP environment, can't create Oracle tables; trying to emulate this with select from dual
):
with projects as (
select 'MASTER_PROJECT' project_id, ' ' parent_project_id from dual
union all
select 'EXECUTION_PROJ', 'MASTER_PROJECT' from dual
union all
select 'UNRELATED_PROJ', ' ' from dual),
toplvl_proj as (
select project_id from projects
where connect_by_isleaf = 1
connect by prior parent_project_id = project_id
start with project_id = 'EXECUTION_PROJ'),
tasks(PROJECT_ID, TASK_ID, PARENT_TASK_ID, INACTIVE, REDEFINED) as (
select 'EXECUTION_PROJ', '101000', '100000', 'X', 'X' from dual union all
select 'EXECUTION_PROJ', '101010', '101000', 'X', ' ' from dual union all
select 'EXECUTION_PROJ', '101011', '101010', 'X', ' ' from dual union all
select 'EXECUTION_PROJ', '101012', '101010', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '102000', '100000', ' ', 'X' from dual union all
select 'EXECUTION_PROJ', '102010', '102000', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '102011', '102010', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '210000', '200000', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '300000', '000000', ' ', 'X' from dual union all
select 'EXECUTION_PROJ', '310000', '000000', 'X', 'X' from dual union all
select 'MASTER_PROJECT', '100000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '101000', '100000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '102000', '100000', 'X', ' ' from dual union all
select 'MASTER_PROJECT', '200000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '300000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '310000', '300000', ' ', ' ' from dual)
select t.*, p.parent_project_id, level tsklvl
from tasks t
join projects p on p.project_id = t.project_id
connect by prior t.task_id = t.parent_task_id
start with t.project_id in (select project_id from toplvl_proj)
and t.parent_task_id = '000000'
What i don't understand is how to leave only the lowest level (from the task_id perspective) - that is, if a task exists only in the parent project or only in the execution project, select it; if it exists in both projects, select only the record belonging to the execution project.
The above query produces this result:
PROJECT_ID | TASK_ID | PARENT_TASK_ID | INACTIVE | REDEFINED | PARENT_PROJECT_ID | TSKLVL |
---|---|---|---|---|---|---|
MASTER_PROJECT | 100000 | 000000 | 1 | |||
EXECUTION_PROJ | 101000 | 100000 | X | X | MASTER_PROJECT | 2 |
EXECUTION_PROJ | 101010 | 101000 | X | MASTER_PROJECT | 3 | |
EXECUTION_PROJ | 101011 | 101010 | X | MASTER_PROJECT | 4 | |
EXECUTION_PROJ | 101012 | 101010 | MASTER_PROJECT | 4 | ||
MASTER_PROJECT | 101000 | 100000 | 2 | |||
EXECUTION_PROJ | 101010 | 101000 | X | MASTER_PROJECT | 3 | |
EXECUTION_PROJ | 101011 | 101010 | X | MASTER_PROJECT | 4 | |
EXECUTION_PROJ | 101012 | 101010 | MASTER_PROJECT | 4 | ||
EXECUTION_PROJ | 102000 | 100000 | X | MASTER_PROJECT | 2 | |
EXECUTION_PROJ | 102010 | 102000 | MASTER_PROJECT | 3 | ||
EXECUTION_PROJ | 102011 | 102010 | MASTER_PROJECT | 4 | ||
MASTER_PROJECT | 102000 | 100000 | X | 2 | ||
EXECUTION_PROJ | 102010 | 102000 | MASTER_PROJECT | 3 | ||
EXECUTION_PROJ | 102011 | 102010 | MASTER_PROJECT | 4 | ||
MASTER_PROJECT | 200000 | 000000 | 1 | |||
EXECUTION_PROJ | 210000 | 200000 | MASTER_PROJECT | 2 | ||
MASTER_PROJECT | 300000 | 000000 | 1 | |||
MASTER_PROJECT | 310000 | 300000 | 2 |
What i need is this (duplicate blocks removed, tasks starting with 3* show EXECUTION_PROJ instead of MASTER_PROJECT and respective values in REDEFINED and INACTIVE fields):
PROJECT_ID | TASK_ID | PARENT_TASK_ID | INACTIVE | REDEFINED | PARENT_PROJECT_ID | TSKLVL |
---|---|---|---|---|---|---|
MASTER_PROJECT | 100000 | 000000 | 1 | |||
EXECUTION_PROJ | 101000 | 100000 | X | X | MASTER_PROJECT | 2 |
EXECUTION_PROJ | 101010 | 101000 | X | MASTER_PROJECT | 3 | |
EXECUTION_PROJ | 101011 | 101010 | X | MASTER_PROJECT | 4 | |
EXECUTION_PROJ | 101012 | 101010 | MASTER_PROJECT | 4 | ||
EXECUTION_PROJ | 102000 | 100000 | X | MASTER_PROJECT | 2 | |
EXECUTION_PROJ | 102010 | 102000 | MASTER_PROJECT | 3 | ||
EXECUTION_PROJ | 102011 | 102010 | MASTER_PROJECT | 4 | ||
MASTER_PROJECT | 200000 | 000000 | 1 | |||
EXECUTION_PROJ | 210000 | 200000 | MASTER_PROJECT | 2 | ||
EXECUTION_PROJ | 300000 | 000000 | 1 | |||
EXECUTION_PROJ | 310000 | 300000 | 2 |
Can this be achieved?
Silly me. ROW_NUMBER()
allows to get rid of unnecessary duplicates, and then hierarchical query doesn't need to worry about cross-project connections:
with projects as (
select 'MASTER_PROJECT' project_id, ' ' parent_project_id from dual
union all
select 'EXECUTION_PROJ', 'MASTER_PROJECT' from dual
union all
select 'UNRELATED_PROJ', ' ' from dual),
proj_hier as (
select project_id, level proj_level from projects
connect by parent_project_id = prior project_id
start with project_id = 'MASTER_PROJECT'),
tasks(PROJECT_ID, TASK_ID, PARENT_TASK_ID, INACTIVE, REDEFINED) as (
select 'EXECUTION_PROJ', '101000', '100000', 'X', 'X' from dual union all
select 'EXECUTION_PROJ', '101010', '101000', 'X', ' ' from dual union all
select 'EXECUTION_PROJ', '101011', '101010', 'X', ' ' from dual union all
select 'EXECUTION_PROJ', '101012', '101010', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '102000', '100000', ' ', 'X' from dual union all
select 'EXECUTION_PROJ', '102010', '102000', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '102011', '102010', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '210000', '200000', ' ', ' ' from dual union all
select 'EXECUTION_PROJ', '300000', '000000', ' ', 'X' from dual union all
select 'EXECUTION_PROJ', '310000', '000000', 'X', 'X' from dual union all
select 'MASTER_PROJECT', '100000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '101000', '100000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '102000', '100000', 'X', ' ' from dual union all
select 'MASTER_PROJECT', '200000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '300000', '000000', ' ', ' ' from dual union all
select 'MASTER_PROJECT', '310000', '300000', ' ', ' ' from dual),
proj_tasks as (
select t.*, ph.proj_level,
row_number() over(partition by t.task_id order by ph.proj_level desc) rn
from tasks t
join proj_hier ph on t.project_id = ph.project_id)
select t.*, level task_level from proj_tasks t
where rn = 1
connect by prior task_id = parent_task_id
start with parent_task_id = '000000'
Sorry... it took me the whole day and a post to StackOverflow to realise what the solution should look like...