sqlsqliterecursive-query

Find none overlapping intervals using sqlite


I have a data base with ids and time intervals example:

TimeId Start End
1 0 10
1 2 13
1 11 21
1 15 30
2 0 10
2 2 13
2 11 21
2 15 30

I want to write a query which selects none-overlapping intervals.Rows with different TimeId are considered non-overlapping even if the range [start, end] from both overlaps. So expected output from the previous example would be

TimeId Start End
1 0 10
1 11 21
2 0 10
2 11 21

another possible output

timeid start end
1 0 10
1 15 30
2 0 10
2 15 30

or

timeid start end
1 2 13
1 15 30
2 0 10
2 15 30

or

timeid start end
1 2 13
1 15 30
2 2 13
2 15 30

What is important is that you output as much as possible none-overlapping intervals.

So Basically there are two conditions:

  1. The output must not contain overlaping intervals (where the defintion of overlaping contains also the timeid as explained above)
  2. In the remaining (not selected or filtered out) rows there must be no rows which can be added to the result without breaking the first condition. Meaning: if I added any other row from the not selected ones, the outputed result will contain now an overlapping interval. Maybe I can rephrase this condition as the outputted set of rows should be maximum.

So for example the following output is not acceptable:

timeid start end
1 2 13
1 15 30
2 2 13

sine I can add the row

timeid start end
2 15 30

and it will still not overlaps with any of the three outputted rows.

Example database

CREATE TABLE "test" (
    "timeid" INTEGER,
    "start"  INTEGER,
    "end"    INTEGER
);

INSERT INTO test VALUES(1, 0, 10);
INSERT INTO test VALUES(1, 2, 13);
INSERT INTO test VALUES(1, 3, 15);
INSERT INTO test VALUES(1, 11, 21);
INSERT INTO test VALUES(1, 15, 30);

INSERT INTO test VALUES(2, 0, 10);
INSERT INTO test VALUES(2, 2, 13);
INSERT INTO test VALUES(2, 11, 21);
INSERT INTO test VALUES(2, 15, 30);

Last thing which gave hope was

with recursive nol(id, start, end) AS (
  select test.timeid, test.start, test.end from test inner join 
 ( select timeid as ttid, min(start) as ttstart from test group by id)  as tt on
 tt.ttid = test.timeid and tt.ttstart = test.start
    union all 
    select test.timeid , test.start, test.end from test  inner join nol 
    on test.timeid = nol.id and test.start > nol.end    
    GROUP By test.timeid
    Having min(test.start)
)
select * from nol 

Assumptions and Problems

In the above code (and in the real case that I have), as you may noticed, i assume that the combination timeid and start are unique, i.e., no two rows shoud have the same timeid and start.

The problem with the above code is that it is not supported. I get the error (Result: recursive aggregate queries not supported).

Could you please help what would be an alternative to using aggregate inside recursive?

thanks in advance


Solution

  • You could try to prepare your data in a way that you can recognize the possible overlaps, flag them and filter the result accordingly.
    With your sample data the code resulting as in one of your expectations is:

    --      prepare your table data for overlap testings 
    WITH 
      grid AS
        ( Select      t.timeid, t.starts, t.ends, 
                      ( Select Max(starts) From tbl Where timeid = t.timeid And starts < t.starts ) as last_start, 
                      ( Select Max(ends) From tbl Where timeid = t.timeid And starts < t.starts ) as last_end, 
                      t2.starts as starts_2, t2.ends as ends_2 
          From        tbl t
          Left Join   tbl t2 ON( t2.timeid = t.timeid And 
                                 t2.starts > t.ends ) 
        ), 
    --      overlaps and flags - final preparation for main SQL  
      overlaps AS 
        ( SELECT DISTINCT
                     g.timeid, g.starts, g.ends, 
                     g2.starts as starts_overlap, g2.ends as ends_overlap,
                     Case When g2.starts < LAG(g2.ends) Over(Partition By g2.timeid Order By g2.starts)
                          Then 'OUT' 
                     Else 'IN' 
                     End as flag
          FROM       tbl t
          INNER JOIN ( Select     * 
                       From       grid
                       Where      last_start Is Null OR starts > last_end
                     ) g ON( ( g.timeid = t.timeid  And g.starts = t.starts And g.ends = t.ends )  
                            OR
                             ( g.timeid = t.timeid  And g.starts_2 = t.starts And g.ends_2 = t.ends )  
                           )
          LEFT JOIN  grid g2 ON( g2.timeid = t.timeid And g2.starts > t.starts And g2.starts < t.ends )
          ORDER BY   t.timeid, t.starts
        ) 
    
    --      S Q L : 
    Select     t.* 
    From       overlaps o 
    Inner Join tbl t ON( t.timeid = o.timeid And 
                              t.starts = Coalesce(o.starts_overlap, o.starts) And 
                              t.ends = Coalesce(o.ends_overlap, o.ends) )
    Where      ( o.starts_overlap Is Null OR o.starts_overlap > o.ends ) And o.flag = 'IN'
    Order By   o.timeid, o.starts 
    

    R e s u l t :

    timeid starts ends
    1 0 10
    1 15 30
    2 0 10
    2 15 30

    fiddle