sqloraclejoinmultiple-join-rows

How to get previous N rows for multiple joining rows


I am writing a SQL with Oracle Client 12 driver. I have two tables simplified as appended, and I want to get a table with following logic. The "B.TIME_B <= A0.TIME_A" seems created massive joining and made the query very slow. Please help to find best solution.

WITH A0 AS (
  SELECT *
  FROM A
  WHERE A.EVENT = 'a0'
)

SELECT * FROM (
SELECT
  ROW_NUMBER() OVER (PARTITION BY A0.TIME_A0 ORDER BY B.TIME_B DESC) RN, 
  A0.*,
  B.*
FROM 
   A0,B
WHERE
   B.TIME_B <= A0.TIME_A) B0

WHERE B0.RN <= 3   
  1. Find TIME_A, where EVENT_A = 'a0', as TIME_A0,

  2. Find TIME_B = TIME_A0, as EVENT_B0,

  3. And then get the row and previous 2 rows of table B, where EVENT_B0 found. N in this example is 3, and M is 2, but in real case both number are over 3000, so efficiency will be appreciated.

TableA

TIME_A  EVENT_A
1       a1
2       a1
3       a1
4       a0
5       a2
6       a2
7       a3
8       a0

Table B

TIME_B  EVENT_B
1       b1
2       b2
3       b3
4       b4
5       b5
6       b5
7       b6
8       b7

JOIN A_B

TIME_A  EVENT_A TIME_B  EVENT_B
4       a0      2       b2
4       a0      3       b3
4       a0      4       b4
8       a0      6       b5
8       a0      7       b6
8       a0      8       b7

Solution

  • Query 1:

    If you are not going to have overlapping ranges then you can use:

    SELECT *
    FROM   (
      SELECT TIME_B,
             EVENT_B,
             MAX( TIME_A  ) OVER ( ORDER BY TIME_B ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING )
               AS TIME_A,
             MAX( EVENT_A ) OVER ( ORDER BY TIME_B ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING )
               AS EVENT_A
      FROM   tableB B
             LEFT OUTER JOIN tableA A
             ON ( B.TIME_B = A.TIME_A AND A.EVENT_A = 'a0' )
    )
    WHERE  TIME_A IS NOT NULL;
    

    Which only uses a single join and then finds the valid rows with analytic functions.

    Output:

    TIME_B | EVENT_B | TIME_A | EVENT_A
    -----: | :------ | -----: | :------
         2 | b2      |      4 | a0     
         3 | b3      |      4 | a0     
         4 | b4      |      4 | a0     
         6 | b5      |      8 | a0     
         7 | b6      |      8 | a0     
         8 | b7      |      8 | a0     
    

    db<>fiddle here


    Query 2:

    If you could have overlapping ranges then you could use a hierarchical query to generate the rows:

    SELECT TIME_B,
           EVENT_B,
           CONNECT_BY_ROOT( TIME_A ) AS TIME_A,
           CONNECT_BY_ROOT( EVENT_A ) AS EVENT_A
    FROM   (
      SELECT A.*,
             B.*,
             ROW_NUMBER() OVER ( ORDER BY TIME_B ) AS rn
      FROM   tableB B
             LEFT OUTER JOIN tableA A
             ON ( B.TIME_B = A.TIME_A AND A.EVENT_A = 'a0' )
    )
    WHERE LEVEL <= 2
    START WITH EVENT_A IS NOT NULL
    CONNECT BY PRIOR rn -2 <= rn AND rn < PRIOR rn
    ORDER BY time_a, time_b
    

    Output:

    TIME_B | EVENT_B | TIME_A | EVENT_A
    -----: | :------ | -----: | :------
         2 | b2      |      4 | a0     
         3 | b3      |      4 | a0     
         4 | b4      |      4 | a0     
         6 | b5      |      8 | a0     
         7 | b6      |      8 | a0     
         8 | b7      |      8 | a0     
         8 | b7      |     10 | a0     
         9 | b8      |     10 | a0     
        10 | b9      |     10 | a0     
    

    db<>fiddle here