sqlgoogle-bigquerywindow-functionsstring-agg

How to concatenate N rows of a column based on a specific column value using a condition in Google BigQuery?


TimeStamp USER ID string_col
1 100001 Here
2 100001 there
3 100001 Apple
4 200002 this is
5 200002 that is
6 200002 Apple
7 200002 some
8 200002 summer
9 200002 winter
10 200002 Apple

That is my raw table & I want to use the word "Apple" as a trigger word to concatenate ONLY N (For this example 2) rows above that for every userID.

Below table is the output I'm looking for:

TimeStamp USER ID string_col Result
1 100001 Here null
2 100001 there null
3 100001 Apple Here There
4 200002 this is null
5 200002 that is null
6 200002 Apple this is that is
7 200002 some null
8 200002 summer null
9 200002 winter null
10 200002 Apple summer winter

Solution

  • For the passion of problem solving and coding, I would like to share this simple solution to see if it helps to answer your question:

    SELECT
      t1."TimeStamp",
      t1."USER ID",
      t1."string_col",
      (
        SELECT
          STRING_AGG(t2."string_col", ' ')
        FROM
          raw_table AS t2
        WHERE
          t2."USER ID" = t1."USER ID"
          AND t2."TimeStamp" < t1."TimeStamp"
          AND t1."string_col" = 'Apple'
          AND t2."string_col" != 'Apple'
          AND t1."TimeStamp" - t2."TimeStamp" <= 2
      ) AS Result
    FROM
      raw_table AS t1
    

    The output looks like this:
    enter image description here