In a Databricks SQL query, I am creating a column to get the difference in Timestamps of successive rows. I use something like the following,
select
(Timestamp - (LAG(Timestamp, 1) OVER (partition by colA order by Timestamp))) tdiff,
*
from
the_table
But this gives me value of type interval day to second
like {"seconds": 180, "nano": 0, "negative": false, "zero": false, "units": ["SECONDS", "NANOS"]}
How can I get it converted to seconds. For this example, I should basically see 180.
Looks like in earlier runtimes (tested at least for 13.2) datediff supports the seconds unit.
https://docs.databricks.com/en/sql/language-manual/functions/datediff3.html
This just generates some random ids with multiple values and a fuzzed time to show the syntax.
Also shows equivalent extract from interval as originally posed.
%sql
with some_ids_and_random_ints as (
select
id,
floor(rand(5) * 1000) random_int
from
range(1000) AS t
),
force_dups_on_id as (
select
mod(id, 10) as id_with_dups,
timestampadd(second, random_int, current_timestamp()) as some_time
from
some_ids_and_random_ints
),
result_compare as (
select
*,
datediff(
second,
lag(some_time) over (
partition by id_with_dups
order by
some_time
),
some_time
) as diff_inbuilt_function,
(some_time - (LAG(some_time, 1) OVER (partition by id_with_dups order by some_time))) tdiff_as_interval
from
force_dups_on_id
order by
id_with_dups,
some_time
)
select id_with_dups,
some_time,
diff_inbuilt_function,
cast(extract(second from tdiff_as_interval) as integer) as seconds_from_interval
from result_compare