I have a string like A18.30 and I want to compare if this falls under these 2 min and max values: A17.0 and A19.0.
Expectation is first letter needs to compared and remaining numeric digits, if it falls into the range to be compared.
I have tried with substring, but I want to know if there is a better way to do the comparison with regex. I use snowflake db.
It looks like the values that are compared are sort of "versions" which requires "natural sort". It can be achieved with SQL:
SELECT value, lower_limit, upper_limit,
TRANSFORM(REGEXP_EXTRACT_ALL(value,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e))
BETWEEN TRANSFORM(REGEXP_EXTRACT_ALL(lower_limit,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e))
AND TRANSFORM(REGEXP_EXTRACT_ALL(upper_limit,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e))
AS is_value_between_lower_upper_limit
FROM VALUES ('A18.30','A17.0','A19.0'),
('P1A.5', 'P1.0', 'A2.0') AS s(value, lower_limit, upper_limit);
Output:
More at: How to sort "version" strings with SQL in Snowflake?
Alternatively by using UDF:
create or replace function natural_sort(str TEXT)
returns array
language python
runtime_version = '3.11'
packages = ('natsort')
handler = 'sort'
as
$$
import natsort
def sort(str):
return list(natsort.natsort_key(str))
$$;
SELECT value, lower_limit, upper_limit,
natural_sort(value) BETWEEN natural_sort(lower_limit)
AND natural_sort(upper_limit) AS is_value_between_lower_upper_limit
FROM VALUES ('A18.30','A17.0','A19.0'),
('P1A.5', 'P1.0', 'A2.0') AS s(value, lower_limit, upper_limit);
Output: