I have a table with columns name
and full name
, and am interested in knowing whether one is part of the other (but they are not equal).
Example:
name | full name | is_subset
-------------------------------------------
john smith | john smith | false
john smith | john h. smith | true
john smith | alice jones | false
I found a method that works using SPLIT
and array intersections, but was curious as to whether there is a simpler or more efficient method.
SELECT
name,
full name,
CARDINALITY(
(
ARRAY_INTERSECT(
SPLIT(name, ' '),
SPLIT(full name, ' ')
)
)
) = CARDINALITY(SPLIT(name, ' ')) AND CARDINALITY(
SPLIT(full name, ' ')
) > CARDINALITY(SPLIT(name, ' ')) AS is_subset
from t
-- name is a logical subset of fullname
ARRAY_EXCEPT(
SPLIT(name, ' '),
SPLIT(fullname, ' ')
)
=
ARRAY []
AND
-- name and fullname are unequal sets
ARRAY_EXCEPT(
SPLIT(fullname, ' '),
SPLIT(name, ' ')
)
!=
ARRAY []
The second condition can probably be reduced to...
AND
LENGTH(name) < LENGTH(fullname)
(The shorter LENGTH() version can be "tricked" by certain uses of duplicates.)