select distinct
promo_name
,case
when substring(promo_name,instr(promo_name, "P0"),2) = "P0" then 0
when substring(promo_name,instr(promo_name, "P1"),2) = "P1" then 1
When substring(promo_name,instr(promo_name, "P01"),3) = "P01" then 1
when substring(promo_name,instr(promo_name, "P2"),2) = "P2" then 2
When substring(promo_name,instr(promo_name, "P02"),3) = "P02" then 2
when substring(promo_name,instr(promo_name, "P3"),2) = "P3" then 3
when substring(promo_name,instr(promo_name, "P03"),3) = "P03" then 3
when substring(promo_name,instr(promo_name, "P4"),2) = "P4" then 4
when substring(promo_name,instr(promo_name, "P04"),3) = "P04" then 4
when substring(promo_name,instr(promo_name, "P5"),2) = "P5" then 5
when substring(promo_name,instr(promo_name, "P05"),3) = "P05" then 5
when substring(promo_name,instr(promo_name, "P6"),2) = "P6" then 6
when substring(promo_name,instr(promo_name, "P06"),3) = "P06" then 6
when substring(promo_name,instr(promo_name, "P7"),2) = "P7" then 7
when substring(promo_name,instr(promo_name, "P07"),3) = "P07" then 7
when trim(substring(promo_name,instr(promo_name, "P8"),2)) ="P8" then 8
when trim(substring(promo_name,instr(promo_name, "P08"),3)) ="P08" then 8
when trim(substring(promo_name,instr(promo_name, "P9"),2)) ="P9" then 9
when trim(substring(promo_name,instr(promo_name, "P09"),3)) ="P09" then 9
when trim(substring(promo_name,instr(promo_name, "P10"),3)) ="P10" then 10
when trim(substring(promo_name,instr(promo_name, "P11"),3)) ="P11" then 11
when trim(substring(promo_name,instr(promo_name, "P12"),3)) ="P12" then 12
else 0 end as promo_id ,Case when trim(substring(promo_name,instr(promo_name, "P10"),3)) = "P10" then 10 when trim(substring(promo_name,instr(promo_name, "P11"),3)) = "P11" then 11 when trim(substring(promo_name,instr(promo_name, "P12"),3)) = "P12" then 12 when trim(substring(promo_name,instr(promo_name, "P13"),3)) = "P13" then 13 when trim(substring(promo_name,instr(promo_name, "P14"),3)) = "P14" then 14 else 0 end as id from hbi_dns_protected.store_zones_stock_v7_1_4 where promo_name is not null
Trying to extract ID from String, it works fine from P10 to P14 when I use in a separate column, when I do in the same column it just pick 1 instead of 11, 1 instead of 12 etc...
Am I making an mistake here? sample data
Why not use regexp_extract
to do a RegEx extraction from your string, rather than write code for each case, something like:
%sql
SELECT *,
regexp_extract( promo_name, ' P(\\d+)', 1 ) AS promoNumber
FROM tmp
My results:
NB The regex is case-sensitive. If you have a requirement to capture either lower or upper case Ps, then you could use a character class ie [pP]
instead.
A full explanation of the RegEx pattern used:
[pP]
which says match any character (case-sensitive) in the brackets(\\d+)
. This is made up of the RegEx pattern \d
for matching digits, the +
symbol which means, 'match one or more'. The brackets make it into a group, which is group 1. The \d
has an additional slash which is an escape character required for the Spark SQL implementation of regexp_extract
.regexp_extract
has the value of 1 which means 'return group 1 from the function'I use regex101.com for testing and practicing with RegEx expressions.