I want to write a case statement which can extract value for a particular substring from a column named details
which has multiple occurrences for [address]
is it possible with REGEX along with case?
sample Data in the column:
[address]:kattengat 28
[address]:1012 SZ
[address]: Amsterdam
The below SQL only outputs:
kattengat 28
Select case when to_number(instr(details),'[address')>0 then substr(details,REGEXP_INSTR(details,'address',1,1)+8,instr(substr(details,REGEXP_INSTR(details,'address',1,1)+8,'[')-1)) else '' end from table_name;
Expected output is :
kattengat 28 1012 SZ Amsterdam
Create table statement:
Create table test (id number(10), details clob);
Insert statement :
insert into test (id, details) values (1,to_clob ('[ADDRESS ] kattengat 28
[NAME ] ALEX
[PHONE ] 65438
[ADDRESS ] 1012 SZ
[DOB ] 1st Jan 1998
[ADDRESS ] Amsterdam')):
Please note I don't want to concat and add statements rather looking for a solution which can extract values associated with the substring [address] based on the number of occurrences of the substring in a single line
Here's one option:
SQL> with test (col) as
2 (select '[address]:kattengat 28
3 [address]:1012 SZ
4 [address]: Amsterdam' from dual
5 )
6 select trim(replace(regexp_substr(replace(col, chr(10), '#'), '[^#]+', 1, column_value), '[address]:', '')) result
7 from test cross join
8 table(cast(multiset(select level from dual
9 connect by level <= regexp_count(col, ':')
10 ) as sys.odcinumberlist));
RESULT
--------------------------------------------------------------------------------
kattengat 28
1012 SZ
Amsterdam
SQL>
What does it do?
regexp_substr
part of code is responsible for splitting source column value into separate rowsreplace
) by #
, and that character is used as a separator for regexp_substr
replace
removes [address]:
from the sourcetrim
removes leading/trailing empty strings (as the one in front of "Amsterdam"With sample data you posted later (by the way, are you sure there are spaces in front of [NAME] etc.? I guess NOT!):
SQL> select * from test;
ID DETAILS
---------- --------------------------------------------------
1 [ADDRESS ] kattengat 28
[NAME ] ALEX
[PHONE ] 65438
[ADDRESS ] 1012 SZ
[DOB ] 1st Jan 1998
[ADDRESS ] Amsterdam
Code I previously posted, slightly modified because previously address
was in lower case, there were NO spaces within square brackets, and there was a colon sign):
SQL> with temp as
2 (select trim(replace(regexp_substr(replace(details, chr(10), '#'), '[^#]+', 1, column_value), '[ADDRESS ]', '')) result
3 from test cross join
4 table(cast(multiset(select level from dual
5 connect by level <= regexp_count(details, '\[')
6 ) as sys.odcinumberlist))
7 )
8 select *
9 from temp
10 where instr(result, '[') = 0;
RESULT
--------------------------------------------------------------------------------
kattengat 28
1012 SZ
Amsterdam
SQL>
If you want to get result in one line, you could aggregate values returned by that query as
SQL> with temp as
2 (select trim(replace(regexp_substr(replace(details, chr(10), '#'), '[^#]+', 1, column_value), '[ADDRESS ]', '')) result,
3 column_value cv
4 from test cross join
5 table(cast(multiset(select level from dual
6 connect by level <= regexp_count(details, '\[')
7 ) as sys.odcinumberlist))
8 )
9 select listagg (result, ', ') within group (order by cv) final_result
10 from temp
11 where instr(result, '[') = 0;
FINAL_RESULT
--------------------------------------------------------------------------------
kattengat 28, 1012 SZ, Amsterdam
SQL>