sqloracle-database

Need to extract value from key-value pair data


I have one column that contains the value in the key-value pair. But I need to extract only value from that column.

Input Table:

+------+---------------------------------+
| Col1 |              Col2               |
+------+---------------------------------+
|    1 | key1:Val1;Key2:Val2;;;Key5:Val5 |
+------+---------------------------------+

Expected Output:

+------+--------------------------------+------+------+------+------+
| Col1 |              Col2              | Col3 | Col4 | Col5 | Col6 |
+------+--------------------------------+------+------+------+------+
|    1 | key1:Val1;Key2:Val2;;Key4:Val4 | Val1 | Val2 |      | Val4 |
+------+--------------------------------+------+------+------+------+

Explanation:

As Col2 contains the value in key-value pair out of which I need to extract only value and store it in a different column and wherever there is no value we need to put it as blank. So basically, need to identify ; and :.

My attempt:

SELECT
  Col1,
  SUBSTR(REGEXP_SUBSTR(Col1, 'Key1:(.*?)(;|$)', 1, 1, NULL, 1), 5) AS Col2
From table_name;

DB - Oracle Sql Developer tool
Version - 19c

I am not sure whether this is the optimized way to handle this requirement and hence seeking help for the same.


Solution

  • The technique you are using is broadly correct but there are a few minor issues:

    1. You do not anchor the pattern to a start of a term so if you have the data kkey1:not this;key1:this and you search for key1 then you will partially match the kkey1 key and get the wrong value.
    2. Your data has a mix of title- and lower-case keys and you are only searching for title-case key names.
    3. It is unclear why you are using SUBSTR as you are extracting the capturing group from the regular expression and not the entire match so you do not need to remove the key prefix.

    If you want to UPDATE the table (i.e. store it in a different column) then use:

    UPDATE table_name
      SET col3 = REGEXP_SUBSTR(Col2, '(;|^)key1:(.*?)(;|$)', 1, 1, 'i', 2),
          col4 = REGEXP_SUBSTR(Col2, '(;|^)key2:(.*?)(;|$)', 1, 1, 'i', 2),
          col5 = REGEXP_SUBSTR(Col2, '(;|^)key3:(.*?)(;|$)', 1, 1, 'i', 2),
          col6 = REGEXP_SUBSTR(Col2, '(;|^)key4:(.*?)(;|$)', 1, 1, 'i', 2),
          col7 = REGEXP_SUBSTR(Col2, '(;|^)key5:(.*?)(;|$)', 1, 1, 'i', 2)
    

    Which for the sample data:

    CREATE TABLE table_name (
      col1 VARCHAR2(20),
      col2 VARCHAR2(100),
      col3 VARCHAR2(20),
      col4 VARCHAR2(20),
      col5 VARCHAR2(20),
      col6 VARCHAR2(20),
      col7 VARCHAR2(20)
    );
    
    INSERT INTO table_name (col1, col2)
      VALUES (1, 'key1:Val1;Key2:Val2;;;Key5:Val5');
    

    Then, after the UPDATE, the table contains:

    COL1 COL2 COL3 COL4 COL5 COL6 COL7
    1 key1:Val1;Key2:Val2;;;Key5:Val5 Val1 Val2 null null Val5

    If you just want a SELECT statement then use the same REGEXP_SUBSTR function calls as the UPDATE in the SELECT:

    SELECT col1,
           col2,
           REGEXP_SUBSTR(Col2, '(;|^)key1:(.*?)(;|$)', 1, 1, 'i', 2) AS col3,
           REGEXP_SUBSTR(Col2, '(;|^)key2:(.*?)(;|$)', 1, 1, 'i', 2) AS col4,
           REGEXP_SUBSTR(Col2, '(;|^)key3:(.*?)(;|$)', 1, 1, 'i', 2) AS col5,
           REGEXP_SUBSTR(Col2, '(;|^)key4:(.*?)(;|$)', 1, 1, 'i', 2) AS col6,
           REGEXP_SUBSTR(Col2, '(;|^)key5:(.*?)(;|$)', 1, 1, 'i', 2) AS col7
    FROM  table_name;
    

    fiddle