sqlregexsnowflake-cloud-data-platformregexp-replace

Remove all occurrence of specific character from strings surrounded by double quotes


I have below string (this is a single string - I've added newlines only for readability):

"someField0";"some value1; some value2; some value3; some value4";
"someField1";"some value1; some value2; some value3";
"someField2";"some value1; some value2";
"someField3";"some value123";

I need to get rid of any occurrence of ; character from quoted parts using regex / substitution. Desired result is like below:

"someField0";"some value1 some value2 some value3 some value4";
"someField1";"some value1 some value2 some value3";
"someField2";"some value1 some value2";
"someField3";"some value123";

Is it possible to achieve this using only regex with substitution?

I have found several solutions like below, but all of them are capable to remove only one occurrence.

("[^",]+);([^"]+")

And then substitute with

$1 $2

I am searching of solution based on pure regex because I need to implement this inside SQL query using regex_replace function. This is a part of more complex parsing statement.


Solution

  • You could try using backreferences on capture groups. The idea is to stitch first and third capture group, and leave the second out.

    select regexp_replace(str,'([^"])([;]+)([^"])','\\1\\3')