regex

how can I make a Regex to match prefix and multiple suffixes at same time?


I've encountered this problem: I need to clean some corrupted data that has the following pattern:

VALUES ( 1', NULL, 'Smith', 'Jackson',...
VALUES ( 2', 'jon@doe', 'John', 'Doe',...

Basically, I need to remove the extra ', for that I first need to match it...

So I came up with this

(?<=VALUES \(  \d+)(.*')(?=(, '[a-zA-Z])||(^NULL,$))

Unfortunately, it's not matching anything, the \d+ is specially problematic.

Any idea how I can isolate the 1', 2', ... (only the ' after the number that can have N digits) and be able to match the prefix for 'string or NULL


Solution

  • One option is to use

    \b(VALUES\s*\(\s*\d+)'(?=,\s*(?:NULL|'[a-zA-Z]))
    

    Explanation

    Regex demo

    In the replacement use group 1.

    Output

    VALUES (  1, NULL, 'Smith', 'Jackson',...
    VALUES (  2, 'jon@doe', 'John', 'Doe',...