regexstringtextwrangler

Regex in Textwrangler - Remove String Between Two Characters


I have a text file that has multiple weather statistics for popular cities, which includes not only the highs and lows for the current day, but also yesterday's weather, as shown below:

City/Town, State;Yesterday’s High Temp (F);Yesterday’s Low Temp (F);Today’s High Temp (F);Today’s Low Temp (F);Weather Condition;Wind Direction;Wind Speed (MPH);Humidity (%);Chance of Precip. (%);UV Index

Atlanta, GA;43;22;44;22;Partly sunny, chilly;NW;9;38%;7%;4
Atlantic City, NJ;45;24;37;9;A snow squall;WNW;22;36%;58%;3
Baltimore, MD;40;23;34;8;A snow squall, windy;NW;19;37%;57%;1
Bismarck, ND;-10;-29;-8;-15;Frigid;SSE;6;73%;58%;2

I'd like to be able to enter a regex command that will remove the first two numbers after the state, removing yesterday's high and low temperature, to have it look like this:

City/Town, State;Yesterday’s High Temp (F);Yesterday’s Low Temp (F);Today’s High Temp (F);Today’s Low Temp (F);Weather Condition;Wind Direction;Wind Speed (MPH);Humidity (%);Chance of Precip. (%);UV Index

Atlanta, GA;44;22;Partly sunny, chilly;NW;9;38%;7%;4
Atlantic City, NJ;37;9;A snow squall;WNW;22;36%;58%;3
Baltimore, MD;34;8;A snow squall, windy;NW;19;37%;57%;1
Bismarck, ND;-8;-15;Frigid;SSE;6;73%;58%;2

Is there an easy way to do this?


Solution

  • Match portion:

    -?\d+;-?\d+;(-?\d+;-?\d+)
    

    Substituion:

    $1
    

    Breaking it down:

    Check for possible hyphen
    -?
    Check for number
    \d+
    Check for semicolon
    ;
    Do the above again
    -?\d+;
    Start of capturing group
    (
    Do above check 2 times again
    -?\d+;-?\d+
    End of capturing group
    )
    

    $1 means to replace it with the contents of the first capturing group.

    You can also use this if you don't want to do any substitions:

    -?\d+;-?\d+;(?=-?\d+;-?\d+)
    

    It utilizes a lookahead to check if there are two more numbers in front of it.