regexcoda

Regex replace everything after specific character inside specific tag


I have a list of XML elements with values. I'd like to remove any characters or numbers after a specific character (in this case, a period), but only within the someTag element.

<someTag>123.3</someTag>
<someTag>8623.34</someTag>

I'm able to target periods inside the tag using: \.(?=[^<]*</someTag>). However, I can't figure out how to remove the period and everything after it so that the end result would be:

<someTag>123</someTag>
<someTag>8623</someTag>

Any help is greatly appreciated!


Solution

  • Your pattern only matches the period. You have to capture everything after it till the tag's closing too. Try this:

    (\.[^\<]+)<\/someTag>
    

    [^\<]+ is a negated character set, meaning it will match anything not in the set. [^\<]+ matches until it encounters a < character.

    Regex101