I'm working with a list extracted from OCR of receipt content in Power Automate Desktop and want to filter for specific numeric groups (prices). The issue is that numbers in 00-00
date format keep getting caught. I’ve tried the following:
1.Regular Expression: [+-]?\b\d{1,3}(?:,\d{3})+\b(.*?)
works for larger numbers but fails to capture smaller ones like -2.
2.Set Text Parsing Start Position: This doesn't work; it always starts from the beginning. For example, I tried setting a dynamically calculated starting position as a variable (in the case of the attached screenshot, 47), but it didn't work as expected.
Additional Information:
The target numeric group always appears at an index calculated as textVariable.length * 4 + 7
The date format is always 00-00
, while the price may start with a + or -, may include thousands separators, and can contain decimal places.
I'd prefer to avoid For Each
due to high processing time.
Any guidance is appreciated!
(?<!\S)([\d]|[+-])?\b(\d{1,3}(?:,\d{3})*)\b\s
Explanation:
(?<!\S)
Ensures no non-whitespace character in the beginningRegexr link: https://regexr.com/87i2a
^([\d]|[+-])?(\b\d{1,3}(?:,\d{3})*\b)\s
(requires multiline)Explanation:
^([\d]|[+-])?
Optional start with 0-9, + or -\s
Having this at the end ensures the ending character is a whitespace, matching whole words
Regexr link: https://regexr.com/87i21