regexocrtext-parsingpower-automate-desktop

Regular Expressions in Power Automate Desktop /Set Text Parsing


enter image description hereenter image description hereenter image description here

I'm working with a list extracted from OCR of receipt content in Power Automate Desktop and want to filter for specific numeric groups (prices). The issue is that numbers in 00-00 date format keep getting caught. I’ve tried the following:

1.Regular Expression: [+-]?\b\d{1,3}(?:,\d{3})+\b(.*?)

works for larger numbers but fails to capture smaller ones like -2.

2.Set Text Parsing Start Position: This doesn't work; it always starts from the beginning. For example, I tried setting a dynamically calculated starting position as a variable (in the case of the attached screenshot, 47), but it didn't work as expected.

Additional Information:

The target numeric group always appears at an index calculated as textVariable.length * 4 + 7 The date format is always 00-00, while the price may start with a + or -, may include thousands separators, and can contain decimal places.

I'd prefer to avoid For Each due to high processing time. Any guidance is appreciated!


Solution

  • Updated answer: (?<!\S)([\d]|[+-])?\b(\d{1,3}(?:,\d{3})*)\b\s

    Explanation:

    Regexr link: https://regexr.com/87i2a enter image description here



    Try this: ^([\d]|[+-])?(\b\d{1,3}(?:,\d{3})*\b)\s (requires multiline)

    Explanation:

    Regexr link: https://regexr.com/87i21 2