regexazure-data-factoryregexp-replaceazure-mapping-data-flow

Azure data factory - mapping data flows regex implementation to format a number


I am creating a mapping data flow where I have a phone number column which can contain values like (555) 555-1234 or (555)555-1234 or 555555-1234

I want to extract numbers from this value. How can that be done. I have tried the below function with different variations but nothing is working.

regexExtract("(555) 555-1234",'\d+)')

regexExtract("(555) 555-1234",'(\d\d\d\d\d\d\d\d\d\d)')


Solution

  • Because you have multiple phone formats, you need to remove parentheses and spaces and dashes so you need multiple statements of regexExtract which will make your solution complicated.

    instead, i suggest that you use regexReplace, mainly keeping only digits. i tried it in ADF and it worked, for the sake of the demo, i added a derived column phoneNumber with a value: (555) 555-1234 in the derived column activity i added a new column 'validPhoneNumber' with a regexReplace value like so:

    regexReplace(phoneNumber,'[^0-9]', '')
    

    enter image description here

    Output:

    enter image description here

    You can read about it here: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#regexReplace