Trying to remove duplicate numbers in the same cell in alteryx using regex replace due to applying to a large amount of columns. Below is my current formula. It is working for all rows except the last where the decimals are numbers are being removed, instead of just removing duplicate numbers between commas in the same cell. Any suggestions? I have been playing with the formula in regex101.com and cannot seem to figure it out.
TRIM(REGEX_Replace([CurrentField], '(-?\d+(?:[.,]\d+)?),(?=.*\1)', ''))
ID | Description | Price | CurrentOUTPUT | Desired Output |
---|---|---|---|---|
0 | red | 0,0,67,67,0 | 0,67 | 0,67 |
1 | blue | -30,-30 | -30 | -30 |
2 | green | 0.03,0.045,0.03 | 0.03,0.045 | 0.03,0.045 |
2 | green | 1.17,0.37,0.36,0.36,0.37 | 1.10.36,0.37 | 1.17,0.37,0.36 |
You can use
TRIM(REGEX_Replace([_CurrentField_], ',?(-?\d+(?:[.,]\d+)?)(?=.*\1)', ''), ',')
See the regex demo.
You will need to trim commas after regex replacement, hence TRIM
is used.
Details:
,?
- an optional comma(-?\d+(?:[.,]\d+)?)
- Group 1: an optional minus, one or more digits, then an optional sequence of .
or ,
and one or more digits(?=.*\1)
- a positive lookahead that matches a location immediately followed with any zero or more chars other than line break chars as many as possible and then Group 1 value.