I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!
You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE @String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE @StartPos INT = 1, @EndPos INT = 1;
DECLARE @Input VARCHAR(500) = ISNULL(@String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE @OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (@StartPOS <> 0)
BEGIN
SET @StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', @Input);
IF @StartPOS <> 0
BEGIN
SET @OutputString += SUBSTRING(@Input, 1, @StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET @Input = SUBSTRING(@Input, @StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET @EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', @Input)); --First occurance of 4 numbers with a not number behind it.
SET @Input = STUFF(@Input, 1, (@EndPos - 1), REPLICATE('*', (@EndPos - 1))); --@EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET @OutputString += @Input; --Append the last element
SET @OutputString = LEFT(@OutputString, LEN(@OutputString))
SELECT @OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.