sqlsql-serverdata-maskingpii

Mask values in a SQL Query string for SQL Server


I am a SQL Server DBA. I would like to write a procedure which I can provide to rest of my team where they can view the text for currently running queries on the server (Similar to how we view in sp_who2) but with all the values masked. Examples:

Query text Query text after Masking
Select * from sometable where rating = '4' Select * from sometable where rating = '****'
Select name, id from sometable where id = '3233' Select name, id from sometable where id = '****'
UPDATE Customers SET ContactName = 'Alfred Schmidt' WHERE CustomerID = 1; UPDATE Customers SET ContactName = '****' WHERE CustomerID = ****;
INSERT INTO Customers (CustomerName, ContactName) VALUES ('Cardinal', 'Tom B. Erichsen'); INSERT INTO Customers (CustomerName, ContactName) VALUES ('*****', '****');

Solution

  • You could try some XML-trickery to handle the strings. First replace all single quotes with an empty tag <X/> to get a XML that looks like this.

    INSERT INTO Customers (CustomerName, ContactName) 
      VALUES (<X />Cardinal<X />, <X />Tom B. Erichsen<X />);
    

    Then you shred the xml to get the text nodes and the node numbers where mod 2 is 0 is the ones you want to mask.

    After that you can rebuild your query string using the mask values.

    I have not found a way to deal with numbers other then removing all numbers from the query using Translate or nested replace and that will of course also remove numbers from table names and column names as well.

    You could try something like this.

    declare @S nvarchar(max);
    declare @X xml;
    
    set @S = N'UPDATE Customers SET ContactName = ''Alfred Schmidt'' WHERE CustomerID = 1;';
    
    set @X = replace(@S, '''', '<X/>');
    
    with C as
    (
      select T.X.value('.', 'nvarchar(max)') as V, 
             row_number() over(order by T.X) as RN
      from @X.nodes('text()') as T(X)
    )
    select @S = (
                select case when C.RN % 2 = 0 then '''*****''' else C.V end
                from C
                order by C.RN
                for xml path(''), type
                ).value('text()[1]', 'nvarchar(max)');
    
    set @S = translate(@S, '0123456789', '**********')
    
    print @S;
    

    Result:

    UPDATE Customers SET ContactName = '*****' WHERE CustomerID = *;
    

    Note: Just realized that this solution does not handle the cases where the string values contains single quotes but I think this is something that possibly can inspire more robust solution so I will leave it here.