We make use of the MASKED
column feature in SQL Server to hide sensitive data when querying. But we've come across a scenario where the INSERT
performed as part of the MERGE
statement is inserting a masked value. I.E. the newly inserted value is seen as the masked value, even when querying the table as a user with the UNMASK
permission.
Specifically, the scenario occurs when:
MASKED
.UNMASK
permission.MERGE
statement's ON
clause uses the MASKED
column.MERGE
statement doesn't match, so drops into the INSERT
section.Below is a minimal reproduction of the issue.
-- Define a test table with one non-masked and one masked column.
CREATE TABLE TestTable(
C1 NVARCHAR(100) MASKED WITH (FUNCTION = 'default()'),
C2 NVARCHAR(100) MASKED WITH (FUNCTION = 'default()'),
C3 NVARCHAR(100) -- Not masked
);
INSERT INTO TestTable
VALUES ('aaa', 'aaa', 'aaa'), ('bbb', 'bbb', 'bbb')
-- Print the current user and initial table content
SELECT SUSER_NAME() AS LoginName, USER_NAME() AS UserName;
SELECT *, 'No columns should be Masked' FROM TestTable
-- Create users with different permissions
CREATE USER user1 WITHOUT LOGIN
-- Setup permissions (No UNMASK permissions)
GRANT SELECT, UPDATE, INSERT TO user1;
-- Switch context to the user WITHOUT UNMASK permissions
EXECUTE AS USER = 'user1';
-- Show the user can only see masked data
SELECT SUSER_NAME() AS LoginName, USER_NAME() AS UserName;
SELECT *, 'Columns 1 & 2 should be Masked' FROM TestTable
-- Execute the merge, matching on the masked column
MERGE TestTable target
USING (
SELECT * FROM (VALUES
('bbb', 'bbb', 'bbb'), -- Will be merged
('ccc', 'ccc', 'ccc'), -- Will be inserted
('ddd', 'ddd', 'ddd')) -- Will be inserted
AS s (C1 ,C2 ,C3)
) AS source
ON (source.C1 = target.C1)
WHEN MATCHED THEN UPDATE SET
target.C2 = source.C2,
target.C3 = source.C3
WHEN NOT MATCHED BY TARGET
THEN INSERT (C1, C2, C3)
VALUES (source.C1, source.C2, source.C3);
-- A regular insert statement to check if masking applies there
INSERT INTO TestTable
VALUES ('eee', 'eee', 'eee')
-- Print the current user and current table content
SELECT SUSER_NAME() AS LoginName, USER_NAME() AS UserName;
SELECT *, 'Columns 1 & 2 should be Masked' FROM TestTable
REVERT; -- Back to the session user context
-- Print the current user and current table content
SELECT SUSER_NAME() AS LoginName, USER_NAME() AS UserName;
SELECT *, 'No columns should be Masked' FROM TestTable
-- Tidy up
DROP TABLE TestTable;
DROP USER user1;
The result of the last SELECT
statement is shown below. This select is being executed as a user that has the UNMASK
permission, but the two rows inserted by a user without that permission still have the column used as a key in the MERGE
apparently masked. I suspect because the masked value was inserted during the MERGE
, rather than any issue with permissions.
Is this is a known issue? Or am I doing something wrong?
This is a consequence of the way MERGE
works.
When multiple actions are defined, CASE
expressions are evaluated in the execution plan to determine the correct value to use for each column. Clearly, an update action will generally cause different values to be assigned than if an insert action applies for that source row.
As the documentation says (emphasis added):
Whenever you project an expression referencing a column for which a data masking function is defined, the expression is also masked.
This is unfortunate when one clause (the insert action in your case) references only literal values (or ones from a non-masked source), while another clause uses masked data. The resulting CASE
expression will always emit masked data.
CASE
WHEN [Action1005]=(4) THEN <unmasked data> -- INSERT action
ELSE <masked data>
END
Changing this behaviour would likely be difficult and risky, Using separate insert, update, and delete statements may be a viable workaround.
Azure Feedback: Bug: Merge inserts with masked key columns