sqlencryptionsubstringamazon-redshiftdata-masking

Alberti or Caesar cipher in SQL on VARCHAR field to mask number


I've used an Alberti cipher in SAS to mask numbers which works great on varying lengths. However, I'm having trouble doing the same task in Redshift SQL.

If I have a number all the same length, I could simply use a combination of substrings and to pick apart and Frankenstein a masked number. If my COL1 is 123456789, it would be something like:

substring(col1,9,1)||substring(col1,2,2)||substring(acct_num_full,3,1)||...

However, the column I'm trying to mask could be anywhere from 5-40 characters and so the example above would not work. I've tried lpad and rpad to standardize the lengths but it doesn't work for the shorter numbers.

Does anyone have any ideas or have done Alberti or Caesar ciphering in SQL on a varying character column? I would need an algorithm that could be reversed as well so I could decrypt.


Solution

  • The translate() function can perform character substitutions - https://docs.aws.amazon.com/redshift/latest/dg/r_TRANSLATE.html

    Just provide the caesar cypher as the chars_to_replace and chars_to_substitute strings.

    ===========================================

    For example performing the caesar cypher example from wikipedia - https://en.wikipedia.org/wiki/Caesar_cipher

    select 'THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG'::text as pt
      into foo;
    
    select pt,
     translate(pt,
               'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
               'XYZABCDEFGHIJKLMNOPQRSTUVW') as ct;