I want to hash sensitive information (1 field) in my SAS data set using MD5. But after hashing the data looks awkward, i.e. all special characters. Is this the right way to use a hash function?
My Code:
data md5;
set sashelp.class (obs=2);
md5 = md5(strip(name));
keep name md5;
put _all_;
run;
My Output:
Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 md5=�p?ޞ��\�rT]( _ERROR_=0 _N_=1
Name=Alice Sex=F Age=13 Height=56.5 Weight=84 md5=dH���/�x{�͇!K8 _ERROR_=0 _N_=2
That's correct, you just need to apply a hexadecimal format $hex32.
so it's readable. MD5 is 128-bit Hash but there's a better hashing called SHA256() which is 256-bit hash.
Code:
data md5;
set sashelp.class (obs=2);
format md5 $hex32.;
md5 = md5(strip(name));
keep name md5;
put _all_;
run;
Output:
Name=Alfred md5=86703FDE9E87DD5C0F8E1072545D0128
Name=Alice md5=64489C85DC2FE0787B85CD87214B3810
Note:
You can also add a SALT or PEPPER values to your string for added security; These are string concatenate to the beginning or end of your string.