nginxnginx-configopenrestynginx-logsidecar

OpenResty: Anonymise query parameter


I'm trying to anonymize email addresses (replace it by a UUID) to avoid keeping them as plaintext in my nginx access log. For now, I could only replace it with ***** by overriding OpenResty's nginx.conf :

http {
    include       mime.types;
    default_type  application/octet-stream;


    log_format  main  '$remote_addr - $remote_user [$time_local] "$anonymized_request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

     ....

    map $request $anonymized_request {
        default $request;
        ~([^\?]*)\?(.*)emailAddress=(?<email_address>[^&]*)(&?)(.*)(\s.*) "$1?$2emailAddress=*****$4$5$6"; # $email_address;
    }

    include /etc/nginx/conf.d/*.conf;
}

Current result:

# curl http://localhost:8080/?emailAddress=dat@mail.de&attr=hello

127.0. 0.1 - - [24/Jan/2020:11:38:06 +0000] "GET /?emailAddress=*****&attr=hello HTTP/1.1" 200 649 "-" "curl/7.64.1" "-"

Expected:

127.0. 0.1 - - [24/Jan/2020:11:38:06 +0000] "GET /?emailAddress=a556c480-3188-5181-8e9c-7ce4e391c1de&attr=hello HTTP/1.1" 200 649 "-" "curl/7.64.1" "-"

Please, is it possible to pass the email_address variable to a script that converts it to UUID? Or, how can we have the same log format using a log_by_lua_block?


Solution

  • May be this is not a completely deterministic method, but this is the first Lua UUID generation function I found trough google (all credits goes to Jacob Rus). I'm slightly modified this function to make it use the randomizer seed so it will allways generate the same UUID for the same email address. You can rewrite it to anything thats suit your needs more, this is only the idea:

    http {
        include       mime.types;
        default_type  application/octet-stream;
    
        log_format    main  '$remote_addr - $remote_user [$time_local] "$anonymized_request" '
                            '$status $body_bytes_sent "$http_referer" '
                            '"$http_user_agent" "$http_x_forwarded_for"';
    
        access_log    logs/access.log  main;
    
        ...
    
        map $request $anonymized_request {
            default $request;
            ~([^\?]*)\?(.*)emailAddress=(?<email_address>[^&]*)(&?)(.*)(\s.*) "$1?$2emailAddress=$uuid$4$5$6"; # $email_address;
        }
    
        ...
    
        server {
    
            ...
    
            set $uuid '';
            log_by_lua_block {
                local function uuid(seed)
                    math.randomseed(seed)
                    local template ='xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'
                    return string.gsub(template, '[xy]', function (c)
                        local v = (c == 'x') and math.random(0, 0xf) or math.random(8, 0xb)
                        return string.format('%x', v)
                    end)
                end
                local email = ngx.var.arg_emailAddress
                if email == nil then email = '' end
                -- get CRC32 of 'email' query parameter for using it as a seed for lua randomizer
                -- using https://github.com/openresty/lua-nginx-module#ngxcrc32_short
                -- this will allow to always generate the same UUID for each unique email address
                local seed = ngx.crc32_short(email)
                ngx.var.uuid = uuid(seed)
            }
        }
    
    }