loggingazure-application-insightsopen-telemetryazure-monitoring

Masking multiple instances of sensitive data in logs of Azure Monitor Application Insights for Java


In trying to implement masking of sensitive data in logs sent to Azure Application Insights, with official documentation: Masking sensitive data in log message

Only single instance of sensitive data is shown to be masked. I would like to mask variable number of instances of sensitive data in my logs. Let's say my sensitive data is userId: "A12345678Y" and "B23456789Z"

I would like to mask all the userIds in the log: "User A12345678Y has friended User B23456789Z, and User A12345678Y has 20 mutual friends with User B23456789Z". There is a variable number of userIds, meaning there could be tens of userIds per log.

I have tried masking instances of userId by adding the rule: ".*(?<redactedUserId>[A|B][0-9]{8}[Y|Z]).*" twice in the "rules" JSON config.

JSON looks like this:

{
  "connectionString": "InstrumentationKey=00000000-0000-0000-0000-000000000000",
  "preview": {
    "processors": [
      {
        "type": "log",
        "body": {
          "toAttributes": {
            "rules": [
              ".*(?<redactedUserId>[A|B][0-9]{8}[Y|Z]).*",
              ".*(?<redactedUserId>[A|B][0-9]{8}[Y|Z]).*"
            ]
          }
        }
      },
      {
        "type": "attribute",
        "actions": [
          {
            "key": "redactedUserId",
            "action": "delete"
          }
        ]
      }
    ]
  }
}

2 instances of userId has been deleted.
Pre Masked: "User A12345678Y has friended User B23456789Z, and User A12345678Y has 20 mutual friends with User B23456789Z"

Post Masked: "User A12345678Y has friended User B23456789Z, and User {redactedUserId} has 20 mutual friends with User {redactedUserId}"

What if I have 50 instances of userId? or 70? I would like to mask all of them. Hardcoding 50 lines of the rule seems unwise.

How should I implement such that variable number of instances of userId is masked?

Edit with solution:

I have finally solved this as well as replacing {redactedUserId} to any message I want to be shown in my logs.

I did it by:

  1. 1st Processor is a log processor converting an entire log message that contains userId into a single attribute named "LogMessage"
  2. 2nd Processor is an attribute processor that searches for regex pattern which I inputted to be userId, and replaces it to be "**Censored Message Here**". The value for the attribute "LogMessage" is now a censored log message.
  3. 3rd Processor is a log processor converting the "{LogMessage}" attribute in the logs back to its value. So rather than displaying "{LogMessage}", my logs will display as "User **Censored Message Here** has friended User...
  4. 4th Processor is an attribute processor that cleans up and deletes the "LogMessage" attribute from the log custom properties.

JSON:

"processors": [
        {
          "type": "log",
          "body": {
            "toAttributes": {
              "rules": [
                "(?<LogMessage>.*[A|B][0-9]{8}[Y|Z].*)"
              ]
            }
          }
        },
        {
          "type": "attribute",
          "actions": [
            {
              "key": "LogMessage",
              "pattern": "[A|B][0-9]{8}[Y|Z]",
              "replace": "**Censored Message Here**",
              "action": "mask"
            }
          ]
        },
        {
          "type": "log",
          "body": {
            "fromAttributes": [
              "LogMessage"
            ]
          }
        },
        {
          "type": "attribute",
          "actions": [
            {
              "key": "LogMessage",
              "action": "delete"
            }
          ]
        }
      ]

Solution

  • A change was done in Application Insights 3.4.16 to allow you to mask every userId from one rule. Have you tried with the 3.4.16 version?