I'm trying to use PaperTrail's filter logs tool to filter out specific paths using RegEx. My log string could look like one of the below:
Should NOT pass and NOT be logged
Sep 03 10:12:40 lastmingear heroku/router: at=info method=GET path="/orders/SOME_ID?key=USER_KEY" host=www.lastmingear.com...
Should PASS, and BE logged
Sep 03 10:12:40 lastmingear heroku/router: at=info method=GET path="/orders/SOME_ID?key=USER_KEY&log=true" host=www.lastmingear.com...
The only difference is that the path where I want it to BE logged has an additional params log=true
. So the RegEx statement should read verbally, like:
IF a
key=USER_KEY
is provided, then do NOT pass into logs UNLESS there is also alog=true
You can use a regex, but it's usually considered a bad practice to match query strings against a pattern like that. What if the parameters are in a different order? What if there are other parameters in-between them? What if they're URL-encoded?
Instead, you might consider parsing the query string and analyzing the key-value pairs:
require 'uri'
def log?(log_line)
path = log_line[/path="([^"]+)"/, 1]
uri = URI(path)
params = URI.decode_www_form(uri.query).to_h
not params['key'] or params['log'] == 'true'
end
UPDATE: This is a tricky regex problem to solve because there isn't really a way to say if-this-then-that-or-etc. in a regex. You can use assertions but they will only get you so far. You essentially must enumerate all of the patterns you want to pass. I want to stress that this is fairly brittle and you'll want to keep an eye on it over time to see if there is any variance in the pattern.
This pattern matches log lines for the /orders route with a numeric order number, with an optional query string. If the query string is provided, it must match one of the patterns provided. If a numeric key number is provided, log must be true.
/path="\/orders\/\d+
(?:\?
(?:(?!(?<=[?&])key=\d+(?=[&"])).)*?
|(.+?&)?log=true(&.+?)?
)?
"/x