I have some logs in my codebase that have multiline f-strings, such as:
...
logger.error(
f'...'
f'...'
f'...'
f'...'
f'...'
f'...'
)
And some only have two f'...'
s on separate lines while others 3 f'...'
s, and so on.
I am currently duplicating patterns to catch such logs. For example:
...
patterns:
- pattern-either:
- pattern: |
logger.$METHOD(
f'...'
f'...'
f'...'
)
- pattern: |
logger.$METHOD(
f'...'
f'...'
f'...'
f'...'
f'...'
)
Catch those with 3 and 5 f'...'
s on multiple lines. I have to write another pattern for those with 4, 2 and so on.
Is there a scalable way to capture all of these with fewer patterns? The current implementation won't scale as there might be logs with 6, 7, 8, 9 and so on multiline f-strings.
A good solution has been posted here and a live demo here.
Basically, instead of duplicating the lines, a metavariable, $X
was created to represent the message. In case the message doesn't match '...'
, it is flagged as a suspect. The full code is:
rules:
- id: test
patterns:
- pattern-either:
- pattern: logger.$METHOD(..., $X, ...)
- pattern: logger.$METHOD(..., message=$X, ...)
- pattern: logger.$METHOD(..., msg=$X, ...)
- metavariable-pattern:
metavariable: $X
patterns:
- pattern-not: |
"..."
message: Semgrep found a match $X
languages:
- python
severity: WARNING
All credits to lagoAbal.