semgrep

Semgrep determine value of function parameter


I have this Semgrep rule to detect:

any usage of:

new $X.Classic(..., {..., classicName: $Y, ...})

where: classicName's value matches with this regex: .*hello.*

rules:
  - id: tyq
    languages:
      - ts
    message: Hello
    pattern-either:
      - patterns:
          - pattern-either:
              - pattern-inside: |
                  import * as $X from '@abc/abc';
                  ...
          - pattern-either:
              - patterns:
                  - pattern: |
                      new $X.Classic(..., {..., classicName: $Y, ...})
                  - focus-metavariable: $Y
                  - metavariable-comparison:
                      metavariable: $Y
                      comparison: re.match(".*hello.*", $Y)

and this works if $Y is:

  1. hardcoded string: (classicName: "hello")
  2. a variable assignment within scope

ex:

   #1 var varName = "hello";

   #2 new $X.Classic(..., {..., classicName: varName, ...})

or if #2 is within some function and #1 is in root of file

but DOES-NOT work if:

the value of classicName is coming from a function parameter.

like:

class Class {
  constructor() {
    super();
    const classicName = "hello";
    this.createClassic(classicName);
  }

  private createClassic(classicName: string) {
    new abc.Classic(this, 'MyBucket', {
      //ruleid: tyq
      classicName: classicName,
    });
  }
}

Any way I can catch the values of parameters of function. (As in this case parameter value is resolvable)


Solution

  • If you want to track data across functions, you likely want to leverage taint mode. Taint mode functions by determining if any data from a "source" reaches a "sink". The idea would be to mark any string matching you regex as a source, and then any instance of the usage you want to detect as a sink. The rule below implements this.

    rules:
      - id: tyq
        languages:
          - ts
        message: Hello
        mode: taint
        pattern-sources:
          - patterns:
              - pattern: $X
              - metavariable-regex:
                  metavariable: $X
                  regex: .*hello.*
        pattern-sinks:
          - patterns:
              - pattern-inside: |
                  import * as $X from '@abc/abc';
                  ...
              - pattern: "new $X.Classic(..., {..., classicName: $Y, ...})"
              - focus-metavariable: $Y
        severity: WARNING
    

    Playground

    The focus-metavariable in the sink restricts it such that the source has to reach that specific point, rather than just anywhere in the sink pattern. If we didn't have that, we'd also match instances where our source reached say, the first parameter. I've also swapped out the metavariable-comparison for metavariable-regex as a stylistic choice. Note also that $X in the sources and $X in the sinks are independent. If you wanted to write a rule where they had to be the same thing (as they would if they were combined under a patterns) you could use the unify_taint_mvars option, but that isn't needed here.

    One downside to this rule: it is overbroad, since if anything marked as a source is used to compute the value $Y matches, the rule with match. You could use sanitizers or tweak some other options to cut down on the noise.

    Here's an example with an additional finding which I assume is an FP (depends on if you want to forbid this, or if you only care about the final value which reaches the object).

    There are some limitations to taint analysis in the OSS version, so you'll have to use the pro engine if you want interprocedural or interfile analysis.