bashgrepgitlab-ci.yml

Using grep in .gitlab-ci.yml to find a line


In my .gitlab-ci.yml file, I have the line

- if: '$CI_COMMIT_MESSAGE =~ /^\[release\]::/'

The line is used to check for commit message that starts with [release]:: and follows gitlab yml convention where regex expression must start and end with a / character. The line is working properly for what it is supposed to do as tested and verified. So, that is all good.

Next, I want to write a Gitlab CICD job that will verify that this line exists in .gitlab-ci.yml file. To do so, I tried creating a job like this

check-job:
  stage: check
  script:
    - |
      #!/bin/bash
      if ! grep '$CI_COMMIT_MESSAGE =~ /^\[release\]::/' .gitlab-ci.yml; then
        echo "missing line starting with [release]::"
        exit 1
      else
        echo "found line starting with [release]::"
      fi

This however does not work but I am sure it has to do with \[ part of regular expression. This is because if I remove \[release\]::/, and check for:

grep '$CI_COMMIT_MESSAGE =~ /^' .gitlab-ci.yml

, it will work.


Solution

  • The problem with grep '$CI_COMMIT_MESSAGE =~ /^\[release\]::/' is, that it interprets special symbols like \ too, but you wanted to search for them literally, without this special meaning.

    Instead of painstakingly escaping the special symbols (and having to read that escaped version later), I'd recommend to use grep -F which searches for literal strings instead of regexes.

    grep -F '$CI_COMMIT_MESSAGE =~ /^\[release\]::/' .gitlab-ci.yml
    

    Please note that your grep command is only a heuristic.
    On one side, both yaml and gitlab-ci syntax would allow to write an equivalent rule using a slightly different string.
    One the other side, users can trick your check by adding the requested string as a comment or something like that. As pointed out by @Dunes, the grep command might even find itself.

    To work around the last two problems, you could use ...

    grep -Eo '^[^#]+' .gitlab-ci.yml | grep -Fv 'grep' | grep -F '$CI_COMMIT_MESSAGE =~ /^\[release\]::/'
    

    The first heuristic (!) removes (trailing) comments, the second one the grep command itself, and the last one searches for your string in the remaining parts.
    Just like before, this is only a heuristic.

    Resolving all such problems seems too hard. You would need not only a yaml parser and a parser for gitlab-ci files, but also an interpreter (!) for the latter, as users could define things like include: $VARIABLE.yml.
    I recommend a manual review process using protected branches, merge requests, and maybe CODEOWNERs