c++gccclang-tidy

Compiler or Clang tidy warning on methods that directly access the file system


my C++ code has calls to classes such as std::ifstream to open a file for reading or std::filesystem methods that iterate over directories or check if a file exists - I intend to abstract these away in wrappers as I'm about to support working both on cloud and on a local filesystem.

I want to prevent someone from adding calls to such methods (i.e. have a compiler warning or a clang-tidy rule that will fail) and only allow accessing the file system using my wrappers.

Is it possible? I read about #pragma GCC poison but this means I have to list all of the possible methods that I want to ban

I'm still in the assessment stage


Solution

  • I don't think there's anything pre-built that meets the requirements in the question, but the clang-query tool can be used to construct ad-hoc queries that would.

    First, let me elaborate on the presumed requirements:

    Here is a shell-script that calls clang-query to do those things:

    #!/bin/sh
    
    PATH=$HOME/opt/clang+llvm-16.0.0-x86_64-linux-gnu-ubuntu-18.04/bin:$PATH
    
    query='m
    
      expr(
        isExpansionInFileMatching("^/home/"),
        unless(
          isExpansionInFileMatching("my-fs-layer")
        ),
        anyOf(
          hasType(
            hasDeclaration(
              namedDecl(
                matchesName("::std::ifstream")
              )
            )
          ),
          callExpr(
            callee(
              functionDecl(
                matchesName("::std::filesystem")
              )
            )
          )
        )
      ).bind("expr")
    
    '
    
    if [ "x$1" = "x" ]; then
      echo "usage: $0 filename.cc -- <compile options like -I, etc.>"
      exit 2
    fi
    
    # Run the query.  Setting 'bind-root' to false means clang-query will
    # not also print a redundant "root" binding.
    clang-query \
      -c="set bind-root false" \
      -c="$query" \
      "$@"
    
    # EOF
    

    Explanation

        isExpansionInFileMatching("^/home/"),
        unless(
          isExpansionInFileMatching("my-fs-layer")
        ),
    

    Report when the expression's expansion location is in a file somewhere under /home, thereby suppressing a deluge of reports from system headers.

    But, suppress if it has the substring my-fs-layer, thus allowing certain parts of the code to directly use the filesystem.

    Note: Clang regexes are POSIX regexes, which do not support lookahead (positive or negative), so we cannot use something like (?!my-fs-layer).

          hasType(
            hasDeclaration(
              namedDecl(
                matchesName("::std::ifstream")
              )
            )
          ),
    

    This matches any expression whose type is a named type and the name has ::std::ifstream as a substring.

    Note that if a parameter is declared as std::ifstream &is, the expression is is not considered to have reference type, as the reference-ness of the declaration type (conceptually) relates to lvalue-ness of the expression. Consequently there is no need to skip reference types here.

          callExpr(
            callee(
              functionDecl(
                matchesName("::std::filesystem")
              )
            )
          )
    

    This matches any call to a function whose name has ::std::filesystem as a substring.

    See AST Matcher Reference for more details on these and other matchers.

    Example run

    Given report.cc:

    // report.cc
    // Report file system access.
    
    #include <filesystem>        // std::filesystem
    #include <fstream>           // std::ifstream
    
    namespace NS {
    
    int f1()
    {
      std::ifstream is("filename");                  // reported
      int i;
      is >> i;                                       // reported
      return i;
    }
    
    int f2(std::ifstream &is)
    {
      int i;
      is >> i;                                       // reported
      return i;
    }
    
    bool f3()
    {
      return std::filesystem::exists("something");   // reported
    }
    
    }
    
    // EOF
    

    and my-fs-layer.cc:

    // my-fs-layer.cc
    // Stand-in for a filesystem abstraction layer that *is* allowed to
    // access the FS directly.
    
    #include <filesystem>        // std::filesystem
    #include <fstream>           // std::ifstream
    
    namespace myfs {
    
    int f1()
    {
      std::ifstream is("filename");
      int i;
      is >> i;
      return i;
    }
    
    int f2(std::ifstream &is)
    {
      int i;
      is >> i;
      return i;
    }
    
    bool f3()
    {
      return std::filesystem::exists("something");
    }
    
    }
    
    // EOF
    

    the script (which I called cmd.sh) produces the output:

    $ ./cmd.sh my-fs-layer.cc report.cc --
    
    Match #1:
    
    $PWD/report.cc:11:17: note: "expr" binds here
      std::ifstream is("filename");                  // reported
                    ^~~~~~~~~~~~~~
    
    Match #2:
    
    $PWD/report.cc:13:3: note: "expr" binds here
      is >> i;                                       // reported
      ^~
    
    Match #3:
    
    $PWD/report.cc:20:3: note: "expr" binds here
      is >> i;                                       // reported
      ^~
    
    Match #4:
    
    $PWD/report.cc:26:10: note: "expr" binds here
      return std::filesystem::exists("something");   // reported
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    4 matches.