c++regexclangllvm-clangclang-format

How to create category for external headers in clang-format?


I want to configure clang-format to sort in C++ the included headers as follows:

I'm using clang-format 8.0.0 on macOS. My current configuration (snippet related only to includes) is as follows:

SortIncludes: true
IncludeBlocks: Regroup
IncludeCategories:
  # Headers in <> without extension.
  - Regex:           '<([A-Za-z0-9\/-_])+>'
    Priority:        4
  # Headers in <> from specific external libraries.
  - Regex:           '<((\bboost\b)|(\bcatch2\b))\/([A-Za-z0-9.\/-_])+>'
    Priority:        3
  # Headers in <> with extension.
  - Regex:           '<([A-Za-z0-9.\/-_])+>'
    Priority:        2
  # Headers in "" with extension.
  - Regex:           '"([A-Za-z0-9.\/-_])+"'
    Priority:        1

In this configuration I assume, that system/standard headers are without extension. It will not work for UNIX/POSIX headers. Main header is automatically detected and assigned the priority 0. So far, all seems working as expected, except for the category for external libraries. It looks like clang-format is assigning it to the priority 2.

Expected result:

#include "test.h"

#include <allocator/region.hpp>
#include <page.hpp>
#include <page_allocator.hpp>
#include <test_utils.hpp>
#include <utils.hpp>
#include <zone_allocator.hpp>

#include <catch2/catch.hpp>     // <--------

#include <array>
#include <cmath>
#include <cstring>
#include <map>

Actual result:

#include "test.h"

#include <allocator/region.hpp>
#include <catch2/catch.hpp>     // <--------
#include <page.hpp>
#include <page_allocator.hpp>
#include <test_utils.hpp>
#include <utils.hpp>
#include <zone_allocator.hpp>

#include <array>
#include <cmath>
#include <cstring>
#include <map>

How to configure priority 3 to have the expected result?


Solution

  • The problem is that Clan-format uses POSIX ERE regexes. And those do not support word boundaries.

    So <catch2/catch.hpp> will never match the second rule. Then, the same string is evaluated for the third rule, that matches.

    If it had matched the second rule, It would have stopped there, but since it hadn't, it goes on with next rule.

    Just remove all \b on the regex. It is safe to remove them because you already have word boundaries: at the left you have < and to the right you have / so even if you could use word boudaries, it would be useless.

      - Regex:           '<(boost|catch2)\/([A-Za-z0-9.\/-_])+>'
        Priority:        3
    

    NOTE: Bear in mind that - inside [] should be scaped with a backslash unless It is placed on the last position. That is because It is used for ranges. So when you write [A-Za-z0-9.\/-_] you mean A-Za-z0-9. or range from / to _ which probably you don't mean to be like that.