rustrust-compiler-plugin

Why isn't `regex!` a wrapper for `Regex::new` to offer the same regex matching speed?


The Rust Regex crate offers the regex! syntax extension which makes it possible to compile a regex during the standard compile time. This is good in two ways:

Unfortunately, the docs say:

WARNING: The regex! compiler plugin is orders of magnitude slower than the normal Regex::new(...) usage. You should not use the compiler plugin unless you have a very special reason for doing so.

This sounds like a completely different regex engine is used for regex! than for Regex::new(). Why isn't regex!() just a wrapper for Regex::new() to combine the advantages from both worlds? As I understand it, these syntax-extension compiler plugins can execute arbitrary code; why not Regex::new()?


Solution

  • The answer is very subtle: one feature of the macro is that the result of regex! can be put into static data, like so:

    static r: Regex = regex!("t?rust");
    

    The main problem is that Regex::new() uses heap allocations during the regex compilation. This is problematic and would require a rewrite of the Regex::new() engine to also allow for static storage. You can also read burntsushi's comment about this issue on reddit.


    There are some suggestions about how to improve regex!:

    As of the beginning of 2017, the developers are focused on stabilizing the standard API to release version 1.0. Since regex! requires a nightly compiler anyway, it has a low priority right now.

    However, the compiler-plugin approach could offer even better performance than Regex::new(), which is already super fast: since the regex's DFA could be compiled into code instead of data, it has the potential to run a bit faster and benefit from compiler optimizations. But more research has to be done in the future to know for sure.