regexsieve-language

How to match everything inside the first pair of square brackets


I'm trying to create a regular expression in sieve. The implementation of sieve that I'm using is Dovecot Pigeonhole

I'm subscribed to github project updates and I receive emails from github with the subject in the format that looks like this:

Re: [Opserver] Create issues on Jira from Exception details page (#77)

There is a project name in square bracket included in the subject line. Here is the relevant part of my sieve script:

if address "From" "notifications@github.com" {
  if header :regex "subject" "\\[(.*)\\]" {
      set :lower :upperfirst "repository" "${1}";
      fileinto :create "Subscribtions.GitHub.${repository}"; stop;
  } else {
      fileinto :create "Subscribtions.GitHub"; stop;
  }
}

As you can see from the above, I'm moving the messages to appropriate project IMAP folders. So the message with the subject above will end up in Subscribtions.Github.Opserver

Unfortunately, there is one small problem with this script. If someone adds square brackets in the title of their github issue, the filter breaks. For example if the subject is:

[Project] [Please look at it] - very weird issue

The above filter will move the message to folder Subscribtions.Github.Project] [please look at it which is completely undesirable. I'd like it to be moved to Subscribtions.Github.Project anyway.

This happens because by default regular expressions are greedy. So they match the longest possible match. However when I try to fix it the usual way changing "\\[(.*)\\]" to "\\[(.*?)\\]" nothing seems to change.

How do I write this regular expression so that it acts as desired?


Solution

  • The answer is to change "\\[(.*)\\]" to "\\[([^]]*)\\]".

    By reading regex spec linked in the question we disvover that POSIX regular expression are used. Unfortunately those do not support non-greedy matches.

    However there is a work around in this particular case, given above.