regexcharacter-encodingimapsieve-language

Sieve fails on encoded subjects


I'm trying to set up filter rules for sieve to add a flag and a header to incoming messages with a regex, but sieve fails as soon as there's a german umlaut within the subject.

Here's my sieverule

require ["fileinto","editheader","variables","regex","imap4flags","encoded-character"];
if anyof (header :regex :comparator "i;ascii-casemap" "Subject" "([0-9]{3}-[0-9]{7}-[0-9]{7})")
{
        addheader :last "X-OrderID" "${0}";
addflag "\\Flagged";
        addflag "${0}";
}

The subject is something like this:

Rückfrage zur Lieferung einer Bestellung von xxx (Bestellung: 304-1962494-2978192)

and the second letter, ü, is causing the trouble.

When I try and send the message without it, everything works as supposed.

The messages are of this type:

MIME-Version: 1.0
Content-Type: multipart/mixed; 

When there are umlauts within the subject, it is changed to

=?UTF-8?Q?R=C3=BCckfrage_zur_Lieferung_einer_Bestellung_von

but I didn't find a way to convert this so far.

On my research I found an extension to sieve called mime

https://www.rfc-editor.org/rfc/rfc5703

however, if I try to require it in the require part of my script I get an error and if I try to set it as an additional extension to sieve it doesn't reload the config, saying the extension is not known.

Can someone help me on fixing this?


Solution

  • It can't work this way. First of all, you don't need require "encoded-character", it's in the basic set already (https://www.rfc-editor.org/rfc/rfc5228#page-10). Next, you don't need anyof here. The :comparator "i;ascii-casemap" restricts the character class to 7-bit US-ASCII. The MIME version of the mail body has nothing to do with the mail headers, so RFC5703 doesn't apply at all.

    To cite from RFC5228 (highlighting be me):

    Comparisons are performed on octets. Implementations convert text from header fields in all charsets [MIME3] to Unicode, encoded as UTF-8, as input to the comparator (see section 2.7.3). Implementations MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII subset of ISO-8859-* character sets, and UTF-8.

    Everything is done automatically. So, just don't request explicit ASCII comparison. The following expression will do what you want:

    require ["fileinto","editheader","regex","imap4flags"];
    if header :regex :comparator "i;octet" "Subject" "[[:graph:]]* ([0-9]{3}-[0-9]{7}-[0-9]{7})$" {
        ...
    }
    

    BTW: If your SIEVE filter throws an error, it's because it does not need to implement all optional extensions. You didn't mention which software you use, so it's left as an exercise for you to to find out the capabilities string from your SIEVE implementation, which tells you what capabilities it supports (see https://www.rfc-editor.org/rfc/rfc5228#page-31).

    Hope this helps, viele Grüße :-)