emailsmtpdkimcanonicalization

How to format DKIM Header and body?


I've done a question about this before, but still do not understand what to do.

I need to make canonicalized header and body for a email. I've read this piece of documentation lots of times. Could someone make a example, because I cannot wrap my head around this:

3.4.1. The "simple" Header Canonicalization Algorithm

The "simple" header canonicalization algorithm does not change header fields in any way. Header fields MUST be presented to the signing or verification algorithm exactly as they are in the message being signed or verified. In particular, header field names MUST NOT be case folded and whitespace MUST NOT be changed.

3.4.2. The "relaxed" Header Canonicalization Algorithm

The "relaxed" header canonicalization algorithm MUST apply the following steps in order:

3.4.3. The "simple" Body Canonicalization Algorithm

The "simple" body canonicalization algorithm ignores all empty lines at the end of the message body. An empty line is a line of zero length after removal of the line terminator. If there is no body or no trailing CRLF on the message body, a CRLF is added. It makes no other changes to the message body. In more formal terms, the "simple" body canonicalization algorithm converts "*CRLF" at the end of the body to a single "CRLF".

Note that a completely empty or missing body is canonicalized as a single "CRLF"; that is, the canonicalized length will be 2 octets.

The SHA-1 value (in base64) for an empty body (canonicalized to a "CRLF") is:

uoq1oCgLlTqpdDX/iUbLy7J1Wic=

The SHA-256 value is:

frcCV1k9oG9oKj3dpUqdJg1PxRT2RSN/XKdLCPjaYaY=

3.4.4. The "relaxed" Body Canonicalization Algorithm

The "relaxed" body canonicalization algorithm MUST apply the following steps (1) and (2) in order:

  1. Reduce whitespace:

    • Ignore all whitespace at the end of lines. Implementations MUST NOT remove the CRLF at the end of the line.

    • Reduce all sequences of WSP within a line to a single SP character.

  2. Ignore all empty lines at the end of the message body. "Empty line" is defined in Section 3.4.3. If the body is non-empty but does not end with a CRLF, a CRLF is added. (For email, this is only possible when using extensions to SMTP or non-SMTP transport mechanisms.)

The SHA-1 value (in base64) for an empty body (canonicalized to a null input) is:

2jmj7l5rSw0yVb/vlWAYkK/YBwk=

The SHA-256 value is:

47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=

3.4.5. Canonicalization Examples (INFORMATIVE)

In the following examples, actual whitespace is used only for clarity. The actual input and output text is designated using bracketed descriptors: "<SP>" for a space character, "<HTAB>" for a tab character, and "<CRLF>" for a carriage-return/line-feed sequence. For example, "X <SP> Y" and "X<SP>Y" represent the same three characters.

Example 1: A message reading:

A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
                <HTAB> Z <SP><SP><CRLF>
<CRLF>
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
<CRLF>
<CRLF>

when canonicalized using relaxed canonicalization for both header and body results in a header reading:

a:X <CRLF>
b:Y <SP> Z <CRLF>

and a body reading:

<SP> C <CRLF>
D <SP> E <CRLF>

Example 2: The same message canonicalized using simple canonicalization for both header and body results in a header reading:

A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
       <HTAB> Z <SP><SP><CRLF>

and a body reading:

<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>

Example 3: When processed using relaxed header canonicalization and simple body canonicalization, the canonicalized version has a header of:

a:X <CRLF>
b:Y <SP> Z <CRLF>

and a body reading:

<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>

Solution

  • Okay, let's try translating these examples into C strings:

    3.4.5. Canonicalization Examples (INFORMATIVE)

    In the following examples, actual whitespace is used only for clarity. The actual input and output text is designated using bracketed descriptors: "<SP>" for a space character, "<HTAB>" for a tab character, and "<CRLF>" for a carriage-return/line-feed sequence. For example, "X <SP> Y" and "X<SP>Y" represent the same three characters.

    Example 1: A message reading:

    A: <SP> X <CRLF>
    B <SP> : <SP> Y <HTAB><CRLF>
                    <HTAB> Z <SP><SP><CRLF>
    <CRLF>
    <SP> C <SP><CRLF>
    D <SP><HTAB><SP> E <CRLF>
    <CRLF>
    <CRLF>
    

    Translation:

    char *message = "A: X\r\nB : Y\t\r\n\tZ  \r\n\r\n C \r\nD \t E\r\n\r\n\r\n";
    

    when canonicalized using relaxed canonicalization for both header and body results in a header reading:

    a:X <CRLF>
    b:Y <SP> Z <CRLF>
    

    Translation:

    char *headers = "a:X\r\nb:Y Z\r\n";
    

    and a body reading:

    <SP> C <CRLF>
    D <SP> E <CRLF>
    

    Translation:

    char *body = " C\r\nD E\r\n";
    

    Example 2: The same message canonicalized using simple canonicalization for both header and body results in a header reading:

    A: <SP> X <CRLF>
    B <SP> : <SP> Y <HTAB><CRLF>
           <HTAB> Z <SP><SP><CRLF>
    

    Translation:

    char *headers = "A: X\r\nB : Y\t\r\n\tZ  \r\n";
    

    and a body reading:

    <SP> C <SP><CRLF>
    D <SP><HTAB><SP> E <CRLF>
    

    Translation:

    char *body = " C \r\nD \t E\r\n";
    

    Example 3: When processed using relaxed header canonicalization and simple body canonicalization, the canonicalized version has a header of:

    a:X <CRLF>
    b:Y <SP> Z <CRLF>
    

    Translation:

    char *headers = "a:X\r\nb:Y Z\r\n";
    

    and a body reading:

    <SP> C <SP><CRLF>
    D <SP><HTAB><SP> E <CRLF>
    

    Translation:

    char *body = " C \r\nD \t E\r\n";