I've done a question about this before, but still do not understand what to do.
I need to make canonicalized header and body for a email. I've read this piece of documentation lots of times. Could someone make a example, because I cannot wrap my head around this:
The "simple" header canonicalization algorithm does not change header fields in any way. Header fields MUST be presented to the signing or verification algorithm exactly as they are in the message being signed or verified. In particular, header field names MUST NOT be case folded and whitespace MUST NOT be changed.
The "relaxed" header canonicalization algorithm MUST apply the following steps in order:
Convert all header field names (not the header field values) to lowercase. For example, convert "SUBJect: AbC" to "subject: AbC".
Unfold all header field continuation lines as described in [RFC5322]; in particular, lines with terminators embedded in continued header field values (that is, CRLF sequences followed by WSP) MUST be interpreted without the CRLF. Implementations MUST NOT remove the CRLF at the end of the header field value.
Convert all sequences of one or more WSP characters to a single SP character. WSP characters here include those before and after a line folding boundary.
Delete all WSP characters at the end of each unfolded header field value.
Delete any WSP characters remaining before and after the colon separating the header field name from the header field value. The colon separator MUST be retained.
The "simple" body canonicalization algorithm ignores all empty lines at the end of the message body. An empty line is a line of zero length after removal of the line terminator. If there is no body or no trailing CRLF on the message body, a CRLF is added. It makes no other changes to the message body. In more formal terms, the "simple" body canonicalization algorithm converts "*CRLF" at the end of the body to a single "CRLF".
Note that a completely empty or missing body is canonicalized as a single "CRLF"; that is, the canonicalized length will be 2 octets.
The SHA-1 value (in base64) for an empty body (canonicalized to a "CRLF") is:
uoq1oCgLlTqpdDX/iUbLy7J1Wic=
The SHA-256 value is:
frcCV1k9oG9oKj3dpUqdJg1PxRT2RSN/XKdLCPjaYaY=
The "relaxed" body canonicalization algorithm MUST apply the following steps (1) and (2) in order:
Reduce whitespace:
Ignore all whitespace at the end of lines. Implementations MUST NOT remove the CRLF at the end of the line.
Reduce all sequences of WSP within a line to a single SP character.
Ignore all empty lines at the end of the message body. "Empty line" is defined in Section 3.4.3. If the body is non-empty but does not end with a CRLF, a CRLF is added. (For email, this is only possible when using extensions to SMTP or non-SMTP transport mechanisms.)
The SHA-1 value (in base64) for an empty body (canonicalized to a null input) is:
2jmj7l5rSw0yVb/vlWAYkK/YBwk=
The SHA-256 value is:
47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=
In the following examples, actual whitespace is used only for
clarity. The actual input and output text is designated using
bracketed descriptors: "<SP>"
for a space character, "<HTAB>"
for a
tab character, and "<CRLF>"
for a carriage-return/line-feed sequence.
For example, "X <SP> Y"
and "X<SP>Y"
represent the same three
characters.
Example 1: A message reading:
A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
<HTAB> Z <SP><SP><CRLF>
<CRLF>
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
<CRLF>
<CRLF>
when canonicalized using relaxed canonicalization for both header and body results in a header reading:
a:X <CRLF>
b:Y <SP> Z <CRLF>
and a body reading:
<SP> C <CRLF>
D <SP> E <CRLF>
Example 2: The same message canonicalized using simple canonicalization for both header and body results in a header reading:
A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
<HTAB> Z <SP><SP><CRLF>
and a body reading:
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
Example 3: When processed using relaxed header canonicalization and simple body canonicalization, the canonicalized version has a header of:
a:X <CRLF>
b:Y <SP> Z <CRLF>
and a body reading:
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
Okay, let's try translating these examples into C strings:
In the following examples, actual whitespace is used only for
clarity. The actual input and output text is designated using
bracketed descriptors: "<SP>"
for a space character, "<HTAB>"
for a
tab character, and "<CRLF>"
for a carriage-return/line-feed sequence.
For example, "X <SP> Y"
and "X<SP>Y"
represent the same three
characters.
Example 1: A message reading:
A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
<HTAB> Z <SP><SP><CRLF>
<CRLF>
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
<CRLF>
<CRLF>
Translation:
char *message = "A: X\r\nB : Y\t\r\n\tZ \r\n\r\n C \r\nD \t E\r\n\r\n\r\n";
when canonicalized using relaxed canonicalization for both header and body results in a header reading:
a:X <CRLF>
b:Y <SP> Z <CRLF>
Translation:
char *headers = "a:X\r\nb:Y Z\r\n";
and a body reading:
<SP> C <CRLF>
D <SP> E <CRLF>
Translation:
char *body = " C\r\nD E\r\n";
Example 2: The same message canonicalized using simple canonicalization for both header and body results in a header reading:
A: <SP> X <CRLF>
B <SP> : <SP> Y <HTAB><CRLF>
<HTAB> Z <SP><SP><CRLF>
Translation:
char *headers = "A: X\r\nB : Y\t\r\n\tZ \r\n";
and a body reading:
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
Translation:
char *body = " C \r\nD \t E\r\n";
Example 3: When processed using relaxed header canonicalization and simple body canonicalization, the canonicalized version has a header of:
a:X <CRLF>
b:Y <SP> Z <CRLF>
Translation:
char *headers = "a:X\r\nb:Y Z\r\n";
and a body reading:
<SP> C <SP><CRLF>
D <SP><HTAB><SP> E <CRLF>
Translation:
char *body = " C \r\nD \t E\r\n";