There are official test cases for XML canonicalization which can be found here: Test cases for Canonical XML 2.0
One of them looks like this:
<!DOCTYPE doc [<!ATTLIST e9 attr CDATA "default">]>
<doc>
<e1 />
<e2 ></e2>
<e3 name = "elem3" id="elem3" />
<e4 name="elem4" id="elem4" ></e4>
<e5 a:attr="out" b:attr="sorted" attr2="all" attr="I'm"
xmlns:b="http://www.ietf.org"
xmlns:a="http://www.w3.org"
xmlns="http://example.org"/>
<e6 xmlns="" xmlns:a="http://www.w3.org">
<e7 xmlns="http://www.ietf.org">
<e8 xmlns="" xmlns:a="http://www.w3.org">
<e9 xmlns="" xmlns:a="http://www.ietf.org"/>
</e8>
</e7>
</e6>
</doc>
The given canonicalized form is
<doc>
<e1></e1>
<e2></e2>
<e3 id="elem3" name="elem3"></e3>
<e4 id="elem4" name="elem4"></e4>
<e5 xmlns="http://example.org" xmlns:a="http://www.w3.org" xmlns:b="http://www.ietf.org" attr="I'm" attr2="all" b:attr="sorted" a:attr="out"></e5>
<e6>
<e7 xmlns="http://www.ietf.org">
<e8 xmlns="">
<e9 attr="default"></e9>
</e8>
</e7>
</e6>
</doc>
I'm wondering why b:attr="sorted"
comes before a:attr="out"
in the sorted output... I'd be really thankful if someone could clarify this for me.
Don't look at the namespace prefixes; look at the namespace URIs.
Although a
comes before b
, i
comes before w
:
xmlns:b="http://www.ietf.org"
xmlns:a="http://www.w3.org"
Therefore b:attr="sorted"
comes before a:attr="out"
canonically.
This is explained in section 2.3:
Note: In
e5
,b:attr
precedesa:attr
because the primary key is namespace URI not namespace prefix