If I have:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE country[
<!ELEMENT country
(president | king | (king,queen) | queen)>
<!ELEMENT president (#PCDATA)>
<!ELEMENT king (#PCDATA)>
<!ELEMENT queen (#PCDATA)>
]>
Why (president | king | (king,queen) | queen)>
generate the following error if we try to validate
<country><king>Luis</king></country>
we get the error message [...]Both 1st and 2nd occurence of "king" are possible
. What if I write: (president | (king) | (king,queen) | queen)>
?
It's because your content model is non-deterministic. This means that given the king
element, the parser cannot determine which model is being matched without looking ahead. See Deterministic Content Models (Non-Normative) for more details.
What I would do is make queen
optional when a king
is present:
<!ELEMENT country (president | (king,queen?) | queen)>
Response to comment...
The XML processor cannot use "look ahead" in order to figure out what is gonna "happen" after matching "king", right?
Right. For example, lets say we have this country
element:
<country>
<king/>
</country>
and we declare country
like this in our DTD:
<!ELEMENT country (president | king | (king,queen) | queen)>
there are 4 possible options for the content of country
:
So if we have a king
element in our XML, the parser doesn't know if it is option #2 or option #3.
If we declare country
like this:
<!ELEMENT country (president | (king,queen?) | queen)>
there are 3 possible options for the content of country
:
As you can see, if we have a king
element in our XML there is only one possible option that the parser can choose.