I have the following fake.dtd
file:
<!ELEMENT outer - - (#PCDATA, foo, bar) >
<!ELEMENT foo - o (#PCDATA) >
<!ELEMENT bar - - (#PCDATA) >
And the following SGML document:
<!DOCTYPE outer SYSTEM "fake.dtd">
<OUTER>Document Title
<FOO>1234
<BAR>wxyz</BAR>
</OUTER>
I am getting a validation error using nsgmls
:
4:19:E: character data is not allowed here
Note that putting </OUTER>
on the same line as </BAR>
solves the problem; the error refers to the line-break.
Is there a way to keep the SGML as is (because I already have thousands of documents like this), but change the DTD so that it validates?
Adding another #PCDATA
to the end of the outer
element seems silly because that would make characters other than newline legal.
The SGML Standard (ISO 8879:1986/A1:1988, 11.2.4) explicitly recommends to not use content models like (#PCDATA, foo, bar)
(emphasis mine):
NOTE - It is recommended that ā
#PCDATA
ā be used only when data characters are to be permitted anywhere in the content of the element; that is, in a content model where it is the sole token, or whereor
is the only connector used in any model group.
Despite mentioning #PCDATA
only as the first token in the group, your outer
element type still is declared to have mixed content, so data characters can occur anywhere: that's why the line break (aka a "record end") after </BAR>
is recognized as a data character instead of just a separator on the one hand, but there's no corresponding #PCDATA
token to absorb it on the other hand, hence the error. (And only the omitted </FOO>
end-tag circumvented the same error in the line before!)
The proper and common approach in this case would be to place the "Document Title" into an actual title
element—for which one can allow omission of both the start- and end-tag:
<!ELEMENT outer - - (title, foo, bar) >
<!ELEMENT title o o (#PCDATA) >
Now
outer
content model still reflects the proper order of elements,outer
element has element content (not any longer mixed content), title
element, as it should.(The same technique is used in several Standard DTDs, like the "General Document" example in annex E of the Standard.)