I'm writing a parser for WordML. Going through the spec I read that the way to count the number of pages in a document is to read the element Pages
in DocumentProperties
. If I read the spec correctly, DocumentProperties
should always be there.
While creating a test document on my Mac I noticed that there is no Pages
or DocumentProperties
element in the generated xml. I have a w:document
and inside it a w:body
with content.
Is DocumentProperties
mandatory or is this a Mac thing?
There are two different Word XML formats - the old Word 2003 XML format, and the Office Open XML format, which can be saved either as a .docx, where it is saved as a set of XML and potentially other file types in a .zip container, and the "Flat OPC" format, which is a single-file XML representationof the same thing.
Each format stores properties in a different place.
If you are seeing an element called w:document then you are actually saving in the OOXML format. In that format, the "built-in" properties are saved in at least two "parts". You would normally find the element within a element in a pkg:part named /docProps/app.xml.
There are at least three complications: