xmlms-wordwordmlroundtrip

What are the effects of an "XML Roundtrip" on Word 2003 documents?


Saving a Word 2003 document to XML and then back results in a reduced file size, and probably more that I don't know about. A diff on the WordML of the new document against the old shows differences only in the revision save ID's. So, what is getting lost in the roundtrip?

If nothing is actually getting lost, then how would one explain the few thousand bytes off the size of the file?


Solution

  • As far as I know Word stores some information in addition to text and formatting in the DOC files, for example user information, some stuff on the document history, etc. This information accumulates when using "File > Save". I suppose that saving as XML and re-saving as DOC strips that information.

    If I recall correctly, as simple "Save As" reduces file size already and I think there used to be some menu item that allowed you to save a version of the DOC file that was significantly smaller in size than the "File > Save" version.