I am interested in reading Microsoft Visio files with Python. I did a custom offline personal Python module to read it; I took great inspiration from this repository: https://github.com/dave-howard/vsdx but I recoded it myself, added tests.
I do not have a Microsoft Visio licence, therefore I cannot see what is inside, but as many Office files, they are a zipped archive of an architecture of XML files.
From what I can see, there are "pages" files, "master" files and "relationship" files.
My issue is that one of the visio file that I extract data from has "too much data". By that I mean that when a colleague opens it with his Visio, it shows empty field.
I investigated it a bit and ran a "without-master" version of the code as well.
Expected result (what the person with the Visio sees):
Item name | Item property 1 | Item property 2 | Item property 3 |
---|---|---|---|
Item1 | A | C |
Result with "getting the mastershape and adding the properties":
Item name | Item property 1 | Item property 2 | Item property 3 |
---|---|---|---|
Item1 | A | B | C |
Result without "getting the mastershape and adding the properties" (only the data of the shape in shapeN.xml):
Item name | Item property 1 | Item property 2 | Item property 3 |
---|---|---|---|
Item1 | A |
As you can see, the implementation by default gives "too much information" while the "only shape" forgets a lot of information (including ones that are wanted).
From this experiment, I see that the information I was looking for was stored in the master files.
One can see how the XML are programmed in the answer of this : Where are Visio Master Shape properties stored?
Therefore I imagine two possibilities:
The Visio user has something that disables him from seeing the value. Are there possible settings/menus that this could happen? He showed me his screen and the value was clearly empty with a value different from what I parsed.
The "Visio should get/show this value" is stored somewhere because it is consistent with saving/loading on a different PC. In this case where would this be located?
There is a flag "Visible" for each property, that can be either set from the UI, or as a result of forumla calculation (depending on the shape).
The propery value may also be techically set (or cleared) by some formula (either in this shape, or in some other shape). Basically all Visio shapes are "smart" shapes controlled by formulas.
I would recommend you get the Visio somehow to deal with this anyway. Also it's hard to tell without seeing your file.