excelxmlvbaxfa

VBA: Extracting XML data from XFA form and passing to XML parser


I am trying to extract the XML information from an XFA form using VBA.

Below code works to extract the XML data to a separate file, but it requires user interaction (the user is requested to give the XML file a name). I have given up trying to automate this without user interaction due to Adobe's "safe path" requirement (which seems impossible to bypass with a VBA automation).

Dim objPDDoc As New AcroPDDoc
Dim objJSO As Object
Dim strSafePath as String

strSafePath = ""

objPDDoc.Open (FileName)
Set objJSO = objPDDoc.GetJSObject
objJSO.xfa.host.exportdata strSafePath, 0

What I would rather do is to parse the XML information directly using MSXML2.DOMDocument60. I was hoping to be able to do something like this:

Dim XMLDoc As New MSXML2.DOMDocument60

If XMLDoc.Load(objJSO.xfa.host.exportdata) = True Then
    Call funcParse(XMLDoc)
End if

However, loading XMLDoc with objJSO.xfa.host.exportdata doesn't work, and I cannot seem to figure out which - if any - possibilities there are to pass the XML information using any xfa.host methods/properties.

Any help is welcome - also telling me this is not possible in VBA.


Solution

  • Original poster here. After about a year of looking into this on-and-off, I found the solution.

    After having accessed the JavaScript object through AccroPDDoc.GetJSObject, I can extract the nested XML as a string by using objJSO.xfa.this.saveXML.

    This way, I don't have to first save the nested XML to file (which would require user interaction) - instead I can immediatly extract the nested XML and pass it to the parser.

    Dim objPDDoc as New AcroPDDoc
    Dim objJSO as Object
    Dim XMLDoc As New MSXML2.DOMDocument60
    
    ObjPDDoc.Open (Filename)
    Set objJSO = objPDDoc.GetJSObject
    If XMLDoc.LoadXML (objJSO.xfa.this.saveXML) = True then
         ParseXML(XMLDoc)
    End if