ms-accessvbainfopath2010

Extract InfoPath Attachment with VBA


We are archiving a bunch of InfoPath [IP] documents. The data are going to be stored in MS Access 2010 (with attachments in the file system, storing references to them in the Access DB).

I found one VBA solution to extract the file that the IP form contains, but it doesn't function. (http://www.infopathdev.com/forums/p/10182/36240.aspx)

I have found many .NET solutions, but have had no luck converting them to VBA.

How can I take the file name & file contents contained in an IP Attachment node, and (using Access's VBA) create a real file, stored in the file system?


Solution

  • InfoPath attachments are base64 encoded. The encoding can be "unpacked" with

    ' The MSXML2.DOMDocument class has methods to
    ' convert a base64—encoded string to a byte array.
    Private Function DecodeBase64(ByVal strData As String) As Byte()
    
        Dim obj_XML As MSXML2.DOMDocument
        Dim objNode As MSXML2.IXMLDOMElement
    
        ' Getting help from MSXML
        Set obj_XML = New MSXML2.DOMDocument
        Set objNode = obj_XML.createElement("b64")
        objNode.DataType = "bin.base64"
        objNode.Text = strData
    
        ' Return the value
        DecodeBase64 = objNode.nodeTypedValue
    
        Set objNode = Nothing
        Set obj_XML = Nothing
    
    End Function
    

    The byte arrays' first 16 bytes are a useless header.

    The next 4 bytes are a little-endian unsigned integer, of the actual size of the file. This information is unnecessary, too.

    The next 4 bytes are a little-endian unsigned integer, of the number of characters in the FileName. The FileName is in UniCode, so it has 2 bytes / char. Multiply the number of the FileName size by 2, therefore. The name will have nulls at the end that need to be removed.

    Finally, the remainder past that point is the file.

    I decided to create a helper class to split the byte array of the attachment. My class takes a byte array, and a number. the input array will be split into 2 properties, representing the top and bottom halves. That number entered is 1) the number of bytes for the bottom array, and 2) the index of the first byte of the entered array to put into the top part.

    I initially fed it the whole attachment, and added it to the object, splitting it at 23. This put the header and the file-size bytes and the filename size, and put them into the bottom array, and the filename and the file contents, and put it into the top.

    I made a reference to those two arrays (to persist them), then fet the bottom array into the object, splitting at 19. I discard the bottom, having the filename size in the top.

    After converting that to a Long, I fet the upper part into the object, splitting it at the value of FileNameSize times 2. Thus the filename is in the bottom, and the file is in the top.

    Using 2 user-defined types, I line up the 4 bytes of the FileName size into 4 contiguous bytes in one, and treat the other UDT as a single long. Using the LSet statement, I copy the 4 contiguous bytes of the 1st UDT to the 4 contiguous bytes of the 2nd UDT. This converts the bytes to a long.

    VBA automatically, implicitly converts byte arrays to strings, if you set a string equal to a byte array of UniCode bytes.

    Finally, the upper array is copied to a file, using Put.

    The class:

    Private pLow()  As Byte
    Private pHigh() As Byte
    Private src()   As Byte
    Private pt      As Long
    
    Public Property Get Low() As Byte():    Low = pLow:     End Property
    Public Property Get High() As Byte():   High = pHigh:   End Property
    
    Private Function outOfBounds() As Boolean
    
        Dim msg As String
    
        ' Check the bounds
        If pt < 0 Then _
            msg = "Division point cannot be less than zero"
    
        If pt > (UBound(src) + 1) Then    ' When = UBound + 1, copy whole array into pLo
            msg = "‘point’ is greater the the size of the array."
        End If
    
        If msg <> "" Then
             outOfBounds = True
            src = Null
            Err.Raise vbObjectError + 6, msg
        End If
    
    End Function
    
    ' point is the index of the 1st element to be copied into pHi
    Public Sub Load(SrcArr() As Byte, point As Long)
    
        src = SrcArr    ' grant class-wide access.
        pt  = point      ' grant class-wide access.
    
        If outOfBounds() Then Exit Sub
    
        ' Create new arrays and assign to private fields
        Dim L()  As Byte
        Dim H()  As Byte
        Dim hiUB As Long
    
        hiUB = UBound(src) - point
        If point <> 0 Then                           ' <————<< If ‘point’ is 0, then this is just going to be a copy of
            ReDim L(point - 1)                       '         the whole array into pHi; don’t initialize pLo.
        End If
        If point <> (UBound(src) + 1) Then           ' <————<< If it is the SIZE of the array (UBound+1), then this is
            ReDim H(hiUB)                            '         just going to be a copy of the whole array into pLo, so
        End If                                       '         there would be no need to initialize pHi.
    
        ' Do the two copies
        If point <> 0 Then _
            MoveMemory L(0), src(0), point           ' ‘point’ is the 0-based 1st element to copy into pHi.  So it
                                                     ' also serves as the 1-based copy SIZE, for copying into pLo.
        If point <> (UBound(src) + 1) Then _
            MoveMemory H(0), src(point), (hiUB + 1)
    
        pLow = L
        pHigh = H
    
    End Sub
    

    The processing:

    Public Sub processAttachment(dataIn As String, Optional path As String)
    
        On Error GoTo Er
    
        ' After development, remove this:
        If IsMissing(path) Or path = "" Then path = "had a default here"
    
        Dim fNum    As Integer       ' File number, for file communication.
        Dim fName   As String
        Dim fNamSz  As Long
        Dim b_Tmp() As Byte
        Dim btArr() As Byte
            btArr = DecodeBase64(dataIn)   ' <————<<< dataIn is a base64-encoded string.  Convert it to a byte array.
    
        Dim cAS As New clsArraySplitter    ' Instantiate the class for getting array sections.
    
        cAS.Load btArr, 24                 ' Separate the data at the beginning of btArr (whose size is set), from the
        btArr = cAS.High                   ' rest of the data (whose sizes will be different for each attachment).
                                           ' Header (16 bytes, 0-15) + 2, 4-byte long int.s  =  16 + 4 + 4  =  24.
                                           ' Set the dymaically-sized portion of the data (fName & the file) aside, for now.
    
        cAS.Load cAS.Low, 16               ' Dump Hdr;  puts part to be dumped in .Low, 2 longs in .High
                                           ' Now .Low  has header to be dumped; just ignore it,
                                           '     .High has fSize & fNameSize (8 bytes, total).
    
        cAS.Load cAS.High, 4               ' Now .Low  has fSize     (4 bytes; I don't need this info),
                                           '     .High has fNameSize (4 bytes).
    
        fNamSz = ByteArrayToLong(cAS.High) ' Get FileName character count
        fNamSz = fNamSz * 2                ' UniCode has 2-bytes per character
    
      ' Now, refocus on the array having fname & file.
      ' Separate into 2 parts, the file name, and the file.
        cAS.Load btArr, fNamSz              ' Now .Low  has the fName,
                                            '     .High has the file.
    
      ' Get fName, then trim null(s) off the right end.
        fName = Trim(cAS.Low)               ' VB handles this Byte array to string conversion.
    
        Dim pos As Integer
            pos = InStr(fName, Chr$(0))    ' No matter how many, pos points at the 1st one.
         If pos > 0 Then _
             fName = Left$(fName, pos - 1)
    
      ' Open output byte array to a file, then close the file.
      ' I need to check for the existence of the file, and add '(n)' to the filename if it already exists.
      ' Since attachments are not stored in the XML by name, InfoPath can store attachments with exactly the same name
      ' (with or w/o different contents) any # of times.   Since I’m saving to the file system, I can’t do that.
    
        fName = UniqueFileName(path, fName)
        path = path & fName
        fNum = FreeFile
    
        Open path For Binary Access Write As fNum
        Put fNum, , cAS.High
    
    Rs: Close fNum   ' This ‘Close’ is put here, at the resume point, so that it is sure to be closed
        Exit Sub
    
    Er: MsgBox "Error, """ & Err.Description & ","" in ""processAttachment()."""
        Resume Rs
    
    End Sub
    

    Hope that all formats right . . .