pythonsql-servervarbinary

Convert MSSQL image / varbinary column data to file


I have data from an (unknown) MSSQL server, in a column of type Image (I think this is an old variant of varbinary).

I want to make a script (preferably in Python or Typescript, but open for anything) that can take the data like the one in the gist below and spew out files.

I have attempted quite a few approaches, but been unable to convert the file in the gist below – the converted file becomes corrupted. I suppose that's due to my own encoding/converting-skill shortcomings.

https://gist.github.com/tomfa/2f9e63ee1cc3cd012ac4c9b1031de39f

My code

Below is my attempt at decoding the data with Python. The code runs fine, but the output PDF is not valid.

import binascii

s = open("data.dat").read()

# chop off the '0x' at the front.
s = s[2:]

# Decode hex to binary.
binary = binascii.unhexlify(s)

with open("test123.pdf", 'wb') as f:
    f.write(binary)

Solution

  • It seems the Image field was zip'ed, and the data needed to be unzipped before it could be read.

    from os import getenv
    import pymssql
    import zipfile
    
    conn = pymssql.connect('server', 'user', 'password', 'database')
    cursor = conn.cursor()
    cursor.execute("""
    SELECT %s, %s FROM %s;
    """ % ('filename_column', 'image_column', 'table'))
    
    def unzip(archive):
        with zipfile.ZipFile(archive, 'r') as zip:
            for file in zip.namelist():
                zip.extract(member=file)
    
    row = cursor.fetchone()
    while row:
        filename = row[1]
        imagedata = row[2]
        zip_path = f'./{filename}.zip'
    
        with open(zip_path, 'wb') as f:
            print("Writing %s" % filename)
            f.write(imagedata)
    
        unzip(zip_path)
    
        row = cursor.fetchone()
    
    conn.close()