I have data from an (unknown) MSSQL server, in a column of type Image
(I think this is an old variant of varbinary
).
I want to make a script (preferably in Python or Typescript, but open for anything) that can take the data like the one in the gist below and spew out files.
I have attempted quite a few approaches, but been unable to convert the file in the gist below – the converted file becomes corrupted. I suppose that's due to my own encoding/converting-skill shortcomings.
https://gist.github.com/tomfa/2f9e63ee1cc3cd012ac4c9b1031de39f
Below is my attempt at decoding the data with Python. The code runs fine, but the output PDF is not valid.
import binascii
s = open("data.dat").read()
# chop off the '0x' at the front.
s = s[2:]
# Decode hex to binary.
binary = binascii.unhexlify(s)
with open("test123.pdf", 'wb') as f:
f.write(binary)
It seems the Image field was zip'ed, and the data needed to be unzipped before it could be read.
from os import getenv
import pymssql
import zipfile
conn = pymssql.connect('server', 'user', 'password', 'database')
cursor = conn.cursor()
cursor.execute("""
SELECT %s, %s FROM %s;
""" % ('filename_column', 'image_column', 'table'))
def unzip(archive):
with zipfile.ZipFile(archive, 'r') as zip:
for file in zip.namelist():
zip.extract(member=file)
row = cursor.fetchone()
while row:
filename = row[1]
imagedata = row[2]
zip_path = f'./{filename}.zip'
with open(zip_path, 'wb') as f:
print("Writing %s" % filename)
f.write(imagedata)
unzip(zip_path)
row = cursor.fetchone()
conn.close()