pythonpdfautomationms-wordwin32com

.doc to pdf using python


I'am tasked with converting tons of .doc files to .pdf. And the only way my supervisor wants me to do this is through MSWord 2010. I know I should be able to automate this with python COM automation. Only problem is I dont know how and where to start. I tried searching for some tutorials but was not able to find any (May be I might have, but I don't know what I'm looking for).

Right now I'm reading through this. Dont know how useful this is going to be.


Solution

  • A simple example using comtypes, converting a single file, input and output filenames given as commandline arguments:

    import sys
    import os
    import comtypes.client
    
    wdFormatPDF = 17
    
    in_file = os.path.abspath(sys.argv[1])
    out_file = os.path.abspath(sys.argv[2])
    
    word = comtypes.client.CreateObject('Word.Application')
    doc = word.Documents.Open(in_file)
    doc.SaveAs(out_file, FileFormat=wdFormatPDF)
    doc.Close()
    word.Quit()
    

    You could also use pywin32, which would be the same except for:

    import win32com.client
    

    and then:

    word = win32com.client.Dispatch('Word.Application')