pythondocxaspose.words

Looping through list to compare bulk .docx files using aspose.words python


I am attempting to bulk compare .docx files using aspose.words.

I have a list that I plan on autogenerating by scanning file names.

I then split that list into two lists, new docs and old docs.

I then call my "compare_doc" function which compares the documents, if there are changes, to save a to a new file, else, print something in the console.

All my print statements are running, so I assume the code is working, however, it will only ever save one compared file, even if there are changes in the other set of documents.

Main Script

import aspose.words as aw
from datetime import date
from compare_doc import compare_doc

docs = ["Trial Division - Civil - affidavit_new.docx", "Trial Division - Civil - affidavit_old.docx", "test_doc_new.docx", "test_doc_old.docx"]
new_docs = []
old_docs = []

# Create two lists of old and new files
for i in docs:
    if "old" in i:
        old_docs.append(i)

    if "new" in i:
        new_docs.append(i)


# Cycle through the documents and convert to document type
for i, j in new_docs, old_docs:
    doc = aw.Document(i)
    doc1 = aw.Document(j)
    print(doc)
    print(doc1)
    compare_doc(doc, doc1)




compare script

import aspose.words as aw
from datetime import date

# set additional options
options = aw.comparing.CompareOptions()
options.ignore_formatting = True
options.ignore_headers_and_footers = True
options.ignore_case_changes = True
options.ignore_tables = True
options.ignore_fields = True
options.ignore_comments = True
options.ignore_textboxes = True
options.ignore_footnotes = True

#  date.today(),

def compare_doc(doc, doc1):
    doc.compare(doc1, "user", date.today(), options)
    x = doc.revisions.count
    print(x)
    if x > 0:
        i = 0
        print("Test")
        doc.save(f"compared{i}.docx")
        i = i + 1
    else:
        print("Documents are equal")

I am new to python, so any help would be apprecciated.

I was expecting two compare files to save, however only one does.


Solution

  • Your code always saves the comparison result as compared0.docx name, so the file is always overridden. Try to modify your code like this:

    index = 0
    for i, j in new_docs, old_docs:
        doc = aw.Document(i)
        doc1 = aw.Document(j)
        print(doc)
        print(doc1)
        compare_doc(doc, doc1, index)
        index = index + 1;
    
    def compare_doc(doc, doc1, index):
        doc.compare(doc1, "user", date.today(), options)
        x = doc.revisions.count
        print(x)
        if x > 0:
            print("Test")
            doc.save(f"compared{index}.docx")
        else:
            print("Documents are equal")