I combined multiple text files into a single text file using simple code:
with open("Combined_file.txt", 'w') as f1:
for indx1, fil1 in enumerate(files_to_combine):
with open(files_to_combine[indx1], 'r') as f2:
for line1 in f2:
f1.write(line1)
f1.write("\n")
files_to_combine
is a list containing files to combine into a single file Combined_file.txt
. I want to combine MS Word .docx
files similar to above and looked at this answer https://stackoverflow.com/a/48925828 using python-docx
module. But I couldn't figure out how to open and save a docx file under top for
loop of above code, since with open
construct won't work here. Also, if the source docx file contains an image, can it be copied using above code and the answer code?
You will need python-docx
for that. From there it is pretty straightforward.
from docx import Document
# Create a new Document object for the combined file
combined_file = Document()
for file in files_to_combine:
doc = Document(file)
# If you want to do it manually
for para in doc.paragraphs:
combined_file.add_paragraph(para.text)
# This will needed if you have images
for rel in doc.part.rels.values():
if "image" in rel.target_ref:
combined_file.add_picture(rel.target_part.blob)
# Same goes for charts and other types
# in docs find methods for an each type
# This will copy all document
# I commented so you can decided which way to choose
# for element in doc.element.body:
# combined_file.element.body.append(element)
combined_file.add_page_break()
combined_file.save('Combined_file.docx')