pythontextmergefilesplitting

How to merge and split again thousands of text files?


I have thousands of .txt files. These text files include one string. (Every file has different string.)

I want to edit these strings but i don't want to manually open each file one-by-one for editing. So i want to merge all these files into a single .txt file and after my editing done, i want to seperate/split them again with the same file names they were owned before i merged.

For example;

i have these text files.

lorem.txt (hi, this is an example line.)

ipsum.txt (hi, this is another line.)

merol123.txt (hi, just another line.)

*

merged.txt >>> edited and ready to split again. >> result needs to be like this;

*

lorem.txt (hi, this is edited line.)

ipsum.txt (another edited line.)

merol123.txt (another edited line. number 4847887)

Note: Sentences inside brackets represents string inside txt file.

Is it possible? I am waiting your helps, thanks!


Solution

  • First of all, I assumed you've not repeated your strings correctly (like "hi, this is an example line." != "hi, this is edited line.") by mistake, not on purpose (that I can't figure out).

    I named the accumulative file common.doc to distinct from the other .txt files in the target directory. Also, this example code implies all the files are in the same directory.

    # merging.py
    import os
    import glob
    
    with open("common.doc", "w") as common:
        for txt in glob.glob("./*.txt"):
            with open(txt, "r") as f:
                content = f.read()
            common.write("{} ({})\n".format(os.path.basename(txt), content))
    

    And after common.doc editing:

    # splitting.py
    with open("common.doc", "r") as common:
        for line in common:
            name = line[:line.find(" (")]
            text = line[line.find(" (")+2:line.rfind(")")]
            with open(name, "w") as f:
                f.write(text)
    

    And a solution for multiline text (merging stays with .strip() removed on content writing), not suitable for hundreds of thousands of files tho...

    # splitting2.py
    with open("common.doc", "r") as common:
        everything = common.read()
    elements = everything.split(")")
    for elem in elements:
        name = elem[:elem.find(" (")].strip()
        text = elem[elem.find(" (")+2:]
        if name:
            with open(name, "w") as f:
                f.write(text)