pythontext-to-speechpdf-readerpyttsx3

Pyttsx3 long input file results in only the last page being converted to MP3


import pyttsx3,PyPDF2
import os

file_path = input('Enter a file path:')

if os.path.exists(file_path):   
    pdfreader = PyPDF2.PdfFileReader(open(file_path, 'rb'))
    engine = pyttsx3.init()

    for page_num in range(pdfreader.numPages):
        text = pdfreader.getPage(page_num).extract_text()
        clean_text = text.strip().replace('\n', '    ')
        print(clean_text)

    voices = engine.getProperty('voices')
    engine.setProperty('voice', voices[1].id)
    rate = engine.getProperty('rate')
    engine.setProperty('rate', rate-55)
    engine.save_to_file(clean_text, 'story.mp3')
    engine.runAndWait()

    engine.stop()

    
else:
    print('The specified file does not exist')   

When loading in my thesis work which is about 70 pages, it prints out the clean_text comletely, but in the mp3 only the last page is being read out. Am I missing something?


Solution

  • Found it myself. Had to concatenate the strings from the for loop in order for everything to be read out together.

    Defined a new string outside it

    final_text=""
    

    and added it to the for loop:

    final_text += clean_text