import pyttsx3,PyPDF2
import os
file_path = input('Enter a file path:')
if os.path.exists(file_path):
pdfreader = PyPDF2.PdfFileReader(open(file_path, 'rb'))
engine = pyttsx3.init()
for page_num in range(pdfreader.numPages):
text = pdfreader.getPage(page_num).extract_text()
clean_text = text.strip().replace('\n', ' ')
print(clean_text)
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
rate = engine.getProperty('rate')
engine.setProperty('rate', rate-55)
engine.save_to_file(clean_text, 'story.mp3')
engine.runAndWait()
engine.stop()
else:
print('The specified file does not exist')
When loading in my thesis work which is about 70 pages, it prints out the clean_text comletely, but in the mp3 only the last page is being read out. Am I missing something?
Found it myself. Had to concatenate the strings from the for loop in order for everything to be read out together.
Defined a new string outside it
final_text=""
and added it to the for loop:
final_text += clean_text