pythonwavspeech-to-textpyttsx3

Text-to-speech convert to wav in python


I'm using pyttsx3 for text-to-speech tasks. Here's an example https://github.com/padmalcom/AISpeechAssistant/blob/main/code/02_text_to_speech/simple_main_02.py Is there a way to convert the spoken words directly to a wav file?


Solution

  • I believe what you are asking for is this:

    import pyttsx3
    import wave
    
    # Initialize the pyttsx3 engine
    engine = pyttsx3.init()
    
    # Set properties for the speech output (optional)
    engine.setProperty('rate', 150)  # Speed of speech
    engine.setProperty('volume', 1.0)  # Volume (0.0 to 1.0)
    
    # Set the output file name
    output_file = 'output.wav'
    
    # Convert text to speech
    text = "Hello, this is an example of text-to-speech conversion."
    engine.save_to_file(text, output_file)
    
    # Run the speech synthesis
    engine.runAndWait()
    
    # Optional: Get the audio data in the form of a wave file object
    with wave.open(output_file, 'rb') as wav_file:
        # You can now manipulate the wave file object as needed
        # For example, you can get information about the audio file:
        frames = wav_file.getnframes()
        channels = wav_file.getnchannels()
        sample_width = wav_file.getsampwidth()
        frame_rate = wav_file.getframerate()
        duration = frames / float(frame_rate)
    
        print("Audio information:")
        print(f"Number of frames: {frames}")
        print(f"Number of channels: {channels}")
        print(f"Sample width: {sample_width}")
        print(f"Frame rate: {frame_rate}")
        print(f"Duration: {duration} seconds")