azuretext-to-speech

Azure text to speech python SDK timeout error


I have a minimal azure text to speech example which fails on some computers and not others. All of the computers are MacOS 14.5 running python 3.11.8 with azure-cognitiveservices-speech==1.41.1 there are no other differences between the computers running the code.

Some computers work instantly and produce an audio file, others never work and timeout with the following errors:

Error details: USP error: timeout waiting for the first audio chunk Error: File:/Users/runner/work/1/s/external/azure-c-shared-utility/pal/ios-osx/tlsio_appleios.c Func:tlsio_appleios_destroy Line:196 tlsio_appleios_destroy called while not in TLSIO_STATE_CLOSED.

There is an open issue on github, although that only references the TLS error which I suspect secondary: https://github.com/azure/azure-c-shared-utility/issues/658

def text_to_speech(text, voice_name='zh-CN-YunfengNeural')

    if not os.path.exists(TEMP_AZURE_AUDIO_PATH): os.makedirs(TEMP_AZURE_AUDIO_PATH)
    output_file = os.path.join(TEMP_AZURE_AUDIO_PATH, f"{text[:10]}--{voice_name}--{style}.wav")

    speech_config = speechsdk.SpeechConfig(subscription=credentials.azure_speech_key, region=credentials.azure_service_region)  
    speech_config.speech_synthesis_voice_name = voice_name
    
    audio_config = speechsdk.audio.AudioOutputConfig(filename=output_file)

    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
    result = synthesizer.speak_text_async(text).get()

    # Check result status
    if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        logger.info("Speech synthesis completed.")
        # Verify that the file was created successfully
        if os.path.exists(output_file):
            print(f"File '{output_file}' was created successfully.")
        else:
            print(f"File '{output_file}' was not created.")
            return None
        
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print(f"Speech synthesis canceled: {cancellation_details.reason}")
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            if cancellation_details.error_details:
                print(f"Error details: {cancellation_details.error_details}")
        return None

    return output_file

Expecting with the same environment, os, python packages and same credentials that it would work on every computer. 5 out of 8 computers produce the error, the others work every time.

Does anyone have any suggestions?


Solution

  • Thanks all for your help with this. In the end it was actually a network issue which was really hard to debug, and so stackoverflow friends didn't have all the info - sorry!

    Th actual solution: All of the computers run the the same VPN, but it turns out MSS clamping was not enabled on some routers, which meant the VPN dropped packets. Strangely, all other network activities worked fine, it just caused an issue with the Azure TTS API - weird!

    (MSS Clamping is a network setting that adjusts the Maximum Segment Size (MSS) of TCP packets to ensure they fit within the limits of your network path, especially when using tunnels like VPNs.)