pythonlinuxazureazure-functionsazure-speech

Azure Function with Azure AI Speech network issue (WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED)


I'm using Azure AI Speech (previously Azure Cognitive Services), specifically the Text-to-Speech (TTS) feature in a Python Azure Function app (Linux). However, when my Function app is running and trying to call the AI Speech service, it gives me this error:

Connection failed (no connection to the remote host). Internal error: 1. Error details: Failed with error: WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED wss://eastus.tts.speech.microsoft.com/cognitiveservices/websocket/v1 X-ConnectionId: 45e9f87a28fa40ba981221cb55f6fc15 USP state: Sending. Received audio size: 0 bytes.

My code looks like this:

try:
    file_url = None

    speech_config = speechsdk.SpeechConfig(
        subscription=my_azure_ai_speech_api_key,
        region="eastus"
    )

    local_audio_path = f"/tmp/{filename}"
    ensure_directory_exists(local_audio_path)

    audio_config = speechsdk.audio.AudioOutputConfig(filename=local_audio_path)
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
    result = synthesizer.speak_text_async(text).get()

    if result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        cancellation_reason = cancellation_details.reason
        cancellation_error = ""

        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            cancellation_error = cancellation_details.error_details

        # Logic to log error here, this is actually where the error occurs that I mentioned above
        return

    with open(local_audio_path, "rb") as audio_file:
        audio_data = audio_file.read()

    # Logic to write the audio file to my Azure Blob Storage here

    # Logic to remove the local file here

    return file_url
except Exception as e:
    # Logic to log error here

I am developing on a Windows machine and locally, the Azure Function app is working fine and succeeds in calling the AI Speech service. I am only experiencing this problem on Azure Cloud with my deployed Azure Function app. I already tried to search for solutions and a suggestion was to install OpenSSL 1.1 in my Function app environment, so I created this startup script (startup.sh):

#!/bin/bash

# Log start of script
echo "Starting startup script"

# Update package lists
echo "Updating package lists"
apt-get update

# Install dependencies
echo "Installing dependencies"
apt-get install -y build-essential libssl-dev ca-certificates libasound2 wget

# Install OpenSSL 1.1
echo "Installing OpenSSL 1.1"
wget http://security.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb
dpkg -i libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb

# Verify installations
echo "Verifying OpenSSL installation"
openssl version

# Log end of script
echo "Startup script completed"

I then added these environment variables to my Function app:

WEBSITE_STARTUP_SCRIPT=startup.sh
WEBSITE_RUN_FROM_PACKAGE=1

But I am still having the same issue. Besides the above, I have already:


Solution

  • As per the error code is unable to connect with TTS(translate text to speech)

    1. Cross-check if no firewall or network policies are blocking the function connection with TTS on the Azure platform.
    2. If you are fetching the API key from the environment variable make sure you have added the variable in the Azure function app setting.
    3. Cross-check check Http/Https version and TLS version in the Azure function configuration setting.

    Code :

    I used the local directory path to save the Text-to-Speech converted audio to a output.wav file in the below code.

    import os
    import azure.functions as func
    import azure.cognitiveservices.speech as speechsdk
    import logging
    import traceback
    
    def ensure_directory_exists(path):
        directory = os.path.dirname(path)
        if not os.path.exists(directory):
            os.makedirs(directory)
    
    app = func.FunctionApp()
    
    @app.function_name(name="HttpTriggerFunction")
    @app.route(route="tts", methods=["GET", "POST"])
    def main(req: func.HttpRequest) -> func.HttpResponse:
        logging.info('Python HTTP trigger function processed a request.')
    
        text = req.params.get('text')
        if not text:
            try:
                req_body = req.get_json()
            except ValueError:
                pass
            else:
                text = req_body.get('text')
    
        if not text:
            return func.HttpResponse(
                "Please pass a text on the query string or in the request body",
                status_code=400
            )
    
        try:
            my_azure_ai_speech_api_key = "<speech_key>"
            region = "<speech_region>"
    
            speech_config = speechsdk.SpeechConfig(
                subscription=my_azure_ai_speech_api_key,
                region=region
            )
    
            local_audio_path = r"C:/Users/kamali/Documents/xxxxxx/output.wav"
            ensure_directory_exists(local_audio_path)
    
            audio_config = speechsdk.audio.AudioOutputConfig(filename=local_audio_path)
            synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
            result = synthesizer.speak_text_async(text).get()
    
            if result.reason == speechsdk.ResultReason.Canceled:
                cancellation_details = result.cancellation_details
                logging.error(f"Speech synthesis canceled: {cancellation_details.reason}")
                if cancellation_details.reason == speechsdk.CancellationReason.Error:
                    logging.error(f"Error details: {cancellation_details.error_details}")
    
                return func.HttpResponse(
                    "Speech synthesis canceled",
                    status_code=500
                )
    
            with open(local_audio_path, "rb") as audio_file:
                audio_data = audio_file.read()
                
            return func.HttpResponse("File generated successfully", status_code=200)
        except Exception as e:
            logging.error(f"Exception: {str(e)}")
            logging.error(f"Traceback: {traceback.format_exc()}")
            return func.HttpResponse("An error occurred", status_code=500)
    

    requirements.txt :

    azure-functions
    azure-cognitiveservices-speech
    

    Local Output :

    I got the below output in the browser with the HTTP trigger function output URL.

    http://localhost:7071/api/tts?text=Hello World
    

    enter image description here

    VS Code terminal Output :

    I got the output. wav file as shown below.

    enter image description here

    Before deploying the above project to the Azure function app, I updated the above code with the local_audio_path = f"/tmp/output.wav" in the below code, which worked fine for me after deployment to the Azure function app.

    import azure.functions as func
    import azure.cognitiveservices.speech as speechsdk
    import logging
    import traceback
    
    app = func.FunctionApp()
    
    @app.function_name(name="HttpTriggerFunction")
    @app.route(route="tts", methods=["GET", "POST"])
    def main(req: func.HttpRequest) -> func.HttpResponse:
        logging.info('Python HTTP trigger function processed a request.')
    
        text = req.params.get('text')
        if not text:
            try:
                req_body = req.get_json()
            except ValueError:
                pass
            else:
                text = req_body.get('text')
    
        if not text:
            return func.HttpResponse(
                "Please pass a text on the query string or in the request body",
                status_code=400
            )
    
        try:
            my_azure_ai_speech_api_key = "<speech_key>"
            region = "<speech_region>"
    
            speech_config = speechsdk.SpeechConfig(
                subscription=my_azure_ai_speech_api_key,
                region=region
            )
    
            local_audio_path = f"/tmp/output.wav"
    
            audio_config = speechsdk.audio.AudioOutputConfig(filename=local_audio_path)
            synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
            result = synthesizer.speak_text_async(text).get()
    
            if result.reason == speechsdk.ResultReason.Canceled:
                cancellation_details = result.cancellation_details
                logging.error(f"Speech synthesis canceled: {cancellation_details.reason}")
                if cancellation_details.reason == speechsdk.CancellationReason.Error:
                    logging.error(f"Error details: {cancellation_details.error_details}")
    
                return func.HttpResponse(
                    "Speech synthesis canceled",
                    status_code=500
                )
    
            with open(local_audio_path, "rb") as audio_file:
                audio_data = audio_file.read()
    
            return func.HttpResponse("File generated successfully", status_code=200)
        except Exception as e:
            logging.error(f"Exception: {str(e)}")
            logging.error(f"Traceback: {traceback.format_exc()}")
            return func.HttpResponse("An error occurred", status_code=500)
    

    I deployed the above code to the Azure Function app and got the below output in the Azure Portal.

    enter image description here