I am building a simple Python application using continuous speech translation through the Azure Cognitive Services Speech SDK. Translation and detection between languages works as far as I can tell. My goal is to update the UI according to the detected language. However, I cannot get a detected language signal. The docs say that for Python continuous language detection works. Can I get a signal for the detected language?
My question is about:
speechsdk.languageconfig.AutoDetectSourceLanguageConfig(self.detectable_languages)
I am setting the languages according to the SDK specs using the language code and locale:
detectable_languages = ["en-US", "es-MX"]
This is where I am configuring the translation recognizer:
speech_translation_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_LanguageIdMode, value='Continuous')
def configure(self):
self.speech_translation_config = self.init_speech_translation_config()
self.audio_config = self.set_audio_source()
self.auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(self.detectable_languages)
print("Detectable languages set in AutoDetectSourceLanguageConfig:")
print(self.auto_detect_source_language_config.language)
self.translation_recognizer = self.init_translation_recognizer()
self.set_event_callbacks()
When I run my app I get this output:
Error in executing callback: 'AutoDetectSourceLanguageConfig' object has no attribute 'language'
It seems to me I should get a signal for the machines having detected "en-US" or "es-MX". Can someone tell me if the API for Python is limited in this regard or if the detected language is not exposed for a reason I am not aware of? Thank you!
I tried the below code and successfully detected the "en-US" and "es-MX" languages.
Code :
import azure.cognitiveservices.speech as speechsdk
import sys
speech_key = "<speech_key>"
service_region = "<speech_region>"
detectable_languages = ["en-US", "es-MX"]
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=detectable_languages)
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
def main():
recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config,
audio_config=audio_config,
auto_detect_source_language_config=auto_detect_source_language_config)
print("Session started")
print("Continuous recognition started, press Enter to stop...")
while True:
try:
print("Speech start detected")
result = recognizer.recognize_once()
print("Speech end detected")
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
text = result.text
language_result = result.properties[speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult]
if language_result:
detected_language = language_result
print("Recognized:", text)
print("Detected language:", detected_language)
else:
print("Could not detect the language.")
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized")
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Speech recognition canceled:", cancellation_details.reason)
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details:", cancellation_details.error_details)
except KeyboardInterrupt:
print("Exiting...")
sys.exit(0)
except Exception as e:
print("Error: {0}".format(e))
if __name__ == "__main__":
main()
Output :