I am using Microsoft speech SDK to transcript an audio wave files, I receive the files as binary through an API. I tried to benefit from this format directly but I couldn't make it the input to the SDK functions. Because of this I made a function that saves this audio as a wave file first then give the path to Microsoft functions. After this process is done I don't need the file, I used os.remove() to remove it but every time it gives me the error that another process is using the file. I debugged and found out that one of Microsoft functions is the one using the file.
That is my code:
def function_that_gets_binary(file):
with open(os.path.join("Data_to_remove", file.filename), "wb") as f:
f.write(file.file.read())
filename = file.filename
f.close()
transcript, time_seconds = MS_SDK(os.path.join("Data_to_remove",filename))
def MS_SDK(voice):
audio_config = speechsdk.audio.AudioConfig(filename=voice)
speech_config.speech_recognition_language = "ar-SA"
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
done = False
def stop_cb(evt):
"""callback that stops continuous recognition upon receiving an event `evt`"""
nonlocal done
done = True
speech_recognizer.stop_continuous_recognition()
print("Stopped")
os.remove(voice)
full_text = []
# Connect callbacks to the events fired by the speech recognizer
speech_recognizer.recognized.connect(lambda evt: full_text.append(format(evt.result.text)))
speech_recognizer.session_stopped.connect(stop_cb)
# Start continuous speech recognition
start = timeit.default_timer()
speech_recognizer.start_continuous_recognition()
while not done:
time.sleep(0.5)
transcript = " ".join(full_text)
end = timeit.default_timer()
duration = "{} seconds".format(round((end - start), 6))
return transcript , duration
I tried to add a sleep time to give the process some time to leave the file, but I got the same error. I tried to make a separate remove function that runs in the background with 20 seconds sleep and the same error! I don't understand what is happening because I already received the transcript and the process printed "stopped"
I converted speech to text and successfully deleted the audio .wav file after conversion with the code below.
Code :
import time
import timeit
import os
import azure.cognitiveservices.speech as speechsdk
def MS_SDK(voice):
audio_config = speechsdk.audio.AudioConfig(filename=voice)
speech_config = speechsdk.SpeechConfig(subscription="<speech_key>", region="<speech_region>")
speech_config.speech_recognition_language = "ar-SA"
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
done = False
def stop_cb(evt):
"""callback stops continuous recognition upon receiving an event `evt`"""
nonlocal done
done = True
speech_recognizer.stop_continuous_recognition()
print("Stopped")
full_text = []
speech_recognizer.recognized.connect(lambda evt: full_text.append(format(evt.result.text)))
speech_recognizer.session_stopped.connect(stop_cb)
start = timeit.default_timer()
speech_recognizer.start_continuous_recognition()
while not done:
time.sleep(0.5)
transcript = " ".join(full_text)
end = timeit.default_timer()
duration = "{} seconds".format(round((end - start), 6))
speech_recognizer.__del__()
return transcript, duration
def transcribe_audio_and_print(audio_file_path):
transcript, duration = MS_SDK(audio_file_path)
print("Transcript:", transcript)
print("Duration:", duration)
try:
os.remove(audio_file_path)
print("File deleted successfully.")
except Exception as e:
print("Error occurred while deleting file:", e)
audio_file_path = "path to wav file/Data_to_remove/<filename>.wav"
transcribe_audio_and_print(audio_file_path)
Output :
The following code converted speech to text, and the audio .wav file was successfully deleted as shown below.
C:\Users\xxxxxxxx\Documents\xxxxxxxxx>python main.py
Stopped
Transcript: Hello this is a test of the speech synthesis service.
Duration: 3.507588 seconds
File deleted successfully.