I've recently started to develop a React Native App with Expo and a couple of days ago I ran into a problem which is now beginning to make me crazy.
I want to send a text string to OpenAI TTS API and play the audio response in my app immediately. I don't want to save the file locally. Link to API: https://platform.openai.com/docs/guides/text-to-speech
So far I have successfully made the HTTP request and it looks like I get the correct response from the API. When I output response.blob()
I get:
{"_data": {"__collector": {}, "blobId": "5C362F63-71CB-45EC-913E-9A808AF2194F", "name": "speech.mp3", "offset": 0, "size": 143520, "type": "audio/mpeg"}}
The issue seems to be to play the sound with expo-av. I have been Googling for days now, and no matter which solution I try, I get some error when trying to play the sound. The current error with the current code is:
Error: The AVPlayerItem instance has failed with the error code -1002 and domain "NSURLErrorDomain".
I would really appreciate if anyone could help me (please write code examples).
I also read this post where the solution was to switch to Google's TTS API, but that is not the solution I want: https://www.reddit.com/r/reactnative/comments/13pa9wx/playing_blob_audio_using_expo_audio_react_native/
This is my code:
const convertTextToSpeech = async (textToConvert) => {
const apiKey = 'myKey';
const url = 'https://api.openai.com/v1/audio/speech';
const requestOptions = {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'tts-1',
input: textToConvert,
voice: 'alloy',
language: 'da',
})
};
await fetch(url, requestOptions)
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.blob();
})
.then(blob => {
playSound(blob);
})
.catch(error => {
console.error('There was a problem with the request:', error);
});
};
async function playSound(blob) {
const url = URL.createObjectURL(blob);
const { sound } = await Audio.Sound.createAsync({ uri: url });
await sound.playAsync();
}
Okay after some serious attempts trying to figure this out, this is my solution, I hope it serves you and others well. Works for me for now (I am still testing functionality so it might run into issues down the road but fingers crossed).
react-native-buffer
before you implement it.Also I haven't handled the sound instance much more than just playing it to make sure it works, so expecting more work there probably..
Client
const toBuffer = async (blob) => {
const uri = await toDataURI(blob);
const base64 = uri.replace(/^.*,/g, "");
return Buffer.from(base64, "base64");
};
const toDataURI = (blob) =>
new Promise((resolve) => {
const reader = new FileReader();
reader.readAsDataURL(blob);
reader.onloadend = () => {
const uri = reader.result?.toString();
resolve(uri);
};
});
const constructTempFilePath = async (buffer) => {
const tempFilePath = FileSystem.cacheDirectory + "speech.mp3";
await FileSystem.writeAsStringAsync(
tempFilePath,
buffer.toString("base64"),
{
encoding: FileSystem.EncodingType.Base64,
}
);
return tempFilePath;
};
and
const blob = await response.blob();
const buffer = await toBuffer(blob);
const tempFilePath = await constructTempFilePath(buffer);
const { sound } = await Audio.Sound.createAsync({ uri: tempFilePath });
await sound.playAsync();
Server
const mp3 = await openai.audio.speech.create({
model: "tts-1",
voice: "alloy",
input: "Hello World",
});
const mp3Stream = new PassThrough();
mp3Stream.end(Buffer.from(await mp3.arrayBuffer()));
res.setHeader("Content-Type", "audio/mpeg");
res.setHeader("Content-Disposition", 'attachment; filename="audio.mp3"');
mp3Stream.pipe(res);