I am trying to use the Azure text to Speech service (Microsoft.CognitiveServices.Speech) to convert text to audio, and then convert the audio to another format using NAudio.
I already got the NAudio part working using an mp3 file. But I cannot get any output from SpeakTextAsync
that will work with NAudio.
This is the code where I try to play the file using NAudio (as temperary test), but this doesn't play anything valid.
var waveStream = new RawSourceWaveStream(azureStream, new WaveFormat());
using (var waveOut = new WaveOutEvent())
{
waveOut.Init(waveStream);
Log.Logger.Debug("Playing sounds...");
waveOut.Play();
while (waveOut.PlaybackState == PlaybackState.Playing)
{
Thread.Sleep(1000);
}
}
The 2 possible outputs I found are, but I am probably missing something important:
Option 1 (AudioDataStream):
using var synthesizer = new SpeechSynthesizer(_config, null);
using var result = await synthesizer.SpeakTextAsync(text);
switch (result.Reason)
{
case ResultReason.SynthesizingAudioCompleted:
Console.WriteLine($"Speech synthesized to speaker for text [{text}]");
return AudioDataStream.FromResult(result);
case ResultReason.Canceled:
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (cancellation.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Console.WriteLine($"CANCELED: Did you update the subscription info?");
}
return null;
}
default:
return null;
}
Option 2 (PullAudioOutputStream):
PullAudioOutputStream stream = new PullAudioOutputStream();
AudioConfig config = AudioConfig.FromStreamOutput(stream);
using var synthesizer = new SpeechSynthesizer(_config, null);
using var result = await synthesizer.SpeakTextAsync(text);
switch (result.Reason)
{
case ResultReason.SynthesizingAudioCompleted:
Console.WriteLine($"Speech synthesized to speaker for text [{text}]");
return stream;
case ResultReason.Canceled:
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (cancellation.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Console.WriteLine($"CANCELED: Did you update the subscription info?");
}
return null;
}
default:
return null;
}
So how to I convert the text to speech to a valid NAudio format?
Kevin,
Why do you need NAudio for ? if it's for playback only, it's not necessary, the following line play the text out loud :
await synthesizer.SpeakTextAsync(text);
For any other reason, If you need the result of speech synthesis with NAudio.
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
using var stream = new MemoryStream(result.AudioData);
using var reader = new WaveFileReader(stream);
using var player = new WaveOutEvent();
player.Init(reader);
player.Play();
while (player.PlaybackState == PlaybackState.Playing)
{
Thread.Sleep(500);
}
}