I am trying to use AWS Polly (for TTS) using JavaScript SDK from AWS lambda (which is exposed through a REST API using API gateway). There is no trouble in getting the PCM output. Here is a call flow in brief.
.NET application --> REST API (API gateway) --> AWS Lambda (JS SDK) --> AWS Polly
The .NET application (am using POSTMAN too for testing) gets an audio stream buffer in following format.
{"type":"Buffer","data":[255,255,0,0,0,0,255,255,255,255,0,0,0,0,0,0,255,255,255,255,0,0,0,0,255,255,255,255,255,255,255,255,0,0,255,255,255,255,0,0,0,0,255,255,255,255,0,0,255,255,255, more such data]
Now I don't know how to convert it back to raw PCM. I would like it send this data back as raw PCM but unable to find a way to do it. I also cannot understand why AWS would send data back in such a format. Using there console, one can get audio in raw PCM format (which I can then feed to Audacity), but not so simple with SDK. Or am I missing something really basic?
Any suggestions/tips on this? Thanks.
As Michael mentioned (in the comment), sending the response from Polly back causes the stream to turn into a JSON object. Encoding the received buffer from Polly in base64 fixes this. Here's what code sample now looks like -
polly.synthesizeSpeech(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
//old code
//callback(null, data.Audiostream); //this converts buffer to JSON obj
//use below instead
if (data && data.AudioStream instanceof Buffer) {
var buf = data.AudioStream.toString('base64');
callback(null, buf);
}
});
PS: I am using AWS SDK on AWS lambda