When calling the webhook multiple times in one scene and sending simple responses there is a bug at merging the simple responses.
prompt from the first webhook call
{
"override": false,
"firstSimple": {
"speech": "<speak><audio src=\"https://www.example.com/audio/file1.mp3\"></speak>",
"text": "Text 1"
}
}
prompt from the second webhook call
{
"override": false,
"firstSimple": {
"speech": "<speak><audio src=\"https://www.example.com/audio/file2.mp3\"></audio> <audio src=\"https://www.example.com/audio/file3.mp3\"></audio></speak>",
"text": " Text 2"
}
}
merged prompt in the response send to the user
{
"firstSimple": {
"speech": "<speak><speak><audio src=\"https://www.example.com/audio/file1.mp3\"></speak> <audio src=\"https://www.example.com/audio/file2.mp3\"/> <audio src=\"https://www.example.com/audio/file3.mp3\"/></speak>",
"text": "Text 1 Text2"
}
}
So with the two speak
tags the SSML is invalide and is not spoken out.
Sometimes the speech object is completely missing.
I already created an Github issue for that.
So found out that the merging Bug is related to invalid SSML. Unfortunately there is no error message from Google for SSML errors.
And as a workaround for the problem that the speech object is completely missing I changed conv.add(new Simple('Text'))
to conv.prompt.firstSimple = new Simple('Text')
or conv.prompt.lastSimple = new Simple('Text')
.