I am using phone number voice webhook, making a TwiML
response like this:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Connect>
<Stream url="wss://..."/>
</Connect>
<Gather speechTimeout="auto" speechModel="phone_call" enhanced="true" input="speech" action="/respond"/>
</Response>
It is starting the Bidirectional Voice Stream
properly, no issues there. It is able to connect, send data and disconnect. But its not making any request to '/respond' in Gather
part. If I remove the Stream
connect part and update TwiML
to this:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather speechTimeout="auto" speechModel="phone_call" enhanced="true" input="speech" action="/respond"/>
</Response>
then Gather
is being called. But why is it not being called with Bidirectional Stream
?
Either:
Do do it completely via streams?
Here I am getting issues in getting StreamId, ConnectionId, CallId at one place.
Use Gather like I am trying to do?
Here with BiDirectional Stream
, Gather
is not even being called for some reason.
Gather
?Currently, we are using its already trained speechTimeout
and speech model
to get que on when user has stopped speaking. In Gather
step, we are making request to another API endpoint, where with the help of 'StreamId', 'ConnectionId' and 'CallId' we are sending voice response as streaming output.
The behavior you described of the Stream working but not the Gather is by design of the Twiml you are using. Twilio processes the Twiml in order and doesn't proceed until the "verb" finishes. The verbs in the Twiml are Connect and Gather. You have the Gather twiml after the Stream:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Connect>
<Stream url="wss://..."/>
</Connect>
<Gather speechTimeout="auto" speechModel="phone_call" enhanced="true" input="speech" action="/respond"/>
</Response>
An alternative, is to use just the Gather Twiml and then use the Twilio REST API to handle the Media Streams:
string accountSid = Environment.GetEnvironmentVariable("TWILIO_ACCOUNT_SID");
string authToken = Environment.GetEnvironmentVariable("TWILIO_AUTH_TOKEN");
TwilioClient.Init(accountSid, authToken);
var stream = StreamResource.Create(
url: new Uri("wss://example.com/"),
pathCallSid: "CAXXXXXXXXXXXXXXXXXXXXXXXXXXX"
);
Your application would need to obtain the pathCallSid from the call that was in "Gather" mode, then use the Twilio REST API to start the media stream for that call. One problem with this approach is that Gather seems best suited for short segments of the call.
To address another question you asked:
Do do it completely via streams?
Here I am getting issues in getting StreamId, ConnectionId, CallId at one place.
Check out the Status Callback parameter when you create the stream:
The statusCallback attribute takes an absolute or relative URL as value. Whenever a stream is started or stopped, Twilio will make a request to this URL
For example:
<Stream url="wss://..." statusCallback="http://yourapi.com..." />
The parameters sent to the statusCallback url contain the StreamSid & CallSid.