pythontwiliogoogle-speech-to-text-api

How to retrieve information from outbound Twilio call with Python and put it into google speech to text?


I'm new to twilio, so I didn't even get how does voice call streaming works. I have only one task: I need to call a number using twilio and put the voice into google speech to text when the person stops talking. But I don't even understand how to realize it and link all these things. I tried to use django to readress the call into google, but I didn't understand how to set up a webhook. I tried to send voice directly to google speech, but i didn't get how to. Basically, I'm at a dead end and do not understand what I should to do. How to get this voice call in live, how to send it to google.

@app.route("/call", methods=['GET', 'POST'])
def gather_call():
    resp = VoiceResponse()

    gather = Gather(input='speech', speechTimeout=5, action='/com')
    gather.say('Say something')
    resp.append(gather)
    return str(resp)

Solution

  • Twilio developer evangelist here.

    It sounds like you need to slow down a touch and get to understand how Twilio voice calls work first. I recommend you go through the Programmable Voice Quickstart for Python which teaches you how to make and receive calls, handle webhooks and control voice calls in Python, using Flask.

    Once you have done that you will have a better understanding of how Twilio, TwiML and webhooks work.

    Then, if you are looking to convert speech to text, I'll first direct you to <Gather>. <Gather> helps you take user input on a phone call, you can take that input from the dial pad or with speech by setting the input attribute to "speech". This actually uses Google Cloud speech to text behind the scenes.

    If you really need to stream the voice audio directly to Google speech to text, then you can use <Stream>. This requires a websocket connection to stream the audio. There is an example Python application that shows you how to do real time transcription using <Stream> and the Google speech to text API.

    But get comfortable with how Twilio works first before digging into the more complicated parts.