javaandroidspeech-recognitionspeechspeech-to-text

Speech to Text on Android


I am looking to create an app which has Speech to text.

I am aware of this kind of ability using the RecognizerIntent: http://android-developers.blogspot.com/search/label/Speech%20Input

However - I do not want a new Intent to be popped up, I want to do the analysis a certain points in my current app, and I dont want it to pop something up stating that it is currently attempting to record your voice.

Has anybody any ideas on how best to do this. I was perhaps thinking of trying Sphinx 4 - but I dont know if this would be able to run on Android - has anyone got any advice or experience?!

I was wondering if I could alter the code here to perhaps not bothering to show the UI or button and just do the processing: http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/VoiceRecognition.html


Solution

  • If you don't want to use the RecognizerIntent to do speech recognition, you could still use the SpeechRecognizer class to do it. However, using that class is a little bit more tricky than using the intent. As a final note, I would highly suggest to let the user know when he is recorded, otherwise he might be very set up, when he finally finds out.

    Edit: A small example inspired (but changed) from, SpeechRecognizer causes ANR... I need help with Android speech API

    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
            "com.domain.app");
    
    SpeechRecognizer recognizer = SpeechRecognizer
            .createSpeechRecognizer(this.getApplicationContext());
    RecognitionListener listener = new RecognitionListener() {
        @Override
        public void onResults(Bundle results) {
            ArrayList<String> voiceResults = results
                    .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
            if (voiceResults == null) {
                System.out.println("No voice results");
            } else {
                System.out.println("Printing matches: ");
                for (String match : voiceResults) {
                    System.out.println(match);
                }
            }
        }
    
        @Override
        public void onReadyForSpeech(Bundle params) {
            System.out.println("Ready for speech");
        }
    
        /**
         *  ERROR_NETWORK_TIMEOUT = 1;
         *  ERROR_NETWORK = 2;
         *  ERROR_AUDIO = 3;
         *  ERROR_SERVER = 4;
         *  ERROR_CLIENT = 5;
         *  ERROR_SPEECH_TIMEOUT = 6;
         *  ERROR_NO_MATCH = 7;
         *  ERROR_RECOGNIZER_BUSY = 8;
         *  ERROR_INSUFFICIENT_PERMISSIONS = 9;
         *
         * @param error code is defined in SpeechRecognizer
         */
        @Override
        public void onError(int error) {
            System.err.println("Error listening for speech: " + error);
        }
    
        @Override
        public void onBeginningOfSpeech() {
            System.out.println("Speech starting");
        }
    
        @Override
        public void onBufferReceived(byte[] buffer) {
            // TODO Auto-generated method stub
    
        }
    
        @Override
        public void onEndOfSpeech() {
            // TODO Auto-generated method stub
    
        }
    
        @Override
        public void onEvent(int eventType, Bundle params) {
            // TODO Auto-generated method stub
    
        }
    
        @Override
        public void onPartialResults(Bundle partialResults) {
            // TODO Auto-generated method stub
    
        }
    
        @Override
        public void onRmsChanged(float rmsdB) {
            // TODO Auto-generated method stub
    
        }
    };
    recognizer.setRecognitionListener(listener);
    recognizer.startListening(intent);
    

    Important: Run this code from the UI Thread, and make sure you have required permissions.

    <uses-permission android:name="android.permission.RECORD_AUDIO" />