Jetpack Compose - Speech Recognition

Do you know how to apply Speech Recognition (SpeechRecognizer) in Jetpack Compose?

Something like this, but in Compose.

I followed the steps in this video:

Added these permissions in the manifest:

<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.RECORD_AUDIO"/>

Wrote this code in MainActivity:

class MainActivity : ComponentActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContent {
            PageUi()
        }
    }
}

@Composable
fun PageUi() {
    val context = LocalContext.current
    val talk by remember { mutableStateOf("Speech text should come here") }

    Column(
        modifier = Modifier.fillMaxSize(),
        horizontalAlignment = Alignment.CenterHorizontally,
        verticalArrangement = Arrangement.Center
    ) {
        Text(
            text = talk,
            style = MaterialTheme.typography.h4,
            modifier = Modifier
                .fillMaxSize(0.85f)
                .padding(16.dp)
                .background(Color.LightGray)
        )
        Button(onClick = { askSpeechInput(context) }) {
            Text(
                text = "Talk", style = MaterialTheme.typography.h3
            )
        }
    }
}

fun askSpeechInput(context: Context) {
    if (!SpeechRecognizer.isRecognitionAvailable(context)) {
        Toast.makeText(context, "Speech not available", Toast.LENGTH_SHORT).show()
    } else {
        val i = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        i.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
        i.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
        i.putExtra(RecognizerIntent.EXTRA_PROMPT, "Talk")

        //startActivityForResult(MainActivity(),i,102)
    }
}

@Preview(showBackground = true)
@Composable
fun PageShow() {
    PageUi()
}

But I have no idea how to use startActivityForResult in Compose and do the rest? And when I test it so far on my phone (or emulator) it always ends up with the toast message!

Solution

I am going to explain my own implementation. Let me give you a general idea first, and then I am going to explain each step. So first you need to ask for permissions every time and then if permission is granted then you should start an intent in order to hear what the user says. What the user says is saved on a variable to a View Model. The variable on the View Model is being observed by the composable so you can get the data.

1) Add this to your Manifest file:

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    package="your.package">

    // Add uses-permission
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.RECORD_AUDIO" />

   [...]
   [...]
   [...]

    // Add above the last line  </manifest> like so:
    <queries>
        <intent>
            <action android:name="android.speech.RecognitionService" />
        </intent>
    </queries>

</manifest>

2) Create a ViewModel

class ScreenViewModel : ViewModel() {

    var textFromSpeech: String? by mutableStateOf(null)

}

You need the ViewModel in order to observe the variable from composable and implement your code logic for clean architecture.

3) Implement asking for permission

In build.gradle add the following:

implementation "com.google.accompanist:accompanist-permissions:$accompanist_version"

Then create a composable like so:

@ExperimentalPermissionsApi
@Composable
fun  OpenVoiceWithPermission(
    onDismiss: () -> Unit,
    vm: ScreenViewModel,
    ctxFromScreen: Context,
    finished: () -> Unit
) {

    val voicePermissionState = rememberPermissionState(android.Manifest.permission.RECORD_AUDIO)
    val ctx = LocalContext.current

fun newIntent(ctx: Context) {
    val intent = Intent()
    intent.action = Settings.ACTION_APPLICATION_DETAILS_SETTINGS
    val uri = Uri.fromParts(
        "package",
        BuildConfig.APPLICATION_ID, null
    )
    intent.data = uri
    intent.flags = Intent.FLAG_ACTIVITY_NEW_TASK
    ctx.startActivity(intent)
}

    PermissionRequired(
        permissionState = voicePermissionState,
        permissionNotGrantedContent = {
            DialogCustomBox(
                onDismiss = onDismiss,
                dialogBoxState = DialogLogInState.REQUEST_VOICE,
                onRequestPermission = { voicePermissionState.launchPermissionRequest() }
            )
        },
        permissionNotAvailableContent = {
            DialogCustomBox(
                onDismiss = onDismiss,
                dialogBoxState = DialogLogInState.VOICE_OPEN_SYSTEM_SETTINGS,
                onOpenSystemSettings = { newIntent(ctx) }
            )
        }
    ) {
        startSpeechToText(vm, ctxFromScreen, finished = finished)
    }
}

DialogBox you can create your own custom as I have done or use the standard version, this is up to you and out of the scope of this answer.

On the above code if permission is granted you move automatically to this piece of code: startSpeechToText(vm, ctxFromScreen, finished = finished) which you have to implement next.

4) Implementing Speech Recognizer

fun startSpeechToText(vm: ScreenViewModel, ctx: Context, finished: ()-> Unit) {
    val speechRecognizer = SpeechRecognizer.createSpeechRecognizer(ctx)
    val speechRecognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
    speechRecognizerIntent.putExtra(
        RecognizerIntent.EXTRA_LANGUAGE_MODEL,
        RecognizerIntent.LANGUAGE_MODEL_FREE_FORM,
    )

    // Optionally I have added my mother language
    speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "el_GR")

    speechRecognizer.setRecognitionListener(object : RecognitionListener {
        override fun onReadyForSpeech(bundle: Bundle?) {}
        override fun onBeginningOfSpeech() {}
        override fun onRmsChanged(v: Float) {}
        override fun onBufferReceived(bytes: ByteArray?) {}
        override fun onEndOfSpeech() {
            finished()
            // changing the color of your mic icon to
            // gray to indicate it is not listening or do something you want
        }

        override fun onError(i: Int) {}

        override fun onResults(bundle: Bundle) {
            val result = bundle.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
            if (result != null) {
                // attaching the output
                // to our viewmodel
                vm.textFromSpeech = result[0]
            }
        }

        override fun onPartialResults(bundle: Bundle) {}
        override fun onEvent(i: Int, bundle: Bundle?) {}

    })
    speechRecognizer.startListening(speechRecognizerIntent)
}

With this implementation it is very customizable and you do not get this pop up from google. So you can inform the user that his device is listening with your own unique way!

5) Call from your composable the function to start listening:

@ExperimentalPermissionsApi
@Composable
fun YourScreen() {

    val ctx = LocalContext.current
    val vm: ScreenViewModel = viewModel()
    var clickToShowPermission by rememberSaveable { mutableStateOf(false) }

    if (clickToShowPermission) {
        OpenVoiceWithPermission(
            onDismiss = { clickToShowPermission = false },
            vm = vm,
            ctxFromScreen = ctx
        ) {
            // Do anything you want when the voice has finished and do
            // not forget to return clickToShowPermission to false!!
            clickToShowPermission = false
        }
    }
}

So on you code everytime you call clickToShowPermission = true you can start listening what the user says...