iosios-keyboard-extension

How does Google's custom iOS keyboard, Gboard, programmatically dismiss the frontmost app?


Google's custom iOS app, Gboard, has an interesting feature that can't be accomplished using public APIs for in the iOS SDK (as of iOS 10). I'd like to know exactly how Google accomplishes the task of programmatically popping back one app in the App Switching stack in Gboard.

Custom iOS keyboards have two major components: the container app and the keyboard app extension. The keyboard app extension runs in a separate OS process that is started up whenever a user is in any app on their phone that requires text input.

These are the approximate steps that can be followed, using Gboard, to see the effect of programmatically returning to a previous app:

  1. A user starts the Apple Messages app on their iPhone and taps a text field to begin entering text.
  2. The Gboard keyboard extension is launched and the users sees the Gboard custom keyboard (while they are still in the Apple Messages app).
  3. The user taps the microphone key inside the Gboard keyboard extension to do voice-to-text input.
  4. Gboard uses a custom url scheme to launch the Gboard container app. The Gboard keyboard and Apple messages app are pushed down one layer in the App stack and the Gboard container app is now the frontmost app in the App stack. The Gboard container app uses the microphone to listen to the user's speech and translates it into text which it places onto the screen.
  5. The user taps the "Done" button when they are satisfied with the text input they see on the screen.
  6. This is where the magic happens… as the text input screen is dismissed, the Gboard container app is also dismissed automatically. The Gboard container app goes away and is replaced by the Apple Messages app (sometimes the Gboard keyboard extension process is still alive, sometimes it is relaunched, and sometimes it needs to be re-launched manually by tapping inside a text field.) . How does Google accomplish this?
  7. Finally, the user sees the text that was just translated inserted automatically inside the text input field. Presumably Google accomplishes this by sharing data between the Gboard container app and the keyboard extension.

I would assume that Google is using private APIs by exploring the status bar's view hierarchy using Objective-C runtime introspection and somehow synthesizing tap events or calling an exposed target / action. I've explored this a very little and have been able to find interesting UIView subclasses inside the status bar, like UIStatusBarBreadcrumbItemView which contains an array of UISystemNavigationActions. I'm continuing to explore these classes in the hope that I can find some way of replicating the user interaction.

I understand that using private APIs is a good way to get your app submission rejected from the App Store - this isn't a concern that I'd like to be addressed in the answer. I'm looking primarily for specific answers about how exactly how Google accomplishes the task of programmatically popping back one app in the App Switching stack in Gboard.


Solution

  • Your guess is correct — Gboard is using private API to do it.

    … though not through exploring view hierarchy or event injection.

    When the voice-to-text action is done, we can check the syslog from Xcode or Console that it calls the -[AVAudioSession setActive:withOptions:error:] method. So I've reverse-engineered the Gboard app and look for the stack trace related to this.

    Climbing up the call stack we can find the -[GKBVoiceRecognitionViewController navigateBackToPreviousApp] method, and…

    enter image description here

    _systemNavigationAction? Yep, definitely private API.

    Since class_getInstanceVariable is a public API and "_systemNavigationAction" is a string literal, the automatic checker is not able to note the private API usage, and the human reviewers probably don't see anything wrong with the "jump back to the previous app" behavior. Or probably because they are Google and you are not…


    The actual code that performs the "jump back to previous app" action is like this:

    @import UIKit;
    @import ObjectiveC.runtime;
    
    @interface UISystemNavigationAction : NSObject
    @property(nonatomic, readonly, nonnull) NSArray<NSNumber*>* destinations;
    -(BOOL)sendResponseForDestination:(NSUInteger)destination;
    @end
    
    inline BOOL jumpBackToPreviousApp() {
        Ivar sysNavIvar = class_getInstanceVariable(UIApplication.class, "_systemNavigationAction");
        UIApplication* app = UIApplication.sharedApplication;
        UISystemNavigationAction* action = object_getIvar(app, sysNavIvar);
        if (!action) {
            return NO;
        }
        NSUInteger destination = action.destinations.firstObject.unsignedIntegerValue;
        return [action sendResponseForDestination:destination];
    }
    

    In particular, the -sendResponseForDestination: method performs the actual "go back" action.

    (Since the API is undocumented, Gboard is actually using the API incorrectly. They used the wrong signature -(void)sendResponseForDestination:(id)destination. But it happens that all numbers other than 1 will work the same, so the Google developers are lucky this time)