macosswiftmedia-player

Capture OSX media control buttons in Swift


I would like my app to respond to the F7, F8 and F9 keyboard media control buttons.

I am aware of this lovely library but it is not working in combination with Swift: https://github.com/nevyn/SPMediaKeyTap


Solution

  • I actually solved this problem myself just the other day. I wrote a blog post on it, as well as a Gist

    I'll embed the blog post and final code just in case the blog or Gist ever go away. Note: This is a very long post that goes into detail about how the class in constructed and what you can do to call other methods in your App's delegate. If all you want is the finished product (the MediaApplication class), head towards the bottom. It's just above the XML and the Info.plist informaton.


    For starters, to get the key events from the media keys you need to create a class that extends NSApplication. This is as simple as

    import Cocoa
    
    class MediaApplication: NSApplication {
    }
    

    Next, we need to override the sendEvent() function

    override func sendEvent(event: NSEvent) {
        if (event.type == .SystemDefined && event.subtype.rawValue == 8) {
            let keyCode = ((event.data1 & 0xFFFF0000) >> 16)
            let keyFlags = (event.data1 & 0x0000FFFF)
            // Get the key state. 0xA is KeyDown, OxB is KeyUp
            let keyState = (((keyFlags & 0xFF00) >> 8)) == 0xA
            let keyRepeat = (keyFlags & 0x1)
            mediaKeyEvent(Int32(keyCode), state: keyState, keyRepeat: Bool(keyRepeat))
        }
    
        super.sendEvent(event)
    }
    

    Now, I don't pretend to entirely understand what is going on here, but I think I have a decent idea. NSEvent objects contain several key properties: type, subtype, data1, and data2. Type and subtype are fairly self-explanatory, but data1 and data2 are extremely vague. Since the code only uses data1, that's what we'll be looking at. From what I can tell, data1 contains all of the data surrounding a key event. That means it contains the key code and any key flags. It appears that key flags contain information about the key's state (Is the key pressed down? Has the key been released?) as well as whether or not the key is being held down and repeating the signal. I'm also guessing that the key code and key flags both take up half of the data contained in data1 and the bitwise operations are separating that data out into appropriate variables. After we get the values we need, we call mediaKeyEvent() which I will get to in a moment. Regardless of what events get sent to our MediaApplication, we do want the default NSApplication to handle all events as well. To do this, we call super.sendEvent(event) at the end of the function. Now, let's take a look at mediaKeyEvent().

    func mediaKeyEvent(key: Int32, state: Bool, keyRepeat: Bool) {
        // Only send events on KeyDown. Without this check, these events will happen twice
        if (state) {
            switch(key) {
            case NX_KEYTYPE_PLAY:
                // Do work
                break
            case NX_KEYTYPE_FAST:
                // Do work
                break
            case NX_KEYTYPE_REWIND:
                // Do work
                break
            default:
                break
            }
        }
    }
    

    This is where things start to get fun. First things first, we only want to check what key is being pressed if state is true, which in this case is whenever the key is pressed down. Once we get into checking the keys, we look for NX_KEYTYPE_PLAY, NX_KEYTYPE_FAST, and NX_KEYTYPE_REWIND. If their functions aren't obvious, NX_KEYTYPE_PLAY is the play/pause key, NX_KEYTYPE_FAST is the next key, and NX_KEYTYPE_REWIND is the previous key. Right now, nothing happens when any of those keys is pressed down, so lets go over some possible logic. We'll start with a simple scenario.

    case NX_KEYTYPE_PLAY:
        print("Play")
        break
    

    With this code in place, when your application detects that the play/pause key has been pressed you will see "Play" printed out to the console. Simple, right? Let's up the ante by calling functions in your application's NSApplicationDelegate. First we will assume that your NSApplicationDelegate has a function called printMessage. We will be modifying it as we go, so pay close attention to the changes. They will be minor, but the changes will impact how you call them from mediaEventKey.

    func printMessage() {
        print("Hello World")
    }
    

    This is the simplest case. When printMessage() is called, you will see "Hello World" in your console. You can call this by calling performSelector on your NSApplicationDelegate which is accessible through the MediaApplication. performSelector takes in a Selector which is just the name of the function in your NSApplicationDelegate.

    case NX_KEYTYPE_PLAY:
        delegate!.performSelector("printMessage")
        break
    

    Now, when your application detects that the play/pause key has been pressed, you will see "Hello World" printed to the console. Let's kick things up a notch with a new version of printMessage that takes in a parameter.

    func printMessage(arg: String) {
        print(arg)
    }
    

    The idea is now that if printMessage("Hello World") is called, you will see "Hello World" in your console. We can now modify the performSelector call to handle passing in a parameter.

    case NX_KEYTYPE_PLAY:
        delegate!.performSelector("printMessage:", withObject: "Hello World")
        break
    

    There are a few things to note about this change. First, it's important to notice the : that was added to the Selector. This separates the function name from the parameter when it gets sent to the delegate. How it works isn't too important to remember, but it's something along the lines of the delegate calling printMessage:"Hello World". I'm fairly certain that is not 100% correct as it would likely use an object ID of some sort, but I haven't done any extensive digging into the specifics. Either way, the important thing to remember is to add : when passing in a parameter.. The second thing to note is that we added a withObject parameter. withObject takes an AnyObject? as a value. In this case, we just pass in a String because that's what printMessage is looking for. When your application detects that the play/pause key has been pressed, you should still see "Hello World" in the console. Let's look at one final use-case: a version of printMessage that takes in not one, but two parameters.

    func printMessage(arg: String, _ arg2: String) {
        print(arg)
    }
    

    Now, if printMessage("Hello", "World") is called, you will see "Hello World" in your console. We can now modify the performSelector call to handle passing in two parameters.

    case NX_KEYTYPE_PLAY:
        delegate!.performSelector("printMessage::", withObject: "Hello", withObject: "World")
        break
    

    As before, there are two things to notice here. First, we now add two : to the end of the Selector. Like before, this is so that the delegate can pass information along that contains the parameters. At a very basic level, it would look something like printMessage:"Hello":"World", but again I don't know what it really looks like at a deeper level. The second thing to notice is that we have added a second withObject parameter to the performSelector call. Like before, this withObject takes an AnyObject? as a value and we're passing in a String because that's what printMessage wants. When your application detects that the play/pause key has been pressed, you should still see "Hello World" in the console.

    One final thing to note is that performSelector can only accept up to two parameters. I'd really like to see Swift add concepts like splatting or varargs so that this limitation eventually goes away, but for now just avoid trying to call functions that require more than two parameters.

    This is what a very simple MediaApplication class that just prints out some text would look like once you are done with everything above:

    import Cocoa
    
    class MediaApplication: NSApplication {
        override func sendEvent(event: NSEvent) {
            if (event.type == .SystemDefined && event.subtype.rawValue == 8) {
                let keyCode = ((event.data1 & 0xFFFF0000) >> 16)
                let keyFlags = (event.data1 & 0x0000FFFF)
                // Get the key state. 0xA is KeyDown, OxB is KeyUp
                let keyState = (((keyFlags & 0xFF00) >> 8)) == 0xA
                let keyRepeat = (keyFlags & 0x1)
                mediaKeyEvent(Int32(keyCode), state: keyState, keyRepeat: Bool(keyRepeat))
            }
    
            super.sendEvent(event)
        }
    
        func mediaKeyEvent(key: Int32, state: Bool, keyRepeat: Bool) {
            // Only send events on KeyDown. Without this check, these events will happen twice
            if (state) {
                switch(key) {
                case NX_KEYTYPE_PLAY:
                    print("Play")
                    break
                case NX_KEYTYPE_FAST:
                    print("Next")
                    break
                case NX_KEYTYPE_REWIND:
                    print("Prev")
                    break
                default:
                    break
                }
            }
        }
    }
    

    Now, I should also add that, by default, your application is going to use the standard NSApplication when it runs. If you want to use the MediaApplication that this whole post is about, you'll need to go ahead and modify your application's Info.plist file. If you're in the graphical view, it will look something like this:

    Info.plist
    (source: sernprogramming.com)

    Otherwise, it will look something like this:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
      <key>CFBundleDevelopmentRegion</key>
      <string>en</string>
      <key>CFBundleExecutable</key>
      <string>$(EXECUTABLE_NAME)</string>
      <key>CFBundleIconFile</key>
      <string></string>
      <key>CFBundleIdentifier</key>
      <string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>
      <key>CFBundleInfoDictionaryVersion</key>
      <string>6.0</string>
      <key>CFBundleName</key>
      <string>$(PRODUCT_NAME)</string>
      <key>CFBundlePackageType</key>
      <string>APPL</string>
      <key>CFBundleShortVersionString</key>
      <string>1.0</string>
      <key>CFBundleSignature</key>
      <string>????</string>
      <key>CFBundleVersion</key>
      <string>1</string>
      <key>LSApplicationCategoryType</key>
      <string>public.app-category.utilities</string>
      <key>LSMinimumSystemVersion</key>
      <string>$(MACOSX_DEPLOYMENT_TARGET)</string>
      <key>LSUIElement</key>
      <true/>
      <key>NSHumanReadableCopyright</key>
      <string>Copyright © 2015 Chris Rees. All rights reserved.</string>
      <key>NSMainNibFile</key>
      <string>MainMenu</string>
      <key>NSPrincipalClass</key>
      <string>NSApplication</string>
    </dict>
    </plist>
    

    In either case, you will want to change the NSPrincipalClass property. The new value will include you project's name, so it will be something like Notify.MediaApplication. Once you make the change, run your application and use those media keys!