I'm trying to tap the currently selected output audio device on macOS, so I basically have a pass through listener that can monitor the audio stream currently being output without affecting it.
I want to copy this data to a ring buffer in real time so I can operate on it separately.
The combination of Apple docs and (outdated?) SO answers are confusing as to whether I need to write a hacky kernel extension, can utilise CoreAudio for this, or need to interface with the HAL?
I would like to work in Swift if possible.
Many thanks
I don't know about kernel extensions - Apple's use of special "call us" signing certificates and the necessity of turning off SIP discourages casual exploration.
However you can use a combination of CoreAudio and HAL AudioServer plugins to do what you want, and you don't even need to write the plugin yourself, there are several open source versions to choose from.
CoreAudio doesn't give you a way to record from (or "tap") output devices - you can only record from input devices, so the way to get around this is to create a virtual "pass through" device (AudioServerPlugin), not associated with any hardware, that copies output through to input and then set this pass through device as default output and record from its input. I've done this using open source AudioServer Plugins like BackgroundMusic, BlackHole and more recently the fantastic MIT-licensed libASPL.
To tap/record from the resulting device you can simply use the existing recording APIs, e.g. add an AudioDeviceIOProc
callback to it or set the device as the kAudioOutputUnitProperty_CurrentDevice
of an kAudioUnitSubType_HALOutput
AudioUnit
or create an AVCaptureDevice
if you're using AVCaptureSession
.
There are two problems with the above virtual pass through device approach:
If 1. is a problem, then a simple is to create a Multi-Output device containing the pass through device and a real output device (see screenshot) & set this as the default output device. Volume controls stop working, but you can still change the real output device's volume in Audio MIDI Setup.app
.
For 2. you can add a listener to the default output device and update the multi-output device above when it changes.
You can do most of the above in swift, although for ringbuffer-stowing from the buffer delivery callbacks you'll have to use C or some other language that can respect the realtime audio rules (no locks, no memory allocation, etc). You could maybe try AVAudioEngine
to do the tap, but IIRC changing input device is a vale of tears.
Update:
The ScreenCaptureKit
API which appeared in macOS Monterey, adds a flag which lets you capture output audio buffers. Last I checked, you had to also capture the screen too (it's in the API name I guess), but it's a whole lot easier than what I describe above.
Update 2:
As of macOS Sonoma 14.2, there is now CATap
, which allows you to capture (tap) application and output device audio.