iosavfoundationaudiounitaudiotoolboxauv3

How to schedule MIDI events at "sample accurate" times?


I'm trying to build a sequencer app on iOS. There's a sample on the Apple Developer website that makes an audio unit play a repeating scale, here:

https://developer.apple.com/documentation/audiotoolbox/incorporating_audio_effects_and_instruments

In the sample code, there's a file "SimplePlayEngine.swift", with a class "InstrumentPlayer" which handles sending MIDI events to the selected audio unit. It spawns a thread with a loop that iterates through the scale. It sends a MIDI Note On message by calling the audio unit's AUScheduleMIDIEventBlock, sleeps the thread for a short time, sends a Note Off, and repeats.

Here's an abridged version:

DispatchQueue.global(qos: .default).async {
    ...
    while self.isPlaying {
        // cbytes is set to MIDI Note On message
        ...
        self.audioUnit.scheduleMIDIEventBlock!(AUEventSampleTimeImmediate, 0, 3, cbytes)
        usleep(useconds_t(0.2 * 1e6))
        ...
        // cbytes is now MIDI Note Off message
        self.noteBlock(AUEventSampleTimeImmediate, 0, 3, cbytes)
        ...
    }
    ...
}

This works well enough for a demonstration, but it doesn't enforce strict timing, since the events will be scheduled whenever the thread wakes up.

How can I modify it to play the scale at a certain tempo with sample-accurate timing?

My assumption is that I need a way to make the synthesizer audio unit call a callback in my code before each render with the number of frames that are about to be rendered. Then I can schedule a MIDI event every "x" number of frames. You can add an offset, up to the size of the buffer, to the first parameter to scheduleMIDIEventBlock, so I could use that to schedule the event at exactly the right frame in a given render cycle.

I tried using audioUnit.token(byAddingRenderObserver: AURenderObserver), but the callback I gave it was never called, even though the app was making sound. That method sounds like it's the Swift version of AudioUnitAddRenderNotify, and from what I read here, that sounds like what I need to do - https://stackoverflow.com/a/46869149/11924045. How come it wouldn't be called? Is it even possible to make this "sample accurate" using Swift, or do I need to use C for that?

Am I on the right track? Thanks for your help!


Solution

  • You're on the right track. MIDI events can be scheduled with sample-accuracy in a render callback:

    let sampler = AVAudioUnitSampler()
    
    ...
    
    let renderCallback: AURenderCallback = {
        (inRefCon: UnsafeMutableRawPointer,
        ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
        inTimeStamp: UnsafePointer<AudioTimeStamp>,
        inBusNumber: UInt32,
        inNumberFrames: UInt32,
        ioData: UnsafeMutablePointer<AudioBufferList>?) -> OSStatus in
    
        if ioActionFlags.pointee == AudioUnitRenderActionFlags.unitRenderAction_PreRender {
            let sampler = Unmanaged<AVAudioUnitSampler>.fromOpaque(inRefCon).takeUnretainedValue()
    
            let bpm = 960.0
            let samples = UInt64(44000 * 60.0 / bpm)
            let sampleTime = UInt64(inTimeStamp.pointee.mSampleTime)
            let cbytes = UnsafeMutablePointer<UInt8>.allocate(capacity: 3)
            cbytes[0] = 0x90
            cbytes[1] = 64
            cbytes[2] = 127
            for i:UInt64 in 0..<UInt64(inNumberFrames) {
                if (((sampleTime + i) % (samples)) == 0) {
                    sampler.auAudioUnit.scheduleMIDIEventBlock!(Int64(i), 0, 3, cbytes)
                }
            }
        }
    
        return noErr
    }
    
    AudioUnitAddRenderNotify(sampler.audioUnit,
                             renderCallback,
                             Unmanaged.passUnretained(sampler).toOpaque()
    )
    

    That used AURenderCallback and scheduleMIDIEventBlock. You can swap in AURenderObserver and MusicDeviceMIDIEvent, respectively, with similar sample-accurate results:

    let audioUnit = sampler.audioUnit
    
    let renderObserver: AURenderObserver = {
        (actionFlags: AudioUnitRenderActionFlags,
        timestamp: UnsafePointer<AudioTimeStamp>,
        frameCount: AUAudioFrameCount,
        outputBusNumber: Int) -> Void in
    
        if (actionFlags.contains(.unitRenderAction_PreRender)) {
            let bpm = 240.0
            let samples = UInt64(44000 * 60.0 / bpm)
            let sampleTime = UInt64(timestamp.pointee.mSampleTime)
    
            for i:UInt64 in 0..<UInt64(frameCount) {
                if (((sampleTime + i) % (samples)) == 0) {
                    MusicDeviceMIDIEvent(audioUnit, 144, 64, 127, UInt32(i))
                }
            }
        }
    }
    
    let _ = sampler.auAudioUnit.token(byAddingRenderObserver: renderObserver)
    

    Note that these are just examples of how it's possible to do sample-accurate MIDI sequencing on the fly. You should still follow the rules of rendering to reliably implement these patterns.