I'm looking to create an app that takes input from the microphone, performs processing on the input audio vector, and immediately plays it back through the output. I want to eventually perform voice processing on this, so I need a low latency solution. However, my current solution calls back around every 100ms, which is too slow for my use case. I'm hoping to access the buffer for playback every 8ms instead.
My current solution is:
var audioEngine: AVAudioEngine
var inputNode: AVAudioInputNode
var playerNode: AVAudioPlayerNode
var bufferDuration: AVAudioFrameCount
init() {
audioEngine = AVAudioEngine()
inputNode = audioEngine.inputNode
playerNode = AVAudioPlayerNode()
bufferDuration = 960// AVAudioFrameCount(352)
}
func startStreaming() -> Void {
// Configure the session
do {
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.playAndRecord, mode: .voiceChat, options: [.defaultToSpeaker])
try audioSession.setPreferredSampleRate(96000)
try audioSession.setPreferredIOBufferDuration(0.008)
try audioSession.setActive(true)
try audioSession.overrideOutputAudioPort(.speaker)
} catch {
print("Audio Session error: \(error)")
}
let fmt = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: AVAudioSession.sharedInstance().sampleRate, channels: 2, interleaved: false)
// Set the playerNode to immediately queue/play the recorded buffer
inputNode.installTap(onBus: 0, bufferSize: bufferDuration, format: fmt) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
// Schedule the buffer for playback
playerNode.scheduleBuffer(buffer, at: nil, options: [], completionHandler: nil)
}
// Start the engine
do {
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.outputNode, format: fmt/*inputNode.inputFormat(forBus: 0)*/)
try audioEngine.start()
playerNode.play()
} catch {
print("Audio Engine start error: \(error)")
}
I have tried options like setting the buffer.frameLength, but nothing seems to change the frequency of the callback. It's unclear to me if this problem lies with the framework not allowing buffers this small, or if I'm missing a solution. Other solved solutions on this site provide reasoning that low buffer sizes are not required, but I do need a solution that is very fast. If this is not possible with AVFAudio, is there a potential solution in AudioKit.io and the Core Audio C API?
Try directly connecting the input node to the output node and discarding both the tap and AVAudioPlayerNode
altogether. (What's AVAudioPlayerNode
good for anyway? Not time sensitive stuff.)
This works for me:
let engine = AVAudioEngine()
init() {
let session = AVAudioSession.sharedInstance()
try! session.setCategory(.playAndRecord)
try! session.setPreferredIOBufferDuration(0.008)
try! session.setActive(true)
engine.connect(engine.inputNode, to: engine.outputNode, format: nil)
try! engine.start()
}