iosswiftavfoundationavaudiosessionavaudioengine

Choose specific input channel as a mono input from USB device in AVAudioSession/AVAudioEngine


I'm working on an audio recording app which uses an external USB Audio interface (e.g, Focusrite Scarlett Solo) connected to an iPhone.

When I run AVAudioSession.sharedInstance().currentRoute.inputs it returns the interface correctly.

1 element
  - 0 : <AVAudioSessionPortDescription: 0x28307c650, type = USBAudio; name = Scarlett Solo USB; UID = AppleUSBAudioEngine:Focusrite:Scarlett Solo USB:130000:1,2; selectedDataSource = (null)>

Channels are returned correctly as well.

po AVAudioSession.sharedInstance().currentRoute.inputs.first?.channels
▿ Optional<Array<AVAudioSessionChannelDescription>>
  ▿ some : 2 elements
    - 0 : <AVAudioSessionChannelDescription: 0x283070b60, name = Scarlett Solo USB 1; label = 4294967295 (0xffffffff); number = 1; port UID = AppleUSBAudioEngine:Focusrite:Scarlett Solo USB:130000:1,2>
    - 1 : <AVAudioSessionChannelDescription: 0x283070b70, name = Scarlett Solo USB 2; label = 4294967295 (0xffffffff); number = 2; port UID = AppleUSBAudioEngine:Focusrite:Scarlett Solo USB:130000:1,2>

When I connect the inputNode to mainMixerNode in AVAudioEngine it uses multi-channel input so the Line/Instrument input is on the right channel and Microphone input is on the left.

How can I make it so that I use only the 2nd Channel (guitar) as a mono to be played back in both speakers?

I've been looking through some docs and discussions but could not find the answer.

I tried changing channels to 1 in audio format but as expected it plays the first channel in mono but I can't select 2nd channel to be played instead.

let input = engine.inputNode
let inputFormat = input.inputFormat(forBus: 0)
        
let preferredFormat = AVAudioFormat(
    commonFormat: inputFormat.commonFormat,
    sampleRate: inputFormat.sampleRate,
    channels: 1,
    interleaved: false
)!

engine.connect(input, to: engine.mainMixerNode, format: preferredFormat)

EDIT: As asked, I'm adding channel mapping code

let input = engine.inputNode
let inputFormat = input.inputFormat(forBus: 0)
        
engine.mainMixerNode.auAudioUnit.channelMap = [1]
engine.connect(input, to: engine.mainMixerNode, format: inputFormat)

EDIT 2:

So the answer from bugix indeed works at some point but I had issues with panning. This is how I fixed panning initially but then noticed it was passing through all the inputs as mono. I've just tested it in an empty project in app delegate and posting whole code that you could just run and see in action.

@main
class AppDelegate: UIResponder, UIApplicationDelegate {
    let engine = AVAudioEngine()
    var inputNode: AVAudioInputNode { engine.inputNode }
    var inputFormat: AVAudioFormat { inputNode.inputFormat(forBus: 0) }
    var outputNode: AVAudioOutputNode { engine.outputNode }
    var outputFormat: AVAudioFormat { outputNode.outputFormat(forBus: 0) }
    var mainMixerNode: AVAudioMixerNode { engine.mainMixerNode }
    
    let mixerNode: AVAudioMixerNode = .init()


    func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
        
        
        engine.attach(mixerNode)
        
        let layoutTag: AudioChannelLayoutTag = kAudioChannelLayoutTag_Mono
        let layout = AVAudioChannelLayout(layoutTag: layoutTag)!
        let monoFormat: AVAudioFormat = .init(standardFormatWithSampleRate: inputFormat.sampleRate, channelLayout: layout)
       
        engine.connect(inputNode, to: mixerNode, format: inputFormat)
        engine.connect(mixerNode, to: mainMixerNode, format: monoFormat)
        
        try! engine.start()
        
        mixerNode.pan = -0.4
        
        return true
    }
}

Also, it didn't work when I tested with 10 input channel audio interface. It was only streaming first two signals. I've tested the same audio interface with Garage Band and it works there. Wonder how Apple does it and the same time doesn't provide us with a straightforward interface to do so. Ideally it should be possible to switch from AVAudioSession but it doesn't have that feature. I've looked through all the docs of AVAudioSession, AVAudioEngine and whatnot but couldn't find anything there.


Solution

  • So this technical note was the documentation that helped me a lot understand many things. I'm not going to write details myself as it's better if you also read this https://developer.apple.com/library/archive/technotes/tn2091/_index.html.

    Can't post pictures yet so Id suggest checking out the signal flow of AUHAL (in case of iOS it would be AURemoteIO) in the link above.

    Apple states:

    When you want to get the audio device's input data, the connection should be Source: AUHAL (output scope, element 1) and Destination Destination Unit (input scope, input element )

    We could skip most of the stuff as AVAudioEngine does them on a higher level but let's have a look at Channel Mapping section.

    We need to make a channel map on engine.inputNode.audioUnit which is the closest unit to hardware, hence the lower level than auAudioUnit and avAudioUnit. auAudioUnit in fact might still show that channel map is [0, 1] but after we set the property to audioUnit we could say that both 0 and 1 would represent 1 or if you've got more channels it could be 2,3 etc. or maybe even [4, 6] under-the-hood whatsoever.

    typealias AudioChannelCount = UInt32
    typealias AudioChannelsResolver = (AudioChannelCount) -> [UInt32]
    
    func mapChannels(
        _ channels: [UInt32],
        inputUnit: AudioUnit,
        format: AVAudioFormat,
        toChannels destinationChannelsResolver: AudioChannelsResolver
    ) throws {
        var channelMap: UnsafeMutablePointer<Int32>? = nil
        let numberOfChannels: UInt32 = format.channelCount
        let mapSize: UInt32 = .init(MemoryLayout<Int32>.size * Int(numberOfChannels))
        channelMap = UnsafeMutablePointer<Int32>.allocate(capacity: Int(mapSize))
        let destinationChannels = destinationChannelsResolver(numberOfChannels)
        
        guard channels.count <= destinationChannels.count else {
            throw MyError.sourceChannelCountIsGreaterThanDestChannelCount
        }
        
        for (index, channel) in destinationChannels.enumerated() {
            guard numberOfChannels > channel else {
                throw MyError.invalidChannelIndexSent
            }
            if channels.indices.contains(index) {
                channelMap?[Int(channel)] = Int32(channels[index])
            } else {
                channelMap?[Int(channel)] = -1
            }
        }
        
        // Assign mutated channel map to audio unit's ChannelMap property
        // When you want to get the audio device's input data, the connection should be:
        // AUHAL (output scope, element 1) -> Destination Unit (input scope, input element)
        // Input - 1, Output - 0
        // https://developer.apple.com/library/archive/technotes/tn2091/_index.html
        let status = AudioUnitSetProperty(
            inputUnit,
            kAudioOutputUnitProperty_ChannelMap,
            kAudioUnitScope_Output,
            1,
            channelMap,
            mapSize
        )
        channelMap?.deallocate()
        
        if status != noErr {
            throw MyError.failedToSetChannelMap
        }
    }
    
    // Could do the same for iOS
    class AppDelegate: NSObject, NSApplicationDelegate {
        let engine = AVAudioEngine()
        
        func applicationDidFinishLaunching(_ notification: Notification) {
            let format = engine.inputNode.outputFormat(forBus: 0)
            let desiredFormat = AVAudioFormat(commonFormat: format.commonFormat, sampleRate: format.sampleRate, channels: 1, interleaved: format.isInterleaved)
            guard let inputUnit = engine.inputNode.audioUnit else { return }
            
            try! mapChannels(
                [1, 1], // 2nd element doesn't really matter in case we use `desiredFormat` that has 1 channel so could have [1] as well.
    // Or you could keep it stereo and both channels would still produce the same output.
                inputUnit: inputUnit,
                format: engine.inputNode.inputFormat(forBus: 0),
                toChannels: { count in
                    .init(0 ..< count)
                }
            )
            
            engine.connect(engine.inputNode, to: engine.mainMixerNode, format: desiredFormat)
            try! engine.start()
        }
    }
    

    pan works in both cases - using stereo and [1, 1], [0, 0] and such mapping or when using mono format.

    engine.inputNode.pan = -0.7 for example.

    And what's most important is that only the channels you set will produce sound. For example, if I've got [1, 1] channel map and I switch the cable from channel 1 into channel 0, I wouldn't get any sound.

    macOS Note

    Make sure you enable Audio input in App sandbox alongside Microphone usage description in Info.plist for engine to produce sound.