I'm writing a program that records from my speaker output using pyaudio
. I am on a Raspberry Pi. I built the program while using the audio jack to play audio through some speakers, but recently have switched to using the speakers in my monitor, through HDMI. Suddenly, the program records silence.
from pyaudio import PyAudio
p = PyAudio()
print(p.get_default_input_device_info()['index'], '\n')
print(*[p.get_device_info_by_index(i) for i in range(p.get_device_count())], sep='\n\n')
The above code outputs first the index of the default input device of pyaudio
, then the available devices. See the results below.
Case A:
2
{'index': 0, 'structVersion': 2, 'name': 'bcm2835 Headphones: - (hw:2,0)', 'hostApi': 0, 'maxInputChannels': 0, 'maxOutputChannels': 8, 'defaultLowInputLatency': -1.0, 'defaultLowOutputLatency': 0.0016099773242630386, 'defaultHighInputLatency': -1.0, 'defaultHighOutputLatency': 0.034829931972789115, 'defaultSampleRate': 44100.0}
{'index': 1, 'structVersion': 2, 'name': 'pulse', 'hostApi': 0, 'maxInputChannels': 32, 'maxOutputChannels': 32, 'defaultLowInputLatency': 0.008684807256235827, 'defaultLowOutputLatency': 0.008684807256235827, 'defaultHighInputLatency': 0.034807256235827665, 'defaultHighOutputLatency': 0.034807256235827665, 'defaultSampleRate': 44100.0}
{'index': 2, 'structVersion': 2, 'name': 'default', 'hostApi': 0, 'maxInputChannels': 32, 'maxOutputChannels': 32, 'defaultLowInputLatency': 0.008684807256235827, 'defaultLowOutputLatency': 0.008684807256235827, 'defaultHighInputLatency': 0.034807256235827665, 'defaultHighOutputLatency': 0.034807256235827665, 'defaultSampleRate': 44100.0}
If I then go into to terminal, enter sudo raspi-config
and change the audio output to the headphone jack, I get an actual recording, not silence, and receive a different output to the above code.
Case B:
5
{'index': 0, 'structVersion': 2, 'name': 'vc4-hdmi-0: MAI PCM i2s-hifi-0 (hw:0,0)', 'hostApi': 0, 'maxInputChannels': 0, 'maxOutputChannels': 2, 'defaultLowInputLatency': -1.0, 'defaultLowOutputLatency': 0.005804988662131519, 'defaultHighInputLatency': -1.0, 'defaultHighOutputLatency': 0.034829931972789115, 'defaultSampleRate': 44100.0}
{'index': 1, 'structVersion': 2, 'name': 'bcm2835 Headphones: - (hw:2,0)', 'hostApi': 0, 'maxInputChannels': 0, 'maxOutputChannels': 8, 'defaultLowInputLatency': -1.0, 'defaultLowOutputLatency': 0.0016099773242630386, 'defaultHighInputLatency': -1.0, 'defaultHighOutputLatency': 0.034829931972789115, 'defaultSampleRate': 44100.0}
{'index': 2, 'structVersion': 2, 'name': 'sysdefault', 'hostApi': 0, 'maxInputChannels': 0, 'maxOutputChannels': 128, 'defaultLowInputLatency': -1.0, 'defaultLowOutputLatency': 0.005804988662131519, 'defaultHighInputLatency': -1.0, 'defaultHighOutputLatency': 0.034829931972789115, 'defaultSampleRate': 44100.0}
{'index': 3, 'structVersion': 2, 'name': 'hdmi', 'hostApi': 0, 'maxInputChannels': 0, 'maxOutputChannels': 2, 'defaultLowInputLatency': -1.0, 'defaultLowOutputLatency': 0.005804988662131519, 'defaultHighInputLatency': -1.0, 'defaultHighOutputLatency': 0.034829931972789115, 'defaultSampleRate': 44100.0}
{'index': 4, 'structVersion': 2, 'name': 'pulse', 'hostApi': 0, 'maxInputChannels': 32, 'maxOutputChannels': 32, 'defaultLowInputLatency': 0.008684807256235827, 'defaultLowOutputLatency': 0.008684807256235827, 'defaultHighInputLatency': 0.034807256235827665, 'defaultHighOutputLatency': 0.034807256235827665, 'defaultSampleRate': 44100.0}
{'index': 5, 'structVersion': 2, 'name': 'default', 'hostApi': 0, 'maxInputChannels': 32, 'maxOutputChannels': 32, 'defaultLowInputLatency': 0.008684807256235827, 'defaultLowOutputLatency': 0.008684807256235827, 'defaultHighInputLatency': 0.034807256235827665, 'defaultHighOutputLatency': 0.034807256235827665, 'defaultSampleRate': 44100.0}
You can see in case B that I now have access to many different devices. I've attempted recording from all three available inputs in case A, and both #0 and #1 fail. #1 also records silence, and #0 returns OSError: [Errno -9998] Invalid number of channels
. If you look closely at case A, you'll see that #0 has ['maxInputChannels'] = 0
, so that's why.
I've attempted to create loopback devices that read from the sound output and introduce another input to pass the audio back in. I would then record from that input, as it would have input channels. I've researched on this thread here, but the only solution is for Windows.
I have also attempted to create a loopback device using the pulseaudio
utility pactl
. This link here demonstrates what I have tried. Upon succesfully creating a loopback, I'm unable to plug into it using pyaudio
; it doesn't show up in the list of devices.
Does anybody know...
pulseaudio
loopback using pyaudio
?pyaudio
to solve my problem?Thanks very much.
This problem took a while. Turns out, pyaudio
is pretty useless for recording system audio, so I switched to pasimple
, which has all of the benefits of pyaudio
and, gasp, actually works. By benefits, I mean it is A) simple and B) has no dependencies. (In python. It does require pulseaudio
).
Below you will find my Recorder
object. Keep in mind that I am on Raspbery Pi, so my means of finding the correct output device to listen in on may not work on other systems.
pasimple
works super well. Check out the documentation here. The tlength
argument is worth looking into.
import json
import subprocess
import wave
from threading import Thread, Event
import pasimple as pa
class Recorder(Thread):
def __init__(self) -> None:
super().__init__()
default_sink = subprocess.check_output('pactl get-default-sink', shell = True)
self.device = '{}.monitor'.format(default_sink.decode().rstrip())
devices = json.loads(subprocess.check_output('pactl --format="json" list sinks', shell = True))
device = [device for device in devices if device['monitor_source'] == self.device][0]
specs = device['sample_specification'].split()
self.audio = {}
self.audio['format'] = getattr(pa, 'PA_SAMPLE_{}'.format(specs[0].upper()))
self.audio['channels'] = int(specs[1][:-2])
self.audio['rate'] = int(specs[2][:-2])
self.audio['sample-width'] = pa.format2width(self.audio['format'])
self.is_recording = Event()
self.kill = Event()
def _get_sample_length(self, seconds: int) -> int:
return self.audio['channels'] * self.audio['sample-width'] * self.audio['rate'] * seconds
def _read_audio_data(self, seconds: int) -> bytes:
return self.stream.read(self._get_sample_length(seconds))
def record_to_file(self, file: str, seconds: int) -> None:
data = self._read_audio_data(seconds)
with wave.open(file, 'wb') as f:
f.setnchannels(self.audio['channels'])
f.setsampwidth(self.audio['sample-width'])
f.setframerate(self.audio['rate'])
f.writeframes(data)
def run(self) -> None:
self.stream = pa.PaSimple(
direction = pa.PA_STREAM_RECORD,
format = self.audio['format'],
channels = self.audio['channels'],
rate = self.audio['rate'],
device_name = self.device,
stream_name = 'thingamajiggy'
)
self.is_recording.set() # change state upon stream initialisation
self.kill.wait() # await program end
self.stream.flush() # release resources
self.stream.close()
if __name__ == "__main__":
recorder = Recorder()
recorder.start()
recorder.is_recording.wait() # wait for stream to be established
recorder.record_to_file('example.wav', 10)
recorder.kill.set() # kill thread, free resources