I am trying to use pyAudioAnalysis to analyse an audio stream in real-time from a HTTP stream. My goal is to use the Zero Crossing Rate (ZCR) and other methods in this library to identify events in the stream.
pyAudioAnalysis only supports input from a file but converting a http stream to a .wav will create a large overhead and temporary file management I would like to avoid.
My method is as follows:
Using ffmpeg I was able to get the raw audio bytes into a subprocess pipe.
try:
song = subprocess.Popen(["ffmpeg", "-i", "https://media-url/example", "-acodec", "pcm_s16le", "-ac", "1", "-f", "wav", "pipe:1"],
stdout=subprocess.PIPE)
I then buffered this data using pyAudio with the hope of being able to use the bytes in pyAudioAnalysis
CHUNK = 65536
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
output=True)
data = song.stdout.read(CHUNK)
while len(data) > 0:
stream.write(data)
data = song.stdout.read(CHUNK)
However, inputting this data output into AudioBasicIO.read_audio_generic() produces an empty numpy array.
Is there a valid solution to this problem without temporary file creation?
You can try my ffmpegio
package:
pip install ffmpegio
import ffmpegio
# read entire stream
fs, x = ffmpegio.audio.read("https://media-url/example", ac=1, sample_fmt='s16')
# fs - sampling rate
# x - [nx1] numpy array
# or read a block at a time:
with ffmpegio.open(["https://media-url/example", "ra", blocksize=1024, ac=1, sample_fmt='s16') as f:
fs = f.rate
for x in f:
# x: [1024x1] numpy array (or shorter for the last block)
process_data(x)
Note that if you need normalized samples, you can set sample_fmt
to 'flt'
'dbl'
.
If you prefer to keep dependency low, the key in calling ffmpeg subprocess is to use raw output format:
import subprocess as sp
import numpy as np
song = sp.Popen(["ffmpeg", "-i", "https://media-url/example", "-f", "s16le","-c:a", "pcm_s16le", "-ac", "1", "pipe:1"], stdout=sp.PIPE)
CHUNK = 65536
n = CHUNK/2 # 2 bytes/sample
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
while len(data) > 0:
data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
I cannot speak of pyAudioAnalysis
but I suspect it expects samples and not bytes.