python audio ffmpeg pyaudio pyaudioanalysis

Can pyAudioAnalysis be used on a live http audio stream?

I am trying to use pyAudioAnalysis to analyse an audio stream in real-time from a HTTP stream. My goal is to use the Zero Crossing Rate (ZCR) and other methods in this library to identify events in the stream.

pyAudioAnalysis only supports input from a file but converting a http stream to a .wav will create a large overhead and temporary file management I would like to avoid.

My method is as follows:

Using ffmpeg I was able to get the raw audio bytes into a subprocess pipe.

try:
    song = subprocess.Popen(["ffmpeg", "-i", "https://media-url/example", "-acodec", "pcm_s16le", "-ac", "1", "-f", "wav", "pipe:1"],
                            stdout=subprocess.PIPE)

I then buffered this data using pyAudio with the hope of being able to use the bytes in pyAudioAnalysis

CHUNK = 65536

p = pyaudio.PyAudio()

stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=44100,
                output=True)

data = song.stdout.read(CHUNK)

while len(data) > 0:
    stream.write(data)
    data = song.stdout.read(CHUNK)

However, inputting this data output into AudioBasicIO.read_audio_generic() produces an empty numpy array.

Is there a valid solution to this problem without temporary file creation?

Solution

You can try my ffmpegio package:

pip install ffmpegio


import ffmpegio

# read entire stream
fs, x = ffmpegio.audio.read("https://media-url/example", ac=1, sample_fmt='s16')
# fs - sampling rate
# x - [nx1] numpy array

# or read a block at a time:
with ffmpegio.open(["https://media-url/example", "ra", blocksize=1024, ac=1, sample_fmt='s16') as f:
    fs = f.rate
    for x in f:
       # x: [1024x1] numpy array (or shorter for the last block)
       process_data(x)

Note that if you need normalized samples, you can set sample_fmt to 'flt' 'dbl'.

If you prefer to keep dependency low, the key in calling ffmpeg subprocess is to use raw output format:


import subprocess as sp
import numpy as np

song = sp.Popen(["ffmpeg", "-i", "https://media-url/example", "-f", "s16le","-c:a", "pcm_s16le", "-ac", "1", "pipe:1"], stdout=sp.PIPE)

CHUNK = 65536
n = CHUNK/2 # 2 bytes/sample

data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
while len(data) > 0:
    data = np.frombuffer(song.stdout.read(CHUNK),np.int16)

I cannot speak of pyAudioAnalysis but I suspect it expects samples and not bytes.