pythonaudioffmpegpyaudiopyaudioanalysis

Can pyAudioAnalysis be used on a live http audio stream?


I am trying to use pyAudioAnalysis to analyse an audio stream in real-time from a HTTP stream. My goal is to use the Zero Crossing Rate (ZCR) and other methods in this library to identify events in the stream.

pyAudioAnalysis only supports input from a file but converting a http stream to a .wav will create a large overhead and temporary file management I would like to avoid.

My method is as follows:

Using ffmpeg I was able to get the raw audio bytes into a subprocess pipe.

try:
    song = subprocess.Popen(["ffmpeg", "-i", "https://media-url/example", "-acodec", "pcm_s16le", "-ac", "1", "-f", "wav", "pipe:1"],
                            stdout=subprocess.PIPE)

I then buffered this data using pyAudio with the hope of being able to use the bytes in pyAudioAnalysis

CHUNK = 65536

p = pyaudio.PyAudio()

stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=44100,
                output=True)

data = song.stdout.read(CHUNK)

while len(data) > 0:
    stream.write(data)
    data = song.stdout.read(CHUNK)

However, inputting this data output into AudioBasicIO.read_audio_generic() produces an empty numpy array.

Is there a valid solution to this problem without temporary file creation?


Solution

  • You can try my ffmpegio package:

    pip install ffmpegio
    
    
    import ffmpegio
    
    # read entire stream
    fs, x = ffmpegio.audio.read("https://media-url/example", ac=1, sample_fmt='s16')
    # fs - sampling rate
    # x - [nx1] numpy array
    
    # or read a block at a time:
    with ffmpegio.open(["https://media-url/example", "ra", blocksize=1024, ac=1, sample_fmt='s16') as f:
        fs = f.rate
        for x in f:
           # x: [1024x1] numpy array (or shorter for the last block)
           process_data(x)
    

    Note that if you need normalized samples, you can set sample_fmt to 'flt' 'dbl'.

    If you prefer to keep dependency low, the key in calling ffmpeg subprocess is to use raw output format:

    
    import subprocess as sp
    import numpy as np
    
    song = sp.Popen(["ffmpeg", "-i", "https://media-url/example", "-f", "s16le","-c:a", "pcm_s16le", "-ac", "1", "pipe:1"], stdout=sp.PIPE)
    
    CHUNK = 65536
    n = CHUNK/2 # 2 bytes/sample
    
    data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
    while len(data) > 0:
        data = np.frombuffer(song.stdout.read(CHUNK),np.int16)
    

    I cannot speak of pyAudioAnalysis but I suspect it expects samples and not bytes.