pythonsignal-processingpitch-tracking

Harmonic product spectrum for single guitar note Python


I am trying to detect the pitch of a B3 note played with a guitar. The audio can be found here.

This is the spectrogram: spectrogram

As you can see, it is visible that the fundamental pitch is about 250Hz which corresponds to the B3 note.

It also contains a good amount of harmonics and that is why I chose to use HPS from here. I am using this code for detecting the pitch:

def freq_from_hps(signal, fs):
    """Estimate frequency using harmonic product spectrum
    Low frequency noise piles up and overwhelms the desired peaks
    """
    N = len(signal)
    signal -= mean(signal)  # Remove DC offset

    # Compute Fourier transform of windowed signal
    windowed = signal * kaiser(N, 100)

    # Get spectrum
    X = log(abs(rfft(windowed)))

    # Downsample sum logs of spectra instead of multiplying
    hps = copy(X)
    for h in arange(2, 9): # TODO: choose a smarter upper limit
        dec = decimate(X, h)
        hps[:len(dec)] += dec

    # Find the peak and interpolate to get a more accurate peak
    i_peak = argmax(hps[:len(dec)])
    i_interp = parabolic(hps, i_peak)[0]

    # Convert to equivalent frequency
    return fs * i_interp / N  # Hz

My sampling rate is 40000. However, instead of getting a result close to 250Hz (B3 note), I am getting 0.66Hz. How is this possible?

I also tried with an autocorrelation method from the same repo but I also get bad results like 10000Hz.

Thanks to an answer I understand I have to apply a filter to remove the low frequencies in the signal. How do I do that? Are there multiple methods to do that, and which one is recommended?

STATUS UPDATE:

The high-pass filter proposed by the answer is working. If I apply the function in the answer to my audio signal, it correctly displays about 245Hz. However, I would like to filter the whole signal, not only a part of it. A note could lie in the middle of the signal or a signal contain more than one note (I know a solution is onset detection, but I am curious to know why this isn't working). That is why I edited the code to return filtered_audio.

The problem is that if I do that, even though the noise has been correctly removed (see screenshot). I get 0.05 as a result.

spectrogram


Solution

  • Based on the distances between the harmonics in the spectrogram, I would estimate the pitch to be about 150-200 Hz. So, why doesn't the pitch detection algorithm detect the pitch that we can see by eye in the spectrogram? I have a few guesses:

    The note only lasts for a few seconds. At the beginning, there is a beautiful harmonic stack with 10 or more harmonics! These quickly fade away and are not even visible after 5 seconds. If you are trying to estimate the pitch of the entire signal, your estimate might be contaminated by the "pitch" of the sound from 5-12 seconds. Try computing the pitch only for the first 1-2 seconds.

    There is too much low frequency noise. In the spectrogram, you can see a lot of power between 0 and 64 Hz. This is not part of the harmonics, so you could try removing it with a high-pass filter.

    Here is some code that does the job:

    import numpy as np
    from scipy.io import wavfile
    from scipy import signal
    import matplotlib.pyplot as plt
    
    from frequency_estimator import freq_from_hps
    # downloaded from https://github.com/endolith/waveform-analyzer/
    
    filename = 'Vocaroo_s1KZzNZLtg3c.wav'
    # downloaded from http://vocaroo.com/i/s1KZzNZLtg3c
    
    # Parameters
    time_start = 0  # seconds
    time_end = 1  # seconds
    filter_stop_freq = 70  # Hz
    filter_pass_freq = 100  # Hz
    filter_order = 1001
    
    # Load data
    fs, audio = wavfile.read(filename)
    audio = audio.astype(float)
    
    # High-pass filter
    nyquist_rate = fs / 2.
    desired = (0, 0, 1, 1)
    bands = (0, filter_stop_freq, filter_pass_freq, nyquist_rate)
    filter_coefs = signal.firls(filter_order, bands, desired, nyq=nyquist_rate)
    
    # Examine our high pass filter
    w, h = signal.freqz(filter_coefs)
    f = w / 2 / np.pi * fs  # convert radians/sample to cycles/second
    plt.plot(f, 20 * np.log10(abs(h)), 'b')
    plt.ylabel('Amplitude [dB]', color='b')
    plt.xlabel('Frequency [Hz]')
    plt.xlim((0, 300))
    
    # Apply high-pass filter
    filtered_audio = signal.filtfilt(filter_coefs, [1], audio)
    
    # Only analyze the audio between time_start and time_end
    time_seconds = np.arange(filtered_audio.size, dtype=float) / fs
    audio_to_analyze = filtered_audio[(time_seconds >= time_start) &
                                      (time_seconds <= time_end)]
    
    fundamental_frequency = freq_from_hps(audio_to_analyze, fs)
    print 'Fundamental frequency is {} Hz'.format(fundamental_frequency)