I am trying to generate random signals with the following two properties:
The values should be approximately log-normally distributed (any long-tailed distribution bounded form below with non-zero mode would do).
The power spectral density (PSD) of the signal should have low power in the lower frequencies and increase in power with increasing frequency.
Generating samples from a log-normal distribution is straightforward: sample from a normal distribution and apply the exponential function 1.
Python code:
import numpy as np
import matplotlib.pyplot as plt
size = 100_000
samples = np.exp(np.random.randn(size))
plt.hist(samples, bins=np.linspace(0, 10, 101), histtype="step", density=True)
plt.show()
Generating signals with normally distributed values and a defined power spectra is covered well in this SO answer.
Combining both approaches works well for white noise and pink noise. However, when applying the exponential to violet or blue noise, the PSD shape is not preserved. Instead, the PSDs closely resemble the PSD of white noise.
Any insights or suggestions for alternative approaches would be much appreciated.
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import welch
# from https://stackoverflow.com/a/67127726/2912349
def noise_psd(N, psd = lambda f: 1):
X_white = np.fft.rfft(np.random.randn(N));
S = psd(np.fft.rfftfreq(N))
# Normalize S
S = S / np.sqrt(np.mean(S**2))
X_shaped = X_white * S;
return np.fft.irfft(X_shaped);
def PSDGenerator(f):
return lambda N: noise_psd(N, f)
@PSDGenerator
def get_white_noise(f):
return 1;
@PSDGenerator
def get_blue_noise(f):
return np.sqrt(f);
@PSDGenerator
def get_violet_noise(f):
return f;
@PSDGenerator
def get_brownian_noise(f):
return 1/np.where(f == 0, float('inf'), f)
@PSDGenerator
def get_pink_noise(f):
return 1/np.where(f == 0, float('inf'), np.sqrt(f))
def get_colored_noise(size, color="white"):
if color == "white":
return get_white_noise(size)
elif color == "blue":
return get_blue_noise(size)
elif color == "violet":
return get_violet_noise(size)
elif color == "brownian":
return get_brownian_noise(size)
elif color == "pink":
return get_pink_noise(size)
# --------------------------------------------------------------------------------
# Sample the different noise distributions, and determine the spectral density of each series.
size = 100_000
fig, axes = plt.subplots(2, 2, sharex=False, sharey="row")
bins = np.linspace(-5, 5, 101)
noise_types = ["white", "pink", "blue", "violet"]
colors = ["lightgray", "magenta", "blue", "violet"]
for noise_type, color in zip(noise_types, colors):
samples = get_colored_noise(size, noise_type)
axes[0, 0].hist(samples, bins=bins, color=color, label=noise_type, histtype="step", density=True)
f, p = welch(samples)
axes[1, 0].loglog(f, p, color=color, label=noise_type)
# --------------------------------------------------------------------------------
# The samples above are normally distributed. Applying the exponential function yields lognormally distributed samples.
# However, the shape of the power spectrum is not preserved in every case.
# Specifically, noise with a low power in the lower frequencies and high power in high frequencies (blue, violet) appears indistinguishable from white noise.
#
# [1] https://en.wikipedia.org/wiki/Log-normal_distribution
bins = np.linspace(0, 10, 101)
for noise_type, color in zip(noise_types, colors):
samples = np.exp(get_colored_noise(size, noise_type))
axes[0, 1].hist(samples, bins=bins, color=color, label=noise_type, histtype="step", density=True)
f, p = welch(samples)
axes[1, 1].loglog(f, p, color=color, label=noise_type)
# --------------------------------------------------------------------------------
# decorate axes
axes[0, 0].set_title("Normally distributed samples")
axes[0, 1].set_title("Lognormally distributed samples")
axes[0, 0].set_ylabel("Density")
axes[0, 0].set_xlabel("Value")
axes[0, 1].set_xlabel("Value")
axes[1, 0].set_ylabel("Power")
axes[1, 0].set_xlabel("Frequency [Hz]")
axes[1, 1].set_xlabel("Frequency [Hz]")
axes[0, 0].legend(loc="upper left")
fig.tight_layout()
plt.show()
@jlandercy's answer below matches the PSDs remarkably well. However, at least in the case of lognormal target distributions with non-white noise PSD, the distribution of the values is no longer matched very well. While the resulting distributions are long-tailed, the mode appears to be zero (even though it should be around 1).
import numpy as np
import matplotlib.pyplot as plt
from scipy import fft, stats
def get_colored_noise(size, power=lambda f: 1., distribution=stats.norm):
X = fft.rfft(distribution.rvs(size=size))
S = power(fft.rfftfreq(size))
S /= np.sqrt(np.mean(S ** 2))
X *= S
return fft.irfft(X)
powers = {
"white" : lambda f: 1.,
"blue" : lambda f: np.sqrt(f),
"violet" : lambda f: f,
"brown" : lambda f: 1. / np.where(f == 0, float('inf'), f),
"pink" : lambda f: 1. / np.where(f == 0, float('inf'), np.sqrt(f))
}
# --------------------------------------------------------------------------------
# Test with a normal target distribution.
size = 100_000
noise_types = ["white", "pink", "blue", "violet"]
colors = ["lightgray", "magenta", "blue", "violet"]
fig, axes = plt.subplots(2, 2, sharex=False, sharey="row")
target_distribution = stats.norm(loc=0, scale=0.5)
bins = np.linspace(-2.5, 2.5, 101)
for noise_type, color in zip(noise_types, colors):
samples = get_colored_noise(size, powers[noise_type], target_distribution)
axes[0, 0].hist(samples, bins=bins, color=color, label=noise_type, histtype="step", density=True)
f, p = welch(samples)
axes[1, 0].loglog(f, p, color=color, label=noise_type)
# plot expected distribution of values
expected_values = target_distribution.rvs(size)
axes[0, 0].hist(expected_values, bins=bins, color="black", linestyle="--", label="expectation", histtype="step", density=True)
# --------------------------------------------------------------------------------
# Test with a lognormal target distribution.
target_distribution = stats.lognorm(loc=0, s=0.5)
bins = np.linspace(0, 5, 101)
for noise_type, color in zip(noise_types, colors):
samples = get_colored_noise(size, powers[noise_type], stats.lognorm(loc=0, s=0.5))
axes[0, 1].hist(samples, bins=bins, color=color, label=noise_type, histtype="step", density=True)
f, p = welch(samples)
axes[1, 1].loglog(f, p, color=color, label=noise_type)
# plot expected distribution of values
expected_values = target_distribution.rvs(size)
axes[0, 1].hist(expected_values, bins=bins, color="black", linestyle="--", label="expectation", histtype="step", density=True)
# --------------------------------------------------------------------------------
# decorate axes
axes[0, 0].set_title("Normally distributed samples")
axes[0, 1].set_title("Lognormally distributed samples")
axes[0, 0].set_ylabel("Density")
axes[0, 0].set_xlabel("Value")
axes[0, 1].set_xlabel("Value")
axes[1, 0].set_ylabel("Power")
axes[1, 0].set_xlabel("Frequency [Hz]")
axes[1, 1].set_xlabel("Frequency [Hz]")
axes[0, 0].legend(loc="upper left")
fig.tight_layout()
plt.show()
IAAFT Method
To achieve the OP's objectives, I used a non-linear approach based on the Iterative Amplitude Adjusted Fourier Transform (IAAFT) Algorithm. This method is described in Section 2.1 in this paper by Venema, Ament and Simmer, Nonlin. Processes Geophys., 13, 321–328, 2006. This paper introduces a stochastic extension of the IAAFT method originally proposed (I believe) by Schreiber and Schmitz (1996, 2000), referenced therein.
IAAFT alternates between two steps: enforcing the desired spectral shape and restoring the target amplitude distribution (here log-normal) via rank remapping. See Section 2.1 of the Venema paper for a detailed explanation.
Further below I include my prior original attempts based on filtering the OP's log normal samples with +10 dB/decade noise shaping filters. I'll keep those in case there is interest (as they do clearly still meet the OP's more general objectives), but after comparing all three approaches, I see no advantage of pursuing the linear filtering route: linear filtering distorts the amplitude distribution: the output in this case did remain strictly positive and right-skewed, but is no longer log-normal - a consequence of central limit effects.
The figure below shows the resulting log-normal amplitude distribution and +10 dB/decade (blue noise) PSD achieved using the IAAFT method. A Python script follows, which can generate samples with both a log-normal distribution and any desired spectral shape.
One key implementation detail not found in the literature for the IAAFT implementation (to my knowledge) is the use of piecewise linear interpolation in log-log space when constructing the target DFT magnitudes. This technique, also used in the filtering methods described futher below, is essential for faithfully reproducing PSDs defined on log-log axis (as they often are) and dramatically improves the accuracy of the spectral shaping.
import numpy as np
from scipy import signal as sig
from scipy.stats import rankdata
from numpy.fft import rfft, irfft
# Parameters
nsamps = 2**16
fs = 1.0
iters = 50 # upper bound; will break earlier
tol = 1e-6 # relative PSD‑error threshold
# Target marginal (white log‑normal)
x0 = np.exp(np.random.randn(nsamps))
ref_sorted = np.sort(x0) # store once
# Target magnitude spectrum (+10 dB/decade) via log‑log interp
freq_bp = np.array([1e-5*fs, fs/2]) # break‑points (Hz)
psd_bp = 10*np.log10(freq_bp) # 10 dB/decade slope
fft_bins = np.fft.rfftfreq(nsamps, d=1/fs)
# insert first point to avoid log10(0)
freq_bp = np.insert(freq_bp, 0, fs/nsamps)
psd_bp = np.insert(psd_bp, 0, psd_bp[0])
# linear interp on log–log axes
new_psd = np.interp(np.log10(fft_bins[1:]),
np.log10(freq_bp[1:]), psd_bp[1:])
target_mag = np.empty_like(fft_bins)
target_mag[0] = 0.0 # Set DC to zero
target_mag[1:] = 10**(new_psd/20) # amplitude spectrum
# Rank‑remap helper
def rank_remap(src, ref_sorted):
ranks = rankdata(src, method='ordinal') - 1
return ref_sorted[ranks]
# IAAFT loop
x = x0.copy()
last_err = np.inf
for i in range(iters):
# (a) impose spectrum
X = rfft(x)
X = target_mag * np.exp(1j*np.angle(X))
y = irfft(X, n=nsamps)
# (b) impose marginal
x = rank_remap(y, ref_sorted)
# convergence check every few iters (to exit early if converged)
if (i % 5 == 0) or (i == iters-1):
err = np.linalg.norm(np.abs(rfft(x)) - target_mag) / np.linalg.norm(target_mag)
print(f"iter {i:2d}: PSD rel‑err = {err:.3e}")
if err > last_err - tol:
print("converged.")
break
last_err = err
The above details are complete as far as the recommended solution to meet the OP's objectives. I am retaining my prior attempts below due to possible interest in these approaches for generating colored noise in general.
These include:
Filtering of the log normal distributed samples with a +10 dB / decade FIR noise shaping filter: Excellent PSD match as +10 dB / decade blue noise, but filtering does alter the log-normal distribution (still strictly positive and right-skewed but not log-normal). Computation intensive (many filter coefficients) to extend PSD performance to low frequency offsets.
Filtering of the log normal distributed samples with a +10 dB / decade IIR noise shaping filter. Reasonable PSD match as +10 dB / decade blue noise, same convergence issue with distribution as FIR approach, but more efficient in computational resources.
The resulting distributions and PSD's are given below and then the details for each including Python code below that.
FIR Method
The Blue noise shaping filter can be created directly from an IFFT as an FIR filter. This requires a significantly large number of coefficients to extend the performance down to low frequency offsets, but makes for a relatively simple and high performance blue noise filter (or to any noise shape desired) and can be easily modified to any power spectral density shape. If a higher low frequency limit is acceptable, the number of coefficients required can be significantly reduced (the low frequency limit in Hz is approximately the inverse of the total time duration of the filter's impulse response (as given by the number of coefficients).
# simple blue noise filter as the inverse FFT of properly placed
# spectral values in an FFT grid
fs=1.
imp_duration = 20001 # (in samples, will be filter length, must be odd)
freq = np.array([1e-5*fs, fs/2])
psd = 10*np.log10(freq) # desired PSD
# ------The following allows for arbitrary PSD shapes ------------
# interpolate on linear frequency scale to Nyquist (DC excluded)
# consistent with piece-wise linear on a loglog plot
new_freq = np.arange(imp_duration // 2 + 1) * fs / imp_duration
psd_bin1 = psd[0]
# insert first values outside of interpolation (avoid log0 errors)
freq = np.insert(freq, 0, fs / imp_duration)
psd = np.insert(psd, 0, psd_bin1)
logf = np.log10(freq[1:])
logf_new = np.log10(new_freq[1:])
new_psd = np.interp(logf_new, logf, psd[1:]) # interpolated psd
# ---------------------------------------------------------------
fft_out = np.empty(imp_duration, dtype='complex')
fft_out[1:imp_duration // 2 + 1] = 10**(new_psd / 20)
fft_out[:-imp_duration // 2:-1] = 10**(new_psd / 20)
# fill DC Bin
fft_out[0] = 0 # np.abs(fft_out[1])
coeff = np.real(fft.fftshift(fft.ifft(fft_out)))
# simple normalization at Nyquist for high pass responses
__, normalize = sig.freqz(coeff, 1, worN = .5)
# FIR Numerator Coefficients
num = coeff / np.abs(normalize)
The magnitude frequency response of this filter (for comparison with IIR approach) is shown below:
IIR Method
Alternatively an IIR filter can be used to create the Blue Noise shaping filter. Here many approaches can be used to create a filter with a frequency response increasing +10 dB/decade (to meet the OP's "Blue Noise" requirements). The approach I took was to cascade a pink noise filter (which has a frequency response of -10 dB/decade) with my "World's Best Full Band Differentiator" which as a differentiator has a frequency response of +20 dB/decade extending to nearly Nyquist. Classical differentiators based on 2 sample differences (Backward Euler) or Taylor series three sample approximations fail to have the necessary response close to Nyquist. The 30 sample FIR filter used is provided in the code below (this too can be shortened if needed at the expense of performance close to Nyquist). This is the best 30-tap differentiator (based on my own opinion). For the "best" differentiator with only 5 taps (insufficient for purposes here but useful elsewhere), see Rick Lyon's variant at here: https://dsprelated.com/showarticle/814.php. For a more extensive list of common pink noise filters and other arbitrary psd noise generators, see DSP.SE #21048. I used the "Simple enough" one from Julius Smith to demonstrate the process here.
# "World's best full band differentiator"
coeff = array([ 2.82112362e-05, -1.11647072e-04, 2.91526927e-04, -6.37274523e-04,
1.25673341e-03, -2.31844664e-03, 4.09244466e-03, -7.03092714e-03,
1.19420035e-02, -2.04036794e-02, 3.58878504e-02, -6.74174135e-02,
1.45214857e-01, -4.29264611e-01, 3.98461826e+00, -3.98461826e+00,
4.29264611e-01, -1.45214857e-01, 6.74174135e-02, -3.58878504e-02,
2.04036794e-02, -1.19420035e-02, 7.03092714e-03, -4.09244466e-03,
2.31844664e-03, -1.25673341e-03, 6.37274523e-04, -2.91526927e-04,
1.11647072e-04, -2.82112362e-05])
# Pink noise shaping filter from https://ccrma.stanford.edu/~jos/sasp/Example_Synthesis_1_F_Noise.html
# IIR Numerator coefficients
B = [0.049922035, -0.095993537, 0.050612699, -0.004408786]
# IIR Denominator coefficients
A = [1, -2.494956002, 2.017265875, -0.522189400]
imp_duration = round(np.log(1000)/(1-max(abs(np.roots(A))))) + coeff.size # impulse response duration
# cascade pink noise filter with differentiator
b = np.convolve(B, diff)
# simple normalization at Nyquist for high pass responses
__, normalize = sig.freqz(b, A, worN = .5)
# Final IIR filter numerator
num = b / np.abs(normalize)
# Final IIR filter denominator (no change)
den = A
The resulting desired noise samples is then created by cascading the noise (with any distribution) with these filters as follows:
The time domain samples are generated as in the OP, except I generate a larger sample set to allow for removing filter transients, and then processed with either the FIR or IIR filter as follows:
nsamps = 2**18
tsamps = np.exp(np.random.randn(nsamps//2))
# den = 1 for FIR filter method
filtered_tsamps = sig.lfilter(num, den, tsamps)[imp_duration:]