emcee refuses to properly explore function

I have a 1D function that looks like this:

i.e. a pronounced drop towards a stable value around 0. The function is written as:

((1. / np.sqrt(1. + x ** 2)) - (1. / np.sqrt(1. + C ** 2))) ** 2

where C is the parameter I'm trying to explore using emcee. The problem is that emcee (using a uniform prior) refuses to explore the region of large likelihood and instead wanders around the entire allowed range for this parameter seemingly randomly. Here's what the traces look like (full code is below):

where the true value is shown with a red line. Whereas scipy.optimize.minimize is able to easily zoom in on the true value, emcee is apparently unable to do it.

Am I doing something wrong or is this function just not able to be explored using a uniform prior like I am doing?

import numpy as np
import matplotlib.pyplot as plt
import emcee


def main():

    # Set true value for the variable
    C_true = 27.

    # Generate synthetic data
    x = np.arange(.1, 100)
    y_true = func(x, C_true)
    noise = 0.01
    y_obs = np.random.normal(y_true, noise)

    # Set up the MCMC
    nwalkers = 4
    ndim = 1
    nburn = 500
    nsteps = 5000
    # Maximum value for the 'C' parameter
    C_max = 5 * C_true
    # Use a 10% STDDEV around the true value for the initial state
    p0 = [np.random.normal(C_true, C_true * .1, nwalkers)]
    p0 = np.array(p0).T

    # Run the MCMC
    print("Running emcee...")
    sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(x, y_obs, C_max))
    # Burn-in
    state = sampler.run_mcmc(p0, nburn)
    sampler.reset()
    sampler.run_mcmc(state, nsteps)
    samples = sampler.chain.reshape((-1, ndim))

    # Print the median and 1-sigma uncertainty of the parameters
    C_median = np.median(samples)
    C_percnt = np.percentile(samples, [16, 84])
    print(f'C = {C_median:.2f} ({C_percnt[0]:.2f}, {C_percnt[1]:.2f})')

    # Chains
    plt.plot(sampler.chain[:, :, 0].T, c='k', alpha=0.1)
    plt.axhline(C_true, color='r')
    plt.ylabel('C')
    plt.xlabel('Step')
    plt.tight_layout()
    plt.show()

    # Fitted func
    plt.scatter(x, y_obs)
    y_emcee = func(x, C_median)
    plt.scatter(x, y_emcee)
    plt.show()


def func(x, C):
    x_C = ((1. / np.sqrt(1. + x ** 2)) - (1. / np.sqrt(1. + C ** 2))) ** 2
    # Beyond C, the function is fixed to 0
    return np.where(x < C, x_C, 0)


def lnlike(C, x, y_obs):
    model = func(x, C)
    lkl = -np.sum((y_obs - model) ** 2)
    return lkl


def lnprior(C, C_max):
    if 0 < C < C_max:
        return 0.0
    return -np.inf


def lnprob(C, x, y_obs, C_max):
    lp = lnprior(C, C_max)
    if not np.isfinite(lp):
        return -np.inf
    return lp + lnlike(C, x, y_obs)


if __name__ == '__main__':
    main()

Solution

In your log likehood function, include the noise standard deviation and the factor of 2 in the denominator, i.e., change it to:

def lnlike(C, x, y_obs, sigma):
    model = func(x, C)
    lkl = -np.sum(0.5 * (y_obs - model) ** 2 / sigma**2)
    return lkl

def lnprob(C, x, y_obs, C_max, sigma):
    lp = lnprior(C, C_max)
    if not np.isfinite(lp):
        return -np.inf
    return lp + lnlike(C, x, y_obs, sigma)

and your sampler run to:

sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(x, y_obs, C_max, noise))

and you should get what you expect:

Alternatively, if you don't expect to know what the noise standard deviation is, you could use a Student's-t like likelihood (start from a Gaussian likelihood and marginalise over an unknown sigma value, see, e.g., Section 3.3 of this book), e.g.,

def lnlike(C, x, y_obs):
    model = func(x, C)
    lkl = -(len(x) / 2) * np.log(np.sum((y_obs - model) ** 2))
    return lkl

which should also work.