I have data x and y which are noisy evaluations of a function f:[0,alpha] -> [0,1]. I know very little about my function except that f(0) = 0 and f(alpha) = 1.
Is there any way to enforce these boundary conditions when fitting a spline?
Here is a picture, where one sees that the spline fits nicely, but the approximation is bad around 0, where it takes a value around 0.08:
I am aware of bc_type for various splines, but as far as I can tell this only allows one to specify 1st and 2nd derivative, and not fix boundary values.
(Probably this question betrays my ignorance of how splines are fitted, and that what I am asking for is not possible. I just want to make sure I'm not missing something obvious.)
Here is a toy example:
import numpy as np
from scipy.interpolate import UnivariateSpline
import matplotlib.pyplot as plt
## Generate noisy evaluations of the square root function.
## (In my exmaple, I don't have a closed form expression for my function)
x = np.linspace(0,1,1000)
y = np.sqrt(x) + (np.random.random(1000)-0.5)*0.05
## Fit spline
spline = UnivariateSpline(x,y)
plt.figure(figsize=(7,7))
plt.scatter(x,y,s=1,label="Monte Carlo samples")
plt.plot(x,spline(x),color='red',label="spline")
plt.legend()
plt.title("Noisy evaluations of sqrt")
plt.grid()
plt.show()
And the resulting plot, where one sees rather nicely that the spline provides a poor approximation around zero:
What you're looking for is a function that constructs a clamped B-Spline that interpolates the given endpoints. Unfortunately, to the best of my knowledge, no scipy spline interface implements this exactly.
Still, Unit 9 of this online course describes an algorithm to solve exactly this problem denoted "Curve Global Approximation". Below I implement a python version of this algorithm, which follows the description in the website.
The function bspline_least_square_with_clamped_endpoints(x, y, n_ctrl, p)
returns a scipy.interpolate.BSpline
object with n_ctrl
coefficients, that interpolates the first and last input points as you requested.
The x,y
inputs are as in your example, and the p
parameter designates the spline degree, which is 3 by default.
The parameter n_ctrl
is the number of coefficients in the resulting B-Spline, which you can adjust to your needs. Note that as it increases the result may overfit.
Calling the function like below on your input, with the additional points (0,0)
and (1,1)
at the beginning and end, and with n_ctrl=6
, results in the following figure.
spline = bspline_least_square_with_clamped_endpoints(x, y, n_ctrl=6, p=3)
The implementation follows the explanation in the website.
from scipy.interpolate import BSpline
from scipy.linalg import solve
def bspline_least_square_with_clamped_endpoints(x, y, n_ctrl, p=3):
"""
Build a clamped BSpline least square approximation of the points (x, y) with
number of coefficients=n_ctrl and spline_degree=p.
The resulting BSpline will satisfy BSpline(x[0]) == y[0] and BSpline(x[-1]) == y[-1].
Based on the algorithm presented in: https://pages.mtu.edu/~shene/COURSES/cs3621/NOTES/INT-APP/CURVE-APP-global.html
"""
# Build clamped knot vector between x[0] and x[-1], (simple) implementation here builds uniform inner knots
num_inner_knots_inc_end_pts = n_ctrl - p + 1 # including knots[p] and knots[-p-1]
inner_knots = np.linspace(x[0], x[-1], num_inner_knots_inc_end_pts)
knots = np.concatenate([np.array([x[0]] * p), inner_knots, np.array([x[-1]] * p)])
N = build_basis_functions_eval_matrix(knots, x, p)
Q = y - N[:, 0] * y[0] - N[:, -1] * y[-1]
# now remove first and last rows and columns since we only need rows 1 to n-1, columns 1 to h-1
N = N[1:-1, 1:-1]
Q = Q[1:-1]
c = solve((N.T).dot(N), (N.T).dot(Q))
c = np.concatenate([[y[0]], c, [y[-1]]]) # append the first and last values (as the interpolating end coefficients)
bs = BSpline(knots, c, p)
return bs
The current implementation constructs a clamped knot vector with inner knots uniformly spaced, there are other options (which can result in different basis functions and therefore not exactly the same resulting curve).
The main algorithmic work is done in the function build_basis_functions_eval_matrix(knots, x, p)
below, which constructs the matrix N
of B-Spline basis function evaluations at the inputs. The matrix shape is (len(x), n_ctrl)
and the entry N[k, i]
contains the evaluation of the basis function N_i(x_k)
.
In my implementation I use the fact that the i'th B-Spline basis functions can be reconstructed from scipy.interpolate.BSpline
by setting the coefficients to [0,..,0,1,0,...,0]
- see my answer here for a discussion of how this is done.
def build_basis_functions_eval_matrix(knots, x, p=3):
""" Build N matrix, where N[k, i] contains the evaluation of basis function N_i(x_k)."""
n = len(knots) - p - 1 # number of ctrls == number of columns
m = len(x) # number of samples == number of rows
# Using the hack that scipy.interpolate.BSpline with coefficients [0,...,0, 1, 0,...,0]
# reconstructs the i'th basis function (see https://stackoverflow.com/a/77962609/9702190).
cols = []
for i in range(n):
c = np.zeros((n,))
c[i] = 1
N_i = BSpline(knots, c, p)
col = N_i(x)
cols.append(col.reshape((m, 1)))
N = np.hstack(cols)
return N