gaussian-processgpytorch

How to Model Different Noise Levels for Velocity and Acceleration in Gaussian Process Regression using GPyTorch?


I'm currently working on a project where I'm using Gaussian Process Regression with GPyTorch to model the velocity and acceleration of a vehicle.

My data consists of three columns: time, velocity and acceleration.

Since acceleration is the derivative of velocity, I'm fitting both velocity and its derivative (acceleration) using GPyTorch as shown here.

Here's a simplified overview of my code:

import torch
import gpytorch

# Extract necessary columns
train_x = torch.tensor(data['time'].values, dtype=torch.float32).unsqueeze(-1)
train_vx = torch.tensor(data['vel'].values, dtype=torch.float32).unsqueeze(-1)
train_ax = torch.tensor(data['accel'].values, dtype=torch.float32).unsqueeze(-1)
train_y = torch.cat([train_vx, train_ax], dim=-1)

# Defining GP model with derivatives
class GPModelWithDerivatives(gpytorch.models.ExactGP):
    # ... (model definition) ...

likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=2)
model = GPModelWithDerivatives(train_x, train_y, likelihood)

This is working pretty well, but although MultitaskGaussianLikelihood will model the noise separately for vel and accel I've heard that it assumes the noise level is about the same in each, and for my data, this is definitely not the case, with the acceleration data being much noisier than the velocity data.

Is there a way to specify different noise levels for the velocity and acceleration data within GPyTorch's MultitaskGaussianLikelihood class or any other suitable method?


Solution

  • I just tested the code from the tutorial you linked, but with separate noise levels on the velocity and acceleration e.g.

    from torch.distributions.multivariate_normal import MultivariateNormal
    noise = MultivariateNormal(torch.zeros(2),torch.diag(torch.tensor([0.05,0.5])))
    train_y += noise.sample((n,))
    

    With a slightly modified likelihood, because having global and task noise seems to introduce a redundant fitting parameter:

    likelihood = MultitaskGaussianLikelihood(num_tasks=2,has_global_noise=False)
    

    And after training the model, it was able to capture the different noise levels in the noise model (0.0439,0.3795). The fitted noise values can be accessed via likelihood.task_noises.

    Interestingly, when I introduce the extra argument rank=2 to the likelihood, which corresponds to introducing an extra parameter to account for correlation in the noise:

    likelihood = MultitaskGaussianLikelihood(num_tasks=2,has_global_noise=False,rank=2)
    

    the final noise model did an even better job at reproducing the noise levels (0.0594,0.5139). However, due to a rather confusing design choice, the noise values have to be accessed via likelihood.task_noise_covar when the rank is increased. The fitted covariance was also quite large, probably reflecting the small number of samples in this example.