matlabellipseweighted

How to plot ellipse centered at weighted mean incorporating 95% of data mass - ideally in Matlab


I'm using Matlab 2022a to solve this problem, but if anyone has general equations/algorithms or a good solution in another program (e.g. R), I'd be interested in hearing about that too.

I have a set of spatial data (mock data generated and plotted below) with x,y, and a third column of data values at each location.

num = 100;
P = mvnrnd([0,0,20], [ 11.3669 -2.9268 -2.0568;
-2.9268 1.3372 0.4483;
-2.0568 0.4483 51.9697],num);
CenterofMass=sum(P(:,1:2).*P(:,3))/sum(P(:,3)); %weighted mean
scatter(P(:,1),P(:,2),abs(P(:,3))), hold on
plot(CenterofMass(:,1),CenterofMass(:,2),'r+')

mock data I'm trying to

  1. Calculate and plot an ellipse centered at the center of mass (weighted mean), with axes that incorporate a statistical percent (e.g. 95%) of the data mass (in the third column).

I've looked at this discussion, but it doesn't deal with having data values at locations. Calculating the center of mass is relatively easy, but I'm struggling with how to deal with the spatial covariance (of whatever equivalent is needed) of data with weights.


Solution

  • The idea is to tweak your data so that you can use the excellent answer from the post you linked to.

    Instead of having a set of points with different masses, you can create a set of points with equal mass, by scaling the number of points:

    num = 100;
    P = mvnrnd([0,0,20], [ 11.3669 -2.9268 -2.0568;
        -2.9268 1.3372 0.4483;
        -2.0568 0.4483 51.9697],num);
    CenterofMass=sum(P(:,1:2).*P(:,3))/sum(P(:,3)); %weighted mean
    scatter(P(:,1),P(:,2),abs(P(:,3))), hold on
    plot(CenterofMass(:,1),CenterofMass(:,2),'r+')
    
    % Create array X
    % digits kept (mass is not an integer)
    prec = 2;
    nX = round(10^prec*P(:,3));
    X = [];
    
    for ii=1:size(P,1)
    
        % Each point is repeated nX(ii) times to get correct mass
        X = [X;[P(ii,1),P(ii,2)].*ones(nX(ii),1)];
    
    end
    

    Now, you can just use the code from the answer, using the EDIT part as well:

    Mu = mean( X );
    X0 = bsxfun(@minus, X, Mu);
    
    STD = 2;                     %# 2 standard deviations
    conf = 2*normcdf(STD)-1;     %# covers around 95% of population
    % --> In the tweaked dataset X, this is equivalent to 95% of the mass
    scale = chi2inv(conf,2); 
    
    %# eigen decomposition [sorted by eigen values]
    [V,D] = eig(scale* (X0'*X0) ./ (size(X,1)-1) );     %#' cov(X0)
    [D,order] = sort(diag(D), 'descend');
    D = diag(D);
    V = V(:, order);
    
    t = linspace(0,2*pi,100);
    e = [cos(t) ; sin(t)];        %# unit circle
    VV = V*sqrt(D);               %# scale eigenvectors
    e = bsxfun(@plus, VV*e, Mu'); %#' project circle back to orig space
    
    %# plot cov and major/minor axes
    plot(e(1,:), e(2,:), 'Color','k');
    

    enter image description here