the question has already been raised several times, but mine differs a little from those previously voiced. There is a table (x value and relative frequencies).
x | 150 | 250 | 350 | 450 | 550 | 650 | 750 |
---|---|---|---|---|---|---|---|
y | 1 | 2 | 8 | 30 | 18 | 16 | 5 |
I don’t really understand the meaning of the function [f,x] = ecdf(y)
built into matlab, since I estimate and plot an empirical distribution function,
however, it is clearly not correct, if you build a histogram based on the selected data (x and y), then the resulting ECDF does not describe the correctly chosen distribution.
Therefore, such a question arose: how to construct correctly ECDF function from the table (empirical distribution function for x and having an array of relative frequencies)for the distribution function and from it directly estimate and plot cumulative distribution function?
My code for plot hist and ECDF:
%% data
y = [1; 2; 8; 30; 18; 16; 5];
x = [150; 250; 350; 450; 550; 650; 750];
%% hist and polygon
figure(1)
bar(x,y,'LineWidth',1,...
'FaceColor',[0.0745098039215686 0.623529411764706 1],...
'EdgeColor',[0.149019607843137 0.149019607843137 0.149019607843137],...
'BarWidth',1,...
'BarLayout','stacked');
hold on
plot(x,y,'-o','Color','red','LineWidth',1)
hold off
%% ecdf
[ff,x] = ecdf(y);
x_e = [0;x];
figure(2)
stairs(x_e,ff,'Marker','o','LineWidth',1,'Color',[0.0745098039215686 0.623529411764706 1]);
set(gca,'GridAlpha',0.25,'GridLineStyle','--','MinorGridLineStyle','--',...
'XGrid','on','XMinorGrid','on','YGrid','on');
xlim([0 780]);
You should not use the ecdf
function, because it takes the data values as input. Your inputs, on the other hand, seem to be the population values and their absolute frequencies. So you only need to
When plotting, I suggest you include some initial and final population values with respective normalized frequencies 0
and 1
for a clearer graph.
x = [150; 250; 350; 450; 550; 650; 750];
y = [1; 2; 8; 30; 18; 16; 5]; % example data
cdf = cumsum(y./sum(y)); % normalize, then compute cumulative sum
stairs([100; x; 900], [0; cdf; 1], 'linewidth', .8), grid on % note two extra values