matlabprobabilitydiscrete-space

Determining probability mass function of random variable


If we have a discrete random variable x and the data pertaining to it in X(n), how in matlab can we determine the probability mass function pmf(X)?


Solution

  • You can do this in at least eight different ways (some of them were already mentioned in the other solutions).

    Say we have a sample from a discrete random variable:

    X = randi([-9 9], [100 1]);
    

    Consider these equivalent solutions (note that I don't assume anything about the range of possible values, just that they are integers):

    [V,~,labels] = grp2idx(X);
    mx = max(V);
    
    %# TABULATE (internally uses HIST)
    t = tabulate(V);
    pmf1 = t(:, 3) ./ 100;
    
    %# HIST (internally uses HISTC)
    pmf2 = hist(V, mx)' ./ numel(V);                      %#'
    
    %# HISTC
    pmf3 = histc(V, 1:mx) ./ numel(V);
    
    %# ACCUMARRAY
    pmf4 = accumarray(V, 1) ./ numel(V);
    
    %# SORT/FIND/DIFF
    pmf5 = diff( find( [diff([0;sort(V)]) ; 1] ) ) ./ numel(V);
    
    %# SORT/UNIQUE/DIFF
    [~,idx] = unique( sort(V) );
    pmf6 = diff([0;idx]) ./ numel(V);
    
    %# ARRAYFUN
    pmf7 = arrayfun(@(x) sum(V==x), 1:mx)' ./ numel(V);   %#'
    
    %# BSXFUN
    pmf8 = sum( bsxfun(@eq, V, 1:mx) )' ./ numel(V);      %#'
    

    note that GRP2IDX was used to get indices starting at 1 corresponding to the entries of pmf (the mapping is given by labels). The result of the above is:

    >> [labels pmf]
    ans =
               -9         0.03
               -8         0.07
               -7         0.04
               -6         0.07
               -5         0.03
               -4         0.06
               -3         0.05
               -2         0.05
               -1         0.06
                0         0.05
                1         0.04
                2         0.07
                3         0.03
                4         0.09
                5         0.08
                6         0.02
                7         0.03
                8         0.08
                9         0.05