matlabmatrixcombinatoricsdiophantine

Finding all combinations of multiple variables summing to 1


I am trying to solve the equation

x1 + x2 + x3 + .... + xn = 1

where the values of all xi are restricted to [0, 0.1, 0.2, ..., 0.9, 1].


Currently, I am solving the problem by first generating an n-dimensional array mat, where at each element location the value is the sum of the axis values, which vary in axisValues = 0:0.1:1:

mat(i,j,k,...,q) = axisValues(i) + axisValues(j) + ... + axisValues(q).

Then I search for all entries of the resulting array that are equal to one. The code (shown below for further clarification) is working fine and has been tested for up to 5 dimensions. The problem is, that the run time increases exponentially and I need the script to work for more than a few dimensions.

clear all
dim = 2; % The dimension of the matrix is defined here. The script has been tested for dim ≤ 5
fractiles(:,1) = [0:0.1:1]; % Produces a vector containing the initial axis elements, which will be used to calculate the matrix elements
fractiles = repmat(fractiles,1,dim); % multiplies the vector to supply dim rows with the axis elements 0:0.1:1. These elements will be changed later, but the symmetry will remain the same.
dim_len = repmat(size(fractiles,1),1,size(fractiles,2)); % Here, the length of the dimensions is checked, which his needed to initialize the matrix mat, which will be filled with the axis element sums
mat = zeros(dim_len); % Here the matrix mat is initialized
Sub=cell(1,dim);
mat_size = size(mat);
% The following for loop fills each matrix elements of the dim dimensional matrix mat with the sum of the corresponding dim axis elements.
for ind=1:numel(mat)
    [Sub{:}]=ind2sub(mat_size,ind);
    SubMatrix=cell2mat(Sub);
    sum_indices = 0;
    for j = 1:dim
        sum_indices = sum_indices+fractiles(SubMatrix(j),j);
    end
    mat(ind) = sum_indices;
end
Ind_ones = find(mat==1); % Finally, the matrix elements equal to one are found.

I have the feeling that the following idea using the symmetry of the problem might help to significantly reduce calculation time:

For a 2D matrix, all entries that fulfill the condition above lie on the diagonal from mat(1,11) to mat(11,1), i.e. from the maximal value of x1 to the maximal value of x2.

For a 3D Matrix, all entries fulfill the condition that lie on a diagonal plane through mat(1,1,11), mat(1,11,1), mat(11,1,1), i.e. from the maximal value of x1 and x2 to the maximal value of x3.

The same is true for higher dimensions: All matrix elements of interest lie on an n-1 dimensional hyper-plane fixed on the highest axis value in each dimension.


The question is: Is there a way to directly determine the indices of the elements on these n-1 dimensional hyper-plane? If so, the whole problem could be solved in one step and without needing to calculate all entries of the n-dimensional matrix and then searching for the entries of interest.


Solution

  • Math:

    Instead of going the hypercube-way, we solve the equation

    x(1) + x(2) + ... + x(n) = 1
    

    where each x(i) can vary in [0, 1/k, 2/k, ... (k-1)/k, 1] instead. In your case k will be 10, as this will then result in the percentages [0, 10, 20, ... 90, 100]. Multiplied by k this corresponds to the diophantine equation

    x(1) + x(2) + ... + x(n) = k,
    

    where all the x(i) vary in [0, 1, 2, ... k-1, k].

    We can build a bijection between this and the combinatorial concept of combinations with repetition.

    The wikipedia article even implicitly mentions the underlying bijection by the statement:

    The number of multisubsets of size k is then the number of nonnegative integer solutions of the Diophantine equation x1 + x2 + ... + xn = k.

    For a smaller example, say we are going with k=3 and percentages in [0, 33, 66, 100] instead. Given all k-multicombinations of the set {1,2,3}:

    RepCombs = 
         1     1     1
         1     1     2
         1     1     3
         1     2     2
         1     2     3
         1     3     3
         2     2     2
         2     2     3
         2     3     3
         3     3     3
    

    Then we map these to your percentages using the following rule: For every row i, if the entry is j, then add 1/3 of the percentage to the corresponding Matrix entry M(i,j). The first row will correspond to [1/3 + 1/3 + 1/3, 0, 0] = [1,0,0]. The overall matrix generated by this process will look like this:

    M =
        1.0000         0         0
        0.6667    0.3333         0
        0.6667         0    0.3333
        0.3333    0.6667         0
        0.3333    0.3333    0.3333
        0.3333         0    0.6667
             0    1.0000         0
             0    0.6667    0.3333
             0    0.3333    0.6667
             0         0    1.0000
    

    Code:

    And now for the MATLAB code that generates all this: I use the function nmultichoosek from this answer and accumarray to accomplish our goal:

    function M = possibleMixturesOfNSubstances(N, percentageSteps)
    RepCombs = nmultichoosek(1:N, percentageSteps);
    numCombs = size(RepCombs,1);
    M = accumarray([repmat((1:numCombs).', percentageSteps, 1), RepCombs(:)], 1/percentageSteps, [numCombs, N]);
    

    If you want percentages in [0, 10, ... 90, 100] and have 4 substances, call this function using possibleMixturesOfNSubstances(4,10)