pythonnumpymatrixsvdmatrix-decomposition

How to generate a Rank 5 matrix with entries Uniform?


I want to generate a rank 5 100x600 matrix in numpy with all the entries sampled from np.random.uniform(0, 20), so that all the entries will be uniformly distributed between [0, 20). What will be the best way to do so in python?

I see there is an SVD-inspired way to do so here (https://math.stackexchange.com/questions/3567510/how-to-generate-a-rank-r-matrix-with-entries-uniform), but I am not sure how to code it up. I am looking for a working example of this SVD-inspired way to get uniformly distributed entries.

I have actually managed to code up a rank 5 100x100 matrix by vertically stacking five 20x100 rank 1 matrices, then shuffling the vertical indices. However, the resulting 100x100 matrix does not have uniformly distributed entries [0, 20).

Here is my code (my best attempt):

import numpy as np
def randomMatrix(m, n, p, q):
    # creates an m x n matrix with lower bound p and upper bound q, randomly.
    count = np.random.uniform(p, q, size=(m, n))
    return count

Qs = []
my_rank = 5
for i in range(my_rank):
  L = randomMatrix(20, 1, 0, np.sqrt(20))
  # L is tall
  R = randomMatrix(1, 100, 0, np.sqrt(20)) 
  # R is long
  Q = np.outer(L, R)
  Qs.append(Q)

Q = np.vstack(Qs)
#shuffle (preserves rank 5 [confirmed])
np.random.shuffle(Q)


Solution

  • I just couldn't take the fact the my previous solution (the "selection" method) did not really produce strictly uniformly distributed entries, but only close enough to fool a statistical test sometimes. The asymptotical case however, will almost surely not be distributed uniformly. But I did dream up another crazy idea that's just as bad, but in another manner - it's not really random.
    In this solution, I do smth similar to OP's method of forming R matrices with rank 1 and then concatenating them but a little differently. I create each matrix by stacking a base vector on top of itself multiplied by 0.5 and then I stack those on the same base vector shifted by half the dynamic range of the uniform distribution. This process continues with multiplication by a third, two thirds and 1 and then shifting and so on until i have the number of required vectors in that part of the matrix.
    I know it sounds incomprehensible. But, unfortunately, I couldn't find a way to explain it better. Hopefully, reading the code would shed some more light.
    I hope this "staircase" method will be more reliable and useful.

    import numpy as np 
    from matplotlib import pyplot as plt
    
    '''
    params:
        N    - base dimention
        M    - matrix length
        R    - matrix rank
        high - max value of matrix
        low  - min value of the matrix
    '''
    N    = 100
    M    = 600
    R    = 5
    high = 20
    low  = 0
    
    # base vectors of the matrix
    base = low+np.random.rand(R-1, N)*(high-low)
    
    def build_staircase(base, num_stairs, low, high):
        '''
        create a uniformly distributed matrix with rank 2 'num_stairs' different 
        vectors whose elements are all uniformly distributed like the values of 
        'base'.
        '''
        l = levels(num_stairs)
        vectors = []
        for l_i in l:
            for i in range(l_i):
                vector_dynamic = (base-low)/l_i
                vector_bias    = low+np.ones_like(base)*i*((high-low)/l_i)
                vectors.append(vector_dynamic+vector_bias)
        return np.array(vectors)
    
    
    def levels(total):
        '''
        create a sequence of stritcly increasing numbers summing up to the total.
        '''
        l = []
        sum_l = 0
        i = 1
        while sum_l < total:
            l.append(i)
            i +=1
            sum_l = sum(l)
        i = 0
        while sum_l > total:
            l[i] -= 1
            if l[i] == 0:
                l.pop(i)
            else:
                i += 1
            if i == len(l):
                i = 0
            sum_l = sum(l)
        return l
            
    n_rm = R-1 # number of matrix subsections
    m_rm = M//n_rm
    len_rms = [ M//n_rm for i in range(n_rm)]
    len_rms[-1] += M%n_rm
    rm_list = []
    for len_rm in len_rms:
        # create a matrix with uniform entries with rank 2
        # out of the vector 'base[i]' and a ones vector.
        rm_list.append(build_staircase(
            base = base[i], 
            num_stairs = len_rms[i], 
            low = low,
            high = high,
        ))
    
    rm = np.concatenate(rm_list)
    plt.hist(rm.flatten(), bins = 100)
    

    A few examples:
    enter image description here enter image description here enter image description here

    and now with N = 1000, M = 6000 to empirically demonstrate the nearly asymptotic behavior: enter image description here enter image description here enter image description here