pythonnumpydownsampling

Python: Resizing array by removing nth element


I have some dynamically created arrays that have varying lengths and I would like to resize them to the same 5000 element length by popping every n element.

Here is what I got so far:

import numpy as np
random_array = np.random.rand(26975,3)

n_to_pop = int(len(random_array) / 5000)
print(n)

If I do the downsampling with n (5) I get 5395 elements

I can do 5395 / 5000 = 1.07899, but I don't know how to calculate how often I should pop a element to remove the last 0.07899 elements.

If I can get within 5000-5050 length that would also be acceptable, then the remainder can be sacrificed with a simple .resize

This is probably just a simple math question, but I couldn't seem to find an answer anywhere.

Any help is much appreciated.

Best regards

Martin


Solution

  • You can use something like np.linspace to make your solution as uniform as possible:

    subset = random_array[np.round(np.linspace(0, len(random_array), 5000, endpoint=False)).astype(int)]
    

    You don't always want to drop a uniform number of elements. Consider the case of reducing a 5003 element array to 5000 elements vs a 50003 element array. The trick is to create a set of elements to keep or drop that's as linear as possible in the index, which is exactly what np.linspace does.

    You could also do something like

    np.delete(random_array, np.round(np.linspace(0, len(random_array) len(random_array) - 5000, endpoint=False)).astype(int))