for exmaple, I have two arrays: 'x' for actual values, 'I' for their index in array.
x = [1, 2, 3, 3, 2, 4]
I = [0, 1, 2, 3, 4, 5]
in 'x', the 4th value is duplicate one of 3rd value and the 5th value is duplicate one of 2nd value
Therefore, I want to generate the
y = [0, 1, 2, 2, 1, 5]
(containing first occurrence indicies of original array values)
How can I do this efficiently using python numpy methods?
You could do:
u, idx, inv = np.unique(x, return_inverse=True, return_index=True)
>>> idx
array([0, 1, 2, 5], dtype=int64)
>>> inv
array([0, 1, 2, 2, 1, 3], dtype=int64)
>>> idx[inv]
array([0, 1, 2, 2, 1, 5], dtype=int64)
No, after it's clear, read the docs of np.unique
:
inv
are the indices of the unique array that can be used to reconstruct x
.idx
are the indices of x
that result in the unique array.So you just take the indices of x
that result in the unique array and reconstruct them instead of x