I am working on large arrays, and trying to compute the "transition" array (please tell me if that is the right word for it). With a very simple example:
old_indices = [1, 2, 3, 0]
indices = [2, 3, 0, 1]
Since the element at index 0 (1) goes to 3, the first element of the transition is 3, the same for the element at index 1 (2) which goes to 0, so it would be 0, etc. The transition should then be:
transition = [3, 0, 1, 2]
Then, a "transition map" can be created, which is basically the opposite way. The new element at index 0 (2) comes from index 1 in the old list, so the first element of the transition map would be 1, the element at index 1 (3) comes from index 2 in the older list, etc.
transition_map = [1,2,3,0]
I implemented something like this to create these lists:
transition = np.empty(indices.shape, int)
transition_map = np.empty(indices.shape, int)
for i in range(len(old_indices)):
for j in range(len(indices)):
if old_indices[i] == indices[j]:
transition[i] = j
transition_map[transition[i]] = i
Which works properly to create these lists. However, if I have hundreds of thousands or millions of elements in each indices list (of the same size), it starts to take more time. I have 48 processors available for this calculation. I was wondering if parallelizing this and populating the array could be beneficial. However, I have no experience in parallelization with Python 3. For instance, I do not know which tool I should use, and if the fact that I populate the same array with multiple processors would make it possible (or even, if the communication time would be too high to synchronize the array every time).
Assuming that each element exists in both, this solution applies. To speed up the process, we can use a dictionary to keep track of the indices, where the key is the current value, which corresponds to the index. To make the naming easier to understand, I have called the variables before
the transition, and after
, so that we know how the transition maps it.
# Setup test cases
before = [1, 2, 3, 0]
after = [2, 3, 0, 1]
# Create the dictionary
for i, v in enumerate(after):
indices[v] = i
# Find the indices for the transition array
transitions = [indices[x] for x in before]
To find your so-called transition_map
, just reverse the names.
for i, v in enumerate(before):
indices[v] = i
transition_map = [indices[x] for x in after]