I developed an application which simulates N robots moving in a grid which try to maximize the amount of visited grid cells in a limited amount of steps, meeting in a goal point. It all works correctly, but is horrible slow. It is currently python+numpy+mathplotlib.
Maximum robots there can be have a soft limit of 100 (if it can get higher, it is nice to have).
To do that, I do the following, simplified:
while steps > 0:
for robot in robots:
agent.calc(robot,steps)
A robot is a 1x2 numpy array (x-and-y-coordinates).
The agent here decides what to do. Since I need to switch the tactic and strategy on the fly, I cannot move that logic.
agent.calc updates a robot in place, one after another.
cProfiling it returns the following. Extracting the top
39014272 function calls (39010490 primitive calls) in 150.314 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
12417735 62.807 0.000 62.807 0.000 distance.py:8(taxicab_distance)
124596 36.882 0.000 36.882 0.000 {numpy.core.multiarray.array}
113657 30.204 0.000 100.800 0.001 logical_agent.py:16(choose_max_distance_to...)
12417013 6.579 0.000 69.384 0.000 squaregrid.py:30(distance)
113700 2.900 0.000 109.769 0.001 logical_agent.py:73(calc)
11652363 2.625 0.000 2.625 0.000 {method 'append' of 'list' objects}
161849 1.653 0.000 1.653 0.000 distance.py:11(euclidean_distance)
113664 1.632 0.000 1.632 0.000 {sorted}
114834 1.185 0.000 1.185 0.000 {method 'keys' of 'dict' objects}
113700 0.695 0.000 1.134 0.000 squaregrid.py:19(neighbours)
I implemented different environments for the robots, the most important is squaregird. Every environment has its own distance function, since I intended to use different metrics, i.e. Manhattan/taxicab and euclidean. I extracted the distance function into an own distance.py file, since I use it in several occasions.
One can see that taxicab_distance is called alot, since the agent needs to evaluate the distances of a robots four neighbours and itself to a goal point to see whether the next position can still reach the goal and to maximize the distance to all other robots as a optimizing heuristics.
The function does not do anything fancy, just
def taxicab_distance(u, v):
return np.abs(u[0] - v[0]) + np.abs(u[1] - v[1])
I know that python has a pretty high function call overhead, and I assume that that hits the performance. The {numpy.core.multiarray.array} can be ignored, I think I know what I am doing wrong there.
Distance call chain: agent -> environment.distance -> taxicab_distance
The question is, how can I reduce the overhead of calling the function? I strongly considered using pythons c extensibility, cython, to be more concrete. Will it work? can there be another reason why it is so slow?
I benchmarked it with inlining, and it took off ~15sec. In the end, I rewrote the number crunching in C++ and used cython for integrating. After that, it took only 1 second.
EDIT: cpython -> cython