python-2.7numerical-computing

Declaring variables in Python 2.7x to avoid issues later


I am new to Python, coming from MATLAB, and long ago from C. I have written a script in MATLAB which simulates sediment transport in rivers as a Markov Process. The code randomly places circles of a random diameter within a rectangular area of a specified dimension. The circles are non-uniform is size, drawn randomly from a specified range of sizes. I do not know how many times I will step through the circle placement operation so I use a while loop to complete the process. In an attempt to be more community oriented, I am translating the MATLAB script to Python. I used the online tool OMPC to get started, and have been working through it manually from the auto-translated version (was not that helpful, which is not surprising). To debug the code as I go, I use the MATLAB generated results to generally compare and contrast against results in Python. It seems clear to me that I have declared variables in a way that introduces problems as calculations proceed in the script. Here are two examples of consistent problems between different instances of code execution. First, the code generated what I think are arrays within arrays because the script is returning results which look like:

This result was generated for the following code snippet at the overlap_logix operation:

CenterCoord_Array = np.asarray(CenterCoordinates)
Diameter_Array = np.asarray(Diameter)
dist_check = ((CenterCoord_Array[:,0] - x_Center) ** 2 + (CenterCoord_Array[:,1] - y_Center) ** 2) ** 0.5
radius_check = (Diameter_Array / 2) + radius
radius_check_update = np.reshape(radius_check,(len(radius_check),1))
radius_overlap = (radius_check_update >= dist_check)
    # Now actually check the overalp condition.
    if np.sum([radius_overlap]) == 0:
        # The new circle does not overlap so proceed.
        newCircle_Found = 1
        debug_value = 2
    elif np.sum([radius_overlap]) == 1:
        # The new circle overlaps with one other circle
        overlap = np.arange(0,len(radius_overlap), dtype=int)
        overlap_update = np.reshape(overlap,(len(overlap),1))
        overlap_logix = (radius_overlap == 1)
        idx_true = overlap_update[overlap_logix]
        radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)

A similar result for the same run was produced for variables:

Here is the same code snippet for the working MATLAB version (as requested):

distcheck = ((Circles.CenterCoordinates(1,:)-x_Center).^2 +  (Circles.CenterCoordinates(2,:)-y_Center).^2).^0.5;
radius_check = (Circles.Diameter ./ 2) + radius;
radius_overlap = (radius_check >= distcheck);
    % Now actually check the overalp condition.
    if sum(radius_overlap) == 0
        % The new circle does not overlap so proceed.
        newCircle_Found = 1;
        debug_value = 2;
    elseif sum(radius_overlap) == 1
        % The new circle overlaps with one other circle
        temp = 1:size(radius_overlap,2);
        idx_true = temp(radius_overlap == 1);
        radius = distcheck(1,idx_true) - (Circles.Diameter(1,idx_true)/2);

In the Python version I have created arrays from lists to more easily operate on the contents (the first two lines of the code snippet). The array within array result and creating arrays to access data suggests to me that I have incorrectly declared variable types, but I am not sure. Furthermore, some variables have a size, for example, (2L,) (the numerical dimension will change as circles are placed) where there is no second dimension. This produces obvious problems when I try to use the array in an operation with another array with a size (2L,1L). Because of these problems I started reshaping arrays, and then I stopped because I decided these were hacks because I had declared one, or more than one variable incorrectly. Second, for the same run I encountered the following error:

for the operation:

radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)

which occurs at the bottom of the above code snippet. I have posted the entire script at the following link because it is probably more useful to execute the script for oneself:

https://github.com/smchartrand/MarkovProcess_Bedload

I have set-up the code to run with some initial parameter values so decisions do not need to be made; these parameter values produce the expected results in the MATLAB-based script, which look something like this when plotted: enter image description here

So, I seem to specifically be having issues with operations on lines 151-165, depending on the test value np.sum([radius_overlap]) and I think it is because I incorrectly declared variable types, but I am really not sure. I can say with confidence that the Python version and the MATLAB version are consistent in output through the first step of the while loop, and code line 127 which is entering the second step of the while loop. Below this point in the code the above documented issues eventually cause the script to crash. Sometimes the script executes to 15% complete, and sometimes it does not make it to 5% - this is due to the random nature of circle placement. I am preparing the code in the Spyder (Python 2.7) IDE and will share the working code publicly as a part of my research. I would greatly appreciate any help that can be offered to identify my mistakes and misapplications of python coding practice.


Solution

  • I believe I have answered my own question, and maybe it will be of use for someone down the road. The main sources of instruction for me can be found at the following three web pages:

    Stackoverflow Question 176011

    SciPy FAQ

    SciPy NumPy for Matlab users

    The third web page was very helpful for me coming from MATLAB. Here is the modified and working python code snippet which relates to the original snippet provided above:

    dist_check = ((CenterCoordinates[0,:] - x_Center) ** 2 + (CenterCoordinates[1,:] - y_Center) ** 2) ** 0.5
    radius_check = (Diameter / 2) + radius
    radius_overlap = (radius_check >= dist_check)
    # Now actually check the overalp condition.
    if np.sum([radius_overlap]) == 0:
        # The new circle does not overlap so proceed.
        newCircle_Found = 1
        debug_value = 2
    elif np.sum([radius_overlap]) == 1:
        # The new circle overlaps with one other circle
        overlap = np.arange(0,len(radius_overlap[0]), dtype=int).reshape(1, len(radius_overlap[0]))
        overlap_logix = (radius_overlap == 1)
        idx_true = overlap[overlap_logix]
        radius = dist_check[idx_true] - (Diameter[0,idx_true] / 2)
    

    In the end it was clear to me that it was more straightforward for this example to use numpy arrays vs. lists to store results for each iteration of filling the rectangular area. For the corrected code snippet this means I initialized the variables:

    as numpy arrays whereas I initialized them as lists in the posted question. This made a few mathematical operations more straightforward. I was also incorrectly indexing into variables with parentheses () as opposed to the correct method using brackets []. Here is an example of a correction I made which helped the code execute as envisioned:

    This example also shows that I had issues with array dimensions which I corrected variable by variable. I am still not sure if my working code is the most pythonic or most efficient way to fill a rectangular area in a random fashion, but I have tested it about 100 times with success. The revised and working code can be downloaded here:

    Working Python Script to Randomly Fill Rectangular Area with Circles

    Here is an image of a final results for a successful run of the working code:

    enter image description here

    The main lessons for me were (1) numpy arrays are more efficient for repetitive numerical calculations, and (2) dimensionality of arrays which I created were not always what I expected them to be and care must be practiced when establishing arrays. Thanks to those who looked at my question and asked for clarification.