pythonnumpymatplotlibconvex-hull

Drawing the outermost contour of a set of data points without losing resolution


I have a set of data points (as scattered data points in black) and I want to draw their outermost contour. I have tried to calculate the convex hull of my points (code below) but I lose too much resolution and the shape loses its nuances.

# load usual stuff
from __future__ import print_function
import sys, os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib import colors
from scipy.spatial import ConvexHull

# read input file
cbm_contour = sys.argv[1]


def parse_pdb_coords(file):
    f = open(file, "r")

    coords = X = np.empty(shape=[0, 3])

    while True:
        line = f.readline()
        if not line: break

        if line.split()[0] == "ATOM" or line.split()[0] == "HETATM":
            Xcoord = float(line[30:38])
            Ycoord = float(line[38:46])
            Zcoord = float(line[46:54])

            coords = np.append(coords, [[Xcoord, Ycoord, Zcoord]], axis=0)

    return coords

##########################################################################

plt.figure(figsize=(11, 10))

# parse input file
cbm_coords = parse_pdb_coords(cbm_contour)

# consider only x- and y-axis coordinates
flattened_points = cbm_coords[:, :2]
x = cbm_coords[:,0]
y = cbm_coords[:,1]

# Find the convex hull of the flattened points
hull = ConvexHull(flattened_points)

for simplex in hull.simplices:
    plt.plot(flattened_points[simplex, 0], flattened_points[simplex, 1], color='red', lw=2)

plt.scatter(cbm_coords[:,0], cbm_coords[:,1], s=1, c='black')

plt.xlabel('X-axis coordinate ($\mathrm{\AA} $)', size=16)
plt.ylabel('Y-axis distance ($\mathrm{\AA} $)', size=16)

plt.yticks(np.arange(-20, 24, 4),size=16)
plt.xticks(np.arange(-20, 24, 4),size=16)

plt.savefig("example.png", dpi=300, transparent=False)
plt.show()

Note that this can't be transformed to a 'minimal working example' due to the complexity of the data points, but my dataset can be downloaded here. The idea is to have a generalized solution for other datasets too.

Does anyone have a suggestion?

enter image description here


Solution

  • You could use an Alpha Shape instead of your Convex Hull, for instance using the alphashape library.

    You will have to test different values for alpha to find one that suits your needs. Here alpha=1 did a good job.

    # pip install alphashape
    # pip install descartes
    
    import matplotlib.pyplot as plt
    from alphashape import alphashape
    from descartes import PolygonPatch
    
    ax = plt.subplot()
    ax.scatter(arr[:, 0], arr[:, 1], s=1, color='k')
    ax.set_aspect(1)
    
    ashape = alphashape(arr[:, :2].tolist(), alpha=1)
    
    ax.add_patch(PolygonPatch(ashape, fc='none', ec='red'))
    

    Output:

    pandas alpha shape

    Assuming such input as arr (X, Y, Z):

    array([[-18.1  ,   1.418,   4.103],
           [-17.712,   2.202,   5.819],
           [-17.58 ,   1.998,   5.527],
           ...,
           [  9.472,  -1.901,  11.157],
           [  9.472,  -1.295,  11.615],
           [  9.472,  -0.895,  12.261]])
    

    NOTE

    If you get the following error:

    IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed
    

    Then your version of shapely is too recent for descartes. You can fix the issue by following instructions here.