I have a set of data points (as scattered data points in black) and I want to draw their outermost contour. I have tried to calculate the convex hull of my points (code below) but I lose too much resolution and the shape loses its nuances.
# load usual stuff
from __future__ import print_function
import sys, os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib import colors
from scipy.spatial import ConvexHull
# read input file
cbm_contour = sys.argv[1]
def parse_pdb_coords(file):
f = open(file, "r")
coords = X = np.empty(shape=[0, 3])
while True:
line = f.readline()
if not line: break
if line.split()[0] == "ATOM" or line.split()[0] == "HETATM":
Xcoord = float(line[30:38])
Ycoord = float(line[38:46])
Zcoord = float(line[46:54])
coords = np.append(coords, [[Xcoord, Ycoord, Zcoord]], axis=0)
return coords
##########################################################################
plt.figure(figsize=(11, 10))
# parse input file
cbm_coords = parse_pdb_coords(cbm_contour)
# consider only x- and y-axis coordinates
flattened_points = cbm_coords[:, :2]
x = cbm_coords[:,0]
y = cbm_coords[:,1]
# Find the convex hull of the flattened points
hull = ConvexHull(flattened_points)
for simplex in hull.simplices:
plt.plot(flattened_points[simplex, 0], flattened_points[simplex, 1], color='red', lw=2)
plt.scatter(cbm_coords[:,0], cbm_coords[:,1], s=1, c='black')
plt.xlabel('X-axis coordinate ($\mathrm{\AA} $)', size=16)
plt.ylabel('Y-axis distance ($\mathrm{\AA} $)', size=16)
plt.yticks(np.arange(-20, 24, 4),size=16)
plt.xticks(np.arange(-20, 24, 4),size=16)
plt.savefig("example.png", dpi=300, transparent=False)
plt.show()
Note that this can't be transformed to a 'minimal working example' due to the complexity of the data points, but my dataset can be downloaded here. The idea is to have a generalized solution for other datasets too.
Does anyone have a suggestion?
You could use an Alpha Shape instead of your Convex Hull, for instance using the alphashape
library.
You will have to test different values for alpha
to find one that suits your needs. Here alpha=1
did a good job.
# pip install alphashape
# pip install descartes
import matplotlib.pyplot as plt
from alphashape import alphashape
from descartes import PolygonPatch
ax = plt.subplot()
ax.scatter(arr[:, 0], arr[:, 1], s=1, color='k')
ax.set_aspect(1)
ashape = alphashape(arr[:, :2].tolist(), alpha=1)
ax.add_patch(PolygonPatch(ashape, fc='none', ec='red'))
Output:
Assuming such input as arr
(X, Y, Z):
array([[-18.1 , 1.418, 4.103],
[-17.712, 2.202, 5.819],
[-17.58 , 1.998, 5.527],
...,
[ 9.472, -1.901, 11.157],
[ 9.472, -1.295, 11.615],
[ 9.472, -0.895, 12.261]])
If you get the following error:
IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed
Then your version of shapely
is too recent for descartes
. You can fix the issue by following instructions here.