while-loopkdtreeclosest-points

Create line network from closest points with boundaries


I have a set of points and I want to create line / road network from those points. Firstly, I need to determine the closest point from each of the points. For that, I used the KD Tree and developed a code like this:

def closestPoint(source, X = None, Y = None):

df = pd.DataFrame(source).copy(deep = True) #Ensure source is a dataframe, working on a copy to keep the datasource

if(X is None and Y is None):
    raise ValueError ("Please specify coordinate")
elif(not X in df.keys() and not Y in df.keys()):
    raise ValueError ("X and/or Y is/are not in column names")
else:
    df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate

if (df["coord"].duplicated):
    uniq = df.drop_duplicates("coord")["coord"]
    uniqval = list(uniq.get_values())
    dupl = df[df["coord"].duplicated()]["coord"]
    duplval = list(dupl.get_values())

    for kq,vq in uniq.items():
        clstu = spatial.KDTree(uniqval).query(vq, k = 3)[1]
        df.at[kq,"coord"] = [vq,uniqval[clstu[1]]]
        if([uniqval[clstu[1]],vq] in list(df["coord"]) ):
            df.at[kq,"coord"] = [vq,uniqval[clstu[2]]]

    for kd,vd in dupl.items():
        clstd = spatial.KDTree(duplval).query(vd,k = 1)[1]
        df.at[kd,"coord"] = [vd,duplval[clstd]]
else:
    val = df["coord"].get_values()
    for k,v in df["coord"].items():
        clst = spatial.KDTree(val).query(vd, k = 3)[1]
        df.at[k,"coord"] = [v,val[clst[1]]]
        if([val[clst[1]],v] in list (df["coord"])):
            df.at[k,"coord"] = [v,val[clst[2]]]

return df["coord"]

The code can return the the closest points around. However, I need to ensure that no double lines are created (e.g (x,y) to (x1,y1) and (x1,y1) to (x,y)) and also I need to ensure that each point can only be used as a starting point of a line and an end point of a line despite the point being the closest one to the other points.

Below is the visualization of the result: Result of the code

What I want: What I want

I've also tried to separate the origin and target coordinate and do it like this:

df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate
df["target"] = "" #create a column for target points

count = 2 # create a count iteration
if (df["coord"].duplicated):
  uniq = df.drop_duplicates("coord")["coord"]
  uniqval = list(uniq.get_values())
  for kq,vq in uniq.items():
    clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
    while not vq in (list(df["target"]) and list(df["coord"])):
        clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
        df.set_value(kq, "target", uniqval[clstu[count-1]])
    else:
        count += 1
        clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
        df.set_value(kq, "target", uniqval[clstu[count-1]])

but this return an error

IndexError: list index out of range

Can anyone help me with this? Many thanks!


Solution

  • Answering now about the global strategy, here is what I would do (rough pseudo-algorithm):

    current_point = one starting point in uniqval
    while (uniqval not empty)
      construct KDTree from uniqval and use it for next line
      next_point = point in uniqval closest to current_point
      record next_point as target for current_point
      remove current_point from uniqval
      current_point = next_point
    

    What you will obtain is a linear graph joining all your points, using closest neighbors "in some way". I don't know if it will fit your needs. You would also obtain a linear graph by taking next_point at random...