sklearn KDTree giving incorrect output for nearest neighbours

36 views Asked by Aran At 10 October 2023 at 14:22

I have two datasets containing points. I want to 'match' the points to eachother so that I can plot them and have a line joining them. Using a KDTree seemed to be the best solution to find the nearest neighbour in each list which worked for some of the data I have but not for others.

Example of how the plots should look

Some of the plots come out looking like this with lines connecting the wrong points when its clear there are closer points available

A = centres['arr_13b'].to_numpy()
B = tops['arr_13b'].to_numpy()
tree = KDTree(A, metric = 'euclidean')
dist_,ind_ = tree.query(B)

coords = np.zeros((len(A),2,2))
for i,match in enumerate(coords):
    match[0] = A[i]
    match[1] = B[ind_[i]]

lines = LineCollection(coords, color = 'red')
print(coords)
print(ind_)
print(A)
print(B)

fig, ax = plt.subplots(dpi = 200)
plt.scatter(centres['arr_13b']['X'], centres['arr_13b']['Y'], s = 50, color = 'deepskyblue')
plt.scatter(tops['arr_13b']['X'], tops['arr_13b']['Y'], s = 10, color = 'dodgerblue')

ax.add_artist(lines)
plt.show()

I originally assumed the way I assigned coordinate values to the matrix was getting mixed up and causing the wrong points to be connected but after looking at the output from KDtree.query() it appears that its actually identifying the wrong points as nearest neighbours. Am I doing something wrong or is there a better way to achieve what I want to do?

Original Q&A

TechQA.

sklearn KDTree giving incorrect output for nearest neighbours

There are 0 answers

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in KDTREE

Popular Questions

Trending Questions