I have a dataframe with tuples of latitudes and longitudes as below (sample of actual coordinates):
    id    latlon             
67  79    (39.1791764701497, -96.5772313693982)
68  17    (39.1765194942359, -96.5677757455844)
69  76    (39.1751440428827, -96.5772939901891)
70  58    (39.175359525189, -96.5691986655256)
71  50    (39.1770962912298, -96.5668107589661)
I want to find the id and the distance of the nearest latlon in the same dataframe (for illustration, I'm just making up numbers below in nearest_id and nearest_dist columns):
    id    latlon                                  nearest_id  nearest_dist
67  79    (39.1791764701497, -96.5772313693982)   17          37          
68  17    (39.1765194942359, -96.5677757455844)   58          150           
69  76    (39.1751440428827, -96.5772939901891)   50          900          
70  58    (39.175359525189, -96.5691986655256)    17          12          
71  50    (39.1770962912298, -96.5668107589661)   79          4      
I have a large number (45K+) of coordinates on which I want to perform this operation.
Here is my attempted solution below, using great_circle from geopy.distances:
def great_circle_dist(latlon1, latlon2):
    """Uses geopy to calculate distance between coordinates"""
    return great_circle(latlon1, latlon2).meters
def find_nearest(x):
        """Finds nearest neighbor """
        df['distances'] = df.latlon.apply(great_circle_dist, args=(x,))
        df_sort = df.sort_values(by='distances')
        return (df_sort.values[1][0], df_sort.values[1][2])
df['nearest'] = df['latlon'].apply(find_nearest)
df['nearest_id'] = df.nearest.apply(lambda x: x[0])
df['nearest_dist'] = df.nearest.apply(lambda x: x[1])
del df['nearest']
del df['distances']
What can be done to make this calculation efficiently?
                        
Spatial indexing should help.
You can achieve spatial indexing using a database (e.g. Postgres with PosGIS extension), but you can also have an in-memory solution.
Have a look at the Rtree library. You will need to create an index, add all your points to the index, and then query the index using the
nearestmethod.