Shortest path from A to B

27 views Asked by At

Background

I have a dataset containing n points on a plane. The dataset points are drawn from an unknown curve (for simplicity, a polygonal curve) and are corrupted by Gaussian noise. The following picture shows an example where the red curve represents the unknown curve, and the blue dots represent the dataset points. enter image description here

My ending goal is to produces an estimate of the unknown curve given the dataset. My strategy to tackle the problem is the following:

  • STEP 1: find a "reasonable" ordering for the dataset points;
  • STEP 2: by least squares, fit a spline to the ordered dataset.

Now, the main challenge is to establish a good ordering. Let's say that I have the first point, A, and the ending point, B, of the ordering. In the previous example, A can be the closest point to [x=1, y=0] and the B can be the point with the maximum y value. The problem now is to connect these two points. What does "reasonable" order mean? In principle, the real order is the one in which we see the dataset points by moving from the starting point to the ending point of the unknown curve.

Note: I've seen that some guys solve this problem by directly fitting a spline on the unordered dataset while working with footpoints. I want to avoid this approach because it seems to be computational expensive: for each dataset point, we have to solve an optimization problem just to compute the cost used for fitting the spline. If n is large, it seems an unfeasable approach for real-time application.

Problem

The idea of ordering the points is to search for a path, i.e. the order in which I have to visit the dataset points, to reach point B from point A. So far, without providing additional information, the path that I'm searching for is not uniquely identified. To be more precise, my "ideal" path should satisfy the following constraints:

  1. The length of the path, i.e. the sum of the Euclidean distances between consecutive points, should be minimal;
  2. The path should visit each point of the dataset.

If I'm not mistaken, this problem is the the travelling salesman problem (or something similar). The challenge is that in my case the dataset size n can be large (in the order of ten of thousands), and I need an algorithm that solves this problem in a reasonable amount of time, say less than 1 or 2 seconds on a common personal computer. The computational complexity plays a more crucial role than the accuracy of the output because, in the end, every 1 or 2 seconds (time scans) I have a new dataset (always referring to the same unkwnon curve) to "order" from a new point A to a new point B.

Due to these facts, in order to simplify to computational complexity of the problem, I can relax the second constraint defined above:

  1. The path visits a subset of points in the dataset (eventually, all them).

The idea is that, since n is large, is not a big deal to lose some point on the path from A to B. Moreover, another idea is to merge together the estimates produced at each timescan with a Kalman filter. Hence, hopefully, the errors and the imprecisions of the estimates based on single time scans will cancel out as more and more times scans are obtained. In other words, I'm searching for a fast ordering algorithm, not necessarily super-accurate.

Questions

  1. I've heard about proximity graphs, but I'm not an expert on the subject. Are they related to my ordering problem? Is there some algorithm based on graphs that returns the shortest path from A to B in a reasonable amount of time?
  2. Beside my approach, which I recognize can be full of flaws, is there some fast algorithm to fit arbitrary polygons on a given dataset where the order of the point is random?

What I've done

The simplest strategy that came to my mind was to use a Nearest Neighborh (NN) approach. Unfortunately, NN has the bad problem to "not always progressing" along the unknown curve: it can happen that NN tends to go back toward the starting point A. Then I tried a new heuristic strategy that, somewhat, tries to mimic what an human eye does to fit the dataset:

  • search for a direction;
  • go straight on until you have points in front of you;
  • if you don't have any points in front of you, search for a new direction and repeat.

It works pretty well, by the problem is that my algorithm is cumbersome to tune: it has almost 10 parameters that heavily depends on the signal to noise ratio of the dataset. I would like to have a solution that automatically calibrate itself, and my algorithm is poorly suited to achieve this goal.

0

There are 0 answers