Is there any way of finding the longest common subsequence of two sequences in O(NlogN) time?
I read somewhere that there is a way to achieve this using binary search.
I know the dp approach that takes O(N2) time.
Is there any way of finding the longest common subsequence of two sequences in O(NlogN) time?
I read somewhere that there is a way to achieve this using binary search.
I know the dp approach that takes O(N2) time.
                        
                            
                        
                        
                            On
                            
                            
                                                    
                    
                The longest common subsequence between two sequences is essentially n-squared.
Masek and Patterson (1980) made a minor improvement to n-squared / log n using the so-called "Four Russians" technique.
In most cases the additional complexity introduced by such convoluted approaches is not justified by the small gains. For practical purposes you can consider the n-squared approach as the reasonable optimum in typical applications.
                        
                            
                        
                        
                            On
                            
                            
                                                    
                    
                For the general case, the O(N^2) dynamic programming algorithm is the best you can do. However, there exist better algorithms in some special cases.
This is a very common situation. Sequences consisting of letters from some alphabet (e.g. English) lie in this category. For this case, the O(N*M) algorithm can be optimised to get an O(N^2/logN) with method of four Russians. I don't know how exactly, you can search for it.
An example problem is "Given two permutations of numbers from 1 to N, find their LCS". This one can be solved in O(N*logN). 
 Let the sequences be A and B. 
Define a sequence C. C[i] is the index of B[i] in A. (A[C[i]] = B[i])
 LCS of A and B is the longest increasing subsequence of C.
                        
                            
                        
                        
                            On
                            
                            
                                                    
                    
                Assuming Exponential Time Hypothesis (which is stricter than P is not equal to NP, but is still widely believed to be true), it is not possible to do it in time O(N^{2 - eps}) for any positive constant eps, see "Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping" by Karl Bringmann and Marvin Kunnemann (pre-print on arXiv is available).
Roughly speaking, it means that the general case of this problem cannot be solved in time better than something like O(N^2/log N), so if you want faster algorithms you have to consider additional constraints (some properties of the strings) or look for approximate solution.
                        
                            
                        
                        
                            On
                            
                            
                                                    
                    
                The following is a O(NLOGM) approach of the LCS problem that uses binary search:
//If the strings are not the same length, take the longest one
//Map each letter with a ArrayList representing the occurences on that letter (indexes)
//The ArrayLists are thus sorted. This will enable binary search later on
Map<Character, List<Integer>> map = new HashMap<>();
for (int i = 0; i < first.length(); i++) {
    char c = first.charAt(i);
    if (!map.containsKey(c)) {
        map.put(c, new ArrayList<>());
    }
    map.get(c).add(i);
}
//Loop on the second string and increment an index to check that any matching character is a valid match
//For sequence match we need to maintain the same order as in the initial string so we use an increasing index for that
int index = 0;
StringBuilder lcs = new StringBuilder();
for (int j = 0; j < second.length(); j++) {
    char c = second.charAt(j);
    if (map.containsKey(c)) {
        List<Integer> indexes = map.get(c);
        //The following method uses Binary Search to find the first bigger or equal index
        //This is possible since the indexes list is sorted by definition
        int firstBigger = firstBiggerOrEqualIndex(indexes, index);
        if (firstBigger != -1) {
            index = firstBigger+1;
            lcs.append(c);
        }
    }
}
return lcs.toString();
The method firstBiggerOrEqualIndex uses Binary Search like this:
private static int firstBiggerOrEqualIndex(List<Integer> indexes, int index) {
    int start = 0;
    int end = indexes.size()-1;
    int res = -1;
    while(start <= end) {
        int mid = (start+end)/2;
        int value = indexes.get(mid);
        if (value > index) {
            res = value;
            end = mid-1;
        }
        else if (value == index) {
            res = value;
            break;
        }
        else {
            start = mid + 1;
        }
    }
    return res;
}
                        
The dynamic programming approach, which is O(n2) for general case. For certain other cases, there are lower-complexity algorithms:
For a fixed alphabet size (which doesn't grow with n), there's the Method of Four Russians which brings the time down to O(n2/log n) (see here).
See here another further optimized case.