I'm trying to find a diff (longest common subsequences) between two lists of strings. I'm guessing difflib could be useful here, but difflib.ndiff annotates the output with -, +, etc. For instance
from difflib import ndiff
t1 = 'one 1\ntwo 2\nthree 3'.splitlines()
t2 = 'one 1\ntwo 29\nthree 3'.splitlines()
d = list(ndiff(t1, t2 )); print d;
[' one 1', '- two 2', '+ two 29', '? +\n', ' three 3']
Is tokenising and removing the letter-codes in the output the right way? Is this the proper Pythonic way of diffing lists?
If all you want is the difference of first list from second, you can convert them to
setand take set difference using-operator.Example -