I have two text files, each containing one word per line. For reference, the first file contains a limited list of unique words and the second file a longer file with many words, many of which are recurring.
My goal is keep all the words in the second file which exist in the first file (the wordlist), and remove all the others. The words in the wordlist are sorted, but the issue I am having is that I want the words in the second file to be in the order they are in, which is unsorted. As far as I know this keeps from using the 'comm' command which seems to require sorted lists to find collisions between the two files.
Is there another utility I can use which allows me to achieve my goals, or is there a way to use comm to actually output the joint words in the order they appear in the second file?
Yes, there are endless programming languages and utilities. Typically in linux environment, one would write a
awkscript.No. But there's
join, with it: you can number the linesnl -w1of the file you want to preserve order, thensorton second field, thenjoin -o1.1,1.2on second field with the sorted other file, then re-sorton line numbers and remove line numbers withcut- because of line numbers, the original line order is preserved.