I have a column with lengthy strings that I need to extract only a portion of to analyse. I need to extract a substring that contains the 5 sentences before and after occurrences of a keyword, excluding repeated sentences.
For example, the original cell structure might contain 4 mentions to the keyword:
This is one sentence. And another. And another. And another. And another. And another. And another. And another. And another. Here, there is a first mention to the keyword. Another sentence. And another. And another. And another. And another. A second mention to the keyword. This is another. And another. And another. And another. And another. And another. And another. And another. And another. And another. A third mention to the keyword. And a forth mention. Another sentence. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. And another. (...) And another. And another.
The expected new cell would be:
And another. And another. And another. And another. And another. Here, there is a first mention to the keyword. Another sentence. And another. And another. And another. And another. A second mention to the keyword. This is another. And another. And another. And another. And another. And another. And another. And another. And another. And another. A third mention to the keyword. And a forth mention. Another sentence. And another. And another. And another. And another.
Is there are a way to do this adding a column based on a column in OpenRefine with a GREL function? Thanks for your help!