I'm using lucene.net and the snowball analyzer in a asp.net application.
With a specific language I'm using I have the following issue: For two specific words with different meanings after they are stemmed the result is the same, therefore a search for any of them will produce results for both things.
How can I teach the analyzer either not to stem this two words or to, although stemming them, know that they have different meanings.
With Lucene 4.0,
EnglishAnalyzernow has this ability, since it has a constructor which takes astemExclusionSetOf course, Lucene.Net isn't up to Lucene 4 yet, so fat lot of good that does.
However, EnglishAnalyzer does this by using a
KeywordMarkerFilter. So you can create your own Analyzer, overriding the tokenStream method, and adding into the chain aKeywordMarkerFilterjust before theSnowballFilter.Something like:
You'll need to construct your own
stemExclusionSet(see CharArraySet).