query parser failed when AND is used in query

15 views Asked by At

I have a custom analyzer that does parse keywords into ngrams:

class Custom_Analyzer(PythonAnalyzer):

    def createComponents(self, fieldName):
        source = LetterTokenizer()
        filter = ASCIIFoldingFilter(source)
        filter = LowerCaseFilter(source)
        filter = StopFilter(filter, StopFilter.makeStopSet(['and','or'], True))
        filter = NGramTokenFilter(filter, 5, 5,True)
        return self.TokenStreamComponents(source, filter)

   def initReader(self, fieldName, reader):
       return reader

If I search for using AND in the query text, it would fail:

QueryParser("field", Custom_Analyzer()).parse("SOMETHING AND")
QueryParser("field", Custom_Analyzer()).parse("SOMETHING OR")

I know AND is a keyword in lucene for boolean search, but for some reason the STOP filter isn't removing it. If the query text is using lowercase, the QueryParser would succeed.

I know I can remove the AND and the OR before it gets into the custom analyzer, but I feel like it should be part of pylucene. What am I doing wrong?

0

There are 0 answers