solr_sub query breaks when all the conditions having negate in a subquery along with AND

22 views Asked by At

Consider the following queries

"-a:d +b:2 +(-c:[* TO *])" this query is not returning any result.

"-a:d +b:2 -c:[* TO *]" whereas this query returns results.

For #1,

It evaluates c then a then b.

For #2

It evaluates a, then b then c.

Ideally, it is same. What's the difference in this?

Kindly help. Thanks in advance. It is standard query parser.

1

There are 1 answers

0
MatsLindh On BEST ANSWER

Think of each part of the query in Solr as representing a set of documents. A document is either part of the set or not, and you then perform operations between these sets as part of your query.

AND means that "documents that are in both these sets should be included" (an intersection), OR means "documents that are in either of these sets should be included" (a union) and NOT or - means "subtract this set from the other set".

Given your first query, it gets parsed as:

-a:d  # subtract the set of documents matched by "a:d" from the existing set
+b:2  # AND includes the set of documents matched by `b:2` (+ => needs to be present)
+(-c:[* TO *])"  # AND include the set of documents matched by:
                 #   the empty set minus the set of documents matching `c:[* TO *]`

If we ignore that Solr sometimes prefixes the set of all matching documents to your query so that you start with a set representing "all documents" instead of "no documents", and assume that we always start with an empty set, we can see why we get no documents:

current: empty_set, statement: -a:d
  subtract the set matched by `a:d`
  # result is still an empty set
  
current: empty_set, statement +b:2
  add the set matched by `b:2`
  # result is a set that contains documents with `b:2`

current: (b:2), statement +(-c:[* TO *])
  INTERSECT it (since its given with +) with the set represented by:
    empty_set
      subtract the set matched by `c:[* TO *]` (documents with a value in `c`) 
      # result is still an empty set
  # result is an empty set, since the internal part is an empty set and we require a match through `+`

If we change the last step to include all documents first (*:*):

current: (b:2), statement +(*:* -c:[* TO *])
  INTERSECT (since its given with +) with the set represented by:
    all_documents
      subtract the set matched by `c:[* TO *]` (documents with a value in `c`) 
      # result is all_documents except those with a value in `c`)

  # resulting set: documents with `b:2` and without a value for `c`.
  #                                     ^- not "either set", but those that are in both sets - so an intersection