On their website on knn retrieval, they have written
The filter is applied during the approximate kNN search to ensure that k matching documents are returned. This contrasts with a post-filtering approach, where the filter is applied after the approximate kNN search completes. Post-filtering has the downside that it sometimes returns fewer than k results, even when there are enough matching documents.
The typical way of doing attribute filtering with knn retrieval is to it before knn (attribute filtering is done first, and then vector similarity is computed through brute force) or after knn (knn retrieval is done first efficiently and then attribute filtering is done). Both of these are not perfect. The first approach is slow and the second approach has low recall.
So how does Elasticsearch do attribute filtering during knn retrieval?
There are two different concepts for filtering:
Hybrid search is combining knn search and lexical search.
kNN filtering is
filterinside of the kNN query (which is pre-filtering)filterall other filters found in the Query DSL tree (which is post-filtering)pre-filtering – filter is applied during the approximate kNN search to ensure that
kmatching documents are returned.post-filtering – filter is applied after the approximate kNN search completes, which results in fewer than
kresults, even when there are enough matching documents.Yes, the first approach can be slow and the second approach can has low recall. You can increase the speed by decreasing the
num_candidatesvalue and you can tune the search relevancy by increasing thenum_candidates. It's a trade-off between the search speed and relevancy.* pre-filtering kNN search (48 hits)
* post-filtering kNN search (24 hits)
* hybrid search (124 hits)
Note: Number of hits shared only to give an idea about it. For result hits
kandnum_candidatesandsizeset to100.