How to do search with priority in django?

964 views Asked by At

I'm using djangorestframework with mysql database. I have a view that returns a list based on a search query param. I'm using rest_frameworks.filters SearchFilter for the search based filtering. Here's my view:

from rest_framework import filters
from rest_framework.generics import ListAPIView
...

class FooListView(ListAPIView):
    serializer_class = SymbolSerializer
    queryset = Symbol.objects.all()
    filter_backends = [filters.SearchFilter]
    search_fields = ['field_A', 'field_B', 'field_C']

An example URL to call is:

http://localhost:8000/symbols/symbols/?search=bird

Now everything works fine but I need a feature that filters.SearchFilter doesn't support. I want my search to be ordered by priority of search_fields.

For example here's two records:

foo1 : {"field_A": "any", "field_B": "many", "field_C": "bar", "id": 3}

foo2 : {"field_A": "many", "field_B": "any", "field_C": "bar", "id": 4}

Now when I do a search with search='many' param, I want the view to return me a list which foo2 record is higher that foo1 ( like this [foo2, foo1] ) because I want the search's priority to be field_A score but It just returns me a list that is sorted by id ([foo1, foo2]).

Any help?

2

There are 2 answers

0
Maximiliano Castro On

I just stumbled into this same exact problem.

My solution was to tweak the searching logic of DRF's filters.SearchFilter a little bit using inspiration from this response and ended up with the following custom filter class:

class PriorizedSearchFilter(filters.SearchFilter):
    def filter_queryset(self, request, queryset, view):
        """Override to return priorized results."""
        # Copy paste from DRF
        search_fields = getattr(view, 'search_fields', None)
        search_terms = self.get_search_terms(request)

        if not search_fields or not search_terms:
            return queryset

        orm_lookups = [
            self.construct_search(six.text_type(search_field))
            for search_field in search_fields
        ]
        base = queryset
        conditions = []

        # Will contain a queryset for each search term
        querysets = list()

        for search_term in search_terms:
            queries = [
                models.Q(**{orm_lookup: search_term})
                for orm_lookup in orm_lookups
            ]

            # Conditions for annotated priority value. Priority == inverse of the search field's index.
            # Example: 
            #   search_fields = ['field_A', 'field_B', 'field_C']
            #   Priorities are field_A = 2, field_B = 1, field_C = 0
            when_conditions = [models.When(queries[i], then=models.Value(len(queries) - i - 1)) for i in range(len(queries))]
            
            # Generate queryset result for this search term, with annotated priority
            querysets.append(
                queryset.filter(reduce(operator.or_, queries))
                    .annotate(priority=models.Case(
                        *when_conditions,
                        output_field=models.IntegerField(),
                        default=models.Value(-1)) # Lowest possible priority
                    )
                )

        # Intersect all querysets and order by highest priority
        queryset = reduce(operator.and_, querysets).order_by('-priority')

        # Copy paste from DRF
        if self.must_call_distinct(queryset, search_fields):
            # Filtering against a many-to-many field requires us to
            # call queryset.distinct() in order to avoid duplicate items
            # in the resulting queryset.
            # We try to avoid this if possible, for performance reasons.
            queryset = distinct(queryset, base)
        return queryset

Use filter_backends = [PrioritizedSearchFilter] and you're set.

0
João Seckler On

Maximiliano's answer is almost perfect, but there is a bug with duplicate entries in it. If the search word matches the object in more than one field, then queryset's distinct will regard the same object annotated with different priorities as different objects. As a result, the search filter will return duplicated entries.

Since Django 3.2, the solution is as simple as replacing annotate with alias. From it's documentation,

Same as annotate(), but instead of annotating objects in the QuerySet, saves the expression for later reuse with other QuerySet methods. (...) alias() can be used in conjunction with annotate(), exclude(), filter(), order_by(), and update().

In other words, alias solves the problem because it can be used with order_by but is disregarded by distinct.