SearchHitsSupport returns hits instead of aggregated data

20 views Asked by At

I am using spring-data-elasticsearch and do some aggregations, below is how I am building a query:

NativeSearchQueryBuilder nativeSearchQueryBuilder = new 
NativeSearchQueryBuilder()
        .withQuery(query)
        .withPageable(pageRequest)
        .withAggregations(aggregation);

To get results I was trying to do simple approach:

 SearchPage<MyData> searchPage = SearchHitSupport.searchPageFor(elasticsearchTemplate.search(query,
                MyData.class,
                IndexCoordinates.of(myDataIndexName)), query.getPageable());

But as results I get all hits (propalby returned by query, before aggregation).

Correct results I get when I iterate over all buckets and pick documents by myslef - isn't there any easier way ?

Collecting results from buckets:

SearchHits<MyData> search = elasticsearchTemplate.search(query,
                    MyData.class,
                    IndexCoordinates.of(myData));
     
 ElasticsearchAggregations aggregations = (ElasticsearchAggregations) search.getAggregations();
            Iterator<Aggregation> iterator = aggregations.aggregations().iterator();
            
            List<org.elasticsearch.search.SearchHit> results = new ArrayList<>();
            while (iterator.hasNext()) {
                ParsedTerms aggregation = (ParsedTerms) iterator.next();
                List<org.elasticsearch.search.SearchHit> collect = aggregation.getBuckets()
                        .stream()
                        .map(b -> b.getAggregations().asList())
                        .flatMap(Collection::stream)
                        .map(c -> Arrays.stream(((ParsedTopHits) c).getHits().getHits()).collect(Collectors.toList()))
                        .flatMap(Collection::stream)
                        .collect(Collectors.toList());
                results.addAll(collect);
            }

  List<MyData> myDataList = getMyData(results);

getData method for converting searchits to my custom pojo objects:

private List<MyData> getMyData(List<org.elasticsearch.search.SearchHit> results) {
        ObjectMapper objectMapper = new ObjectMapper();
        objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
        objectMapper.registerModule(new JavaTimeModule());
        List<MyData> collect = results.stream().map(e -> {
            String source = e.getSourceAsString();
            try {

                return objectMapper.readValue(source,MyData.class);
            } catch (JsonProcessingException ex) {
                log.error(ex.getMessage(), ex);
                return null;
            }
        }).collect(Collectors.toList());
        return collect;
    }
1

There are 1 answers

0
P.J.Meisch On

When sending a query and aggregations to Elasticsearch you will always get the query results and the aggregation back, that's the normal behaviour. You can prevent having the query documents returned if you call setMaxResults() on your NativeQuery. But then youd would not need the paging access.

So I don't see what you problem actually is. If you wrap the returned SearchHits in a SearchPage and want to get to the aggregations, use SearchPage.getSearchHits() and retrieve the aggregations from that like you showed in your code.

If you are looking for an easier way to access the aggreagtions than iterating over them by yourself, no there isn't one.

Having an implementation for returned aggregations in Spring Data Elasticsearch that is client agnostic would mean kind of abstracting away the classes that client implementations have (the current Elasticsearch Java client has 226 classes in the co.elastic.clients.elasticsearch._types.aggregations package); this is not to handle. And there are other clients like the one used in the Spring Data Opensearch project. Furthermore when parsing an aggregations response it might be necessary to know what agrregations/types were in the request in order to match that info to the returned classes.