I am trying to query an AWS ElasticSearch Domain from a Lambda worker.
To do so, I am using http-aws-es and the main javascript client for Elastic Search.
I query documents with the following relevant fields:
- A
reffield - String - A
statusfield - String ENUM (REMOVED,BLOCKED,PUBLISHED,PENDING,VERIFIED) - A
fieldfield - String Array - A
thematicsfield - String Array
What I want to achieve is:
- Filter all documents that are not either
PUBLISHEDorVERIFIEDor where thereffield is set - Return the best matches with my
keywwordsargument (string array) relatively to values infieldandthematics - Sort to put documents with
PUBLISHEDstatus first - Limit the number of results to 20
I found the more_like_this operator, and gave it a try. I build step by step my query and the actual version, at least, doesn't return an error, but no documents are returned. It still misses the ref filter + #3 and #4 from above. Here is the query :
const client = new elasticsearch.Client({
host: ELASTICSEARCH_DOMAIN,
connectionClass: httpAwsEs,
amazonES: {
region: AWS_REGION,
credentials: new AWS.EnvironmentCredentials('AWS')
}
})
let keywords = event.arguments.keywords
let rst = await client.search({
body: {
'query': {
'bool': {
'filter': {
'bool': {
'must_not': [
{
'term': {
'status': 'REMOVED'
}
},
{
'term': {
'status': 'PENDING'
}
},
{
'term': {
'status': 'BLOCKED'
}
}
]
}
},
'must': {
'more_like_this': {
'fields': ['field', 'thematics'],
'like': keywords,
'min_term_freq': 1,
'max_query_terms': 2
},
'should': [
{
'term': {
'status': 'PUBLISHED'
}
}
]
}
}
}
}
})
console.log(rst)
return rst
I have to upload my lambda code to debug this and it complicates debugging a lot. Since I never made ES queries before, I wanted to have at least some hints as to how to proceed with this or know if I am misusing the ES query syntax.
EDIT:
As requested, here is my index mapping (with JS type):
- city text (String)
- contact_email text (String)
- contact_entity text (String)
- contact_firstname text (String)
- contact_lastname text (String)
- contacts text (String list)
- country text (String)
- createdAt date (String)
- description text (String)
- editKey text (String)
- field text (String)
- id text (String)
- name text (String)
- pubId text (String)
- ref text (String)
- state text (String)
- status text (String)
- thematics text (String Array)
- type text (String Array)
- updatedAt (String)
- url text (String)
- verifyKey text (String)
- zone text (String Array)
Taken from AWS elastic search management console (index tabs > mappings)
There are one or two issues in your query (
shouldinsidemustandmust_notinsidefilter). Try the simplified query below instead: