I have enabled a CDC replication process that takes several tables from my PG RDBMS and needs to be synchronised to OpenSearch for efficient querying. This is working this way:
AWS Aurora Postgres -> AWS DMS -> Kinesis DataStream -> Lambda -> OpenSearch.
The indexes are all in OpenSearch.
The problem with this approach is that for queries I will need to perform joins. I'd love to avoid that, an instead, be able to build a denormalized searchable document before getting into OpenSearch, but I am working with Streams and I receive changes from different tables in different events, so I am not sure how to achieve that in real-time.
What's the right way to solve this? I've used ksqldb in the past and there I was able to join data streams and build the data I need using the middleware db.
Thanks.
I was finally able to solve this by changing a bit the flow:
This flow is very similar (almost the same) than when I used KSQLDB.
Another way can be:
If anyone has a better idea, please add your comments.