I'm looking for a solution to aggregate data quickly with a regular delta load of any updates to the transactional database.
We currently have the transactional DB and then a read-only replica, which is being used for our reporting, by changing the connection string to read-only application intent. This isn't ideal as we have to aggregate on the fly with SQL queries and we also can't index separately to support the read-only queries.
In our old application we used to use CDC with a DW and SSAS cube, however, after moving to Azure the multideimensional cube is not supported. I have been looking at the possibility of Azure Sync to replicate the data to a second repository and then using databricks for the aggregation. I'm not sure if this is the right approach as I would then need to pass the aggregated data back to the reporting repository (whatever I decide it's going to be). I don't want to have to split the raw and aggregated data and ideally want a single data source for all of it.
I'm sure this is a common design challenge, so I would like to know
a) What are the best tools in Azure for replication / delta loads to a second repository?
b) What is the best tool for aggregating data quickly?