We are working in a project where we have multiple databases with same schemas and tables. Each gets its data from different activities performed and we want to consolidate everything into one database for reporting and dashboarding. Is there an efficient way to do it with Postgresql.
Example: I have 2 databases with a table named "messages" and columns "message_id", "message","created_date" like below.
DB1:
message_id message created_date
1 New message for A 2024-01-30 00:00:00
2 New message for B 2024-01-30 01:00:00
DB2:
message_id message created_date
1 New message for C 2024-01-30 00:00:01
2 New message for D 2024-01-30 01:00:02
Outcome expected is:
DB3:
message_id message created_date
DB1-1 New message for A 2024-01-30 00:00:00
DB1-2 New message for B 2024-01-30 01:00:00
DB2-1 New message for C 2024-01-30 00:00:01
DB2-2 New message for D 2024-01-30 01:00:02
The approach we are trying is to write some jobs to push the data from each postgres database to the parent / consolidated one. Looking for some better and efficient approaches.
you can use FDW postgres, extension available by default in your server. so you will resolve visibility problem between your clusters and/or databases.
after that you can create foreing tables to move needed data from child to parent.
you will need to build views on target schema to visualize consoledated data.
it depends on your performance and data freshness to choose between views or Materialize views.
https://www.postgresql.org/docs/current/postgres-fdw.html