Does Anybody have prior experience to set up Oracle Exadata Active-Active Cluster across OCI regions yet? If yes, can you share possible best practices and guiding principles if possible.
The goal is to set up an Active-Active Oracle Exadata cluster across two OCI regions, so customer can readily access other region if one region goes down. It has to be spontaneous without any downtime. It should not be read only Passive site and if required other site can be used (R & W mode ) at any given point. The requirement is, NOT to waste infra as Passive or Stand by, instead it is expected to use all infra as Active serving customers.
The goal is to set up an Active-Active Oracle Exadata cluster across two OCI regions, so customer can readily access both regions simultaneously at the same time. It has to be spontaneous without any downtime.
Usually, It is known that Dataguard and Goldengate can be used, but I am looking for specific implementation best practices and architectural principles considering App Middle tier accessing DB cluster spontaneously.
Your mentioned that "The goal is to set up an Active-Active Oracle Exadata cluster across two OCI regions, so customer can readily access other region if one region goes down. It has to be spontaneous without any downtime."
The terminology Active-Active or Active-Standby is used from database semantics rather than for an Exadata Cluster(DB System/VM Cluster). So i am going to take the question as that the goal here is to design a DR solution for Exadata database which has stringent RTO goals and you want a solution which is automatic/spontaneous without downtime .
a. With proper planning and execution, Oracle Data Guard and Active Data Guard role transitions can effectively minimize downtime and ensure that the database environment is restored with minimal impact on the business.
b. A failover is used when the primary database is deemed lost or unrecoverable, or the expected time to repair exceeds the required recovery time objective (RTO). During a failover the primary database is taken offline at one site and a standby database is brought online as the primary database. Failover can be completely automated using Data Guard Fast-Start Failover or it can be a manual, administrator-driven process . Fast-Start Failover eliminates the uncertainty inherent in a process that requires manual intervention, assuming similar measures have been taken to automate the failover of the application tier to the new primary database. Fast-Start Failover automatically executes a database failover within seconds of an outage being detected and can complete in seconds.
Please note OCI/DBAAS has not implemented fast start failover yet meaning that this cannot be done via console or DBAAS API's .
Please take a look at https://www.doag.org/formes/pubfiles/5256791/2013-DB-Larry_Carpenter-Session_Keynote__Best_Practices_for_Data_Availability_and_Disaster_Protection-Praesentation.pdf (Page level 38 for more details on Fast start failover )
Using goldengate customers can configure Active-Active primary-standby wherein both primary and standby are open in Read write mode . Please note that goldengate replication differs from dataguard replication in the sense that Goldengate standby is not an exact block to block copy of the Primary . There could also be restrictions around specific object datatypes which goldengate can support . For more details on configuring goldengate to maintain a live standby database and failover best practices , please refer to
https://docs.oracle.com/en/middleware/goldengate/core/19.1/admin/configuring-oracle-goldengate-maintain-live-standby-database.html#GUID-6CE0810E-A681-4CCA-9BC8-539E8A364FD3
https://www.oracle.com/technetwork/database/availability/8399-goldengate-dataguard-1888654.pdf
Please note there is no current offering yet for Goldengate in OCI/DBAAS meaning no console/DBAAS API's for configuring / setting up goldengate standby .