Run Kafka Standalone in Docker Container on production env for CDC

72 views Asked by user3281975 At 16 February 2024 at 09:00

I have to implement Change Data Capture (CDC) and deliver changes from Postgres DB to Data Lake (AWS S3). I want to implement CDC with Debezium and Kafka. This is data flow: Postgres --> Debezium --> Kafka --> S3

I have about 5GB (about 90 tables) of data daily, that will be moved to Kafka.

High availability is not the issue - if Kafks or Server fails, we will simply rerun.
Scalability is not the issue - we don't have such a big load.
Fault Tolerance is not the issue also.
Speed is also not important

I want to manually (AWS MSK is not an option because of price) run Kafka Standalone (1 Broker) on production in docker containers to deliver data to S3.

According to that, I have a few questions:

Is my architecture OK for solving the CDC problem?
Is it better to run Kafka in a Docker Container or install Kakfa manually on a Virtual Server (EC2)
Is My solution OK for production?
Data Loss: If Kafka experiences a failure, will Debezium retain the captured changes and transfer them to Kafka once it is back online?
Data Loss: If Debezium experiences a failure, will the system resume reading changes from the point where it stopped before the failure occurred? (not sure if this question is ok)
Any solutions or recommendations for my problem?

Original Q&A

TechQA.

Run Kafka Standalone in Docker Container on production env for CDC

There are 0 answers

Related Questions in LINUX

Related Questions in DOCKER

Related Questions in APACHE-KAFKA

Related Questions in DEBEZIUM

Related Questions in CHANGE-DATA-CAPTURE

Popular Questions

Trending Questions