The technical target is to build a new flow based on streaming and DFI tools.
The documentation related to customer growth data ingestion is here
Data is exported with CDC tooling using Debezium component. This component is available through the streaming team: CDC
The data source is a dedicated outbox table which stores data events about favorite sports.
Data stream into the kafka topic are formalised thanks to AVRO technology.
The Sport Avro schema are available in
Github
.
The Sink to S3 bucket is built with a S3 connector .
The CDC Streaming is manage thanks to a kafka connect cluster available in the
prod-gke-eu flux repository
This kafka connect cluster is common to all the Sport public streaming.