The design of the data collection architecture references a model (as shown in the diagram below) for the overall architecture design.
For different databases, data is extracted and transformed based on write-ahead logging and sent to various target ends, including message queues, databases, big data platforms, and files. The data transmission process is divided into full and incremental types. Full data is obtained through logical extraction from the database. Incremental data is obtained through log analysis. Full and incremental data to secondary databases can achieve continuous data replication, completing real-time protection of database data. Full and incremental data replicated to other target ends through synchronization software constitutes heterogeneous replication.
Source databases support traditional commercial databases such as Oracle, DB2, SQL Server, as well as domestic databases like Dameng, GaussDB, TDSQL, OB, etc. Heterogeneous target ends support various data warehouses, message queues, MPP, NoSQL, etc.