The Information2 high availability disaster recovery management software, i2Availability, captures production data in real-time and replicates it to the disaster recovery server at the data layer. At the application layer, it monitors the running status in real-time and, if an abnormality occurs (such as abnormal service stoppage, network anomalies, hardware failures, or system crashes) resulting in unreachable business, switches the application to the disaster recovery server, ensuring business continuity through application takeover.
The primary-backup high availability feature mainly provides high availability services for various applications, with monitoring objects that can be the primary node, secondary node, or both. The primary-backup high availability introduces an arbitration mechanism to avoid false switchovers between the primary and secondary nodes due to network issues. Meanwhile, data synchronization is integrated into high availability, achieving data synchronization associated with high availability.
When the primary node is operating normally and no abnormalities occur in the monitoring objects, the primary node provides services externally (e.g., SQL Server). Through association rules, changed data is replicated to the secondary node in real-time, and the monitoring objects configured in the high availability rules remain in real-time monitoring status. When an abnormality occurs in the monitoring objects, a resource switchover script is executed, and i2UP automatically shuts down the service on the primary node (e.g., SQL Server), stops the association rules, and switches the primary node to the secondary node. The virtual IP address migrates to the secondary node, the secondary node starts the service, and the association rules are enabled. At this point, the secondary node becomes the primary node and continues to provide services externally.
Without arbitration configured, if a network failure occurs between the primary and secondary nodes while other networks are normal, two primary nodes may be visible on the control machine interface, leading to IP conflicts (also known as split-brain). Therefore, arbitration settings are required. Primary-backup high availability supports a multi-node arbitration mechanism, which can prevent the situation where HA cannot switch when a single arbitration node fails.
Application Switchover Mechanism
High availability software needs to be deployed on both the production server and the takeover server, with software configuration (setting data synchronization paths, database service start/stop scripts). First, synchronize the business data between the two ends, then select corresponding monitoring conditions (network, service, process, performance) for real-time monitoring of the business system and formulate high availability switchover configuration rules. Detailed configuration planning is provided in the application availability deployment planning. The topology diagram is as follows:

Normally, the working machine replicates data to the disaster recovery machine in real-time through the high availability software, and the disaster recovery machine monitors the status of the working machine in real-time through a heartbeat line.

When the working machine fails or other factors prevent it from providing services normally, the heartbeat line between the working machine and the disaster recovery machine will be disconnected. At this point, the disaster recovery machine initiates a judgment mechanism to determine whether to take over the business.

When the judgment mechanism of the disaster recovery machine takes effect, it takes over the entire business service, ensuring business continuity.

After the working machine is repaired, the disaster recovery machine restores the incremental data generated during the takeover period to the working machine, enabling the business system to switch back.
Database replication software at the semantic level enables real-time data replication, achieving full and incremental synchronization in high-concurrency transaction scenarios of databases. Through synchronization verification, it ensures eventual transaction-level consistency between the source and target databases. Additionally, it provides advanced features such as standby database takeover and incremental rollback, assisting users in completing data integration tasks such as disaster recovery backup, heterogeneous data migration, data distribution, and data warehouse construction in complex application environments.
Dual-active database replication primarily reflects the normal operation of database applications on both sides, enabling real-time synchronization of database data from the production center to the disaster recovery center while in operation. Dual-active database replication holds significant value in scenarios such as databases in complex cluster environments, business-critical databases, and databases with read-write separation in heterogeneous environments.
Database replication demands extremely high real-time performance. Regardless of whether it is local or remote, changes at the source are transmitted to the disaster recovery end in real-time asynchronously, with an RPO (Recovery Point Objective) reaching seconds. In the event of a disaster recovery switchover, the database switchover time is less than 60 seconds.
Database replication employs logical data disaster recovery technology, which can be granular to the user (schema) or table (table) level. Since only committed transactions are transmitted, the amount of data transferred is minimal, ensuring low-latency asynchronous data replication in low-bandwidth environments. This represents an efficient and cost-effective method for database disaster recovery. Communication occurs using standard IP networks, allowing the disaster recovery center database to be deployed locally or in a remote disaster recovery center without distance restrictions. Furthermore, the disaster recovery center database remains open at all times, enabling rapid and seamless switching of front-end applications to the disaster recovery database in the event of planned or unplanned downtime of the production database. Compared to other physical replication technologies based on disks or file systems, this approach not only eliminates the lengthy database recovery and startup time but also guarantees a 100% switchover success rate.
i2CDP (Instant Real-time Data Protection) offers enterprise-level continuous data backup and on-demand recovery services for application data, both structured and unstructured, in core application system environments. It allows you to easily backup production data in real-time to backup nodes in the disaster recovery center and quickly recover the required data on demand to ensure business continuity of the production system.
On any virtual machine or physical machine, i2CDP monitors and captures all changes to protected files or directories and replicates the changed byte volume over a standard IP network to any recovery site you choose, minimizing data loss. When processing and replicating files and directories in use, there is no need to close the file, and related applications remain online and active, without any negative impact on your work. i2CDP continuously protects your data at all times.
With bytes as the smallest unit of data capture, rather than traditional files or blocks, the amount of data that needs to be replicated is greatly reduced, saving network bandwidth resources and improving the efficiency of the entire disaster recovery system. By intercepting data changes from the production system in a bypass manner, i.e., after the data changes are intercepted by i2CDP, the changed data is buffered, compressed, encrypted, sent, and confirmed at the application layer. Moreover, the upper limit of system resources that i2CDP can use can be pre-limited, ensuring that it does not affect the normal operation of the existing production system in any case and ensuring the security of data throughout the process.
The optimized data backup transmission method of the i2CDP product, due to byte-level backup, performs excellently in various complex environments such as narrow bands, long distances, and large data volumes, with efficiency much higher than traditional data transmission methods. And the transmission bandwidth and usage periods can be freely set, thereby reasonably allocating and utilizing various resources of the entire system while prioritizing bandwidth for business applications of the production system.
Its main features include:
·Supports local CDP or remote CDP
·Real-time transmission with byte-level data variable replication and storage technology
·Supports recovery of individual files, directories, etc., for volumes, directories, or single files
·Storage-agnostic and supports heterogeneous storage
·No distance limitations and minimal impact on the host
·Application-agnostic with comprehensive support
·Ensures database consistency
·Data compression and encrypted transmission
·Graphical monitoring and management
Asynchronous Replication with Recovery Point Objective (RPO) ≈ 0
The disaster recovery management software enables real-time asynchronous replication of data between two servers, and the disaster recovery end can be a local or remote disaster recovery site without adjusting the existing network and operating environment. The entire process does not require stopping applications and can be deployed online with data replicated online.
Byte-level data capture greatly reduces the amount of data transmitted, enabling data replication tasks to be performed well in limited bandwidth and harsh network environments.
Based on this, it can transmit changed data from the production end to the disaster recovery end within milliseconds, achieving asynchronous replication with an RPO ≈ 0, with minimal impact on the performance of the production end and achieving a replication effect close to synchronous replication.
Data Recovery Process
Based on the real-time byte-level data backup of the business system, it continuously captures and tracks changes in data, records them in the form of time logs stored independently of business data. Each log records a new data change, and each log serves as a recovery point for a piece of data. It can record data changes to one millionth of a second and perform data rollback and recovery operations based on the records, providing sufficiently fine recovery granularity for recovering data objects to ensure that data can be recovered to any point in the past, which is not possible with system snapshot functionality. See the following diagram:
When a system operation error or data anomaly at a specific point in time is detected, data rollback and recovery operations can be performed based on the CDP record time node. See the following diagram:
Purpose of the Drill
The establishment of a disaster recovery center aims to enable business and data recovery through the center in the event of a disaster. This requires the disaster recovery center to be 100% reliable. If the data in the disaster recovery center cannot be guaranteed to be recoverable, then the disaster recovery has little value and meaning. Therefore, regular disaster recovery drills are a necessary method to test the reliability of the disaster recovery center.
The prevention and drills for security incidents are crucial for IT systems. From a technical perspective, there are two main indicators to measure a disaster recovery system: RPO and RTO. RPO represents the amount of data loss allowed when a disaster occurs, while RTO represents the time it takes for the system to recover.
Regular disaster recovery drills are essential to verify the effectiveness of the disaster recovery architecture and pre-planned recovery procedures, as well as actual execution capabilities. By identifying and improving issues discovered during the drills, the disaster recovery system can be further refined. Additionally, it enables relevant personnel from various departments to familiarize themselves with relevant strategies, processes, and methods, thereby enhancing their comprehensive execution capabilities for emergency response and disaster recovery.
Disaster recovery drills serve as a benchmark for assessing the success of disaster recovery projects and an important means for testing disaster recovery maintenance management processes and documentation. Through drills, problems can be promptly identified, and the coordination and accurate operations of personnel from various departments can be ensured.
Principle of the Drill
According to the priority order determined in the business impact analysis, data, data processing systems, and network systems supporting critical business functions should be restored in the disaster recovery center. Detailed operation steps for time, location, personnel, equipment, and each step should be formulated, along with instructions for coordination among teams in specific situations.
The system in the disaster recovery center replaces the production system to support the provision of critical business functions. This stage involves the main tasks involved in the operation and management of the production system, including all operational procedures and regulations for reopening.
Objectives of the Drill
·Verify the recovery process
·Validate the effectiveness of pre-planned procedures
·Verify the effectiveness of various equipment
·Validate data recovery