Tuesday, August 29, 2023

 

Azure managed instance for Apache Cassandra is an open-source NoSQL distributed database that is trusted by thousands of companies for scalability and high availability without compromising performance. Linear scalability and proven fault tolerance on commodity hardware or cloud infrastructure makes it the perfect platform for mission critical data. This is a distributed database environment, but the data can be replicated to other environments including the Azure Cosmos Database for use with Cassandra API.

The Database Migration Assistant has a preview feature to help with this database migration.  The Azure Cosmos DB Cassandra connector helps with the live data migration from existing native Apache Cassandra workloads running on-premises or in the Azure public cloud to the Azure Cosmos DB with zero application downtime. It does this with the help of a replication agent to move data from Apache Cassandra to the Cosmos DB. The replication agent is a java process that runs on the native Cassandra host(s) and uploads data from Cassandra via a managed pipeline. Customers need only download the agent on the source Cassandra nodes and configure the target Azure Cosmos DB Cassandra API account information.

The replication agent runs on the native Cassandra cluster. Once it is installed,  it takes a snapshot of the cluster and uploads the requisite files. After the initial snapshot, continuous ingestion commences in the following manner. First, it connects to the replication metadata endpoint of the Cosmos Cassandra account and fetches replication component information. Then it sends the commit logs to the replication component. Finally, mutations are replicated to the Cosmos DB Cassandra endpoint by the replication component.

Customers can begin using the data in the Azure Cosmos DB Cassandra API account by first verifying the supported features of Cassandra here and estimating the request units required. This can be calculated even at the granularity of each operation which helps with the planning.

The benefits of this data migration from native Cassandra clusters to Cosmos DB Cassandra API account include no downtime, no code changes, and no manual data migration. The configuration is simple and the replication is fast. It is also completely transparent to Cassandra and the other workloads to the cluster.

The Cosmos DB Cassandra API account normalizes the cost of all database operations using Request Units. This is a performance currency abstracting the system resources such as CPU, IOPS, and memory that are required to perform the database operations and help with cost estimation in dollars by virtue of unit price.

Reference: This article is a continuation of articles on Azure Resources with the last one describing Cassandra Configuration: CassandraConnectivity-2.docx

 

No comments:

Post a Comment