Tuesday, September 27, 2022

 

Recovery and Replication of data in Multitenant Applications:

This is a continuation of the detailed article on Azure Monitor and Azure Monitor Logs. These are different services and Logs are incredibly useful for troubleshooting and notifications. In next section, we discuss recovery options.

The example used here is a full disaster recovery scenario for a multitenant SaaS application implemented with the database per tenant model. The concepts introduced here are geo-restore and geo-replication.

A geo-restore can be used to recover the catalog and tenant databases from automatically maintained geo-redundant backups into an alternate recovery region. After the outage is resolved, geo-replicate can be used to repatriate the changed databases to their original region.

A database can be restored to an earlier point in time within its restoration period. This works for any service tier or compute size for the restored database. If the database is restored into an elastic pool, there must be sufficient resources in the pool to accommodate the database. There is no charge incurred during the restoration and the restored database is charged at normal rates after that.

A point-in-time restore does not support cross-server restoration and it cannot restore a geo-secondary database. Hyperscale databases are not subject to a backup frequency and must be restored on demand. A restored database can be used to replace the original database by renaming it. If the database is restored only for its data, a recovery script must extract and apply that data to the original database.

A restore operation on a long-term backup can be performed form the logical server via the user interface, command line, programmability interface or scripts. It is not applicable to Hyperscale databases.

A deleted database can be restored to the deletion time, or an earlier point in time or an earlier point of time on the same server. A geo-restore can perform cross-server cross-region restore from the most recent backups. It is typically done when the database is restored or the entire region is inaccessible.

 There is usually a delay when a backup is taken and when it is geo-restored and the restored database can be upto one hour behind the original database. Geo-restore relies on automatically created geo-replicated backups with a recovery point objective of up to 1 hour and an estimated recovery time objective (RTO) of upto 12 hours. It does not guarantee that the target region will have the capacity to restore the database after a regional outage, because a sharp increase in demand is likely. Therefore, it is most used for small databases. Business continuity for larger databases is ensured via auto-failover groups. It has a much lower RPO and RTO and the capacity is guaranteed.

GeoReplication helps to create a continuously synchronized readable secondary database for a primary database. It is preferable although not required to have the secondary database in a different region. Since this kind of secondary database is merely readable, it is called a georeplica. This option serves to perform quick recovery of individual databases in case of a regional disaster or a large-scale outage. Once georeplication is setup, a geo-failover helps maintain continuity of business.

Geo-replication can also be performed for database migration with minimum downtime and application upgrades by creating an extra secondary as a fail back copy during application upgrades. An end-to-end recovery requires recovery of all components and dependent services. All components are resilient to the same failures and become available within the recovery time objective of the application. Designing cloud solutions for disaster recovery include scenarios using two Azure regions for business continuity with minimal downtime or using regions with maximum data preservation or to replicate an application to different geographies to follow demand.


No comments:

Post a Comment