Monday, August 12, 2024

 Understanding Workloads for business continuity and disaster recovery (aka BCDR).

The Azure public cloud provides native capabilities in the cloud for the purposes of business continuity and disaster recovery, some of which are built into the features of the resource types used for the workload. Aside from features within the resource type to reduce RTO/RPO (for a discussion on terms used throughout the BCDR literature) please use the references), there are dedicated resources such as Azure Backup, Azure Site Recovery and various data migration services such as Azure Data Factory and Azure Database Migration Services that provided a wizard for configuring the BCDR policies which are usually specified in a file-and-forget way.  Finally, there are customizations possible outside of those available from the features of the resource types and BCDR resources which can be maintained by Azure DevOps. 

Organizations may find that they can be more efficient and cost-effective by taking a coarser approach at a deployment stamp level higher than the native cloud resource level and one that is tailored to their workload. This article explores some of those scenarios and the BCDR solutions that best serve them.

Scenario 1: Microservices framework: This form of deployment is preferred when the workload wants to update various services hosted as api/ui independently from others for their lifetime. Usually, there are many web applications, and a resource is dedicated to each of them in the form of an app service or  a container framework. The code is either deployed via a pipeline directly as source code or published to an image that the resource pulls. One of the most important aspects peculiar to this workload is the dependencies between various applications. When a disaster strikes the entire deployment, they won’t all work together even when restored individually in a different region without reestablishing these links. Take for example, the private endpoints that provide connectivity privately between caller-callee pairs of these services. Sometimes the callee is external to the network and even subscription and usually endpoint establishing the connectivity is manually registered. There is no single button or pipeline that can recreate the deployment stamp and certainly none that can replace the manual approval required to commission the private link. Since individual app services maintain their distinctive dependencies and fulfilment of functionality but cannot work without the whole set of app services, it is important to make them exportable and importable via Infrastructure-as-code aka IaC that takes into account parameters such as subscription, resource groups, virtual network, prefixes and suffixes in naming convention and recreates a stamp.

The second characteristic of this workload is that typically it will involve a diverse set of dependencies and stacks to host the various web applications  that it does. There won’t be any consistency, so the dependencies could range from a mysql database server to producing and consuming jobs on a databricks analytical workspace or an airflow automation. Consequently, the dependencies must be part of the BCDR story. Since this usually involves data and scripts, they should be migrated to the new instance. Migration and renaming are two pervasive activities for the BCDR of this workload type. Scripts that are registered in a source code repository like GitHub must be pulled and spun into an on-demand triggered or scheduled workflow.

Lastly, data used by these resources are usually proprietary and territorial in terms of ownership. This implies that the backup and restore of data might have to exist independently and as per the consensus with the owner and the users. A MySQL data can be transferred to and from another instance via the Azure Database Migration Service so as to avoid the use of mysqldump command line with credentials or via GitOps via the az command to the database server instance with implicit login An approach that suits the owner and users can be implemented outside the IaC.

Reference:

No comments:

Post a Comment