Cluster computing

Sunday, August 18, 2024

This is a continuation of the BCDR articles on strategies by workloads:

The Azure public cloud provides native capabilities in the cloud for the purposes of business continuity and disaster recovery, some of which are built into the features of the resource types used for the workload. Aside from features within the resource type to reduce RTO/RPO (for a discussion on terms used throughout the BCDR literature) please use the references), there are dedicated resources such as Azure Backup, Azure Site Recovery and various data migration services such as Azure Data Factory and Azure Database Migration Services that provided a wizard for configuring the BCDR policies which are usually specified in a file-and-forget way. Finally, there are customizations possible outside of those available from the features of the resource types and BCDR resources which can be maintained by Azure DevOps.

Organizations may find that they can be more efficient and cost-effective by taking a coarser approach at a deployment stamp level higher than the native cloud resource level and one that is tailored to their workload. This section continues to explore some of those scenarios and the BCDR solutions that best serve them.

Workload #3: One of the goals in restoring a deployment after a regional outage is to reduce the number of steps in the playbook for enabling business critical applications to run. Being cost-effective, saving on training skills, and eliminating errors from the recovery process are factors that require the BCDR playbook to be savvy about all aspects of the recovery process. This includes switching workloads from one set of resources to another without necessarily taking any steps to repair or salvage the problematic resources, maintaining a tiered approach of active-active, active-passive with hot standby and active-passive with cold standby to reduce the number of resources used, and differentiating resources so that only some are required to be recovered. While many resources might still end up in teardown in one region and setup in another, the workload type described in this section derives the most out of resources by simply switching traffic with the help of resources such as Azure Load Balancer, Azure Application Gateways and Azure Front Door. Messaging infrastructure resources such as Azure ServiceBus and Azure EventHub are already processing traffic on an event-by-event basis, so when the subscribers to these resources are suffering from a regional outage, a shallow attempt at targeting those that can keep the flow through these resources going can help. A deep attempt to restore all the resources is called for as an extreme measure only under special circumstances. This way, there is optimum use of time and effort in the recovery.

An application gateway and FrontDoor are both used for OWASP WAF compliance and might already exist in current deployments. With slight differences between the two, both can be leveraged to switch traffic to an alternate deployment but only one of them is preferred to switch to a different region. FrontDoor has the capability to register a unique domain per backend pool member so that the application received all traffic addressed to the domain at the root “/” path as if it was sent to it directly. It also comes with the ability to switch regions such as between centralus and east us 2. Application gateway, on the other hand, is pretty much regional with one instance per region. Both can be confined to a region by directing all traffic between their frontend and backends to go through the same virtual network. Networking infrastructure is probably the biggest investment that needs to be made up front for BCDR planning because each virtual network is specific to a region. Having the network up and running allows resources to be created on-demand so that the entire deployment for another region can be created only on-demand. As such an Azure Application Gateway or Front Door must be considered a part of the workload along with the other app services and planned for migration

Cluster computing

Sunday, August 18, 2024

No comments:

Post a Comment