This is a continuation of the BCDR articles on strategies by workloads:
The Azure public cloud provides native capabilities in the
cloud for the purposes of business continuity and disaster recovery, some of
which are built into the features of the resource types used for the workload.
Aside from features within the resource type to reduce RTO/RPO (for a
discussion on terms used throughout the BCDR literature) please use the
references), there are dedicated resources such as Azure Backup, Azure Site
Recovery and various data migration services such as Azure Data Factory and Azure
Database Migration Services that provided a wizard for configuring the BCDR
policies which are usually specified in a file-and-forget way. Finally, there are customizations possible
outside of those available from the features of the resource types and BCDR
resources which can be maintained by Azure DevOps.
Organizations may find that they can be more efficient and
cost-effective by taking a coarser approach at a deployment stamp level higher
than the native cloud resource level and one that is tailored to their
workload. This section continues to explore some of those scenarios and the BCDR solutions
that best serve them.
Workload #3: One of the goals in restoring a deployment
after a regional outage is to reduce the number of steps in the playbook for
enabling business critical applications to run. Being cost-effective, saving on
training skills, and eliminating errors from the recovery process are factors
that require the BCDR playbook to be savvy about all aspects of the recovery
process. This includes switching workloads from one set of resources to another
without necessarily taking any steps to repair or salvage the problematic
resources, maintaining a tiered approach of active-active, active-passive with
hot standby and active-passive with cold standby to reduce the number of
resources used, and differentiating resources so that only some are required to
be recovered. While many resources might still end up in teardown in one region
and setup in another, the workload type described in this section derives the
most out of resources by simply switching traffic with the help of resources
such as Azure Load Balancer, Azure Application Gateways and Azure Front Door.
Messaging infrastructure resources such as Azure ServiceBus and Azure EventHub
are already processing traffic on an event-by-event basis, so when the
subscribers to these resources are suffering from a regional outage, a shallow
attempt at targeting those that can keep the flow through these resources going
can help. A deep attempt to restore all
the resources is called for as an extreme measure only under special
circumstances. This way, there is optimum use of time and effort in the
recovery.
No comments:
Post a Comment