Cluster computing

Monday, April 6, 2026

Continued from previous post

Across all four stamps, the architect’s job is to standardize the continuity process without standardizing away workload nuance. I define business impact tiers, map each stamp to specific RTO and RPO values, document the exact dependency graph, and automate the rebuild of identities, networking, secrets, and policy in the secondary region so the recovery is repeatable under pressure. Then I test failover routinely, including a true cutover rehearsal, because Microsoft’s guidance repeatedly emphasizes that backup alone is not enough and that recovery plans must be validated in practice.6

Implementation for each of the deployment stamps now follows:

1. Web apps + storage static site + API behind Application Gateway WAF

a. Recommended posture: active-passive in a paired secondary region, with the entire stamp reproducible from IaC and traffic shifted only after health checks pass. Azure’s DR guidance recommends cross-region data replication, automated provisioning, and preconfigured runbooks, while the deployment-stamp pattern emphasizes that identical stamps should be redeployable rather than manually repaired.

b. A good service mapping is: Azure App Service for UI/API, Application Gateway with WAF policy as the regional entry control, storage account with geo-redundant replication for static content and any blob assets, Key Vault for certificates/secrets, and Front Door or Traffic Manager if the client needs global traffic steering across regions. I keep the app stateless where possible, externalize session state, and make sure the secondary region has the same custom domains, certificates, private endpoints, managed identities, and network rules before I ever declare it ready.

c. Recommended RTO/RPO bands: RTO 15–60 minutes and RPO 5–30 minutes for most business web/API workloads; if the application is revenue-critical, I target the low end of that band and pre-provision more of the secondary stack. If the UI is mostly static and the APIs are modestly stateful, I usually push toward the lower RPO by using geo-redundant storage and keeping the app tier fully codified.

d. My failover checklist: I confirm secondary App Service, App Gateway/WAF, storage, Key Vault, and DNS are deployed; validate replication and certificate availability; stop writes in the primary if needed; verify health probes, custom domain bindings, and backend pool health in the secondary; switch traffic; test login, static content, API calls, and WAF policy behavior; then monitor logs and error rates before resuming normal operations.

Cluster computing

Monday, April 6, 2026

No comments:

Post a Comment