Previous articles
in this regard have been discussing resolutions for shortcomings in the use of
Infrastructure-as-a-code (IaC) in various scenarios. This section discusses the
resolution for the case there are cascading resource locks.
Resources can be locked to prevent unexpected changes. A
subscription, resource group or resource can be locked to prevent other users
from accidentally deleting or modifying critical resources. The lock overrides
any permissions the users may have. The lock level can be set to CannotDelete
or ReadOnly with ReadOnly being more restrictive. Lock inheritance can be
applied at a parent scope, all resources within that scope can then inherit the
same lock. Some considerations still apply after locking. For example, a
CannotDelete lock on a storage account does not prevent data within that
account from being deleted. A read only lock on an application gateway prevents
you from getting the backend health of the application gateway because it uses
POST. Only Owner and User Access Administrator role members are granted access
to Microsoft.Authorization/locks/* actions.
When the IaC is applied, it can be quite frustrating to find
the resources locked in the public cloud and preventing the IaC actions to
complete. For example, a resource might have a private endpoint which in turn
might be associated with a DNS and have a private NIC card and these
sub-resources might be locked that prevents the private endpoint from being
deleted which in turns fails the IaC application. The resolution for the owner
of the subscription is to delete the lock from the said resource via the Azure
Portal or the command-line interface and then proceed to apply the locks. And
iterate over the ‘apply’ and the ‘unlock’ steps until there are no further
obstructions.
While this works for the role with the elevated privileges,
many developers using the credentials for the CI/CD pipeline to make changes to
the subscription do not have that privilege and might find the experience
harrowing to resolve without external intervention. One way that they overcome
this unlocking is by applying the unlock commands via a pipeline step prior to
the application of the IaC. Fortunately, there are ways to unlock at a global
subscription level scope rather than at a resource-by-resource level. Even so,
it might not be clear when the locks reappear, and the unlocking might need to
be repeated. Checking the policies to make sure that the locking is not
enforced automatically, which in turn interferes with the infrastructure
changes by code, is a good practice and one that can potentially advise about
the intent behind the locking. If the locking were simply to prevent accidental
deletions against a broad range of resources, then the unlocking is
straightforward for the applying of the changes
Let us make a specific association between say a firewall
and a network resource such as a gateway. The firewall must be associated with
the gateway to prevent traffic flow through that appliance. When they remain
associated, they remember the identifier and the state for each other.
Initially, the firewall may remain in detection mode where it is merely
passive. It becomes active in the prevention mode. When the modes are attempted
to be toggled, the association prevents it. Neither end of the association can
tell what state to be in without exchanging information and when they are
deployed or updated in place, neither knows about nor informs the other.
The above resolutions are easy when the error messages are
descriptive and indicate that the failure of the IaC is exclusively due to
locks. There are other forms of errors where the cause may not be
straightforward. In such cases, the activity log on the resources or at the
subscription level can be quite helpful when the json content of a logged event
explains exactly what happened. This particular feature is also helpful to know
if something transpired by actions of something other than the deployment of
the infrastructure changes.
No comments:
Post a Comment