This is a continuation of articles on
Infrastructure-as-code. This one talks about locking of resources. Locks,
policies and IaC sometimes compete to provide protection to resources and can
overstep on each other with conflicts that require resolution. Each of them has
a part to play and cannot be done away with and the hope is that each pipeline
run is smooth and leaves a clean state after its run.
This might be wishful thinking when resources that need to
be created, modified or deleted have sub-resources or are associated with other
resources. With such dependencies, operations might result in an error that
states that one or the other has locks on it. Locking is essential to prevent
any accidental modifications to the resources. It is assumed that authorized
operations will be able to unlock the resources prior to the change and then
lock afterwards. With the example of a private endpoint on an Azure public
cloud resource to provide private ip address for incoming traffic, associated
resources including the parent resource, private links and dns zones might all
get locked. Only when all the locks are released, will the operations
succeed. This makes it hard to know
upfront which locks to acquire and release.
One of the approaches with pipeline automations is the
cascaded unlocking of all resources in a resource hierarchy such as a resource
group or subscription level. Since the identity with which the Azure operations
are performed must be privileged. Only the Owner and User Access Administrator
built-in roles can create and delete management locks. The corresponding
permissions belong to the Microsoft.Authorization/* or
Microsoft.Authorization/locks/* organizational prefix. Custom roles having
these permissions could also be sufficient. It might be time consuming to go
through all the resources and sub-resources in a resource hierarchy to unlock
them first before the operations begin and to lock them at the end and often
includes some wait time to be specified in the script. But this leaves the
resources in a clean state for the changes to be propagated from the IaC to the
management portal for these resources. It is also possible to conditionally run
these for changes that carry certain labels or distinguishing features such as
a filter on operations.
A policy might act like a catch-all to apply locking where
locks are missed out from resources but a policy on the Azure public cloud has
a compliance interval of 24 hours. It is also a default allow and explicit deny
system If a resource violates a policy, it is marked as non-compliant. The
effects that a policy takes are detection or prevention. The IaC code is the
ultimate source of truth for the resources and there are ways to specify locks
in the IaC for resources that must behave independently from the collective
approach taken by the policy. Anytime a policy changes the locks and the IaC is
unaware, there is a conflict. It is preferable to keep locking as simple as
possible without any customizations for any subset of resources so that the
pipeline automation is sufficient to co-ordinate the locking and unlocking.
Finally, it is much easier to do locking and unlocking with command-line interface than execute it elsewhere. Both pipeline scripts and public cloud automations can execute these commands and although Runbooks might not be able to execute them in PowerShell, the az cli can certainly be run via functions or such other resources. Invoking a script for locking or unlocking does not require a resource or its state to change.
No comments:
Post a Comment