A complex deployment is one which has multiple layers, resource groups and resource types. Creating a complex deployment using IaC is fraught with errors both at plan and execution stages. The IaC compiler can detect only those errors as can be statically determined from the IaC. Runtime execution errors are more common because policy violations are not known until the actual deployment and given the diverse set of resources that must be deployed, the errors are not always well-known. From name size limitations, invalid security principals, locked resources, mutual incompatibility of resource pairs, conflicting settings between resources, are just a few of the errors to name a few.
A
realization dawns in as the size and scale of infrastructure grows that the
veritable tenets of IaC such as reproducibility, self-documentation,
visibility, error-free, lower TCO, drift prevention, joy of automation, and
self-service somewhat diminish when the time and effort increases exponentially
to overcome its brittleness. Packages go out of date, features become
deprecated and stop working, backward compatibility is hard to maintain, and
all existing resource definitions have a shelf-life. Similarly, assumptions are
challenged when the cloud provider and the IaC provider describe attributes
differently. The information contained
in IaC can be hard to summarize in an encompassing review unless we go block by
block. Its also easy to shoot oneself in the foot by means of a typo or a
command to create and destroy instead of change and especially when the state
of the infrastructure disagrees with that of the portal.
TCO of an
IaC for a complex deployment does not include the man-hours required to keep it
in a working condition and to assist with redeployments and syncing. One-off
investigations are just too many to count on a hand in the case when
deployments are large and complex. The sheer number of resources and their
tracking via names and identifiers can be exhausting. A sophisticated CI/CD for
managing accounts and deployments is a good automation but also likely to be
run by several contributors. When edits
are allowed and common automation accounts are used, it can be difficult to
know who made the change and why.
Some
flexibility is required to make a judicious use of automation and manual
interventions for keeping the deployments robust. Continuously updating the
IaC, especially by the younger members of the team is not only a comfort but
also a necessity. The more mindshare a
complex IaC gets, the likely that it will reduce the costs associated with
maintaining it and dispel some of the limitations mentioned earlier.
As with all
solutions, scope and boundaries apply. It is best not to let IaC spread out so
much that the high priority and severity deployments get affected. It can also
be treated like code with its own index, model and co-pilot.
References
to build the first co-pilot:
1. https://github.com/raja0034/azureml-examples
2. https://github.com/raja0034/openaidemo/blob/main/copilot.py
References: previous
articles on IaC
No comments:
Post a Comment