Wednesday, September 6, 2023

 

Access Control is notorious in the IaC for quickly growing out of bounds and being rather unstable. Between declarative forms and scripts, the role-assignments increase much more with the number of related resources.

A role assignment consists at the minimum of a role, an assignee, and a scope. The role can either be a fully qualified identifier to its definition or just its name. The assignee on the other hand can be one of many types such as a user, group, ServicePrincipal or a ForeignGroup.  With each of these types, the role assignment must specify the principal identifier which is a guid. It could also simply replace the type and the identifier with a name for an assignee. A scope must be specified, and it is preferable to give the entire resource id. This role assignment seems simple with values passed for three parameters, but a string of cryptic errors encountered with their usage makes it brittle.

For example, one of the ways the assignee is obtained during IaC is  by provisioning the resource with a system managed identity and subsequently retrieving its identifier by querying for the provisioned resource. A simple variable substitution will then be attempted with the role assignment, and it would fail with the message that the principal id provided as the assignee must be a valid Guid. Manually inspecting the resource and making sure that the object id and not the app id was used is required. But the source of the error might be something inconspicuous in the form of double quotes on either end that are added as part of the query results and escapes one’s attention. Stripping the quotes around the principal id is required for the role assignment command to recognize it.

Similarly, scope is often constructed from elements but in fact it is best treated as opaque and taken directly from the source. Any kind of parsing or reconstruction is prone to script errors and grammar errors.

More importantly, the number of role assignments can be reduced by targeting a higher scope, but the diligence required to group and organize the role-assignments such that they can be avoided or replaced with higher or more specific assignments, might be daunting. It is precisely the overlooking of proper organization by multiple participants that the technical debt is incurred and with the role assignments proliferating, this debt comes back to haunt quickly.

Finally, role assignments and network rules are hard to debug when they go missing and it is in the best interest of the code maintainers to actually specify the associations right at the time of creation. The symptoms manifested by missing rules and assignments are not only difficult to diagnose but also tend to work their way backwards from the customers and end-users. Proper application of role assignment might return the dreaded 403 forbidden http status code and message even when the root cause might have been just cross network permissions that went missing when the resources were created.

Authentication, authorization, and auditing are the final proof of declarations that work and those that do not. One must remove the unnecessary just as much as the incorrect ones.

A special mention about IaC state must be made because state eludes the code as the cause and the resource as the effect. Carefully propagating the changes forward from code and writing through the state to apply the changes to the resources and similarly backward propagating the modified resource by importing them into the state and updating the IaC must be fully traversed in both directions to keep the code, the state and the resources in sync. The changes being made to keep all three in sync were often spread out over time and distributed among authors leading to sources of errors or discrepancies. Establishing a baseline combination of state, IaC and corresponding resources is necessary to make incremental changes. It is also important to keep them in sync going forward. The best way to do this would be to close the gap by enumerating all discrepancies to establish a baseline and then have the process and the practice to enforce that they do not get out of sync.

 

No comments:

Post a Comment