Monday, January 30, 2023

 

Zero Cost for IT Cloud Automation and managed services:

Introduction: Analogies from open source total cost of ownership aka TCO are directly applicable to the inventory of an IT provider in any organization. Costs such as acquisitions, operations and personnel are not only amplified at the cloud level for an IT provider but also incur increased workflow automations and complexity costs. While the TCO was coined around the era of war between managed software and open source, it continues to draw parallels and hold as much relevance in the new IT world dominated by public and private cloud inventories. In this writeup, we review some of these in detail.

 

Description: An IT Organization provider has the following costs:

Costs of resource instances – Today we rely more and more on virtual and shared resources and additionally we now have more granular and fragmented resources as containers, virtual networks and migratory applications and services. There is no longer a concept of ownership as much as there is weak reference established via tenancy. 

Cost of operations and services: Erstwhile notions of patch management, backup and security no longer apply on a periodic assessment basis and instances are more short-lived than ever. Furthermore, management of resources is now done via dedicated agents and central command that transcend cloud and on-premise boundaries. The running of these services is now no longer triggered as much as they are serviced automatically via alerts and notifications.

Cost of manpower: With increased complexity of cloud computing, there is more manpower needed than before. This runs contrary to the general belief that the cloud services are self-serve and increasingly automated.

 

Detail: We reduce the cost in each of these categories above.

Today as private cloud customers request compute and storage, we slap on self-service automation and maintenance workflows by default so that the resources get leased and serviced in a routine manner. However, these workflows are all utilizing existing infrastructure that have either come to end of life or have not kept up with the pace of things in the public cloud. Moreover, the discrepancy between the services offered in a private cloud and those offered in public cloud only grows with emphasis on legacy tools and platforms. Take examples such as infoblox, zabbix etc for our network address and monitoring utilities and the evidence becomes clearer. If we rely on static ip addressing versus dhcp, we may have workflows and automations in place that build up on these services in layers and each addition is costly because the foundation is not right. There is no evidence of using static ip addressing as the primary mode of address assignment in the public cloud. Furthermore, the public cloud is thought through in how it will scale on many fronts whereas private cloud groans to scale because of the bottlenecks in how these products scale to cloud loads.  Monitoring with clunky products like Zabbix is another example of why a UI-CLI-SDK masquerading product is nowhere compared to the services at the cloud scale such as AWS CloudWatch or Azure Monitor. Automations and scripts are relatively inexpensive compared to products but misguided automation and incorrect emphasis only leads to more and more technical debt that may not only weigh down the capabilities at some point but also sink the offerings if the users find the ubiquity and norm of cloud computing more appealing. This write-up does not belabor the point that betting against the public cloud is foolish and instead draws attention to the investments being made in the short term on the private cloud versus the on boarding for the public cloud. We will look at these differences in investment and come up with an onboarding strategy for public cloud in a separate section but right now we just make the case that the differences in the technology stack on private cloud versus their comparables in the public cloud should be few. For the sake of clarity, we refrain from a discussion on hybrid cloud because the state of affairs in a private cloud is focus worthy to take precedence in this discussion and the differences are much better called out between private and public rather than hybrid and anything else.

Moreover, our workflows are not that simple. We repackage existing functionalities from out of box providers that are largely on-premise in their deployments with sometimes a false claim for being cloud enabled.  There is a reason why there are no third party applications and appliances put into the mix of resources or services available from a public cloud provider. With a native stack and homogeneous resources, the public cloud can clone regions and avoid significant costs and labor associated with external purchases, maintenance, relicensing and support tools. These public clouds are betting on people and their deliverables to grow and expand their capabilities with very little dependencies or churn thrown their way. Naturally the number and type of services have significantly grown in the public cloud.

The technology stack from private cloud must not be recreating services from scratch at par with the public cloud but should be designed with fit over public cloud services in the first place before extending to the legacy hardware and private cloud infrastructure. Infrastructure as a service and platform as a service must be differentiated. Today we go to either based on whether we are requesting resource or whether we are requesting software stacks. I believe the right choice should be to differentiate the offerings based on usage requests. For example, managed clusters with marathon stacks should come from IaaS and programming stack for deploying applications should come from PaaS. In the former case, we can rely on cloud computing resources from public cloud.

Some private cloud offerings are hesitant to let go of resources and usages because they cite that the equivalent is not available in public cloud. For example, they say the flavors for operating system and the images offered from private cloud are not available the same in public cloud. In such cases, the private cloud has the advantage that it can offer better variations and customized flavors for the customer. In addition, the machines can also be joined to domain.

This sort of reasoning is fallacious. There are two reasons for it. First if the customers cannot do with public cloud offerings, then they are definitely digging themselves into a rut from which they will find it difficult to climb out later. A private cloud is offered not merely because it offers an alternative in the portfolio of cloud based investments but more so that the stack is completely managed by the organization. As such this ownership based slice must be a smaller percentage than the leased public cloud investments.  The second reason is that the private cloud does not come homogenous. It is based on either Openstack or VMware based stacks and the tools and services are separate for each. There are very few features from private cloud that make use of both equally.

The differences between private cloud and public cloud also come with existing and legacy inventory both of which have their own set of limitations that require more workarounds and workflow rework that cost significantly to make and operate. The rollover and upgrade from existing to new resources are somewhat more drawn out with the age of the systems as applications written at the time or customers using those resources may not move quickly or as nimbly as the new ones being written and deployed on say PaaS. Customers are slow to respond to take actions on their compute resources when notified by emails and a link to self-help. 

As a use case, take the example of co-ordination software such as chef, puppet, ansible, salt, BladeLogic to manage all the systems whether on-premise or in the cloud. Each of these systems have a purchase cost, relicensing cost, training cost, operational cost, and continue to add their own complexities to workflows and automations with what they support and what they don’t. On the other hand there are tools from the public cloud that spans both the public cloud and on-premise assets by virtue of an agent installed on the machine that talks over http.  These System Center tools from either public cloud are designed for the organization wide asset management and chores that has been widely accepted by many companies.

Conclusion: The adage of total cost of ownership holds true even in the new IT world although not by the same name. Managed versus unmanaged services show clear differentiation and place value in favor of managed anywhere.

 

No comments:

Post a Comment