Tuesday, November 9, 2021

Cost comparisons between standalone products and cloud native solutions

 

Introduction:

This article is a TCO calculator for a comparison of cost between an isolated storage appliance and one native to public cloud computing

Description:

Many datacenter products are sold as separate isolated standalone appliances which start out as lean and mean to fit on a single host and eventually justify their own expansion to several racks. The backend processing for many IT operations is delegated to these appliances. For example, object storage is one such example where each organization can choose to have a private cloud storage.

This is a comparison of the features and their relative price comparisons as low or high:

Feature/Subsystem

Standalone appliance

Cloud native DIY solution

Organization

Multi-layered and multi-component monolithic application which requires significant bare metal libraries – High

This is staged and pipelined execution including several pre-built Azure resources - Low

Cluster based architecture for scale out

Involves deploying specific types of components to control and data nodes with costs for coordinator – High

State based reconciliation of control plane resources including scale out and replicas – Low

Microservices for each of the components for ease of integration, testing and programmability

Each component targets the same core storage layer which if distributed between clusters relies on message-based consistency algorithms. Depending on code organization, maintenance and individual component health, the costs for shipping releases of software are accumulated over timeframes. High

Each service can be included into an app service and a plan while components are replaced by efficient use of resources. Packing, unpacking multi-layer blob and user-access-resolution independent layers are replaced by pipelined services that add minimal code to existing resources. Message broker, passing, pub-sub and other routines are eliminated in favor of dedicated products like service bus while the algorithm remains the same. Code reduction and independent release results in cost savings- Low

Since the user namespace hierarchy, user object management, web user interface and virtual data centers are implemented independently as layers, the flexibility to provide business functionalities can remain shallow and restricted to upper layers or frontend

Behind the scenes, the system architecture facilitates the changes to be restricted to frontend or middle tier including data access. Most features can be added in a single shot feature delivery. But the cost often includes metadata changes that might also be persisted to the store. Most features that require persistence reuse the store. High

Behind the staged pipeline and region-based storage accounts, the feature implementations do not rely on anything more than a message queue and a database. Custom logic can be added via extensions and functions that are easy to add without impacting the rest of the organization. Low

DIY libraries and code

Significant investment – High

Little or no investment – leveraging available resources- Low

Objects owned by a virtual data center within a replication group will need to be replicated.

Code must be written to replicate readable objects from one virtual data center to another. Three nodes might be chosen from a pool of cluster nodes for the writes. For example: the storage engine records the disk locations of the chunk in a chunk location index and the disk locations corresponding to the chunk are written to three different disks/nodes. The index locations are chosen independently from the object chunk locations. The VDC needs to know the location of the object. Directories such as for location of objects might be designated for different purposes.  Cost: High

Syncing across availability zones is built into the Azure resources. Although this might not be exposed to the resource invokers, they are welcome to create regions for read-write and read-only. Cosmos DB for instance supports automatic replication across regions. If a storage engine layer must be written on top of the cloud resources, it may still have to write its own replication but usages involving existing data stores can leverage an Azure store, cache or CDN with automatic replication. Cost: Low

Query execution engine

A storage engine could have standard query operators for the query language if the entire data were to be considered as enumerable. In order to collapse the enumeration, efficient lookup data structures such as Bplus tree are used. These indexes can be saved right in the storage for enabling faster lookup later. Cost: High

Unlike preparation, resolving, compilation, plan creation, plan optimization and caching of plans, objects and their heuristics, the cloud services provide simpler indexing and searching capabilities that transcend even document types let alone documents. Besides the operational advantages of using these services from the cloud, this simplifies the search experience. Cost: Low

Analysis engine

The reporting stack has always been a read-only stack which made it possible to interchange analysis stacks independent from the strict or eventually consistent writes.

 

A storage engine with its own reporting stack is a significant investment for that product even if the query interfaces are exposed as standard query operators Cost: High

Many analytical stacks can easily connect to the storage via existing and available connectors reducing the need for integration. Services for analysis from the public cloud are rich, robust and very flexible to work with. Cost: Low

 

Conclusion:

The use of a TCO calculator realizes the reimagining of a storage appliance built for the cloud so that the footprint on premises of individual organizations is minimized.

No comments:

Post a Comment