Introduction:
This article is a TCO calculator for a comparison of cost between an isolated storage appliance and one native to public cloud computing
Description:
Many datacenter products are sold as separate isolated
standalone appliances which start out as lean and mean to fit on a single host
and eventually justify their own expansion to several racks. The backend
processing for many IT operations is delegated to these appliances. For
example, object storage is one such example where each organization can choose
to have a private cloud storage.
This is a comparison of the features and their relative
price comparisons as low or high:
| Feature/Subsystem | Standalone
  appliance | Cloud native
  DIY solution | 
| Organization | Multi-layered
  and multi-component monolithic application which requires significant bare
  metal libraries – High | This is
  staged and pipelined execution including several pre-built Azure resources -
  Low  | 
| Cluster based
  architecture for scale out | Involves
  deploying specific types of components to control and data nodes with costs
  for coordinator – High | State based
  reconciliation of control plane resources including scale out and replicas –
  Low | 
| Microservices
  for each of the components for ease of integration, testing and
  programmability | Each
  component targets the same core storage layer which if distributed between
  clusters relies on message-based consistency algorithms. Depending on code
  organization, maintenance and individual component health, the costs for
  shipping releases of software are accumulated over timeframes. High | Each service
  can be included into an app service and a plan while components are replaced
  by efficient use of resources. Packing, unpacking multi-layer blob and
  user-access-resolution independent layers are replaced by pipelined services
  that add minimal code to existing resources. Message broker, passing, pub-sub
  and other routines are eliminated in favor of dedicated products like service
  bus while the algorithm remains the same. Code reduction and independent
  release results in cost savings- Low | 
| Since the
  user namespace hierarchy, user object management, web user interface and
  virtual data centers are implemented independently as layers, the flexibility
  to provide business functionalities can remain shallow and restricted to
  upper layers or frontend  | Behind the
  scenes, the system architecture facilitates the changes to be restricted to
  frontend or middle tier including data access. Most features can be added in
  a single shot feature delivery. But the cost often includes metadata changes
  that might also be persisted to the store. Most features that require
  persistence reuse the store. High | Behind the
  staged pipeline and region-based storage accounts, the feature
  implementations do not rely on anything more than a message queue and a
  database. Custom logic can be added via extensions and functions that are
  easy to add without impacting the rest of the organization. Low | 
| DIY libraries
  and code | Significant
  investment – High | Little or no
  investment – leveraging available resources- Low | 
| Objects owned
  by a virtual data center within a replication group will need to be
  replicated. | Code must be
  written to replicate readable objects from one virtual data center to
  another. Three nodes might be chosen from a pool of cluster nodes for the
  writes. For example: the storage engine records the disk locations of the
  chunk in a chunk location index and the disk locations corresponding to the
  chunk are written to three different disks/nodes. The index locations are
  chosen independently from the object chunk locations. The VDC needs to know
  the location of the object. Directories such as for location of objects might
  be designated for different purposes. 
  Cost: High | Syncing
  across availability zones is built into the Azure resources. Although this
  might not be exposed to the resource invokers, they are welcome to create
  regions for read-write and read-only. Cosmos DB for instance supports
  automatic replication across regions. If a storage engine layer must be
  written on top of the cloud resources, it may still have to write its own
  replication but usages involving existing data stores can leverage an Azure
  store, cache or CDN with automatic replication. Cost: Low | 
| Query
  execution engine  | A storage engine could have standard query operators
  for the query language if the entire data were to be considered as
  enumerable. In order to collapse the enumeration, efficient lookup data
  structures such as Bplus tree are used. These indexes can be saved right in
  the storage for enabling faster lookup later. Cost: High | Unlike
  preparation, resolving, compilation, plan creation, plan optimization and
  caching of plans, objects and their heuristics, the cloud services provide
  simpler indexing and searching capabilities that transcend even document
  types let alone documents. Besides the operational advantages of using these
  services from the cloud, this simplifies the search experience. Cost: Low | 
| Analysis
  engine The reporting stack has always been a read-only stack
  which made it possible to interchange analysis stacks independent from the strict
  or eventually consistent writes. | A storage engine with its own reporting stack is a
  significant investment for that product even if the query interfaces are
  exposed as standard query operators Cost: High | Many analytical
  stacks can easily connect to the storage via existing and available
  connectors reducing the need for integration. Services for analysis from the
  public cloud are rich, robust and very flexible to work with. Cost: Low | 
Conclusion: 
The use of a TCO calculator realizes the reimagining of a
storage appliance built for the cloud so that the footprint on premises of
individual organizations is minimized.
