Introduction:
This article is a TCO calculator for a comparison of cost between an isolated storage appliance and one native to public cloud computing
Description:
Many datacenter products are sold as separate isolated
standalone appliances which start out as lean and mean to fit on a single host
and eventually justify their own expansion to several racks. The backend
processing for many IT operations is delegated to these appliances. For
example, object storage is one such example where each organization can choose
to have a private cloud storage.
This is a comparison of the features and their relative
price comparisons as low or high:
Feature/Subsystem |
Standalone
appliance |
Cloud native
DIY solution |
Organization |
Multi-layered
and multi-component monolithic application which requires significant bare
metal libraries – High |
This is
staged and pipelined execution including several pre-built Azure resources -
Low |
Cluster based
architecture for scale out |
Involves
deploying specific types of components to control and data nodes with costs
for coordinator – High |
State based
reconciliation of control plane resources including scale out and replicas –
Low |
Microservices
for each of the components for ease of integration, testing and
programmability |
Each
component targets the same core storage layer which if distributed between
clusters relies on message-based consistency algorithms. Depending on code
organization, maintenance and individual component health, the costs for
shipping releases of software are accumulated over timeframes. High |
Each service
can be included into an app service and a plan while components are replaced
by efficient use of resources. Packing, unpacking multi-layer blob and
user-access-resolution independent layers are replaced by pipelined services
that add minimal code to existing resources. Message broker, passing, pub-sub
and other routines are eliminated in favor of dedicated products like service
bus while the algorithm remains the same. Code reduction and independent
release results in cost savings- Low |
Since the
user namespace hierarchy, user object management, web user interface and
virtual data centers are implemented independently as layers, the flexibility
to provide business functionalities can remain shallow and restricted to
upper layers or frontend |
Behind the
scenes, the system architecture facilitates the changes to be restricted to
frontend or middle tier including data access. Most features can be added in
a single shot feature delivery. But the cost often includes metadata changes
that might also be persisted to the store. Most features that require
persistence reuse the store. High |
Behind the
staged pipeline and region-based storage accounts, the feature
implementations do not rely on anything more than a message queue and a
database. Custom logic can be added via extensions and functions that are
easy to add without impacting the rest of the organization. Low |
DIY libraries
and code |
Significant
investment – High |
Little or no
investment – leveraging available resources- Low |
Objects owned
by a virtual data center within a replication group will need to be
replicated. |
Code must be
written to replicate readable objects from one virtual data center to
another. Three nodes might be chosen from a pool of cluster nodes for the
writes. For example: the storage engine records the disk locations of the
chunk in a chunk location index and the disk locations corresponding to the
chunk are written to three different disks/nodes. The index locations are
chosen independently from the object chunk locations. The VDC needs to know
the location of the object. Directories such as for location of objects might
be designated for different purposes.
Cost: High |
Syncing
across availability zones is built into the Azure resources. Although this
might not be exposed to the resource invokers, they are welcome to create
regions for read-write and read-only. Cosmos DB for instance supports
automatic replication across regions. If a storage engine layer must be
written on top of the cloud resources, it may still have to write its own
replication but usages involving existing data stores can leverage an Azure
store, cache or CDN with automatic replication. Cost: Low |
Query
execution engine |
A storage engine could have standard query operators
for the query language if the entire data were to be considered as
enumerable. In order to collapse the enumeration, efficient lookup data
structures such as Bplus tree are used. These indexes can be saved right in
the storage for enabling faster lookup later. Cost: High |
Unlike
preparation, resolving, compilation, plan creation, plan optimization and
caching of plans, objects and their heuristics, the cloud services provide
simpler indexing and searching capabilities that transcend even document
types let alone documents. Besides the operational advantages of using these
services from the cloud, this simplifies the search experience. Cost: Low |
Analysis
engine The reporting stack has always been a read-only stack
which made it possible to interchange analysis stacks independent from the strict
or eventually consistent writes. |
A storage engine with its own reporting stack is a
significant investment for that product even if the query interfaces are
exposed as standard query operators Cost: High |
Many analytical
stacks can easily connect to the storage via existing and available
connectors reducing the need for integration. Services for analysis from the
public cloud are rich, robust and very flexible to work with. Cost: Low |
Conclusion:
The use of a TCO calculator realizes the reimagining of a
storage appliance built for the cloud so that the footprint on premises of
individual organizations is minimized.
No comments:
Post a Comment