Introduction:
Many partners want access to the data that a service team
maintains in its inventory. One popular technique to open up data about the
resources provisioned by the service involves the use of a Kusto database and
cluster. APIs provide real-time access to the data and Kusto provides
continuously replicated data. Sometimes there is a lag for the refresh of the
data in the shared Kusto database and it can vary arbitrarily from table to
table depending on their size and use.
Besides the lag, the data is expected to be the same between the two and
this article explores the appropriate usage of one versus the other.
Description:
Kusto data access is an option to access the data directly
from a database. It is very helpful when we want to browse through the data or
explore it. APIs encapsulate logic with access to the data and provide validation,
error translation and response formatting along with diagnosability, telemetry
and troubleshooting help. These are owned by the service team that also owns
the data. The APIs are also versioned providing applications that use them with
some reassurance towards compatibility and migration path. APIs are also very
performant and already have fast path for critical scenarios. They are
continuously maintained for streamlined access to data and with proper controls
and overrides for desired custom behavior. APIs guarantee robustness, predictability,
and SLAs for existing and new capabilities that the service team authors. In
this sense, it is managed access to the data
This can be compared to the Kusto data access where the
compute required to extract, transform, and load the data is similar in nature
to the implementation of the API but now falls under the Do-It-Yourself onus of
the service. If the client teams were to invest in building access to the data
via Kusto queries, they will also own the maintenance and the total cost of
ownership which accrues enormously over extended periods of time. One of the
unaccounted costs for Kusto comes from its fragility. The queries, the formatting
of the data, the semantics and deprecation of schema associated with the data
are all susceptible to change without any notification to the applications and
their authors. Even the values of the data can change, and assumptions made on
them can break. This implies that the cost for data access is higher for Kusto.
Another dimension of comparison is storage. The APIs provide
both read-write capabilities while the Kusto is essentially for read-only
purposes mandating local storage for stashed results and transformations during
vectorized executions. The size of the storage is also a consideration when the
frequency and the access patterns are high. An application that wishes to
enrich the data in place must make a copy of the original, transform it and save
it in local or remote stores but not at the source. If the data supported
user-defined objects and dictionaries, then the APIs provide a way to enhance
them so that the next data access will get the additional state persisted with
the access.
APIs become the first choice for accessing data, but Kusto
can be useful in automations that cannot wait for the functionality to be
available via APIs that are published by the service team. They are also very
useful to write one-off automations that are special purposed or dedicated
without any impact to customers. Most commercial systems will rely on APIs for
interacting between services especially for production environments and cloud
scale. In-house projects, and reporting dashboards can make use of Kusto
directly or Azure Data Explorer or automations based on them. Kusto queries can also be quite elaborate or
custom defined to suit specifics needs that are faster and lightweight compared
to the staged management, release pipelines and scheduled delivery of features
introduced into the APIs. The ability to include such queries in background
automation is sometimes useful because they don’t have interactivity with the
customer and those automations can be kicked off periodically or on-demand.
Both Kusto and API data access can be programmatic involving
the use of a query provider and an http client respectively. But the code for
the Kusto data access will likely involve more packing and unpacking into
objects as well as conversions whereas the requests and responses for the API
come even versioned with their corresponding APIs in already composed form.
This investment can be made if the language and the query provide usefulness
that is not available otherwise or requires much more code to be written on the
API side. No code or low code scenarios prefer this approach, but those
scenarios do not include cases where transfer of data must be made formal.
Conclusion:
Data access, its mode and delivery are governed by factors
that together weigh in favor of one versus the other.
No comments:
Post a Comment