Azure Kusto data access differs by producer and consumer. This article explains how to ingest data into Kusto with a producer and how to retrieve data with a consumer.
Producers:
There are a few products to publish data
to Kusto. These include:
·
Metadata
Mover: Cosmos DB changes result in a change feed that can be used to ingest
Kusto data tables.
·
Azure
Data Factory: This is a service designed to bridge disparate data sources. It
is quick and easy to use a preconfigured data pipeline, connects to both SQL DB
and KQL (Kusto Query Language) cluster, and allows the creation of scheduled
triggers. The pipeline will not run an activity individually. ADF (Azure Data
Factory) requires MSI (Managed Service Identity) and does not have support for
MI (Managed Identities)
·
Studio
Jobs: replicates source to destination fully every time including new
columns
These involve two different approaches
primarily:
·
Change
tracking. The source must support change tracking and publishing via a change
feed. It should be detailed about every delta in terms of the scope, type and
description of the change and must come with versioning so that each change can
be referred to by its version. Then these changes can be applied to Kusto
·
E2E Workflow:
This is a two-stage publishing.
o The first stage does an initial load from the source to
the destination again using some watermark such as a version or a timestamp for
the data to be transferred to the destination.
o The second stage involves periodic incremental
loading
Some progress indicator is needed for incremental
updates. Prefer overwriting the destination to reading and merging changes, if
any.
There can also be a hybrid
man-in-the-middle implementation that acts as consumer for source and a
producer for destination, but it would be implementing both the producer and
consumer in code versus leveraging the capabilities of these technologies in an
ADO pipeline.
ADO is widely recognized as the best
way to create a pipeline in Azure and your task to add data to Kusto requires
you to connect your data source to Kusto.
The Azure DevOps project represents a
fundamental container where data is stored when added to Azure DevOps. Since it
is a repository for packages and a place for users to plan, track progress, and
collaborate on building workflows, it must scale with the organization. When a
project is created, a team is created by the same name. For an enterprise, it
is better to use collection-project-team structure which provides teams with a high
level of autonomy and supports administrative tasks to occur at the appropriate
level.
No comments:
Post a Comment