Wednesday, June 16, 2021

 Let us take a closer look at how the monitoring data is gathered and analyzed internally within the cloud. The architecture behind Azure Monitoring is a diagnostics pipeline.  This pipeline consists of an ingestion gateway, a delivery service, a distributed graph, a normalizer service and scrubber, logs to metrics converter, and an uploader to a global database. The pipeline support ingestion, streaming, transformations, and querying. This is its hallmark. All these paths are supported end-to-end via the pipeline without any interference from each other. 

The idea behind the monitoring pipeline is one of queuing and pub-sub mechanisms. Logs and metrics flow from gateways to storage queues, where blobs are listened for, scrubbed, forwarded to event hubs, and uploaded to different destinations such as CosmosDB, Azure Data Lake Storage (ADLS), and delivery services. The rate of flow to the queues can be throttled and the schema hints can be propagated to the storage where the schema and notifications power the analytics. The metrics accumulation in an MDM facilitates the logic for throttling and rate adjustments while the schemas are mostly published and queried via Kusto. 

Configurations for different storage containers, queues, and hubs are defined between the collection and the delivery services. These are called Monikers and it is a pairing of Event hub and storage account. The ingestion service is responsible to connect the monitoring agent with its storage account. The use of this service reduces the number of monikers, the number of blob writes to storage, and the complexity of the distributed graph representation. The storage is billed in terms of transactions and what would earlier take hundreds of transactions and blob writes, would require only tens of transactions using the ingestion or ingress service. It can also aggregate the blobs before writing them to the storage account. 

The corresponding egress service is the delivery service and can be considered an equivalent of Apache Kafka. It comes with a set of producer and consumer definitions and this pub-sub service operates at the event level. There is an application programmability interface provided for consumers who would like to define the monikers instead of the control on the events. The setting up of monikers determines where and how the data is delivered and the use of monikers reduces the bill in an equivalent way to how the ingress did. The destinations are usually Kusto clusters and event hubs. The delivery service forms the core of the pipeline with agents and ingestion pouring data to storage defined by monikers. At the other end of the pipeline are the event hubs and Kusto clusters. 

Collection and Storage have pre-requisites. For example, when virtual machines are created, they automatically have a monitoring agent (MA) installed. This agent reaches out to a collection service with an intent to write and define a namespace. The handshake between the monitor and the agent gives the agent the configuration necessary to direct its data to a destination Moniker which can scale automatically for the storage account. 

Unlike the collection and the storage, which are automatically provisioned, the delivery and the paths are set up by the customer using the application programmability interfaces in the extensibility SDK associated with the delivery services. The delivery service then concerns itself merely with the resolving of monikers, the listening on the monikers, the filtering of the events, and its delivery to the Kusto clusters and event hubs. If the destination is unreachable or unavailable, the data is handed off to the snapshot delivery service which reads the delivery service namespaces for retries. The data is never put in memory when the delivery service forwards the data to a cache under a namespace key. The snapshot delivery service acts as the standby destination in place of the unreachable one. 


No comments:

Post a Comment