Wednesday, September 13, 2023

Overwatch organization

 

Overwatch can be taken as an analytics project over Databricks. It collects data from multiple data sources such as APIs and cluster logs, enriches and aggregates the data and comes with little or no cost. Auditing logs and cluster logs are primary data sources. Databricks monitors and logs cluster metrics such as CPU Utilization, memory usage, network I/O and storage, job related telemetry such as those for scheduled jobs, run history, execution times and resource utilization. The notebook execution metrics such as tracking metrics for individual notebook executions, including execution time, data read/write and memory usage, logging and metrics export, data from application monitoring tools like DataDog or Relic to gain deeper insights into performance alongside other applications and services, and SQL Analytics monitoring including those for query performance and resource utilization.

The Deployment runners used for Overwatch take the following parameters:

ETL Storage prefix

ETL database name

Consumer DB Name

Secret Scope

Secret Key for Databricks PAT Token

Secret Key for EventHub

Event Hub Topic Name

Primordial Date

Max Days

And AT Scopes

These parameters are stored in a csv file in the deployment folder of the storage account associated with the Overwatch and mounted via the ETL storage prefix.

So it would seem that the storage account used with the Overwatch notebook jobs is for both read and write with the ability to collect the cluster logs for reading purposes say from the cluster-logs directory and to write the corresponding calculations to say a report folder within the same account as <etl_storage_prefix>/cluster-logs and <etl_storage_prefix>/reports. However, the json configuration to the Overwatch jobs that run for a long time and parse large and plentiful logs run in a dedicated manner. It is possible to configure the read be served from a location different from the write and involves injecting the separate locations to the Overwatch jobs. The default locations of storage account qualified cluster-log folder and that for report folder are configurable.

With the newer versions, the etl_storage_prefix has been renamed to storage_prefix to indicate that it is just the working directory for the Overwatch and the all the logs are accessed via the mount_mapping_path variable that lists the remote locations of logs storage as a path different from the ones the storage_prefix points to. Therefore, the reports are written to a location as abfss@container on an Azure data lake storage, but the cluster logs can be read from mounts such as dbfs:/mnt/logs



No comments:

Post a Comment