Monday, June 8, 2020

Application troubleshooting continued

Application Monitoring:

Flink runtime monitoring has a dedicated solution with Prometheus stack.  This stack comprises of Metrics which provides time-series, Labels which provides key-value pairs, Scrape which fetches metrics, TSDB which is the Prometheus storage layer and PromQL which is the query language used for charts, graphs and alerts. The dashboard is available via Grafana. All it takes to set up this stack is to drop the reporter jar in the lib directory and to configure the conf/flink-conf.yaml. The Prometheus service can be configured via prometheus.yml configuration or by service discovery. The stack is helpful even if we want to define custom metrics. Flink, as such, supports gathering and exposing metrics to external systems.

 

Kubernetes provides events, logs, metrics and audit information for all actors and their activities on assets maintained. All of this data may be collected centrally in the form of json text and destined for services that are dedicated to improving analysis and insights into the machine data.

Content can be wrapped with augmented key-value set that can be used with queries for filtering, transforming, mapping and reducing the operational machine data into more meaningful reports. Kubernetes is well-positioned to wrap individual data entries with metadata to not only enhance the content but also do so authoritatively and irrespective of downstream destination systems.

This envelope of metadata surrounding each entry may consist of predefined labels and annotations, timestamps or attributes of points of origin and any additional extractable key-value pairs from the data itself. Since Kubernetes is the source of truth for the runtime operations associated with hosting the applications, they are done once per data entry.

However, if system administrators are allowed to write rules with which to inject custom key-value pairs in these labels and annotations surrounding each entry, then it may improve the querying associated with the data by providing input not just from the system but also from the rules defined by the system administrator. This set of rules is evaluated by a classifier that executes on all data entries exactly once. The rules may have intrinsic and operators that evaluate against, say, day of week, peak versus non-peak hour period, and traffic characteristics such as five tuple attributes of the flow and so on.

By enhancing the envelope as well as the evaluator to wrap each entry, the downstream systems are guaranteed multiple perspectives on individual entries that were simply not possible earlier by the native Kubernetes framework.

A classifier to add labels and annotations within the Kubernetes framework to boost the native events will significantly improve the capabilities of downstream listeners and their alerts and reports.

 


No comments:

Post a Comment