Cluster computing: Metrics

Monday, September 14, 2020

Metrics

We were discussing a set of features for stream store that brings the notion of accessing events in sorted order with skipped traversal. The events can be considered to be in some predetermined sequence in the event stream whether it is by offset or by timestamp. These sequence numbers are in sorted order. Accessing any event in the stream, as if considered to be in a batch bounded by a head and a tail StreamCut that occur immediately before and after the event respectively, is now better than the linear traversal to read the event. This makes the access to the event from the historical set of events in the stream to be O(log N). The skip-level access links in the form of head and tail streamcuts can easily be built into the metadata on a catch-up basis after the events are accrued in the stream.

In addition, we have the opportunity to collect fields and the possible values that occur in the events to allow them to be leveraged in queries subsequently. This enhancement of metadata from events in the stream becomes useful to find similar events

The use of standard query operators with the events in the stream has been made possible by the Flink programming library but the logic written with those operators usually is not aware of all the fields that has been extracted from the events. By closing the gap between the field extraction and new fields in query logic, the applications can not only improve existing logic but also write new ones.

The extraction of fields and their values provides an opportunity to not only discover the range of values that certain keys can take across all the events in the stream but also their distribution. Events are numerous and there is no go-to source for statistics about events especially similar looking events. Two streams having similar events may have them in incredibly different order and arrival times. If the stream store is unaware of the contents of the events, it can tell the number of events and the size made up by those events. But with some insight into the events such as the information about their source, a whole new set of metrics are now available which can help with summary information, point of origin troubleshooting, contribution/spread calculation, and better resource allocations.

As with many metrics, there is a sliding window for timestamp based datapoint collections for the same metric. Although metrics support flexible naming convention, prefixes and paths, the same metric may have a lot of datapoints over time. A sliding window presents a limited range that can help with aggregation functions such as latest, maximum, minimum and count over this range for subsequent trend analysis. Queries on metrics is facilitated with the help of annotations on the metric data and pre-defined metadata. These queries can use any language but the queries using search operators are preferred for their similarity to shell based execution environments. In this way metrics provide handy information about the stream that would otherwise have to be interpreted by running offline analysis on logs. Summary statistics from metrics can now be saved with metadata.

Having described the collection of metrics from streams with primitive information on events, let us now see how to boost the variety and customization of metrics. In this regard, the field extraction performed from the parsing of historical events, provides additional datapoints which become helpful in generating metrics. For example, the metrics can now be based on field names and their values as they occur throughout the stream, giving information on the priority and severity about certain events. This summary information about the stream now includes metrics about criteria pertaining to the events.

Cluster computing

Monday, September 14, 2020

Metrics

No comments:

Post a Comment