Wednesday, April 17, 2019

The metrics for Sequence analysis:

Sequences can be generated in large numbers. Their processing can take arbitrary time. Therefore, there is a need to monitor and report progress on all activities associated with sequences.

Metrics for sequences can be duration based which includes elapsed time.  If there are a million records and clustering them takes more time with one algorithm than other, elapsed time can help determine the right choice.

Metrics for sequences can also include count of sequences. If the number of sequences processed stalls or they are processed far too quickly for results, then they refer to some inconsistency. The metrics in this case helps to diagnose and troubleshoot.

Metrics can also be scoped to partitions while global ones are maintained separately. Metrics can also tags or namespaces associated with the same physical resource.

Metrics can support a variety of aggregations such as sum(), average() and so on. These can be executed at different scopes as well as globally. Metrics may be passed as parameters in the form of time series array.

When sequences rules are discovered, they are listed one after the other. There is no effort to normalize them as they are inserted. The ability to canonicalize the groups can be taken on by background tasks. Together the online and offline data modifications may run only as part of an intermediate stage processing where preprocessing and postprocessing steps involve cleaning and prefix generation. Metrics give visibility to these operations.

No comments:

Post a Comment