Design of alerts and notifications system for any storage engineering product.
Storage engineering products publish alerts and notifications that interested parties can subscribe to and take actions on specific conditions. This lets them remains hands-off the product as it continues to serve the entire organization.
The notifications are purely a client-side concept because a periodic polling agent that watches for a number of events and conditions can raise this notification. The store does not have to persist any information. Yet the state associated with a storage container can also be synchronized by persistence instead of retaining it purely as an in-memory data structure. This alleviates the need for the users to keep track of the notifications themselves. Even though the last notification is typically the most actionable one, the historical trend from persisted notifications gives an indication for the frequency of adjustments.
The notifications can also be improved to come from various control plane activities as they are most authoritative on when certain conditions occur within the storage system. These notifications can then be run against the rules specified by the user or administrator so that only a filtered subset is brought to the attention of the user.
Notifications could also be improved to come from the entire hierarchy of containers within a storage product for all operations associated with them. They need not just be made available on the user interactions. They can be made available via a subscriber interface that can be accessed globally.
There may be questions on why information on the storage engineering product needs to come from notifications as opposed to metrics which are suited for charts and graphs via existing time-series database and reporting stack. The answer is quite simple. The storage engineering product is a veritable storage and time series database in its own right and should be capable storing both metrics and notifications. All the notifications are events and those events are also as continuous as the data that generates them. They can be persisted in the store itself. Data does not become redundant as they are stored in both formats. Instead, one system caters to the in-store evaluation of rules that trigger only the alerts necessary for the humans and another is more continuous machine data that can be offloaded for persistence and analysis to external dedicated metrics stacks.
When the events are persisted in the store itself, the publisher-subscriber interface then becomes similar to writer-reader that the store already supports. The stack that analyzes and reports the data can read directly from the store. Should this store container for internal events be hidden from public, a publisher-subscriber interface would be helpful. The ability to keep the notification container internal enables the store to cleanup as necessary. Persistence of events also helps with offline validation and introspection for assistance with product support.
The notification system is complimentary to the health reporting stack but not necessarily a substitute. This document positions the notification system component of the product as a must-have component which should work well with existing subscriber plugins. It also exists side by side with metrics publisher and subscribers.
No comments:
Post a Comment