Cluster computing

Thursday, June 13, 2019

There are a few other troubleshooting mechanisms which we can call supportability measures. I’m listing them here.
Monitoring and Alerts – System events are of particular interest to administrators and users. A way to register monitors and their policies will help to have an overarching and comprehensive view of the operations of the product. Different kinds of sensors may be written for this purpose and packaged with the product.
Counters- performance counters for various operations of the product will be very helpful for diagnosis on what takes a lot of time. When the elapsed time and processing time are measured separately, they help tremendously in finding bottlenecks or long running tasks.
Dynamic views – If the operational data is persisted then the current window of activity can be viewed with built-in queries. After all the storage product stores streams and it can take all activity data as append-only data.
User Interface – There are pages that can help remote monitoring of products and troubleshooting via viewing their logs or setting up back channels of communication with the hosts of the application. Such interface will be very helpful for remote troubleshooting on customer deployments.
The APIs for collecting metrics from the system will prove very helpful to other applications who don’t need to involve other means of access and roll these operation monitoring workflows into theirs. API invocation decouple technology stacks and help with independent monitoring. Since API are published over the web, they are usable across networks.
Virtually all API can be packaged into an SDK for developer convenience. These will tremendously improve the possibilities for application development and open up the boundaries for custom usages. Such expanded possibilities means the product will endear itself to developers and their organizations.

Cluster computing

Thursday, June 13, 2019

No comments:

Post a Comment