Today I want to take a short break to discuss an interesting idea for monitoring API calls. APIs are ubiquitous and their usage is relevant to operations, development and test teams. There are various kinds of tools available for monitoring and some come with facilitators such as Mashery and Splunk. These are often based on non-intrusive capture, index and search mechanisms or a subset of them. Often they are cumbersome to use, involve heavy investment and don't provide real-time information. But I want to talk about a specific monitoring usage similar to real-time chart of call volumes. The APIs I want to target for my monitoring have a log counter i.e. they log the number of calls to an API identified by its name in a table in a database for read access by all. In this case, I'm interested in charts similar to performance counters on desktops.
There are a few metrics I'm interested in. For example, one metric is reported via graphs that are based in grids with call volumes on the y-axis and time on the x-axis. The peaks and the valleys in the graph can correspond to the alerts.
Another metric is the histogram of log counters that indicate the relative comparison between APIs. This may have similar patterns over time. But the idea is to view the first metric in comparison with the others at the same time.
A third metric is the averages over time slices. This can come in useful to detect time periods for bursts.
Another metric is to measure rate of change or velocity i.e. the number of calls per second. This can come useful to see and thwart possible denial of service attacks because it is often preceded by a slope. This could be particularly useful for login calls.
Data may also get stored for analysis later. There could be studies such as event correlation and capacity planning.
This tool is expected serves a specific purpose. It's not meant as a general purpose tool to span audit, compliance, governance and reporting. The data set is limited. The intended audience is targeted. The expected actionable items for the alerts are few but critical. This is more for the development and test team to know what's going on without reaching out.
These monitors can be considered as internal use so private and confidential information can also be monitored. One more important thing is that the instrumentations are added by what the developer team is interested in.
There are a few metrics I'm interested in. For example, one metric is reported via graphs that are based in grids with call volumes on the y-axis and time on the x-axis. The peaks and the valleys in the graph can correspond to the alerts.
Another metric is the histogram of log counters that indicate the relative comparison between APIs. This may have similar patterns over time. But the idea is to view the first metric in comparison with the others at the same time.
A third metric is the averages over time slices. This can come in useful to detect time periods for bursts.
Another metric is to measure rate of change or velocity i.e. the number of calls per second. This can come useful to see and thwart possible denial of service attacks because it is often preceded by a slope. This could be particularly useful for login calls.
Data may also get stored for analysis later. There could be studies such as event correlation and capacity planning.
This tool is expected serves a specific purpose. It's not meant as a general purpose tool to span audit, compliance, governance and reporting. The data set is limited. The intended audience is targeted. The expected actionable items for the alerts are few but critical. This is more for the development and test team to know what's going on without reaching out.
These monitors can be considered as internal use so private and confidential information can also be monitored. One more important thing is that the instrumentations are added by what the developer team is interested in.
No comments:
Post a Comment