Cluster computing

Friday, March 28, 2014

Today I will continue my discussion on thread level software instrumentation
On window s there thread local storage available. Here we create a buffer and keep all our stack based data we collect. To collect CPU ticks, we read the system clock in the constructor and again in the destructor of an object we put on the stack. The diff is the time spent executing code and can be cumulated with the running total on the thread to see how much time each takes. Note that its ideal to keep this counter on the scheduler. since threads could attach and detach from the currently executing run.
Similarly it is easier to have a base method for allocations and deletions and they can be take the stack trace when the corresponding system calls are made. This then gives us a histogram of stack traces for allocations and we can use that to find the biggest users of memory or the most frequent users.
Having a thread local storage is desirable on any platform especially given the functionality available via language specific provisions such as C++ 11 that introduces thread_local. Creating a pointer sized variable enables allocation of an arbitrary buffer reserved for this thread which allows thread specific instrumentation that is not normally covered from routines that place a variable on the stack. Undeniably the latter is required too. For example, crash logs are written this way. But here we specify what else we can add to crash log notifications via thread local storage.
The first thing we said is historical stack traces that are not captured by as-of-this-point-of-time stack trace by stack local variables.
When do we take this stack trace, for every exception encountered. Even if not all of them are crash notations.The suggestion is that by keeping track of the last few exceptions at any given time, when we get a dump, it gives us another data point of interest.
We add an exception handler to the chain of handlers that each thread maintains. The first of these we will customize to take a stack trace, put it on our thread local ring buffer, and return 'keep looking' so that other handlers including the most recent frame's own handler to get a chance to handle the exception.
Why is this better than logging ? It's complimentary to logging. Dumps and logs are both relevant for diagnosis and often every bit of information helps.
One consideration for the ring buffer is that it could be implemented as a skip list for fast traversal.
Creating a ring buffer on each thread is one part. It could be used for different purposes.
using it for storing exceptions encountered us one such purpose. Usually the operating systems meet this purpose and using it for applications specific tasks is preferable. One such purpose could be to list the last user query or commands
For now we will just focus on creating a ring buffer for every thread on all platforms.
We do this whenever we start a thread or have a callback for initialization routines. There should be fewer of those through out the source code.

Cluster computing

Friday, March 28, 2014

No comments:

Post a Comment