Monday, May 13, 2013

To get stack frames from streams instead of dump files

Dump files can be arbitrarily large and they may generally stored in compressed format along with other satellite files.  File operations including extraction and copying on a remote network can be expensive. If we were interested only in a stack trace, we are probably not interested in these operations. Besides, we rely on the debuggers to give us the stack trace. The debuggers can attach to process, launch an executable and open the three different kinds of dump files to give you the stack trace but they don't work with compressed files or sections of it. While the debuggers have to support a lot of commands from the user, retrieving a specific stack trace requires access only to specific ranges of offsets in the crash dump file. Besides, the stack trace comes from a single thread. Unless all the thread stacks have to be analyzed, we will look at how to retrieve a specific stack trace using stream instead of files.
Note getting a stack trace that we describe here does not require symbols. The symbols help to make the frames user friendly. That can be done separately from getting the stack trace. Program debug database files and raw stack frames are sufficient to pretty print a stack.
The dump files we are talking about are Microsoft proprietary but the format is helpful for debugging. Retrieving physical address in a memory dump is easy. TEB information has top and bottom of stack. and memory dump of these can give us the stack.
Using streams is an improvement over using files for retrieving this information.
Streams can be written to a local file so we don't lose any feature we currently have.
Streams allow you to work with specific ranges of offsets so you don't need the whole file.
With a stream,
Debugger SDK available with the debugging tools has both managed and unmanaged APIs to get a stack trace. These APIs instantiate a debugging client which can give a stack trace. However, there is no API for supporting a stream yet. This is probably because most debuggers prefer to work on local files because the round trips for an entire debugging session over a low bandwidth and high latency networks is just not preferable. However, for specific operations such as to get a stack trace, this is not a bad idea. In fact, what stream support to GetStackTrace buys us is the ability to save a few more roundtrips for extraction, save on local storage as well as creating archive locations, and reduce the files and database footprint.
Both 32 bit and 64 bit dump require similar operations to retrieve the stack trace. There is additional information in the 64-bit dump files that helps with parsing.
The stack trace once retrieved can be made user friendly by looking up the symbols. These symbols are parsed from the program debug database.  Modules and offsets are matched with the text and then the stack symbols can be printed better. Information need not be retrieved from these files by hand but they can be retrieved with the Debug Interface Access. There's an SDK available on MSDN for the same.
Lastly, with a streamlined operation of retrieving stack trace as read only, no file copy, no maintenance of data or metadata locally, the stack trace parsing and reporting can be an entirely in-memory operation.

No comments:

Post a Comment