Sunday, December 7, 2014

In today's post we continue discussing the WRL Trace generation system. We discussed a verification technique for the correctness of the trace data.  First, the simulation results were compared with that of an existing instruction level simulator. Second, to check for the correctness of the information created by the trace code, we had to verify synchronization. There were two techniques used for this purpose - one where a program that used  a simple tight infinite loop for generating a recognizable pattern, and then run a variant of the analysis program that checked whether anything was missing or out of place and two - the trace entry generated by the operating system on entry to the kernel was extended  to include a sequence number that was incremented on each kernel entry.
We next look at a cache analysis program. The requirements were that it be flexible. It should be changed from run to run. It shouldn't take up too much memory to cause paging overhead. It should be as fast as possible given that the analysis takes much more time than trace generation. When constants were used in this program whose values were specified at compile time,  they defined the number of cache levels, split or integrated data and instruction caches, degree of associativity, number of cache lines, line size, write policy and cost of hit and miss (in units of processor cycles) A version of this program is compiled for each cache configuration and there can even be a fully associative version of the second level cache.
The program is executed at intervals that can be based on the number of instructions, trace entries, or memory references encountered. The data written includes both intervals and cumulative summaries of the number of references, miss ratio, contributions to the cycles per instruction, the percent of misses that resulted from user/kernel conflicts and all of these for each cache, contribution to the CPI by second level compulsory misses (first time references), second level cache contribution to the CPI, if fully associative, and for user mode and kernel mode, number of cycles executed and CPI, CPI broken down by instruction, load and store contributions.
#coding exercise
Decimal getMedian(decimal [] a)
{
If (a == NULL) return 0;
Return a.Median();
}
#coding exercise
Decimal getMode(decimal [] a)
{
If (a == NULL) return 0;
Return a.Mode();
}


No comments:

Post a Comment