Thursday, January 8, 2015

Today we continue to discuss the Shasta distributed shared memory protocol. We review the effects of upgrades and data sharing.The runs were repeated just like in the previous discussion. The execution time for base runs was measured for each application and the other times were normalized to this time. The base set of runs use upgrade requests and do  not use sharing writeback messages. The former implies that no data is fetched on a store if the processor already has a shared copy.  The latter implies that home is not updated on a 3 hop read operations.  The base run was compared to a run which does not support upgrade messages.In this case, the processor generates a read-exclusive request whether or not there is a local shared copy of the line. The base run was also compared to a run where the shared writeback messages were added. The runs were also compared with varying block sizes.
The results showed that the support for upgrade messages is important for a number of applications. On the other hand, sharing write back messages typically hurt performance. This is why it was not used in the base set of runs. One application was however an exception because several processors read the data produced by another processor. It was also seen that larger block sizes can sometimes exacerbate the cost of the write backs. In all these cases, we see that supporting a dirty sharing protocol is important for achieving higher performance in Shasta.
We now look at the effects of the migratory optimizations. The execution time for the base runs was recorded and the other times were normalized to this time. A comparison was made with the run involving migratory optimizations. The set of runs were then repeated for varying block sizes. It was seen that the migratory optimization did not provide an improvement or even degraded performance in some cases. The degradations are slight and could have been worse were it not for the revert mechanism and hysteresis built into the protocol. This is due to the fact  that the migratory patterns are either not present at the granularity of 64 bytes or larger block sizes or the pattern is unstable. There was only one application that detected a large number of stable patterns. In fact, the number of upgrade misses is reduced by over 90% for this application.  At the larger block sizes, the same application ends up having fewer and more unstable patterns, resulting in a slight loss of performance. Thus migratory optimizations in Shasta have had little or no benefit. There are several other factors that contribute to this. First the use of upgrade messages reduces the cost of store misses that may be eliminated. Second, exploiting release consistency is effective in hiding the latency of the upgrades. Finally, the batching optimization also leads to the merging of load and store misses to the same line within a single batch.
To summarize the results overall, the support for variable granularity communication is by far the most important optimization in Shasta. Support for upgrade messages and a dirty sharing protocol are also important for achieving higher performance. The optimizations from the release consistency models provides smaller performance gains because the processors are busy handling messages while they wait for their own requests to complete. And lastly migratory optimizations turn out not to be useful in the context of Shasta.
#codingexercise
Double GetAlternateEvenNumberRangeAvg()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeAvg();
}
#codingexercise
Double GetAlternateEvenNumberRangeMode()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeMode();
}

No comments:

Post a Comment