Cluster computing

Saturday, December 20, 2014

Continuing on our discussion from the previous post on sequential consistency in the WRL research report on shared memory consistency model, let us know look at the implementation considerations for the same.
We first look at architectures without caches. The key design principle here is to enforce and maintain program order of execution. We also keep track of the memory operations and make sure that they are sequential.
Let us take a look at write buffers with bypassing capability. We have a bus based shared memory system with no cache. A processor issues memory operations in sequential order. The write buffer merely helps the processor to not wait on the finishing of the write. Reads are allowed to bypass any previous writes to the buffer. This is allowed as long as the read address is different from any of the addresses in the write buffer. This optimization allows us to hide the write latency. To see how this can violate sequential consistency, let us consider the example of the Dekker algorithm mentioned earlier where we maintained that both reads of the flag cannot return the same value. However, in our bypassing capability, this is now permitted. This problem doesn't occur with uniprocessors.
Now let us take a look at overlapping write operations. Here we use a general interconnection network that alleviates the bottleneck of a bus based design. We still assume processors issue memory operations in program order. As opposed to the previous example, multiple write operations issued by the same processor may be simultaneously serviced by the different memory modules. We maintained that the reads of data by a processor should return the value written by the other processor for sequential consistency. In our case now, we violate this assumption because a write may be injected before a write makes its way to the memory module. This highlights the importance of maintaining program order between memory operations. If we coalesce other write operations to the same cache line, that can also cause the inconsistency.
One way to remedy this could be to wait for the write to reach the memory before allowing another into the network. Enforcing the above requires an acknowledgement to the processor issuing a write. These acknowledgements help with maintaining program order.
Next we consider non-blocking read operations on the same example as above. Here a processor ensures that its write arrives at the memory location in program order. If another processor is allowed to issue its read operation, then the read might arrive before the write leading to an inconsistency.
#codingexercise
Decimal GetOddNumberRangeMode(Decimal [] A)
{
if (A == null) return 0;
Return A.OddNumberRangeMode();
}

Cluster computing

Saturday, December 20, 2014

No comments:

Post a Comment