Today I'm going to continue discussing the WRL research report on Shared memory consistency models. Most programmers assume sequential semantics for memory operations. A read will return the value of the last write. As long as there is a uniprocessor data and dependences can be controlled, the illusion of sequentiality can be maintained. The compiler and hardware can freely reorder operations to different locations. Compiler optimizations that reorder memory operations can include register allocation, code motion, and loop transformations, and hardware optimizations, such as pipelining, multiple issue, write buffer bypassing and forwarding and lockup-free caches.
The following are some of the myths about the memory consistency models:
1) It only applies to systems that allow multiple copies of shared data This is not true as in the case with overlapped writes and non-blocking reads.
2) Most current systems are sequentially consistent. On the contrary, Cray computers and Alpha computers relax this consistency.
3) The memory consistency model only affects the design of the hardware. But it actually affects compiler and other system design.
4) A cache coherence protocol inherently supports sequential consistency. In reality, this is only part of it. Another myth is that the memory consistency model depends on whether the system supports an invalidate or update based coherence protocol. But it can allow both.
5) The memory model may be defined solely by specifying the behaviour of the processor. But it is affected by both the processor and memory system.
6) Relaxed memory models may not be used to hide read latency. The reality is it can hide both read and write latency.
7) Relaxed consistency models require the use of extra synchronization. There is no synchronization required for some of the relaxed models.
8) Relaxed consistency models do not allow asynchronous algorithms. This can actually be supported.
#codingexercise
Decimal GetOddNumberRangeMean(Decimal [] A)
{
if (A == null) return 0;
Return A.OddNumberRangeMean();
}
Tonight we will continue to discuss this WRL Research report on Shared memory consistency models. In particular, we will be discussing Sequential consistency. This is the most commonly assumed memory consistency model. The definition is that the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appears in program order. Maintaining this program order together with maintaining a single sequential order among all the processors is the core tenet of this model. The latter part actually requires that memory is updated atomically with respect to other operations. Let us take two examples for using this model. We use a code segment from Dekker's algorithm for critical sections, involving two processors and two flags that are initialized to zero. When a processor enters the critical section, it updates its flag and checks the other. Here the assumption is that the value is read only after it is written. Here the expectation that the program order is maintained ensures sequential consistency and a race condition. To illustrate the importance of the atomic execution, let us instead consider an example where there are three processors sharing variables A and B both set to zero. Here if one of the processor reads a variable and writes to the other which another processor reads, then the atomicity requires that the updates made to the variable are seen by all the processors at once.
The following are some of the myths about the memory consistency models:
1) It only applies to systems that allow multiple copies of shared data This is not true as in the case with overlapped writes and non-blocking reads.
2) Most current systems are sequentially consistent. On the contrary, Cray computers and Alpha computers relax this consistency.
3) The memory consistency model only affects the design of the hardware. But it actually affects compiler and other system design.
4) A cache coherence protocol inherently supports sequential consistency. In reality, this is only part of it. Another myth is that the memory consistency model depends on whether the system supports an invalidate or update based coherence protocol. But it can allow both.
5) The memory model may be defined solely by specifying the behaviour of the processor. But it is affected by both the processor and memory system.
6) Relaxed memory models may not be used to hide read latency. The reality is it can hide both read and write latency.
7) Relaxed consistency models require the use of extra synchronization. There is no synchronization required for some of the relaxed models.
8) Relaxed consistency models do not allow asynchronous algorithms. This can actually be supported.
#codingexercise
Decimal GetOddNumberRangeMean(Decimal [] A)
{
if (A == null) return 0;
Return A.OddNumberRangeMean();
}
Tonight we will continue to discuss this WRL Research report on Shared memory consistency models. In particular, we will be discussing Sequential consistency. This is the most commonly assumed memory consistency model. The definition is that the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appears in program order. Maintaining this program order together with maintaining a single sequential order among all the processors is the core tenet of this model. The latter part actually requires that memory is updated atomically with respect to other operations. Let us take two examples for using this model. We use a code segment from Dekker's algorithm for critical sections, involving two processors and two flags that are initialized to zero. When a processor enters the critical section, it updates its flag and checks the other. Here the assumption is that the value is read only after it is written. Here the expectation that the program order is maintained ensures sequential consistency and a race condition. To illustrate the importance of the atomic execution, let us instead consider an example where there are three processors sharing variables A and B both set to zero. Here if one of the processor reads a variable and writes to the other which another processor reads, then the atomicity requires that the updates made to the variable are seen by all the processors at once.
No comments:
Post a Comment