Cluster computing

Thursday, February 12, 2015

Today we continue to read from WRL Research report.
The fourth benchmark uses the bcopy (block copy) procedure to transfer large blocks of data from one area of memory to another. This doesn't necessarily involve the operating system but each may have their own implementation of the procedure. Generally they differ on cache organization and memory bandwidth and hence this is a good metric to evaluate the performance. The tests were run on configuraitons with two different block sizes. In the first case, the blocks were large enough and aligned properly to use bcopy in the most efficient way but small enough that both the source and the destination fit in the cache. In the second case, the transfer size was bigger than the cache size, so cache misses would occur continuously. In each case, several transfers were made between the same source and destination, and the average bandwidth of copying was measured.
The results showed that the cached bandwidth was largest for M2000 RISC/os and the 8800 Ultrix an d it progressively reduced for the Sun4 and Sun3 machines. The uncached bandwidth also shows the same gradation on the bandwidth.This implies that even with faster processors had no role in improving memory bandwidth. Thus memory intensive apploications are not likely to scale on these machines. In fact, the relative performance of memory copying drops with faster processors across both RISC and CISC machines.
The next benchmark we consider is the read from file cache. the benchmark consists of a program that opens up a large file and reads the file repeatedly in 16 kbyte blocks.For each configuration, a file size was chosen such that it would fit in the main memory file cache.Thus the benchmark measures the cost of entering the kernel and copying the data from the kernel's file cache back to a buffer in the benchmark's address space.

We will continue with this post shortly.

Cluster computing

Thursday, February 12, 2015

No comments:

Post a Comment