Cluster computing

Friday, October 18, 2013

In today's post we will talk about one more distributed computing before we move on to the next topic. We will talk about time synchronization of physical clocks. We mentioned that virtual clocks and logical time can capture a partial order of events in a distributed system. But in this post, we will talk about say a real clock. Such a clock would advance at an appropriate rate. eg. one tick per millisecond. Each process may have its own clock.
The issue here is that the current time displayed by the clock is not a shared state. Moreover, a watch that ticks accurately is not enough. It can not be used to tell time though it can be used to time the duration of some interval. The clocks may need to be synchronized to tell time and even then they can drift over time requiring synchronization again.
Consider a process receiving a time notification from another process. The receiving process has no way of knowing the delay from the transmission One solution could involve sending a message first and then waiting for an echo. The receiving process can then know the time elapsed.
Since the process knows the timestamp on the echo, it can split the elapsed time to know what the delay was in receiving the echo.
Sending a request and receiving an echo can be repeated with the hope to improve accuracy. If the delays are the same, there is no improvement. If the delays vary, then the repetitions narrow the possible interval. This improves accuracy of the clock.
As an example, lets say the timestamp of the first echo is 3 and that of the second echo is 3:30. Since the elapsed time for the earlier was ten minutes, and the elapsed time for the other is twelve minutes, we have two intervals for the current time at the other process. The interval intersecting these two intervals is now much smaller thus improving the accuracy in predicting the time at the other process.
When this is repeated with several servers, and each server finds the interval its working on, the intersection of all these intervals will be even narrower thus increasing the accuracy of the time.
It might happen that the intersection for the intervals could come out to be empty.
This means that one or more of the servers have a wider drift. So we take each interval as a likely vote and go with the majority.
One way to do such a tally is as follows is to see how many intervals are satisfied by a given time. We increment a counter whenever we are within an interval and decrement the counter whenever we leave the interval. We also update our minimum and maximum bound on each entry and exit. At the maximum count of overlapping intervals, we know what intersection majority have.

Cluster computing

Friday, October 18, 2013

No comments:

Post a Comment