Cluster computing

Monday, April 20, 2015

Today we continue to discuss the streaming proxy design. We were discussing to improve tje lazy segmentation strategy. The replacement policy involves evicting the least useful media object segments. This can be done in one of three different phases. the first phase is when the object is fully cached. the second phase is when the object is partially cached with more than one segment. The third phase is when the object is partially cached with only one segment. The utility value of each object is updated after each replacement and this process repeats iteratively until the required space is found.
We now discuss the hyper proxy system where the authors co-ordinate the two schemes discussed earlier.
For every media object accessed through the proxy, a data structure is maintained which is called the access log of the object. This data structure has fields for such things as the length of the base segment, the number of cached segments of the object, the threshold length used in the replacement policy and the average access duration of an object. The threshold length is calculated for each object as per the three phase technique above. In this system, there are two lists maintained, the basic list and the premium list. The basic list contains all objects whose length of cached segment is larger than the length of all its threshold length while the premium list contains those objects whose cached data length is equal to or less than the threshold length. A flag is used to determine the priority at admission. The admission policy dictates that the object is fully cached until it is chosen as an eviction item. At that time, some of the segments are evicted, the object is also transferred to the premium list. When the object is accessed again, the segments of the object are adaptively admitted by the admission policy or adaptively replaced by the replacement policy

Sunday, April 19, 2015

As we read up on this streaming proxy design, we will continue to review the analytical results from the tradeoff between delay startup ratio and the byte hit ratio. What we had said so far was that the change in the delay with respect to the change in the byte hit is always greater than 1 when the percentage of the startup length and the percentage of the reserved cache space is less than 50%. This gives a solid basis to giving a higher priority to the caching of the startup length of objects in the replacement policy. We now study the proposed replacement policy design.
This technique is called the improved adaptive lazy segmentation strategy. In order to significantly reduce the startup latency with a small decrease of byte hit ratio, a three phase iterative replacement policy is considered. The least utility value media object is chosen as the victim for eviction from the cache.And the segment of this object is evicted in one of two phases as follows:
In the first phase, if the object is fully cached, the object is segmented by the lazy segmentation method The first two segments are kept and the remaining segments are evicted right after the segmentation is completed.
In the second phase, if the object is partially cached with more than one segment, the last cached segment of this object is evicted.
In the third phase, if the victim has only the first segment and is to be replaced, then its startup length and base segment length is compared. If the startup length is less than the base segment length, the base segment length is kept and the rest is replaced. Otherwise it is totally replaced. The utility value of the media object is updated after each replacement and this process repeats iteratively until the required space is found.

#codingexercise

GetOddNumberRangeSumFifthRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumFifthRootSumEighteen();

}

Saturday, April 18, 2015

Today we continue on our discussion of the streaming proxy design. We were finding the upper bound on the byte hit ratio and the startup ratio in the ideal case. Let us look into some analytical results to give us an understanding of these ratios in the ideal case. Given a total of 10000 original media objects where we use a cache with a size of 20% of the total object size, we initialize the number of objects possible with the average length of objects to be 2000. Skew ratio is set at 0.47 and the startup length percentage is set at 5%. It is observed that the decrease in the byte hit ratio is much slower than the decrease of the delayed startup ratio when B increases. This implies that if we reduce the byte hit ratio by a small measure, we can trade for a significant large reduction in the delayed startup ratio. This has been an objective of the proxy design principles. To find this tradeoff, the partial derivative of the upper bound on the delay startup ratio with respect to the reserved cache space percentage is calculated. This yields the change in the delayed startup ratio. Similarly the partial derivative of the upper bound on the byte hit ratio with respect to the reserved cache space percentage yields the change in the byte hit ratio. From a relation with the change in delayed startup ratio over the change in byte hit ratio, it can be shown that the delta delay over the delta hit is always greater than 1 when alpha and Beta are less than 50%.
#codingexercise

GetOddNumberRangeSumCubeRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumCubeRootSumEighteen();

}

Friday, April 17, 2015

Today we continue our discussion on streaming proxy design. We were discussing the ideal case for the analytical model. We said that it corresponds to when the cache is always used to populate the most popular media objects. In this case, the delayed startup ratio was the ratio of the mean arrival rate of the t+1th to Nth media object over that of all of the media objects. Representing this in terms of the skew ratio,we notice that the upper bound is when skew factor is not equal to 1. We can now find the byte hit ratio in this case. Again, we find the upper bound by using t = Beta .U / Alpha where U is the the total cache size divided by the average length of objects. t was the number of most popular objects sorted by popularity whose segments within the startup length would fill the reserved portion of the cache size. The max delay startup ratio occurs when skew factor is not equal to 1. We can express this in the same ratio of arrival rates as the definition of the delay startup ratio.
Next to derive an upper bound on the byte hit ratio in the ideal case, let us consider the misses when the object is accessed for the first time. The byte hit ratio in this case is expressed as 1 minus a ratio where the numerator is the startup length of the t+1th to the Nth objects times the average arrival rate and that of the beyond-startup-length of the q+1 th to the Nth media objects times the average arrival rate and the denominator is the full length of all the N media objects times the average arrival rate.
The reason we subtract it from one is because we want the remaining that corresponds to the non-reserved cache space.
#codingexercise

GetOddNumberRangeSumCubeRootPowerEighteen) (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumCubeRootPowerEighteen();

}

Thursday, April 16, 2015

Today we continue our discussion on streaming proxy design. We were reviewing the assumptions for building an analytical model. The first assumption is that the popularity of media objects follows a zipf distribution. The second assumption was that the request arrival interval process occurs at a known average rate and independent of time. The third assumption was that the clients view the media objects completely. The first assumption gives us a probability set pi. The second assumption gives us a sampling process where the sum of pi equals 1. The third assumption gives us a mean arrival rate as lambda times pi. We also reviewed the added definition of startup length. This is the percentage alpha of the full length of the media object. And Beta is the percentage of the total cache space reserved for all the startup lengths.
This help us determine the startup ratio. To build the model, we consider the ideal case that the cache space is allocated to cache the most popular objects, then the startup lengths of the first t most popular objects can remain in the reserved portion of the cache. This can be extended to imply that the rest of the cache is used for the q most popular objects beyond their startup length. Therefore the delayed startup ratio can be expressed as the sum of the mean arrival rate of the t+1th to the Nth media object over that of all the N media objects.

#codingexercise
GetOddNumberRangeSumCubeRootPowerSixteen) (Double [] A)
{
if (A == null) return 0;
Return A.OddNumberSumCubeRootPowerSixteen();
}

Tuesday, April 14, 2015

Today we continue our discussion on streaming proxy design. We were saying that the reality is that cache size may be bounded. We may not be able to cache the number of segments required for continuous delivery of a media object if there are lots of media objects. One way to overcome this was to determine the priority of media objects. Given a higher priority in reducing proxy jitter, the proxy can choose to evict segments of the object whose cached data length is less than its prefetching length. This will allow the prefetching of the cached segment to be always in time. Even if the segments of popular objects are evicted, the overall proxy jitter reduces at the cost of a little byte hit ratio. Thus we have seen that the byte hit ratio can be traded for less proxy jitter.
This conflict between improving the byte hit ratio and reducing the proxy jitter helped the authors to revise the principle of designing a better proxy based on proxy jitter. They noted that segment based proxy caching strategies always perform well in the byte hit ratio but not so well in the delayed startup ratio. This is further explained when evaluating the adaptive lazy segmentation based scheme.
We also look at the tradeoff between Byte hit ratio and delayed startup ratio. We could see some conflicting interests in this tradeoff from the previous discussion but we build an analytical model now. This analytical model is based on the following assumptions:
1) The popularity of the objects follow a Zipf like distribution
2) The request arrival interval process follows a Poisson distribution with a mean arrival rate lambda.
3) The clients view the requested objects completely. This is to simplify the analysis and does not affect the conclusion.
After we build the model, we will evaluate it with analytical results.
Then we review the author's improved Adaptive-Lazy Segmentation Strategy.
The assumption 1) describes the probability set pi in terms of the fraction of fi to the total fi. fi is the inverse of i raised to the power theta and i can vary from 1 to N the total number of objects. Theta is the skew factor and is positive.
The assumption 2) describes the sampling process as independent and sampled from the aggregate arrival interval process based on probability set pi where the sum of pi equals 1.
The assumption 3) lets us calculate the mean arrival rate as lambda times pi.
In addition to these assumptions, the following definitions make it easier to calculate the delayed startup ratio.
The startup length is the length of the beginning part of an object. If this part is cached, then there is no startup delay. Let alpha be this percentage of the full object length. Let Beta be the percentage of the total cache space reserved for caching the startup lengths of objects.

Monday, April 13, 2015

today we continue our discussion on the design of streaming proxy. we were calculating the number of segments for continuous delivery. We were considering both th uniformly segmented media and the exponentially segmented media. We now review the tradeoff between the low proxy jitter and the high byte ratio and then the byte hit ratio versus the delayed startup ratio.In theory, we know the minimum number of segments that must always be cached in the proxy to guarantee a proxy jitter free delivery. In practice, this is difficult for each and every media object because cache size is limited. One way to overcome this is to determine if a media object is popular. Popular objects are always cached to reduce network traffic and server load. If an object is popular enough all its segments can be cached in the proxy even larger than the prefetching length. On the other hand, if the objects are not popular enough, some segments may get evicted and only a few of it's segments cached. This can contribute to proxy jitter. Given a higher priority in reducing the proxy jitter, the proxy can choose to evict segments of the object whose cached data length is larger than its prefetching length so that the prefetching of its uncached segments can always be in time. If the popular objects get evicted by any chance, this would contribute to the byte hit ratio. Thus we can simulate the availability of all segments for media objects that are popular.