Cluster computing

Thursday, April 23, 2015

We discussed the hyper proxy modules for Active prefetching and lazy segmentation strategy. We continue discussing the remaining two modules. We briefly looked at the replacement policy and the maintainance of two lists - basic and premium lists. The utility value of each object is updated after each replacement. Even after an object is fully evicted, the system will keep its access log. If this is not the case, when the object is accessed again, it will be fully cached.One characteristic of media objects is that they have diminishing popularity as the time goes by. Hence this recaching the full length of the object is wasteful. Consequently keeping the access log is relevant and recommended.
To not let access logs proliferate, a large enough timeout threshold is set so that the proxy deletes the access logs eventually.
We now look at performance evaluation. To evaluate the performance of the proxy, it was tested on several workloads - both real and synthetic. All the workloads assumed a Zipf like distribution with a skew factor of theta. All the media objects and the request interval follow the Poisson distribution with a mean interval lambda.
The first synthetic workload simulates accesses to the media object in the web environment in which the media varies from short one to long one. All there parameters are considered same.
The second workload simulates the web access where the clients abort the access where a started session terminates before the full media object is delivered.Nearly 80%of the sessions terminated before 20% of the object is delivered.
The third workload is a real capture of a workload covering a period of 10 days from an actual server.
There are a total of 403 objects and the unique object size accounts to 20GB. A total of 9000 requests were made during the period mentioned for the real workload.

Wednesday, April 22, 2015

We discuss the remaining three major modules in the HyperProxy design. Active prefetching helps with every subsequent access to a media object. It helps determine when to prefetch which segment.
In the case when no segment is cached, the prefetching of the Bs/Bt segment is considered.The new segment admission is marked with priority
If there are a few segments cached that number less than the Bs/Bt threshold, then the proxy jitter is unavoidable. The new segment admission is marked with priority.
IF there are more segments cached than the said threshold, the prefetching of the next segment starts at an offset Bs/Bt number of segments from this candidate. The new segment admission is marked with non-priority.
Bs is the encoding rate of the segment and Bt is the network bandwidth of the proxy server link.
Active prefetching of exponentially segmented object.
Next we consider the lazy segmentation strategy. When there is no cache space available and replacement is needed, the replacement policy kicks in and calculates the caching utility of each cached object. The smallest utility value items are evicted first. If the object is fully cached, the object is segmented uniformly based on the average access duration at the time. Then a certain number of segments are retained and the rest are evicted.
The replacement policy uses a formula to determine the utility value and the eviction is based on the maintenance of two lists - the basic list and the priority list

Tuesday, April 21, 2015

Today we continue discussing the hyper proxy from the paper on streaming proxy design. We saw that it maintains a data structure that keeps the threshold length, the average access duration, the access frequency F etc. The system maintains two media object lists - one premium list and one basic list. These lists help to find a victim and evict some of its segments. Then the segments of new objects are adaptively admitted by the admission policy.
There are four major modules in the Hyper Proxy caching system. These are:
1) a priority based admission policy
2) Active prefetching
3) Lazy segmentation policy
4) Differentiated replacement policy
We discuss these in detail in the next few posts.

#codingexercise

GetOddNumberRangeProductFifthRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberProductFifthRootSumEighteen();

}

We will continue our discussion about the major modules in the Hyper Proxy system.
Let's begin with the cache admission system. As we mentioned before, if the media object is requested the first time, the object is then cached in full. The replacement policy is activated if there is not sufficient space. From the two lists, we look at the premium list first, we pick out an object for which there is no priority and if one such is not located, we look at one with a priority. The fully cached object is linked to the basic list and an access log is created. If the access log indicated that the object is fully cached, the access log is merely updated to handle this robustness case.

The other three major modules are Active Prefetching, Lazy segmentation policy, differentiated replacement policy.

In the active prefetching module, partially cached objects so called because they don't have all the segments in the cache are then determined to see which segments should be prefetched.

In the lazy segmentation module, the average access duration at current time instance is calculated. It is used as the length of the base segment. and then the object is segmented uniformly Then a number of segments determined by the ration threshold length over base length is retained while the rest evicted.

In the differentiated replacement policy, we give a higher priority to reduce proxy jitter, reduces the erroneous decision of the replacement and gives fair chance to the replacement segment so that they can be cached back into the proxy again based on admittance policy should media objects be accessed again.

Monday, April 20, 2015

Today we continue to discuss the streaming proxy design. We were discussing to improve tje lazy segmentation strategy. The replacement policy involves evicting the least useful media object segments. This can be done in one of three different phases. the first phase is when the object is fully cached. the second phase is when the object is partially cached with more than one segment. The third phase is when the object is partially cached with only one segment. The utility value of each object is updated after each replacement and this process repeats iteratively until the required space is found.
We now discuss the hyper proxy system where the authors co-ordinate the two schemes discussed earlier.
For every media object accessed through the proxy, a data structure is maintained which is called the access log of the object. This data structure has fields for such things as the length of the base segment, the number of cached segments of the object, the threshold length used in the replacement policy and the average access duration of an object. The threshold length is calculated for each object as per the three phase technique above. In this system, there are two lists maintained, the basic list and the premium list. The basic list contains all objects whose length of cached segment is larger than the length of all its threshold length while the premium list contains those objects whose cached data length is equal to or less than the threshold length. A flag is used to determine the priority at admission. The admission policy dictates that the object is fully cached until it is chosen as an eviction item. At that time, some of the segments are evicted, the object is also transferred to the premium list. When the object is accessed again, the segments of the object are adaptively admitted by the admission policy or adaptively replaced by the replacement policy

Sunday, April 19, 2015

As we read up on this streaming proxy design, we will continue to review the analytical results from the tradeoff between delay startup ratio and the byte hit ratio. What we had said so far was that the change in the delay with respect to the change in the byte hit is always greater than 1 when the percentage of the startup length and the percentage of the reserved cache space is less than 50%. This gives a solid basis to giving a higher priority to the caching of the startup length of objects in the replacement policy. We now study the proposed replacement policy design.
This technique is called the improved adaptive lazy segmentation strategy. In order to significantly reduce the startup latency with a small decrease of byte hit ratio, a three phase iterative replacement policy is considered. The least utility value media object is chosen as the victim for eviction from the cache.And the segment of this object is evicted in one of two phases as follows:
In the first phase, if the object is fully cached, the object is segmented by the lazy segmentation method The first two segments are kept and the remaining segments are evicted right after the segmentation is completed.
In the second phase, if the object is partially cached with more than one segment, the last cached segment of this object is evicted.
In the third phase, if the victim has only the first segment and is to be replaced, then its startup length and base segment length is compared. If the startup length is less than the base segment length, the base segment length is kept and the rest is replaced. Otherwise it is totally replaced. The utility value of the media object is updated after each replacement and this process repeats iteratively until the required space is found.

#codingexercise

GetOddNumberRangeSumFifthRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumFifthRootSumEighteen();

}

Saturday, April 18, 2015

Today we continue on our discussion of the streaming proxy design. We were finding the upper bound on the byte hit ratio and the startup ratio in the ideal case. Let us look into some analytical results to give us an understanding of these ratios in the ideal case. Given a total of 10000 original media objects where we use a cache with a size of 20% of the total object size, we initialize the number of objects possible with the average length of objects to be 2000. Skew ratio is set at 0.47 and the startup length percentage is set at 5%. It is observed that the decrease in the byte hit ratio is much slower than the decrease of the delayed startup ratio when B increases. This implies that if we reduce the byte hit ratio by a small measure, we can trade for a significant large reduction in the delayed startup ratio. This has been an objective of the proxy design principles. To find this tradeoff, the partial derivative of the upper bound on the delay startup ratio with respect to the reserved cache space percentage is calculated. This yields the change in the delayed startup ratio. Similarly the partial derivative of the upper bound on the byte hit ratio with respect to the reserved cache space percentage yields the change in the byte hit ratio. From a relation with the change in delayed startup ratio over the change in byte hit ratio, it can be shown that the delta delay over the delta hit is always greater than 1 when alpha and Beta are less than 50%.
#codingexercise

GetOddNumberRangeSumCubeRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumCubeRootSumEighteen();

}

Friday, April 17, 2015

Today we continue our discussion on streaming proxy design. We were discussing the ideal case for the analytical model. We said that it corresponds to when the cache is always used to populate the most popular media objects. In this case, the delayed startup ratio was the ratio of the mean arrival rate of the t+1th to Nth media object over that of all of the media objects. Representing this in terms of the skew ratio,we notice that the upper bound is when skew factor is not equal to 1. We can now find the byte hit ratio in this case. Again, we find the upper bound by using t = Beta .U / Alpha where U is the the total cache size divided by the average length of objects. t was the number of most popular objects sorted by popularity whose segments within the startup length would fill the reserved portion of the cache size. The max delay startup ratio occurs when skew factor is not equal to 1. We can express this in the same ratio of arrival rates as the definition of the delay startup ratio.
Next to derive an upper bound on the byte hit ratio in the ideal case, let us consider the misses when the object is accessed for the first time. The byte hit ratio in this case is expressed as 1 minus a ratio where the numerator is the startup length of the t+1th to the Nth objects times the average arrival rate and that of the beyond-startup-length of the q+1 th to the Nth media objects times the average arrival rate and the denominator is the full length of all the N media objects times the average arrival rate.
The reason we subtract it from one is because we want the remaining that corresponds to the non-reserved cache space.
#codingexercise

GetOddNumberRangeSumCubeRootPowerEighteen) (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumCubeRootPowerEighteen();

}