Cluster computing

Monday, April 27, 2015

Today we start reading up on Azure AD Graph API. This has let different web clients, applications and services to take advantage of doing CRUD on AD Artifacts directly. We will review some of the sample code to connect with this Graph API also. Graph APIs are as relevant to a cloud API provider as a portal to the user. In this section, we cover the usability aspect of the Graph APIs.
First of all the Azure AD Graph APIs lists entities and types and the operations that can be performed on them. These are REST based APIs so they follow the common REST standards. The entities exposed by the APIs include Application, AppRoleAssignment, Contact, Device, DirectoryLinkChange, DirectoryObject, DirectoryRole, DirectoryRoleTemplate, ExtensionProperty, Group, OAuth2PermissionGrant, ServicePrincipal, SubscribedSKU, TenantDetail, User etc.
The complex types exposed by this API include AlternativeSecurityId, AppRole, AssignedLicense, AssignedPlan, KeyCredential, LicenseUnitsDetail, OAuth2Permission, PasswordCredential, PasswordProfile, ProvisionedPlan, Provisioning Error, RequiredResourceAccess, ResourceAccess, ServicePlanInfo, ServicePrincipalAuthenticationPolicy, VerifiedDomain etc.
In addition, the API is popular for the operations on the AD artifacts such as Users, Groups, Roles and Contacts.
The Operations on Users include CreateUser, GetUser(s), UpdateUser, DeleteUser, GetUserDirectReports, Get Users GroupMemberships, Get Users manager, Update Users manager, Reset User's Password.
The Operations on Group include CreateGroup, GetGroup(s), UpdateGroup, DeleteGroup, GetGroupMembers, AddaMemberToAGroup, RemoveaMemberFromAGroup, CheckGroupMembership, CheckGroupMembershipInList, GetAllGroupMemberships (transitive).
Roles and Contacts can only be listed with this Graph APIs.
DirectoryExtensions can also be created, updated and deleted with these APIs. Directory extensions enables application developers to extend the directory and develop richer applications without worrying about the limitations exposed by an external store. For example, user, Group, TenantDetail, Device, Application, and ServicePrincipal can be extended with String Type or BinaryType Single Valued Attributes. A single string attribute can take say 256 characters and 100 such extension values are permitted on a single object. A SkypeID for a user can be taken as an extension.

#codingexercise

GetOddNumberRangeSumSeventhRootPowerTwenty (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumSeventhRootPowerTwenty();

}

Sunday, April 26, 2015

Today we take a break to cover go language. This is an opensource programming language and works on Linux, Mac, Windows and more. Code is brief and explains itself succinctly. Code is organized into packages. Functions, types, constants and variables have to be exported from a package before being used with their package qualifiers.
In Go code, errors are values, checked with if err != nil syntax.
As an example, token, err := scanner.Scan()
if (err != nil) {
return err;
}
Multivalue return from functions are common. Most library code have one or two checks only per file. Types keep the data, operation and error together. So there is no need for callers to check after each call to a type. They can multiple calls and assume that an error would have triggered subsequent calls to fail.
Code is automatically organized into bin, pkg, src and referenced via paths.
Testing is available via lightweight test framework that exposes syntax such as a func ( t *testing.T). The tests call a failure function such as t.Error or t.Fail and the test functions themselves are named TestFooBar etc.
Formatting follows simple C like syntax. with fmt.Printf statements or log.Println
Interfaces are named with a convention where they end with an -er after their primary behaviour.
Allocation primitives are new and make. new does not initialize memory it only returns a zero valued address A zero value for a mutex is an unlocked mutex. A zero value of the Buffer means its empty and ready to use. Consequently those types with buffers and mutexes are initialized by the same new.
make creates slices, maps and channels only and it returns an initialized not zeroed value of type and so references to data structures must be initialized separately.
Data is represented as Arrays, slices and maps. Arrays and slices are one dimensional. Slices can be of variable lengths. Maps associate values of one type with that of another type.
Methods can be defined for any named types except pointers or interfaces and the receiver doesn't have to be a struct

Saturday, April 25, 2015

We continue the review of the performance results from the study of a hyper proxy streaming proxy design. We saw that with the workload we labeled 'web', the hyper proxy provides the best continuous streaming service to the client as compared to the proxy hit and the proxy startup hit schemes. Hyper proxy can reduce jitter by nearly 50% when the cache size is nearly 20 %.
Similar results are observed for the PART workload as shown in Figure 3. When cache size is nearly 20% of the the object size, hyper proxy reduces proxy jitter by 50% by giving up less than 5% in the byte hit ratio. To reduce the delayed startup ratio, the proxy startup hit achieves the best performance. The result is somewhat expected because the scheme targets the reduction in delayed startup ratio. Contrasting this with hyper proxy which aggressively reduces proxy jitter by keeping more segments, the cache space may be used by media objects for which the demand may be terminated early. This lowers the effectiveness of hyper proxy with delayed startup ratio. Lastly with the real workload, hyper proxy works best individually for each metric and overall. It performs better in reducing proxy jitter and delayed startup as well as keeping the degradation in byte hit ratio within tolerable limits.
In conclusion, we see that the proxy designs that were targeting byte hit ratio can be improved by targeting proxy jitter instead because byte hit ratio does not target continuous media delivery which is more important for streaming purposes. The authors for this paper are credited with an optimization model that improves performance against proxy jitter with a small tradeoff increase in byte hit ratio. This tradeoff has been elaborated in the previous discussions. Using this model, the authors have proposed an active prefetching method that determines which segment to bring in to the cache when. Lastly by combining prefetching with proxy caching schemes, the authors have proposed a hyper proxy system that performs well against all the performance studies mentioned.

Friday, April 24, 2015

Today we continue discussing the remaining modules of the Hyper proxy system and the results. There were two kinds of workloads used. The first kind of workload varied the lengths I found the media objects and the second kind of workload varied the access times of media objects such that the session would close before the full object us downloaded. In addition a third workload involving a capture from real traffic on a server was also used. These three showed different characteristics we use the two metrics to evaluate the workloads one is the delayed startup ratio and the other is a byte hit ratio. The first is the total number of startup delayed requests normalized by the total number of requests . The second is the total amount of data transferred divided by that demanded by all the clients. And we also want to reduce jitter byte ratio.
We now evaluate the performance of the workloads which we label Web for first, Part for second and Real for third. The proxy cache system was also varied to involve three different schemes. The proxy hit represents the adaptive lazy segmentation with active prefetching. The proxy startup hit represents the improved lazy segmentation scheme and active prefetching. And lastly the proxy jitter scheme which represe the hyper proxy system.
For the web workload, the Hyper Proxy provides the best continuous streaming service to the clients while the Proxy Hit ratio performs worst since it increases byte hit ratio. This is more notably so when the cache size is 20% of the total object size in which case the reduction in proxy jitter is nearly 50 % with the hyper proxy.
Hyper proxy achieves the lowest delayed startup ratio followed closely by the proxy startup hit scheme.
The hyper proxy achieves a relatively low byte hit ratio because there is a smaller reduction of network traffic.

Thursday, April 23, 2015

We discussed the hyper proxy modules for Active prefetching and lazy segmentation strategy. We continue discussing the remaining two modules. We briefly looked at the replacement policy and the maintainance of two lists - basic and premium lists. The utility value of each object is updated after each replacement. Even after an object is fully evicted, the system will keep its access log. If this is not the case, when the object is accessed again, it will be fully cached.One characteristic of media objects is that they have diminishing popularity as the time goes by. Hence this recaching the full length of the object is wasteful. Consequently keeping the access log is relevant and recommended.
To not let access logs proliferate, a large enough timeout threshold is set so that the proxy deletes the access logs eventually.
We now look at performance evaluation. To evaluate the performance of the proxy, it was tested on several workloads - both real and synthetic. All the workloads assumed a Zipf like distribution with a skew factor of theta. All the media objects and the request interval follow the Poisson distribution with a mean interval lambda.
The first synthetic workload simulates accesses to the media object in the web environment in which the media varies from short one to long one. All there parameters are considered same.
The second workload simulates the web access where the clients abort the access where a started session terminates before the full media object is delivered.Nearly 80%of the sessions terminated before 20% of the object is delivered.
The third workload is a real capture of a workload covering a period of 10 days from an actual server.
There are a total of 403 objects and the unique object size accounts to 20GB. A total of 9000 requests were made during the period mentioned for the real workload.

Wednesday, April 22, 2015

We discuss the remaining three major modules in the HyperProxy design. Active prefetching helps with every subsequent access to a media object. It helps determine when to prefetch which segment.
In the case when no segment is cached, the prefetching of the Bs/Bt segment is considered.The new segment admission is marked with priority
If there are a few segments cached that number less than the Bs/Bt threshold, then the proxy jitter is unavoidable. The new segment admission is marked with priority.
IF there are more segments cached than the said threshold, the prefetching of the next segment starts at an offset Bs/Bt number of segments from this candidate. The new segment admission is marked with non-priority.
Bs is the encoding rate of the segment and Bt is the network bandwidth of the proxy server link.
Active prefetching of exponentially segmented object.
Next we consider the lazy segmentation strategy. When there is no cache space available and replacement is needed, the replacement policy kicks in and calculates the caching utility of each cached object. The smallest utility value items are evicted first. If the object is fully cached, the object is segmented uniformly based on the average access duration at the time. Then a certain number of segments are retained and the rest are evicted.
The replacement policy uses a formula to determine the utility value and the eviction is based on the maintenance of two lists - the basic list and the priority list

Tuesday, April 21, 2015

Today we continue discussing the hyper proxy from the paper on streaming proxy design. We saw that it maintains a data structure that keeps the threshold length, the average access duration, the access frequency F etc. The system maintains two media object lists - one premium list and one basic list. These lists help to find a victim and evict some of its segments. Then the segments of new objects are adaptively admitted by the admission policy.
There are four major modules in the Hyper Proxy caching system. These are:
1) a priority based admission policy
2) Active prefetching
3) Lazy segmentation policy
4) Differentiated replacement policy
We discuss these in detail in the next few posts.

#codingexercise

GetOddNumberRangeProductFifthRootSumEighteen (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberProductFifthRootSumEighteen();

}

We will continue our discussion about the major modules in the Hyper Proxy system.
Let's begin with the cache admission system. As we mentioned before, if the media object is requested the first time, the object is then cached in full. The replacement policy is activated if there is not sufficient space. From the two lists, we look at the premium list first, we pick out an object for which there is no priority and if one such is not located, we look at one with a priority. The fully cached object is linked to the basic list and an access log is created. If the access log indicated that the object is fully cached, the access log is merely updated to handle this robustness case.

The other three major modules are Active Prefetching, Lazy segmentation policy, differentiated replacement policy.

In the active prefetching module, partially cached objects so called because they don't have all the segments in the cache are then determined to see which segments should be prefetched.

In the lazy segmentation module, the average access duration at current time instance is calculated. It is used as the length of the base segment. and then the object is segmented uniformly Then a number of segments determined by the ration threshold length over base length is retained while the rest evicted.

In the differentiated replacement policy, we give a higher priority to reduce proxy jitter, reduces the erroneous decision of the replacement and gives fair chance to the replacement segment so that they can be cached back into the proxy again based on admittance policy should media objects be accessed again.