Cluster computing

Thursday, April 30, 2015

Today we continue reading the paper on DBPowder. We were reviewing related technologies Today we will continue some more discussion. Hibernate is mentioned as an ORM that handles complex correspondence. Here developers describe mappings between the persistent classes and the relational schema in hbm files. This file is used to generate code and table. However, each time the hbm needs to be updwted, it is pretty disruptive. Even with annotations the expressibility of the mapping isn't improved. While we talk about ORM framework, we don't talk about the mark and sweep required for Graph API. Those we read in distributed computing. Here we read the paper for improvements to the mapping framework. Also we were hinting at writing Graph API for Active Directory objects CRUD. That cannot work unless AD objects CRUD works via LDAP protocol. We need proper credentials and reserved OU. And even so, will most likely require a ticketing framework for changes to be captured in audit trail.

Wednesday, April 29, 2015

Today we continue reading the paper introducing DBPowder as we did in the previous post. We said that for complex correspondences, the developer needs to specify the complete correspondence among persistent classes and tables. which is not always possible because they are subject to change. With DBPowder, simple and complex correspondences are supported with a collaboration of a conceptual model and a navigational data usage description. During the initial rapid development, developers describe an Extended Entity Relationship model and DBPowder generates schema and classes for this model. During the latter spiral development, DBPowder introduces ObjectView, a graph based object description form over the EER model. The developers can refine their EER models and add ObjectView. and then DBPowder refines the existing relational schema and persistent classes and adds persistent classes for ObjectView.
In the discussion on related works, the authors mention ActiveRecord which is a one to one mapping between table and attributes - attributes of ActiveRecord are defined with those of the table. The advantage of this approach is that its simple but doesn;t work for complex correspondences.
Another example is the conceptual model eg. ER model as the basis such as WebML and Enterprise Objects Framework. Here schema and classes are generated from the conceptual model.
A third approach was introduced by Thiran et al which uses wrapper based ORM and applies the eight transformation operators one after the other to a target relational schema. The descriptive power is limited within these operators.
#codingexercise

GetOddNumberRangePowerSeventhRootPowerTwenty (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberPowerSeventhRootPowerTwenty();

}

Tuesday, April 28, 2015

Today we will start reading another research paper. This one is titled : DBPowder : A flexible object relational mapping framework based on a conceptual model. written by Murakami, Amagasa, Kitagawa et al. Object Relational mapping framework are well known to web application developers for the convenience they provide in programming data objects and persisting to a database. The relational schema is abstracted by an object graph that enables seamless data persistence. The authors begin by recognizing that there are many such ORM frameworks and most of them have to compromise two contradictory requirements, 1) that is the support of persistence classes that are directly mapped to relational table. and 2) the support of complicated compositions of base classes which are required by the application. These are two strong requirements and they become clearer as the application evolves. Initially a one to one mapping between each persistent class and their respective table is required and as the development proceeds, more complicated correspondences between persistent classes and relational tables are needed as the development proceeds. Foytr this reason, it is desirable for an ORM framework to be flexible enough between these two ends of the spectrum at different development stages.
In this paper, the authors do that by proposing a framework called DBPowder that addresses the difficulty in handling simple and complex correspondences. DBPowder supports direct mapping to tables with Extended Entity Relationship Model and it supports complicated compositions with ObjectView, a graph based object description form over the EER model. The EER model and the ObjctView together provide the flexibility that is required in different development stages.
Through out this paper the authors use the term simple correspondence and complex correspondence for 1) and 2) mentioned earlier. They refer to the classes in ORM as a) persistence classes that deal with persistent data in an object oriented language. b) relational schema to manage the persistent data in the RDB. and c) data conversion between object states and RDB queries and responses.
If we take the example of users and hosts, that have a many to many relationship as captured in a third join table, then we can have a simple correspondence with more complicated mappings possible by defining the mappings by hand.

Monday, April 27, 2015

Today we start reading up on Azure AD Graph API. This has let different web clients, applications and services to take advantage of doing CRUD on AD Artifacts directly. We will review some of the sample code to connect with this Graph API also. Graph APIs are as relevant to a cloud API provider as a portal to the user. In this section, we cover the usability aspect of the Graph APIs.
First of all the Azure AD Graph APIs lists entities and types and the operations that can be performed on them. These are REST based APIs so they follow the common REST standards. The entities exposed by the APIs include Application, AppRoleAssignment, Contact, Device, DirectoryLinkChange, DirectoryObject, DirectoryRole, DirectoryRoleTemplate, ExtensionProperty, Group, OAuth2PermissionGrant, ServicePrincipal, SubscribedSKU, TenantDetail, User etc.
The complex types exposed by this API include AlternativeSecurityId, AppRole, AssignedLicense, AssignedPlan, KeyCredential, LicenseUnitsDetail, OAuth2Permission, PasswordCredential, PasswordProfile, ProvisionedPlan, Provisioning Error, RequiredResourceAccess, ResourceAccess, ServicePlanInfo, ServicePrincipalAuthenticationPolicy, VerifiedDomain etc.
In addition, the API is popular for the operations on the AD artifacts such as Users, Groups, Roles and Contacts.
The Operations on Users include CreateUser, GetUser(s), UpdateUser, DeleteUser, GetUserDirectReports, Get Users GroupMemberships, Get Users manager, Update Users manager, Reset User's Password.
The Operations on Group include CreateGroup, GetGroup(s), UpdateGroup, DeleteGroup, GetGroupMembers, AddaMemberToAGroup, RemoveaMemberFromAGroup, CheckGroupMembership, CheckGroupMembershipInList, GetAllGroupMemberships (transitive).
Roles and Contacts can only be listed with this Graph APIs.
DirectoryExtensions can also be created, updated and deleted with these APIs. Directory extensions enables application developers to extend the directory and develop richer applications without worrying about the limitations exposed by an external store. For example, user, Group, TenantDetail, Device, Application, and ServicePrincipal can be extended with String Type or BinaryType Single Valued Attributes. A single string attribute can take say 256 characters and 100 such extension values are permitted on a single object. A SkypeID for a user can be taken as an extension.

#codingexercise

GetOddNumberRangeSumSeventhRootPowerTwenty (Double [] A)

{

if (A == null) return 0;

Return A.OddNumberSumSeventhRootPowerTwenty();

}

Sunday, April 26, 2015

Today we take a break to cover go language. This is an opensource programming language and works on Linux, Mac, Windows and more. Code is brief and explains itself succinctly. Code is organized into packages. Functions, types, constants and variables have to be exported from a package before being used with their package qualifiers.
In Go code, errors are values, checked with if err != nil syntax.
As an example, token, err := scanner.Scan()
if (err != nil) {
return err;
}
Multivalue return from functions are common. Most library code have one or two checks only per file. Types keep the data, operation and error together. So there is no need for callers to check after each call to a type. They can multiple calls and assume that an error would have triggered subsequent calls to fail.
Code is automatically organized into bin, pkg, src and referenced via paths.
Testing is available via lightweight test framework that exposes syntax such as a func ( t *testing.T). The tests call a failure function such as t.Error or t.Fail and the test functions themselves are named TestFooBar etc.
Formatting follows simple C like syntax. with fmt.Printf statements or log.Println
Interfaces are named with a convention where they end with an -er after their primary behaviour.
Allocation primitives are new and make. new does not initialize memory it only returns a zero valued address A zero value for a mutex is an unlocked mutex. A zero value of the Buffer means its empty and ready to use. Consequently those types with buffers and mutexes are initialized by the same new.
make creates slices, maps and channels only and it returns an initialized not zeroed value of type and so references to data structures must be initialized separately.
Data is represented as Arrays, slices and maps. Arrays and slices are one dimensional. Slices can be of variable lengths. Maps associate values of one type with that of another type.
Methods can be defined for any named types except pointers or interfaces and the receiver doesn't have to be a struct

Saturday, April 25, 2015

We continue the review of the performance results from the study of a hyper proxy streaming proxy design. We saw that with the workload we labeled 'web', the hyper proxy provides the best continuous streaming service to the client as compared to the proxy hit and the proxy startup hit schemes. Hyper proxy can reduce jitter by nearly 50% when the cache size is nearly 20 %.
Similar results are observed for the PART workload as shown in Figure 3. When cache size is nearly 20% of the the object size, hyper proxy reduces proxy jitter by 50% by giving up less than 5% in the byte hit ratio. To reduce the delayed startup ratio, the proxy startup hit achieves the best performance. The result is somewhat expected because the scheme targets the reduction in delayed startup ratio. Contrasting this with hyper proxy which aggressively reduces proxy jitter by keeping more segments, the cache space may be used by media objects for which the demand may be terminated early. This lowers the effectiveness of hyper proxy with delayed startup ratio. Lastly with the real workload, hyper proxy works best individually for each metric and overall. It performs better in reducing proxy jitter and delayed startup as well as keeping the degradation in byte hit ratio within tolerable limits.
In conclusion, we see that the proxy designs that were targeting byte hit ratio can be improved by targeting proxy jitter instead because byte hit ratio does not target continuous media delivery which is more important for streaming purposes. The authors for this paper are credited with an optimization model that improves performance against proxy jitter with a small tradeoff increase in byte hit ratio. This tradeoff has been elaborated in the previous discussions. Using this model, the authors have proposed an active prefetching method that determines which segment to bring in to the cache when. Lastly by combining prefetching with proxy caching schemes, the authors have proposed a hyper proxy system that performs well against all the performance studies mentioned.

Friday, April 24, 2015

Today we continue discussing the remaining modules of the Hyper proxy system and the results. There were two kinds of workloads used. The first kind of workload varied the lengths I found the media objects and the second kind of workload varied the access times of media objects such that the session would close before the full object us downloaded. In addition a third workload involving a capture from real traffic on a server was also used. These three showed different characteristics we use the two metrics to evaluate the workloads one is the delayed startup ratio and the other is a byte hit ratio. The first is the total number of startup delayed requests normalized by the total number of requests . The second is the total amount of data transferred divided by that demanded by all the clients. And we also want to reduce jitter byte ratio.
We now evaluate the performance of the workloads which we label Web for first, Part for second and Real for third. The proxy cache system was also varied to involve three different schemes. The proxy hit represents the adaptive lazy segmentation with active prefetching. The proxy startup hit represents the improved lazy segmentation scheme and active prefetching. And lastly the proxy jitter scheme which represe the hyper proxy system.
For the web workload, the Hyper Proxy provides the best continuous streaming service to the clients while the Proxy Hit ratio performs worst since it increases byte hit ratio. This is more notably so when the cache size is 20% of the total object size in which case the reduction in proxy jitter is nearly 50 % with the hyper proxy.
Hyper proxy achieves the lowest delayed startup ratio followed closely by the proxy startup hit scheme.
The hyper proxy achieves a relatively low byte hit ratio because there is a smaller reduction of network traffic.