Sunday, May 31, 2015

We were discussing Riverbed Operating System (RiOS) technical concepts. we now review connection pooling and SSL acceleration. Applications open many connections during their execution. Many of these are short-lived yet they require significant overhead to initiate communications. If for example, the loading of a webpage requires a client to open TCP connections, then each slows down the application. RiOS can maintain a pool of open connections for these purposes. Here the connections are already open and there is no overhead associated with using one. Each such pre-opened connection is ready for data transfer and is not considered dirty from previous use. Connection pooling can improve on overheads by as much as 50%.
In addition to connection pooling, another common practice is the use of SSL with application connections. SSL comes with different trust models. RiOS can accelerate SSL traffic while letting private keys be held in the data center and without requiring fake certificates. Both the RiOS appliance and client can auto-discover their peers and begin optimizing SSL traffic. RiOS also provides central management of SSL acceleration capabilities.  Allowing the organization to use their own certificate with their SSL connections and not require them to keep fake certificates or server private keys in branch offices makes RiOS flexible and improves the security. RiOS only distributes temporary session keys to branch office appliances.
RiOS optimizes both http and https traffic. For static web content, a "learning mechanism" allows the RiOS clients to track the objects that are requested for particular web page and accelerate future requests by using the learned information and pre-fetching associated content. In addition, the learned information is sent in parallel for normally sequential data creating additional optimization benefits.
#codingexercise
Double  GetNthRootSumOddRaisedPDividedQAndEvenRaisedPTimesQ (Double [] A,Double  p, Double q)
{
If ( A== null) return 0;
Return A.NthRootSumOddRaisedPDividedQAndEvenRaisedPTimesQ(p, q);
}

Saturday, May 30, 2015

We were discussing Riverbed Operating System (RiOS) technical concepts.  We now look at the use of Window Scaling and Virtual Window Expansion. Window Scaling is a standard TCP technique to increase the net throughput when the TCP window is the bottleneck.  The maximum amount of data per round-trip goes up because the number of bytes that can be "in flight" without being acknowledged is increased. Although this is a TCP implementation, the configuration lies with the user which often requires some esoteric settings to be tweaked.
This configuration by the user is avoided by RiOS in automatic window scaling.  RiOS virtually expands the TCP windows and enables capacity that is hundreds of times greater than the basic TCP payloads. As a TCP proxy, RiOS effectively repacks TCP payloads with the help of references instead of the actual data.  Since we noted earlier that the use of references saves a lot of data duplication and the proprietary hierarchical references can be used for several segments that are indexed, the TCP frame is virtually expanded and often by a factor of several hundred or more. Consequently, there's fewer roundtrips needed to deliver a given amount of data. The use of references to repack the TCP data is referred to as the Virtual Window Expansion.
RiOS also makes use of high speed TCP and Max speed TCP. These are techniques used for high latency links or where packet loss is high.  These techniques can accelerate TCP based applications so that a single connection runs at hundreds of Mbps even when round trip latencies are high. The potential benefits include : higher throughput, faster replication, backup and mirroring and better utilization of links. These techniques don't need to pre-determine the bandwidth. They can self-adjust the transmission for appropriate throughput.  The difference between high speed TCP and max speed TCP is that the high speed TCP will back down in speed as a result of packet loss or congestion where as max TCP is designed to use a set amount of bandwidth regardless of congestion or packet loss.

#codingexercise

Double  GetNthRootSumOddRaisedPTimesQAndEvenRaisedPDividedQ (Double [] A,Double  p, Double q)


{


If ( A== null) return 0;


Return A.NthRootSumOddRaisedPTimesQAndEvenRaisedPDividedQ(p, q);


}


Friday, May 29, 2015

We were discussing Riverbed Operating System (RiOS) technical concepts. For Data Deduplication, RiOS intercepts and analyzes TCP traffic, segmenting the data and indexing it. It replaces duplicate segments with proprietary hierarchical references that save massive amounts of data.The segments are compressed using a Lempel-Ziv based algorithm that achieves good peak compression ratios. The size of the segments stored on disk is approximately 100 bytes allowing detection of fine grained changes in the traffic. Since the comparison is at byte level, the same works for both encrypted as well as unencrypted traffic. This scalable data reduction (SDR) technique can be performed interchangeably on disk or in memory. It can be done adaptively as well in either location or both. The best fit for particular connections or overall workload determines the strategy. Since disk accesses are involved, the appliances are fitted with solid state disks for better performance.
It is important to note that segments are byte chunks and agnostic of the metadata on the corresponding data transfer. Filenames, e-mails, objects or anything that the segment belongs to does not affect are not recognized. This is both a convenience as well as a limitation.  The convenience is that the changes are detected even if they are fine grained and applied consistently regardless of file renaming, object cloning or email replies. The limitation is that an application protocol could have been added that tags the traffic with metadata based on time or space for more information to the SDR. In any case, the SDR has to work across all TCP traffic and therefore differs from cache based solutions that cannot recognize the data as the same when tracking file or object based entities.
RiOS overcomes the chattiness of transport protocols through transport streamlining which is  set of techniques that reduce the number of round trips necessary to transport information across the WAN while maintaining the reliability and resiliency of the transport.The techniques involve window scaling, intelligent repacking of payloads, connection management and other protocol optimization techniques. These are standard techniques and therefore RiOS remains true to the fundamentals such as congestion control, error detection, connection management for good neighbor etc without adding extra requirements such as the use of a tunnel, proprietary protocols or other non-standard protocol optimization techniques. Consequently this native transport streamlining design avoids the issues such as traffic mixing, MTU sizes or TCP over TCP that affect alternative designs based on tunnels etc.

#codingexercise
Double  GetNthRootSumOddRaisedPMinusQAndEvenRaisedPPlusQ (Double [] A,Double  p, Double q)

{

If ( A== null) return 0;

Return A.NthRootSumOddRaisedPMinusQAndEvenRaisedPPlusQ(p, q);

}

Thursday, May 28, 2015

We were discussing Riverbed Operating System (RiOS) technical concepts. We know that RiOS can accelerate TCP based connections. These include but are not limited to CIFS, NFS, FTP, HTTP, HTTPS and database connections. Data Streamlining reduces bandwidth consumption by 60 to 95 %.
We now look at the specifics of data de-duplication. RiOS intercepts and analyzes TCP traffic, segmenting the data and indexing it. The indexed data is resident on disk and can be compared. If a segment of data has been seen before, it is not transferred across the WAN, instead a reference is sent in its place. This process  enables duplicate data to be replaced by a reference. RiOS has patented this reference. It includes a hierarchical structure with a single reference that can represent many segments and therefore the ability to reduplicate a large amount of data. If the data has not been seen before, the segments are compressed using Lempel-Ziv based algorithm and sent across to the RiOS appliance on the other side of WAN. These segments are also stored on the counterpart appliance. The original data is reconstructed using the new data and references to the existing data and passed through to the client. LZ Compression allows peak compression ratios of 100:1 to be achieved.

Wednesday, May 27, 2015

We were discussing Riverbed Operating System (RiOS) technical concepts. Let us look into network integration. RiOS solution does not demand anything from the network topology. It also does not require the use of tunnels. Since there are no tunnels involved, peers can be auto-discovered, full mesh environments can be supported and there is no limit to scale. There is no hindrance to the use of web cache communication protocols, policy based routing, or other out of path deployment options.
The purpose of mentioning this non-invasive, application neutral acceleration with little or no change to network is to say that this approach differs from a overarching one panacea fix all strategy. It's also not quite like a solution that requires different vendors to "bolt and glue" their products. That said, WAN optimization can proceed through multiple vendors who can combine or reuse some of the patterns introduced with RiOS.
The use of minimal configurations helps with the following in these ways:
Data Streamlining:
Reduce WAN bandwidth utilization by 60-95%
eliminate redundant transfers
perform cross application optimization
and provide quality of service for all the applications.
The support for data streamlining includes features for rule-based policy administration of optimization classes, packet marking, enforcement for QoS and route control.
These enable tremendous ease of use with most QoS measures.
#codingexercise

Double  GetNthRootProductOddRaisedPMinusQAndEvenRaisedPPlusQ (Double [] A,Double  p, Double q)




{




If ( A== null) return 0;


By



Return A.NthRootProductOddRaisedPMinusQAndEvenRaisedPPlusQ(p, q);




}


Tuesday, May 26, 2015

We continue to read up on Riverbed Optimization System (RiOS) technical overview. In the previous post, we discussed RiOS as a transparent TCP proxy where an innovative three channel mechanism replaces the logical single end to end connection. With RiOS, all TCP traffic can be intercepted and accelerated regardless of the application that generates it. This means that there is no application category or bracket to which RiOS is specific. In addition, transport streamlining optimizes the behavior of TCP on the WAN. And these can be applied to both encrypted data and clear.
Application caching is therefore not required which anyways is application specific and providing marginal benefits.
That mentioned we can refer to application specific optimizations. There are some applications like Windows File sharing or Exchange e-mail, where the application protocol matters. Aside from data and transport streamlining, application streamlining enables RiOS to alleviate application specific behavior although introducing TCP optimizers or data compression devices have shown limited performance gains.
Having discussed traffic, we now shift attention to data and disk usages. RiOS implements a Universal Data Store. This lets it reduce data to efficiently scale across multiple peers because of the use of a single store. The significance of this improvement can be seen in the number of per-peer data store savings. Moreover per peer segmentation often reduces usable disk space to very small portions. This manifests in such performance bottlenecks as data store "misses" or "cold hits". On the other hand, with a universal datastore and accelerated traffic, RiOS can now support efficient sharing in a large scale enterprise environment.
The data is guaranteed to be coherent because the single copy is accessed through the server. The server handles permissions and file locking. The client requests happen as if the intermediary accelerator device is not there.  Hence, the data can be considered original by branch offices. It may be interesting to note that in the absence of the single copy of server, obsolete and multiple stale versions of the master have a chance to proliferate. Perhaps its worthwhile to mention that the locking    available from the server also includes OpLocks which enables latency optimizations over CIFS that can help reduce contention for the same file.
#codingexercise answer
In response to the question asked in the previous post, something like this could work :
var l = new SortedList ();
Items.ForEach ( x => l.Add (x) ),

Sunday, May 24, 2015

We continue our discussion on the Riverbed Optimization System. We noted that its a powerful WAN accelerator. We review some of the technical concepts. The Riverbed Optimization System (RiOS) operates as a transparent TCP Proxy. RiOS implements the logical single end to end TCP connection with three back to back TCP connections. These connections are established in a one-to-one ratio with no encapsulation, traffic mixing, or tunnel configuration. The two outer connections look the same as a simple logical end to end connection to both the client and the server while the inner connection is invisible. This inner connection lets RiOS perform a variety of performance improvements for transmissions across the WAN. Due to this design of TCP connections across WAN, there is no disruption or reconfiguration of clients, servers or routers. For each of the software clients across the branch office computers, RiOS replace the original end to end TCP connections with two or three back to back connections. The server side connections appear the same as its original connections while the RiOS optimized connection accelerates WAN traffic directly over the remote compute device.
By referring to the proxy as transparent, it is implied that the source and the destination IP header information is maintained as the optimized traffic flows through the device. In the event that this may not jive with the various network rules and firewalls, RiOS provides three visibility modes: Correct Addressing, Correct Addressing Plus Port Transparency and Full IP address and Port Transparency.
Correct Addressing refers to a mode in which the RiOS addresses optimized traffic across the WAN to accurately reflect the source, destination, and nature of each packet. The IP addresses are meaningful only to those appliances while that of the unoptimized traffic is relevant only to LAN.
Correct Addressing plus Port Transparency introduces "spoofing" for the traffic sent over the WAN. The WAN traffic is still addressed to and from the appliance's IP addresses - only the port information is spoofed.
With Full IP address and Port Transparency, RiOS offers a complete address spoofing operations. Here the optimized traffic is addressed identically to unoptimized traffic on the LAN.

#codingexercise




Double  GetNthRootProductOddRaisedPPlusQAndEvenRaisedPMinusQ (Double [] A,Double  p, Double q)



{



If ( A== null) return 0;

By


Return A.NthRootProductOddRaisedPPlusQAndEvenRaisedPMinusQ(p, q);



}

In light of the coding exercises, perhaps I want to draw attention to an actual interview question.
This interview question made you think if you wanted a different data structure.
Entries are placed in a singly linked list that may have duplicates but can also be compared. We want the most efficient algorithm to sort them descending.





  
In today's post we discuss, Wide-area data services
This is a term used to describe the set of issues that enterprise applications face in terms of network slowness, application performance, data coherency and other such data issues when the same enterprise applications are required to be shared by various branch offices and personnel. Think of a hypothetical example where a CAD software is being used to design a bridge by offices across the continents. The bridge will be modeled completely to the finest detail and at a pace where the engineers don't feel limited in the data they collaborate with to build the model. There is only one instance of the bridge and it is electronic at any given time.
Although the example serves to illustrate the context, wide-area data services are not restricted to one application. It is used to address data traffic issues that can improve such things as file sharing, email, backup, document management system, IT tools, as well as ERP and CRM solutions.
Together, the set of techniques used to achieve this speedup in the use of WAN applications is referred to as WAN optimization.
Riverbed is a pioneer in this area. We review the technical information on its optimization system as available on the Internet.
Riverbed's optimization System has four major components that address different technical concerns.
1) Data Streamlining - This looks into data deduplication so that the WAN bandwidth is reduced and applications can be prioritized by bandwidth and latency.
2) Transport Streamlining - This looks into different transport issues that removes inefficiencies when a data transfer must happen.
3) Application Streamlining - This looks into application protocol performance so that unnecessary round trips are avoided. Note that there is a line between mutating an existing application for performance improvement and reworking the application level protocol behavior. The former is neither allowed nor viable without source code and mere reverse engineering. The latter is non-invasive and more traffic oriented.
4) Management Streamlining - This looks into deployment and management and virtualizing branch office services.
Usually this solution is available in the form of a hardware appliance and a software client. The former works at the traffic level and the latter is installed at individual workstations.
By the way, wide area data services improvements are not restricted to applications but can also be applied to Cloud technologies including VMWare vSphere datacenter applications.


#codingexercise



Double  GetNthRootProductOddAndEvenRaisedPPlusQ (Double [] A,Double  p, Double q)


{


If ( A== null) return 0;


Return A.NthRootProductOddAndEvenRaisedPPlusQ(p, q);


}


Friday, May 22, 2015

Today I had the opportunity to use my hospitals patient information access portal. I found it quite interesting in the way information is organized on the portal. A few of the observations include :
1) patient access security
2) invitation only links
3) portal multi factor security
4) information organized by the type
5) information as deep as reports
#codingexercise


Double  GetNthRootProductOddRaisedPEvenRaisedQ (Double [] A,Double  p, Double q)

{

If ( A== null) return 0;

Return A.NthRootProductOddRaisedPEvenRaisedQ(p, q);

}


Tuesday, May 19, 2015

In the storage industry, file system protocols and their access are very important for the products and the day to day usages.the storage vendors push these features down into their products but cloud service providers have to even wire ways to expose these at their user level. If there were more than one vendot and each vendor had different degrees of interpretation, then there are more variations. Even if there is one vendor to go with, these variations are a helpful study both from a survey perspective as well as in evaluating the maturity of the product. It's in this connection that the change log between the versions and the components affected become all the more interesting . The savviness to use the right product right version of the product and right features enables ease of use to accomplish a task
In an example outside the topic we began some heat questionnaire and answer set for this evaluation.
Examples include :
How does this fit our need now ?
How does this feet our need down the line ?
How does this evaluate with others
How much effort us involved ?
How many moving parts are involved ?
What is the resiliency
And the TCO?
This builds a scorecard for each option.

#codingexercise

Double  GetNthRootSumOddRaisedPEvenRaisedQ (Double [] A,Double  p, Double q)
{
If ( A== null) return 0;
Return A.NthRootSumOddRaisedPEvenRaisedQ(p, q);
}

Monday, May 18, 2015

I've injured my hand and the swelling has increased even more today. In the previous post we ere discussing dynamic pricing models. It's important to note that such model is only applicable in a small percentage of cases. Specifically when the supply is less than demand. Uber's research has shown that both supply and demand curves are elastic here. Higher prices increase supply here. The Boston experiment confirmed that there is higher number of fulfilled requests when that happens. On the demand side there are two areas noteworthy, I referred the corresponding post by Bill Gurley. There is a mention of UberX as an alternative to the black car service and this has caught on.

#codingexercise


GetAllNumberRangeProductSeventhRootPowerTwwlve(Double [] A)



{





if (A == null) return 0;





Return A.AllNumberRangeProductSeventhRootPowerTwelve();




}




Sunday, May 17, 2015

In the previous post we reviewed the pricing models. To summarize we have advertising model with deep pockets and critical mass. Large companies can engage in this. Then there is free product with subscrption Services for sale . This is generally not liked by investors there's a model to renter existing distribution especially with a markup 2 to 5 times above the cost of the software. If we don't want this cost model, we could pursue a value mod l where the offering is clear to the customer and the conversion  from basic services to premium services is with a factor that us simple and clean. Of course pricing model may be allowed to be governed by the factors this is   the example of market model. Where competitive forces could be allowed to shape the pricing  taking its a,step further we can have pricing models
That are dynamic. Take Uber surge
Pricing for instance. This is an example of dynamic model which we haven't discovered so far  there us actually an interesting incident that led to this model. In Boston Uber drivers detected a problem where as they signed off at 1am there was a lot of unfilled request piling up. To make the supply elastic, drivers were given the option of more money for working longer.

Saturday, May 16, 2015

Today we review the different types of pricing models for software.  Pricing models are complex decisions  there are a lot if intertwined factors at play such as strategy, customer etc. Let's use this blog post to come up to speed with at least a few.
First, give it away for free and make money on advertising. Facebook, Twitter and PInterest follow this model. It requires deep pockets and is very hard to measure. There is said to be a critical mass required for this model to succeed.
Second, Free product bundled with paid services model. Red Hat  Linux follows this model. Customers pay subscription fees if they want support or other services from the service offerimgs. A positive benefit of this model is that this generates cash flow although on the flip side it turns investors away
Third pricing model is Freemium model. Dropbox and linkedin offer just enough products for free to gain regular audience and then convert them for paid services. The pricing has to be a function of the perceived incremental value. This is typically made with simple conversion factors so the equation may look like 10000 more at 10$ each.
Next is the cost based model where the product is sold at two to five times the cost incurred.multiples are used because the middlemen in the distribution channels as well as the end retailers work with fixed markup? This is the most common method for entering into existing distributions.
Value model is another. This articulates the value of the offering be it monetary or otherwise. In all these cases the value has to be perceived as compelling.
Portfolio pricing expands on this by offering a variety of products and services. There more mix and match possible now.
Tiered or volume pricing is used if the product is purchased in different types of quantities by different users?
Market pricing is another model which is used in highly competitive and minimally differentiated markets. Strong players like Amazon for instance can ask and get a premium based on better services or better timeliness of services. However businesses have to watch out for not competing themselves away.
Feature pricing is another example of pricing model where the pricing starts from a baseline of features or a bare bone version. This model often suffers from a lot of customer complaints.
Another extension of a pricing model based on reusability is razor and blade model.
Here the base component us sold cheaply or even given away and the consumable portion is charged as used or more often. To be able to provide the base units, there must be a deep pocket to begin with.
Generally speaking complex pricing models can get more and more complex but the right use of metrics and simpler conversions can overcome the shortcomings of overly simplistic model.
Courtesy Cayenne Consulting blog





Friday, May 15, 2015

A quick look at cascade styles in Hibernate ORM framework. By default there is no cascading of state between one associated entity and another. Hibernate does not implement persistence by reachability by default. If we want a cascade style along an association, we must specify it expicitly in the mapping file. Cascading options  can be described as persist, delete, lock. These can be combined. The associations can be parent child for these to work in that what affects the parents cascades to the child.
The cascading operations for a parent child relationship are as follows
If the parent is passed to persist, all children are passed to persist
If the parent is passed to merge, all children are passed to merge.
If the parent is passed to save, update, saveorUpdate, all the children are passed to the same
If a transient or detached child becomes referenced by a persistent parent, saveorUpdate is called.
If a parent is passed to delete, all the children are passed to delete.
If a child becomes dereferenced from a persistent parent, nothing special happens. Cascade = delete-orphan could handle this case.

Thursday, May 14, 2015

Hibernate and deleting persistent objects:
When deleting objects with Hibernate it's best to think of it as making an object transient. The state of the object is erased from the database but it can be referenced.  The order of deleting the objects is not important because there is no risk of violating a foreign key constraint. A NOT NULL constraint on a foreign key column may still be violated if the order is incorrect.--
This is resolved by cascade options.
When flushing the session, the following order is important
All entity insertions in the same order that the save () were called
All entity updates
All collection deletions
All collection elements deletions, updates and insertions.
All collection insertions
All entity deletions in the same order as delete () were called.
While this order is guaranteed there are no guarantees when these JDBC calls will be called except unless explicitly invoked with a flush(). That said, there are different FlushModes available to make these less frequent.
These modes include :
- flush at commit time when the Hibernate Transaction API is used
- flush automatically using the explained routine
- or never flush unless flush is explicitly called.
In addition to Flush modes and cascade styles offered, Hibernate also makes its metadata available to applications. This way applications can choose to implement deep copy algorithms such as for mutable value types and not for immutables or associated entities. Hibernate exposes metadata via the SessionFactory. Instances of the metadata interfaces include ClassMetadata, CollectionMetadata, and the Type hierarchy.




Sunday, May 10, 2015

Persisting objects with Hibernate.
Hibernate defines three object states. Transient, persistent and detached
Transient state is for example a new object. It has no ID associated and consequently no representation in the database. These objects will be collected by the garbage collector. The session helper methods can be used to save the object.
A persistent state is one where there is a database representation available. When we load an object we put it in this state. This object is now in the scope of session. Any changes made to the object will be detected.
A detached state is one where the object goes out of scope of session. Changes can be made because the reference is valid. When the object is reattached to a session,  it and all  it's changes can become persistent again. This enables a way to work with long running unit if work which are called application transaction.
The save method on the session makes an object persistent. It does not guarantee to return an identifier. The assignment might happen at flush time.
The load method of a Session provides a way of retrieving a persistent instance based on its identifier.
Load can work through a proxy. If the class is mapped with a proxy, load returns an initialized proxy which does not actually hit the database. This helps with creating associations or working with batches. If the load has to reach a database and there is no matching row, then there is an unrecoverable exception.Its better to detect the database record with a get method which returns null if there's none. Objects can be loaded for update only when there's a LockMode specified or a cascade style specified.
Reload of an object can work with refresh in the sequence save()->flush()->refresh()
Objects saved by the session are all transactional persistent instances and any changes to the persistent state will be persisted when the session is flushed. Therefore there is no need to call a method such as update between load and flush.
Hibernate works with states only. It does not work with SQL statements from users. That approach might be better with JDBC. Further Batch processing conflicts with online transaction processing, so prefer not to do batching with Hibernate although some options are available.
When an object is loaded but presented to a higher layer where it may spend an inordinate amount of time may require separate transactions for the retrieving and saving. These "long" unit of work therefore have to work with versioned data.
To reattach this detached instance, we call update if the session does not already contain a  persistent instance. We use merge when we want to merge the modifications at any time without consideration of the state of the session.

If there are associated items to the entity we want to save, then they too can be persisted in any order as long as there is no not null constraints on a foreign key column. There is never a risk of violating a foreign key constraint. On the other hand there is risk of violating a not Null foreign constraint if the order is not maintained.

As an aside, the lock method can be used to reassociate a detached object. However the detached instance has to be unmodified. There are several LockMode available.

We now look at automatic state detection. This is enabled via saveOrUpdate method.
This method is supposed to generate a new identifier or reattach with an existing identifier. But first let's note that update, merge and saveOrUpdate methods are not to be called if we are using the same session. They are typically used when going from older session to a newer session.

SaveOrUpdate uses object versioning either with a version or a timestamp. If the version is the same, save it. If the version is different update it

Merge is very different. If there is a persistent entity with the same identifier in the session,  merge will  update it. If there is no persistent entity, merge will load from the database or create a new one. The returned entity doesn't become part of the session but remains detached.

#codingexercise


GetAllNumberRangeProductNinthRootPowerTen(Double [] A)



{





if (A == null) return 0;





Return A.AllNumberRangeProductNinthRootPowerTen();


}



In continuation of our discussion on an example of Graph API, shall we look into a generic methods of graph related computing. Take for example, persistence of a graph by reachability. How do we find nodes in the graph that correspond to entities that should be affected when a start node or a pivot entity is to be modified and saved? These are  done witg shared references. We find that shared references prevents us from deleting an entity even when the pivot entity is being deleted simply because another entity is using the shared reference
 If all the nodes are connected by a root entity, meaning all entities derive from a root object type, does it help to have a process that can find and collect these dirty elements. Taking it more generally give a connected graph where there are entities constantly getting dirtied, how do we non-invasively differentiate them and connect a subgraph that we need to work on ?
One such method to deal with this is the garbage collection method using gossip algorithm from the text book. Here is an example:
We first need to find the dirty objects in the graph. There are two processes involved - a marker process that marks different objects as dirty/clean and a mutator process that is responsible for connecting nodes that are already in sync with the store and need not be taken any action on. 
The marker need not be exact. It can be conservative. 
The mutator and the marker do not interfere with each others running because they work on basis of superposition. That is they touch data that are not related to each other. 
We begin with the root and simply mark vertices that can be reached from the already marked ones. This behaves just like the gossip program. However, the mutator can still frustrate the marker. So we add a second logic to say that we add an edge to x,y when y is clean and set both x and y to delete, we delete an edge between x and y when either of them is false. This way we build the sub-graph we are interested in.

Saturday, May 9, 2015

In today's post we will be reviewing the Open Graph protocol as described online. We will review what's mentioned on the web and then we will explore some from their GitHub.  Graphs provide a very intuitive and wholesome representation of the data such as in web pages particularly social media such as Facebook. Its not that relational, no-sql or object databases don't serve as a good store for the computations of the social media, they don't simply provide a single technology for a developers to interact with. Consequently the graph protocol and its API enters the picture. By converting the data in the web pages and the social accounts into a graph, we can enable more such functionalities as multiple posts and multiple comments. The relationship between the graph objects allows us to do CRUD operations on these objects.
Facebook Graph API comprises of the following:
1) nodes - these pertain to resources such as a User or a Photo, a page or a Comment
2) edges - the connections between those resources such as a Page's photos or a Page's comments.
3) fields - the things such as the birthday of the User
The Graph API is HTTP based and follow the REST paradigm and conventions. For Example, GET /me will return your profile information. APIs are available are comprehensive and distributed into core and extended set of APIs. APIs are also authorized and require an access_token.
Subsequently the objects were versioned with /v2.1/{object}
Core API's and SDKs are central to the Facebook Platform. These elements are subject to a version system and gurantee that anything considered a code API node, field, edge dialog, SDK or SDK method will remain unavailable and unchanged for a span of two years from the version release.  Breaking changes come via new version.
Core Elements include Facebook login, share dialog, requests dialog, the like button, the Facebook SDK for iOS, the Facebook SDK for Android, some methods of the Facebook SDK for JavaScript and Some Graph API fields and endpoints.
Extended is everything beyond that, APIs and SDKs.  The extended are subject to variation within 90 days.
APIs can be simply read-based. In addition, we can choose fields to make a query.
Objects can also be discovered by their IDs. But sometimes this is not possible. In such cases, we can use the URL node to return the IDs of Open Graph Object URLs or find data associated with an App Link URL.  We can even retrieve multiple objects with object type such as
GET /me/{action-type}/{object-type}
App Link URLs take the form :
GET graph.facebook.com?ids=http://fb.me/<id>/&fields=app_links&access_token=<access_token>

While we continue our discussion on Graph API of Facebook, we can also take a look at some of the simple Graph methods as implemented here:

https://github.com/ravibeta/csharpexamples/tree/master/CodingExercises/CodingExercises
And how graph apis work at :  http://1drv.ms/1PxFU

Coming back to our discussion on FaceBook Graph API, let's us explain the App Link a bit more since it's a FaceBook concept and not a general concept we have been discussing so far. App Links enables deep link to content in our Facebook app. With this kind of sharing it is possible to jump to and from the Facebook application to the App Links App. Moreover the content is brought into the Facebook App. It works by adding metadata to existing URLs on the web so that they can be consumed by our Facebook App. 

Friday, May 8, 2015

Today  we wrap your discussion on DBPowder. We saw the ORM framework and it's processes. We saw the flexibility  for both simple and complex correspondences. We saw the use of Active record for data wrapper and data classes. When the developer describes the eer model, the relation classes and ObjectView are generated. In this case the process kicks off by mapping tables to entities with a one to one  mapping. The attributes of the conceptual model become attributes of the table and the entities. Then the relationships are established and their cardinality is specified. With the help of this information from the conceptual model we can then add the biderctional attributes to tables and classes. Sometimes it is easier to ease the relationship into a new entity and this is one of the flexibility offered by DBPowder in its code generation. For examples a user table having many different users  registering different hosts has many registration dates that cannot be attributes in the user or hosts table . Next we add hierarchy information and this is possible in one of three different ways.
The complex correspondences are described using the ObjectView where a graph based object is used to generate the cpc equivalences in complex correspondences with rs. This ObjectView takes application logic into account as well. A pivot entity is chosen which is the starting point,  connectivity for the edges are established along with that of the relationship and cardinality set. The nodes and edges for a hierarchy are defined along with a direction from parent entity to child entity. Then the nodes within the subgraph of the directed graph which have one-connectivity are combined to form a group.  Class definitions are generated from the grouped node.
The code generator generates the source code for the relational schema with the relational tables so far and their corresponding simple persistent classes for the simple correspondences. It also generates the source code for the complex persistent classes with the eer and the specific object views. With the help of ActiveRecords, we move away from many to many correspondences between persistent classes and tables to one to one correspondences between ActiveRecords and tables.

#codingexercise

GetAllNumberRangeProductSeventhRootPowerTen(Double [] A)


{




if (A == null) return 0;




Return A.AllNumberRangeProductSeventhRootPowerTen();



}


Thursday, May 7, 2015

Today we continue our discussion on DB powder. Using DBPowder and ObjectView, developers can come up with a number of persistent classes along with each part of the application logic. In this case, a table might be used by more than one persistent class. There are many to many correspondences among persistent classes and tables and the persistent classes hold the values. Now if there are persistent objects in the same session, and they hold the same attribute value, in the same tuple, they cannot recognize each other that the attribute value is the same.  To solve this issue, DBPowder leverages ActiveRecord. The data wrapper classes and the data classes access tables using ActiveRecords. Because values of data wrappers are held in ActiveRecords and each of the data wrapper is the wrapper class of the Active Records,values in data wrapper can have a single correspondence with those in the tables. As a result, each persistent class can recognize each other that the attribute value is the same.
We now look at a Prototype system of DBPowder and a production use. The first version of the prototype system for DBPowder was implemented in JavaSE in  2006. In this paper, EER model and ObjectView support were added. The prototype included a description language, DBPowder-mdl and has a parser and code generator to interpret DBPowder-mdl and generate source code.  The prototype also generates simple web pages for create, retrieve, update and delete operations. DBPowder-mdl, the parser and the GEN were refined in this paper.
The set of web applications and the DBPowder system was applied in production for the security administration task and refined. The same set of applications were deployed to another site where the number of users in two sites were 130.

Wednesday, May 6, 2015

Today we continue the discussion on DB-Powder  description language - the DBPowder mdl.  The goal of this language is to allow developers to describe the conceptual model. As we have seen in the previous posts, that DBPowder uses a process to generate code and tables with this model. The description style of DBPowder-mdl is either EER-style or Object-View style. The EER style primitives are Entity E, Attribute A and linked-entity L.  The primitives are described in hierarchical structure.  Primary keys can be omitted because DBPowder assigns surrogate keys.
To extend the hierarchical structure into a graph, E and L can be described more than once. We saw that generalization hierarchies are supported by  Single Relation (SR), Class Relation Inheritance (CR), and Concrete Class Relation Inheritance (CCR).
The Object-View style primitives are the pivot-entity PE and the member-entity ME.
The code generation generates the source code using the EER model or the Object View. Developers add application codes using the generated class and built-in session class. The code generator represents the EER, Object View and the ORM processes and this helps to generate the relational schema and persistent classes. The latter include data wrapper class, logic class and a session class. A sample usage of the application codes involves developers to hold the values in the data wrappers using the getter/setter methods and then presist the values into session classes using CRUD methods such as insert, find, update delete etc. This is achieved because the data classes are using ActiveRecords and the data wrapper has correspondence with tables via the ActiveRecords.
#codingexercise



GetAllNumberRangeSumSeventhRootPowerTen(Double [] A)



{



if (A == null) return 0;



Return A.AllNumberRangeSumSeventhRootPowerTen();


}

Monday, May 4, 2015

Today we continue the discussion  on DBPowder. We noted earlier the process involved in the code generation, specifically given a conceptual model, how to map relational and object model. When EER is modeled using a directed graph, the starting entity is called the pivot entity. Connectivity on the edge is defined along with that of the corresponding relationship.Hierarchy is represented by directional edges from parent entity to child entity. When representing relationship, if there are attributes to the relationship, then they belong to the connected node to which the connectivity is many.If there are more than one path from the starting point to E, E has to fill all the conditions that correspond to the incoming edges. If this is not preferable, ObjectView can reduce the number of incoming edges by introducing another corresponded node and modifying one of the paths to use this node.  When grouping of nodes within a sub-graph, we form with those where the connectivity of the corresponding relationship is one. ObjectView arranges the class definitions generated from the grouped node  using the keywords structured, literal and interface. The structured literal is defined as a fixed number of  literals and the interface is defined as the abstract behavior of an object type. A structured literal can be used as a user defined literal. The interface is independent of the directed graph and applied to the group node.Multiple interfaces are possible for a grouped node in which case all the operations definitions are defined in the corresponding class.
In the example we took earlier with users and hosts, the application logic has to start from the user and hence the pivot entity is chosen. the node register and host are grouped.  The generated classes are user oriented. It's also possible to use host as pivot entity, in which case the sub-classes of the user are not required.
By using different ObjectView pivot entity, we can form the attributes from the grouped node. A practical generation of source code and their structure would involve code generation that corresponds to active records that has attribute values, setters/getters methods for data classes, and persistent classes for the wrappers of the data wrapper classes. Together with the active record and the logic classes, data persistence can proceed with the application codes defined by the developer and the call persistence methods on the session class that translate to SQL DDL. Developers of .Net framework can quickly associate these pieces of generated code to the templates defined from say the EDM.

Sunday, May 3, 2015

Today we continue our discussion on DBPowder. We briefly reviewed the data models  - namely the EER model, the Object model and the relational model.  We now see how to use the EER model for simple correspondences and ORM processes. When the developer describes the EER model and the DBPowder generates the corresponding spc object model and the rs relational model, it follows this process:
1) In eer, A surrogate key is added to each entity where primary key may have been omitted.
2) Each table and an attribute in rs is generated with one to one correspondence to an entity and an attribute in EER. The relationship is used to add foreign keys to either side of the tables.  Hierarchies are denoted later.
3) For each table, an attribute and a foreign key in rs, a class, attribute and relationship in spc is generated. Each relationship is converted into a bidirectional one by adding an inverse relationship. The cardinality is also added by looking it up in the EER.
The hierarchies were reserved for after step 3 because they are not directly mapped to the rs. Instead three methods are used.
1) Single Relational Inheritance - all of the classes in the a generalization hierarchy are mapped onto one table.
2) Class Relational Inheritance - each class in the hierarchy is mapped one to one into one table.
3) Concrete Class Relational Inheritance - each concrete class in a generalization hierarchy is mapped one to one into one table.
In the example describing user generalization earlier, SR was used.
The reg_date attribute of register entity proves tricky because a user has a many to many relationship to a host, and a SR relationship between persistent classes of User and Host cannot handle this attribute reg_date because reg_date is unidirectional.
For complex correspondences DBPowder introduces ObjectView. The process now continues as :
4) The Object View descriptions are denoted first:
    1) specifications of entities and relationships in the EER model use a directed graph.
     2) Nodes are grouped within a subgraph of the directed graph by combining the nodes which have a relationships of one-connectivity.
      3) class definitions are then generated from the group nodes using structured literal and interface.

Saturday, May 2, 2015

Today we will continue our discussion on DBPowder. We will look into the conceptual model introduced by this framework. This paper proposes the following data models:
1) Object Model : This object is based on the subset of ODMG 3.0 standard, with some Java-based extensions. An object is called persistent  if all property values are saved into the storage in order they may be retrieved later. Let us take an example with a class User that has an attribute user_name and has a relationship register to another class Register. The Register class also has a relationship user to the class User. Therefore the two classes have a bidirectional relationship. The cardinality is 1:n  A class AdmUser is a sub class of the class User and the class User has a superclass of AdmUser and a class GuestUser
2) RelationalModel: The relational model has a table user and an attribute user_name and primary key user_id. A table register has a foreign key user_id to the table user.
3) EER model in DBPowder : The EER model describes the primitives such as attributes, entity, relationship, connectivity etc. The notations used are : e defines a fact,  e has one or more attributes A1, A2 .. An. Entity E comprises of  a set of facts | the entire attributes. Relationships are expressed as connectivity which represents cardinality.The relationship among three or more entities is described by another entity E' and connecting E' to each of the entities E1, E2, E3.
In the example taken earlier an entity user p has an attribute user_name q and a relationship r to an entity register. r has a connectivity 1:n. For entity user, the occurrence of entity register  is mandatory , a constraint represented by s. The entities of user, adm_user and guest_user form a hierarchy with user as the base. Thus the EER can be used to generate both the relational tables as well as the simple correspondences. 

Friday, May 1, 2015

Today we continue reading the paper on DBPowder. We brought up the mention that Hibernate allows multiple complex correspondence with HBM files. Using Hibernate Annotations which is a supplemental tool, the developer can describe classes with annotations but they too don't improve the Hibernate.
ADO.Net entity model is another popular method which has a conceptual schema description language, a store schema description language and a mapping specification language - all represented in xml. As part of MSL, Melnik proposed a method with bidirectional views which describe a set of constraints between a group of tables and persistent classes. The compiler compiles the constraints to generate the query views and update views for complex correspondences.
Msquare-ORMsquare is another method bur it requires a relational schema and persistent classes in advancey
A relational view extends the original table but views and persistent classes  still need to be described to utilize ORM moreover when data manipulations are performed on the views, they may not be properly reflected on the original table. Also views have to be unfolded to original tables with complex queries causing poorer performance. For object oriented software design, uml is popular.
In this paper, while the authors adopted an EER model to clarify the issues, a design in the EER model can be easily converted into that by the UML Class diagram. This enables simple to complex correspondence comparison because simple uses EER and complex uses ObjectView, in addition to EER.
#codingexercise


GetOddNumberRangePowerTenthRootPowerTwenty(Double [] A)


{


if (A == null) return 0;


Return A.OddNumberPowerTenthRootPowerTwenty();


}