Cluster computing

Thursday, December 7, 2017

We continue with our discussion of the personal coder. We browse our email inbox from any computer and kiosk. Interacting with our own personal assistant on any speech recognition medium may just be as easy to bring up as that. With the standardization of the protocol between a speech recognizing hardware and the personal assistant, the two can be separately and allowed to vary independently giving rise to the notion of personal assistant conjured up on the kiosk.

The personal assistant can also improve human interactions with the help of emotions and feelings. Personal robots such as the retail available Jibo already demonstrate this but we have cited a deeper and broader set of skills in our write-up so far. If the form and the functionality as described here can be combined, it will lead to pervasive use of the personal assistant.

Personal assistant can not only be interactive, humorous and informative, it can also help with visualization and drawing with verbal commands using the help of a projector lamp and a wall. This capability is often touted in movies but not facilitated on the personal assistant device.

Personal assistant might also be paired with a virtual reality handset so that the assistant and the owner can be following the same canvas. This helps the owner to share with the assistant for replay on the projector mentioned above. Both the projector and the canvas are specialized functions of the personal assistant and not necessarily something that needs to be taken out of a notebook or desktop. However, enabling a tracking device with the assistant improves a whole lot more possibilities other than the owner's commands.

There are two advantages of having the personal assistant maintain an activity log:

First, the personal assistant is uniquely positioned to be a trusted confidential passive associate

Second, the collection of activities helps boost different traits and habits of the owner leading to positive lifestyles.

Together the personal assistant and the owner might achieve singularity as man and machine increasing the impact that the individual makes.

#codingexercise

We were discussing finding closest numbers between the elements of three sorted integer arrays. We don't have to sweep all three arrays if we need to determine the relative range of each array. We take the sentinel values of a given array and find its closest numbers in the other arrays. Then we compare the sentinels of the other arrays with the closest numbers found in that array. This gives us a comparison of how wide or narrow the other arrays are compared to one of them.It gives us an idea of the relative positioning of the arrays within each other whereas we could only tell their position on the number line with the sentinels.

Another usage of finding the closest numbers in these arrays to a given input value is to discover overlapping range of sorted numbers. In order to do this on number line, we could choose max(start1,start2,start3) and min(end1,end2,end3) for (start,end) pair for every array. But by finding the closest of the min and max values, we can determine the overlapping range projection on any one of three arrays.
finally we can choose the projection with the smallest number of elements

Wednesday, December 6, 2017

We continue with our discussion of the personal coder. Let us now consider the first differentiation. If the assistant is mobile, then it is likely to be classified a robot. The movement of the device is mostly with respect to an object. When we introduce motors and an ability to govern them by way of co-ordinate space, we are working with robotics. The assistant targets only the owner. It does not have to move around to translate commands into tasks. This is why we separate robotics from the capabilities an assistant has.
Second, the assistant is not an appliance but a software over an appliance. When an appliance collects and sends data over the internet, it can be considered an internet-of-things. In the case of an assistant, the data pertaining to the owner is maintained by the assistant and actively indexed and worked with. Moreover, the assistant need not be globally unique or have its own ipv6 address to do its job. This is where we differentiate from the lower level sensors and devices that are otherwise also connected to the internet.
There are two different kinds of pressure exerted in this middle layer for a "personal assistant". The pressure from the layer of IoT devices is that they can proliferate to include dedicated assistant responsibilities and may no longer require the owner to talk to one assistant only. This avoids having to relay commands and reduces the assistant from a being a single point of failure. At the same time, it makes it natural for the owner to turn to the function specific device and command in just the same way as they would an assistant for working with her data locally. This is also more intuitive for the user to know that a device by its function. Moreover, elaborate setup of applications and devices to the assistant is now avoided. In other words, the personal assistant has to remain smarter and differentiate from the device specific functions that they can specialize in. One way to do this is to provide more intelligent services over a unification platform for seamless experience, relay and consistency. If the assistant could treat these devices as plugins, the ecosystem for the assistant can grow better.
Similar arguments come from the mobile robot inclusive top layer where the assistant does not enter. The robot may assume the responsibilities of the "personal assistant" given the hardware and support it gets. Since it is mobile it is the equivalent of butler services for the customer. However, the digital assistant can continue to grow its presence by being more affordable and comprehensive in its support for taking commands from a robot as well. These kinds of improvements are easy to imbibe in the personal assistant while enjoying the power of cloud computing available on a voice activated command.
Another area of improvement in building the jargon for the personal assistant is to create associations with proper or common names for devices that are not necessarily on the internet. For example, it should be easy for the assistant to pair with a headphone over Bluetooth network and allow the user to listen to music on his headphone. The personal assistant merely has to recognize that the word headphone refers to a specific device that it can scan the Bluetooth network for and connect. Most desktop computers already have a hardware profile and can scan for hardware changes. They are also automatically able to download the necessary drivers to let the device function properly. In a way the personal assistant can become a headless desktop so long as the essence of communication continues to remain voice based. There is also a demarcation of concerns between the portable laptop and the portable personal assistant. One is general purpose computing and the other is merely customer facing. In a way, the personal assistant would benefit immensely to communicate with a desktop for all computing tasks.
Sometimes it is harder to ask if the desktop based computing is going to be demarcated from the personal assistant? The personal assistant is not the same as the desktop computer because it derives its value from being closest to the owner. It can be considered a frontend while the desktop may act as the backend. This connection between the desktop and the personal assistant is not just an extension from desktop computing but a protocol that spans different hardware, platforms and computing.
In its virtual form, a personal assistant may also be given any avatar. If we were to visualize the assistant it could be a photo frame on the desk that can show different personas as suited to the end user. By making the speech recognition software as thin as possible on the front end and backing it up with heavy duty cloud services, we can make the assistant appear differently at a time or a place.

#codingexercise

Tuesday, December 5, 2017

We were discussing the personal coder here. In another extension of a similar idea, the assistant can also become the store for personal data – be it identity, images, video or documents. The notion that the assistant has an unlimited storage is equivalent to connecting cloud drives with the assistant to store and retrieve personal data. These include the ability to connect, save, extract, transform and load user data from a variety of stores and automatically back-up to the assistant's personal drive space for the user.

Once we have a notion of a personal assistant, it is easy to view other roles suitable for it. For example automation in the cloud proceeds with the installation of an agent on any device even if they are cross platform or cross company. It may therefore be imagined that the assistant can also manage other devices within the premise. This is helpful to make the assistant as the point of contact for the user.

The expansion of roles for the personal assistant leads us into robotics and computer vision space as well. From butlers to life savers, robots are popular in the imaginations presented in the movies. To some degree the addition of a simple camera on a standalone desktop personal assistant can be an improved capability rather than a voice activated assistant. We use face recognition for example with laptops to allow users to sign in. This works well as one of the options to sign in when there is sufficient visibility and light. It may not be sufficient for more complex image recognition algorithms such as Active Contour Model across images but the need to track movable objects is reduced when we have a stationary observer in addition to a listener. Moreover, the primary role of the assistant is in focusing on the owner rather than objects and other devices with which it can talk via programmability options as mentioned earlier

There are essentially two points for consideration where we demarcate robots from assistants- the first is on the basis of whether the assistant is mobile or not. The second is whether the assistant is connected seamlessly to computing at cloud scale. In this way the assistant separates itself from robots as well as devices that claim to have connectivity via internet of things. The expansion of roles and the introduction of capabilities does not take away the identity of the assistant as someone responsible for the users' commands and their data. Eventually the need to carry a phone could be replaced with being able to summon an assistant at the nearest available device

Monday, December 4, 2017

We were discussing the argument for standard query operators here. Standard Query operators have an interesting attribute. They are often used in Lambda expressions. Lambda expressions are local functions that can be passed around and returned as any first-class objects of the programming world. Software developers love lambdas because it makes expressions of logic so much more fun and convenient. They use it to create delegates and expression trees. The former treats logic as a data type and the latter represents logic in a tree like data structure where each node is an expression. Both can be compiled and run and even allow dynamic modification of executable code.

When data is processed by logic written in lambdas they are predictable, consistent and free from resource onus. Considerations such as where the logic is deployed, how it is hosted, the topology of the resources and the chores that go with maintaining it are no longer weighing on the developer. Even the business owners appreciate when their logic is no longer held ransom to technology, infrastructure or vendors. Furthermore, with the organization or re-organization of code by way of delegates and expression trees in addition to and not excluding the well accepted object-oriented programming concepts, the business is now able to move swiftly to the market to deliver new and better proposals.

It is also important to observe here that data technologies and processing are never in vacuum. They work within the parameters of their ecosystem by way of feedback cycle. Lambda expressions have lately demonstrated acceptance even in the cloud computing world in the form of serverless computing.

In a database, the query optimizer something very similar to this. It decides the set of tree transformations that are applied to the abstract syntax tree after it has been annotated and bound. These transformations include choosing join order, rewriting selects etc. It not only chooses the order in which the operations are applied but it also introduces new operators. Therefore it seems a good candidate for what we have described above.

It’s important to note the distinction between database and serverless this way. One tries to push the computations as close to the data as possible while the other tried to distance the computations from the resources used. Traditionally databases have required to scale up at least for online processing while the alternatives have gone for scale out via batch oriented. Databases have required large memory and continued to take up as much memory as added. The alternatives have distributed the computations via commodity servers

#codingexercise

Yesterday we were given three sorted arrays and we wanted to find one element from each array such that they are closest to each other. One of the ways to do this was explained this way: We could also traverse all three arrays while keeping track of maximum and minimum difference encountered with the candidate set of three elements. We traverse by choosing one element at a time in any one array by incrementing the index of the array with the minimum value.

By advancing only the minimum element, we make sure the sweep is progressive and exhaustive.

We don't have to sweep for mode that spans all three arrays because we are guaranteed that the maximum and minimum difference between the three identical elements will be zero. For every such occurrence, we then count the number of identical elements in each array.

Sunday, December 3, 2017

I was looking for a programmatic way to get authorization code to login with Amazon also called LWA. An authorization code is merely a first step in the authorization code grant as documented here: https://developer.amazon.com/docs/login-with-amazon/authorization-code-grant.html

However it seems the implementation for the OAuth is different from that for AWS or Alexa. When it is an OAuth for a device, the REST APIs are helpful to get a code and a token. The implementation here is geared towards loading the Login With Amazon SDK. There is no denying that OAuth requires a user interface to accept the credentials from the user.
http://httpunit.sourceforge.net/doc/api/com/meterware/httpunit/WebResponse.html
WebForm form = resp.getForms()[0];
WebResponse response = form.submit();
String atMainToken = response.getCookieValue("at-main");
at-main token is not the same as oa2.access_token but it helps especially if there is a utility for translation.
However, client interactions could also be tolerated in the absence of a user. While this may not be required from the RFC, it does provide a convenience. The client redirects the user to the authorization server with the following parameters response_type, client_id, redirect_uri, scope and state. When the authorization server is satisfied, the user is asked to login to the authorization server and approve the client, The user is then redirected to the redirect_uri along with code and state. This code can then be swapped for token by the client using the oauth endpoints.
The access token can be decoded and validated using the oauth endpoints. It can also be used to refresh the tokens. To get the access tokens, when we load the LWA SDK on a page, the page calling the SDK must be in the same domain as the redirect_uri specified. This would otherwise lead to cross domain origin and will not be allowed. Even if we set the Origin and Referer header fields with the help of an interceptor to overcome the CORS policy of the browser, the server may reject the request. Moreover, the page making the request with the Javascript SDK must be secured with https.

#codingexercise
Yesterday we were given three sorted arrays and we wanted to find one element from each array such that they are closest to each other. One of the ways to do this was explained this way: We could also traverse all three arrays while keeping track of maximum and minimum difference encountered with the candidate set of three elements. We traverse by choosing one element at a time in any one array by incrementing the index of the array with the minimum value.
By advancing only the minimum element, we make sure the sweep is progressive and exhaustive.
We don't have to sweep for average because we could enumerate all the elements of the array to find the global average. then we can find the elements closest to it in the other two arrays.
List<int> GetAveragesFromThreeSortedArrays(List<int> A, List<int> B, List<int> C)
{
var combined = A.Union(B).Union(C).ToList();
int average = combined.Avg();
return GetClosestToGiven(A, B, C, average);
}

Saturday, December 2, 2017

We were discussing the argument for standard query operators. Today I want to contrast this with OData. While services provide scrubbing and analysis over data from tables, OData exposes the entire database to the web so they may be accessed by REST APIs. The caller can then use the database just like any other browsable API and from any device. It uses the well known HTTP methods, query parameters and request body to make the web conversation. The difference between standard query operator and this API is that the former standardizes the programming needs across applications while the latter serves to open up a data source to the web. One may even be considered a layer on top of another and there is no denying that the former has a lot more flexibility as we mix and match even collections across data sources.The former plays an important role with data virtualization while the latter plays an important role in connecting a data source. Still they are both services.

There was not much difference between the two when we don't worry about the syntax of the query and we view the results as an enumerable. Even popular relational databases are hosted as a service with programmability features so you can leverage them in your code.Similarly, standard query operators may be implented entirely in ORM. With the introduction of microservices, it became easy to host not only a dedicated database but also a dedicated database server instance. Use microservices with Mesos based clusters and shared volumes, we now have many copies of the server for high availability and failover. This is possibly great for small and segregated data but larger companies often require massive investments in their data, often standardizing tools, processes and workflows to better manage their data. In such cases consumers of the data don't talk to the database directly but via a service that sits behind say even a message bus. If the consumers proliferate, they end up creating and sharing many different instances of services for the same data each with its own view rather than the actual table. APIs for these services are more domain based rather than implementing a query friendly interface that lets you directly work with the data. As services are organized, data may get translated or massaged as it makes its way from one to another. I have seen several forms of organizing the services starting with service-oriented architecture at the enterprise level to fine grained individual microservices. It is possible to have a bouquet of microservices that can take care of most data processing for the business requirements. Data may even be at most one or two fields of an entity along with its identifier for such services. This works very well to alleviate the onus and rigidity that comes with organization, the interactions between the components and the various chores that need to be taken to keep it flexible to suit changing business needs. The flat ring of services on the other hand are already business friendly to begin with letting services do their work. The graph of service dependencies may get heavily connected but at least it becomes better understood with very little stickiness that comes with ownership of data. Therefore, a vast majority of services may now be decoupled from any data ownership considerations and those that do may find it convenient to not remain database specific and can even form a chain if necessary.

#codingexercise

By advancing only the minimum element, we make sure the sweep is progressive and exhaustive.

List<int> GetClosest(List<int> A, List<int> B, List<int> C)
{
var ret = new List<int>();
int i = 0;
int j = 0;
int k = 0;
int dif f = INT_MAX;
while ( i < A.Count && j < B.Count && k < C.Count)
{
var candidates = new List<int>() { A[i], B[j], C[k] };
int range = Math.Abs(candidates.Min() - candidates.Max());
if ( range < diff)
{
diff = range;
ret = candidates.ToList();
}
if (range == 0) return ret;
if (candidates.Min() == A[i])
{
i++;
} else if (candidates.Min() == B[j])
{
j++;
} else {
k++;
}
}
return ret;
}

Friday, December 1, 2017

The argument for standard query operators.

Recently I came across a mindset among the folks of a company that databases are bad and services are good. There was not much difference between the two when we don't worry about the syntax of the query and we view the results as an enumerable. Even popular relational databases are hosted as a service with programmability features so you can leverage them in your code. With the introduction of microservices, it became easy to host not only a dedicated database but also a dedicated database server instance. Use microservices with Mesos based clusters and shared volumes, we now have many copies of the server for high availability and failover. This is possibly great for small and segregated data but larger companies often require massive investments in their data, often standardizing tools, processes and workflows to better manage their data. In such cases consumers of the data don't talk to the database directly but via a service that sits behind say even a message bus. If the consumers proliferate, they end up creating and sharing many different instances of services for the same data each with its own view rather than the actual table. APIs for these services are more domain based rather than implementing a query friendly interface that lets you directly work with the data. As services are organized, data may get translated or massaged as it makes its way from one to another. I have seen several forms of organizing the services starting with service-oriented architecture at the enterprise level to fine grained individual microservices. It is possible to have a bouquet of microservices that can take care of most data processing for the business requirements. Data may even be at most one or two fields of an entity along with its identifier for such services. This works very well to alleviate the onus and rigidity that comes with organization, the interactions between the components and the various chores that need to be taken to keep it flexible to suit changing business needs. The flat ring of services on the other hand are already business friendly to begin with letting services do their work. The graph of service dependencies may get heavily connected but at least it becomes better understood with very little stickiness that comes with ownership of data. Therefore, a vast majority of services may now be decoupled from any data ownership considerations and those that do may find it convenient to not remain database specific and can even forma chain if necessary.

Enterprise architects strive to lay the rules for different services but most are all the more willing to embrace their company's initiatives including investments in the cloud or making it more consistent with the others. Unless a team specifically asks for a one-off treatment by way of non-traditional databases or special requirements, they are all the more excited to use cookie cutters or corral the processing to a service. Instead if these same architects were to also take on the responsibility to open up some services with APIs implementing standard query operators on their data akin to what a well-known managed language does or what web developers practice with their REST API using standard query parameters, they will do away with much of the case by case needs that come their way. In essence, promoting standard query operators for data over and on top of business interactions with the service seems a win-win for everyone.

#codingexercise
Yesterday we were given three sorted arrays and we were finding one element from each array such that the element is closest to the given element. The elements were one each from each of the arrays.
Now if we wanted to find one element from each array such that they are closest to each other, we can reuse the GetClosest methods earlier in iteration for every element of one of the array until the criteria is satisfied We check the absolute value of the difference to the candidate value. Alternatively, we could also traverse all three arrays while keeping track of maximum and minimum difference encountered with the candidate set of three elements. We traverse by choosing one element at a time in any one array by incrementing the index of the array with the minimum value.

By advancing only the minimum element, we make sure the sweep is progressive and exhaustive.