Cluster computing

Saturday, March 31, 2018

Driver-Rider Real-Time Monitoring
Introduction:
Problem Statement: When parents offer to engage in carpool for their kids in daily commutes, the location information of the assets is generally unknown for the duration of the ride. Map based mobile device applications may enable driver –rider location sharing but their real time notifications to stationary observers are extremely expensive to both the device on the moving vehicle as well as the handheld on the stationary observer. This document attempts to provide an efficient solution to elastic scaling of observers and rides that is not only performing but also convenient without requiring anything more than a browser interface for the publisher and the subscriber.

Differentiation:
Google Maps and Uber/Lyft applications enable location sharing but they heat up the mobile device when turned on. Moreover, neither application allows an observer to include more than just the ride from a particular participant.
Design:
Without the use of native mobile applications, a browser based application that can query local mobile operating system’s location sharing primitives is sufficient to publish data to cloud based service.
A message queue broker from the cloud handles the exchange of messages required for sharing the location information between publisher and elastic group of observers. A global cloud database keeps track of all the information regarding rides and observers.
Architecture:
All cloud based technologies are sufficient for this purpose. And web interface for browser display can be based on server side page displays.
Performance:
The use of cloud based technologies such as queue service, cloud database and such others are sufficient to improve performance. Standalone superfast erlang based applications are not required.
Security:
Access control is based on row level security in the database enabling granular control of all necessary assets.
Testing:
All server side code regardless of the tier involved can be unit-tested and integration tested with selenium web-browser based tests.
Conclusion:
Publisher-subscriber application of location sharing mobile units can be applied to a variety of domains.

#codingexercise

Given a triangular structure of members, find the minimum sum path from top to bottom:

Solution: Since each level has to be represented once, pick the minimum of each level to add to the desired sum

otherwise the exhaustive case is :

int GetMinSumPathTopToBottom(int[,] A, int rows, int cols, int i, int j)
{
if (i == rows) {return 0;}
Debug.Assert( 0 <= i && i < rows && 0 <= j && j < cols);
var locals = new List<int>();
sums += A[i,j];
for (int k = 0; k < cols; k++)
{
locals.Add(GetMinSumPathTopToBottom(A, rows, cols, i+1, k));
}
sums += locals.min();
return sums;
}

Friday, March 30, 2018

Today onwards we will start discussing Microsoft Dynamics AX from the Dynamics suite of products. Uptil now, we have talked about reporting solutions and dashboards from a variety of technology stacks such as Grafana, Splunk and SSRS but it would be interesting to see such techniques applied to customer relationship management. Dynamics 365 is more than a personal book for your customer, it provides a professional with the best tools for managing their data, updating records, and status both online and offline.
Dynamics AX enables us to capture our expense transactions and receipt information. It also helps us create and submit timesheets. Initially we used to have expensive ERP software until Dynamics AX came along the way.
Some of the major benefits from AX include the following:
1) While most enterprise resource planning software were traditional enterprise products, Dynamics AX is built with cloud services. This facilitates ubiquity and reachability from any device anywhere. Not only does this simplify production deployments for the product offerings, it also enables continuous availability and updates. As with all the benefits of cloud computing such as elastic scaling, load balancing, regional availabilities etc the services powering Dynamics AX also means improved handling of data. Improvements in data handling was previously mentioned here: https://goo.gl/n4G2TU
2) Another advantage to Dynamics AX is the integrations it can perform via connectors, plugins and data sources. More data generally means better reports and more meaningful insight and not just from a statistics perspective but from several analysis techniques such as with PowerBI
3) Faster development and deployment - The entire software development life-cycle with off the shelf products such as AX now becomes far easier and continuous without any compromise in control, governance and compliance. This is a big win for common criteria certification
4) Improved development environment - The development environment could not be any better than the Visual Studio and .NET framework integration along with the rich set of tools that come with that IDE. The entire Application Object Tree (AOT) is available to browse in the Visual Studio. The development is with X++ language which is specific to accounting and business management systems. It is considered at par with managed languages such as C#.
5) Scoping - Earlier users were granted access via role centers. Dynamics AX uses the equivalent of namespaces called workspaces which enable the users to focus on the most important aspects of their tasks.
6) Web Interface - cloud service power rich user interface via the web browser that makes it near ubiquitous to access and perform daily activities.

Thursday, March 29, 2018

Software application metric for aging

Introduction:

Software development life cycle is often used to indicate the ritual that involves planning, creating, testing and deploying an information system. As the cycle repeats there is a lot of time involved maintenance. As the software ages, it spends more time in this stage. This essay tries to articulate a metric for determining the age of software and how to keep it fresh.

Description:

Software quality metrics, generally, fall in three categories. They are either:

Product based – These capture the characteristics of the product such as size, complexity, design features, performance and quality level.

Process based – These attempt to track the activities taken for development and maintenance of software.

Project based – These attempt to describe the resources, timeline, cost, schedule and productivity associated with the software project.

The metrics associated with software quality are more typical to be related to product based and process based rather than project based. Yet they do not indicate whether the software has matured – a term loosely used to describe higher costs for fixing defects such that the return on investment for maintenance activities is well over the cost of newly written software.

Periodic refactoring and rewriting modules and components of the software somehow alleviates massive rewrites by replacing smaller chunks of the overall code. Most applications are well organized and written from the start as allowing flexibility with little or controlled changes to the code. Yet a single method might become overwhelmingly complex over time with the addition of more and more branching of logic. For example a method to draw a shape might require different handling depending on the parameters passed to it. Consequently the code becomes so convoluted with handling these different cases that it is termed spaghetti code. A metric for nested branching is called cyclomatic complexity and it indicates how deep the branching goes before reaching a result. In some sense metrics such as cyclomatic complexity determine the current state of the software but there is no metric that keeps track of the progress of these complexity over time

In this regard, a metric known as entropy is used to indicate the level of maintenance involved in any module or organizational unit of software. Such a metric that can progress monotonically over time and continues to remain stable and beyond compromise, then becomes a great indicator for aging. With the help of vectors and features, software organization units can now be represented and classified with the same rigor as vector model space. Therefore a multidimensional metric involving multiple scalars then becomes convenient to indicate the age of the software.

Conclusion:

Software metric for aging continues to be a challenge but advances in using vector space model along with neural net can help determine the pain points better allowing the overall software to remain young.

Wednesday, March 28, 2018

We were discussing the advantages of JavaScript SDK over JSP pagelets. The separation of an SDK from an API only helps client side development. This does not pose any disruption to the existing services and models of the service provider. The SDK forms a separate layer over and on top of the services so that the business clients can choose to use the existing services or develop newer modes of content display using the JavaScript SDK. Powerful jQuery plugins and those exported by the service provider provide an immense combination to not only offload the customization of display but also enhance their integration into business workflows by replacing the existing interruption mechanisms with seamless business workflows. The SDK need not be just in Javascript and other languages can be used facilitating a broader ecosystem. Moreover command line interface may also be facilitated with the same REST APIs that enables an SDK. Samples or even exportable SDKs can be developed by the service provider by consuming the same APIs.
The service Provider may also maintain its own user interface also powered by consuming the REST APIs and JavaScript SDK that it exports to its clients. In other words, the native user interface from the service provider as well as the self-customized interface from the client can exist side by side serving disjoint audience but maintained together as identity resources with the service provider.
Moreover, with the adoption of cloud-based technologies, PaaS platform and containers, the client-side technologies may be freed up and made popular with their developer community. This lets the service provider develop more and more widgets and consolidate their best practice that others may not want to invest in. Furthermore, the service provider may allow embrace partnership with vendors for different workflows and interfaces while consolidating the server-side APIs across data types.
To list the disadvantages of JavaScript over JSP, we can include
1) single point maintenance facilitated by all server side code
2) consolidation and consistency in views and all customizations via parameters
3) tight control of client side displays and customizations
4) arguably improved security through less surface area.

Tuesday, March 27, 2018

Trade-offs between Javascript SDK and Java pagelets from a service provider:
Service providers can ship a Javascript SDK to improve customization and programmability.
This does not pose any disruption to the existing services and models of the service provider. The SDK forms a separate layer over and on top of the services so that the business clients can choose to use the existing services or develop newer modes of content display using the JavaScript SDK. Powerful jQuery plugins and those exported by the service provider provide an immense combination to not only offload the customization of display but also enhance their integration into business workflows by replacing the existing interruption mechanisms with seamless business workflows.
There is a tradeoff in exporting a JavaScript SDK from the service provider instead of the service provider providing the client-side display. It has to rely on the clients to send the data securely without compromise. This is generally difficult to do without an all-in approach. However, the way the service provider may pass the data between its services is similar to how an external service might send the credentials to the service provider. Therefore, these services and the UI can also be hosted as single-origin for the JavaScript SDK while the service provider is exclusively API based.
The service Provider may also maintain its own user interface also powered by consuming the REST APIs and JavaScript SDK that it exports to its clients. In other words, the native user interface from the service provider as well as the self-customized interface from the client can exist side by side serving disjoint audience but maintained together as identity resources with the service provider.
Is it safe for the data to be sent over the https on several external network? This question is not really solved by the pagelet technology from the service provider. That said, it's true that data can be compromised when transferred from network to network. It's also true that procuring and processing data only with server-side technology reduces surface area and client involvement. However, pagelet technologies do not decentralize the development of the interface or the technologies that are used to deploy them. Moreover, with the adoption of cloud-based technologies, PaaS platform and containers, the client-side technologies may be freed up and made popular with their developer community. This lets the service provider develop more and more widgets and consolidate their best practice that others may not want to invest in. Furthermore, the identity provider may allow embrace partnership with vendors for different workflows and interfaces while consolidating the server-side APIs across data types.

To list the advantages of JavaScript over JSP, we can include:

1) writing and debugging via browser is easier

2) no more compilation required

3) performance at par or even better with modular and refactored code

4) standard REST interface adoption

5) JsUnit is available for unit-testing so any existing language features are not lost with JavaScript.
Login screen enhancement: https://1drv.ms/w/s!Ashlm-Nw-wnWtWiupMGBf_WbEJxS

Monday, March 26, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space.
We saw how the embedding space is created. Mapping new shapes is slightly more effort. The space was originally constructed with a set of 3D shapes. They were jointly embedded. Introducing a new shape requires us to find an embedding point. The steps for this include:
First, a feature vector is computed.
Second, pairwise distances are computed.
Third we minimize the Sammon error but this time applying Liu-Nocedal method which is a large scale optimization method that combines BFGS steps and conjugate directions steps. BFGS is an iterative method for solving unconstrained non-linear optimization problems.
The 3D shapes in the embedding space have abundant information to train the CNN and also to perform data generation. The shapes are represented as clean and complete meshes which allows control and flexibility. Many images can be generated from the shapes using a rendering process. This is called image setting. In the embedding space, a shape is mapped to a point. For each image, its association with a shape is automatically known. The collection of images and shapes form the training data for CNN.
CNN models can approximate high dimensional and non-linear functions as we recall the feature vector has a large number of attributes and the Sammon error minimization objective is a non-linear function. CNN can infer millions of parameters. CNN therefore can be precise and informative once it is trained on a large amount of data. If the data is not proper, CNN cannot learn enough latent information and there results have overfitting. When the images are generated with rich variation in lighting and viewpoint and superimposed on random backgrounds, the CNN has sufficient data. Approximately 1 million images are synthesized per category.