Cluster computing

Thursday, March 29, 2018

Software application metric for aging

Introduction:

Software development life cycle is often used to indicate the ritual that involves planning, creating, testing and deploying an information system. As the cycle repeats there is a lot of time involved maintenance. As the software ages, it spends more time in this stage. This essay tries to articulate a metric for determining the age of software and how to keep it fresh.

Description:

Software quality metrics, generally, fall in three categories. They are either:

Product based – These capture the characteristics of the product such as size, complexity, design features, performance and quality level.

Process based – These attempt to track the activities taken for development and maintenance of software.

Project based – These attempt to describe the resources, timeline, cost, schedule and productivity associated with the software project.

The metrics associated with software quality are more typical to be related to product based and process based rather than project based. Yet they do not indicate whether the software has matured – a term loosely used to describe higher costs for fixing defects such that the return on investment for maintenance activities is well over the cost of newly written software.

Periodic refactoring and rewriting modules and components of the software somehow alleviates massive rewrites by replacing smaller chunks of the overall code. Most applications are well organized and written from the start as allowing flexibility with little or controlled changes to the code. Yet a single method might become overwhelmingly complex over time with the addition of more and more branching of logic. For example a method to draw a shape might require different handling depending on the parameters passed to it. Consequently the code becomes so convoluted with handling these different cases that it is termed spaghetti code. A metric for nested branching is called cyclomatic complexity and it indicates how deep the branching goes before reaching a result. In some sense metrics such as cyclomatic complexity determine the current state of the software but there is no metric that keeps track of the progress of these complexity over time

In this regard, a metric known as entropy is used to indicate the level of maintenance involved in any module or organizational unit of software. Such a metric that can progress monotonically over time and continues to remain stable and beyond compromise, then becomes a great indicator for aging. With the help of vectors and features, software organization units can now be represented and classified with the same rigor as vector model space. Therefore a multidimensional metric involving multiple scalars then becomes convenient to indicate the age of the software.

Conclusion:

Software metric for aging continues to be a challenge but advances in using vector space model along with neural net can help determine the pain points better allowing the overall software to remain young.

Wednesday, March 28, 2018

We were discussing the advantages of JavaScript SDK over JSP pagelets. The separation of an SDK from an API only helps client side development. This does not pose any disruption to the existing services and models of the service provider. The SDK forms a separate layer over and on top of the services so that the business clients can choose to use the existing services or develop newer modes of content display using the JavaScript SDK. Powerful jQuery plugins and those exported by the service provider provide an immense combination to not only offload the customization of display but also enhance their integration into business workflows by replacing the existing interruption mechanisms with seamless business workflows. The SDK need not be just in Javascript and other languages can be used facilitating a broader ecosystem. Moreover command line interface may also be facilitated with the same REST APIs that enables an SDK. Samples or even exportable SDKs can be developed by the service provider by consuming the same APIs.
The service Provider may also maintain its own user interface also powered by consuming the REST APIs and JavaScript SDK that it exports to its clients. In other words, the native user interface from the service provider as well as the self-customized interface from the client can exist side by side serving disjoint audience but maintained together as identity resources with the service provider.
Moreover, with the adoption of cloud-based technologies, PaaS platform and containers, the client-side technologies may be freed up and made popular with their developer community. This lets the service provider develop more and more widgets and consolidate their best practice that others may not want to invest in. Furthermore, the service provider may allow embrace partnership with vendors for different workflows and interfaces while consolidating the server-side APIs across data types.
To list the disadvantages of JavaScript over JSP, we can include
1) single point maintenance facilitated by all server side code
2) consolidation and consistency in views and all customizations via parameters
3) tight control of client side displays and customizations
4) arguably improved security through less surface area.

Tuesday, March 27, 2018

Trade-offs between Javascript SDK and Java pagelets from a service provider:
Service providers can ship a Javascript SDK to improve customization and programmability.
This does not pose any disruption to the existing services and models of the service provider. The SDK forms a separate layer over and on top of the services so that the business clients can choose to use the existing services or develop newer modes of content display using the JavaScript SDK. Powerful jQuery plugins and those exported by the service provider provide an immense combination to not only offload the customization of display but also enhance their integration into business workflows by replacing the existing interruption mechanisms with seamless business workflows.
There is a tradeoff in exporting a JavaScript SDK from the service provider instead of the service provider providing the client-side display. It has to rely on the clients to send the data securely without compromise. This is generally difficult to do without an all-in approach. However, the way the service provider may pass the data between its services is similar to how an external service might send the credentials to the service provider. Therefore, these services and the UI can also be hosted as single-origin for the JavaScript SDK while the service provider is exclusively API based.
The service Provider may also maintain its own user interface also powered by consuming the REST APIs and JavaScript SDK that it exports to its clients. In other words, the native user interface from the service provider as well as the self-customized interface from the client can exist side by side serving disjoint audience but maintained together as identity resources with the service provider.
Is it safe for the data to be sent over the https on several external network? This question is not really solved by the pagelet technology from the service provider. That said, it's true that data can be compromised when transferred from network to network. It's also true that procuring and processing data only with server-side technology reduces surface area and client involvement. However, pagelet technologies do not decentralize the development of the interface or the technologies that are used to deploy them. Moreover, with the adoption of cloud-based technologies, PaaS platform and containers, the client-side technologies may be freed up and made popular with their developer community. This lets the service provider develop more and more widgets and consolidate their best practice that others may not want to invest in. Furthermore, the identity provider may allow embrace partnership with vendors for different workflows and interfaces while consolidating the server-side APIs across data types.

To list the advantages of JavaScript over JSP, we can include:

1) writing and debugging via browser is easier

2) no more compilation required

3) performance at par or even better with modular and refactored code

4) standard REST interface adoption

5) JsUnit is available for unit-testing so any existing language features are not lost with JavaScript.
Login screen enhancement: https://1drv.ms/w/s!Ashlm-Nw-wnWtWiupMGBf_WbEJxS

Monday, March 26, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space.
We saw how the embedding space is created. Mapping new shapes is slightly more effort. The space was originally constructed with a set of 3D shapes. They were jointly embedded. Introducing a new shape requires us to find an embedding point. The steps for this include:
First, a feature vector is computed.
Second, pairwise distances are computed.
Third we minimize the Sammon error but this time applying Liu-Nocedal method which is a large scale optimization method that combines BFGS steps and conjugate directions steps. BFGS is an iterative method for solving unconstrained non-linear optimization problems.
The 3D shapes in the embedding space have abundant information to train the CNN and also to perform data generation. The shapes are represented as clean and complete meshes which allows control and flexibility. Many images can be generated from the shapes using a rendering process. This is called image setting. In the embedding space, a shape is mapped to a point. For each image, its association with a shape is automatically known. The collection of images and shapes form the training data for CNN.
CNN models can approximate high dimensional and non-linear functions as we recall the feature vector has a large number of attributes and the Sammon error minimization objective is a non-linear function. CNN can infer millions of parameters. CNN therefore can be precise and informative once it is trained on a large amount of data. If the data is not proper, CNN cannot learn enough latent information and there results have overfitting. When the images are generated with rich variation in lighting and viewpoint and superimposed on random backgrounds, the CNN has sufficient data. Approximately 1 million images are synthesized per category.

Sunday, March 25, 2018

Saturday, March 24, 2018

Friday, March 23, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
CNN has the ability to separate an image into various layers of abstraction while capturing different features and elements. This lets CNN to be utilized for different learning tasks where the tasks may differ on the focus they require. It is this adaptive ability of CNN that is leveraged for joint embedding. The CNN is first trained to map an image depicting an object similar to a shape to a corresponding point in the embedding space such that the position of the point for the image is closer to the point for the shape. During this training, the CNN discovers latent connection between that exists between an image and the object it features. Then when a test image is presented, the latent connection helps to place that image in the embedding space closer to the object it features.
Moreover, CNN can generalize from different tasks. This makes it useful to repurpose a well-trained network. Since it learns from a high dimensional space, CNN can differentiate even similar images for a variety of tasks.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space. The goal of MDS is to get the co-ordinate matrix. It uses the notion that the co-ordinate matrix can easily be formed from the eigenvalue decomposition of the scalar product matrix B. The steps of a classical MDS algorithm are as follows:
1. Form pairwise distance matrix that gives proximity between pairs
2. Multiply the squared proximities with a form of identity matrix to get matrix B
3. Extract the m-largest positive eigenvalues and the corresponding m eigenvectors
4. Form the co-ordinate matrix from the eigen vectors and the diagonal matrix of the eigen values

#proposal for login screens ui enhancements: https://1drv.ms/w/s!Ashlm-Nw-wnWtWrw5g03hYU5CBKL