Cluster computing

Friday, April 20, 2018

Challenges cited from health information collaboration in Information Technology projects
The design of IT systems for collaboration in data sharing is itself quite a challenge given that publishers and subscribers do not adhere to any standard. In the health information industry, such designs are considered even more difficult because they also have to satisfy the demands of sponsors and users.
In the law enforcement industry, legal requests and responses may even have a uniform standard across participating companies and law enforcement agencies. The goal here was in timeliness and the availability of data. Health information sharing standards have a slightly different area of emphasis. They have plenty of data when it comes to collections but privacy and boundary rules have many manifestations. Although they may not be subject to the same rules as personally identifiable information (PII), the health records have their own conformance requirements which have different interpretations and implementations.
The techniques for information sharing include:
1) Gateway between two heterogenous systems such as across companies
2) Message Bus between enterprise wide organizations with different areas of emphasis
3) Web-services within an organization for the sharing of data that is not exposed directly from data stores
4) And exposed data sources such as databases and their connectors

The techniques are important only for what can be applied in a given context. However, the time-held tradition had been to store the data in a database or archive in a data warehouse and connect those data stores via web services.
The pull and push between publishers and subscribers do not necessarily come with a protocol. Participating systems can choose whatever works for them
In the absence of any protocol or enforcement, information sharing falls largely on the level of collaboration between participants. Such collaboration is particularly relevant in coalescing health information across boundaries.

#codingexercise
https://ideone.com/XHJ2II

Thursday, April 19, 2018

Standard Query Operators in all programming languages
We were discussing standard query operations
There may be some new queries in practice today that were not as prevalent as earlier. These queries may originate from machine learning where they augment current data mining practice of grouping, ranking, searching and sorting. Such queries call for some new algorithms which are expressed in the form of code. Such logic together with the query operators above are augmenting the query language to form a newer language.
When we are missing the standard query operators in a programming language while a certain language has taken the lead with expanded libraries to cover the newer querying techniques, we have two ways to overcome it. Some plugins may provide these operators but such plugins often change the denominator over which the original language was executable.
First, write code in the existing broader reachability language with the same kind of primitives in a shared module so that they can be replaced with whatever comes next in the language revisions.
Second, adopt the plugins and upgrade your language and toolset to take advantage of better expressions, testability and brevity in the logic artifacts that we keep in version control.
Both these techniques require extra work over the development that we currently do for the sake of meeting business goals. Consequently the use of standard query operators might incur cost but the benefits far outweigh the costs since we are aligning ourselves to the general direction in which queries are being expressed.
It may be interesting to note that queries get a lot more variety from systems other online transactional processing. Analytical processing, Business Intelligence stacks and reporting systems come with far more complex queries. Data warehouses provide immense datasets to play with the queries. Data mining and other techniques are consistently expanding the definition of queries. In addition, the notion of batch processing and stream processing have been introduced Finally, clustering algorithms, decision trees, SVMs and neural-net based machine learning techniques are adding query logic that was not anticipated in the SQL query language. Universal query language is now trying to broaden the breadth of existing query language and standardize it. With these emerging trends, the standard query operators do not appear to be anywhere close to being discontinued and instead are likely to be expanded where possible.

Wednesday, April 18, 2018

Standard Query Operators in all programming languages
Let us say we have a list of records. We want to perform some queries on the list to find information such as which records match a given criteria. Other well known queries include grouping, eliminating duplicates, aggregating, finding one record, counting records etc. If we have two lists, we may want to join the lists, find the intersection or find the records that appear in one but not the other.
These are called standard query operations. They are standard because databases have traditionally described these operations in great detail and the operators in their query language are not only thorough but are also emulated up the layers in the software stack where usually the database powers all the layers above.
Many complex queries are often translated to these standard query operators to make them simpler to understand. Consequently these standard query operators become primitives and a convenience to standardize across several business areas where queries are used.
There may be some new queries in practice today that were not as prevalent as earlier. These queries may originate from machine learning where they augment current data mining practice of grouping, ranking, searching and sorting. Such queries call for some new algorithms which are expressed in the form of code. Such logic together with the query operators above are augmenting the query language to form a newer language.
Yet not all programming languages come with libraries to facilitate even the standard query operators while a certain language has taken the lead with expanded libraries to cover the newer querying techniques. Some plugins may provide these operators but such plugins often change the denominator over which the original language was executable.
There are two ways to mitigate such discordance:
First, write code in the existing broader reachability language with the same kind of primitives in a shared module so that they can be replaced with whatever comes next in the language revisions.
Second, adopt the plugins and upgrade your language and toolset to take advantage of better expressions, testability and brevity in the logic artifacts that we keep in version control.
Both these techniques require extra work over the development that we currently do for the sake of meeting business goals. Consequently the use of standard query operators might incur cost but the benefits far outweigh the costs since we are aligning ourselves to the general direction in which queries are being expressed.
#codingexercise http://js.do/code/209859

Tuesday, April 17, 2018

We were discussing why Interface description language don't become popular as compared to their alternatives. We said that interfaces or contracts provide the comfort for participants to work independently, offload validation, determine functional and non-functional requirements but the alternative to work with granular stateless requests that are well documented are a lot more appealing.
We also discussed the trade-offs between state driven sessions and stateless APIs when discussing client side application development. We noted that there is a wonderful ecosystem of browser based client side software development using standard jargons and tools.
Today we see that there are differences to client side and application side software development. For example, client-side javascript based applications require to mitigate security threats that are fundamentally different from server side code. Those client scripts could run anywhere and on any device. Threats such as cross site scripting, man-in-the-middle attacks, sql injection attacks, cross-origin resource sharing etc are all vulnerabilites exploited from the client side.
Consistency in the way we describe our markup, stylesheets and scripts help smooth out the organization of code on the client side. However, they help only so much. There is nothing preventing the developer from moving logic forward from server to client side. When we did have fat clients, they were touted as sophisticated tools. With the help of Model-View-Controller architecture, we seemed to move to different view-models for tabs on the same web page and remove the restriction between rendering and content for the page. UI developers suggested that view-models make a page rather limited and highly restrictive in workflows requiring more navigations and reducing usability in general. The logic on the client side can also be further simplified without compromising rendering or usability. For example, we could make one call versus several calls to the API and data source. The entire page may also be loaded directly as a response from the backend. While API may define how the user interface is implemented, there is little need for user interface to depend on any state based interface. Therefore the argument holds even in front-end development.

#codingexercise : water retained in an elevation map : https://ideone.com/HBEn3F

Monday, April 16, 2018

Contracts are also static and binding for the producer and consumer. Changes to the contract are going to involve escalations and involvement.

Contracts whether for describing services or for component interactions. They are generally replaced by technologies where we use pre-determined and well-accepted verbs and stateless design.

The most important advantage of stateless design and syntax of calls is that each request is granularly authenticated, authorized and audited. By making a one to one relationship between a request and a response and not having the client span multiple requests with a shared state, we avoid the use of dangling resources on the server side and state management on clients. The clients are required to be handling more preparation for each request but this happens to be a convenience when all the requests share the same chores.

The other advantage to stateless design is that much of the requests now follow well-established protocols. These protocols come with important benefits. First they are widely accepted in the industry. Second they move away from company specific proprietary contracts. Third the wide acceptance fosters a community of developers, tools and ecosystems.

The popularity of a technology is determined not only by the simplicity of the product but by the audience using the technology. When tools, development environment and products are promoting stateless design over stateful contracts, no matter how much we invest in interface specification language, its adoption will be very much dependent on the audience who will generally go with their convenience.

Protocols can be stateful and stateless. There is no doubt that either may come useful under different circumstances. Contracts still come useful in being a descriptor for a service in its three-part description - Address, Binding and Contract. But when each service behaves the same as every other cookie-cutter service and a well-known binding and contract, there may not be anything more needed than address.

Leaving the communications aside, let us look at the proposed benefit of a client-side state management. Certainly, it improves the performance of numerous intermediary requests if the setup and teardown are the ones taking the most performance hit. Also, the client has an opportunity to keep the reference on the server-side resource for as long as they want and the underlying communication channel holds. Proponents argue a stateful design may result in higher performance for trunk or bulk callers instead of retail individual callers since this reduces the number of clients to a manageable few. While this may be true, it underscores the stateful design as more of an exception rather than the norm.
#codingexercise: https://ideone.com/3ZY49O

Sunday, April 15, 2018

Why doesn’t Interface description language become popular?

Software is designed with contracts primarily because it is difficult to build it all at once. Interfaces form a scaffolding that lets components be designed independently. A language used to define interfaces is enhanced to include many keywords to derive maximum benefit of describing the contract. Software engineers love the validation that can be offloaded once the criteria is declared this way. Yet Interfaces or contracts do not become popular. Technologies that used such contracts as Component Object Model (COM) and Windows Communication Framework (WCF) were replaced with simpler and higher granularity and stateless mechanisms. This writeup tries to enumerate the reasons for this unpopularity.

Contracts are verbose. They take time to be written. They are also brittle when business needs change and the contract requirements change. Moreover, they become difficult to read and orthogonal to the software engineers efforts with the component implementation. On the other hand, tests are improved because the success and the failure of the components as well as their functional and non-functional requirements can now be determined. If the software does not work as expected, it might end up to be a miss or incorrect specification in the contract.

Contracts are also static and binding for the producer and consumer. If they are static, it is easy to ship it to the consumer for offline software development. At the same time, the consumer might need to articulate changes to the contract if the interface is not sufficient. This bring us to the second drawback that changes to the contract are going to involve escalations and involvement.

Contracts whether for describing services or for component interactions, are generally replaced by technologies where we use pre-determined and well-accepted verbs so that the lingo is the same but the payloads differ. Here we can even browse and determine the sequence of operations based on the requests made to the server. This dynamic discoverability of the necessary interactions helps eliminate the need for a contract. Documentation also improves the need to have explicit contracts and the chores needed to maintain them.

Conclusion: Contracts provide the comfort for participants to work independently, offload validation, determine functional and non-functional requirements but the alternative to work with granular stateless requests that are well documented are a lot more appealing.

#sqlexerrcise
Consider a set of players each of whom belongs to one and only one league. Each player may have several wins and losses as part of the games between leagues. A league may have any number of players.
Define a suitable schema and list all those players who have more wins than losses.
write a table valued function for SELECT player, count(*) as wins from GAMES where result='win' GROUP BY PlayerID
write a table valued function for SELECT player, count(*) as losses from GAMES where result='loss' GROUP BY PlayerID
Use the results A and B from above to determine count as
SELECT A.PlayerID, A.wins, B.losses from A INNER JOIN B on A.PlayerID = B.playerID where A.wins - B.losses > 0;
or we could use the PIVOT operator with the sum aggregate in this case.

Saturday, April 14, 2018

Today I'm taking a break from my previous post below on Java Interfaces to discuss a small coding question I came across and post it here for thoughts:
/*
Implement a class with following methods
put( key, value );

delete( key );
getRandom(); // returns one of the values added randomly
Additional Requirements : No duplicates:
No dups.
- update existing
*/
public class MyNoDuplicateContainerWithGetRandom
{
// for map based access
private Dictionary<Object, Object>> dict;
private List<Object> keys; // array

public MyContainer() {
dict = new Dictionary<Object, Object>> ();
keys = new List<Object>();
}

public void put(Object key, Object value)
{
if (dict.ContainsKey(key)){
dict[key] = value;
}else
{
using(tr = new TransactionScope()){
dict.Add(key, value);
keys.Add(key);
}
}
}

public Object getRandom(){
var r = new Random();
int index = r.Next(list.Count());
Object key = keys[index];
return dict[key];
}

public Object delete(Object key){
if (dict.ContainsKey(key))
{
using (tr = new TransactionScope()){
Object value = dict.Remove(key);
list.Remove(key);
}
}
return null;
}

https://1drv.ms/w/s!Ashlm-Nw-wnWtheIEHpU4Jua0V79