Cluster computing

Monday, January 22, 2018

Identity – A score to who you are, what you are and where you are
Contents
Identity – A score to who you are, what you are and where you are 1
Introduction: 1
Description: 1
Conclusion: 2

Introduction:
Identity management is a necessity for every online retail business but it involves management chores such as providing various sign-in options to the users so that they may be authenticated and authorized, complying with standards and providing utmost convenience that may prove distractful to their line of business. Federated identity management stepped in to consolidate these activities. You could now sign in to different retail domains and subsidiaries with a single account. Moreover protocols were developed so that identity may be deferred to providers. Interestingly in the recent years social network providers increasingly became a venerable identity provider by themselves. This write-up introduces the notion of score for an identity as an attribute that may be passed along with the identity to subscribing identity consumers. As more and more business participate, score becomes more meaningful metadata for the customer.
Description:
Using scores to represent consumers probably started more than half a century earlier when Fair, Isaac and Co used statistical analysis to translate financial history to a simple score. We may have come a long way in how we measure credit scores for end users but the data belonged to credit bureaus. Credit card companies became authorities in tracking how consumers spend their money and their customers veritably started carrying cards instead of cash. With the rise of mobile phones, mobile payment methods started gaining popularity. Online retail companies want a share of that spend. And the only way they can authenticate and authorize a user to do so was with identity management. Therefore they shared the umbrella of identity management while maintaining their own siloed data regardless of whether they were in the travel industry, transportation industry or the insurance industry. They could tell what the user did on the last vacation, the ride he took when he was there or the claim he made when he was in trouble but there is nothing requiring them to share this data with an identity provider. Social network and mobile applications became smarter to know the tastes the users may have or acquire and they can make ads more personalized with recommendations but there is no federation of trends and history pertaining to a user across these purchases. On the other hand, the classic problem of identity and access management has been to connect trusted users to sensitive resources irrespective of where these users are coming from and irrespective of where these resources are hosted. The term classic here is used to indicate what does not change. In contrast, business models to make these connections have changed. Tokens were invented to represent user access to a client’s resources so that the identity provider does not have to know where the resources are. Moreover, tokens were issued not only to users but also to devices and applications on behalf of the user so that they may have access to different scopes for limited time. Other access models that pertain to tokens as a form of identity are mentioned here. In the case of integrating domains and web sites with the same identity provider, the data pertaining to a customer only increases with each addition. An identity provider merely has to accumulate scores from all these retailers to make a more generalized score associated with the user. This way existing retail companies can maintain their own data while the identity provider keeps a score for the user.
Conclusion:
An identity and access management solution can look forward to more integrated collaboration with participating clients in order to increase the pie of meaningful information associated with an account holder.

Sunday, January 21, 2018

Today we continue our discussion on the AWS papers in software architecture which suggests five pillars:
- Operational Excellence for running and monitoring business critical systems.
- Security to protect information, systems, and assets with risk assessments and mitigation strategies.
- Reliability to recover from infrastructure or service disruptions
- Performance Efficiency to ensure efficiency in the usage of resources
- Cost Optimization to help eliminate unneeded cost and keeps the system trimmed and lean.
The guidelines to achieve the above pillars include:
1. Infrastructure capacity should be estimated not guessed
2. Systems should be tested on production scale to eliminate surprises
3. Architectural experimentation should be made easier with automation
4. There should be flexibility to evolve architectures
5. Changes to the architecture should be driven by data
6. Plan for peak days and test at these loads to observe areas of improvement
We looked at the security pillar and we reviewed its best practices.
They include identity and access management, monitoring controls, infrastructure protection, data protection and incident response.
The identity and access management only
allows authenticated and authorized users to access the resources. In AWS, there is a dedicated IAM service that supports multi-factor authentication.
The monitoring controls are used to identify a potential security incident. In AWS, Cloud Trail logs, AWS API calls and CloudWatch provide monitoring of metrics with alarming.
Infrastructure protection includes control methodologies which are defense in depth. In AWS, this is enforce in Compute Cloud, Container Service and Beanstalk with Amazon Machine Image.
Data protection involves techniques that involve securing data, encrypting it, and putting access controls etc.. In AWS, Amazon S3 provides exceptional resiliency.
Incident response means putting in place controls and prevention to mitigate security incidents. In AWS logging and events provide this service. AWS CloudFormation can be used to study in a sandbox kind of environment.
IAM is the AWS service that is essential security and enabled this pillar of software architecture.

#codingexercise
int GetClosest(List<int> sortedSquares, int number)
{
int start = 0;
int end = sortedSquares.Count-1;
int closest = sortedSquares[start];
while (start < end)
{
closest = Math.Abs(sortedSquares[start]-number) < Math.Abs(sortedSquares[end]-number) ? sortedSquares[start] : sortedSquares[end];
int mid = (start + end ) / 2;
if (mid == start) return closest;
if (mid == end) return closest;
if (sortedSquares[mid] == number)
{
return number;
}
if (sortedSquares[mid] < number)
{
start = mid;
}else{
end = mid;
}
}
return closest;
}

Saturday, January 20, 2018

Today we resume our discussion on the AWS papers in software architecture which suggests five pillars:
- Operational Excellence for running and monitoring business critical systems.
- Security to protect information, systems, and assets with risk assessments and mitigation strategies.
- Reliability to recover from infrastructure or service disruptions
- Performance Efficiency to ensure efficiency in the usage of resources
- Cost Optimization to help eliminate unneeded cost and keeps the system trimmed and lean.
The guidelines to achieve the above pillars include:
1. Infrastructure capacity should be estimated not guessed
2. Systems should be tested on production scale to eliminate surprises
3. Architectural experimentation should be made easier with automation
4. There should be flexibility to evolve architectures
5. Changes to the architecture should be driven by data
6. Plan for peak days and test at these loads to observe areas of improvement
We looked at the security pillar now we review its best practices.
They include identity and access management, monitoring controls, infrastructure protection, data protection and incident response.
The identity and access management only
allows authenticated and authorized users to access the resources.
The monitoring controls are used to identify a potential security incident.
Infrastructure protection includes control methodologies which are defense in depth
Data protection involves techniques that involve securing data, encrypting it, and putting access controls etc.
Incident response means putting in place controls and prevention to mitigate security incidents.
IAM is the AWS service that is essential security and enabled this pillar of software architecture.
#codingexercise
We were discussing how to check if a number is Fibonacci:
Check if a number is Fibonacci :
boolean is Fibonacci(uint n)
{
return IsSquare(5*n*n + 4) || IsSquare(5*n*n - 4);
}
Another way to test for Fibonacci is to binary chop Fibonacci numbers until we get close to the given number.
We discussed this binary chop method here : http://ravinote.blogspot.com/2017/11/we-resume-our-discussion-about.html

Friday, January 19, 2018

Reducing the cyclomatic complexity in software
Introduction: Mature software applications often end up as a spaghetti code – a term used to denote tangled flow of control and the use of nested conditionals. This not only makes the software hard to read but also results in unexpected behavior. Code written in programming languages like C# and Java can be checked with tools such as NDepend and CheckStyle respectively to measure this complexity. The following are some suggestions to mitigate it.
Description:
1: Refactoring into smaller methods or abstractions especially those that are cumbersome
2: Increased use of boolean variables or results to store intermediary state in processing or for evaluation of multiple conditions. These Boolean variables can also be repeated in subsequent statements
Test code for representing test matrix usually comes up with very little cyclomatic complexity because each test case can be represented by an if condition with multiple conditionals. While such repetitive if conditions are avoided in favor of set once and check once conditions in development code, the latter contributes to cyclomatic complexity. Unpacking the conditions into repetitive but separate lines avoids unnecessary branching and missing of test cases.
3: Using inheritance or encapsulation or design patterns such as Factory or Strategy so that the logic can be re-organized and not just refactored. For example, multiple throw and catch statements increase cyclomatic complexity but if the catch statements are all in one place and with smaller blocks of code, it will help the same way that switch statements do. In C language, the use of pre-processors was prevalent because it made these repetitions easier to write. A catch statement for example may have to add an entry into the log, update counters and do other chores that increase code bloat. This probably may be tolerated but putting logic into the catch handler increases this complexity.
4: Designating private methods for validations, separate methods for business operations and treating dependency calls as merely method and parameters
5: This tip is to not to over-do the practice of reducing this complexity. Many organizations take pride in the way they write the code such as when it looks like a textbook. This emphasizes the fact that code is of the people, by the people and for the people.
Conclusion: Cyclomatic complexity may be a desirable metric in static code analysis practices of an organization and worthy to address at the time of checkins.

Thursday, January 18, 2018

Today we resume our discussion on the AWS papers in software architecture which suggests five pillars:
- Operational Excellence for running and monitoring business critical systems.
- Security to protect information, systems, and assets with risk assessments and mitigation strategies.
- Reliability to recover from infrastructure or service disruptions
- Performance Efficiency to ensure efficiency in the usage of resources
- Cost Optimization to help eliminate unneeded cost and keeps the system trimmed and lean.
The guidelines to achieve the above pillars include:
1. Infrastructure capacity should be estimated not guessed
2. Systems should be tested on production scale to eliminate surprises
3. Architectural experimentation should be made easier with automation
4. There should be flexibility to evolve architectures
5. Changes to the architecture should be driven by data
6. Plan for peak days and test at these loads to observe areas of improvement
We look at the security pillar today:
The security pillar emphasizes protection of information, systems and assets. There are six design principles for security in the cloud:
It implements a strong identity foundation where the privileges are assigned on a need by need basis. and there is separation of concerns. It centralizes privilege management and reduces usage of long term credentials.
It also monitors, alerts and audits actions to respond and take actions.
It applies security at all layers and not just the one at the edge so that the impact radius is covered in full
It automates mechanisms that are necessary for controls and restrictions
It protects data in transit and at rest with the use of access tokens and encyrption
It prepares for security events when many are affected.

#codingexercise
Check if a number is Fibonacci :
boolean is Fibonacci(uint n)
{
return IsSquare(5*n*n + 4) || IsSquare(5*n*n - 4);
}
A way to test for squares is to binary chop the squares until we find something closest to either or both of the required values

Wednesday, January 17, 2018

File descriptors on steroids continued
design

While the file descriptor was inherently local to the process, the DFS allowed it to point to a file on a remote computer. Likewise, file system protocols such as CIFS allowed remote servers to be connected through Active Directory and access was granted to users registered there. Deduplication worked on segments that were identical so that space could be conserved. RSync protocol helped replicate between source and destination regardless of whether the destination was a file system or a S3 endpoint. In all these tasks, much of the operations were asynchronous and involved a source and destination. This library utilizes ZeroMQ messaging library for file system operations.
Performance:
ZeroMQ has demonstrated performance for communications. The stacking of file operations from a storage solution perspective over this library can also meet the stringent requirements for cloud level operations. The implementation might vary on purpose, scope, scale and management as we add plugins for a client but the assumption that asynchronous operations on a remote file will not be hampered by ZeroMQ remains sound.

Security:
Enhanced File Descriptors are inherently as secure as sockets. However file system utilities to secure the files work because these behave the same as regular ones to the layers above.

Testing:
The implementation for this Storage Framework must be able to process hundred thousand requests per second with message sizes of a mix from 0 to 64kb for duration of 1 hour with little or no degradation in write latency for the writes to a million files. Checksum may be used to see that the files are correct. Testing might require supportability features in addition to random file writes. The statistics, audit log, history and other management aspects of the queue should be made available for pull via web APIs.

Conclusion:
With smart operating system primitives we can enhance each process to give more power to individual business.
#codingexercise

trace requests and responses from logs and stores.
#codingexercise
Get Fibonacci number
we compared the following
uint GetTailRecursiveFibonacci(uint n, uint a = 0, uint b = 1)
{
if (n == 0)
return a;
if (n == 1)
return b;
return GetTailRecursiveFibonacci(n-1, b, a+b);
}
with the conventional:
uint GetFibonacci (uint n)
{
if (n == 0)
return 0;
if (n == 1)
return 1;
return GetFibonacci(n-1) + GetFibonacci(n-2);
}
0 1 1 2 3 5 8 13 21 34