Cluster computing

Thursday, September 7, 2017

Today we continue reviewing Bing Maps API

These APIs include map control and services that can be used to make maps part of your application. These APIs are an authoritative source for geospatial features. They offer static and interactive maps, geocoding, route and traffic data.Any spatial data such as store locations can be queried and stored with these APIs. Typically an account and a key is needed to use the APIs. The key is used as a license to utilize their services. They are classified as used for public website, private website and enterprise assets.

The Bing Maps dev center provides account management functionality. An account is needed to cut a key and to manage data sources..

The API options from Bing Maps platform can be listed as follows:

1) The V8 Web Control: This is one of the most universal mapping control available on Bing Maps platform with support for almost every type of browser and web application.

2) The Windows 10 universal platform that helps us build map apps for a variety of windows devices

3) The Windows Presentation Foundation that enables rich user experience on desktop applications including touch controls

4) The REST based services that facilitate tasks such as geocoding, reverse geocoding, routing and static imagery

5) The spatial data services that offer three key functionalities batch geocoding, point of interest data and the ability to store and expose spatial data. Imagine the geographical data columns added to almost all points of interest on the world map.

6) The developer resources such as documentation on APIs and SDKs.

#codingexercise

Rearrange the characters of a string such that the adjacent characters are not same.

Solution: Any data structure that can store the different alphabets and their frequencies can help here. if the alphabets could be ordered based on the decreasing order of frequencies, it will help. Consequently a priority queue or heap might help

Build a priority queue of letters and their counts

Take the most occuring letter write it down decrement its count and temporarily remove it until next iteration completes

Repeat as above and if there is no alphabet that can be written without violation, return as invalid input. If it can be written, add back the skip level alphabet and its count to the priority queue

Wednesday, September 6, 2017

Today we continue reviewing Bing Maps API

The Bing Maps dev center provides account management functionality. An account is needed to cut a key and to manage data sources.

The Bing Maps Services include REST services and spatial data services. The REST services provide geocoding, routing, elevation, imagery etc. and traffic incident information. The data services provide batch geocoding and spatial data source query and management.

The Web controls available by Bing Maps include support for both WPF and native libraries for Android and perhaps iOS. The geocoding API helps geocode a large batch of addresses. The FindBy APIs help with the querying.

The REST services automatically come with transaction accounting that helps determine billable and non-billable transactions.

The API options from Bing Maps platform can be listed as follows:

1) The V8 Web Control: This is one of the most universal mapping control available on Bing Maps platform with support for almost every type of browser and web application.

2) The Windows 10 universal platform that helps us build map apps for a variety of windows devices

3) The Windows Presentation Foundation that enables rich user experience on desktop applications including touch controls

4) The REST based services that facilitate tasks such as geocoding, reverse geocoding, routing and static imagery

6) The developer resources such as documentation on APIs and SDKs.

#codingexercise
Given a range [L,R] find the count of numbers having prime number of set bits in their binary representation.

Solution:

Initialize an array of prime numbers we can use as divisors against the count of set bits

Iterate through the elements from L to R inclusive

count the bits for others and use the divisors
Since the numbers are contiguous, we can use the previous count and positions to determine current count as the counts typically wrap around in small ranges.

Tuesday, September 5, 2017

Today we start reviewing Bing Maps API
These APIs include map control and services that can be used to make maps part of your application. These APIs are an authoritative source for geospatial features. They offer static and interactive maps, geocoding, route and traffic data.Any spatial data such as store locations can be queried and stored with these APIs. Typically an account and a key is needed to use the APIs. The key is used as a license to utilize their services. They are classified as used for public website, private website and enterprise assets.
The Bing Maps dev center provides account management functionality. An account is needed to cut a key and to manage data sources.
The Bing Maps Services include REST services and spatial data services. The REST services provide geocoding, routing, elevation, imagery etc. and traffic incident information. The data services provide batch geocoding and spatial data source query and management.
The Web controls available by Bing Maps include support for both WPF and native libraries for Android and perhaps iOS. The geocoding API helps geocode a large batch of addresses. The FindBy APIs help with the querying.
The REST services automatically come with transaction accounting that helps determine billable and non-billable transactions.
#codingexercise
Find the next smaller element which is smaller than twice itself for all in an integer array
Int[] GetNextSmaller2XElements(List<int> A)
{
var result = new int[A.length];
for (int i =0; i < A.Length; i++)
{
int next = -1;
for (int j = i+1; j < A.Length; j++)
if (A[j] < 2 x A[i]){
next = A[j];
break;
}
result[i] = next;
}
return result;

}

We could also do this with the help of a stack which we keep for all the elements that do not have a next 2X smaller element.

we push the first element in the stack. we pick the next item in the array if the next is smaller than the element in the stack, we print the tuple and pop the element otherwise we push it back on to the stack for retaining the elements we have not found an answer yet. we also push the next element on to the stack so it can participate for matches going forward. This is still O(N^2) but instead of looking ahead through all the elements we are looking back at the collection of unmatched so far. In the worst case, this stack will grow to be the length of the array. The order of the stack is the reverse order of the portion of the array we have covered.

Monday, September 4, 2017

Implementing shared collaborative maps
Introduction: Maps such as Bing Maps or Google Maps helps us visualize our driving route and locations. when the same map is used to share relative locations of two or more cars plying on the road, it becomes a shared collaobartive map. This is like a multiplayer boardgame where participants update their location against the same background in near real time while working their way to their destinations. if the destination is common to the multiple players, they can each see the distances and travel time for others.
Implementation: Such a shared collaborative map can be easily maintained with a shared database that can be updated individually by each participant. The scope and duration of the data persistence is only for the lifetime of the map. The data saved in the database only includes static information such as the origin destination tuple, the route information, the URI to query the location information for each participant based on their route map. In a sense it is nothing different from a shared collaborative online document such as Google docs or spreadsheets except that the updates happen automatically from each participant. Collaborative chat message board servers are great examples of the TCP ( though not absolutely necessary ) based implementations that show how collaboration is achieved in a multiple participant mode. The key differences are that the updates do not affect each other and are therefore already lock-free and the shared map shows arrows or pins for the locations of the cars against the same map. The dynamic location updates come from querying the location REST API from the individual participants. Layers can be drawn over the map to render one or more participants against the same backdrop just the same way as landmarks and traffic are annotated on the map. Moreover, if an intial URI can be set up for all the shared metadata and static data for all the participants, then it becomes easy to even do away with the database and use a service only implementation.
Conclusion: Users may not always find this feature as a menu option on their maps application but the implementation is do-able with their programmatic APIs.

#codingexercise
Find the next larger element greater than twice itself for all in an integer array
Int[] GetNextLarger2XElements(List<int> A)
{
var result = new int[A.length];
for (int i =0; i < A.Length; i++)
{
int next = -1;
for (int j = i+1; j < A.Length; j++)
if (A[j] > 2 x A[i]){
next = A[j];
break;
}
result[i] = next;
}
return result;

}

We could also do this with the help of a stack which we keep for all the elements that do not have a next 2X larger element.

Sunday, September 3, 2017

#codingexercise
Find the next smaller element for all in an integer array
Int[] GetNextSmallerElements(List<int> A)
{
var result = new int[A.length];
for (int i =0; i < A.Length; i++)
{
int next = -1;
for (int j = i+1; j < A.Length; j++)
if (A[j] < A[i]){
next = A[j];
break;
}
result[i] = next;
}
return result;

}
We could also do this with the help of a stack which we keep for all the elements that do not have a next smaller element.
we push the first element in the stack. we pick the next item in the array if the next is smaller than the element in the stack, we print the tuple and pop the element otherwise we push it back on to the stack for retaining the elements we have not found an answer yet. we also push the next element on to the stack so it can participate for matches going forward. This is still O(N^2) but instead of looking ahead through all the elements we are looking back at the collection of unmatched so far. In the worst case, this stack will grow to be the length of the array. The order of the stack is the reverse order of the portion of the array we have covered.

Saturday, September 2, 2017

We continue reading "Modern data Fraud Prevention at Big Data Scale". Feedzai enables companies to move from broad segment based scoring of transactions to individual oriented scoring with machine learning based techniques. Feedzai claims to use a new technology on a new platform. They claim to have highest fraud detection rates with lowest false positives. Feedzai uses real-time behavioral profiling as well as historical profiling that has been proven to detect 61% more fraud. They have true real time processing. They say they have true machine learning capabilities. Feedzai relies on Big Data and therefore runs on commodity hardware. The historical data goes as far back as three years. In addition, Feedzai processes realtime data in 25 milli seconds against vast amounts of data at 99th percentile. This enables fraud to be detected almost as early as when it is committed.
The Machine learning algorithms used include Random Forests and Support Vector machines. The former is helpful because it can be treated as an ensemble of decision trees which brings more robustness to meet the different kinds of transactions subjected to fraud detection. The latter is helpful because it can form more sophisticated models.

#codingexercise
Find the next greater element for all in an integer array
Int[] GetNextGreaterElements(List<int> A)
{
var result = new int[A.length];
for (int i =0; i < A.Length; i++)
{
int next = -1;
for (int j = i+1; j < A.Length; j++)
if (A[j] > A[i]){
next = A[j];
break;
}
result[i] = next;
}
return result;

}
We could also do this with the help of a stack which we keep for all the elements that do not have a next greater element.
we push the first element in the stack. we pick the next item in the array if the next is greater than the element in the stack, we print the tuple and pop the element otherwise we push it back on to the stack for retaining the elements we have not found an answer yet. we also push the next element on to the stack so it can participate for matches going forward. This is still O(N^2) but instead of looking ahead through all the elements we are looking back at the collection of unmatched so far. In the worst case, this stack will grow to be the length of the array. The order of the stack is the reverse order of the portion of the array we have covered.

Friday, September 1, 2017

We continue reading "Modern data Fraud Prevention at Big Data Scale". Feedzai enables companies to move from broad segment based scoring of transactions to individual oriented scoring with machine learning based techniques. Feedzai claims to use a new technology on a new platform. They claim to have highest fraud detection rates with lowest false positives. Feedzai uses real-time behavioral profiling as well as historical profiling that has been proven to detect 61% more fraud. They have true real time processing. They say they have true machine learning capabilities. Feedzai relies on Big Data and therefore runs on commodity hardware. The historical data goes as far back as three years. In addition, Feedzai processes realtime data in 25 milli seconds against vast amounts of data at 99th percentile. This enables fraud to be detected almost as early as when it is committed.
The Machine learning algorithms used include Random Forests and Support Vector machines. The former is helpful because it can be treated as an ensemble of decision trees which brings more robustness to meet the different kinds of transactions subjected to fraud detection. In addition, they handle noise and outliers better. Microsoft's R-package sets the standard for these types of algorithms.
The rxFastForest in MicrosoftML is a fast forest algorithm also used for binary classification or regression. It can be used for churn prediction. It builds several decision trees built using the regression tree learner in rxFastTrees. An aggregation over the resulting trees then finds a Gaussian distribution closest to the combined distribution for all trees in the model This helps to generalize fraud detection patterns well and is fast and easy to train and score.
Support Vector machines on the other hand are able to detect non-linear and complex patterns with good predictive power. These are sophisticated classification machines. These build a predictive model by finding the dividing line between two categories. In other words, the data is most distant to these lines and one of them is usually chosen as the best. The points that are closest to the line are the ones that determine the line and are called support vectors. Once the line is found, classifying is just a preference for putting the data in the right category.
#codingexercise
QuickSort partition
Partition(A, p, r)
x = A[r]
i = p - 1
for j = p to r-1
if A[j] <= x
i = i + 1
exchange A[i] with A[j]
exchange A[i+1] with A[r]
return i + 1