Sunday, October 15, 2017

Today we continue to look at emerging trends in Data storage. Retail companies are increasingly relying on data access services rather than databases. This lets them decouple their business layer from the data access and persistence layer. In this domain, they can quickly implement new business requirements, seasonal campaigns and recommendation algorithms. While traditionally these data access and database services have been hosted right out of the relational stores, companies have started adopting big data.  Moreover the data access services to big data are also no longer necessarily with the retail companies as consolidators and even database offerings have improved to take advantage of this emerging trend. MongoDB is one such data services provider that powers the customer facing companies in a comprehensive way. We take a look at this big picture.
Customer facing companies have comparmentalized the handling of user data from channels, stores, mobile devices, website and contact center.  The channels include marketplace such as online retail companies. The stores include point of sale registers and kiosks. The mobile devices include smartphone and tablets. The website includes a variety of landing pages for customers. The contact center includes customer support options. Furthermore, when users login, they do so with the logins from their social engineering applications and are not necessarily bound to the accounts of a specific store.
MongoDB enables this integration with social engineering applications with application servers while also enabling data and service integration for the said user data inflow with APIs for data and services. Web servers may also be made available for any siloed or sweeping data access. At the heart of their store, they can categorize data based on merchandising, content, inventory, customer, channel, sales and fulfilment, insight and social. Out of these merchanding, inventory, customer and insight are the most heavily used for peak holiday sales season.  The supply chain management system and data warehouse require data flows from this backend. The former is used by suppliers for and necessary for the smooth operation of data access and persistence for holiday shopping. The latter is used to power analytics. The merchandising, inventory, customer and insight information management systems enable critical business workflows in the areas of market/offer, sell/fulfill and insight. The market/offer functions includes guide, offer, semantic search, recommend, rule based decisions, pricing and coupons. The sell/fulfill include orders, payments, fraud detection, fulfilment and business rules.  The insight involves session capture and activity monitoring. We review some of these in detail in the next few posts.
Courtesy: MongoDB.

Saturday, October 14, 2017

#codingexercise
In a rowwise and columnwise sorted matrix, enumerate the elements in ascending order


There is a front that moves from the top left corner to the bottom right. Element along the front are guaranteed to include the minimum. Since the front only moves one cell down or to the right, the element along the zig zag shape of the front are easy to enumerate to pick out the minimum at any step.
void PrintSerialized(int[ , ] A, int rows, int cols)
{
if ( A== null || rows == 0 || cols == 0) return;

var items = new List<Tuple<int,int>>();
items.Add(new Tuple<int,int>(0,0));

while ( items.Count != 0 )
{

// print current min
var item = RemoveMinValue(items);
Console.WriteLine(item.GetValue());

// add next candidates
var down = new Tuple<int,int>(item.first+1, item.second);
var right = new Tuple<int,int>(item.first,item.second+1);
if (IsValid(down) && items.Contains(down) == false)
    items.Add(down);
if (IsValid(right) && items.Contains(right) == false)
    items.Add(right);
}
Console.WriteLine(A[r,c]);
}


5  10   12  14
6  11  13   19
7  15  22  29
9  25  32  39

Friday, October 13, 2017

#puzzle
Without computer assistance, find five different sets of three positive integers {k, m, n}
such that k < m < n and
(1/k)+(1/m)+(1/n) = 19/84

Since this involves fractions the gcd of the denominators k,m,n can be considered to be 84.
So multiplying by 84 we can write this as a + b +c = 19 where
a = 84/k
b = 84/m
c = 84/n
and a, b, c must lie in the factors
1,2,3,4,6,7,12,14,21,28,42,84
Moreover a > b > c for k < m < n
By eliminating a few of these, we can say 'a' could start from 7,12 or 14.
We find two such pairs  for (a,b,c)
12, 6, 1  which implies k,m,n is 7,14,84
12, 4, 3  which implies k,m,n is 7, 21, 28
Similarly when a is 14, we get
14, 4, 1  which implies k,m,n is 6, 21, 84
14, 3, 2  which implies k,m,n is 6, 28, 42
Since we tried k = 6, 7, we can try k =5 and use a gcd of 420
which gives us k,m,n as 5, 60, 105

#codingexercise
In a rowwise and columnwise sorted matrix, enumerate the elements in ascending order
There is a front that moves from the top left corner to the bottom right. Element along the front are guaranteed to include the minimum. Since the front only moves one cell down or to the right, the element along the zig zag shape of the front are easy to enumerate to pick out the minimum at any step.

Thursday, October 12, 2017

#puzzle
 A rearrangement of the letters of a word has no fixed letters if, when the rearrangement is placed directly below the word, no column has the same letter repeated. For instance, the blocks of letters below show that ESARET is a rearrangement with no fixed letters of T E R E S A, but R E A S T E is not.
T E R E S A              T E R E S A
E S A R E T              R E A S T E
How many distinguishable rearrangements with no fixed letters does T E R E S A have? (The two E’s are considered identical.)
Since we have been given a small search word, we could easily enumerate the combinations. For example, we can place the 2 E's in different position as follows:
E ? E ? ? ?
E ? ? ? E ?
E ? ? ? ? E
? ? E ? E ?
? ? E ? ? E
? ? ? ? E E

This yields 6 ways to do so. One way to think about this is that we are moving each of the E's to the remaining four positions and therefore there are 4-Choose-2 ways = 6 ways.
Taking one of the six arrangements above, we can also enumerate the possible arrangements of the remaining letters. Let us say E now occupies new positions X and Y. Neither X nor Y can be in their original position no matter where we put them.  So we have to place the last two letters which we call M and N and there are only 4 positions to choose from.  M can either be moved to where N was or it can be moved to one of the original positions of E. In the former case there are three positions remaining for N so there are 3 ways to do so. In the latter case there are two E's so there are two ways each for M and for N. There fore there are a total of 3 + 2 x 2 = 7 ways to place the last two letters. Moreover X and Y can also be interchanged so we have 7 times 2 equals 14 ways. Since we are doing this for each of the 6 ways to place E's to start with, we have a total of 6 x 14 = 84 ways. It is possible to enumerate all these 14 ways for a given starting arrangement of E and see that we can repeat the same for any of the remaining  arrangements of E.

Wednesday, October 11, 2017

codingexercise
Find the sub-tree with maximum  color difference in  a 2 colored tree
We define color difference as the magnitude of the number of vertices of one color minus the number of other color vertices where each number is a positive count.
Since it is not convenient to return all the subtrees and exhaust the search space,
We will keep track of minimum and maximum color difference of two subtrees and we will color 1 or –1 so that the equal number of both balance each other out.
When we perform a depth first search, we can find the maximum of any color by assigning  1 to that color. Since depth first traversal finds subtrees we have the maximum available for a sub tree.
When we invert the color, we can find the maximum of the opposite color with a similar depth first search.
During the depth first search for every node as the starting point, to find
1) max sum sub tree, we take a node if its value is greater than 0 => sum(parent) += max(0, sum(child))
2) min sum sub tree, we take a node if its value is lesser than 0 => sum(parent) += min(0, sum(child))
Thus for each node we have a color count array . Previously visited nodes already have this counted so this is a form of overlapping subproblems which we alleviate by memoization
After each depth first search we now have a color count array of a sub-tree for which we can find the maximum
When we get the maximum of both colors this way from the color count array, we know that the maximum color count difference is between this high number and nothing. Therefore, we have the color difference we are looking for.

int maxDiff(Node root, List<int> colors, int N)
{
   var counts = new List<int>();
   dfs(root, colors, ref counts);
   int max = counts.Max();
   colors.ForEach (x => Invert(x));
   dfs(root, colors, ref counts);
   max = Math.max(max, counts.Max());
   return max;
}

Tuesday, October 10, 2017

An essay in the system design of SignIn workflows. 
Most web applications require to authenticate a user via signing in. Such applications of all sizes usually delegate the task to a membership provider which takes different forms. We talk about some of these manifestations and their pros and cons. The task inherently may involve something as simple as looking up a user account in the database, or looking it up in the Active Directory or looking it up using API calls for industry accepted authentication and authorization protocols
Fundamentally the account is needed for a session on the application so that the user can seamlessly visit one or more pages without having to authenticate on each page.  When the identity is linked to an account that spans one or more business, then the notion of a single user account is lost and we involve a pool of accounts all pertaining to the same user.
User account when stored and indexed is easy to lookup and results in faster retrieval. Social engineering applications use a mix of relational as well as NoSQL stores for the purposes of storing accounts and their resources respectively. Typically they have a sql connector over their NoSQL so the querying can be expressed in the same language. They need to store user resources such as chat, images and logs in big data because the volume is large and data processing is batch oriented. With user accounts, even the logs are interesting for audit purposes and consequently important to store.
SignIn Applications can be categorized as front-end oriented, middle-tier motivated or back-end performance driven although they may use all three to some degree. When these applications are front-end oriented, we often see the use of Model-View-Controller kind of patterns that evolve to include viewmodels. Since there are re-usable parts of workflows for signin across applications like registering, login, password retrieval, using one-time passcodes, verifying with the help of email or phone and many  more, they often take the shape of widgets where workflows are described in the form of widgets which can be composed or standalone.  Front-end heavy applications pay special attention to consistency and user experience across different use cases.
Middletier motivated applications can be recognized by their dependencies on services or service oriented architecture to be precise. Every task small or big is now a service call away and may exist even behind a gateway or an information bus. These applications were earlier designed with a hierarchy of individual services organized in object-oriented terms while each retaining its independence with its own address binding and contract definitions. More recently, this organization has become flat, fine grained and with the notion of microservices and perhaps serverless computing. When services are involved, applications tend to bring standard enterprise practices such as those describe by the enterprise application blocks where logging and metrics are part of the service implementations. Services also enable independence from data store concerns which brings us to the third category.
Indeed some of the sign-in applications especially for web-applications implemented using well known frameworks make use of data stores in the form of membership providers where the store itself performs search and retrieval using a standard interface that web-application frameworks expect. This has made it convenient for the application to reduce its footprint of signin logic so that the data is kept organized and there is little code involved. Such stores come with the benefit of being the single source of truth at any time.
Thus we see that there are different forms of SignInApplications and their implementations.



#puzzle
How many positive five-digit integers are there consisting of the digits 1, 2, 3, 4, 5, 6, 7, 8, 9, in which one digit appears once and two digits appear twice? For example, 41174 is one such number, while 75355 is not.
If one number appears once, it can be chosen in 9Choose1 ways.
The other two digits that appear twice can be chosen in 8Choose2 ways.
The number of permutations of these selections is given by
permutations based on number of digits  divided by permutations based on repetitions

= 5! / (2!2!1!) = 30

Therefore the total number of numbers is 9Choose1 x 8Choose2 x 30 = 7560 possible numbers that satisfy the given criteria

Monday, October 9, 2017

#puzzle
The integer 72 is the first of three consecutive integers 72, 73, and 74, that can each be expressed as the sum of the squares of two positive integers. The integers 72, 288, and 800 are the first three members of an infinite increasing sequence of integers with the above property. Find a function that generates the sequence and give the next three members.  
We notice that 72 is 6^2 + 6^2 
288 is 12 ^12 
800 is 20^2 + 20^2 
We also note that the next to next number for 72 is merely (6-1)^2 + (6+1)^2 because that will add two 
Similarly the next to next number for 288 is (12-1)^2+(12+1)^2 because that will add two 
And the next to next number for 800 is (20-1)^2+(20+1)^2 
So the middle number must also have a pattern because the function exists. Given that we have three integers to see the pattern, we can generalize it with n 
Consequently the function can be expressed with patterns for all three terms. 
Let us take a closer look at the middle number
73 = 3^2+8^2 = (6-3)^2+(6+2)^2
289 = 8^2 + 15^2 = (12-4)^2+(12+3)^2
801 = 15^2 + 24^2 = (20-5)^2 + (20+4)^2

Therefore given 2n^2 as the first number, the second number is (n-k-2)^2+(n+k+1)^2 and the third number is (n-1)^2 and (n+1)^2 where k is the count of sequence printed starting with 1.
Since the second number is the first number plus 1 we have an equation in n and k
as 2n^2+1 = (n-k-2)^2+(n+k+1)^2
   n = k^2+3k+2
Given k = 4, 5, 6 the next three numbers are 1800, 3528, 6272