Cluster computing

Saturday, July 4, 2015

Matrix-chain-order problem.

A matrix chain order determines the optimal number of scalar multiplications needed to compute a matrix chain product. Even though matrix multiplication is associative, the order of parenthesizing the product impacts the cost of the product. This is because the order determines the number of scalar multiplications to be performed because a matrix multiplication cost is based on the columns of the preceding matrix to be multiplied with the number of rows from the succeeding matrix. Therefore, by fully parenthesizing the matrix product of A1, A2, … An matrices we can compute the total cost and consequently the optimal solution will have an order of multiplication corresponding to the lowest cost.

To solve this problem, we apply dynamic programming by following the four-step sequence:

1. Characterize the structure of an optimal solution

2. Recursively define the value of an optimal solution

3. Compute the value of an optimal solution

4. Construct an optimal solution from computed information

The first step can be performed by arguing that finding the optimal solution of the product of A1, A2 … An is the same as finding the optimal solution of two subproblems obtained by splitting the range into two involving A1…Ak and Ak+1 …An and then combining these optimal subproblem solutions.

Next the recurrence is described with a termination condition as zero cost multiplication when the chain consists of only one matrix. The progress condition can be described as the minimum cost for the range i to j as the minimum of the sum of the cost associated with the sub problems plus the scalar multiplication cost of the results where this sum is calculated one for each chosen k before taking the minimum.

Step 3 can be performed by using a bottom up approach. Given that each matrix Ai has pi-1 x pi dimensions, a bottom up approach solves incremental dimensions so that the solution of the matrix involving previous dimensions can be readily looked up. There are two tables maintained – one that stores the cost of computing a matrix chain product and another auxiliary table that records which index k achieved the optimal cost in computing the cost stored.

Finally we can compute the optimal solution using the auxiliary index table above. We can find the k from the auxiliary table that stores it. We can determine the earlier matrix multiplications recursively since s[1,s[1,n]] determines the last matrix multiplication when computing the preceding component and the s[s[1,n]+1,n] determines the last matrix multiplication when computing the succeeding component of the product of the two subproblems. Again the termination condition is the single element chain.

Moreover, as with dynamic programming problems with bottom-up approach, there is also a corresponding recursive solution using a top down strategy and optional memorization to improve efficiency . That top down stragey is based directly on the recurrence mentioned above. We still maintain a table of costs but the order of filling the table is more like the order of recursion.

The bottom up strategy:
Matrix-Chain-Order(p)
n = p.length - 1
let m[1..n, 1..n] and s[1 .. n-1, 2..n] be new tables for cost and index of optimal
for i = 1 to n
m[i,i] = 0
for l = 2 to n // chain length
for i = 1 to n-l+1
j = i+l-1
m[i,j] = infinity
for k = i to j-1
q = m[i,k] + m[k+1,j] + pi-1pkpj
if q < m[i,j]
m[i,j] = q
s[i,j] = k
return m and s

The top down strategy:
Recursive-Matrix-Chain(p, i, j)
if i == j
return 0
m[i,j] = infinity
for k = i to j - 1
q = recursive-matrix-chain(p,i,k) + recursive-matrix-chain(p, k+1,j) + pi-1pkpj
if q < m[i,j]
m[i,j] = q
return m[i,j]

Friday, July 3, 2015

#codingexercise

/***

Add 2 numbers represented as linked list and return the resulting sum as a linked list.

How are numbers represented as linked list?

Suppose we have 459, it is represented as:

4->5->9->null

So 4 is the head and 9 is the tail.

Say we are given 2 numbers like: 459 and 891, their sum is 1350. That will be represented as:

1->3->5->0.

Implement the function that takes head pointer of 2 numbers as linked list and

returns a head pointer to a new linked list, which is basically the sum of both the numbers.

***/

Node GetSum(Node root1, Node root2)

{

// read the value from root 1

int val1 = GetValue(root1);

// read the value from root 2

int val2 = GetValue(root2);

// write the value of root1 + root2

if ( val1 + val2 < val1 || val1 + val2 < val2) // overflow

return NULL; // or raise exception

return WriteValue(val1 + val2);

}

Node WriteValue(int val)

{

Node root = NULL;

int v = val;

int len = 0;

while (v != 0 ) {

len = len + 1;

v = v / 10;

}

v = val;

Node Prev = null;

for ( i = 0; i < len ; i++)

{

var n = new Node();

n.Data = (Math.floor(v / Power(10, len - i - 1))) % 10;

n.Next = Null;

if (Prev) {

prev.next = n;

prev = n;

}

else

root = n;

}

return root;

}

int GetValue(Node root1)

{

Node cur = root1;

int val1 = 0;

int len = 0;

while(cur)

{

len = len + 1;

cur = cur.Next;

}

// len = 2

cur = root1;

for ( i = 0; i < len; i++)

{

val1 += cur.Data * Power(10, len - i - 1);

cur = cur.Next;

}

// val1 = 29

}

NULL list2

list1 NULL

UnitValue List2

list1 UnitValue

ArbitraryValue List2

List1 ...

Overflow & Underflow

Thursday, July 2, 2015

Codingexercise:
Given a linked list say
LL = 1-2-3-4 and the position k = 2, swap the nodes k and n-k such that the output now becomes:
1-3-2-4

Node Swap(Node left, Node right, Node prevleft, Node prevright, Node root)
{
Assert(left != NULL && right != NULL)
// prevleft left leftnext prevright right nextright
// prevleft right leftnext prevright left nextleft
if (left == right) return root;
Node leftnext = left.next;
Node rightnext = right.next;
left.next = rightnext;
if(prevright && prevright != left) { Prevright.next = left; }
// if (prevright && prevright == left) {left.next = rightnext;}
if (leftnext != right) right.next= leftnext;
else {right.next = left;}
if(prevleft) { Prevleft.next = right;}
else return right;
return root;
}

Node SwapKth(Node root, int K);
{
int len = 0;
Node cur = root;
while(cur){
len = len + 1;
cur = cur.next;
}
Node left = root;
Node Prevleft = root;
Node right = root;
Node Prevright = root;
if (K < 0) return root;
if (K > len) return root;
for (int i = 0; i < K-1; i++){
Prevleft = left;
left = left->next;
}
for (int i = 0; i < len-K; i++){
Prevright = right;
right = right->next;
}
if (K > len - K)
{
temp = left;
left = right;
right = temp;
temp=prevleft;
prevleft = prevright;
prevright = temp;
}
if (left && right)
return Swap(left,right, prevleft, prevright, root);
return root;
}
Test cases:
Empty Set
NULL 1
1 NULL
1 2 ( checks in the swap function)
1 2 K = 0
1 2 K = 1
1 2 K = 2
1234 K = 2
Prevleft left right PrevRight
1 2 3 2
1 2 2 1
1 2 5 4 len = 6 K = 2
3 4 1 NULL len = 4 K = 4
NULL 1 4 3

This was an actual interview question. How do you improve high availability and low latency. My answer was you improve HA with fault tolerance and at least one fail over server. For low latency, you don't do anything as an application programmer. You delegate it to Web accelerators and data deduplicators. If the application was talking to another instance of itself, then it could make its protocol less chatty.

Wednesday, July 1, 2015

Today we discuss another LeetCode challenge. This is a problem on Surrounded regions Give a 2D board containing X and O, capture all regions surrounded by X. A region is captured by flipping all O's in Xs
Here the O that lies on the boundary and those that are connected to it are exempt. Everything else can be captured. Consequently we work our way from outwards boundary and distinguish these Os by calling them say Y.After each boundary is considered, we eliminate it and repeat the step for the inner 2D board.

We use a method IsAdjacentAMatch(Position p, Piece item) that tests each of the eight positions adjacent to the current position for a match with the given item - say an X or a Y or a O.
For each of the elements on the border, we consider the adjacents and if they are O, we convert them to Y.
Finally for all the non-Ys we make a pass through the board and convert them to X.
void IsAdjancentAMatch( Position p, Piece item)
{ int x = P.x;
int y = P.y;

// test x-1,y-1
// test x-1,y
// test x-1, y+1

// test x, y-1
// test x, y+1

// test x+1, y-1
// test x+1, y
// test x+1,y+1

// In each of the test above for a match with Y or O, if its within the Board, mark it with Y
}

public class Solution {
public void Solve(ref char[,] board, int top_left_x, int top_left_y, int bottom_right_x, int bottom_right_y) {
// termination condition
if (bottom_right > top_left) return
// Take the elements along the border of the board
// 0th row
// last col
// n-1th row
// first col
If any element has an adjacency with Y, mark it Y
// Recurse with the inner board ( current board without the border)
Solve(ref b, top_left_x + 1, top_left_y + 1, bottom_right_x - 1, bottom_right_y -1)

}
}

#codingexercise
Double GetNthRootProductAlternateTermRaisedPDividedQoverPTimesQ PminusQ(Double [] A,Double p, Double q)
{
If ( A== null) return 0;
Return A.NthRootProductAlternateTermRaisedPDividedQoverPTimesQPminusQ(p, q);
}

Tuesday, June 30, 2015

Django middleware vs Openresty:

Django is a python framework suitable for building and deploying production quality web application software. It enables model, view controller framework out of the box for fast web application development. In addition, it integrates seamlessly with rest_framework and documentation support to provide documented REST based APIs. Django also provides what is called a middleware to enable http modules or plugins to be written that can modify the request or response or mutate the View parameters. This is defined as a sequence of classes in the settings file. The hooks that middleware classes uses for their processing include request , response, view, template_response and exception. Django middleware are typically used for sessions, authentication, CSRF protection and GZipping content .

OpenResty is a full-fledged web application server that bundles the core and modules of Nginx that opens it up for developers to configure via Lua programming/scripting language. These modules and the core are written in C and are therefore highly performant and cross platform compatible. This web server is able to handle 10K plus connections. Openresty enables a developers server side web application to be run entirely within Nginx server. Nginx is known to support asynchronous IO not only with http clients but also with database backends.

To compare the two, we review a common use case – the ability to route REST API calls to different API providers. This functionality is often referred to as an API gateway.

Only some of the tradeoffs are presented as follows:
Features   Django Middleware      OpenResty bundle
Support request and
response modification      Yes         Yes

Support authentication and
authorization including OAuth    Yes         Yes

Support large number
of connections    Yes         Yes, but this is faster

Support on a wide    More popular by
variety of hardware    virtue of LAMP stack      Not as popular

Production quality and
quality control      More tested in commercial    Less tested

Memory and Disk I/O      Can be large    Aims to be more
                    performant with little
memory footprint

Software support      Generally not needed. Some   Requires workarounds
          legacy platforms are still in use and fixes
          without issues, more reliable

Today we continue to do some more coding exercises.
Let us look at a Skyline problem. This appeared in Leetcode challenge. A city's skyline is the outer contour formed by the buildings. The buildings are given by (Li, Ri, Hi) where Li is the x-coordiante of the building's left edge and Ri is the x-coordinate of the building's ending edge. Hi is the height of the building. The buildings are arranged left to right in increasing order of their left edges.
The output is a list of key points where the contour changes direction. The number of building is guaranteed to be less than 10000. There must be no consecutive horizontal line of equal height and should appear merged instead.

public class Solution {
public IList<int[]> GetSkyline(int[,] buildings) {
var ret = new List<int[]>();
int x = 0;
int y = 0;
var stk = new Stack<int>();
for (int i = 0; i < buildings.Length; i++){
if (buildings[i][2] > y) {
if (stk.isEmpty() == false) {
int j = stk.Pop();
if (building[i][1] < building[j][1])
x = building[j][1];
AddToList(ref ret, x, y);
y = 0;
AddToList(ref ret, x, y);
}
stk.Push(i);
x = buildings[i][0];
AddToList(ref ret, x, y);
y = buildings[i][2];
AddToList(ref ret, x,y);
}
else
{
if (buildings[i][1] < buildings[stk.Peek()][1))
continue;
int j = stk.Pop();
x = buildings[j][1];
AddToList(ref ret, x, y);
y = buildings[i][2];
AddToList(ref ref, x, y);
stk.push(i);
}
}

}
private static void AddToList(ref List<int[]> ret, int x, int y)
{
var current = new int[2];
current[0] = x;
current[1] = y;
ret.Add(current);
}
}

Monday, June 29, 2015

Trie data structure:
A trie is also referred to as a prefix tree or radix tree and unlike B-Trees which store the keys in the node, a trie uses the position of the node to construct the prefix. All the descendants of a node have a common prefix associated with that node. The prefix starts out empty at the root and gets appended as we descend down the children. The term 'trie' comes from retrieval. A trie comes in very useful for auto-complete and spell-checker.
A trie stores words with the letters on the edges and as we walk down, we read one letter at a time. Words can end or be a prefix of others. To do a lookup, we just walk down the tree letter by letter and then see if the node has an ending $ to the node. If we get to a null pointer, we know that the word is not there. To efficiently save the prefixes, paths that don't branch into a single edge are compressed and only split when needed.
To lookup the words, we use radix sort.
void RadixSort(Array A, int digits)
{
for (int i = 1; i <= digits; i++)
      InsertionSort(A, i);
}

void InsertionSort(A, d) {
for (int i = 1; i <= n; i++)
{
var x = A[i];
int j = i -1;
while (j >= 0 && A[j] > x)
{
// like arranging a hand of playing cards.
A[j + 1] = A[j];
j = j - 1;
}
A[j + 1] = x;
}
}

We now look at inserting words into a trie. When we add a word to a vertex, we will add word to the corresponding branch of vertex cutting the leftmost character of the word. If the needed branch does not exist, we will have to create it.

class TrieNode:
         def addword(self, word):
                if len(word) == 0:
                    self.isWord = True
                    return
                 key = word[0]
                 word = word [1:]
                 if self.next.has_key[key]:
                     self.next[key].add_item(input)
                 else:
                      node = TrieNode()
                      self.next[key] = node
                      node.add_item(string)

         def dfs (self, sofar = None):
                # we print in two cases
                if self.next.keys() == []:
                    print 'Match:'+sofar
                       # this could be modified to look for similar words by backtracking to the parent and looking for siblings
                if self.isWord:
                    print 'Match:' + sofar
                for key in self.next.keys():
                    self.next[key].dfs(sofar)

        def autocomplete(self, word, sofar=""):
                if len(word) == 0:
                    if self.isWord:
                        print 'Match:', sofar
                    for key in self.next.keys():
                        self.next[key].dfs(sofar)
               else:
                    key = word[0]
                    word = word[1:]
                    if self.next.has_key(key):
                        sofar = sofar + key
                        self.next[key].autocomplete(word, sofar)



courtesy:sarathlakshman.com

WebAPIs often use various authentication methods such as basic authentication, token authentication, or even session authentication. However, applications using the webAPIs may also require sessions to associate the resources for the same user using the same user-agent* (defined in RFC 2616). For example, consider the Cart API. To get the items in the shopping cart of a user the application may have to use the GET method on the Cart API as follows:
GET Cart/ API
“X-Merchant-ID: client_key”
“X-Session: <user session identifier>”
The application can also add or edit items in the shopping cart or delete it. This example illustrates the need for the application to associate an identifier for the scope of a set of webAPI calls.
This session identifier is usually obtained as a hash of the session cookie. And is provided by the authentication and authorization server. The session identifier can be requested as part of the login process or by a separate API Call to a session endpoint. An active session removes the need to re-authenticate. It provides a familiar end-user experience and functionality. The session can also be used with user-agent features or extensions to assist with authentication such as password-manager or 2-factor device reader.
The session identifier is usually a hash of the session. It is dynamic in nature, for external use and useful only for identifying something that is temporary. It can be requested based on predetermined client_secrets.
Sessions can time out or be explicitly torn down – either by the user or by the system forcing re-authentication. Therefore session management must expire/clear the associated cookie and identifiers.
Sessions will need to be protected against csrf attack and clickjacking just as they are for other resources.
Sessions are treated the same as credentials in terms of their lifetime.The user for the session can be looked up as follows:
{HttpCookie cookie = HttpContext.Current.Request.Cookies[FormsAuthentication.FormsCookieName];
FormsAuthenticationTicket ticket = FormsAuthentication.Decrypt(cookie.Value);
SampleIdentity id = new SampleIdentity(ticket);
GenericPrincipal prin = new GenericPrinicipal(id, null);
HttpContext.Current.User = prin;}