Cluster computing

Thursday, May 8, 2014

In this post, we will quickly review the design patterns:
First is the singleton pattern. Some of the core system functions are commonly accessed via a singleton pattern. Remember to use static, remember to hide the constructor, remember to count instances and remember to make it thread safe.
Builder is another pattern. The builder creates an object on your behalf in a step wise manner without knowing how it is created. This is very common to see in object relational mapping where we use a construct like
new Table().AddColumn().AddColumn().AddColumn(). .. etc
instead of Table(col1, col2, col3);
Another advantage is that it can create a hierarchy of builders.
Factory method pattern applies a concept of a factory method which is one that creates and returns an object.
An Abstract factory is one which creates other objects. This helps use many factories to create objects usually when they are of the same kind. There is a separation between the concrete classes and the callers of the factory.
An Observer pattern is one that is also called a Publish-Subscribe pattern or pub-sub pattern for short. Observers register themselves with the subject using an interface for update notifications. The model view controller is an example of this pattern because changes to the model automatically update the views.
One caveat in the observer pattern is that when there are frequent updates, the object may have to spend a lot of time notifying its observers. This is particularly true when there are multiple properties getting updated. It would then be better to consolidate the updates. Also the subject could pass the information on what has changed.
Decorator patterns are often applied in wrappers. The behavior of the underlying object is changed by the decorator.
A decorator is very different from an interface. While provide methods for certain behavior from the underlying objects. In one the object is wrapped and presented as another object while in the latter the object itself provides the desired behavior. Typically a decorator is used when the behavior has to be radically changed.
Using the decorator also helps avoid class explosion when there are many different behaviors added to an object.

In today's post we look at a few more coding questions:
How do we convert a binary tree to a heap ?
There is a shortcut answer for this.
Remember that both can be serialized into arrays.
So we simply sort the array
Then we can deserialize them into the respective data structures.
In the case of the heap, the left and right nodes are at 2i and 2i+1
For the binary trees, the deserialization depends on whether the elements are listed in pre-order, inorder and postorder traversal.
Also, when we sort the array, remember that we can used different comparators.
We discuss the pancake problem next.
The pancake problem has a stack of n pancakes each with a different diameter. When you flip the pancakes, you can flip more than one by inserting your pancake flipper at any level into the stack and reversing it. The question is how many flips will it take to sort all the pancakes by size with the bottom pancake being the largest. We are looking for something that will work for in most cases of different orders of pancakes.
The first thing to observe here is that this is way different from sorting. In sorting we can exchange elements in a collection. This is not the same for this pancake stack where we must lift all the pancakes above the point where we insert the pancake flipper.
Consider three cases for the largest pancake.
If the largest pancake is already at the bottom, we don't need to do anything.
If the largest pancake is at the top, we flip the entire stack so the largest one lands at the bottom.
If the largest pancake is in the middle, we split the stack at that point. Put the pancake at the top of one stack and then invert it.
Because the pancakes at the bottom are unaffected by the flips above, we proceed from bottom to top in this way.
Since it takes two flips for the bottom pancake, we can assume that this takes the same number of flips for each pancake resulting in 2n-2 flips in worst case.
The pancakes problem has an interesting history. Bill Gates published a paper that was the most efficient for almost 30 years [Courtesy programming interviews exposed book]

Wednesday, May 7, 2014

We look at a few more coding questions today. First we look at breadth first traversal and depth First traversal. Depth first traversal we can write recursively as follows with no additional data structure.
DFS-Visit(Root)
{
mark root as partially visited
for each vertex v adjacent to Root
if color of v = unvisited
set parent of v
DFS-visit(v)
mark root as visited
}
Next we look at breadth first search
In BFS, we maintain an additional data structure to store the siblings at any level.
BFS(G, s)
{
// initialize
ENQUEUE(Q,s)
While Q is not null
u = DEQUEUE(Q)
for each v belonging to Adj[u]
do if color of v = unvisited
then color it partially visited and
ENQUEUE(Q)
mark u as visited
}

I'm going to cover a few more questions but first I wanted to take a minute. In particular I want to cover graph based coding questions and revisit to distributed computing.

I will take a small example for edit distance first :
Find the minimum edit distance between two input strings. Edit distance is the minimal number of edit operations to modify a string from one state to the other.
For eg : Saturday can be changed to Sunday using the following transformations:
Saturday -> Sturday
Sturday -> Surday
Surday -> Sunday
if we denote the edit distance by a function f(i,j) between the substring of the first string ending with the j'th character and the substring of the second string ending with the i'th character. Then we know that f(i,0) = i since the because we need to delete that many characters to get an empty string. Similarly f(0,j) = j because the empty string is at the beginning of that same substring ending with the 0th character.
If the jth character of the first string is different from the ith character of the second string, then
there are three different techniques involved in the transformations and finding the edit distances. These are insertion, deletion and substitution.
We note that we can make the lengths equal by deleting the characters from the longer string or inserting the characters into the shorter string. We will shortly see how.
If the jth character of the first string is the same as the ith character of the second string, then the edit distance f(i,j) is the same as f(i-1,j-1)
If there is substitution involved then the edit distance f(i,j) = f(i-1,j-1) + 1
if we insert the ith character of the second string into the first, then edit distance f(i,j) = f(i - 1, j) + 1
If we delete the jth character of the first string, then the edit distance f(i,j) = f(i, j-1) + 1
Thus if the i != 0 and j != 0 and string [i] != string [j], then the edit distance is the minimum of
f (i-1,j) + 1 , f(i, j - 1) + 1 and f(i - 1, j - 1) + 1
With the above formula comprising of case i = 0 case j = 0 case string[i] == string[j] and string[i] != string[i,j] we can create a table of edit distances by enumerating the characters of the first string as columns and those of the second string as rows and using the substrings to compute edit distances for any pair of substrings. Note that we introduce a column at the beginning of all rows and we introduce a row at the beginning of all columns to hold the position for the empty string and we initialize the value of this first cell as zero. we can computer the edit distance for other pairs of substrings starting from here. At each additional row or column, we use the edit distance from the previous row or column and we find the minumum value from the three operations listed above .At the diagonal end of the first cell which is the last cell we will compute the edit distance of the entire first string and the entire second string.

Algorithm: The function to implement the above has the following loops
first loop to set the value of the first row ( corresponding to empty string )
second loop to set the value of the first column ( corresponding to the empty string)
then for each pair of characters from the first and the second string,
if they are equal we set it to the previous diagonal value
else (they are different and) we find the costs for insertion, deletion and substitution and choose the minimum with which we update the corresponding position in the table

We quickly review some graph algorithms next:

Minimum spanning trees:
Kruskal's algorithm:

// Make-set of all vertices
sort the edges of E into non-decreasing order by weight w
for each edge (u,v) belonging to E taken in that order
connect the edge if they belong to different sets

Prim's algorithm:
Put a vertex in the min-priority Q sorted by the weights
While Q is not empty,
extract the minimum
for each vertex v belonging to adjacents of u
if v belongs to Q and w(u,v) < key(v)
connect v to u and set its key

Shortest paths are found by relaxing the edges

Djikstra's shortest path:
Initialize-single-source(G,s)
Initialize Q with V[G]
While Q is not empty
u = extract-min(Q)
add u to shortest path vertices
for each vertex adjacent to u
relax (u,v,w)

In this post we look at a few more coding questions,. In particular we look at a technique called backtracking.
Backtracking helps over brute force method.
Take the alphabet matrix as follows:
a b c e
s f c s
a d e e
There is a path for the string 'bcced' but not for path abcb. Re-entering is not possible
This technique is called backtracking because it returns back to the previous state and uses pop operations from a stack. That is when it finds a character it pushes it on to the stack and for all the rejection where the character is not the same as the corresponding character, it backtracks and pops from the stack.
Since all entries can be entered only once, we keep a boolean flag pertaining to each entry.
bool hasPath(char* matrix, int rows, int cols, char* str)
{
// parameter validation
// bool* visited for the matrix of used flags
int pathLength = 0;
for (int row = 0; row < rows; row++){
for (int col = 0; col < cols; col++){
if (hasPathStack(matrix, rows, cols, row, col, str, pathLength, visited))
return true;
}
}
//cleanup
return false;
}

bool hasPathStack(char*matrix, int rows, int cols, char* str, int& pathLength, bool* visited)
{
// if the boundary conditions are satisfied &&
// if the path has not been found yet &&
// if the element has not been visited already
++ pathLength;
// visited set to true
// visit one of the four neighboring cells
bool hasPath = hasPathStack(matrix, rows, cols, row, col - 1, str, pathLength, visited) ||
hasPathStack(matrix, rows, cols, row - 1, col, str, pathLength, visited) ||
hasPathStack(matrix, rows, cols row, col + 1, str, pathLength, visited) ||
hasPathStack(matrix, rows, cols row + 1, col, str, pathLength, visited) ;
if (!hasPathStack) {
--pathLength; // backtracking
// visited set to false
}
}
[from the coding interviews book]

Tuesday, May 6, 2014

In this post we cover some coding questions based on string operations, bitwise manipulation, recursion, permutations and combinations.
Suppose we were to write down all the combinations of the indices of numbers in a given list of numbers that add up to a particular sum:
The algorithm would look something like this assuming we have a sorted list of numbers and we declare a type IndexedNumber that stores the value and the index corresponding to each entry in the input list
public static void combine(ref List<IndexedNumber> numbers, ref List<IndexedNumber> candidate>, ref List<List<IndexedNumber>> sequences, int level, int start, int n)
{
for (int i = start; i < numbers.Count; i++)
{
if(candidate.Contains(numbers[i]) == false)
{
candidate[level] = numbers[i];
if (candidate.Sum() == n)
sequences.Add(new List<IndexedNumber>(candidate));
if (i < numbers.Count - 1)
combine (ref numbers, ref candidate, ref sequences, level + 1, start + 1, n);
candidate[level] = new IndexedNumber() { Number = 0, Index = -1};
}
}
}
Let's now look at a bit manipulation problem.
We define the problem this way: Write a function to convert an integer x into an integer y and find the number of bits that will need to be changed. For example, given 12 and 16, it should return 3
because 12 is 0xC or 1100 and 16 is 0x10 or 10000 and we have to change 3 bits
int ConvertXtoY(int x, int y)
{
int a = x;
int b = y;
int count = 0;
while ( a || b)
{
if ( (a & 0x1) ^ (b& 0x1)) count ++;
a >>= 1;
b >>= 1;
}
return count;
}
String operation question:
find the longest repeating substring:
char* getMatch(const char* str)
{
char* match = NULL;
int len = 0;
char* first = str;
while (first)
{
char* second = str++;

// case 2
while (second)
{
    char* lMatch = NULL; //local match
    int llen = 0; // local match len
    char* orig = first;
    char* local = second;

    //case 0
    while ( local && *local == *orig)
    {
      if (llen == 0) lMatch = orig;
      llen++;
      local++;
      orig++;
    }

    if (llen > len ) { match = lMatch; len = llen;}

    //case 1
    second++;
}
first++;
}

return match;
}

Monday, May 5, 2014

Today we discuss some more of the .Net APIs again from the Framework class library. Here the we first take a brief look at System.Timers The System.Timer namespace provides a component that allows us to raise an event on a specified interval. This is a server based timer that allows us to specify a recurring interval at which the Elapsed event is raised. Then this event can be used to provide regular processing. Where these server timers differ from windows timers is that these timers can move among threads to handle the Elapsed event. The arguments to the Elapsed event provide data for handling it.
When we look at the implementation of this, it uses the underlying System.Threading.Timer class and GetSystemTimeAsFileTime method for the time arithmetic. On each update of the enabled, it updates the existing underlying timer. If a zero value was provided, it disposes the existing timer and sets the enabled to value. By updating the timer, it changes the value of the interval.
Let us now look at the System.Threading.Timer.
The System.Threading.Timer takes as initialization parameters : callback, state, dueTime and period. For initialization it makes use of an object called the TimerHolder. On each update of the timer it changes the dueTime and period. The TimerHolder in turn takes a TimerQueueTimer with the same initialization parameters for its initialization. The TimerQueue maintains a list of active timers in the AppDomain. It uses a single native timer supplied by the VM to actively manage all the timers.
The data structure used to maintain all the active timers is a doubly linked list because it requires fast insert and delete while traversal can be linear. The FireNextTimers method on this queue traverses each timer to see whether the elapsed is greater than the dueTime and deletes it if it's not periodic. If its the first time, it will be fired on this thread otherwise it will be fired on the thread pool. If the timer hasn't fired yet, it just updates the next time the native timer fires. By updating a timer, not only is its dueTime, period and startTicks updated but also its prev and next in the list.
Given the ease of programmability in C# with the Framework APIs, writing or even rewriting an entire application in C# would be interesting.
We will continue to review the rest of the Framework APIs but I want to take a brief break to solve a puzzle.
How do we decode Kanjii characters ? Let's treat Kanji as Unicode. We know that Unicode gives a codepoint to each of the characters represented in the standard. UTF-16 encoding uses the two 8bit bytes for the regular ISO characters and upto two 16 bit units to handle each additional characters. Essentially a text represented in UTF-16 encoding could have variable number of bytes for each character. We therefore walk the text a character at a time by identifying the byte boundaries much the same way as we carefully walk t the headers of a packet. In code this could translate as an iterator over the bytes that appends to a text a character deciphered by reading one or more bytes at a time. Each character could be represented as a codepoint in the usual notation of U+hex or U+four_digit_hex or U+five_digit_hex or more depending on the blocks and planes used as defined in the Unicode standard.
We will also briefly review another problem in which given an inorder and a preorder traversal of a tree, we will construct the tree and return the root.
So given an inorder traversal of 1,2,3,4,5,6,7
and a preorder traversal of 4,2,1,3,6,5,7.
This has already been treated earlier but the idea is reinforced with the following principle.
Given a node in the preorder list we find the root and the size of the left subtree by simply counting the number of nodes that appear before it in the inorder. So take 4 and the number of nodes before it is three in inorder. Therefore the fourth node from 4 in the pre-order is the right child. To determine the left child, take the immediately adjacent node in the pre-order.
In the above example, if we take a tree with the nodes 3 and 5 missing,
we have the inorder as 1,2,4,6,7
and preorder as 4,2,1,6,7
Here the left child of 6 is null and the right child of 2 is null.
We determine the missing left child by noticing that the item before in the inroder is the parent or null
We determine the missing right child by noticing that the item after in the inorder is the parent or null.
Always review and test all code.
We now look at some more .NET APIs
System.Threading also has other primitives typically available from the operating system.
These include locking and thread c intro primitive
While we are on the topic of semaphores and threads, let us look at a producer consumer problem. A producer consumer can use a semaphore for buffering a count of jobs whereas they can use a mutex if there is only one job to mutually exclude.

Today we discuss some more of the .Net APIs from the framework class library. Specifically we discuss the task parallel library. We first look at data and task parallelism and then we discuss some of the potential pitfalls we need to watch out for.
Data parallelism refers to scenarios in which the same operation is performed concurrently on say elements in a source collection. The collection is partitioned so that multiple threads can work on it. The for loop can be partitioned with a lambda expression.
Parallel.ForEach( sourceCollection, item => Process(item));
There are options available to stop or break loop execution, monitor the state of the loop, maintain thread local state, finalize thread-local objects, control the degree of concurrency and so on.
Task Parallelism is based on the concept of a task which represents an asynchronous operation and resembles a thread or thread pool without the corresponding OS primitives but as an abstraction
This enables more efficient and more scalable use of system resources.
It also enables more programmatic control than is possible in a thread or work item.
Parallel.Invoke() method enables creating and running tasks implicitly. As an example:
Parallel.Invoke(() => DoSomeWork(), () => DoSomeOtherWork());
The number of task instances for the above statement does not have to match the number of parameters to Invoke method.
For greater control over the tasks, we can use Task.Wait and Task.Run method.
The potential pitfalls in Data and Task Parallelism are:
we cannot assume that parallel is always faster. If the parallel loops have few iterations and fast user delegates, then it is unlikely to speed up much.
We should not write to shared memory locations because they increase the race conditions and contention.
We should avoid doing over-parallelization. Each partitioning incurs a cost and there is cost to synchronize the worker threads.
We should avoid calls to non-thread safe methods because it can lead to data corruption which may go undetected.
We should also limit the calls to thread safe methods because there can be significant synchronization involved.
Delegates used with Parallel.Invoke should not have waiting because it affects the execution of others and can result in deadlocks when the tasks are inlined to the currently executing thread.
We should also not assume that iterations in the loop specified always execute in parallel.
We should avoid executing parallel tasks on UI threads.