Monday, April 11, 2016

Given an array with n objects colored red, white or blue, sort them so that objects of the same color are adjacent, with the colors in the order red, white and blue.

Here, we will use the integers 0, 1, and 2 to represent the color red, white, and blue respectively.

We make two passes

First pass put all the reds together by exchanging contiguous positions with red only

Second pass with all whites together

The rest automatically follow.

Void sortColors ( ref list <int> colors )
{
For ( int color = 0; color < 2; color ++;)
For (Int src = 0; src < colors.count (); src++)
{
If ( color > 0 && colors [src] <= color) continue;
Int d = src +1;
If (colors [src] != color){
While ( d < colors.count () && colors [d] != color) d++;
If ( d != colors.count()) swap colors [src], colors [d]
}
}

#coding exercise
Write a program to find the nth ugly number. An ugly number is one that has prime factors of only two, three or five
Solution an ugly number must be formed by multiplying 2,3 or 5 from a smaller number.

List<int> getUgly(int n)
{
var ret = new List<int>() {1};
// indexes for last candidates for corresponding multiples
int I2 = 0;
int I3 = 0;
int I5 = 0;
for (int i = 0; i < ret.Count() && ret.Last() < n;i++)
{

int min = min(GetProduct(ref ret, 2, I2 ),
GetProduct(ref ret, 3, I3),
GetProduct(ref ret, 5, I5));
if (min == GetProduct(ref ret, 2, I2))
I2 = I2 + 1;
if (min == GetProduct(ref ret, 3, I3))
I3 = I3 + 1;
if (min == GetProduct(ref ret, 5, I5))
I5 = I5 + 1;
if (ret.Contains(min) == false) // this check can be commented out
ret.Add(min);
}
return ret;
}
int GetProduct(ref ret, int multiplier, int index)
{
return ret[index] * multiplier;
}
For example:
ret 1,2, 3, 4, 5, 6 ,8,9, 10, 12, 15
I2 6
I3 5
I5 3

Sunday, April 10, 2016

Enterprise scale message queues

Public clouds offer message queue services as a background task processor. Messages may be sent to queues that are set up with a processor to handle the messages. Since the messages are consumed offline, they are ideal for background tasks that do not otherwise fall under the scope of the online services or their Application Programming Interface (API).

This post describes how message queues may be made available in private cloud for end user consumption. If we choose full service message queue software, we will inevitably find a distributed cluster as the topology to deploy. The distributed cluster works even on the same physical machine and is already geared towards performance and scale-ability even on a limited resource. However, the choice of software and hardware and their configurations is important to study so that a suitable arrangement can then be made to handle different sizes of workloads. As we will shortly see, there are many options for rolling out message queue services. Moreover, the message queue services are considered here to be deployed in the cloud. If it were personal to each cloud user, then we could have treated it as an application and rolled it out with each server that is issued to customers.

Let us therefore see how the enterprise wide message queue services needs to be laid out to the customers of a private cloud.

First we choose the number of instances of the message queue cluster and the cluster size. We have already discussed that being as granular as one instance for every server does not offer the power that cloud computing users exist. On the other end of the spectrum having a large cluster for a single dedicated instance to private cloud users has the benefit that it can be made organization specific and at the same time be centrally managed and handled for all users. Moreover, this logical instance can be internally represented by a syndication of clusters across different regions hiding the complexity from the user. After all, the user expects an endpoint and access to the cluster regardless of how it is setup

Second the cluster size can vary depending on the loads because most message queue software allow the addition of nodes to increase the computing power. This means that if we go with a single instance then we can arbitrarily scale out the cluster to meet the traffic. This is also more efficient when it is offered as a managed service

Lastly, we consider the nature of this software and provision accordingly. Customers are likely to use the queues in one of the following ways each of which implies a different layout for the same instance and cluster size.

1. The queues may be mirrored across multiple nodes – This results in a choice of using clusters

2. The queues may be chained where one feeds into the other – This results in a choice of using federation

3. The queues may be arbitrary depending on application needs – This results in a build your own or the shovel work

Conclusion: A private cloud message queue offering has to be carefully planned and could begin with a single managed organization wide distributed instance.
#codingexercise
Given a non negative integer number num. For every numbers i in the range 0 ≤ i ≤ num calculate the number of 1's in their binary representation and return them as an array.
int GetBits(int num)
{
int count = 0;
for (int i = 0; i < num; i++)
{
List<Bit> bits = num.toBits();
count += bitx.Count( x => x & 0x1);
}
return count;
}

Saturday, April 9, 2016

In yesterday's problem solving, we did not really do dynamic programming as much as we did generation of paths.
Since the overlapping subproblems can also be expressed in a relationship between the longest paths of the neighbors to that of the current, we could also solve the problem without redundancy.
for example, we can say that longest_path_at current = max(longest_path_of_neighbors) + 1 and the longest path for each cell of the matrix as the origin can be computed and memoized so that we don't repeat the calculations there again.
Another coding interview question:
Given a list of unique words. Find all pairs of distinct indices (i, j) in the given list, so that the concatenation of the two words, i.e. words[i] + words[j] is a palindrome.
The naive approach would be to iterate the list to enumerate the pairs. This is O(N^2) and then concatenate the pairs and test for palindrome.
something like
List<Tuple<int,int>> GetPalindromeIndices(List<string> words)
{
var ret = new List<Tuple<int,int>>();
for (int i = 0; i < words.Count(); i++)
{
for (int j = 0; j < words.Count(); j++)
{
if (i != j && IsPalindrome(words[i] + words[j]))
{
var index = new Tuple<int,int>() { i, j };
ret.Add(index);
}
}
}
return ret;
}
An improvement would be to sort the words in the list and then if any of the words has a midpoint for reverse within it find the complimentary word or the reverse of the current word in the list. This is a binary search since we know what we are looking for. Note that the word pair lengths need not be the same and hence the midpoint to be contained in one of the words. After we have found all the pairs of words, we then iterate the list once to lookup their corresponding indices.

Detecting themidpoint of a word is only possible when the letters start to appear reverse, therefore we know what the remainder of the word should be. Otherwise if the midpoint is not in the word, the other word is a complete reverse of this one.

Friday, April 8, 2016

Yesterday we were solving a puzzle as below.
You are given an array x of n positive numbers. You start at point (0,0) and moves x[0] metres to the north, then x[1] metres to the west, x[2] metres to the south,x[3] metres to the east and so on. In other words, after each move your direction changes counter-clockwise. Find out if the paths cross.
Today we discuss the space and time complexity of the solution. We use a set of start and end points for both current and previous axis segments. We update it as we linearly traverse the list of displacements until either we cross the path or we reach the end of the given list. Thus the space complexity is O (1) and time complexity is O (n).
We also review another coding problem.
This is the longest increasing path problem.
Given an integer matrix we have to find the length of the longest increasing path. From any cell, the path is formed by traversing left right up or down but not diagonally or outside the boundary. Since the path consists of strictly increasing sequence of selected elements from the matrix, it is a sorted sequence. We are only interested in the number of elements.
IF we start with a cell, we will have one or more choices to pick the next element for a path because only a few of the elements, if any, will be bigger than the current element. for each of these elements we have to repeat the same problem again.
There are three key aspects to recognize
1) the problem has a termination condition. we know we stop when we cannot add another element in the path because every element adjacent to the current element is of lesser or equal value or the current element is at the boundary or both.
2) each element participates as an origin of the path or as an interim element of some other element's path. This is called overlapping sub-problems
3) for all possible choices of next steps from the current element, we have the same problem all over again. This condition is for making progress in our path and we do it with recursion. Recursion means we form a solution that take care of termination and progress to the next step at which time we reenter the solution with the next element as the current.
Therefore the solution looks like the following:

void GetLongestPath(int[,] matrix, int rows, int cols, int i, int j, ref List<int> path)
{
path.Add(matrix[i,j]);
bool progress = false;
if (j-1 > 0 && matrix[i,j-1] > matrix[i,j]) { progress = true; GetLongestPath(matrix, rows, cols, i, j-1, path);}
if (i-1 > 0 && matrix[i-1,j] > matrix[i,j]) { progress = true; GetLongestPath(matrix, rows, cols, i-1, j, path);}

if (j+1 < cols && matrix[i,j+1] > matrix[i,j]) { progress = true; GetLongestPath(matrix, rows, cols, i, j+1, path); }

if (i+1 < rows && matrix[i+1,j] > matrix[i,j]) { progress = true; GetLongestPath(matrix, rows, cols, i+1, j, path); }
if (progress == false) Console.WriteLine("{0} items in {1}", path.Count(), path.toString());

path.Remove(matrix[i,j]);

}

Thursday, April 7, 2016

#puzzle
You are given an array x of n positive numbers. You start at point (0,0) and moves x[0] metres to the north, then x[1] metres to the west, x[2] metres to the south,x[3] metres to the east and so on. In other words, after each move your direction changes counter-clockwise. Find out if the paths cross.
Solution
Keep track of the axis segments from four steps behind
if the current axes segment is perpendicular (directions are repeating patterns) to the one four step behind and has a point from the tracked axes segment on the current axes segment (either the x or y are same, we know whether it is x or y because they repeat alternatively), then they cross.
bool axis_segments_cross( int x1, int y1, int x2, int y2,
int x3, int y3, int x4, int y4)
{
if (x1 == x2)
{
assert(y3 == y4);
if (y1 <= y3 && y3<= y2) return true;
else return false;
}
if (y1 == y2)
{
assert(x3 == x4);
if (x1 <= x3 && x3 <= x2) return true;
else return false;
}
if (x3 == x4)
{
assert(y1 == y2);
if (y3 <= y2 && y2<= y4) return true;
else return false;
}
if (y3 == y4)
{
assert(x1 == x2);
if (x3 <= x1 && x1 <= x4) return true;
else return false;
}
return false;
}

bool update_axis_segments(List<int>x)
{
Point curr_start, curr_end; // current displacement
Point prev_start, prev_end; // 4 displacements back
for (int i = 0; i < x.Length; i++)
{
//update
switch(i%4)
{
case 0: // North curr_end.y+=x[i] also snapshot previous
case 1: // West curr_end.x-=x[i]
case 2: // South curr_end.y -= x[i];
case 3: // East curr_end.x += x[i];
}
// check
if (axis_segments_cross(curr_start.x, curr_start.y, curr_end.x, curr_end.y,
prev_start.x, prev_start.y, prev_end.x, prev_end.y) == true) return true;
}
return false;
}

Wednesday, April 6, 2016

Today we start reading the paper Pelican: a building block for exascale cold data storage by Balakrishnan et al. Cold data is the data that is rarely accessed. Pelican is a rack-scale hard-disk based storage unit designed as the basic building block for Exabyte scale storage for cold data. A rack-scale set of computers is a class of hardware with pooled storage, compute and network such that thousands of cores are interconnected at high speed. A rack is increasingly replacing the individual server as the basic building block of modern data centers. At Microsoft, the rack-scale computing, the Rack-scale computing project aims to redesign hardware, storage, networking and OS stacks in a cross layer co-design manner so as to improve efficiencies. By choosing a building block of this size, Pelican aims to make a building block of this size for cold data workloads. The provisioning is considered just right when the total cost of ownership reduces as compared to traditional disk-based storage. For example, only 8% of the drives can be concurrently spinning in this building block. This introduces complex resource management to be handled by Pelican.
This is met in the following way. Resource restrictions are expressed as constraints over the harddrive. The data layer and IO scheduling ensure that these constraints are not violated. Pelican and conventional storage are compared side by side using a cross validated simulator. Pelican demonstrates high throughput with acceptable latency for cold workloads.
The cost of storing data appears as a sliding scale. The storage in Amazon S3 for example is near the high end. But cold data is also growing phenomenally. Amazon Glacier and Facebook cold data storage are just a few examples where S3 doesn't suffice and cloud scale tiers need to be provisioned. Moreover these systems have to minimize up front capital costs of buying the storage as well as the costs of running the storage. By co-designing hardware and software stacks, Pelican aims to allow the entire rack's resources to be carefully balanced to support cold data workloads.
#coding exercise
Given a sorted positive integer array nums and an integer n, add/patch elements to the array such that any number in range [1, n] inclusive can be formed by the sum of some elements in the array. Return the minimum number of patches required.
This has a dynamic programming approach
If the range of sums that can be generated using the first k numbers in nums is from 0 to r, and the next number has value x, then either:
• x <= r+1 and it is possible to generate any sum from 0 to r+x using the first k+1 numbers, or
• x > r + 1 and it is not possible to generate the sum r + 1, since x and all the following numbers have greater values.
We keep adding x=r+1 and counting such additions until r meets or exceeds n.

int GetMinPatches(List<int> nums, int n)
{
int count = 0;
Int index = 0;
int r = MaxSumImpossible(nums, 0, index);
while ( index < nums.count && r < n)
{
Index++;
int x = MaxSumImpossible(nums, 0, index+1)
If ( nums.contains(x) == false){
count ++;
Nums.add (x, index);
Index++
}
r = r +x;
}
return count;
}

int MaxSumImpossible(Numbers, I, j)
if len(Numbers) == 0:
return 0;
If len(Numbers) == 1:
Return Numbers[0]
m[I,j] = 0
For k = 0 to j-I-1
if (jth coin <= MaxSumImpossible(Numbers Choose K, I,j-1)
q = MaxSumImpossible(Numbers Choose K, I,j-1) + MaxSumImpossible(Numbers with jth Number only, j,j)
If q > m[I,j]
M[I,j] = q
return m[I,j]