Sunday, May 22, 2016

Backup and Recovery – why existing products ? 
Introduction : Digital Data can get erased, lost or corrupted. Virtual Machines are no different in that respect. To enable disaster recovery, virtual machines are often backed up. Previously data was important and stored outside machines on shares, files and repositories. But object storage changed that where the virtual machines was used as stores. While object storage tolerates failures from participating machines, it relies on keeping copies of data but backups are also copies of data.  A word document can get saved continuously. Why can’t all the virtual machines in a data center get saved periodically for the duration that the machines are active? 
Why a new product? It’s true that backup and recovery is becoming increasingly difficult to manage. Products have become smarter by keeping agents in the operating systems of the virtual machines that can detect what to backup and when. On the other hand, snapshots which are not that intelligent and require merely a point of time capture at the storage level has become increasingly popular among cloud providers. Existing products have improved offerings with deduplication, manageability, maintenance and even come with their own appliances. As these businesses compete, there is seldom any attention paid to efficiencies in terms of disk I/Os and network bandwidth in a vendor –heterogenous-deployments in the cloud. Consequently, many cloud providers either do not leverage the products to their full abilities or have to explicitly disable some features to manage the overall operations of the data center.  
Gartner report on backup and recovery actually mentions less than a handful of visionaries who are even poised to make greater impact. Yet their technologies are leaving a lot to be configured by the administrators in a datacenter. What if we had a dedicated pool offering to take a round robin of snapshots periodically for every virtual machine in a datacenter? At this point, we are planning to offer a service and not a product to add value where none existed earlier for the cloud providers. Moreover given the large number of virtual machines in a data center, these snapshots can be better organized and automated so that neither the user of the virtual machines nor the administrators need to take any additional actions.  Most backup programs already have a service or a daemon running in their client-server solutions anyways. These agents that run locally on a client or in a central server target individual flavors of the operating systems.  But when we run a service outside the virtual machines and on a pool of servers such that they can handle the load of a datacenter, we are talking about automating at a scale larger than ever before. Now we can consistently provide many more features in this managed service such as aging policies and cleanups. 

#codingexercise
Find the smallest subarray that needs to be sorted to sort the full array
Tuple<int,int> GetSmallerSort(List<int> nums)
{
int min = nums.min();
int start = nums.indexOf(min);
int max = nums.max();
int end = nums.indexOf(max);
for ( int i = start-1; i>= 0; i--)
      if (nums[i] > nums[start]){
          start = i;
          break;
        }
for ( int j = end+1; j < nums.Length; j++)
      if (nums[j] < nums[end]){
          end = j;
          break;
        }
return new Tuple<int, int>() { start, end };
}

Saturday, May 21, 2016

Today we discover backup and recovery products for datacenters. Cloud providers like VMWare have their own backup products such as Veeam and their vCenter platform also provides ability to take snapshots. These compete directly with third party backup products such as datadomain which is known for deduplication and commvault which is known to be a leader in VM backups. Performance and operations are affected when the backup techniques are competing against each other. Some other site replication techniques include:
Hot Backup - This means maintaining an additional site which operates in parallel to the main installation and immediately takes over in the case of the failure of the main center.
Warm Backup - This site is inactive but ready to become operational within a matter of hours.
Split Site - Two installations each somewhat smaller than a hot backup are used. In case of emergency one center can keep the organization running by performing only jobs with high priority.
Cold Backup - This is used when the recovery time is mainly affected by the need to install the infrastructure and support systems ( electricity, communications, air conditioning, etc) whereas the cost is mainly affected by the cost of the hardware.
Mutual Backup - This is a mutual agreement between two computer centers generally operated by different firms. They decide to assist each other in the case of emergency on the basis of best efforts. While this may look like it is the least expensive, but it is not efficient and particularly when both firms operate with similar peak loads and where each has a limited excess capacity.
Pooling - This is an arrangement where few members join into mutual agreement concerning a computer center, which is standing by idle to offer a service to any member suffering from interruption in its computer center. The idle computer center may use cold or warm backup. The center serves a pool of user who share the owner ship or pay a certain membership fee.
#codingquestion
You are given a Text, where all space, full stop and all punctuation mark is removed. You want to reconstruct the text by putting spaces between words.
A dict is given and following API < bool isInDect(word) > is also given.
a) Decide if the text can be converted a sentence with valid words or NOT.
b) Find how many way you can do the reconstruction of the text.
c) Find what is the minimum number of space can be used for this reconstruction.
d) For case (c) find out the indexes where you suppose to put a space.
e) Now recover the text to sentence in place .
Essentially this is the same as wordBreak problem but the first match alone may not be sufficient depending on the given dictionary.
bool wordBreak(string str, List<string> dictionary)
{
if (str.length == 0) return true;
for (int i =1; i < str.length; i++)
{
if (dictionary.contains(str.substring(0,i) && wordBreak(str.substring(i, len-i), dictionary) return true;
}
return false;
}


Question: Can we setup dedicated pooling for periodic automatic backup services of active data center virtual machines?

Friday, May 20, 2016

This is an analysis of integrating an Order Management System with a third party older system that does not meet current requirements such as omnichannel customer experience. We decide to put the legacy system downstream from the user interface and behind the new order management system. We explore options to leverage existing without writing code such as with data migrations to the legacy system so that the bulk of the processing continues to be done by the legacy. In this exercise, we get to know the shortcomings of the older system that we can mitigate with the newer system. The older system may impose restrictions in how the data is handled and may have quirks that will need to be scoped before the integration. Some edge cases such as a single order with large number of items and a large collection of smaller orders might serve to study the old system’s behavior. These can be alleviated in the new system but the integration can happen across any or all of the three tiers of database, services and user interfaces. It would be prudent to integrate them at the service layer so that the customer experience does not suffer from any of the legacy interactions. Even if the cost is slightly more for UI development, it will help to give a newer more studied experience to the user than anything from the legacy system because we will find it hard to change the experience given that the legacy is third party. In this regard, a service based integration of different domains seems prudent. We will discuss this shortly but let us take a moment to see data integrations. The legacy third party system should be checked for bulk upload and processing of data. If such an option, were available, we can perform extract, transform and load operations that will remove the burden from the online expectations of the customer and the delays and latencies imposed by the third party legacy system. Offline or batch processing options with the legacy system may also help with reducing cost and streamlining the user online interactions. Wherever possible, even the new system could handle asynchronous processing to provide better online Service Level Agreement to the customers of the Order management system. Since the service provides a convenient business layer for the integration, all interactions between the the third party and the Order Management System may be captured at this layer. A microservice model that targets different aspects of the third party services such as inventory, fulfilment and payment separately, will help with the integration testing at the interfaces. This lets the services be hosted separately so that the services can be cloud-ready. The service layer may use these microservices to compete the user requests. This is also helpful to the user interface which can be written new and made available on the browser or devices because they now interact over the http with a single service endpoint. A sample order for an item from the user may involve the following steps: the item will be queried from the inventory to see if it is in stock. The order may then be fulfilled by locating the item by its identifier and changed to sold. When all parts of the order are completed, the order may be marked as completed and closed. Given that we have to integrate with a third party system that can inject failures at anytime during the above workflow, we discussed two options to process the orders. Each order may be put in a transactional scope so that it fails entirely when any of the failures occur. The second option was to allow incremental progress with retries by maintain state during the workflow. Both options do not necessarily change the customer expectations because the failures can be managed internally as much as possible. That said, the customer experience might improve with the system giving more status information on the order placed. When the customer wants to check back on the status, the downstream systems activity updates would have found its way to the customer. As long as the states gets updated in the forward direction from the initialized, pending, fulfilled, acknowledged and closed/canceled, there will be minimal errors introduced. Moreover offline processing for those states may also be enabled. The application may require offline processing for such things as email notifications to the customer. Even though the user interface may be user driven targeted for 100% satisfaction, the offline notifications such as emails will bolster the customer experience for easy reminders and follow-ups. Emails also help with continuous acknowledgement throughout the fulfillment process. Customers orders are also useful history to have so if the legacy system is not handling these operations then the data may be captured in transit and augmented with statistics. Components such as payment services may be streamlined and externalized so that the role of the legacy system is reduced and it may gradually be phased out or replaced. Payment services also need to adhere to compliance policies because the cloud is generally not used for sensitive data until safeguards and compliance checks are complete. Vendor Management services may also be enhanced over time separate from the existing ones supported by the legacy system. The design would be more flexible when the dependency on legacy system for vendors is reduced and the corresponding mapping is maintained in the order management system. Inventory and catalog service may require a database different from the orders in that the catalog may be very large and the operations on it may be very different from the order management database. The interactions with the other systems may be better managed in the service layer without directly querying the data sources. Services may also be across gateway or enterprise service bus especially if they are legacy. Address, binding and contract for these services may need to be defined for the integration. A microservice model instead of service oriented architecture for all components will greatly help with deploying the services in the cloud. The ability to load balance and rotate the hosts for the applications will help with resilience on server failures and it will be seamless to the users. Each service logs may flow via syslog to an index which can tremendously help with troubleshooting and diagnostics. Similarly, authentication, authorization and auditing need to be enabled for all services. This is different when the order management system is external rather than internal because there may be a different authentication and authorization service. Also, the external facing applications will have more stringent compliance than internal systems. Cloud based services in general lend themselves to be easily moved from internal to external.
#coding exercise
Find the diameter of a tree
In the diameter (node root )
{
If (root == null) return null;
Int lh = height(root. Left);
Int rh = height(root.right);
Int ld = diameter (root.left);
Int rd = diameter (root.right);
Return max (lh + rh +1, ld, rd);
}

Int multiply_with_minimum_additions (int a1, int a2)
{
Int res =0;
Int count = 0;
While (a2)
{
Int add = 0;
If (a2 & 0x1){
Add = a1;
}
For (int I =0; I < count; i++)
{
Add <<= 1;
}
Count++;
Res+= add;
A2 >>= 1;
}
Return res;
}

Thursday, May 19, 2016

#codingexercise
Find minimum number of coins to make a sum
int GetMin(List<int> coins, int sum) // sorted coins
{
if (sum == 0) return 0;
if (coins.sum() == sum) return coins.count();
if (coins.sum()  < sum ) return INT_MAX;
return min(GetMin(coins.Range(1, coins.count-1), sum),  GetMin(coins.Range(1, coins.count-1), sum-coins[coins.count-1])+1);
}
Find maximum number of ways to make a sum from coins with infinite supply
int GetMaxWays(List<int> denominations, int sum)
{
if (sum == 0) return 0;
if (denominations.count() == 0) return 0;
return GetMaxWays(denominations.Range(1,denominations.count-1), sum) +
          GetMaxWays(denominations.Range(1,denominations.count-1), sum-denominations[denominations.Count-1]);
}
void TREE-DELETE(Node root, Node z)
{
if (z.left == null)
      TRANSPLANT(T,z,z.right);
else if (z.right == null)
      TRANSPLANT(T,z,z.left)
else{
      y = TREE-MINIMUM(z,right);
      if (y.parent != z)
           TRANSPLANT(T, y, y.right)
           y.right = z.right;
           y.right.parent = y;
      TRANSPLANT(T,z,y)
      y.left = z.left;
      y.left.p = y;
}
}

Wednesday, May 18, 2016

Given arrival and departure times, find the maximum overlap 
Int maxOverlap(List<Tuple<int,int>> timings, int n) 
{ 
Timings.sortByArrivals(); 
Var overlap = List<Tuple<int,int>> (); 
Int max = 0; 
Int current = 0; 
for (int I = 0; I < n; i++) 
{ 
   Current = overlaps[i].first; 
   Overlap.add(timings[i]); 
   For (int I =0; I < overlap.Count; i++) 
   { 
         if (overlaps[i].second < current) 
            overlaps.RemoveAt[i];  
   } 
   If (overlaps.count> max) 
        Max = overlaps.count; 
} 
Return max; 
} 
#Find max area under the histogram 
Int MaxAreaOfARectangle(List<int>histogram) 
{ 
Int max = 0; 
For(int I = 0; I < histogram.count; i++){ 
Int area = GetArea(histogram,i); 
If (area > max) 
    max = area; 
} 
Return max; 
} 
Int GetArea(List<int>histogram, int center) 
{ 
Int area = histogram[center]*1; 
For (int I = center-1; 
         i>=0 && histogram[i] > histogram[center]; i++){ 
         area += histogram[center]*I; 
} 
For (int I = center+1; 
         I<histogram[count] && histogram[i] > histogram[center]; i++){ 
         area += histogram[center]*I; 
} 
return area; 
} 


#codingexercise
Given a positive integer N, count all possible distinct binary strings of length N such that there are no consecutive 1’s
void GetString(int N, ref stringbuilder candidate)
{
if (N==0) { print candidate; return;}
if (candidate.Count>0 && candidate[candidate.Count-1] == "1") {
candidate.add("0");
GetString(N-1, ref candidate);
candidate.RemoveLast();
return;
}
candidate.Add("0")
GetString(N-1, ref candidate);
candidate.RemoveLast();
candidate.Add("1")
GetString(N-1, ref candidate);
candidate.RemoveLast();
}

int GetNonRepeatedElement(List<int> nums)
{
assert(nums.Count > 0);
int x  = nums[0];
for (int i = 1; i  < nums.Count; i++)
      x^= nums[i];
return x;
}

for finding two elements
List<int> GetTwoElements(List<int> nums)
{
assert(nums.Count > 0);
int x = nums[0];
for (int i =1; i < nums.Count; i++)
{
x^= nums[i];
}
int div = x &^(x-1);
int a;
int b;
for (int i = 0; i < nums.Count; i++)
{
if (nums[i] ^ div)
    a ^= nums[i];
else
    b ^= nums[i];
}
var items = new List<int>;
items.add(a);
items.add(b);
return items;
}

Tuesday, May 17, 2016

#codingexercise
find the least common ancestor of two nodes
Node GetLCA (Node root, int n1, int n2)
{
if (root == null) return null;
if (root.data == n1|| root.data == n2) return root;
var left = GetLCA(root.left, n1, n2);
var right = GetLCA(root.right, n1, n2);
if (left && right ) return root;
if (left != null) return left;
return right;
}

Find middle of linked list
void GetMid(Node root)
{
var slow = root;
var fast = root;
while( fast && fast.next && fast.next.next)
{
slow = slow.next;
fast = fast.next.next;
}
return slow;
}

Print Left View of a tree
void LeftView(Node root)
{
int max = 0;
LeftViewHelper(root, 1, ref max);
}

void LeftViewHelper(Node root, int level, ref int max)
{
if (root == null) return;
if (level > max){
print root.data;
max = level;
}
LeftViewHelper(root.left, level+1, max);
LeftViewHelper(root.right, level+1, max);
}


Subjective  question : Are solaris zones good or bad ? Give examples
Solaris zones or containers are easy and lightweight. They are not like running a full OS on VMWare. It is just enough for OS to be separate but lightweight but we can run dozens even hundreds.
Applications that had virtually unrestricted physical resource in the native solaris OS do not nearly have the same on a  solaris zone so they are not able to transition without breaks in many cases.

#sorted array to a balanced binary search tree
Node GetTree(List<int> A, int start, int end)
{
if (start > end) return NULL;
int n = end-start+1;
int median = (start+end)/2;
var root = new Node();
root.data = A[median];
root.left =  GetTree(A, start, median-1);
root.right = GetTree(A, median+1, end);
return root;
}

Monday, May 16, 2016

#codingexercise

Given n non-negative integers in array representing an elevation map where the width of each bar is 1, compute how much water it is able to trap after raining.

int GetArea(List<int> height)
{
int result = 0;
int n = height.Count
if (height == null || n <=2)
    return height;
var left = new List<int>();
var right = new List<int>();
// scan left
int max = height[0];
left[0] = height[0];
for (int i = 1; i < n; i++)
{
if (height[i] < max){
left[i] = max;
}else{
left[i] = height[i];
max = height[i];
}
}
// scan right
max = height[n - 1];
right[n-1] = height[n-1];
for (int i = n-2; i>=0; i--)
{
if (height[i] < max){
right[i] = max;
}else{
right[i] = height[i];
max = height[i];
}
}
for (int i =0; i < n ; i++)
{
result += Math.min(left[i],right[i]) - height[i];
}
return result;
}

int match(char* regexp, char* text)
{
if (regexp[0] == '^')
    return matchHere(regexp+1, text);
do {
if (matchHere(regexp, text)
    return 1;
} while( *text++ != '/0');
return 0;
}

int matchHere(char* regexp, char* text)
{
if (regexp[0] == '/0')
   return 1;
if (regexp[0]  == '*')
   return matchStar(regexp[0], regexp+2, text);
if (regexp[0] == '$' || regexp[1] == '/0')
  return *text == '/0';
if ('*text != '/0' && (regexp[0] == '.' && *regexp == *text))
   return matchHere(regexp+1, text+1);
return 0;
}
int matchStar(int c, char* regexp, char* text)
{
do
{
if (matchHere(regexp, text))
    return 1;
}while(*text != '/0' && (c == '.' || *text++ == c);
return 0;
}
Courtesy: kerninghan & poker