Sunday, May 22, 2016

Backup and Recovery – why existing products ? 
Introduction : Digital Data can get erased, lost or corrupted. Virtual Machines are no different in that respect. To enable disaster recovery, virtual machines are often backed up. Previously data was important and stored outside machines on shares, files and repositories. But object storage changed that where the virtual machines was used as stores. While object storage tolerates failures from participating machines, it relies on keeping copies of data but backups are also copies of data.  A word document can get saved continuously. Why can’t all the virtual machines in a data center get saved periodically for the duration that the machines are active? 
Why a new product? It’s true that backup and recovery is becoming increasingly difficult to manage. Products have become smarter by keeping agents in the operating systems of the virtual machines that can detect what to backup and when. On the other hand, snapshots which are not that intelligent and require merely a point of time capture at the storage level has become increasingly popular among cloud providers. Existing products have improved offerings with deduplication, manageability, maintenance and even come with their own appliances. As these businesses compete, there is seldom any attention paid to efficiencies in terms of disk I/Os and network bandwidth in a vendor –heterogenous-deployments in the cloud. Consequently, many cloud providers either do not leverage the products to their full abilities or have to explicitly disable some features to manage the overall operations of the data center.  
Gartner report on backup and recovery actually mentions less than a handful of visionaries who are even poised to make greater impact. Yet their technologies are leaving a lot to be configured by the administrators in a datacenter. What if we had a dedicated pool offering to take a round robin of snapshots periodically for every virtual machine in a datacenter? At this point, we are planning to offer a service and not a product to add value where none existed earlier for the cloud providers. Moreover given the large number of virtual machines in a data center, these snapshots can be better organized and automated so that neither the user of the virtual machines nor the administrators need to take any additional actions.  Most backup programs already have a service or a daemon running in their client-server solutions anyways. These agents that run locally on a client or in a central server target individual flavors of the operating systems.  But when we run a service outside the virtual machines and on a pool of servers such that they can handle the load of a datacenter, we are talking about automating at a scale larger than ever before. Now we can consistently provide many more features in this managed service such as aging policies and cleanups. 

#codingexercise
Find the smallest subarray that needs to be sorted to sort the full array
Tuple<int,int> GetSmallerSort(List<int> nums)
{
int min = nums.min();
int start = nums.indexOf(min);
int max = nums.max();
int end = nums.indexOf(max);
for ( int i = start-1; i>= 0; i--)
      if (nums[i] > nums[start]){
          start = i;
          break;
        }
for ( int j = end+1; j < nums.Length; j++)
      if (nums[j] < nums[end]){
          end = j;
          break;
        }
return new Tuple<int, int>() { start, end };
}

No comments:

Post a Comment