Monday, November 9, 2015

Today we start reviewing the paper Map-Reduce for Machine Learning on Multicore by Chu et al. In this paper they take advantage of Map Reduce methods to scale Machine Learning algorithms. As we had read from Boyles discussion that summation is an integral part of  Statistical Query models, this paper too talks about taking advantage of summation forms that can be performed on multiple cpus using Map-Reduce methods. They show how this can be done for a variety of learning algorithms such as locally weighted linear regression, k-means, logistic regression, naive Bayes, principal component analysis, independent component analysis, Expectation Maximization, Support Vector Machine etc. Most of these algorithms involve a summation.  When an algorithm computes a sum over the data, the calculation can be distributed over multiple cores. The data is divided into as many pieces as there are cores. Each cpu computes a partial result over its local data. The results are then aggregated. In a cluster framework, a master breaks down the data to several mappers and there is one reducer that aggregates the results. This is therefore scalable horizontally to arbitrary data size. Some mapper and reducer functions require additional scalar information from the algorithms. In order to support these operations, the mapper or reducer can get this information  from the query _info interface. In addituon, this can be customized for each algorithm. Moreover, if some algorithms require feedback from previous operations, the cycle of map-reduce can be repeated over and over again.
#codingexercise
Given a dictionary with limited words. Check if the string given to you is a composite of two words which are already present in the dictionary.

bool IsComposite(List<string> dictionary, string word)
{
for (int i=1; i<dictionary.Length-1; i++)
{
     var word1 = word.Substring(0, i);
     var word2 = word.Substring(i);
     if (dictionary.contains(word1) && dictionary.contains(word2))
     {
         return true;
      }
}
return false;
}

No comments:

Post a Comment