Friday, December 4, 2015

The authors of the paper on Optimal Distributed Online Prediction Using Mini-Batches provide a method that can be called a distributed mini-batch algorithm, a method of converting any serial gradient-based online prediction algorithm into a parallel or distributed algorithm. This method has two important properties:
It can use any gradient based update rule for serial online prediction as a black box and convert it into a parallel or distributed online  prediction algorithm.
If the loss function f(w,z) is smooth in w, the prediction, then this method attains an asymptotically optimal regret bound of O(sq.rt(m)). Moreover, the co-efficient of the dominant term sq.rt(m) is the  same as in the serial bound, and independent of k and of the network topology.
The notion of mini-batches is that we don't process over the entire data set but in iterations. Meanwhile the gradient is not updated after the entire dataset but after a certain number of iterations. It is not the same as stochastic method which can update the gradient after every iteration.  Hence is it is a cross between batch and stochastic gradient method.
They average a mini batch of stochastic gradients computed at each predictor. They propose that the standard deviation of those stochastic gradients reaches the coefficient in the regret bound.
#codingexercise
Given two single digit integer arrays where the elements taken together form a number, maximize the number by replacing it with elements of the second array.
void MaximizeWithReplace(List<int> input, List<int> replace )
{
replace.Sort();
replace.Reverse();
for (int i = 0; i < replace.Length; i++){
  bool replaced = false;
  for (int j = 0; j < input.Length; i++){
         if input[j] < replace [i]{
              input[j] = replace [i];
              replaced = true;
              break;
         }
     }
   if (!replaced) break; // done
   }
}

No comments:

Post a Comment