Tuesday, December 8, 2015

Today we continue discussing the paper by Dekel, Gilad-BachRach, Shamir and Xiao which described optimal distributed online prediction using mini-batches.Their method can use any gradient-based update rule by treating the serial online prediction as a black box and convert it into a parallel or distributed online prediction algorithm. Their regret bounds for smooth loss functions are variance based and state refined.
They provide two examples of the update rule that fits the above template. 
The first is the projected gradient descent update rule where the next prediction is found by applying the Euclidean projection onto the set W to the current prediction without the scaled gradient. This scaling factor is also called the decaying learning rate and is the negative inverse of auxiliary state vector alpha-j that summarizes all of the necessary information of the past. Alpha-j is typically set to be having a complexity of square root(j) where j is the current iteration. In this example of the update rule, we can simply set alpha-j to be current prediction wj and use it as is to get the next prediction as the update rule.
The other example given is the dual averaging method. It's update rule takes the form such that the next prediction is based on the vector inner product of the sum of gradients with the prediction taken together with a monotonically increasing sequence of positive numbers times the strongly convex auxiliary function 
#codingexercise

Given two arrays, first one containing a number sequence and the second one containing number s that can be found in the sequence, find the smallest window in the number sequence that has all the elements from the second array:
#codingexercise
int FindSmallestWindow(List<int> haystack, List<int> needle)
{
// parameter validation and assume no duplicates in haystack or needle.
var indexes= new SortedList();
for (int I =0; I <haystack.Length; I++)
indexes.Add( haystack[i], i);
Array.sort(needle);
// indexes and needle both sorted
now we can use the sorted sequence and index arithmetic linearly
int min = haystack.Length - 1;
int max = 0;
int range = 0;
for (int i = 0; i < needle.length; i++)
{
int index = indexes[needle[i]];
if (index < min) min = index;
if (index > max) max = index;
}
if (min > max) {int temp = min; min = max; max = temp;}
int range = max - min + 1;

return range;
}
int GetWindow(out start, out length, int[] haystack, int[] needle)
{
int minrange = haystack.Length; // default value
for (int start = 0; start < haystack.Length; start++)
{
var found = new Dictionary<int,int>();
for (int i = start; i < haystack.Length; i++)
{
if (needle.contains(haystack[i]))
{
if (found.length == needle.length)
{
int min = found.values.min();
int max = found.values.max();
if (max-min+1 < minrange)
minrange = max-min+1;
break;
}
else
{
found[haystack[i]] = i;
}
}
}
}
return minrange;

}

No comments:

Post a Comment