Thursday, November 24, 2016


We continue discussing the paper "Nested sampling  for general Bayesian computation" by Skilling with an emphasis on understanding the transformation of computing evidence over parameters to computing evidence over unit prior mass that helps this technique. With this technique, nested sampling estimates directly how the likelihood function relates to previous value of the discrete random variable. And it can be used to compare two data models for the same set by comparing the conditional probability that is assigned after the relevant evidence is taken into account.
This method directly computes the evidence. This technique simplifies the evidence calculation by not summing over the parameters directly but instead performing it on the cumulated prior mass  that cover likelihood values greater than say lambda. As Lambda increases, the enclosed mass X decreases from one to zero. When we write the inverse function as L(X) where L is the likelihood function and L(X(Lambda)) is equivalent to Lambda, the evidence can be transformed from the integration of Likelihood function L(theta) over elements of prior mass dX = pi(theta)delta-theta where theta is the parameter to the integration of L over the prior mass directly as L(X)dX  Therefore the summation simplifies to a one dimensional integral over unit range
We are able to redefine the evidence from a variation over the parameters theta to a variation over the prior mass X by dividing the unit prior mass into tiny elements and sorting them by likelihood.
Let's picture it this way. We know that the evidence is the area under the curve of a plot between the likelihood and the prior mass. And we know that the likelihood decreases as prior mass increases from 0 to 1.  The evidence is therefore found by integration over this curve. We were able to obtain this simpler transformation by sorting the evidence based on likelihood values and arranging them as tiny strips of width dX and because X is cumulative.
This is clearer to see with an example where we have two dimensional parameter and their likelihood fill a matrix. If we take a four by four grid, there are sixteen likelihood values associated with each cell each of which has a equal prior mass 1/16. Also they are not sorted because they correspond to parameters. But if we sort them linearly and take the average likelihood,  then the likelihood corresponding to X = 1/5  is one fifth along this sorted sequence and consequently the fourth sorted cell out of sixteen and the likelihood can be read at that X can be read directly. These sorted descending values represents the curve  L over X which we integrate to find the evidence.


#codingexercise
Consider a row of n coins of values v1 . . . vn, where n is even. We play a game against an opponent by alternating turns. In each turn, a player selects either the first or last coin from the row, removes it from the row permanently, and receives the value of the coin. Determine the maximum possible amount of money we can definitely win if we move first.
For example in a sequence  8, 15, 3, 7
if we choose  7, the opponent chooses 8 and we choose 15 to win with value 22.
There are two ways to choose in a series i to j :
1) Choose the  ith coin with value Vi and collect the coin left behind as minimum :
min (F(i+2,j), F(i+1, j-1))
2) Choose the jth coin with value Vj and collect the coin left behind as minimum:
Vj + min (F(i+1, j-1), F(i,j-2))

Therefore the solution looks like this:
int F(List<int>V, int i, int j)
{
if (i > j || i >= V.Count || j >= V.Count) return 0;
if (i==j) return V[i];
if (i ==j + 1) return max(V[i], V[j]);
return Max(V[i] + min (F(V, i+2,j), F(V, i+1, j-1)), V[j] + min (F(V, i+1, j-1), F(V, i, j-2)));
}

Wednesday, November 23, 2016

We started discussing the paper "Nested sampling  for general Bayesian computation" by Skilling. Nested sampling estimates directly how the likelihood function relates to previous value of the discrete random variable. It can be used to compare two data models for the same set by comparing the conditional probability that is assigned after the relevant evidence is taken into account.
This method directly computes the evidence.The calculation of the evidence allows different model assumptions to be compared through the ratios of evidence values known as Bayes factors.
This is an improvement over Markov Chain Monte Carlo methods discussed in earlier posts because they yielded a set of samples representing the normalized posterior and the evidence was of secondary importance. Here the evidence is the direct result and  calculated with far greater ease than MCMC methods. The parameters are sorted by their likelihood values and then summed up to give the evidence. Since there are many data points, this method performs nested sampling to simulate the operation statistically. The evidence is then associated with corresponding numerical uncertainity. Skilling states that this nested sampling is Bayesian in nature. Furthermore, with the nested samples, we can estimate the density of states, to obtain samples from the posterior and to quantify arbitrary properties of the parameter.
This technique simplifies the evidence calculation by not summing over the parameters directly but instead performing it on the cumulated prior mass  that cover likelihood values greater than say lambda. As Lambda increases, the enclosed mass decreases from one to zero.  Therefore the summation simplifies to a one dimensional integral over unit range.
This is the primary intuition behind this paper. We will consider to make it online instead of batch.

This method is implemented in Python here and is not part of the standard numpy package yet.
http://www.inference.phy.cam.ac.uk/bayesys/python/mininest.py

#codingexercise
Find the number of BSTs possible for numbers ranging from 1 to n.
For example, n = 2, # = 2
n= 3, #=5
        static List<Node> makeBSTs(int start, int end)
        {
            var items = new List<Node>();
            if (start > end)
            {
                items.Add(null);
                return items;
            }
            for (int i = start; i <= end; i++)
            {
                var left = makeBSTs(start, i - 1);
                var right = makeBSTs(i + 1, end);
                if (left.Count == 0 && right.Count == 0)
                {
                    var node = new Node();
                    node.data = i;
                    node.left = null;
                    node.right = null;
                    items.Add(node);
                    continue;
                }
                if (left.Count == 0)
                {
                    for (int k = 0; k < right.Count; k++)
                    {
                        var rightnode = right[k];
                        var node = new Node();
                        node.data = i;
                        node.left = null;
                        node.right = rightnode;
                        items.Add(node);
                    }
                    continue;
                }
                if (right.Count == 0)
                {
                    for (int k = 0; k < left.Count; k++)
                    {
                        var leftnode = left[k];
                        var node = new Node();
                        node.data = i;
                        node.left = leftnode;
                        node.right = null;
                        items.Add(node);
                    }
                    continue;
                }
                for (int j = 0; j < left.Count; j++)
                {
                    var leftnode = left[j];
                    for (int k = 0; k < right.Count; k++)
                    {
                        var rightnode = right[k];
                        var node = new Node();
                        node.data = i;
                        node.left = leftnode;
                        node.right = rightnode;
                        items.Add(node);
                    }
                    continue;
                }

            }
            return items;
        }

we can verify the BST formed with
foreach( var root in items){
        printBST(root);
        Console.WriteLine();
}
which should print in ascending order
where the printBST is inorder traversal as follows:
void printBST(Node root)
{
if (root)
{
printBST(root.left);
Console.Write("{0} ", root.data);
print.BST(root.right);
}
}
which in this case verifies the above code.

Tuesday, November 22, 2016

We started discussing the paper "Nested sampling  for general Bayesian computation" by Skilling. Nested sampling estimates directly how the likelihood function relates to previous value of the discrete random variable. It can be used to compare two data models for the same set by comparing the conditional probability that is assigned after the relevant evidence is taken into account.
This method computes the marginal likelihood directly by integration. Moreover, we can get samples from the unobserved as conditionals on the observed optionally. The sampling proceeds based on the nested contours of the likelihood function and not on their values. This technique allows the method to overcome the limitations that creep into annealing methods.
We looked at the direct summation performed by this method by taking the Bayes in the form
Likelihood x Prior = Evidence x Posterior. The evidence can then be written as a summation over the prior mass elements.
The calculation of the evidence allows different model assumptions to be compared through the ratios of evidence values known as Bayes factors. This works well even for future models because evidence does not have to be recalculated for the current one. In fact the evidence and the posterior are valuable information for anybody who wants to compare models. The evidence gives a certain information on the strength of the model and the posterior gives how the observed data modulates the prior beliefs. Both of them together gives good information on how the likelihood is constructed but also how it is used.
This is an improvement over Markov Chain Monte Carlo methods discussed in earlier posts because they yielded a set of samples representing the normalized posterior. Here we get evidence as well which was not always easy because it was found as intermediary distributions that bridge prior and posterior. Annealing methods took direct likelihood values as intermediary steps. It was expensive to compute, it was a by-product and it had secondary importance with regard to posterior. This approach reverses the importance and directly calculates the evidence first. Although here too we sort the parameters by their likelihood values, they are summed up over the function to give the evidence. Since there are usually far too many points to do this continuously, sampling is nested to do it in steps.
#codingexercise
Find the lexicographic minimum in a circular array, e.g. for the array BCABDADAB, the lexicographic minimum is ABBCABDAD.
String GetMinRotation(string input)
{
String arr[input.Length];
String combined = input + input;
string min = input;
for (int I = 0; I < input.Length; i++)
{
    String candidate = input.substring(I, input.Length);
    if (candidate < min) 
       min = candidate;
}
return min;
}

Get Count rotation to desired output
int GetCount(string input, string output)
{
String arr[input.Length];
String combined = input + input;
int count = 0;
for (int I = 0; I < input.Length; i++)
{
    String candidate = input.substring(I, input.Length);
    count++;
    if (candidate == output) 
       return count;
}
return -1;
}

Monday, November 21, 2016

Today we start discussing the paper "Nested sampling  for general Bayesian computation" by Skilling. Nested sampling estimates directly how the likelihood function relates to previous value of the discrete random variable. It can be used to compare two data models for the same set by comparing the conditional probability that is assigned after the relevant evidence is taken into account.
This has several advantages. First, this method computes the marginal likelihood directly by integration. Moreover samples from the distribution of the unobserved observations as conditionals on observed data can also be obtained optionally. This method relies on sampling within a hard constraint on likelihood value as opposed to the softened likelihood  of annealing methods. The sampling proceeds based on the shape of nested contours of likelihood, and not on the likelihood values. This technique allows the method to overcome the limitations that creep into annealing methods.
From the Bayes theorem, we often write that a product form of the model as
Likelihood x Prior  = Evidence x Posterior
which are expressed using parameters of the model.
The likelihood is the probability of the acquired data given the parameters and the model assumptions.
The Prior represents the uncertainity over the unknown parameters given the model assumptions and is taken before we have sampled any data and estimated it.
The posterior represents the uncertainity over the unknown parameters after the data has been sampled. The posterior therefore involves the sampled data D which was not considered in the prior.  The prior and the posterior lets us start with a few beliefs about the world, interact with it and then update the beliefs The computation of the posterior with sampled data is in fact an update of our beliefs about the world. The posterior is a conditional distribution on the sampled data which lets us modulate the prior. The prior and the posterior are usually normalized to unit total. With the likelihood function and the beliefs, we can now estimate the marginal likelihoods from the observed data and their probabilities.
When the equation is written in the product form, it lets us find the evidence as  a summation over the prior mass elements.
#codingexercise
Find the maximum area of a rectangle under the histogram of unit widths
public static int MaxArea(ref List<int> h)
{
            if (h == null || h.Length <= 0) return 0;
            var areas = new List<int>();
            for (int i = 0; i < h.Length; i++)
            {
                areas.add(i, h[i]);
                for (int k = i+1; k < h.Length; k++)
                   {
                     if (h[k] >= h[i])
                         areas[i] += h[i];
                     else
                     {                    
                         break;
                     }
                   }          
            }
            return areas.Max();
}
#Find max area under the histogram 
Int MaxAreaOfARectangle(List<int>histogram) 
{ 
Int max = 0; 
For(int I = 0; I < histogram.count; i++){ 
Int area = GetArea(histogram,i); 
If (area > max) 
    max = area; 
} 
Return max; 
} 
Int GetArea(List<int>histogram, int center) 
{ 
Int area = histogram[center]*1; 
For (int I = center-1; 
         i>=0 && histogram[i] > histogram[center]; i++){ 
         area += histogram[center]*I; 
} 
For (int I = center+1; 
         I<histogram[count] && histogram[i] > histogram[center]; i++){ 
         area += histogram[center]*I; 
} 
return area; 
} 

Alternatively,
Int MaxArea (ref int [] h, int start, int end, ref int min)
{
If (start == end)
{
min = h [start];
return min × 1;
}
If (start < end)
{
Int mid = (start + end)/2;
Int minleft = 0;
Int minright = 0;
Int left = MaxArea (c, ref h, start, mid, ref minleft);
Int right = MaxArea (c,ref h, mid +1, end, ref minright);
min = min (minleft, minright) ;
Int minArea= min × (end-start+1);
Return max (left,right, minArea);
}
Return 0;
}

Sunday, November 20, 2016

Today we continue discussing websockets. We said it facilitates duplex communication and is independent of http and we showed how both the client and the server can be both producer  and consumer.
 The frontend updates the page with the statistics on each event that an event listener on the page receives. The frontend is different from the clients that gather the statistics per machine usually with a collection tool. Each client is responsible for its own host.l
The client is in an endless loop, gathering data, publishing data and waiting for some time. The publishing is done with a post request to the server at the endpoint corresponding to the model used by the server for the client.
The server is written in django. It uses signals for creating and destroying model objects pertaining to the client statistics.  Whenever a stats needs to be saved, using signals, the server then notifies the endpoint set up by crossbar.io which dispatches a  a pubsub event by Crossbar.io with a POST request to the endpoint at path say '/notify'.
 The HTTP requests and the publisher-subscriber events on the websockets are not mutually exclusive. They are just complimentary to each other. We would have a restrictive polling mechanism if there was no pub-sub associated with the architecture.
To summarize:
Frontend:
register an event listener
Client:
publish to server
Server:
notify the subscriber
#codingexercise
We were referring to an example to find the next palindrome of a number with very large count of digits in the exercises earlier. In order to find the palindrome that is next, we need to be able to compares such numbers. Here is one way to do it which relies on the notion of a .Net comparator.
        static int compareTo(List<int> current, List<int> digits)
        {
            int ret = 0;
            if (current.Count < digits.Count) return -1;
            if (current.Count > digits.Count) return 1;
            for (int i = 0; i < current.Count && i < digits.Count; i++)
            {
                if (current[i] < digits[i])
                    return -1;
                if (current[i] > digits[i])
                    return 1;
            }
            return ret;
        }

Given a string, its value is defined as the sum of the squares of the frequencies of its letters. Find the minimum value after removing k characters

static int GetValue(string input, int k)
{
var freq = new Hashtable();
for (int I =0; I < input.Length; i++)
freq[I]++;
for (int j = 0; j < k; j++)
{
 char candidate = freq.Max();// key associated with the max value
 if (candidate)
{
    freq[candidate]--;
    if (freq[candidate] == 0)
         freq.Remove(candidate);
}
}
int val = 0;
for (int I = 0; I < freq.keys.count; i++)
{
val += freq[freq.keys[I]] ^ 2;
}
return val;
}

Saturday, November 19, 2016

Yesterday, we were discussing websockets. We said it facilitates duplex communication and is independent of http. As an example use case, we considered displaying real time monitoring of cpu, memory metrics on a dashboard using a python django server. the clients publish ata from each monitored machine to a central django server that in turn publishes to the frontend. the frontend updates on each new statistic because it listens for those events with say Autobahn. Here the difference is highlighted from the traditional model where the front-end pulls the information from a time series database by running a query every once in a while.
The frontend therefore looks something like this:
<code>
      window.addEventListener("load", function(){
        var connection = new autobahn.Connection({
           url: 'ws://127.0.0.1:8080/ws',
           realm: 'realm1'
        });
        connection.onopen = function(session) {
          var clients = document.getElementById("clients");
          session.subscribe('clientstats', function(args){
            var stats = args[0];
            var serverNode = document.getElementById(stats.ip);
            // update
            }
          });
          }
         });
        connection.open();
</code>

The clients push the code and this can be done with statsdclient which runs on every source of metrics.
The server however is different from traditional.
@receiver(post_save, sender=Client, dispatch_uid="server_post_save")
def notify_server_config_changed(sender, instance, **kwargs):
    requests.post("http://127.0.0.1:8080/notify",
                  json={
                      'topic': 'clientconfig',
                      'args': [model_to_dict(instance)]
                  })
The 'notify' is the url configured for the push service.
#codingexercise
We discussed the next palindrome problem yesterday. It involved an addition and carry over. We do it this way:
       static void AddOne(ref List<int>digits, int start)
        {
            if (start > digits.Count) return;
            if (digits[start] + 1 <= 9)
            {
                digits[start] += 1;
                return;
            }
            int carry_over = 1;
            while (start >= 0 && carry_over != 0)
            {
            int sum = digits[start] + 1;
            digits[start] = sum %10;
            carry_over = sum /10;
            if (start == 0)
            {
                digits.Insert(0, carry_over);
            }
            start--;
            }
        }

Friday, November 18, 2016

Today we take a  break to discuss websockets
Websockets come in helpful where previously we were doing polling over HTTP. Here we maintain a connection different from HTTP and over TCP where both client and server can push events. Yes events is a good example of using web sockets because they are time series and are also numerous in nature. Websockets are full duplex connections so both sides can be producer and consumer.
Some examples of Client side libraries include :
socket.io and
web-socket-js
while python implementations include
Autobahn
and crossbar.io
Crossbar.io is particularly helpful in synchronous python stacks that want the latitude of near realtime behavior to Django applications.
it enables us to trigger notifications from our synchronous python web stack because it comes with a pusher service that can be configured in the json configuration file.
As an example use case, consider that we want to display real time monitoring of cpu, memory metrics on a dashboard using a python django server. the clients publish ata from each monitored machine to a central django server that in turn publishes to the frontend. the frontend updates on each new statistic because it listens for those events with say Autobahn. Here the difference is highlighted from the traditional model where the front-end pulls the information from a time series database by running a query every once in a while.
<pre>
<code>
#codingexercise
Print next greater palindrome of a given array with each element denoting a single digit of an integer
        static List<int> GetNextPalin(List<int> digits)
{
var current = digits.Select(x => x).ToList();
while (IsPalindrome(current, 0, current.Count -1) == false)
{
int start = 0;
int end = current.Count - 1;
while(start<end)
{
current[end] = current[start];
end--;
start++;
}
    if (compareTo(current, digits) == -1) // current < digits
{
if (start == end)
{
        if (current[start] + 1 >= 10){
            AddOne(ref current, start);
               continue;
        }
        current[start] = current[end] + 1;
}
if (start > end)
{
        if (current[start] + 1 >= 10){
            AddOne(ref current, start);
             continue;
        }
        current[start] = current[start] + 1;         
        current[end] = current[end] + 1;
        end++;
        start--;
}
}
}
return current;
}
</code>
</pre>
 Int = 123456, Palindrome = 124421
Int = 1, Palindrome = 1
Int = 12, Palindrome = 22
Int = 92, Palindrome = 99
Int = 31, Palindrome = 33
Int = 13, Palindrome = 22
Int = 83422469, Palindrome = 83433438
Int = 12921, Palindrome = 12921
Int = 125322, Palindrome = 125521
Int = 14587678322, Palindrome = 14587678541
Int = 94187978322, Palindrome = 94188088149

bool isPalindrome( List<int> digits, int start, int end)
{bool ret = true;
while(start < end)
{
if (digits[start] != digits[end]) return false;
start++;
end--;
}
return ret;

}