We continue discussing the paper "Nested sampling for general Bayesian computation" by Skilling with an emphasis on understanding the transformation of computing evidence over parameters to computing evidence over unit prior mass that helps this technique. With this technique, nested sampling estimates directly how the likelihood function relates to previous value of the discrete random variable. And it can be used to compare two data models for the same set by comparing the conditional probability that is assigned after the relevant evidence is taken into account.
This method directly computes the evidence. This technique simplifies the evidence calculation by not summing over the parameters directly but instead performing it on the cumulated prior mass that cover likelihood values greater than say lambda. As Lambda increases, the enclosed mass X decreases from one to zero. When we write the inverse function as L(X) where L is the likelihood function and L(X(Lambda)) is equivalent to Lambda, the evidence can be transformed from the integration of Likelihood function L(theta) over elements of prior mass dX = pi(theta)delta-theta where theta is the parameter to the integration of L over the prior mass directly as L(X)dX Therefore the summation simplifies to a one dimensional integral over unit range
We are able to redefine the evidence from a variation over the parameters theta to a variation over the prior mass X by dividing the unit prior mass into tiny elements and sorting them by likelihood.
Let's picture it this way. We know that the evidence is the area under the curve of a plot between the likelihood and the prior mass. And we know that the likelihood decreases as prior mass increases from 0 to 1. The evidence is therefore found by integration over this curve. We were able to obtain this simpler transformation by sorting the evidence based on likelihood values and arranging them as tiny strips of width dX and because X is cumulative.
This is clearer to see with an example where we have two dimensional parameter and their likelihood fill a matrix. If we take a four by four grid, there are sixteen likelihood values associated with each cell each of which has a equal prior mass 1/16. Also they are not sorted because they correspond to parameters. But if we sort them linearly and take the average likelihood, then the likelihood corresponding to X = 1/5 is one fifth along this sorted sequence and consequently the fourth sorted cell out of sixteen and the likelihood can be read at that X can be read directly. These sorted descending values represents the curve L over X which we integrate to find the evidence.
#codingexercise
Consider a row of n coins of values v1 . . . vn, where n is even. We play a game against an opponent by alternating turns. In each turn, a player selects either the first or last coin from the row, removes it from the row permanently, and receives the value of the coin. Determine the maximum possible amount of money we can definitely win if we move first.
For example in a sequence 8, 15, 3, 7
if we choose 7, the opponent chooses 8 and we choose 15 to win with value 22.
There are two ways to choose in a series i to j :
1) Choose the ith coin with value Vi and collect the coin left behind as minimum :
min (F(i+2,j), F(i+1, j-1))
2) Choose the jth coin with value Vj and collect the coin left behind as minimum:
Vj + min (F(i+1, j-1), F(i,j-2))
Therefore the solution looks like this:
int F(List<int>V, int i, int j)
{
if (i > j || i >= V.Count || j >= V.Count) return 0;
if (i==j) return V[i];
if (i ==j + 1) return max(V[i], V[j]);
return Max(V[i] + min (F(V, i+2,j), F(V, i+1, j-1)), V[j] + min (F(V, i+1, j-1), F(V, i, j-2)));
}
No comments:
Post a Comment