Cluster computing

Metropolis Hastings algorithm helps with performing independent sampling in a high dimensional distribution. In smaller dimensions, it is easier to tell apart samples by segregating batches that are not dependent on a factor. For example, if we were to find the effectiveness of a medicine, we would create two groups - one where the medicine is administered and one where it isn't. When there are many many factors, the samples are difficult to draw.
This algorithm generates samples iteratively with the desired distribution. In each iteration, more and more samples are produced where the next sample depends only on the current in a Markov chain like manner. hence it is also called Markov Chain Monte Carlo method. The number of samples is proportional to the number of iterations. In each iteration it picks a sample based on the current value. then it measures how close this sample is to the desired distribution. if the sample is accepted, then the new sample is used as the current value.if it is rejected, the current value is reused.
The idea is that we generate a sequence of x values one after the other say x0, x1, x2, .. xn such that as we increase their number to a large value, we can see that the last sample is the one with the Probability distribution P(x).
The method uses two steps to generate the next in the sequence. The first stage is to generate a candidate from the proposal distribution which depends on the current state of the Markov chain. The proposal distribution is usually defined with a standard deviation. The second stage is the accept-reject step.
In this step we calculate acceptance probability which depends on the ratios of the probabilities for the candidate in the given and proposal distribution taken together. Given the current state of the Markov chain as xn, the first ratio is that of the probability of the candidate to the probability of the current state. The second ratio is that of the conditional probabilities of the current state given the candidate and that of the candidate given the current state. The first ratio does not depend on the normalizing constant for the distribution P(x) which makes it easier to calculate. The second ratio is a correcting factor for any biases that the proposal distribution might induce. Since the second ratio involves the conditional distribution of the candidate given the current state as the denominator, it is a measure of what happened. The opposite is true for the numerator in the second ration which is therefore a measure of what would have occurred. The second ratio turns out to be 1 when the proposal distribution is normal. This is a special case of Metropolis Hastings algorithm which is then called the Metropolis algorithm
#codingexercise
Yesterday we were discussing an iterative solution for finding non-repeating character sequence in a string. Here is the recursive version:

static void GetMaxRecursive(String input, int start, int curLen, ref int max, ref List<int> h)

        {

               if (start > input.Length - 1)
               {
                   if (curLen >= max)
                   {
                       max = curLen;
                       Console.WriteLine(
                       "One Non repeating character substring found at index {0} is {1}",
                       input.Length - max,
                       input.Substring(input.Length - max, max));
                   }
                    return;
               }

                int prev = h[input[start]];
                if (prev == -1 || start - curLen > prev)
                {
                    curLen++;
                }
                else
                {
                    if (curLen > max)
                    {
                        max = curLen;
                        Console.WriteLine(
                        "One Non repeating character substring found at index {0} is {1}",
                        start - max,
                        input.Substring(start - max, max));
                    }
                    curLen = start - prev;
                }
                h[input[start]] = start;
                if (input.Length - max >= 0 && curLen >= max)
                {
                    max = curLen;
                    Console.WriteLine(
                    "One Non repeating character substring found at index {0} is {1}",
                    input.Length - max,
                    input.Substring(input.Length - max, max));
                }

            GetMaxRecursive(input, start+1, curLen, ref max, ref h);
        }

Puzzle : we have five pieces of chains each three links long. Breaking open a link costs 1 dollar and resealing costs 3 dollars. Can we form one long chain with fifteen dollars.
Answer Yes, break open all links of one chain. Use these three as connectors for the remaining pieces. Cost = 3 x 1 + 3 x 3 = 12.

Cluster computing

Monday, November 14, 2016

No comments:

Post a Comment