Cluster computing

Tuesday, November 15, 2016

Yesterday we were discussing Metropolis Hastings algorithm. It helps with drawing independent samples from a high dimensional distribution. This algorithm is iterative. In each iteration, more and more samples are produced where the next sample depends only on the current in a Markov chain like manner.The number of samples is proportional to the number of iterations. In each iteration it picks a sample based on the current value. then it measures how close this sample is to the desired distribution. if the sample is accepted, then the new sample is used as the current value.if it is rejected, the current value is reused. The method uses two steps to generate the next in the sequence. The first stage is to generate a candidate from the proposal distribution which depends on the current state of the Markov chain. The proposal distribution is usually defined with a standard deviation. The second stage is the accept-reject step. We saw that the acceptance probability depends on the ratios of the probabilities for the candidate in the given and proposal distribution taken together. The first ratio does not depend on the normalizing factor which makes the calculation easier and the second ratio is a correcting factor which can be simplified for a special case of this algorithm.
With the candidate generated and the acceptance probability calculated, the candidate is either accepted or rejected. To make this distinction, a uniformly distributed random number is generated between 0 and 1 and denoted by u. Then if u is less than or equal to the acceptance probability, we accept the candidate and if it isn't, then we reject the candidate. We start the next iteration.
The parameters to the algorithm are the number of samples to draw denoted by n and the standard deviation denoted by sigma. When the standard deviation is too small, the proposal distribution is set too narrow. The acceptance rate is high so the chain doesn't get stuck in any one spot but it doesn't cover a wide range. When the standard deviation is very large, the proposal distribution is set too wide, the acceptance rate is low and the distribution is highly irregular. The chain gets stuck in one spot for long periods of time but it does manage to make big jumps, covering the whole range.
There are two parameters that can be introduced to help with improving the sampling. The first is called the burnin by which we delay the actual collection until the algorithm has run for a while. The other parameter is called the lag by which we allow several iterations of the sampler to elapse in between successive samples which comes in helpful when the Markov chain gets stuck in one location for long periods of time.
Courtesy : Navarro and Perfors

#codingexercise
Yesterday we were discussing a solution for finding non-repeating character sequence in a string. Here's another.
static int GetMaxFreq(string input)
{
Console.WriteLine("input={0}", input);
int count = 0;
int max = 0;
for (int i = 0; i < input.Length; i++)
{
var freq = new List<int>(256);
freq.AddRange(Enumerable.Repeat(0, 256));
freq[input[i]]++;
count = 1;
if (count > max)
max = count;
for (int j = i + 1; j < input.Length; j++)
{
if (freq[input[j]] != 0)
{
break;
}
else
{
count++;
}
if (count > max)
max = count;
freq[input[j]]++;
}
}
return max;
}

#puzzle given three multiple choices for a question as below, find the right answer:

A. Answer A
B. Answer A or B
C. Answer B or C

by process of mutial exclusion, the answer is C. if it were A or B there would be more than one choice possible.

Cluster computing

Tuesday, November 15, 2016

No comments:

Post a Comment