Cluster computing

Thursday, December 21, 2017

We were discussing Bootstrap method, confidence intervals and accuracy of model parameters especially on linearization of non-linear models.
A model may have parameters that correspond to m dimensions. Since each of these dimensions can allow variations, the probability distribution is a function defined on M-dimensional space. With this probability distribution, we choose a region that has a high percentage of the total distribution relative to the selected model parameters This region is called the confidence interval.
We were saying that a model that is made linear looks easy to interpret but it is not accurate especially when transforming non-linear models. We will talk more about this process but first let us consider the m-dimensional space.
Usually non-linear models are solved with an objective function such as the chi-square error function. The gradients of the chi-square function with respect to the parameters must approach zero at the minimum of the objective function. In the steepest descent method, the minimization proceeds iteratively until it stops decreasing.

The error function depends on the model parameters. The M-dimensions can be considered a surface of which we seek a minimum. If the dimensions are large in number, the model may be complex and the surface may have more than one minimum. The optimal minimization will strive to find the true global minimum of the error surface and not just a local minimum. In order to converge to the global minimum, one technique is to vary the starting points or the initial guesses and compare the resulting model parameters.
The goodness of fit and the residuals plot are useful indicators along with the error function. Each gives helpful information. A model may not be accurate and yet it might systematically differ from the data. In this case, the correlation between the model and the data will be high. This correlation will mean that the fit is good but to improve its accuracy, it must also result in a specific distribution of residuals. These residuals should be distributed along the independent axis and normally distributed around zero with no systematic trends. The latter condition makes the difference between the data point and the estimate also called the residual more acceptable.

Find the maximum product of an increasing subsequence
double GetProductIncreasingSubsequence(List<double> A, int n)
{
Debug.Assert(A.All(x => x >= 0));
var products = new List<double>();
for (int i = 0; i < n; i++)
products.Add(A[i]);
Debug.Assert(products.Length == A.Length);

for (int i = 1; i < n; i++)
for (int j = 0; j < i; j++)
if (A[i] > A[j] &&
products[i] < (products[j] * A[i]))
products[i] = products[j] * A[i];

return products.Max();
}
A: 2 0 4

P: 2 0 8

Cluster computing

Thursday, December 21, 2017

No comments:

Post a Comment