Cluster computing

Saturday, December 23, 2017

We were discussing Bootstrap method, confidence intervals and accuracy of model parameters especially on linearization of non-linear models
The goodness of fit and the residuals plot are useful indicators along with the error function. Each gives helpful information
The correlation will mean that the fit is good but to improve its accuracy, it must also result in a specific distribution of residuals. The residuals should be distributed along the independent axis and normally distributed around zero with no systematic trends. The latter condition makes the difference between the data point and the estimate - the residual, more acceptable.
I want to mention that curve fitting improves with higher degrees such as a quadratic over linear but this does not mean we go to as high a degree as possible. The model we build tried to minimize the residuals. If we can do this with lesser degrees, then that is acceptable
When we compare two models, we follow certain rules. For example, the number of data points must be larger than the number of parameters. if we increase the number of parameters, it will result in goodness of fit and better Chi-square. Consequently the model with higher number of parameters does better.This is articulated as a metric called the mean squared error which is chi-squared divided by degrees of freedom. MSE uses them as numerator and denominator so it represents a tradeoff between over-parameterized model and a lower chi-square. A model with a fewer parameters but lower chi-square is preferred.

Cluster computing

Saturday, December 23, 2017

No comments:

Post a Comment