Tuesday, December 26, 2017

We were discussing how to compare two models. For example, the number of data points must be larger than the number of parameters. if we increase the number of parameters, it will result in goodness of fit and better Chi-square. Consequently the model with higher number of parameters does better.This is articulated as a metric called the mean squared error which is chi-squared divided by degrees of freedom. MSE uses them as numerator and denominator so it represents a trade-off between over-parameterized model and a lower chi-square. A model with a fewer parameters but lower chi-square is preferred.

#codingexercise
Find the Eulerian number E(n,m). This is the number of permutations of the numbers from 1 to n in which exactly m elements are greater than the previous elements.
For example, n = 3, m = 1 and there are 4 permutations in which exactly 1 element is greater than the previous element. These are :
1 2 3  => 1,2 and 2,3 resulting in count 2 ( invalid)
1 3 2 => 1,3 resulting in count 1 (valid)
2 1 3 => 1,3 resulting in count 1 (valid)
2 3 1 => 2,3 resulting in count 1 (valid)
3 1 2 => 1,2 resulting in count 1 (valid)

int GetEulerian(int n, int m)
{
    if (m >= n || n == 0)
        return 0;

    if (m == 0)
        return 1;

    return (n - m) * GetEulerian(n - 1, m - 1) +
           (m + 1) * GetEulerian(n - 1, m);

}

we can easily try this for n = 3, m = 1
(3,1)
2 x (2,0) + 2 x (2,1)
2 + 2 x ( 1x (1,0) +2x (1,1))
2 + 2
4
The recursive formula  uses the notion that either the m reduces or doesn't as n reduces.

Monday, December 25, 2017

We were discussing how to compare two models. For example, the number of data points must be larger than the number of parameters. if we increase the number of parameters, it will result in goodness of fit and better Chi-square. Consequently the model with higher number of parameters does better.This is articulated as a metric called the mean squared error which is chi-squared divided by degrees of freedom. MSE uses them as numerator and denominator so it represents a trade-off between over-parameterized model and a lower chi-square. A model with a fewer parameters but lower chi-square is preferred.
#codingexercise
Find the Entringer number for a given n,k
An Entringer number is the number of permutations where we have a number formed from the permutation of digits 1 to n+1 such that the first digit starts with k+1 and the digits following that initially fall and then alternatively rise and fall. For example. E(4,2) are
32415
32514
31425
31524
int GetCountEntringer(int n, int k)
{
if (n == 0 && k == 0) return 1;
if (k == 0) return 0;
return GetCountEntringer(n, k-1) + GetCountEntringer(n-1, n-k);
}
From Wolfram definition of Entringer
(4,1)  (3,2)
(4,0)  (3,3) (3,1) (2,1)
 0   (3,2)(3,0) (3,0)(2,2)(2,0)(1,1)
 0   (3,1)(2,1) 0 0 (2,1)(1,0) 0 (1,0)(0,0)
0    (3,0)(2,2) (2,0)(1,1)0 0 (2,0)(1,1) 0 0 0 1
0    0 (2,1)(1,0)0(1,0)(0,0) 0 0 0 (1,0)(0,0) 0 0 0 1
0   0 (2,0) (1,1) 0 0 0 1 0 0 0 0 1 0 0 0 1
0   0  0 (1,0)(0,0) 0 0 0 1 0 0 0 0 1 0 0 0 1
0   0  0  0   1 0 0 0 1 0 0 0 0 1 0 0 0 1

Sunday, December 24, 2017

We were discussing how to compare two models. For example, the number of data points must be larger than the number of parameters. if we increase the number of parameters, it will result in goodness of fit and better Chi-square. Consequently the model with higher number of parameters does better.This is articulated as a metric called the mean squared error which is chi-squared divided by degrees of freedom. MSE uses them as numerator and denominator so it represents a trade-off between over-parameterized model and a lower chi-square. A model with a fewer parameters but lower chi-square is preferred.
#codingexercise
Find the number of increasing subsequences of size k in a given array.
For example, if 1 2 3 4, the increasing subsequence of size 2 are 1 2, 1 3, 1 4, 2 3 , 2 4 , 3 4
The recurrence is therefore
dp[i][j] = 1, where i = 1 and 1 <= j <= n
dp[i][j] = sum(dp[i-1][j]), where 1 < i <= k, i <= j <= n and A[m] < A[j] for (i-1) <= m < j

int GetCountIncreasingSubsequences(int A[], int n, int k)
{
    int [,] dp = new int[k, n];
    int  sum = 0;

    for (int i = 0; i < n; i++)
        dp[0, i] = 1;

    for (int size = 1; size < k; size++) {
       for (int i = size; i < n; i++) {
            dp[size, i] = 0;
            for (int j = size - 1; j < i; j++) {
                if (A[j] < A[i])
                    dp[size, i] += dp[size - 1, j];
            }
        }
    }

    for (int i = k - 1; i < n; i++)
        sum += dp[k - 1, i];

   return sum;
}

The above method also memoizes the counts computed for sizes smaller than k so we can easily lookup the results for anything less than k.

Saturday, December 23, 2017

We were discussing  Bootstrap method, confidence intervals and accuracy of model parameters especially on linearization of non-linear models
The goodness of fit and the residuals plot are useful indicators along with the error function. Each gives helpful information
The correlation will mean that the fit is good but to improve its accuracy, it must also result in a specific distribution of residuals. The residuals should be distributed along the independent axis and normally distributed around zero with no systematic trends. The latter condition makes the difference between the data point and the estimate - the residual, more acceptable.
I want to mention that curve fitting improves with higher degrees such as a quadratic over linear but this does not mean we go to as high a degree as possible. The model we build tried to minimize the residuals. If we can do this with lesser degrees, then  that is acceptable
When we compare two models, we follow certain rules. For example, the number of data points must be larger than the number of parameters. if we increase the number of parameters, it will result in goodness of fit and better Chi-square. Consequently the model with higher number of parameters does better.This is articulated as a metric called the mean squared error which is chi-squared divided by degrees of freedom. MSE uses them as numerator and denominator so it represents a tradeoff between over-parameterized model and a lower chi-square. A model with a fewer parameters but lower chi-square is preferred.

Friday, December 22, 2017

We were discussing  Bootstrap method, confidence intervals and accuracy of model parameters especially on linearization of non-linear models
The goodness of fit and the residuals plot are useful indicators along with the error function. Each gives helpful information
The correlation will mean that the fit is good but to improve its accuracy, it must also result in a specific distribution of residuals. The residuals should be distributed along the independent axis and normally distributed around zero with no systematic trends. The latter condition makes the difference between the data point and the estimate - the residual, more acceptable.
I want to mention that curve fitting improves with higher degrees such as a quadratic over linear but this does not mean we go to as high a degree as possible. The model we build tried to minimize the residuals. If we can do this with lesser degrees, then  that is acceptable.
#codingexercise
Print Hosoya's triangle
Hosoya's triangle is a triangular arrangement of numbers based on Fibonacci numbers where the sum of the two numbers in the left diagonal or the right diagonal results in the number at the current position.
The recurrence relation is
H(0,0) = H(1,0) =  H(1,1) = H(2,1) = 1
and
H(n,j) = H(n-1,j) + H(n-2, j)
           = H(n-1, j-1) + H(n-2,j-2)
And H(n,i) = F(i+1) x F ( n - i + 1)

int GetHosoya(int r, int c)
{
    if (r == 0 && c == 0) return 1;
    if (r == 1 && c == 0) return 1;
    if (r == 1 && c == 1) return 1;
    if (r == 2 && c == 1) return 1;
    if (r > c)
        return GetHosoya(r - 1, c) + GetHosoya(r - 2, c);
    if (r == c)
        return GetHosoya(r - 1, c - 1) + GetHosoya(r - 2, c - 2);
    return 0;
}

Thursday, December 21, 2017

We were discussing  Bootstrap method, confidence intervals and accuracy of model parameters especially on linearization of non-linear models.
 A model may have parameters that correspond to m dimensions. Since each of these dimensions can allow variations, the probability distribution is a function defined on M-dimensional space. With this probability distribution, we choose a region that has a high percentage of the total distribution relative to the selected model parameters  This region is called the confidence interval.
We were saying that a model that is made linear looks easy to interpret but it is not accurate especially when transforming non-linear models. We will talk more about this process but first let us consider the m-dimensional space.
Usually non-linear models are solved with an objective function such as the chi-square error function. The gradients of the chi-square function with respect to the parameters must approach zero at the minimum of the objective function. In the steepest descent method, the minimization proceeds iteratively until it stops decreasing.
The error function depends on the model parameters. The M-dimensions can be considered a surface of which we seek a minimum. If the dimensions are large in number, the model may be complex and the surface may have more than one minimum.  The optimal minimization will strive to find the true global minimum of the error surface and not just a local minimum. In order to converge to the global minimum, one technique is to vary the starting points or the initial guesses and compare the resulting model parameters.
The goodness of fit and the residuals plot are useful indicators along with the error function. Each gives helpful information. A model may not be accurate and yet it might systematically differ from the data. In this case, the correlation between the model and the data will be high. This correlation will mean that the fit is good but to improve its accuracy, it must also result in a specific distribution of residuals. These residuals should be distributed along the independent axis and normally distributed around zero with no systematic trends. The latter condition makes the difference between the data point and the estimate also called the residual more acceptable.

Find the maximum product of an increasing subsequence
double GetProductIncreasingSubsequence(List<double> A, int n)
{
   Debug.Assert(A.All(x => x  >= 0));
    var products = new List<double>();
    for (int i = 0; i < n; i++)
        products.Add(A[i]);
    Debug.Assert(products.Length == A.Length);

    for (int i = 1; i < n; i++)
        for (int j = 0; j < i; j++)
            if (A[i] > A[j] && 
               products[i] < (products[j] * A[i]))
                products[i] = products[j] * A[i];

    return products.Max();
}
A: 2 0 4

P: 2 0 8

Wednesday, December 20, 2017

We were discussing  Bootstrap method, confidence intervals and accuracy of model parameters. A model may have parameters that correspond to m dimensions. Since each of these dimensions can allow variations, the probability distribution is a function defined on M-dimensional space. With this probability distribution, we choose a region that has a high percentage of the total distribution relative to the selected model parameters  This region is called the confidence interval. Depending on the degree of distribution chosen, confidence intervals may be mentioned in levels as percentages. The region shape of the distribution may also be mentioned such as ellipsoids. 
As we work with model parameters, some rules of thumb come into view. For example, if we define a progressive rate constant, it cannot turn out to be negative. These facts should not be violated.
It should be noted that confidence interval is tied to the model. If the model were transformed from non-linear to linear, the confidence interval varies. It is often cited as oversimplification because it violates assumptions. This leads to more error in the model parameter values.


We were saying that a model based on linearization is easy but it is not accurate especially when transforming non-linear models.
Usually non-linear models are solved with an objective function such as the chi-square error function. The gradients of the chi-square function with respect to the parameters must approach zero at the minimum of the objective function. In the steepest descent method, the minimization proceeds iteratively until it stops decreasing.

#codingexercise 
Find the square root of a number 
Double sqrt(double n)
{
Double x = n;
Double y = 1;
Double threshold = 1E-5;
while (x – y > threshold)
{
x = (x+y)/2;
y= n/x;
}
return x;
}