Friday, November 14, 2014

#codingexercise

Int GetCountDuplicatesMin (int [] A)

{

if (A == null) return 0;

return A.CountDuplicatesMin();

}

We continue to look at picking perfect polynomials for the convergence of conjugate gradients. We have seen that at each step of CG, we can express the error term as a polynomial in a matrix or scalar applied to the initial error term and with Steepest Descent method, the initial error term is a linear combination of orthogonal unit eigenvectors. We also noticed that with each iteration, the convergence is only as good as the worst eigenvector. For the first iteration, we have a horizontal line for the the polynomial to evaluate to a constant 1. At this time we have two eigenvalues. In the second iteration, we have a slope for the line equal to the ratio of the two eigenvalues. Let us say that in the third iteration, the eigenvalue is 1 and the iterations terminate. We now have a parabola for the polynomial of degree two to fit these three points.  The computed error terms may be plotted somewhere close to these eigenvalues on these polynomial representations.
We see that the polynomial converges after n iterations and quicker if there are duplicated eigenvalues. If we had a floating point precision that was infinite, then it would take at most n iterations to converge to an exact solution. In fact, it might be quicker if the initial position vector is already A-orthogonal to some of the eigenvectors of A. But floating point round off errors may re-introduce these eigenvectors. We also find that when the eigenvalues are clustered together, it is quicker to converge. If we know something about these eigenvalues, then we could construct the polynomial faster. The more general case is when the eigenvalues are evenly distributed between Lambda min and Lambda max, the number of distinct eigenvalues is large, and the floating point roundoff occurs.
This range leads to another interesting idea called the Chebyshev polynomial.  This is a polynomial that minimizes the sum of squares of error terms over a range Lambda min and Lambda max rather than at a finite number of points. The polynomial is expressed in radial directions of at most unit length and have the property that they result in something smaller than a unit. It is also the maximum on all such polynomials in the domain of the chosen variable. When we use a Chebyshev polynomial to minimize the sum of square of error terms from the CG, we get an equation in terms of a condition variable that can be plotted as an increasing parabolic curve. This further shows that the first step of CG is identical to a step of Steepest Descent. After that the convergence is usually faster than that of Steepest Descent because of a good eigenvalue distribution or good starting points. This does not imply that every step of the iteration enjoys faster convergence. That said, the rate of convergence is much higher initially and gradual much later 

No comments:

Post a Comment