When discussing conjugate gradient method in the previous post, we mentioned several items. Let's put them all together now. First, this is in fact a conjugate direction method. We don't have to keep track of previous search vectors. Each subspace is incremental and is found by applying a matrix to a vector. The name of the method has nothing to do with gradients because the search directions are not all gradients. It's just a misnomer.
We will next evaluate the convergence of conjugate gradient method. We saw that the progression for the method was linear. Why should we then care for convergence ? This is because there are a few errors encountered during the progress that impedes convergence. One for example is the floating point errors and the second for example is that the search vectors lose their A-orthogonality. The first problem could just as well occur in Steepest Descent methods and there are ways to overcome it. However, the second is not easy but there are ways to use this as an iterative procedure where we apply corrections. The convergence analysis therefore helps us evaluate running the method on sample datasets where it is not even possible to complete n iterations. The first iteration of CG is identical to the first iteration of the Steepest Descent method. We can apply the same conditions here too. If we look at the iteration equation, the initial search direction is also the initial residual direction formed from the initial subspace A applied to x0. All subsequent iterations result from the next residual and the current search vector. For convergence, let's start with the initial direction along the axes, the residual points to the center of the ellipsoid, giving a one step convergence to the solution. For a more general convergence, we can rely on the fact that we can find n A-orthogonal search vectors and we can express the error term as a linear combination of these vectors. With that we can find the next residual with some compromise. In fact, that error term is a weighted average from the step-lengths of previous residuals. Because its weighted, some of the contributions from the shorter step-lengths of residuals might increase and that is why this method is called a rougher. By contrast, the Jacobi method is a smoother because there every eigen vector is reduced on every iteration.
#codingexercise
Int GetCount (int [] A)
{
If (A == null) return 0;
Return A.Count();
}
Int GetCountDuplicates (int [] A)
{
If (A == null) return 0;
Return A.CountDuplicates();
}
We will next evaluate the convergence of conjugate gradient method. We saw that the progression for the method was linear. Why should we then care for convergence ? This is because there are a few errors encountered during the progress that impedes convergence. One for example is the floating point errors and the second for example is that the search vectors lose their A-orthogonality. The first problem could just as well occur in Steepest Descent methods and there are ways to overcome it. However, the second is not easy but there are ways to use this as an iterative procedure where we apply corrections. The convergence analysis therefore helps us evaluate running the method on sample datasets where it is not even possible to complete n iterations. The first iteration of CG is identical to the first iteration of the Steepest Descent method. We can apply the same conditions here too. If we look at the iteration equation, the initial search direction is also the initial residual direction formed from the initial subspace A applied to x0. All subsequent iterations result from the next residual and the current search vector. For convergence, let's start with the initial direction along the axes, the residual points to the center of the ellipsoid, giving a one step convergence to the solution. For a more general convergence, we can rely on the fact that we can find n A-orthogonal search vectors and we can express the error term as a linear combination of these vectors. With that we can find the next residual with some compromise. In fact, that error term is a weighted average from the step-lengths of previous residuals. Because its weighted, some of the contributions from the shorter step-lengths of residuals might increase and that is why this method is called a rougher. By contrast, the Jacobi method is a smoother because there every eigen vector is reduced on every iteration.
#codingexercise
Int GetCount (int [] A)
{
If (A == null) return 0;
Return A.Count();
}
Int GetCountDuplicates (int [] A)
{
If (A == null) return 0;
Return A.CountDuplicates();
}
No comments:
Post a Comment