In today's post we discuss conjugate gradient methods. This is a very popular method for solving systems of the form Ax = b where x is an unknown vector, b is a known vector and A is a known square matrix. Iterative methods like CG are suited for use with sparse matrices. If A is dense, then it is better factored and then the equation solved by back substitution. This is because the factorization and iteration are somewhat same cost for dense matrices. On the other hand, factorization of sparse matrices is much more expensive. And so is the back-solving step which involves trying with different values of b.
Equations of the form Ax = b are linear so we can solve for their point of intersection in n hyperplanes where each has dimension n-1. The equation can also be written in a quadratic form. which we can view as a surface whose minimum point of this surface is the solution to Ax = b. When the quadratic form is expressed in contours, they are concentric ellipsoidal curves each of which has a constant f(x). The gradient of the quadratic form points in the direction of the steepest increase of f(x), and is orthogonal to the contour lines. Gradients come in useful when they are set to zero in which case the function is minimized.
If we wonder why we transform a simpler looking equation Ax = b into the tougher looking quadratic form, then its because the methods we are looking at were developed and intuitively understood in terms of the minimization problems and not in terms of intersecting hyperplanes.
It also helps to think of these problems as Eigenvectors and Eigenvalues because they serve as analysis tools. The steepest descent and CG methods don't calculate eignevector as part of their algorithms. We might recall Eigenvector v from previous posts as a nonzero vector that does not rotate when B is applied to it, except perhaps to point it in precisely the opposite direction. v may change length or reverse its direction but it won't turn sideways. In other words there is some scalar constant lambda such that Bv = lambda v and this lambda is the eigen value.
Courtesy: Shewchuk 1994 An introduction to conjugate gradient method without the agonizing pain.
Equations of the form Ax = b are linear so we can solve for their point of intersection in n hyperplanes where each has dimension n-1. The equation can also be written in a quadratic form. which we can view as a surface whose minimum point of this surface is the solution to Ax = b. When the quadratic form is expressed in contours, they are concentric ellipsoidal curves each of which has a constant f(x). The gradient of the quadratic form points in the direction of the steepest increase of f(x), and is orthogonal to the contour lines. Gradients come in useful when they are set to zero in which case the function is minimized.
If we wonder why we transform a simpler looking equation Ax = b into the tougher looking quadratic form, then its because the methods we are looking at were developed and intuitively understood in terms of the minimization problems and not in terms of intersecting hyperplanes.
It also helps to think of these problems as Eigenvectors and Eigenvalues because they serve as analysis tools. The steepest descent and CG methods don't calculate eignevector as part of their algorithms. We might recall Eigenvector v from previous posts as a nonzero vector that does not rotate when B is applied to it, except perhaps to point it in precisely the opposite direction. v may change length or reverse its direction but it won't turn sideways. In other words there is some scalar constant lambda such that Bv = lambda v and this lambda is the eigen value.
Courtesy: Shewchuk 1994 An introduction to conjugate gradient method without the agonizing pain.
No comments:
Post a Comment