Eigenvalue Problems

From Niserwiki

Jump to: navigation, search

Contents


We come across eigenvalue problems in form of matrix equations or in form of differential equations. These two are treated in different ways. Sometimes it is possible to convert differential equation problem into a matrix problem by choosing a part of the differential operator which has a complete set of solutions. Then we can write the complete differential operator as a matrix and then the eigenvalue problem reduces to a matrix problem. However, usually the complete set solutions are infinite in number. Then the matrix is infinite dimensional. We then need to use truncation scheme to solve the problem.

When the eigenvalue problem involves differential operator, usually one has to specify the boundary conditions on the solution. Thus the problem is more difficult in comparison with the initial value problems. For initial value problems, the associated differential equation can be solved by means of one of the standard methods (see here for details). Solution of eigenvalue problem in differential equation form yields a value of a parameter appearing in the differential equation and a solution function. There are generally more than one such values and functions. The values obtained as a result of solving the problem are called eigenvalues and the solution is called eigenfunction.

The matrix eigenvalue problem is defined as

MX = λX

where M is (usually) a square matrix of dimension (say) n, λ is a number and X is a column vector having n elements. The eigenvalue problem can be written in the form

\left ( M - \lambda I \right ) X = 0

If the solution exists, there will be n values of λ for which the equation can be solved and correspondingly there are n column vectors X. Some of the eigenvalues may be degenerate, i.e. they may have same value.

Below we shall discuss the methods of solving these two classes of eigenvalue problems.

[edit] Differential Equations

To be specific, we shall consider the solution of differential equation

- \frac{d^2}{dx^2} f(x) + V(x) f(x) = e f(x)

with the condition that the function vanishes as the magnitude of x approaches infinity. Here V(x) is a known function and e is the eigenvalue which is unknown and is to be determined as a solution of the equation. This is actually the Schrodinger equation with V(x) being the potential. If we choose V(x) = x2 it is a Schrodinger equation for harmonic oscillator potential and eigenvalues (e's) and eigenfunctions (f(x)'s) for all possible eigenvalues are known.

If the potential is symmetric, the prolem becomes a bit simpler. In this case the problem is symmetric about x = 0. So it is possible to argue that the solution has to be either symmetric or antisymmetric. Further, assuming power series solution near x = 0, f(xc or f(xc * x for even and odd solutions. Here c is a constant. Starting with this ansatz, we can integrate the equation, for a given value of e, using one of the standard methods of solving the equation (see here for the methods of solving differential equations). The problem is, there is no guarantee that the solution will approach to zero at large x (which is the boundary condition at infinity). In fact, for most of the values of e, it will not. So in obtaining the solution, boils down to the determination e as well as the function which satisfies the boundary conditions at infinity as well as at x=0. Below we shall discuss some of the methods of solving such problems.

[edit] Shooting Method

In shooting method, we start from a trial value of e at x = 0. Let us consider that we are looking for a symmetric solution. Then, near x = 0 we can consider a power series solution f_e(x) = 1 + \alpha_2 x^2 \cdots. Putting this in the differential equation and comparing the terms independent of x, we get 2 = − e. Using this trial solution we can then integrate the differential equation to a large value of x. Generally, it will not approach to zero but will be either increasing or decreasing function of x. We then adjust the value of e till we get the right behaviour of f(x). This is basically the shooting method. The name comes from the method used by gunners in shooting a target. A trial shot is fired with trial value of elevation of the gun and the distance by which the shot misses the target is determined. Then the elevation is corrected accordingly and a shot is fired. The process is repeated till the shot hits the target.

What is required to be done is to devise an algorithm for changing the trial value of e so as to reach the correct e as well as fe(x). For the harmonic oscillator problem (or generally one dimensional quantum mechanics problems), we know some general properties of eigenfunctions. For example, we know that the ground state has no nodes, first excited state has one node (or the eigenfunction vanishes at one point only) and so on. Using this property, we can zero in on the ground state by demanding that the eigenfunction has no node at finite value of x (for ground state) and so on.

Let us discuss a little on the general methodology of determining correct eigenvalue. Let us consider a set of N differential equations such that n of them are specified at point x and (N-n) are specified at y. We want to obtain a solution which will satisfy these conditions as well as determine the value of eigenvalue e. In shooting method, we start with N+1 conditions at x (N boundary conditions, of which n are given and N-n are chosen, and one value of e) and obtain the solution at y by solving the equations. This will result in a situation where the final values of N-n parameters don't match with the required boundary conditions. We can consider a N-n dimensional space in which the initial boundary conditions at x denote a point. The process of solving the differential equation results in moving this point to another point in this space when we have integrated the equations to y. We have to ensure that this final point coincides with the point required from the boundary condition at y. This problem is quite similar to the problem of finding zero of a function.

The method above considers solving the problem from one end and trying to get a correct boundary condition at the other end.

[edit] Relaxation Method

In relaxation method one starts with a trial solution which satisfies the boundary conditions to be imposed but otherwise it is not a correct solution but an approximate solution. That is, it doesn't satisfy the given differential equation. Typically, one introduces a number of points in the configuration space and expresses the differential equation as a set of difference equations to be satisfied at these points. This reduces the differential equation into a set of linear equations. These are then solved iteratively (relaxing the solution) insisting that the boundary condition is always satisfied.

[edit] Matrix Eigenvalue Problem

As discussed earlier, in matrix eigenvalue problem AX = λX we are interested in computing the values of λ which solve the eigenvalue equation and then compute the corresponding eigenvectors X.

[edit] Some Definitions and Results about Matrices

If the matrix has dimensionality n, the eigenvalues are determined by solving the equation | A − λI | = 0. Here I is the identity matrix. The determinant in the equation above is an nth degree polynomial in λ and therefore the solution will have, in general, n complex roots.

If the matrix A is real, the roots are all real or pairs of complex conjugate roots. Further, if the matrix A is symmetric AT = A, the roots are all real. For complex matrices, if A is Hermitian or self adjoint A^\dagger = A, then also all the eigenvalues are real.

If an inverse of a real matrix A is its transpose, ATA = AAT = I, it is called orthogonal. Similarly, if an inverse of a complex matrix is its Hermitian adjoint, A A^\dagger = A^\dagger A = I, the matrix is called unitary.

In the equation AX = λX we have defined the right eigenfunctions X. Equivalently, we can define the left eigenfunctions Y (which are row vectors as opposed to X which are column vectors) by equation YA = Yλ. Since there are n eigenvalues λ, there are n left and right eigenfunctions, associated with each of the eigenvalues. We can show that these pairs of eigenvectors form a orthogonal set. That is, Yλ'Xλ = δλ'λCλ.

For symmetric (Hermitian) matrices, the left and right eigenfunctions are transposes (adjoints) of each other.

Consider two matrices ZR and ZL formed from the right and left eigenfunctions. ZR is formed by putting the column vectors XR of n eigenvalues λi (i going from 1 to n) and ZL is formed by stacking the corresponding right eigenfunctions. We then have ZLAZR = diagλiZLZR = diagλi if we normalize the left and right eigenfunctions (i.e. ensure that Yλ'Xλ = δλ'λ.). Here diagλi is a diagonal matrix with the diagonal elements being the eigenvalues.

The result above is an important result because it says that

  1. ZR and ZL are inverses of each other.
  2. Multiplying ZR from right and ZL from left diagonalizes the matrix A.
  3. The diagonalized matrix gives the eigenvalues of the matrix.
  4. And, of course, ZR and ZL give the left and right eigenfunctions.

This means that, the problem of finding eigenvalues and eigenfunctions of a matrix boils down to finding the matrices which are to be multiplied from left and right so that the matrix A is diagonalized. Further, because ZL is inverse of ZR, the diagonalization process is a similarity transformation.

[edit] Methodology of Diagonalizing Matrix

We have seen that the problem of finding eigenvalues and eigenfunctions of a matrix A is same as that of finding a matrix ZR such that Z_R^{-1} A Z_R = {\mathrm {diag}} \lambda_i , that is finding a matrix ZR which diagonalizes A.

The strategy used in most of the matrix diagonalizing codes is to determine the matrix which diagonalizes A as a product of a number of matrices, Z_R  = P_1 P_2 \cdots. That is, the diagonalization is achieved by a series of similarity transformation with each transformation 'nudging' the matrix towards diagonalized form.

One set of methods attempt to diagonalize the matrix by making some of the off-diagonal elements of a matrix vanish. Repeating this procedure to different elements each time, the matrix after these similarity transformations approaches a diagonal matrix. Unfortunately this procedure, although simple and straight forward, does not reduce the matrix in a diagonal form in finite number of steps. This is because the matrix elements which are made zero in earlier similarity transformations may become non-zero in later transformations, although they don't grow. So, repeated application of these set of methods do result in a diagonalized matrix eventually but this may take a large number of steps.

Other set of methods are factorization methods in which the matrix is factored into a left and right factors, A = FLFR or F_L^{-1} A = F_R. So F_R F_L = F_L^{-1} A F_L which is a similarity transformation on A.

[edit] Jacobi Transformations

The Jacobi transformation is essentially a rotation in two dimensions. It makes some of the off-diagonal elements zero. For simplicity consider a symmetric 2 by 2 matrix

\left (  \begin{array}{cc} a & b \\ b & c \\ \end{array} \right ).

Consider a transformation which is a rotation of angle θ in two dimensions. Then the rotation matrix is

\left ( \begin{array}{cc} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \\ \end{array}\right ).

After similarity transformation with the rotation matrix we get

\left ( \begin{array}{cc} a' & b' \\ b' & c' \end{array} \right )

with

Personal tools