Jordan normal form

Jordan normal form states that every square matrix is similar to a Jordan normal form matrix, one of the form

J = \begin{bmatrix}J_1 & \; & \; \\ \; & \ddots & \; \\\; & \; & J_p \\ \end{bmatrix}

where each of the J_i is square of the form

J_i = \begin{bmatrix}\lambda_i & 1 & \; & \; \\ \; & \lambda_i \; & \ddots & \; \\ \; & \; & \ddots & 1 \\ \; & \; & \; & \lambda_i \\ \end{bmatrix}.

This is constructed via generalized eigenvectors. One can observe that each block matrix corresponds to an invariant subspace, and generalized eigenvectors (of a matrix) are a set of chains, each of which is its own invariant subspace.

We let A be any linear transformation from V to V, where V is of course a vector space.

It is more common knowledge that Ker(A - \lambda I) is the mechanism used to solve for eigenvectors. Let us first observe that v \in Ker(A - \lambda I) is such that also Av \in Ker(A - \lambda I) since A commutes with A - \lambda I and that this extends to Ker(A - \lambda I)^t for any natural number t. This gives us a way to identify larger invariant subspaces.

Let W_i = Ker(A - \lambda I)^i. W_i \subset W_{i+1} is obvious, and there will be some and a smallest t at which W_t = W_{t+1}. Afterwards, the W_i must all be equal. If not, there will be in the intermediary on iterating A - \lambda I against a vector from which W_t = W_{t+1} is contradicted.

Next, we observe that Ker(A - \lambda I)^t \cap Im(A - \lambda I)^t = \emptyset. Suppose not. Then, we would have some w \in W_{2t} but not in W_t.

Rank nullity theorem says that the remainder after pulling out Ker(A - \lambda I)^t for some eigenvalue \lambda is Im(A - \lambda I)^t. We can run the same algorithm on that for another eigenvalue. So this is resolved by induction.

The result is that if A has distinct eigenvalues \lambda_1, \lambda_2, \ldots, \lambda_k, there are a_1, a_2, \ldots, a_k such that the domain of A is

(A - \lambda_1 I)^{a_1} \oplus (A - \lambda_2 I)^{a_2} \oplus \cdots \oplus (A - \lambda_k I)^{a_k}.

Now does Ker(A - \lambda I)^t correspond to a irreducible invariant subspace necessarily? No, as there is a difference between algebraic and geometric multiplicity.

Now we will show, as the second part of the proof, to be invoked on the components in the direct sum decomposition from the preceding first part of the proof that if T is nilpotent, meaning that T^n = 0 for some n, then there are v_1, \ldots, v_k and a_1, \ldots, a_k such that \{v_1, Tv_1, \ldots, T^{a_1-1}v_1, v_2, \ldots, v_k, Tv_k, \ldots, T^{a_k-1}v_k\} is a basis (linearly independent by definition) for the domain of T, with \sum a_k = \dim V and \max(a_1, \ldots, a_k) = n. (Note that here n is the smallest with T^n = 0.)

For any eigenvalue, there is an eigenvector space associated with it. Take its preimage with respect to A. Do this successively until nothing remains, which will be at the n-1th iteration. In particular, take u_1, \ldots, u_k to be the basis of the eigenvector space. For each one of these that has non-empty preimage, we take the element with the kernel projected out. This accumulates a set of vectors of the format specified. It has to be exhaustive with respect to the vector space under the nilpotence assumption, from which termination is also guaranteed. It remains to show that these are linearly independent. We can using our eigenvector space as our base case take an inductive hypothesis where the vectors accumulated prior to the kth iteration are linearly independent. Now we show that the vector set remains so after adding in the ones obtained from taking preimage. We note that first, the added ones are linearly independent themselves (if a nontrivial linear combinations gives zero, applying A would violate our inductive hypothesis). There is also that a nontrivial linear combination of the newly added ones cannot equal a linear combination of the rest. To show this, assume otherwise, and apply A just enough times (k times) for one side to disappear. The other side should be a linear combination with respect to our designated basis of the eigenvector space, which cannot disappear. This concludes our construction.

Essentially we have chains (of vectors) which terminate when an element is no longer found after our preimage operation. Applying to this T = A - \lambda I, we see that for some element u in our chain, Au = \lambda u + v, where v is the previous element of the chain, with 0 signifying that we are the at an eigenvector (non-generalized), at the front. Ones along the superdiagonal correspond to the 1 coefficient of v above.



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.