This post talks about the existence of Jordan form by decomposing a vector space $V$ into direct sum of generalized eigenspaces.

In this post we will work over algebraically closed fields.

The diagonalization problem asks whether a linear transformation admits an eigenbasis. Eequivalently it asks when an $n \times n$ matrix $A$ is similar to a diagonal matrix, i.e. $A = PDP^{-1}$ for some invertible matrix $P$ and a diagonal matrix $D$.

Let the characteristic polynomial of $A$ be $p_A(t) = \prod_{i=1}^{k} (t - \lambda_i)^{n_i}$, where $\lambda_i$ are the distinct eigenvalues of $A$ and $n_i$ are the respective algebraic multiplicities.

Theorem 1 Let $T:V \rightarrow V$ is an endomorphism for finite dimensional vector space $V$ (over an algebraically closed field). Let its characteristic polynomial be $p_T(t) = \prod_{i=1}^k \left (t - \lambda_i \right)^{n_i}$. Then $V = \bigoplus_{i=1}^k \mathrm{Ker} \, \left(T - \lambda_i I\right)^{n_i}$.

Proof

1. The sum part

Notice that $\prod_{i=2}^k \left(t - \lambda_i \right)^{n_i}$, $\left(t - \lambda_1\right)^{n_1} \prod_{i=3}^k \left(t - \lambda_i \right)^{n_i}$, …, $\prod_{i=1}^{k-1} \left(t - \lambda_i \right)^{n_i}$ are relatively prime. Therefore we can find polynomials $q_1(t), \cdots, q_k(t)$ such that

$\displaystyle q_1(t) \prod_{i=2}^k \left(t - \lambda_i \right)^{n_i} + q_2(t) \left(t - \lambda_1\right)^{n_1} \prod_{i=3}^k \left(t - \lambda_i \right)^{n_i} + \cdots + q_k(t) \prod_{i=1}^{k-1} \left(t - \lambda_i \right)^{n_i} = 1\;\;\;\;\;\; (1)$

For any vector $v \in V$, notice that $\prod_{i=2}^k \left(t - \lambda_i \right)^{n_i}v \in \mathrm{Ker} \, \left(T - \lambda_1 I\right)^{n_1}$, by Cayley-Hamilton theorem. Analogously we have other inclusions. Therefore by applying (1) to $v$, we have

$\displaystyle q_1(t) \prod_{i=2}^k \left(t - \lambda_i \right)^{n_i}v + q_2(t) \left(t - \lambda_1\right)^{n_1} \prod_{i=3}^k \left(t - \lambda_i \right)^{n_i}v + \cdots + q_k(t) \prod_{i=1}^{k-1} \left(t - \lambda_i \right)^{n_i}v = v\;\;\;\;\;\;(2)$

This gives a required decomposition.

2. The direct part

WLOG suppose that

$\displaystyle a_1v_1 + \cdots + a_mv_m = 0\;\;\;\;\;\;(3)$

where $v_i \in \mathrm{Ker} \, \left(T - \lambda_i I\right)^{n_i}$, is the shortest possible relation for which $a_1, \cdots, a_m \neq 0$.($m \leq k$). Applying $T$ on both sides gives

$\displaystyle a_1 \lambda_1 v_1 + \cdots + a_m \lambda_m v_m = 0\;\;\;\;\;\;(4)$

(3) and (4) are linearly independent equations, because $\lambda_1, \cdots \lambda_k$ are all distinct. Therefore it is possible to eliminate one of the vectors involved, e.g. $(4) - \lambda_1 (3)$ to eliminate $v_1$, and still have other coefficients to be nonzero. This contradicts the minimality assumption, so $v_1, \cdots, v_k$ must be linearly independent.$\Box$

Corollary 1 $T$ is diagonalizable if and only if for each $\lambda_i$, $\mathrm{Ker} \, (T - \lambda_i I)^{n_i} = \mathrm{Ker} \, (T - \lambda_i I)$.

This is because being able to find an eigenbasis means exactly that $V = \bigoplus_{i=1}^k \mathrm{Ker} \, \left(T - \lambda_i I\right)$.

Proposition 2 $\mathrm{dim} \, \mathrm{Ker} \, \left(T - \lambda_i I\right)^{n_i} = n_i$.

This will be clear using Jordan form.

Corollary 2 $T$ is diagonalizable if and only if for each $\lambda_i$, $n_i = \mathrm{dim} \,\mathrm{Ker} \, (T - \lambda_i I)$.

$\mathrm{dim} \,\mathrm{Ker} \, (T - \lambda_i I)$ is usually called the geometric multiplicity of the eigenvalue $\lambda_i$.  Thus the above corollary may be rephrased as

Corollary 2′ $T$ is diagonalizable if and only if for each $\lambda_i$, the algebraic multiplicity coincides with the geometric multiplicity.

Proposition 2 also gives the following quick corollary.

Corollary 3 The algebraic multiplicity is always no less than the geometric multiplicity.

So the diagonalization problem boils down to whether $\mathrm{Ker} \, (T - \lambda_i I)^{n_i} = \mathrm{Ker} \, (T - \lambda_i I)$. This may not hold in many situations. The natural question is then: can we choose a nice basis from each $\mathrm{Ker} \, (T - \lambda_i I)^{n_i}$ such that when $T$ is diagonalizable, the selected basis is an eigenbasis?

one answer to this qusetion is the Jordan form. The Jordan canonical form picks a basis such that the matrix constitutes Jordan blocks of the form $\left(\begin{array}{ccccc}\lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & \lambda \\ \end{array}\right)$

Theorem 2 (Existence of Jordan form) Let $T:V \rightarrow V$ is an endomorphism for finite dimensional vector space $V$ (over an algebraically closed field). Let its characteristic polynomial be $p_T(t) = \prod_{i=1}^k \left (t - \lambda_i \right)^{n_i}$. Then there exists a basis of $V$ so that the matrix representation of $T$ is a direct sum of Jordan blocks.

Furthermore, the number of Jordan blocks and the size of each of them is an invariant for $T$.

Proof

Since we have already shown $V = \bigoplus_{i=1}^k \mathrm{Ker} \, \left(T - \lambda_i I\right)^{n_i}$, it suffices to choose a nice basis in each $\mathrm{Ker} \, \left(T - \lambda_i I\right)^{n_i}$.

Let $W = \mathrm{Ker} \, \left(T - \lambda_i I \right)^{n_i}$. Then $T - \lambda_i I$ is a $W$-invariant map. Therefore, we may regard $W$ as an $F[T - \lambda_i I]$-module. Since $F[T - \lambda_i I]$ is a principal ideal domain, we may invoke the structure theorem for finitely generated modules over PID to see that

$\displaystyle W = \bigoplus_{j=1}^m W_m$

where each $W_m$ is a cyclic subspace. Take $W_1$ as an example. Let $v$ be its generator, and suppose that $p$ is the smallest number such that $(T-\lambda_i I)^p v = 0$. Then it is immediate that $\{ (T-\lambda_i I)^{p-1} v, (T-\lambda_i I)^{p-2}, \cdots, v\}$ forms a basis for $W_1$. It is then routine to show that with respect tothis basis, $T$ takes exactly the form of a Jordan block.

Similarly, one can choose basis for the other $W_j$ to get a required basis for $\mathrm{Ker} \, \left(T - \lambda_i I \right)^n$ for each eigenvalue $\lambda_i$. Put them together and we have found a basis with the desired matrix representation.

Notice that the number of Jordan blocks is determined by the number of distinct eigenvalues and the number of cyclic modules in the decomposition. The size of Jordan blocks is determined by the invariants associated to the decomposition into cyclic modules. The uniqueness thus follows. $\Box$

Although the above proof cannot help us to do any computation, it has some theoretical implications because the Jordan form is easier to manipulate.

Proposition 3 (Minimal Polynomial)

For each eigenvalue $\lambda_i$, let $r_i$ be the size of the largest Jordan block corresponding to $\lambda_i$. Then the minimal polynomial for $T$ is

$\displaystyle p_T(t) = \prod_{i=1}^k \left (t - \lambda_i \right)^{r_i}$

Remark

1. In fact one can directly apply the structure theorem for finitely generated modules over PID to get Jordan form directly. (skipping the proof of theorem 1) To me, a constructive proof is better though.
2. The algebraically closed field condition can be replaced by “the characteristic polynomial splits”, because all we used is that eigenvalues exist.
1. July 26, 2009 6:08 am

I recall struggling some time back trying to understand the proof of the Jordan normal form from a pure linear algebra textbook; it was only when, later on, I re-learned it from Lang’s Algebra using the structure theorem that it actually made much more sense.

Then again, I also think when the Jordan form is often used (e.g. in the theory of Lie algebras), the main idea is not so much the form itself, but the fact that one gets to represent a matrix as a sum of a semisimple and a nilpotent element which are each polynomials in the initial matrix.