Skip to content

This post proves the existence of Smith normal form. The proof is pretty much the same as the one of reduced row echelon form.

Smith normal form is a generalization of Gaussian elimination to matrices with coefficients coming from PID. This can be used, for example, in finding a $\mathbb{Z}$-basis for the kernel/image of a matrix of integers, or proving the often-cited structure theorem for finitely generated module over PID. The procedure is essentially the same as Gaussian elimination as well.

Theorem 1 (Existence of Smith Normal form) Let $A$ be an $m \times n$ matrix with entries in a PID $R$. Then there exists $m \times m$ matrix $P$ and $n \times n$ matrix $Q$, both invertible, such that $PAQ$ is of the form $\left(\begin{array}{cccccc} a_1 & 0 & \cdots & 0 & \cdots & 0 \\ 0 & a_2 & \cdots & 0 & \cdots & 0 \\ \vdots & \vdots & \cdots & \vdots & \cdots & \vdots \\ 0 & 0 & \cdots & a_k & \cdots & 0 \\ 0 & & \cdots & & & 0 \\ \vdots & & \cdots & & & \vdots \\ 0 & & \cdots & & & 0 \end{array}\right)$

with $a_1 | a_2 | \cdots | a_k$.

Proof

The proof is similar to the Gaussian elimination process.

WLOG, assume that the first column is nonzero. Let us consider three operations:

1. Switching two rows

2. Adding some multiple of one row to another

3. For two relatively prime elements $a,b$, we can find $p,q$ such that $pa+qb = 1$. For two rows 1 and 2, we may

• replace row 1 by $a$ row 1 + $b$ row 2
• replace row 2 by $-q$ row 1 + $p$ row2

Only the last operation is different from the usual row operations. This is because multiplying a row by a constant involves scaling the determinant up by that constant. If we are working over a PID, that constant may not be a unit and thus is not what we want. The new operation is a replacement of scaling up two rows and replace one row by their sum, whose determinant is a unit.

Analogously we also have the column operations.

1. Assume that $A \neq 0$. WLOG, assume that the first column is nonzero (switching columns if needed), and actually the upper left element $a_{11}$is nonzero (switching rows if needed). Take this as the pivot.

If the matrix is over a field, we can just eliminate every other entry in the first column. However in our case, division is not allowed, so the best we can do is somehow make the upper left entry the g.c.d. of all the entries in the first column, then eliminate.

Consider $a_{21}$. If it is zero, leave it there. Otherwise, let the g.c.d. of $a_{11}$ and $a_{21}$ be $\beta$. Then we can find $p,q$ such that $p a_{11} + q a_{21} = \beta$. Apply the third type of row operation, we can replace $a_{11}$ by $\beta$, and $a_{21}$ by a multiple of $\beta$. Subtract the second row by a that multiple times the first row, we can make $a_{21}$ zero.

Repeat this, and we can get every element in the first column, except $a_{11}$, to be zero. This finishes one round of elimination.

2. Repeat this again for the first row using column operations. A bad thing may happen: entries in the first column may become nonzero again. We can go back to step 1, and use row operations to clean up the first column. Of course the same question for the first row will arise. Does this loop terminate? The answer is yes, because after one round of elimination, the new $a_{11}$ divides the previous $a_{11}$. Consider the chain of ideals generated by $a_{11}$. It is an ascending chain, which stabilizes since a PID is Noetherian. That means eventually, $a_{11}$ stops changing – that only happens when we no longer need to use the third type of operation, i.e. every element in the row is divisible by $a_{11}$. Using the second type of operation does NOT affect a cleaned-up row/column.

This means that by multiplying approriate invertible matrices on the left and on the right, $A$ is now of the form $\left(\begin{array}{cccc} * & 0 & \cdots & 0 \\ 0 & * & \cdots & * \\ \vdots & \vdots & \ddots & \vdots \\ 0 & * & \cdots & * \end{array}\right)$

3. Repeat for the lower right submatrix. Iterations will eventually give invertible matrices $P'$ and $Q'$ such that $P'AQ'$ is diagonal. By switching columns if needed, we can assume that all the nonzero columns are aligned to the left, i.e. it now takes the form $\left(\begin{array}{cccccc} a_1 & 0 & \cdots & 0 & \cdots & 0 \\ 0 & a_2 & \cdots & 0 & \cdots & 0 \\ \vdots & \vdots & \cdots & \vdots & \cdots & \vdots \\ 0 & 0 & \cdots & a_k & \cdots & 0 \\ 0 & & \cdots & & & 0 \\ \vdots & & \cdots & & & \vdots \\ 0 & & \cdots & & & 0 \end{array}\right)$

The final question is how to make $a_1 | \cdots |a_k$, as needed.

Let us focus on the upper left 2 by 2 block $\left(\begin{array}{cc} a_1 & \\ & a_2 \end{array}\right)$ only. We show that after row/column operations we can make $a_1 | a_2$.

• Add the second column to the first. $\Rightarrow \left(\begin{array}{cc} a_1 & \\ a_2 & a_2 \end{array}\right)$
• Using the third type of row operation, we make the upper left entry to be $c$, the g.c.d. of $a_1, a_2$. It is clear that the lower right entry is unchanged, and the other two entries are multiples of $c$.
• Subtract the second column by the first with appropriate multiple such that the upper right entry is 0. Analogously make the lower left entry 0. The lower right entry may be changed, but it is definitely a multiple of the upper left one.

We may then repeat this process so that $a_1 | a_2, \cdots , a_k$, and then $a_2 | a_3, \cdots, a_k$ and so forth. Repeat this and done. $\Box$

Remark

1. This proof is essentially the same as Gaussian elimination.
2. The $a_1 | \cdots | a_k$ is needed to make the form canonical.

Example 1

Question

If we change the PID condition to a dimension condition, can we get anything similar? Maybe should try $\mathbb{Z}[T]$ to see what happens.

Advertisements
4 Comments leave one →
1. ptp permalink
June 10, 2011 1:03 am

Clarification: Shouldn’t the second allowed row operation include the adjective integer?

2. Adding some integer multiple of one row to another

• soarerz permalink*
June 15, 2011 11:46 am

we are working over the fixed PID $R$ here. Any entry is in $R$.

2. messi permalink
May 23, 2013 3:06 pm

Why is operation 3 allowed(or valid), it is not an elementary operation. You try to explain it in a couple of lines by saying something about the determinant, can you say something more? What is the connection with the determinant?

• soarerz permalink*
June 11, 2013 12:16 am

It is allowed since it’s the same as multiplying by a diagonal block matrix, where the first 2 by 2 block is $ $\begin{pmatrix} a & b \\ -p & q \end{pmatrix}$$ and other diagonal entries are 1.

Determinant is mentioned because the usual third operation is mutliplying a row by a constant, which may not be invertible. Invertibility is definitely a wanted condition since the point of Smith Normal Form is to have a nice expression of the matrix after we do a change of coordinates. Therefore we look for a substitute of “multiplying a row by a constant” which is always invertible, and one alternative is the third operation I mentioned.