Linear Algebra Proofs

A list of useful properties and their proofs in linear algebra. Some of them are also very useful in machine learning. Complex proofs are not in the scope of the post, but the references will be given if interested.

0. Basics

Proof 0.1

Let A and B be two N×NN \times N square matrix, then det(AB)=det(A)det(B)det(AB) = det(A) \cdot det(B)

Proof: https://proofwiki.org/wiki/Determinant_of_Matrix_Product

Proof 0.2

The determinant of a orthogonal matrix must be ±1\pm1

Because 1=det(I)=det(QTQ)=det(QT)det(Q)=(det(Q))2det(Q)=±11=det(I)=det\left(Q^{T} Q\right)=det\left(Q^{T}\right) det(Q)=(det(Q))^{2} \Rightarrow det(Q)=\pm 1

Proof 0.3

Let AA be an N×NN \times N matrix and let λ1,...,λn\lambda_1, ..., \lambda_n be its eigenvalues, then

det(A)=i=1nλidet(A)=\prod_{i=1}^{n} \lambda_{i}

tr(A)=i=1nλitr(A)=\sum_{i=1}^{n} \lambda_{i}

Proof: https://yutsumura.com/determinant-trace-and-eigenvalues-of-a-matrix/

1. Symetric Matrix

Definition

Matrix AA is symmetric if A=ATA = A^\mathrm{T}

Proof 1.1

If λ\lambda is the eigen-value of AA, so is the conjugate of λ\lambda, denoted as λ\overline{\lambda}
If x\mathbf{x} is the eigen-vector of AA, so is the conjugate of x\mathbf{x}, denoted as x\overline{\mathbf{x}}

Since AA is a real matrix,

A=AA = \overline{A}

Knowing that Ax=λxA\mathbf{x} = \lambda \mathbf{x}

Ax=Ax=Ax=λx=λxA \overline{\mathbf{x}} = \overline{A} \overline{\mathbf{x}}=\overline{A \mathbf{x}}=\overline{\lambda \mathbf{x}}=\overline{\lambda} \overline{\mathbf{x}}

Proof 1.2

AA has only real eigenvalues

Consider λ\lambda and x\mathbf{x} are the eigne-value and eigen-vector of AA, repectively. Based on Proof 1.1,

xTAx=xTλx=λxTx\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x}=\overline{\mathbf{x}}^{\mathrm{T}} \lambda \mathbf{x}=\lambda \overline{\mathbf{x}}^{\mathrm{T}} \mathbf{x}

Since AxA\mathbf{x} is a vector and xTAx=(Ax)Tx\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x} = (A\mathbf{x})^\mathrm{T}\overline{\mathbf{x}},

xTAx=xTATx=xTAx=λxTx\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x} = \mathbf{x}^\mathrm{T} A^\mathrm{T} \overline{\mathbf{x}} = \mathbf{x}^\mathrm{T} A \overline{\mathbf{x}} = \overline{\lambda} \mathbf{x}^\mathrm{T} \overline{\mathbf{x}}

Since xTx=xTx\overline{\mathbf{x}}^{\mathrm{T}} \mathbf{x} = \mathbf{x}^\mathrm{T} \overline{\mathbf{x}},

λ=λ\lambda = \overline{\lambda}

Proof 1.3

AA is diagonalizable by an orthogonal matrix.

Schur decomposition:
Every square matrix factors into A=QTQ1A=QTQ^{-1} where TT is upper triangular and QT=Q1\overline{Q}^\mathrm{T}=Q^{-1}. If AA has real eigenvalues then QQ and TT can be chosen real: QTQ=IQ^\mathrm{T}Q = I (a.k.a QQ is an orthogonal matrix)

Based on Proof 1.2, all the eigen values of AA are real.
Based on Schur decomposition, QTAQ=TQ^\mathrm{T}AQ = T.
Then,

TT=(QTAQ)T=(AQ)TQ=QTATQ=QTAQ=TT^\mathrm{T} = (Q^\mathrm{T}AQ)^\mathrm{T} = (AQ)^\mathrm{T}Q = Q^\mathrm{T}A^\mathrm{T}Q = Q^\mathrm{T}AQ = T

Denote the diagonal matrix TT as DD, we have

A=QDQTA = QDQ^\mathrm{T}

Proof 1.4

If AA is nonsingular, A1A^{-1} is symmetric

Since AA is invertible,

A1A=IA^{-1}A = I

Taking the transpose, we have

I=IT=(A1A)T=AT(A1)T=A(A1)T\begin{aligned} I &=I^{\mathrm{T}}=\left(A^{-1} A\right)^{\mathrm{T}} \\ &=A^{\mathrm{T}}\left(A^{-1}\right)^{\mathrm{T}} \\ &=A\left(A^{-1}\right)^{\mathrm{T}} \end{aligned}

Hence, A1=(A1)TA^{-1} = (A^{-1})^\mathrm{T}

2. Positive definite symmetric matrix

Definition

A real symmetric n×nn \times n matrix AA is called positive definite if xTAx>0\mathbf{x}^\mathrm{T}A\mathbf{x}>0 for all non-zero vectors xRn\mathbf{x} \in \mathbb{R}^n.

Proof 2.1

The eigenvalues of a real symmetric positive-definite matrix AA are all positive.

Let λ\lambda be a (real) eigenvalue of AA and let x\mathbf{x} be a corresponding real eigenvector. That is, we have

Ax=λxA\mathbf{x}=\lambda \mathbf{x}

Then we multiply by xT\mathbf{x}^\mathrm{T} on left and obtain,

xTAx=λxTx=λx2\begin{aligned} \mathbf{x}^\mathrm{T} A \mathbf{x} &=\lambda \mathbf{x}^\mathrm{T} \mathbf{x} \\ &=\lambda\|\mathbf{x}\|^{2} \end{aligned}

The left hand side is positive as AA is positive definite and xx is a nonzero vector as it is an eigenvector.
Since the norm x2\|\mathbf{x}\|^2 is positive, we must have λ\lambda is positive.
It follows that every eigenvalue λ\lambda of AA is real.

Proof 2.2

If eigenvalues of a real symmetric matrix AA are all positive, then AA is positive-definite.

Since Proof 1.3, A=QDQTA = QDQ^\mathrm{T} where Q1=QTQ^{-1} = Q^\mathrm{T}, we have

xTAx=xTQDQTx\mathbf{x}^\mathrm{T}A\mathbf{x} = \mathbf{x}^\mathrm{T}QDQ^\mathrm{T}\mathbf{x}

where

D=[λ10000λ20000λn]D=\left[\begin{array}{cccc}{\lambda_{1}} & {0} & {0} & {0} \\ {0} & {\lambda_{2}} & {0} & {0} \\ {\vdots} & {\cdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\lambda_{n}}\end{array}\right]

Putting y=QTxy = Q^\mathrm{T}\mathbf{x}, we can rewrite the above equation as

xTAx=yTDy\mathbf{x}^\mathrm{T}A\mathbf{x} = \mathbf{y}^\mathrm{T}D\mathbf{y}

Let

y=[y1y2yn]\mathbf{y}=\left[\begin{array}{c}{y_{1}} \\ {y_{2}} \\ {\vdots} \\ {y_{n}}\end{array}\right]

Then we have

xTAx=yTDy=λ1y12+λ2y22++λnyn2\mathbf{x}^{\mathrm{T}} A x=\mathbf{y}^{\mathrm{T}} D \mathbf{y} = \lambda_{1} y_{1}^{2}+\lambda_{2} y_{2}^{2}+\cdots+\lambda_{n} y_{n}^{2}

By assumption eigenvalues λi\lambda_i are positive.
Also, since xx is a nonzero vector and QQ is invertible, y=QTxy=QTx is not a zero vector.
Thus the sum expression above is positive, hence xTAx\mathbf{x}^\mathrm{T}A\mathbf{x} is positive for any nonzero vector x\mathbf{x}.
Therefore, the matrix AA is positive-definite.

Proof 2.3

A is invertible

Method 1

Since Proof 2.1, the matrix AA does not have 0 as an eigenvalue

We can prove this by contradiction:
If Ax=0xA\mathbf{x}=0 \cdot \mathbf{x} for some x0\mathbf{x} \neq 0 then by definition of eigenvalues (non-invertible), x\mathbf{x} is an eigenvector with eigenvalue λ=0\lambda = 0

Method 2

We can prove this by using determinent

detA=inλi>0det A = \prod_i^n \lambda_i \gt 0

Proof 2.4

the inverse of A is positive-definite

Based on Proof 2.1 and Proof 2.2, we know the fact that a symmetric matrix is positive-definite if and only if its eigenvalues are all positive.

All eigenvalues of A1A^{−1} are of the form 1/λ1 / \lambda, where λ\lambda is an eigenvalue of AA.
Since A is positive-definite, each eigenvalue λ\lambda is positive, hence 1/λ1 / \lambda is positive.

So all eigenvalues of A1A^{-1} are positive, and it yields that A1A^{-1} is positive-definite.

3. Matrix calculus

Definition

https://en.wikipedia.org/wiki/Matrix_calculus

Reference

[1] matrix cookbook
[2] matrix identities

Derivatives of Matrices, Vectors and Scalar Form

aa is scalar and x\mathbf{x} is a column vector [x1x2xn]T\begin{bmatrix} x_{1} & x_{2} & \cdots & x_{n}\end{bmatrix}^T

ax=[ax1ax2axn]\frac{\partial a}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial a}{\partial x_1} & \frac{\partial a}{\partial x_2} & \dots & \frac{\partial a}{\partial x_n} \end{bmatrix}

a\mathbf{a} and x\mathbf{x} are column vectors

aTxx=xTax=a\frac{\partial \mathbf{a}^T \mathbf{x}}{\partial \mathbf{x}} = \frac{\partial \mathbf{x}^T \mathbf{a}}{\partial \mathbf{x}} = \mathbf{a}

AA is matrix, x\mathbf{x} is column vector and xTAx\mathbf{x}^T A \mathbf{x} is scalar

xTAxx=(A+AT)x\frac{\partial \mathbf{x}^T A \mathbf{x}}{\partial \mathbf{x}} = (A + A^T) \mathbf{x}

If AA is symmetric, then

xTAxx=2Ax\frac{\partial \mathbf{x}^T A \mathbf{x}}{\partial \mathbf{x}} = 2A\mathbf{x}

a\mathbf{a} and b\mathbf{b} are column vectors, XX is a matrix and aTXb\mathbf{a}^T X \mathbf{b} is scalar

aTXbX=abT\frac{\partial \mathbf{a}^T X \mathbf{b}}{\partial X} = \mathbf{a} \mathbf{b}^T

Derivatives of a Determinant

XX is a matrix

det(X)X=det(X)(X1)T\frac{\partial det(X)}{\partial X} = det(X) \cdot (X^{-1})^T

Derivatives of an Inverse

AA is a matrix and AA depends on xx

A1x=A1AxA1\frac{\partial A^{-1}}{\partial x}=-A^{-1} \frac{\partial A}{\partial x} A^{-1}

If xx is AA, then

A1x=A2\frac{\partial A^{-1}}{\partial x}=-A^{-2}

0%