A list of useful properties and their proofs in linear algebra. Some of them are also very useful in machine learning. Complex proofs are not in the scope of the post, but the references will be given if interested.

0. Basics

Proof 0.1

Let A and B be two $N \times N$ square matrix, then $det(AB) = det(A) \cdot det(B)$

Proof: https://proofwiki.org/wiki/Determinant_of_Matrix_Product

Proof 0.2

The determinant of a orthogonal matrix must be $\pm1$

Because $1=det(I)=det\left(Q^{T} Q\right)=det\left(Q^{T}\right) det(Q)=(det(Q))^{2} \Rightarrow det(Q)=\pm 1$

Proof 0.3

Let $A$ be an $N \times N$ matrix and let $\lambda_1, ..., \lambda_n$ be its eigenvalues, then
$det(A)=\prod_{i=1}^{n} \lambda_{i}$ $tr(A)=\sum_{i=1}^{n} \lambda_{i}$

Proof: https://yutsumura.com/determinant-trace-and-eigenvalues-of-a-matrix/

1. Symetric Matrix

Definition

Matrix $A$ is symmetric if $A = A^\mathrm{T}$

Proof 1.1

If $\lambda$ is the eigen-value of $A$ , so is the conjugate of $\lambda$ , denoted as $\overline{\lambda}$
If $\mathbf{x}$ is the eigen-vector of $A$ , so is the conjugate of $\mathbf{x}$ , denoted as $\overline{\mathbf{x}}$

Since $A$ is a real matrix,

A = \overline{A}

Knowing that $A\mathbf{x} = \lambda \mathbf{x}$

A \overline{\mathbf{x}} = \overline{A} \overline{\mathbf{x}}=\overline{A \mathbf{x}}=\overline{\lambda \mathbf{x}}=\overline{\lambda} \overline{\mathbf{x}}

Proof 1.2

$A$ has only real eigenvalues

Consider $\lambda$ and $\mathbf{x}$ are the eigne-value and eigen-vector of $A$ , repectively. Based on Proof 1.1,

\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x}=\overline{\mathbf{x}}^{\mathrm{T}} \lambda \mathbf{x}=\lambda \overline{\mathbf{x}}^{\mathrm{T}} \mathbf{x}

Since $A\mathbf{x}$ is a vector and $\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x} = (A\mathbf{x})^\mathrm{T}\overline{\mathbf{x}}$ ,

\overline{\mathbf{x}}^{\mathrm{T}} A \mathbf{x} = \mathbf{x}^\mathrm{T} A^\mathrm{T} \overline{\mathbf{x}} = \mathbf{x}^\mathrm{T} A \overline{\mathbf{x}} = \overline{\lambda} \mathbf{x}^\mathrm{T} \overline{\mathbf{x}}

Since $\overline{\mathbf{x}}^{\mathrm{T}} \mathbf{x} = \mathbf{x}^\mathrm{T} \overline{\mathbf{x}}$ ,

\lambda = \overline{\lambda}

Proof 1.3

$A$ is diagonalizable by an orthogonal matrix.

Schur decomposition:
Every square matrix factors into $A=QTQ^{-1}$ where $T$ is upper triangular and $\overline{Q}^\mathrm{T}=Q^{-1}$ . If $A$ has real eigenvalues then $Q$ and $T$ can be chosen real: $Q^\mathrm{T}Q = I$ (a.k.a $Q$ is an orthogonal matrix)

Based on Proof 1.2, all the eigen values of $A$ are real.
Based on Schur decomposition, $Q^\mathrm{T}AQ = T$ .
Then,

T^\mathrm{T} = (Q^\mathrm{T}AQ)^\mathrm{T} = (AQ)^\mathrm{T}Q = Q^\mathrm{T}A^\mathrm{T}Q = Q^\mathrm{T}AQ = T

Denote the diagonal matrix $T$ as $D$ , we have

A = QDQ^\mathrm{T}

Proof 1.4

If $A$ is nonsingular, $A^{-1}$ is symmetric

Since $A$ is invertible,

A^{-1}A = I

Taking the transpose, we have

\begin{aligned} I &=I^{\mathrm{T}}=\left(A^{-1} A\right)^{\mathrm{T}} \\ &=A^{\mathrm{T}}\left(A^{-1}\right)^{\mathrm{T}} \\ &=A\left(A^{-1}\right)^{\mathrm{T}} \end{aligned}

Hence, $A^{-1} = (A^{-1})^\mathrm{T}$

2. Positive definite symmetric matrix

Definition

A real symmetric $n \times n$ matrix $A$ is called positive definite if $\mathbf{x}^\mathrm{T}A\mathbf{x}>0$ for all non-zero vectors $\mathbf{x} \in \mathbb{R}^n$ .

Proof 2.1

The eigenvalues of a real symmetric positive-definite matrix $A$ are all positive.

Let $\lambda$ be a (real) eigenvalue of $A$ and let $\mathbf{x}$ be a corresponding real eigenvector. That is, we have

A\mathbf{x}=\lambda \mathbf{x}

Then we multiply by $\mathbf{x}^\mathrm{T}$ on left and obtain,

\begin{aligned} \mathbf{x}^\mathrm{T} A \mathbf{x} &=\lambda \mathbf{x}^\mathrm{T} \mathbf{x} \\ &=\lambda\|\mathbf{x}\|^{2} \end{aligned}

The left hand side is positive as $A$ is positive definite and $x$ is a nonzero vector as it is an eigenvector.
Since the norm $\|\mathbf{x}\|^2$ is positive, we must have $\lambda$ is positive.
It follows that every eigenvalue $\lambda$ of $A$ is real.

Proof 2.2

If eigenvalues of a real symmetric matrix $A$ are all positive, then $A$ is positive-definite.

Since Proof 1.3, $A = QDQ^\mathrm{T}$ where $Q^{-1} = Q^\mathrm{T}$ , we have

\mathbf{x}^\mathrm{T}A\mathbf{x} = \mathbf{x}^\mathrm{T}QDQ^\mathrm{T}\mathbf{x}

where

D=\left[\begin{array}{cccc}{\lambda_{1}} & {0} & {0} & {0} \\ {0} & {\lambda_{2}} & {0} & {0} \\ {\vdots} & {\cdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\lambda_{n}}\end{array}\right]

Putting $y = Q^\mathrm{T}\mathbf{x}$ , we can rewrite the above equation as

\mathbf{x}^\mathrm{T}A\mathbf{x} = \mathbf{y}^\mathrm{T}D\mathbf{y}

Let

\mathbf{y}=\left[\begin{array}{c}{y_{1}} \\ {y_{2}} \\ {\vdots} \\ {y_{n}}\end{array}\right]

Then we have

\mathbf{x}^{\mathrm{T}} A x=\mathbf{y}^{\mathrm{T}} D \mathbf{y} = \lambda_{1} y_{1}^{2}+\lambda_{2} y_{2}^{2}+\cdots+\lambda_{n} y_{n}^{2}

By assumption eigenvalues $\lambda_i$ are positive.
Also, since $x$ is a nonzero vector and $Q$ is invertible, $y=QTx$ is not a zero vector.
Thus the sum expression above is positive, hence $\mathbf{x}^\mathrm{T}A\mathbf{x}$ is positive for any nonzero vector $\mathbf{x}$ .
Therefore, the matrix $A$ is positive-definite.

Proof 2.3

A is invertible

Method 1

Since Proof 2.1, the matrix $A$ does not have 0 as an eigenvalue

We can prove this by contradiction:
If $A\mathbf{x}=0 \cdot \mathbf{x}$ for some $\mathbf{x} \neq 0$ then by definition of eigenvalues (non-invertible), $\mathbf{x}$ is an eigenvector with eigenvalue $\lambda = 0$

Method 2

We can prove this by using determinent

det A = \prod_i^n \lambda_i \gt 0

Proof 2.4

the inverse of A is positive-definite

Based on Proof 2.1 and Proof 2.2, we know the fact that a symmetric matrix is positive-definite if and only if its eigenvalues are all positive.

All eigenvalues of $A^{−1}$ are of the form $1 / \lambda$ , where $\lambda$ is an eigenvalue of $A$ .
Since A is positive-definite, each eigenvalue $\lambda$ is positive, hence $1 / \lambda$ is positive.

So all eigenvalues of $A^{-1}$ are positive, and it yields that $A^{-1}$ is positive-definite.

3. Matrix calculus

Definition

https://en.wikipedia.org/wiki/Matrix_calculus

Reference

[1] matrix cookbook
[2] matrix identities

Derivatives of Matrices, Vectors and Scalar Form

$a$ is scalar and $\mathbf{x}$ is a column vector $\begin{bmatrix} x_{1} & x_{2} & \cdots & x_{n}\end{bmatrix}^T$

\frac{\partial a}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial a}{\partial x_1} & \frac{\partial a}{\partial x_2} & \dots & \frac{\partial a}{\partial x_n} \end{bmatrix}

$\mathbf{a}$ and $\mathbf{x}$ are column vectors

\frac{\partial \mathbf{a}^T \mathbf{x}}{\partial \mathbf{x}} = \frac{\partial \mathbf{x}^T \mathbf{a}}{\partial \mathbf{x}} = \mathbf{a}

$A$ is matrix, $\mathbf{x}$ is column vector and $\mathbf{x}^T A \mathbf{x}$ is scalar

\frac{\partial \mathbf{x}^T A \mathbf{x}}{\partial \mathbf{x}} = (A + A^T) \mathbf{x}

If $A$ is symmetric, then

\frac{\partial \mathbf{x}^T A \mathbf{x}}{\partial \mathbf{x}} = 2A\mathbf{x}

$\mathbf{a}$ and $\mathbf{b}$ are column vectors, $X$ is a matrix and $\mathbf{a}^T X \mathbf{b}$ is scalar

\frac{\partial \mathbf{a}^T X \mathbf{b}}{\partial X} = \mathbf{a} \mathbf{b}^T

Derivatives of a Determinant

$X$ is a matrix

\frac{\partial det(X)}{\partial X} = det(X) \cdot (X^{-1})^T

Derivatives of an Inverse

$A$ is a matrix and $A$ depends on $x$

\frac{\partial A^{-1}}{\partial x}=-A^{-1} \frac{\partial A}{\partial x} A^{-1}

If $x$ is $A$ , then

\frac{\partial A^{-1}}{\partial x}=-A^{-2}