Cholesky Decomposition

4 minute read


Cholesky decomposition is the factorization of a positive-definite, Hermitian matrix into the product of a lower triangular matrix and its conjugate transpose. It is useful in numerous applications including solving linear systems, optimization, and monte-carlo simulations.


The formulation of a Cholesky decomposition is fairly straight-forward. Let \(A\) be a positive-definite, Hermitian matrix. Then

\[A = LL^*\]

is the Cholesky decomposition, where \(L\) is lower-triangular and \(L^*\) is the conjugate transpose of \(L\).

If \(A\) is in fact positive-definite, then \(L\) is unique.


For the sake of simplicity assume, \(A\) is a real, symmetric \(3\times 3\) matrix. Then we can write out the equation \(A = LL^{T}\) element-wise,

\[\begin{aligned} A &= \begin{bmatrix} l_{11} & 0 & 0 \\ l_{21} & l_{22} & 0 \\ l_{31} & l_{32} & l_{33} \end{bmatrix} \cdot \begin{bmatrix} l_{11} & l_{21} & l_{31} \\ 0 & l_{22} & l_{32} \\ 0 & 0 & l_{33} \end{bmatrix} \\ &= \begin{bmatrix} l_{11}^2 & l_{21}l_{11} & l_{31}l_{11} \\ l_{21}l_{11} & l_{21}^2 + l_{22}^2 & l_{31}l_{21} + l_{32}l_{22} \\ l_{31}l_{11} & l_{31}l_{21} + l_{32}l_{22} & l_{31}^2 + l_{32}^2 + l_{33}^2 \end{bmatrix} \end{aligned}\]

Now that we have some representation of \(A\) we can derive \(L\) from the values of \(A\), which gives

\[L = \begin{bmatrix} \sqrt{a_{11}} & 0 & 0 \\ a_{21}/l_{11} & \sqrt{a_{22} - l_{21}^2} & 0 \\ a_{31}/l_{11} & (a_{32} - l_{31}l_{21})/l_{22} & \sqrt{a_{33} - l_{31}^2 - l_{32}^2} \end{bmatrix}\]

From this you can see a pattern in the elements of \(L\). For instance, the diagonals are the square root of that element of \(A\) minus the squares of all the elements before it in the row. Using these patterns we can define each value of \(L\) by its index.

\[\begin{aligned} l_{j,j} =& (\pm) \sqrt{ a_{j,j} - \sum_{k=1}^{j-1} l_{j,k}^2 } \\ l_{i,j} =& \frac{1}{l_{j,j}} \left( a_{i,j} - \sum_{k=1}^{j-1} l_{i,k}l_{j,k} \right) \end{aligned}\]

Each value of \(L\) is dependent of the values above and to the left. Likewise computations must be done descending from the top left corner of the matrix.

It is quite trivial to derive an \(\mathcal{O}(n^3)\) algorithm from these update rules.

for k=0...N

  A[k,k] = sqrt(A[k,k])
  for i=k+1...n
    A[k,i] /= A[k,k]
  for j=k+1...n
    for i=j...n
      A[j,i] -= A[k,j] * A[k,i]


Cholesky decomposition actually has many applications including linear systems, linear least-squares, monte-carlo simulations, Kalman filters, and matrix inversion.

Linear Systems

The most common use of decomposition is solving the linear system \(A \bm{x}=\bm{b}\). If \(A\) is real symmetric, positive-definite then we can use Cholesky to solve.

First, let \(A\bm{x}=\bm{b}\). Cholesky decomposition gives us \(A=LL^T\). Now, using forward substitution, we can solve \(L\bm{y}=\bm{b}\) for \(\bm{y}\) and, using back substitution, solve \(L^T\bm{x}=\bm{y}\) for \(\bm{x}\).


Linear least squares problems are formulated as finding \(\hat{\bm{\theta}} = \argmin_{\bm \theta}{\vert|\bm{y}-X\bm{\theta} \vert|^2}\) where \(\hat{\bm{y}}=X\hat{\bm{\theta}}\) and \(X\in\mathbb{R}^{n\times n}\), \(\bm{\theta},\bm{y}\in\mathbb{R}^n\). A solution exists for this using the normal equations. Namely,

\[\begin{aligned} (X^TX)\hat{\bm{\theta}} = X^T\bm{y} \\ \implies \hat{\bm{\theta}} = (X^TX)^{-1}X^T\bm{y} . \end{aligned}\]

Now \(X^TX\) is symmetric positive definite so it can be factored into \(X^TX = LL^T\). Then it just comes down to solving \(L^T\bm{y}=X^T\bm{\theta}\) using forward substitution and \(L\bm{\theta}=\bm{y}\) using back substitution.



Translating the pseudocode algorithm into C/C++ is fairly straightforward. Here’s an example C++ implementation:

template <typename T>
void serial_potrf(la::Matrix<T> &A)

    for (std::size_t k = 0; k < A.rows(); k++)
    {, k) = std::sqrt(, k));
        for (std::size_t i = k + 1; i < A.rows(); i++)
  , k) /=, k);

        for (std::size_t j = k + 1; j < A.rows(); j++)
            for (std::size_t i = j; i < A.rows(); i++)
      , j) -=, k) *, k);

You’ll note that this writes \(L\) in-place into \(A\). It also only computes \(L\). The upper-right triangle (excluding diagonal) are all junk values.

Fine-Grained Parallelism

Course-Grained Parallelism

Further Reading

1 -

2 -