# System Solution

The approximate solution to the original PDE is obtained by solving a linear system of equations

A*x = b
where A is a square coefficients matrix of size n*n (system matrix), x is the solution vector of size n, and b is the right-hand side vector of size n. The solution vector x could be obtained by a direct computation of the inverse of the matrix A as

x = A-1*b

However, this approach is not practical. The matrix A is sparse, i.e., the fraction of non-zero entries is low. On the other hand, the inverse matrix A-1 is a dense matrix even for the simplest tri-diagonal system of equations. Hence, the large memory requirements severely limit the systems that can be solved in practice. A system of 1 million degrees of freedom (unknowns) would require around 8 terabytes of RAM to store the inverse matrix (1e6^2*8).

There are numerous efficient algorithms that limit the number of arithmetic instructions and bytes of memory required to solve sparse systems of linear equations on modern computers . Possibly the most basic algorithm is the LU factorization – a variant of the Guassian elimination. The system matrix A is represented as a product of two matrices L and U

A = L*U
such that L is a lower-triangular matrix, and U is an upper-triangular matrix. The original system of equations is now expressed as

(L*U)*x = b
and the solution vector x can be obtained by

x = U-1*(L-1*b)
Since the LU factors are triangular, the solution vector can be easily obtained by forward and backward substitution instead of explicitly computing the inverse matrices. Using the MATLAB notation we write

x = U\(L\b)
where the backslash operator (\) denotes the forward/backward substitution.

Compared to the naive approach of computing A-1 sparse factorization can only perform better if that LU factors are sparse. MILAMIN renumbers the unknowns and the matrix to minimize the number of non-zero entries in the factor using various fill-reducing reordering techniques. Essentially, the unknowns are renumbered and the matrix is permuted using a permutation vector

perm = f(A)
A = A(perm,perm)
Further savings on memory and operation requirements can be made for special types of matrices. For example, symmetric positive definite matrices can be factorized using the Cholesky factorization

A = L*L’
where L is a lower triangular factor and L’ is its transpose. Computing only one lower triangular factor roughly halves the number of operations and the amount of memory required.

In short, the algorithm for the solution of the global system of linear equations used in MILAMIN consists of four major steps:

We present several approaches to perform the above steps using MATLAB and the MILAMIN software package. Our major interest is the efficiency of the implementation, and the memory consumption. We show how to solve large systems in a reasonable time on modern multi-core shared memory CPU architectures.