### The FastMat2 Matrix Library. Description and Parallel Implementation

#### Abstract

Finite element codes usually have two levels of programming. At the outer level a series of nested loops for time stepping, or Newton iteration, leads to a series of linear problems that require residual and matrix assemblies which are then solved with iterative solvers like CG (for Conjugate Gradient) or GMRES (Generalized Minimal Residual Method). At the inner level, these assemblies are performed with a loop over all the elements in the mesh. The residual and matrix are computed for each element assembled in the global vector/matrix. The FastMat2 class is the main matrix library used in the PETSc-FEM code (http://www.cimec.org.ar/petscfem) for computations at the element level. It is full multi-index and performs the standard matrix functions like addition, tensor contractions (in the matrix sense and element-by-element), eigenvalue decomposition, inversion. A key point in its design is that many logical operations (row or column selections for instance) are the same for each element. So this kind of operations are cached in a Direct Acyclic Graph structure (DAG). Its implementation, in particular in a shared memory with OpenMP (for Open Multi-Processing) environment is discussed.