A GPU-based caching strategy for multi-material linear elastic FEM on regular grids

In this study, we present a novel strategy to the method of finite elements (FEM) of linear elastic problems of very high resolution on graphic processing units (GPU). The approach exploits regularities in the system matrix that occur in regular hexahedral grids to achieve cache-friendly matrix-free FEM. The node-by-node method lies in the class of block-iterative Gauss-Seidel multigrid solvers. Our method significantly improves convergence times in cases where an ordered distribution of distinct materials is present in the dataset. The method was evaluated on three real world datasets: An aluminum-silicon (AlSi) alloy and a dual phase steel material sample, both captured by scanning electron tomography, and a clinical computed tomography (CT) scan of a tibia. The caching scheme leads to a speed-up factor of ×2-×4 compared to the same code without the caching scheme. Additionally, it facilitates the computation of high-resolution problems that cannot be computed otherwise due to memory consumption.


Major issues
1. Presenting a novel numerical method for a PDE problem without even stating the PDE is not a good practice, even in an Engineeristic context where linear elasticity is taken for granted. Moreover, the local matrices mentioned in Table 1 should be defined. To a cross-disciplinary audience, concepts such as "linear elasticity" or "stiffness matrix" are too general and may do not lend themselves to a unique interpretation. The presentation of the numerical method should be as self-contained as possible.

Minor issues
1. Page 8, lines 26-27. "The mesh cells can have arbitrary connectivity in order to express a wide range of shapes". Not really arbitrary connectivity: in classical FEMS, the intersection of two elements cannot be a portion of a face, because hanging nodes are not allowed, as correctly explained later in the manuscript. The term "arbitrary" should be relaxed, perhaps by invoking the notion of face matching.
2. Page 9. Some bibliographical references on multigrid solvers should be added at the end of the first paragraph.
3. Page 10, liness 72-79. Is this the first example of a matrix-free FEM for linear elasticity problems, or is this an improvement of existing studies on the topic? This is well explained later in the State-of-the-Art section, but a flavour of the exact novelty of the work should be given as soon as possible.
4. Page 12, line 127. "...the updated displacements of their 27 direct neighbors". The shape of elements (tetrahedral or hexahedral) being considered and even the number of space dimensions should be mentioned. Otherwise, the reader might struggle to understand why the direct neighbours are exactly 27. This is made explicit only at a later stage. Moreover, are nodes counted as "direct neighbours" of themselves?
5. Page 12, lines 129-130. "...enhancing cache efficiency by several orders of magnitude". Cache efficiency is actually discussed in the numerical experiments, but a direct comparison with the method in [34] showing the "improvement by several orders of magnitude" appears to be missing.
6. Page 12, lines 133-134. "In most problems with a limited number of discrete materials". Please provide bibliographic reference(s) to such problems.
7. Page 13, line 142. "Two distinct materials are represented with gray and blue". The only visible colors in Fig. 1a) are red and blue.
8. Page 15, equation 1. This equation should be referenced in the text. Moreover, the whole paragraph above equation (1) (starting from "the mass matrix M ") is hard to keep in the short-term memory. Hence, this discussion should be presented as a comment below equation (1), not as a preamble. 11. Page 17, Figure 2. The pseudocode text is cut. To avoid this, the pseudocode should be inserted as text instead that as a figure.
12. Page 29, lines 443-445. "While the experiments on the increasing numbers of materials are somewhat artificial, they serve to show that the caching strategy continues to be beneficial in cases where the original assumption of an ordered distribution of materials holds". maybe the authors mean "...an ordered distribution of materials does not hold anymore"?