performancesoftware-designscientific-computingmeshflexibility

Memory layout for mesh data in scientific computing


I am developing a Finite Element System. As usual, the simulation consists of set of mesh nodes, each with a set of properties (floating-points), like for example several material properties, coordinates or physical quantities that evolve within time.

You can either employ two extreme approaches:

Between these, one could mix, like applying the second approach only for the coordinates x, y, z, and using the first approach for the remaining properties. The more properties your simulation maintains for each mesh node, the possibilities to mix.

On the one hand, I have got the classical question which of these approaches (and their mixes) is best for scientific computing with regards to performance of the program and maintainability of the code. On the other hand, I wonder how to implement code in such a way that migration between different approaches becomes easy. Furthermore, it might even be a solution to migrate between different memory layouts for different parts of the program.

To come to the point:


Solution

  • Your question actually boils down to the question how to design flexible and fast finite element software, which is an area of active research. I have worked on this kind of software, and I think the answers to your questions really depend on what kind of features you would like to support. For example, do you need adaptive grid refinement and coarsening? Do you solve systems of PDEs? Should your code run on a cluster? How many unknowns will you have? Without knowing your goals, I try to make some general ponts anyway.

    1. There are oodles of FE frameworks and grid managers out there -- use one of them!

    2. Grid managers are more complex than many people think. As long as it is only tets in 3D on a single computer, it might still be managable. A grid manager with mixed element types in 3D with local refinement and parallel load-balancing is an enormous task. Only develop your own grid manager if you know what you want to improve over those that are readily available.

    3. Memory layout: Now we get to the actual question :)
      Separate grid data and problem data. This will make your code more maintainable. The grid manager can be used for different kinds of problems, and each problem has its own set of data attached to each element.

    4. If you store things like the conductivity within the grid or outside shouldn't matter much as far as performance is concerned. I would separate it for design reasons, not for performance reasons.

    5. Store your unknowns (probably e_field and b_field in your example) in one contiguous block of memory. This will massively improve the performance of your linear solver. An iterative linear solver is usually limited by memory bandwidth for large matrices.

    6. The inner structure of your vector of unknowns might be grouped or block-wise (basically the two approaches you described yourself) Which of these two approaches to choose for the numbering of unknowns depends on many factors, including the type of PDE you solve and the kind of linear solver you use. There are many scientific publications on this problem alone.