simulation physics mathematical-lattices

Explanation for magic numbers in lattice Boltzmann simulation?

I recently came across this fluid dynamics simulation and have been trying to read up on the lattice Boltzmann method in order to better understand it. Most of the code there is pretty transparent, so between just reading the code and reading Wikipedia I have pretty much figured what everything does... except for the kinematic calculations in the core ``collide` function.

I've been able to figure out that the factors of 1/9, 4/9, and 1/36 are related to the lengths of the vectors connecting cell centers along different lattice directions, and I can even find resources that explain what different factors to use for different lattice types (with a D2Q9 lattice being used in this code). But I haven't been able to find anything that explains how you go from the generic vector equation which defines the collision step of the Lattice Boltzmann Algorithm to the specific nine lines of implementation arithmetic shown below. In particular, where do those factors of 3, 1.5, and 4.5 comes from?

The code used in the linked web page (with minor edits to remove free variables and improve readability) is as follows:

function collide(viscosity) { // kinematic viscosity coefficient in natural units
  var omega = 1 / (3*viscosity + 0.5); // reciprocal of relaxation time
  for (var y=1; y<ydim-1; y++) {
    for (var x=1; x<xdim-1; x++) {
      var i = x + y*xdim; // array index for this lattice site
      var thisrho = n0[i]+nN[i]+nS[i]+nE[i]+nW[i]+nNW[i]+nNE[i]+nSW[i]+nSE[i];

      // macroscopic horizontal velocity
      var thisux = (nE[i]+nNE[i]+nSE[i]-nW[i]-nNW[i]-nSW[i])/thisrho;

      // macroscopic vertical velocity
      var thisuy = (nN[i]+nNE[i]+nNW[i]-nS[i]-nSE[i]-nSW[i])/thisrho;

      var one9thrho = thisrho / 9;
      var one36thrho = thisrho / 36;
      var ux3 = 3 * thisux;
      var uy3 = 3 * thisuy;
      var ux2 = thisux * thisux;
      var uy2 = thisuy * thisuy;
      var uxuy2 = 2 * thisux * thisuy;
      var u2 = ux2 + uy2;
      var u215 = 1.5 * u2;

      n0[i]  += omega * (4/9*thisrho * (1                              - u215) - n0[i]);
      nE[i]  += omega * (  one9thrho * (1 + ux3       + 4.5*ux2        - u215) - nE[i]);
      nW[i]  += omega * (  one9thrho * (1 - ux3       + 4.5*ux2        - u215) - nW[i]);
      nN[i]  += omega * (  one9thrho * (1 + uy3       + 4.5*uy2        - u215) - nN[i]);
      nS[i]  += omega * (  one9thrho * (1 - uy3       + 4.5*uy2        - u215) - nS[i]);
      nNE[i] += omega * ( one36thrho * (1 + ux3 + uy3 + 4.5*(u2+uxuy2) - u215) - nNE[i]);
      nSE[i] += omega * ( one36thrho * (1 + ux3 - uy3 + 4.5*(u2-uxuy2) - u215) - nSE[i]);
      nNW[i] += omega * ( one36thrho * (1 - ux3 + uy3 + 4.5*(u2-uxuy2) - u215) - nNW[i]);
      nSW[i] += omega * ( one36thrho * (1 - ux3 - uy3 + 4.5*(u2+uxuy2) - u215) - nSW[i]);
    }
  }
}

Solution

The code (which seems to be written in C#) starts by calculating the relaxation frequency omega from the viscosity that is set by enforcing a certain Reynolds number Re:= U L / nu. (Generally this is a constant and could thus also be calculated in advance instead of in every single loop.)

var omega = 1 / (3*viscosity + 0.5);

Then the code loops over the entire domain leaving out the first and last nodes of every direction which will be the boundary conditions which will be handled by a different function.

for (var y=1; y<ydim-1; y++) {
   for (var x=1; x<xdim-1; x++) {`

For the storage of the populations it relies on global 1D arrays for every lattice direction \alpha and thus turns to linear indexing to select the correct node. The direction in this case are named 0 for the rest node and according to the corresponding direction (nord, east and so on) for the other populations. Generally codes use a numbering convention instead.

var i = x + y*xdim;

It reads the populations with linear indexing and sums them up to obtain density (zeroth-order momentum).

var thisrho = n0[i]+nN[i]+nS[i]+nE[i]+nW[i]+nNW[i]+nNE[i]+nSW[i]+nSE[i];

Then it determines the velocity by enforcing the first-order momentum \rho*u_i = \sum_\alpha e_{i \alpha} f_\alpha for the current lattice. In this case x points in horizontal direction (east west) and y points in vertical direction (north south), where east and north are the positive directions. This means \alpha = {north, north-east, north-west, south, south-east, south-west}.

var thisux = (nE[i]+nNE[i]+nSE[i]-nW[i]-nNW[i]-nSW[i])/thisrho;

var thisuy = (nN[i]+nNE[i]+nNW[i]-nS[i]-nSE[i]-nSW[i])/thisrho;

Then several variables are declared that will be re-used in the next section: one9thrho and one36thrho are a combination of weights and density for the horizontal/vertical and diagonal populations respectively. The confusing thing here is that ux3 means 3*ux while ux2 is intended to represent ux^2.

var one9thrho = thisrho / 9;
var one36thrho = thisrho / 36;
var ux3 = 3 * thisux;
var uy3 = 3 * thisuy;
var ux2 = thisux * thisux;
var uy2 = thisuy * thisuy;
var uxuy2 = 2 * thisux * thisuy;
var u2 = ux2 + uy2;
var u215 = 1.5 * u2;

The last step handles collision f_\alpha^t (where t stands for temporary as it will be followed up by streaming) = f_\alpha + \omega*(f_\alpha^{eq} - f_\alpha). Every single line is responsible for one discrete direction \alpha, so for a single entry of the vector equation. Note:
- In this case f_\alpha^t is stored in f_\alpha again and therefore it is sufficient to increment (+=) f_\alpha (the arrays nNWSE) by \omega*(f_\alpha^{eq} - f_\alpha). The first term in the brackets is the equilibrium function while the last term corresponds to the current f_\alpha.
- The incompressible equilibrium function is given by f_\alpha^{eq} = w_\alpha \rho (1 + e_{i \alpha} u_i/c_s^2 + (e_i \alpha u_i)^2/(2 c_s^4) - u_i^2/(2 c_s^2)) where we have to sum over all terms containing the index i twice. The lattice speed of sound c_s is given by 1/\sqrt(3)*dx/dx and therefore f_\alpha^{eq} = w_\alpha \rho (1 + 3 e_{i \alpha} u_i + 4.5 (e_i \alpha u_i)^2 - 1.5 u_i^2). The terms one9thrho and one36thrho correspond to the term before the brackets. The sum of ux3 and uy3 correspond to the second term inside the brackets 3 e_{i \alpha}. The second last term 4.5 (e_i \alpha u_i)^2 corresponds to 4.5 u_x^2 or 4.5 u_y^2 for the horizontal or vertical direction and to 4.5 (+-u_x +- uy)^2 for the diagonal direction as both components are present and thus leads to 4.5 (u_x^2 + u_y^2 +- u_x u_y). When subtracted u215 = 1.5*u2 = 1.5*(ux+uy) corresponds to the last term. You will have to take into account if the corresponding velocity projection e_{i \alpha} of lattice velocity \vec e_\alpha in direction i is either 0 or +-1. Latter is responsible for the signs

n0[i]  += omega * (4/9*thisrho * (1                              - u215) - n0[i]);
nE[i]  += omega * (  one9thrho * (1 + ux3       + 4.5*ux2        - u215) - nE[i]);
nW[i]  += omega * (  one9thrho * (1 - ux3       + 4.5*ux2        - u215) - nW[i]);
nN[i]  += omega * (  one9thrho * (1 + uy3       + 4.5*uy2        - u215) - nN[i]);
nS[i]  += omega * (  one9thrho * (1 - uy3       + 4.5*uy2        - u215) - nS[i]);
nNE[i] += omega * ( one36thrho * (1 + ux3 + uy3 + 4.5*(u2+uxuy2) - u215) - nNE[i]);
nSE[i] += omega * ( one36thrho * (1 + ux3 - uy3 + 4.5*(u2-uxuy2) - u215) - nSE[i]);
nNW[i] += omega * ( one36thrho * (1 - ux3 + uy3 + 4.5*(u2-uxuy2) - u215) - nNW[i]);
nSW[i] += omega * ( one36thrho * (1 - ux3 - uy3 + 4.5*(u2+uxuy2) - u215) - nSW[i]);

A code like this is unlikely to be very efficient. To speed the code up you should use two arrays for f_\alpha and f_\alpha^t and perform collision and streaming in one step instead of two. Additionally the equilibrium function can be rewritten further to recalculate as few terms as possible by combining omega and oneXXthrho and further rewrite the quadratic terms. Refer to the code that accompanies "The Lattice Boltzmann Method: Principles and Practice" for better written code. This should already increase performance by at least a factor of two. In order to simulate 3D on your machine you will have to apply several more difficult optimisations. Additionally there are better forums for this topic, for instance Palabos (University of Geneva) and OpenLB (Karlsruhe Institute of Technology).