I recently came across this fluid dynamics simulation and have been trying to read up on the lattice Boltzmann method in order to better understand it. Most of the code there is pretty transparent, so between just reading the code and reading Wikipedia I have pretty much figured what everything does... except for the kinematic calculations in the core ``collide` function.
I've been able to figure out that the factors of 1/9, 4/9, and 1/36 are related to the lengths of the vectors connecting cell centers along different lattice directions, and I can even find resources that explain what different factors to use for different lattice types (with a D2Q9 lattice being used in this code). But I haven't been able to find anything that explains how you go from the generic vector equation which defines the collision step of the Lattice Boltzmann Algorithm to the specific nine lines of implementation arithmetic shown below. In particular, where do those factors of 3, 1.5, and 4.5 comes from?
The code used in the linked web page (with minor edits to remove free variables and improve readability) is as follows:
function collide(viscosity) { // kinematic viscosity coefficient in natural units
var omega = 1 / (3*viscosity + 0.5); // reciprocal of relaxation time
for (var y=1; y<ydim-1; y++) {
for (var x=1; x<xdim-1; x++) {
var i = x + y*xdim; // array index for this lattice site
var thisrho = n0[i]+nN[i]+nS[i]+nE[i]+nW[i]+nNW[i]+nNE[i]+nSW[i]+nSE[i];
// macroscopic horizontal velocity
var thisux = (nE[i]+nNE[i]+nSE[i]-nW[i]-nNW[i]-nSW[i])/thisrho;
// macroscopic vertical velocity
var thisuy = (nN[i]+nNE[i]+nNW[i]-nS[i]-nSE[i]-nSW[i])/thisrho;
var one9thrho = thisrho / 9;
var one36thrho = thisrho / 36;
var ux3 = 3 * thisux;
var uy3 = 3 * thisuy;
var ux2 = thisux * thisux;
var uy2 = thisuy * thisuy;
var uxuy2 = 2 * thisux * thisuy;
var u2 = ux2 + uy2;
var u215 = 1.5 * u2;
n0[i] += omega * (4/9*thisrho * (1 - u215) - n0[i]);
nE[i] += omega * ( one9thrho * (1 + ux3 + 4.5*ux2 - u215) - nE[i]);
nW[i] += omega * ( one9thrho * (1 - ux3 + 4.5*ux2 - u215) - nW[i]);
nN[i] += omega * ( one9thrho * (1 + uy3 + 4.5*uy2 - u215) - nN[i]);
nS[i] += omega * ( one9thrho * (1 - uy3 + 4.5*uy2 - u215) - nS[i]);
nNE[i] += omega * ( one36thrho * (1 + ux3 + uy3 + 4.5*(u2+uxuy2) - u215) - nNE[i]);
nSE[i] += omega * ( one36thrho * (1 + ux3 - uy3 + 4.5*(u2-uxuy2) - u215) - nSE[i]);
nNW[i] += omega * ( one36thrho * (1 - ux3 + uy3 + 4.5*(u2-uxuy2) - u215) - nNW[i]);
nSW[i] += omega * ( one36thrho * (1 - ux3 - uy3 + 4.5*(u2+uxuy2) - u215) - nSW[i]);
}
}
}
var omega = 1 / (3*viscosity + 0.5);
for (var y=1; y<ydim-1; y++) {
for (var x=1; x<xdim-1; x++) {`
var i = x + y*xdim;
var thisrho = n0[i]+nN[i]+nS[i]+nE[i]+nW[i]+nNW[i]+nNE[i]+nSW[i]+nSE[i];
var thisux = (nE[i]+nNE[i]+nSE[i]-nW[i]-nNW[i]-nSW[i])/thisrho;
var thisuy = (nN[i]+nNE[i]+nNW[i]-nS[i]-nSE[i]-nSW[i])/thisrho;
var one9thrho = thisrho / 9;
var one36thrho = thisrho / 36;
var ux3 = 3 * thisux;
var uy3 = 3 * thisuy;
var ux2 = thisux * thisux;
var uy2 = thisuy * thisuy;
var uxuy2 = 2 * thisux * thisuy;
var u2 = ux2 + uy2;
var u215 = 1.5 * u2;
The last step handles collision f_\alpha^t (where t stands for temporary as it will be followed up by streaming) = f_\alpha + \omega*(f_\alpha^{eq} - f_\alpha). Every single line is responsible for one discrete direction \alpha, so for a single entry of the vector equation. Note:
In this case f_\alpha^t is stored in f_\alpha again and therefore it is sufficient to increment (+=) f_\alpha (the arrays nNWSE) by \omega*(f_\alpha^{eq} - f_\alpha). The first term in the brackets is the equilibrium function while the last term corresponds to the current f_\alpha.
The incompressible equilibrium function is given by f_\alpha^{eq} = w_\alpha \rho (1 + e_{i \alpha} u_i/c_s^2 + (e_i \alpha u_i)^2/(2 c_s^4) - u_i^2/(2 c_s^2)) where we have to sum over all terms containing the index i twice. The lattice speed of sound c_s is given by 1/\sqrt(3)*dx/dx and therefore f_\alpha^{eq} = w_\alpha \rho (1 + 3 e_{i \alpha} u_i + 4.5 (e_i \alpha u_i)^2 - 1.5 u_i^2). The terms one9thrho and one36thrho correspond to the term before the brackets. The sum of ux3 and uy3 correspond to the second term inside the brackets 3 e_{i \alpha}. The second last term 4.5 (e_i \alpha u_i)^2 corresponds to 4.5 u_x^2 or 4.5 u_y^2 for the horizontal or vertical direction and to 4.5 (+-u_x +- uy)^2 for the diagonal direction as both components are present and thus leads to 4.5 (u_x^2 + u_y^2 +- u_x u_y). When subtracted u215 = 1.5*u2 = 1.5*(ux+uy) corresponds to the last term. You will have to take into account if the corresponding velocity projection e_{i \alpha} of lattice velocity \vec e_\alpha in direction i is either 0 or +-1. Latter is responsible for the signs
n0[i] += omega * (4/9*thisrho * (1 - u215) - n0[i]);
nE[i] += omega * ( one9thrho * (1 + ux3 + 4.5*ux2 - u215) - nE[i]);
nW[i] += omega * ( one9thrho * (1 - ux3 + 4.5*ux2 - u215) - nW[i]);
nN[i] += omega * ( one9thrho * (1 + uy3 + 4.5*uy2 - u215) - nN[i]);
nS[i] += omega * ( one9thrho * (1 - uy3 + 4.5*uy2 - u215) - nS[i]);
nNE[i] += omega * ( one36thrho * (1 + ux3 + uy3 + 4.5*(u2+uxuy2) - u215) - nNE[i]);
nSE[i] += omega * ( one36thrho * (1 + ux3 - uy3 + 4.5*(u2-uxuy2) - u215) - nSE[i]);
nNW[i] += omega * ( one36thrho * (1 - ux3 + uy3 + 4.5*(u2-uxuy2) - u215) - nNW[i]);
nSW[i] += omega * ( one36thrho * (1 - ux3 - uy3 + 4.5*(u2+uxuy2) - u215) - nSW[i]);
A code like this is unlikely to be very efficient. To speed the code up you should use two arrays for f_\alpha and f_\alpha^t and perform collision and streaming in one step instead of two. Additionally the equilibrium function can be rewritten further to recalculate as few terms as possible by combining omega and oneXXthrho and further rewrite the quadratic terms. Refer to the code that accompanies "The Lattice Boltzmann Method: Principles and Practice" for better written code. This should already increase performance by at least a factor of two. In order to simulate 3D on your machine you will have to apply several more difficult optimisations. Additionally there are better forums for this topic, for instance Palabos (University of Geneva) and OpenLB (Karlsruhe Institute of Technology).