Multiphysics flow simulations using D3Q19 lattice Boltzmann methods based on central moments

In a recent work [A. De Rosis, R. Huang, and C. Coreixas,"Universal formulation of central-moments-based lattice Boltzmann method with external forcing for the simulation of multiphysics phenomena", Phys. Fluids 31, 117102 (2019)], a multiple-relaxation-time lattice Boltzmann method (LBM) has been proposed by means of the D3Q27 discretization, where the collision stage is performed in the space of central moments (CMs). These quantities relax towards an elegant Galilean invariant equilibrium, and can also include the effect of external accelerations. Here, we investigate the possibility to adopt a coarser lattice composed of 19 discrete velocities only. The consequences of such a choice are evaluated in terms of accuracy and stability through multiphysics benchmark problems based on single-, multi-phase and magnetohydrodynamics flow simulations. In the end, it is shown that the reduction from 27 to 19 discrete velocities have only little impact on the accuracy and stability of the CM-LBM for moderate Reynolds number flows in the weakly compressible regime.


I. MOTIVATION
Numerical simulations of viscous fluid flows are routinely performed by scientists involved in both the academic and industrial sectors. These simulations can be approached by different viewpoints. The most common one is the so-called macroscopic or continuumbased level, with the problem being governed by the (in)compressible Navier-Stokes equations (NSEs). As a drastic alternative, molecular dynamics (MD) idealizes a certain volume of fluid by a finite set of particles obeying Newton's law. Interestingly, MD possesses the great advantage to introduce physics properties at the microscopic level. However, if one wants to simulate a relatively large-scale realistic fluid problem, the number of required particles becomes very high, thus dramatically increasing the computational cost. Between MD and NSEs there is another level, called mesoscopic or kinetic, and its practical implementation is represented by the lattice Boltzmann method (LBM) [1][2][3] .
Roughly speaking, the LBM recovers the behavior of fluid dynamics from the motion of populations (or distribution functions) of fictitious particles, that collide and stream along the links of a fixed Cartesian lattice, representing the fluid domain. The collision stage is the core of any LB algorithm because it retains the whole flow physics. The so-called BGK collision operator (named after its authors Bhatnagar Gross and Krook 4 ) represents the simplest, yet effective, and most popular approximation, where all the populations relax with the same common rate towards a discrete equilibrium state. The latter is usually derived by applying a Gauss-Hermite a) Electronic mail: alessandro.derosis@manchester.ac.uk b) Electronic mail: christophe.coreixas@unige.ch quadrature to the continuous Maxwellian distribution 5,6 . Despite its simplicity and popularity, the BGK-LBM is unsuitable for the prediction of turbulent flows as it becomes rapidly unstable in the limit of vanishing viscosity. This is mainly (though not exclusively) due to the intrinsic unavoidable presence of non-hydrodynamic ghost modes, which undermine the stability of the algorithm through spurious couplings with hydrodynamic ones [7][8][9][10] . Originally proposed to introduce free parameters that could be used to either increase the stability (by damping non-hydrodynamic modes) and/or the validity range of LBMs (variable Prandtl number, etc), the multiplerelaxation-time (MRT) LBM introduced the idea to perform the collision in a space of raw (or absolute) moments of different order [11][12][13] . While second-order ones entail the flow and relax with a certain frequency directly linked to the fluid kinematic and bulk viscosities, higher-order ones address the above-mentioned ghost modes and the corresponding frequencies can be (almost) freely tuned to improve the stability of the resulting LBM [14][15][16] .
Another problem affecting the LBM is the lack of Galilean invariance at all the orders, which results in velocity-dependent transport coefficients [17][18][19] . For standard lattices, this is due to the fact that populations (for the BGK) and moments (for the MRT) relax to equilibrium states derived through a second-order truncated Taylor expansion in the local Mach number of the above-mentioned continuous Maxwellian distribution 18 . In 2006, Geier et al. 20 argued that this problem could be solved by performing the collision stage in the space of central moments (CMs), obtained by shifting the lattice directions by the local fluid velocity. Moreover, they imposed the match between the CMs of the continuous Maxwellian distribution and those of the discrete counterpart. Consequently, this methodology assumes that equilibrium CMs are unchanged by the velocity discretization of the Boltzmann equation, which might not be true depending on both equilibrium CMs of interest and on the considered lattice of discrete velocities. The resulting scheme, called cascaded LBM, remarkably outperformed both the BGK-and MRT-LBMs in terms of stability thanks to: (1) the implicit use of an extended equilibrium, and (2) the equilibration of CMs related to high-order moments and bulk viscosity 21 .
More recently, it was demonstrated that it is possible to adopt a CMs-based procedure where moments relax to whatever discrete equilibrium state and for whatever lattice discretization [22][23][24][25][26] . It is possible to switch from populations to CMs (and vice versa) by simply multiplying (or dividing) by a transformation matrix that depends on the adopted lattice and the local fluid velocity. These early attempts relied on a second-order equilibrium state in the D2Q9 and D3Q27 lattice velocity spaces. The adoption of such a simple (yet naive and incomplete) equilibrium distribution generates a non-negligible number of non-zero velocity-dependent equilibrium CMs. As a consequence, post-collision populations are functions of all these terms, which leads to a huge computational overhead 27 . However, by including the correct higher-order terms in the definition of the discrete equilibrium [28][29][30][31] , the methodology outlined in Ref. 23 was proven to lead to Galilean invariant CMs with the D3Q27 discretization 32 . This D3Q27-CM-LBM was further shown to recover the behavior of the original cascaded LBM by only relying on the correct set of Hermite polynomials. Since this family of polynomials is tightly linked to the velocity discretization of the Boltzmann equation 5,6,33 , the correct set of Hermite polynomials is known in an a priori way, hence ensuring its consistency for any kind of lattices. The derivation of D3Q19-LBMs based on Galilean invariants CMs makes no exception, and only requires to apply simple pruning rules on its D3Q27 counterpart 31 .
During the past three decades, numerous velocity discretizations have been proposed for the simulation of isothermal and weakly compressible flows. For example, lattices based on 15 and 19 discrete velocities were introduced by Qian et al. 34 , while Ladd 35 discarded the velocity corresponding to particles at rest, hence, ending up with a D3Q18 lattice. These three velocity discretizations directly flow from the more general D3Q27 lattice through prunning 2 . This is one of the common way to reduce the number of discrete velocities, while keeping most of the macroscopic properties intact for LBMs based on second-order equilibria. The other popular strategy is the moment-matching approach which leads, e.g., to the smallest lattice for the simulation of fluid flows, namely, the D3Q13 lattice 36 .
The D3Q19 has been the default choice for a long time, notably due to checkerboard instabilities arising for smaller lattices 2,13,37 . There has been sporadic interest using the D3Q27 lattice for problems where the rotational invariance of the numerical solution is of particular concern [38][39][40][41][42][43] . However, it was recently proven that numerical errors -notably those induced by the equilibriumare at the origin of the latter rotational problem in-stead of the lattice itself 44 . This is in agreement with the fact that several commercial solvers (e.g., Power-FLOW and ProLB) rely on D3Q19 formulations -with regularized/filtered collision models whose equilibria include high-order velocity terms-without suffering from such anistropic issues [45][46][47][48][49] .
Surprisingly, most research works rely on D3Q27-CM-LBMs. This is also the case of XFlow software which is a CM-LB solver dedicated to industry-oriented flow simulations 50,51 . Nevertheless, it is possible to find in the literature formulations of CM-LBMs based on D3Q15 and D3Q19 lattices, even though the latter are pretty rare. As an example, Premnath & Banerjee 52 showed a comparison between the D3Q27 and D3Q15 lattices using these techniques, and they obtained similar results for low Reynolds and Mach number flow simulations. More recently, Fei et al. 27 have presented an improved implementation of the cascaded scheme for both D3Q27 and D3Q19 lattices, even though, no comparative study was performed by the authors. In the end, it is still not clear if one can reduce the lattice size in the context of CM-LBMs without deteriorating the accuracy of the solver. This is even more true in the context of multiphysics flow simulations, for which, to the best of the authors' knowledge, no systematic accuracy/stability comparison study can be found in the literature.
In this paper, we then aim at deriving a D3Q19-CM-LBM for the simulation of multiphysics flows with or without external acceleration. To make sure this is not done at the expense of accuracy and/or stability, both points will be at the center of the comparative study that will be carried out throughout the paper. The rest of the paper is organized as follows. Sec. II presents a detailed derivation of our D3Q19-CM-LBM which is based on extended formualtions of the equilibrium and forcing terms. This approach is thoroughly tested against several well-defined and consolidated benchmark problems in Sec. III. Some conclusions are drawn in Sec. IV. Finally, additional details are given in Appendices: (A) impact of extended equilibria on stability and accuracy, (B) derivation of the extended forcing term, (C) raw moment formulation of our approach, (D) recalls on the D3Q27-CM formulation, and (E) color-gradient method algorithm.

II. METHODOLOGY
In this Section, we first recall the classic D3Q19-BGK-LBM. Then, we derive a collision operator in the space of CMs and present the D3Q19-CM-LBM. Finally, we demonstrate that the classical D3Q19-MRT-LBM based on the relaxation of raw (or absolute) moments can be interpreted as a particular case of our D3Q19-CM-LBM. If not otherwise stated, the LB unit system will be used henceforth, where the grid spacing and the time step are both equal to 1.
predicts the space and time evolution of the particle distribution functions |f i = [f 0 , . . . , f i , . . . , f 18 ] that collide and stream along the generic link i = 0 . . . 18 with the discrete velocities c i = [|c ix , |c iy , |c iz ] defined as As usual, this numerical scheme can be divided into two parts, i.e., collision: and streaming: where the superscript denotes post-collision quantities here and henceforth. Let us implicitly assume the dependence on the spatial position x and the time t in the following. Within the BGK approximation, the collision operator Ω i can be written as a relaxation of the populations towards an equilibrium state f eq i , i.e.
where ω is a relaxation frequency that is linked to the fluid kinematic viscosity ν as c s = 1/ √ 3 being the lattice sound speed of the D3Q19 velocity discretization 2,3 . The source term F i is usually treated according to the popular model by Guo et al. 53 . The choice of the equilibrium populations is instrumental to recover the correct physics of a phenomenon. At a first glance, one might be tempted to use the popular secondorder truncated expression 34 ρ and u = [u x , u y , u z ] being the mass density and the flow velocity, respectively, and the weights are w = [w 0 , w s , w s , w s , w l , w l , w l , w l , w l , w l , w s , w s , w s , w l , w l , w l , w l , w l , w l ] , with w 0 = 1/3, w s = 1/18 and w l = 1/36. Lattice directions c i and weights w i are defined according to Ref. 54 , in order to further reduce memory consumption thanks to the "swap trick". However, several authors demonstrated that the full potential of any LB discretization (in terms of physical and numerical properties) can only be achieved by using the complete allowable set of Hermite polynomials 9,21,28,29,31 . This is true for all lattices derived through a tensor-product of lower-order ones (e.g., D2Q9 and D3Q27), but with the D3Q19 lattice a pruning strategy must be adopted. Using the latter strategy, Coreixas et al. proposed a derivation of the equilibrium that is compliant with all collision models (Appendix H of Ref. 31 ), and the corresponding expressions are f eq 10 = ρ As usual, macroscopic variables are computed as the zeroth-and first-order moments of the populations: It should be noted that the u 3 terms restore the Galilean invariance for shear flows aligned with the coordinate axes 55 . However, a complete restoration is impossible for standard lattices because one cannot add the diagonal terms u 3 x , u 3 y and u 3 z to the components of the third moment of f eq i . The partial restoration of Galilean invariance leads to an anisotropic stress-strain relation that can increase the errors for shear flows inclined to axes 56 , that can only be removed through correction terms [57][58][59][60] . Nevertheless, the extended equilibrium (9) should be preferred to its second-order counter part (7), as it allows for better stability for simulations at moderate Mach numbers, and in the low viscosity regime (see App. A for more details).

B. General D3Q19-MRT-LBM
The lattice Boltzmann equation (LBE) with the forcing term can be generally expressed as 61,62 where |• denotes a row vector.Notice that Eq. (11) collapses into the aforementioned BGK-LBM if the collision matrix is set to Λ = ωI, where I is the unit tensor. The term F i accounts for external body forces F = [F x , F y , F z ] and its role will be elucidated later.
Its prefactor accounts for discrete effects originating from the change of variables that aims at obtaining a numerical scheme explicit in time 53 . Again, the LBE can be divided into two steps, i.e., collision and streaming Let us first neglect the presence of F i . In order to build a CMs-based collision operator, the lattice directions are shifted by the local fluid velocity 20 . These shifted or peculiar discrete velocitiesc i = [ c ix | , c iy | , c iz |] are defined as where •| denotes a column vector. In order to apply the collision step in the CM space, one must choose a suitable basis. In the present case, the following transformation matrix (from populations to CMs) is proposed: This basis directly flows from its D3Q27 counterpart where monomials related to discrete velocities (±1, ±1, ±1) are discarded 31,63 . In addition, by decoupling moments related to compression/dilation phenomena (trace of the second-order-moment tensor) from those controlling shear phenomena (off-diagonal terms), it is possible to adjust the bulk viscosity independently from its shear counterpart thanks to a diagonal relaxation matrix 2,31,63 . The relaxation matrix in the populations space then reads Λ = T −1 KT, where in the present case K = diag[1, 1, 1, 1, 1, ω, ω, ω, ω, ω, 1, . . . , 1] is the 19 × 19 relaxation matrix in the CMs space. The latter has been chosen in order to (1) take into account external forces (if needed be) through the non-zero first four relaxation frequencies, (2) impose the kinematic viscosity ν thanks to ω = 1/(ν/c 2 s +1/2), and (3) improve the numerical stability via the equilibration of bulk and high-order CMs.
( 16) respectively. The first two quantities are evaluated by applying the matrix T to the corresponding distribution, that is Interestingly, applying the transformation matrix T to equilibrium populations in Eqs. (9) generates the following equilibrium CMs: while the remaining terms are equal to zero. It is of interest to notice that the equilibrium CMs are Galilean invariant, as no dependence on the fluid velocity is present. This is consistent with the theoretical findings in Ref. 32 , where it has been demonstrated that, for tensor-productbased lattices (D2Q9 and D3Q27), Galilean invariant equilibrium CMs are found if the transformation matrix T is applied to discrete equilibrium populations accounting for the correct high-order Hermite polynomials -those based on tensor products of second-order Hermite polynomials. It is worth noting that one could further discard the remaining lattice-dependent terms (those proportional to the lattice constant c s ) thanks to the central-Hermite moment approach, as explained in Ref. 31 . The collision process takes place as After the collision, non-zero CMs read as follows: where pre-collision CMs are Now, we are in the position to reconstruct post-collision populations Eventually, populations are streamed (see Eq. (4)) and macroscopic variables are computed by Eq. (10). In the present model, the forcing term F i is accounted for through the collision step in Eq. (11). The latter is applied in the CM space, and consequently, it requires the computation of the forcing term CMs, R i , following In presence of external forces, post-collision CMs then read as with k 0 = ρ, if one assumes that CMs of the forcing term are Galileant invariant, which can be enforced paying attention to the particular nature of the D3Q19 lattice. A detailed derivation of the forcing term expressed in the velocity space, i.e., F i , is provided in Appendix B. Deriving it in the velocity space is of paramount importance because it allows its extension to any kind of moment space in a straightforward manner, as already demonstrated for both D2Q9 and D3Q27 lattices in our previous work 64 . Interestingly, accounting for external forces does not modify the rest of the procedure, which will henceforth be referred to as the D3Q19-CM-LBM. The interested reader may also refer to Appendix C for the raw moment (RM) for- mulation of the present algorithm. Moreover, the script D3Q19CentralMoments.m in the supplementary material allows the reader to perform all the symbolic manipulations to derive the proposed methodology.

C. Some computational details
It is worth to highlight the benefits, in terms of computational cost and memory consumption, coming from the adoption of the present D3Q19-CM-LBM rather than the more standard D3Q27-CM-LBM 64 . Let us denote as ∆ the number of lattice sites characterizing a certain LB simulation. The reduced memory requirements of the former model clearly stem when populations are considered. Indeed, one can save (19/27) × 100 ≈ 30% when the simplest lattice model is considered (see Table I).
Now, let us consider the number of involved floating point operations. Firstly, one can immediately observe that the computation of macroscopic variables (10) and pre-collision CMs (21) needs to span a different number of directions (19 vs. 27). Hence, the simplest discretization allows us to reduce of approximately 30% the computational cost involved in the computation of ρ, u and k 5...9 . Moreover, the computation of postcollision moments and populations is drastically lighter when the 19-velocities discretization is adopted. In fact, the D3Q19-CM-LBM needs to evaluate the expressions in Eqs. (C5,C7). One can immediately observe that for the D3Q27-CM-LBM [see Eqs. (D5) and the script D3Q27CentralMoments.m attached to the Supplemental Material], a larger number of floating point operations is required, hence, drastically increasing the computational cost of the CM-LBM as compared to the present D3Q19 formulation.
In the next section, we compare the numerical properties of the present D3Q19-CM-LBM against its D3Q27 counterpart 64 for the simulation of multiphysics flows. The interested reader can refer to Appendix D for further details regarding the D3Q27-CM-LBMs.

III. NUMERICAL TESTS
We compare the numerical properties of the D3Q19-CM-LBM with those of its D3Q27 counterpart through eight well-defined consolidated benchmark tests. The first five problems focus on the simulation of single phase flows in absence of external forces: • Taylor-Green vortex, • double shear layer, • lid-driven cavity, • dipole-wall collision, • three-dimensional Taylor-Green vortex.
The sixth one, i.e. Hartmann flow, introduce the Lorentz force in the resulting magnetohydrodynamic system. The interested reader can refer to the work by Dellar 65 for further details regarding the computation of the magnetic field. This section ends with two cases dealing with multiphase flows, i.e., • a static bubble of a certain fluid immersed in another one is considered by means of the well-known Shan-Chen pseudopotential force; • Rayleigh-Taylor instability mechanism is simulated hereafter by adopting the color-gradient method.
If not otherwise stated, populations will be initialized by assuming they are at equilibrium (the latter being computed thanks to initial macroscopic fields) and boundary conditions are imposed by the regularized technique 66 .

A. Taylor-Green vortex
We test the convergence properties of the adopted approach against the popular Taylor-Green vortex benchmark problem 67 . Let us consider a square periodic domain of length 2π with the following initial conditions: with ξ = 2π/N . The domain is idealized by N × N grid points in the x − y plane, whereas only 1 point is adopted in the z direction. The time evolution of the fluid velocity computed by our algorithm is compared to the analytical prediction where the characteristic time is T = 2ξ 2 ν −1 . Specifically, the numerical and analytical solutions are collected in the vectors σ n and σ a , respectively, at t = T . Then, the relative error between the two is evaluated as • denoting the L2-norm. A convergence analysis is carried out by varying the number of lattice points, N , discretizing each side of the domain, i.e. N = 8, 16, 32, 64, 128, 256, 512. Moreover, we investigate For the lowest value of u 0 , an optimal convergence value equal to 2 is found. As u 0 increases, the accuracy and convergence properties of the method deteriorate as well. This behavior should be addressed to the impossibility to add the diagonal terms u 3 x , u 3 y and u 3 z to the components of the third moments of the equilibrium populations 56 .These findings are consistent with those obtained by the D3Q27 lattice discretization. Figure 1(b) shows the results of the convergence analysis at u 0 = 0.3 by adopting the D3Q19-and D3Q27-CM-LBMs. Except for the coarsest resolution, curves are well overlapped, meaning that the accuracy loss observed for higher values of u 0 is not related the reduction of discrete velocities from 27 to 19. In fact, for both cases, the aliasing defect c 3 iζ = c iζ (ζ = x, y or z) is at the origin of the velocity-dependent error terms, that are related to compression/dilation phenomena (trace of the viscous stress tensor), and which can only be dealt with using correction terms 18,58,60 .

B. Double shear layer
An excellent candidate to evaluate the stability of any numerical scheme is represented by the double shear layer test 69,70 . By considering a two-dimensional periodic domain with (x, y) ∈ [0, L] 2 , initial conditions are given by two longitudinal shear layers and a superimposed trans-  verse perturbation, i.e., and where κ = 80 and δ = 0.05. The Reynolds and Mach numbers are Re = u 0 L/ν = 3 × 10 4 and Ma = u 0 /c s = 0.57, respectively, with L = 256. Only one point is considered in the direction z. Figure 2 sketches the (normalized) vorticity field at t/t 0 = 1 (with t 0 = L/u 0 ) and confirms the rise of a Kelvin-Helmholtz instability mechanism, where the flow physics manifests the roll-up of the shear layers and the generation of two counter-rotating vortices.
In Figure 3 percentage relative discrepancy of ∼ 0.009%. These findings are confirmed by the plot of the kinetic enstrophy (normalized by the initial value) in Figure 3(b), where curves obtained by the two approaches are, again, overlapped.
In Appendix A, this test is further used to demonstrate the stability improvement induced by the extended equilibrium (9) as compared to its second-order counterpart.

C. Lid-driven cavity
The lid-diven cavity 68,71 represents one of the most canonical problem to evaluate the accuracy of numerical schemes. Let us consider a square domain of length L = 201. At the top section, a constant uniform rightward velocity u lid = 0.01 is imposed, while the no-slip condition is enforced at the remaining edges. The initial conditions are ρ(x, t = 0) = 1 and u(x, t = 0) = 0. velocity field at the end of each simulation is reported in Figure 5. Again, the contour plot is in full agreement with those drawn in Ref. 68 .

D. Dipole-wall collision
We further evaluate the numerical performance of the D3Q19-CM-LBM by examining the flow physics gener-ated by a dipole-wall collision 72,73 . Let us consider a square domain (x, y) ∈ [−1 : 1] 2 , enclosed by no-slip walls at each side. The velocity is initialized as  Table II, the values of the energy E and enstrophy Ψ at salient time instants are reported. Findings from the D3Q19-CM-LBM run are compared to the reference solution in Ref. 73 and to a recent LB effort 72 .
One can immediately observe that the D3Q19-and the D3Q27-CM-LBMs produce identical results. In turn, they show a slight mismatch (up to 3%) with respect to the LB study by Mohammed et al. 72 . It should be noted that findings in Ref. 72 are closer to the reference ones by Clercx & Bruneau 73 . We address this behavior to the adoption in Ref. 72 of (i) a more accurate boundary condition and (ii) a lower Mach number. The time evolutions of the energy and enstrophy are reported in Figure 6.
Furthermore, we investigate a configuration where an inclined collision is present. Specifically, we ro-   tate the dipole by 30 degrees counter-clockwise by setting (x 1 , y 1 ) = (0.0839, 0.0866) and (x 2 , y 2 ) = (0.1839, −0.0866). In Table III we compare the kinetic energy computed by the proposed approach at different time instants and Reynolds number against recent findings in Ref. 72 . Again, a very good agreement is found with a slight mismatch up to 1.2%. The accuracy of the method is further highlighted in Table IV, where the time instants corresponding to the rise of the the first and second maxima of the enstrophy agree very well with those in Ref. 72 . In Figure 7, the vorticity magnitude is sketched at salient time instants for the afore-mentioned configuration. Present findings corroborate those in Ref. 72 . In particular, both studies show that the vortex hits the left wall at t = 1 if the normal collision is considered, with progressively smaller-scale structures arising as Re increases.

E. Three-dimensional Taylor-Green vortex
We investigate the numerical performance of the proposed methodology against a three-dimensional Taylor-Green vortex 74,75 . Let us consider a cubic periodic domain with edge length D. The flow develops due to the following initial conditions:       u z (x, t = 0) = − u 0 2 sin x sin y cos z.
By setting D = 128, we run several simulations by varying the Reynolds number Re = u 0 D/ν = 1600, 30000 and Mach number Ma = u 0 /c s = 0.2, 0.4, 0.6. In Figure 8, results from all the above mentioned cases are reported in terms of the evolution of the kinetic energy normalized by its initial value. For the lowest value of Re, findings are substantially insensitive to Ma. Moreover, results obtained by the adoption of the D3Q19-CM-LBM overlap very well those provided by its D3Q27 counterpart, with a relative discrepancy of ∼ 0.8%. However, the behavior becomes more interesting when a higher value of Re is considered. Indeed, the adoption of the D3Q19-CM-LBM leads to a stable simulation only for Ma = 0.2, where diffusive phenomena seem to be underestimated due to remaining velocity-dependent errors in the viscous stress tensor. The latter issue eventually leads to stability issues, at t/t 0 ∼ 1 with t 0 = D/u 0 , for higher Mach numbers. On the other hand, the D3Q27-CM-LBM does not undergo any instability. Notably, the relative difference between the solutions provided by the two algorithms is now more prominent and equal to ∼ 7.9%. Finally, the second invariant of the velocity gradient tensor, also know as Q-criterion, is depicted in Figure 9 at t = 5, where the turbulent behavior of the flow can be appreciated especially at Re = 30000. An animation of the vorticity field is also available at https://www. youtube.com/watch?v=QfQ_CpN1CV4, together with one of the Q-criterion https://www.youtube.com/watch?v= OKTh4YWjZ6g. As a conclusion, this testcase shows that one condition to move from the D3Q27 formulation to its D3Q19 counterpart would be to ensure that Ma ≤ 0.2 for under-resolved conditions, in order to keep good stability and accuracy properties.

F. Hartmann flow
In order to test the capability of the present model with forcing, we investigate the so-called Hartmann flow, that is the analogous of the Poiseuille flow for an electrically conductive fluid of magnetic resistivity η = ν. Here, the forcing scheme (24) is based on the new formulation (B4). The latter is used to account for the Lorentz force F = j × b, j, where j is the electric current that is computed directly from the populations 76 where y = 2y − L and the Hartmann number is defined as The convergence properties of the proposed approach are sketched in Figure 10 is found. Moreover, and again in agreement with Ref. 64 , a poor convergence is experienced for low values of L and high values of Ha, where the presence of thin Hartmann layers requires a larger number of grid points to be successfully and accurately reconstructed. The accuracy of the proposed approach is compared to the solution provided by the D3Q27-CM-LBM. Specifically, we re-run the simulation at Ha = 20 by using the finer lattice discretization and the results are summarized in Table V. One can immediately appreciate that the two schemes exhibit very similar accuracy.

G. Static bubble
Now, the accuracy of the present approach is tested in the context of a multiphase flow simulated through the popular Shan-Chen model 77 . By introducing the socalled pseudo-potential ψ = 1 − exp (−ρ), an interaction force is used to mimic the molecular interactions leading to phase segregation, where G is a parameter controlling the strength of the interaction. Let us consider a periodic domain consisting of 100 lattice points in each direction. A droplet of variable radius R ∈ [15 : 30] and density equal to 1.95 is placed in the center of the domain, while the density is set to 0.15 elsewhere. The kinematic viscosity is ν = 0.0333. The parameter G is set equal to -5. In Figure 11, the pressure jump across the interface is plotted against the inverse of the bubble radius. The linear evolution of the pressure jump with respect to the bubble radius proves that the present approach is able  to account for surface tension as described by Young-Laplace's law. Furthermore, we use this test to evaluate the ability of the present model to tackle spurious currents at the interface, that are well-known numerical artifacts affecting the Shan-Chen model. By considering R = 20, this quantity is sketched in Figure 12 together with findings from simulations carried out by using the D3Q19-BGK-LBM with extended equilibrium (9). The simplest collision model clearly exhibits strong anisotropic artifacts that undermine the stability and the accuracy of the run, and which are most likely related to poor spectral properties (dissipation and dispersion). Interestingly, this is drastically alleviated by the adoption of CMs with equilibration of CMs related to high-order kinetics and bulk viscosity. In fact, the resultant velocity map is considerably smoother and it shows a reduced magnitude. This might be related to the introduction of numerical (hyper-)viscosity induced by the equilibration of high-order moments 14,16 as well as the increased bulk viscosity 78 .

H. Rayleigh-Taylor instability
We conclude the numerical campaign by simulating the Rayleigh-Taylor instability mechanism with a D3Q19-CM implementation of the color-gradient method (CGM) [83][84][85] . The interested reader can refer to App. E for further details. Let us consider a three-dimensional domain of size W × 4W × W , with W = 64, where a fluid of density ρ h = 3 is placed over a lighter one of density ρ l = 1. The fluid is initially at rest and initial conditions in terms of density read as follows The domain is periodic at every side, except for the top and bottom sections where the no-slip boundary condition is assigned. The flow is driven by a gravitational body force, that is with g = (0, −g, 0), and g chosen so that t 0 = √ gW = 0.04 83 . The problem is governed by two dimensionless parameters, that are the Reynolds number Re = W gW /ν = 512, and Atwood number, At = (ρ h − ρ l )/(ρ h + ρ l ) = 0.5. In Figure 13, the evolution of the interface between the two fluids is sketched at salient time instants. Notice that the interface is identified as the set of lattice points where Table VI. Present findings are compared to several models to assess its accuracy: (i) the D3Q27-CGM-CM-LBM recently proposed in Ref. 64 , (ii) a D3Q27-CGM-MRT study 79 , (iii) a D3Q15-BGK LB model for multiphase flows 80 , (iv) a D3Q19-phasefield MRT LB scheme 81 and (v) a solution of the coupled Navier-Stokes-Cahn-Hilliard equations 82 . From this, the present method shows a pretty good agreement with data from the literature, even though some discrepancies are also observed as t/t 0 grows, the latter being related to the equilibration of high-order moments as well as the increased bulk viscosity of both CMs-based algorithms. In any case, the reduction of the number of discrete velocities does not deteriorate the accuracy of the CM-LBM, and this confirms the good numerical properties of the proposed approach, as well as, its universality.

IV. CONCLUSIONS
In this paper, a three-dimensional lattice Boltzmann method has been proposed for the simulation of multiphysics phenomena (single phase, multiphase and magnetohydrodynamic flows). By adopting the D3Q19 velocity discretization instead of the more standard -but also more computationally intensive-D3Q27 formulation, non-negligible gains are obtained in terms of both wall-clock time and memory consumption. In order to understand the limitations of such a choice, a large number of validation testcases were considered with a special    Taylor  focus on accuracy and stability discrepancies that would emerge from the adoption of the D3Q19 lattice. Most of them confirmed that the latter discretization is pretty similar to its D3Q27 counterpart in terms of accuracy convergence and stability. In fact, it is only for finite Mach numbers ( Ma ≥ 0.2) and under-resolved conditions that one should not reduce the number of discrete velocities in order not to face stability and accuracy issues. Apart from that, it seems quite safe to adopt the D3Q19 formulation in order to speed up the simulation of multiphysics flows.
In parallel, we proved that by including up to fourthorder velocity terms in the equilibrium (as it is naturally the case for D3Q19-CM-LBMs), one can improve the stability of LBMs for the simulation of single phase flows in the low viscosity regime, and at moderate Mach numbers. One may then wonder if such a result can be extended to more complex flows. Corresponding investigations will be presented elsewhere.
Eventually, as a possible extension to this work, one could include correction terms for velocity-dependent errors (those that are still present in the viscous stress tensor) in order to further improve the accuracy and stability of the present approach. This is motivated by the fact that these errors terms are usually non-negligible in under-resolved conditions (since they are proportional to the space step) and for finite values of the Mach number (because they depend on the velocity field). Hence, corrections terms employed in the context of compressible LBMs 57-60 might further improve the present model for moderate Mach number flow simulations.

SUPPLEMENTARY MATERIAL
The script D3Q19CentralMoments.m allows us to perform all the symbolic manipulations to re-build the model proposed in this paper.
The script D3Q27CentralMoments.m allows us to derive the centralmoments-based scheme in the 27-velocities lattice discretization.

DATA AVAILABILITY STATEMENT
The data that supports the findings of this study are available within the article [and its supplementary material].
Appendix A: Impact of the extended equilibrium The test in Sec. III B is here adopted to demonstrate the impact of the extended equilibrium. In Figure 14, the time evolution of the normalized kinetic energy is plotted for different values of the Mach number, Ma = 0.2, 0.35, 0.4 by using four different approaches: (i) the present D3Q19-CM-LBM, the BGK-LBM with velocity terms up to the (ii) second order, (iii) third order, and (iv) fourth order. For the lowest value of Ma, all the runs can successfully simulate the whole desired time span and they produce similar results. More intriguing results are achieved as Ma grows. In fact, the poorer BGK-LBM blows up at t/t 0 ∼ 0.65 when Ma = 0.35, while the BGK-LBMs with higher velocity terms and the D3Q19-CM-LBM are still stable. Therefore, we can assess that the introduction of higher-order velocity terms in the equilibrium population leads to an increase of the stability of the single-relaxation-time LBM. Then, when Ma = 0.4 all the BGK-LBMs fail, with the BGK-LBM with second-order equilibrium undergoes instabilities even before (t/t 0 ∼ 0.5). The adoption of a more sophisticated equilibrium slightly alleviates this problem, as it is able to reach t/t 0 ∼ 0.86 before blowing up. In the latter case, both third-and fourth-order equilibria lead to almost identical results.
In the end, the extended equilibrium seems to be one of the stabilizating mechanism of the present LBM -in addition to the equilibration of high-order and bulk viscosity related moments. This is in accordance with results obtained through linear stability analyses and numerical simulations for various collision models in the context of D2Q9-LBMs 14,21,30,86 .

Appendix B: Galilean invariant forcing scheme
While the derivation of Galilean invariant forcing terms (for any kind of moment space) is rather straightforward in the context of multiphysics flow simulations based on the D3Q27 velocity discretization 64 , it is slightly more complex for the D3Q19 lattice since the latter is not built through tensor products of D1Q3 lattices in each direction x, y and z. Nonetheless, by relying on the formulation based on the raw moment (RM) space, a general strategy can be proposed to construct D3Q19 formulations thanks to their D3Q27 counterparts 31 . The question is then: how can we move from the Gauss-Hermite formulation -that was proposed for the D3Q27 lattice in our previous study 64 -to the RM formulation of interest? The naive manner would be to rewrite it in terms of Hermite moments (HMs) by replacing weights by their values (since this is a non-weighted formulation), and further converting it to a RM formulation using relationships between HMs and RMs. Nevertheless, there is a more straightforward way to achieve the same goal thanks to the moment matching approach. Indeed, assuming that the form of the forcing RMs (R RM pqr ) is known, then the moment matching condition in the velocity space reads, ∀p, q, r ≤ 2 which leads for the D3Q27 lattice to where, for the sake of compactness, the tensor product notation has been adopted with (σ, λ, χ) ∈ {±1} 3 . By discarding discrete velocities (±1, ±1, ±1), the following constraints are obtained and compute R RM pqr through relationships between HMs and RMs. Another way to do it is by starting from central moments (CMs) of the forcing term 63 and compute R RM pqr through relationships between CMs and RMs. One can even start from the central Hermite moment (CHM) framework which is the only one lead-ing to velocity-and lattice-independent moments of the forcing term, i.e., and the other CHMs equal to zero, and eventually coming back to RMs through formulas provided in 31 . Whatever the approach considered, RMs of the forcing term eventually read Interestingly, due to relationships between all moment spaces, R HM pqr can be obtained from R RM pqr by neglecting lattice-dependent terms (those proportional to c s ), while R CM pqr is obtained by discarding velocity-dependent terms. Eventually, R CHM pqr is derived by neglecting both latticeand velocity-depend terms in R RM pqr .
Finally, it should be noted that, when N = I, particular attention should be paid to the computation of pre-collision and equilibrium moments, as the un-shifted transformation matrix T = M is used. Indeed, equilibrium moments read as follows z k eq 7 = ρu x u y , k eq 8 = ρu x u z , k eq 9 = ρu y u z , k eq 10 = ρu y u 2 x + c 2 s , k eq 11 = ρu x u 2 y + c 2 s , k eq 12 = ρu z u 2 x + c 2 s , k eq 13 = ρu x u 2 z + c 2 s , k eq 14 = ρu z u 2 y + c 2 s , k eq 15 = ρu y u 2 z + c 2 s , k eq 16 = ρ u 2 x + c 2 and the resultant post-collision state is with One can immediately observe that the RM implementation simply reduces at posing |r i = |k i , because N = I in Eq. (C4). A comparison between raw and central moments in terms of involved CPU time is given in Appendix D.  Eventually, we use the test case in Section III E to compare the CPU time required by the D3Q19 and D3Q27 lattice discretizations, as well as, the collision models (raw and central moments). In addition, we carry out different simulations by varying the number of grid points in each direction (D = 32, 48, 64, 96 and 128), and the corresponding CPU time is recorded for each run. All measurements are performed on a iMac 27 equipped with Intel Core i5 6-core 3.3GHz and 8GB of RAM. Before moving to the results, it is worth noting that in the present context, collision models are based on the equilibration of (1) bulk-viscosity-related and (2) higher-order moments, which allows for a non-negligible reduction of the CPU time as compared to their full MRT formulation. This will be discussed in more details in another study.
In Figure 15, the normalized CPU time is plotted as a function of D 3 . Globally speaking, all configurations show that the involved computational time grow with the fourth power of the total number of points. This is in accordance with the fact that the CPU time is proportional to D 3 ×T ite , where the number of time iterations T ite linearly depends on D when the time step is computed via an acoustic scaling (i.e., T ite ∝ D). Quantitavely speaking, the adoption of CMs increases the CPU time of ∼ 7% for the D3Q19-LBM, as compared to its RMs counterpart (see Table VII). Moving to 27 discrete velocities, this gap further increases between the RM and CM formulations, and it reaches ∼ 20%. Moreover, we find that the D3Q19-CM-LBM allows us to save a considerable amount of computational time (∼ 70%), as compared to the more general D3Q27 formulation. The color-gradient method (CGM) was originally proposed by Rothman & Keller 89 , but their model was not able to account for density and contrasts. This feature was developed by Grunau et al. 90 in 1993 by means of a hexagonal lattice and later by Reis (D3Q19 and D3Q27), as well as, two collision models (RMs and CMs). In the present context, higher-order and bulk-viscosity-reltated moments are equilibrated for all configurations.
be note that the original Rothman & Keller CGM has been successfully employed to simulate flows in heterogeneous porous media 91,92 . Hereafter, we will recall the basic features of the D3Q19 formulation of the CGM.
Let us consider two immiscible fluids, namely, red and blue. The evolution of populations f k i is where k = r for the red fluid, and k = b for the blue one. Moreover, it is possible to define the total distribution functions as f i = f r i + f b i . The collision process Ω k i is composed of three sub-stages: where Ω k i (1) , Ω k i (2) and Ω k i (3) are the singlephase, perturbation and recoloring operators, respectively. Macroscopic variables are given by where ρ k is the density of the fluid k, ρ is the total mass density, u is the total momentum and F is a body force. The single-phase collision operator is where h eq i is the equilibrium used by the color-gradient method: with f eq i the standard version of the equilibrium (9), and α, |c i | 2 = 0, (1 − α)/12, |c i | 2 = 1, (1 − α)/24, |c i | 2 = 2.
(E6) Notice that φ = 0 for the original single phase collision operator. To distinguish the two components, the order parameter φ is introduced, that is The values φ = 1, −1, and 0 correspond to a purely red fluid, a purely blue fluid, and the interface, respectively. To obtain a stable interface, the density ratio between the fluids must be taken into account as follows to obtain a stable interface 90 : where the superscript "0" indicates the initial value of the density at the beginning of the simulation 93 . The pressure of the fluid is given as an isothermal equation of state: for the D3Q19 lattice, where c k s is the speed of sound of the fluid k 94 , α is interpolated by with α b = 1/3 and c b s = 1/ √ 3 95 . Following 84,94,96,97 , the interfacial tension is modeled by the so-called perturbation operator: where B i =      −1/3, |c i | 2 = 0, +1/18, |c i | 2 = 1, +1/36, |c i | 2 = 2.