MPE Mathematical Problems in Engineering 1563-5147 1024-123X Hindawi Publishing Corporation 10.1155/2015/956827 956827 Research Article High Performance Computation of a Jet in Crossflow by Lattice Boltzmann Based Parallel Direct Numerical Simulation Lei Jiang 1 Wang Xian 1 http://orcid.org/0000-0003-1047-4990 Xie Gongnan 2 Brasil Reyolando M. 1 State Key Laboratory for Strength and Vibration of Mechanical Structures Xi’an Jiaotong University Xi’an 710049 China xjtu.edu.cn 2 School of Mechanical Engineering Northwestern Polytechnical University Xi’an 710049 China nwpu.edu.cn 2015 2242015 2015 19 09 2014 26 01 2015 2242015 2015 Copyright © 2015 Jiang Lei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Direct numerical simulation (DNS) of a round jet in crossflow based on lattice Boltzmann method (LBM) is carried out on multi-GPU cluster. Data parallel SIMT (single instruction multiple thread) characteristic of GPU matches the parallelism of LBM well, which leads to the high efficiency of GPU on the LBM solver. With present GPU settings (6 Nvidia Tesla K20M), the present DNS simulation can be completed in several hours. A grid system of 1.5 × 108 is adopted and largest jet Reynolds number reaches 3000. The jet-to-free-stream velocity ratio is set as 3.3. The jet is orthogonal to the mainstream flow direction. The validated code shows good agreement with experiments. Vortical structures of CRVP, shear-layer vortices and horseshoe vortices, are presented and analyzed based on velocity fields and vorticity distributions. Turbulent statistical quantities of Reynolds stress are also displayed. Coherent structures are revealed in a very fine resolution based on the second invariant of the velocity gradients.

1. Introduction

A jet in crossflow (JICF), also known as transverse jet, normally describes a jet of fluid which enters and interacts with the crossflow and the resulting flow field. JICF has wide applications in many engineering fields, such as gas turbine impingement cooling and film cooling, fuel injection in combustors, thrust vectoring system in turbojet propulsion, and reaction control for missiles. For a traditional JICF problem, near the flow field of the transverse jet, four types of vortical structures can be observed  due to the interaction between mainstream and jet as shown in Figure 1: (1) the counter-rotating vortex pair (CRVP), also known as kidney vortices; (2) the horseshoe vortices system; (3) the jet shear-layer vortices due to Kelvin-Helmholtz instabilities; (4) the wake vortices. The CRVP and the horseshoe vortices are normally defined by mean flow even though they may have unsteady part. The shear-layer vortices and the wake vortices are naturally unsteady.

Four vortical structures associated with the JICF as in .

Those vortical structures are very important to fluid flow and heat transfer behaviors of those fields where JICF is applied. If gas turbine film cooling is taken as an example, film coverage and the cooling effectiveness are closely relevant to those large structures, no matter whether they are defined as steady or naturally unsteady. The “steady” CRVP lifts the coolant up and mixes with the mainstream, weakening the film coverage and cooling effectiveness. The “unsteady” shear-layer vortices have different directions determined by the shape of the ejection hole and consequently either strengthen or suppress the CRVP . To accurately predict those large structures in which unsteadiness and anisotropy are inherent, a time-resolved scheme is required in the flow field calculation with fine spatial discretization.

The flow field of JICF is characterized by both anisotropic large-scale structures which break down to smaller sizes. These resultant small scale structures are isotropic and dissipation is dominant. The typical energy spectrum for the JICF inherently requires capturing the flow structures across the spectrum of all scales and thus a time and space accurate calculation is needed. Direct numerical simulation (DNS) and large eddy simulation (LES) are commonly used to resolve the turbulent characteristics of JICF in large-scale parallel computation devices. However, inherent features of Navier-Stokes equation or transport equation make the parallel processing with low efficiency. Thus, a highly parallel scheme is extremely helpful to those large-scale simulations of turbulent flow field.

Lattice Boltzmann method (LBM) has been regarded as a promising candidate for years due to its major merits of fully parallel algorithm. Unlike conventional numerical schemes based on assumption and discretizations of macroscopic continuum, the LBM adopts microscopic models . Other advantages of LBM include (1) easy treatment of boundary conditions and (2) easy programming. As a result, the LBM has its wide application in scientific and engineering research, such as Biofluid and porous medium . However, to resolve the turbulent flow, LBM-based DNS and LES require a very high grid resolution and thus huge computation resource is needed.

Graphic Processing Unit (GPU) brings excellent hardware conditions for these simulations. With the development of computing platforms, for example, CUDA  and OpenCL, the use of GPU to accelerate nongraphic computations has drawn much more attention [8, 9]. Due to its high performance of floating-point arithmetic operation, together with wide memory bandwidth and good programmability, general-purpose GPU (GPGPU) has its huge advantage over CPU in turbulent flow simulation  and thus has been applied in fields like weather prediction, crystal growth process, and air flow in the city .

The feasibility for LBM-based DNS has some preliminary validation for turbulent flows between parallel plates ; further verification for JICF is still empty. In this paper, the lattice Boltzmann based DNS on JICF model is performed using D 3 Q 19 model in multi-GPUs. With further validation of this code, this paper aims to resolve the JICF flow field, including the time-averaged and instantaneous velocities, vorticities, and turbulent quantities of Reynolds stress. The characteristic coherent hairpin structures are also revealed and discussed.

2. Lattice Boltzmann Equations

Lattice Boltzmann equation is the special form of the Boltzmann-BGK equation , in which discretization applies to velocity, time, and space. The Boltzmann equation for the discrete velocity distribution on a discrete lattice is as follows: (1) f i t + e i · f i = Ω i , where f i is the particle velocity distribution function and e i is the particle velocity in the i th direction. Here, the scheme of LBM is described as D m Q n . “ D m ” means “ m dimension” and “ Q n ” stands for “ n lattice speeds.” For two-dimensional problems, D 2 Q 9 model is most commonly used and means 2 D and 9 speeds. For three-dimensional problems, several cubic models are used, such as D 3 Q 13 , D 3 Q 15 , D 3 Q 19 , and D 3 Q 27 ( n = 13 , 15, 19, or 27). Ω i is the collision operator. With Boltzmann-BGK approximation, (2) Ω i = - 1 τ f i - f i eq . Combine (1) and (2); then it changes to (3) f i t + e i · f i = - 1 τ f i - f i eq , where f i eq is the local equilibrium and τ is the relaxation time. The equilibrium distribution form has to be correctly adopted to restore the Navier-Stokes equations. For 3 D models with the 13-, 15-, 19-, and 27-speed lattice, an appropriate distribution function is written as (4) f i eq = ρ ϖ i 1 + 3 e i · u + 9 2 e i · u 2 - 3 2 u · u .

In the present paper, D 3 Q 19 scheme is used with particle velocity as shown in Figure 2: e 0 0,0 , 0 , e 1 1,0 , 0 , e 2 - 1,0 , 0 , e 3 0,1 , 0 , e 4 0 , - 1,0 , e 5 0,0 , 1 , e 6 0,0 , - 1 , e 7 1,1 , 0 , e 8 - 1,1 , 0 , e 9 1 , - 1,0 , e 10 - 1 , - 1,0 , e 11 1,0 , 1 , e 12 - 1,0 , 1 , e 13 1,0 , - 1 , e 14 - 1,0 , - 1 , e 15 0,1 , 1 , e 16 0 , - 1,1 , e 17 0,1 , - 1 , and e 18 0 , - 1 , - 1 . Corresponding weighting factors are ϖ 0 = 1 / 3 , ϖ 1 ~ ϖ 6 = 1 / 18 , and ϖ 7 ~ ϖ 18 = 1 / 36 . If (3) is further discretized in both space x and time t , the discretized form can be written as (5) f i x + e i δ t , t + δ t - f i x , t = - 1 τ f i x , t - f i eq x , t . The above equation is the LBGK model, in which τ = λ / δ t is the nondimensional relaxation time. The viscosity in the macroscopic Navier-Stokes equation can be expressed as ν = τ - 1 / 2 C s 2 δ t . Equation (5) is usually solved in a standard way assuming δ t = 1 and divided into two steps as follows.

D 3 Q 19 LBM model.

Collision step is as follows: (6) f ~ i x , t = f x , t - 1 τ f x , t - f i eq x , t ; streaming step is as follows: (7) f i x + e i , t + 1 = f ~ i x , t , where f i and f ~ i denote the pre- and postcollision states of the distribution function. Please note that in the collision step f ~ i is calculated absolutely locally and in streaming step f i is only relevant to its neighboring nodes. Thus, the LBM code itself is highly suitable for parallel processing and capable of obtaining high efficiency. Macroscopic quantities, such as ρ and, can be calculated as follows: (8) ρ = i = 0 N f i = i = 0 N f i eq , ρ u = i = 0 N f i e i = i = 0 N f i eq e i . The lattice speed of sound is C s = 1 / 3 , and pressure can be obtained from the state equation of the ideal gas: (9) p = ρ C s 2 = ρ 3 . Considering boundaries of f ~ in transverse direction (front and back boundaries), periodic boundary conditions are used in this paper. For solid walls, the following boundary conditions are applied : (10) f ~ i = f i eq + f i neq = f i eq + f i -neighbor neq = f i eq + f ~ i -nieghbor - f i -neighbor eq in which f i eq is calculated by macroscopic u and ρ on boundaries by (4). The nonequilibrium distribution on boundaries f i neq is assumed to be of the same value as that in the neighboring inner node f i -neighbor neq . f ~ i -neighbor and f i -neighbor eq on the inner node are computed by (6) and (4), respectively.

3. GPU Settings

A well-known inherent advantage of lattice Boltzmann method is its parallelism. Previous study  compared the acceleration performance between Navier-Stokes solver and current LBM solver, in which incompressible flow around a cylinder was simulated on GPU (GeForce GTX280) and CPU (Xeon E5420, 2.5 GHz) separately. For the Navier-Stokes solver in which Red-Black scheme and multi-grids were applied, the GPU acceleration over CPU was 13.7. Comparatively, for LBM on the same grid scale, the acceleration was 87.4. In addition to the acceleration performance of LBM on GPU as mentioned, the difficulty of coding is much smaller for LBM than Navier-Stokes on GPU.

However, the expense to LBM’s parallelism is also obvious due to substantial variables and consequent huge memory consumption. For the D 3 Q 19 model in single precision calculation, the theoretical GPU memory requirement on 1.5 × 108 grids (1024 × 256 × 600) is over 30 G. Besides, data transfer between GPUs is realized by MPI (message passing interface) and CudaMemcpy() sentences in CUDA. From the previous experience , if the number of GPUs is less than 10, the 1 D domain partitioning is more efficient than 2 D and 3 D . In the current research, 6 GPUs of Nvidia Tesla K20M are used and the current 1 D partitioning of the GPUs is illustrated in Figure 3.

1D partitioning in z -direction.

4. Physical and Numerical Models

The physical model of the jet in cross flow problem is shown in Figure 4. The computational domain is hexahedral in shape. The mainstream flow is in x -direction, while the jet is orthogonal to the mainstream in z -direction and ejected from the bottom plane uniformly with diameter D . Origin of the coordinates is located at the center of the round jet exit. The flow domain is zero-gradient in pressure and no-slip boundary conditions are applied on the bottom plane excluding the jet exit. On transverse planes (with normal in y -direction), periodic boundary conditions are applied. The inlet turbulent profile of velocity is given through a calculation procedure similar to the so-called “rescaling process” as described in . The results presented in this paper are based on flow parameters and boundary conditions listed as follows.

Ejecting hole diameter ( D ) = 75 .

Domain length ( L x / D = L x 1 + L x 2 / D = 5 + 7 ) = 12 .

Domain span L y / D = 3.4 .

Domain height L z / D = 6.4 .

Free-stream turbulence intensity ( TI ) = 3 %.

Boundary layer height ( δ / D ) = 0.2 .

Velocity ratio ( VR = U j / U ) = 3.3 .

Physical model of jet in crossflow.

Reynolds number Re j is defined based on jet velocity U j and film hole diameter D as Re j = U j D / υ . The largest Reynolds number Re j reaches 3,000 on grids (1024 × 256 × 600), and several calculations are performed with different dimensions, grids, and jet locations trialed as shown in Table 1. The discussion is based on the 1st case in the table. The flow domain is kept in the “low-speed” range, and velocity magnitude is smaller than 0.3 times of the lattice sound speed C s . Thus, the LBGK model used in current study is suitable .

Testing cases matrix.

Jet diameter ( D ) Domain ( L x × L y × L z ) Jet location ( L x 1 ) Mesh
75 12D  ×  3.4D  ×  8D 2.6D 896  ×  256  ×  600
75 13.6D × 3.4D × 8D 4D 1024 × 256 × 600
100 10D × 5D × 8D 4D 1024 × 512 × 600
75 8.5D × 3.4D × 6.4D 2.6D 640 × 256 × 480
75 8.5D × 4D × 6.4D 2.6D 640 × 300 × 480
5. Model Validation

Nondimensional time-averaged streamwise velocity U + = U / u τ versus z + = z u τ / υ is presented in Figure 5. The location is selected before the jet at x = - 2 D and y = N Y / 2 to eliminate the influence from the jet. A turbulent boundary layer profile is obtained, consisting of laminar sublayer and log layer, which is consistent with the Von Karman mixing length theory.

Streamwise velocity boundary layer profile.

Further comparisons with experiments are made in regions where the mainstream is disturbed by the jet as shown in Figure 6. The experiments were done by Meyer et al. [23, 24] using stereoscopic PIV in a JICF problem. Time-averaged streamwise and spanwise velocities ( U / U and W / U ) are presented in spanwise direction (with respect to z / D ) at locations of x = 0.5 D , y = N Y / 2 and x = 3 D , y = N Y / 2 . Results show good agreement with experimental data. In current simulation, a uniformly distributed velocity profile is set at the jet exit, while, in experiment, the jet is supplied through a tube and a turbulent profile has been developed inside it. Thus, an overestimation of the jet’s momentum near its boundary of the jet hole can be expected and seen in Figure 6.

Streamwise and spanwise velocities profiles after the jet.

6. Results and Analysis 6.1. Energy Spectra

Fast Fourier transformation (FFT) is performed to a time series of turbulent energy E tur at the sample frequency of 2000 at x = 6 D , z = 4 D , and y = N Y / 2 . The turbulent energy is defined as E tur = 1 / 2 u 2 + v 2 + w 2 . The corresponding turbulent power spectrum is presented versus frequency as shown in Figure 7. The calculated turbulence decay rate is close to the theoretical canonical value of −5/3 and no obvious spectrum buildup is observed, which indicate the current mesh resolution is adequate.

Spectral power density of velocity fluctuations.

6.2. Velocity Field and Vorticity

The 2 D time-averaged velocity ( U , W ) and instantaneous velocity ( u , w ) fields in the middle plane of y -direction ( y = N Y / 2 ) are plotted in Figures 8(a), 8(b), and 8(c), respectively. The corresponding vorticity distributions based on the time-averaged and instantaneous velocities are also shown in the figure. In Figure 8(a), the vertical jet is bended toward exit by the mainstream flow and phenomenon like flow passing a solid obstacle is observed in the wake region. As shown in the enlarging plot of flow field, the approaching wall boundary layer meets with adverse pressure gradient before the jet and consequently separates and forms the horseshoe vortex. Considering the vorticity field, negative and positive values are observed in leading edge and trailing edge along the jet trajectory and trivial values exist in the wake region. Instantaneous velocity and vorticity fields at time steps 80000 and 120000 are shown in Figures 8(b) and 8(c), respectively. Generation and shedding of Kelvin-Helmholtz vortices are introduced by the shear layers between the mainstream and the jet on both leading edge and trailing edge. The shed vortices are mixing with the mainstream and dissipated in the downstream region quickly. In the wake region of the jet, wake vortices are observed through those nontrivial vorticity values; however, no vortex tube is directly observed in this cross section view. To compare the time-averaged fields with their instantaneous counterparts, the Kelvin-Helmholtz vortices are not available after averaging, which proves the shear-layer induced vortices are inherently unsteady and periodic.

Time-averaged and instantaneous 2 D velocity vector and vorticity at y = N Y / 2 plane.

Time-averaged

Time step = 80000

Time step = 120000

The 2 D instantaneous velocity ( u , v ) and time-averaged velocity ( U , V ) fields are displayed at locations = 0.5 D , 1 D , and 2 D , respectively, in Figure 9. The corresponding vorticity distributions are also displayed in the figure. In Figure 9(a), the instantaneous velocity vectors and vorticity distributions at time step 80000 are shown. Separation of mainstream occurs near the mid-chord portion. Further, vortex shedding and dissipation are observed in the downstream area. Wake region is formed at the backside of the jet. A characteristic flow pattern of the wake by the transverse jet is obtained which is significantly different from that by a solid cylinder as described in . In Figures 9(b) and 9(c), the time-averaged velocity vectors and vorticity distributions are displayed. Similar to flow passing a solid cylinder, along the streamwise direction, positive and negative values appear when the mainstream starts to separate. In downstream region, the vorticity recovers to trivial values due to the dissipation of those periodically-shed vortices.

2 D velocity vector and vorticity at planes of z = 0.5 D , 1 D , and 2 D .

z = 0.5 D

z = 1 D

z = 2 D

The 2 D time-averaged velocity ( V , W ) fields are presented in several streamwise planes at locations x = 1 D , 1.5 D , and 3 D in Figures 10(a), 10(b), and 10(c), respectively, with vorticity distributions. As shown in Figures 10(a) and 10(b), the CRVP system consists of two layers, the upper one and the lower one. For a round hole, as used in the current study, the rotating directions of the two pairs are the same. This “two-deck” vortices structure is very similar to that observed in  by laser-induced fluorescence. By comparing among Figures 10(a), 10(b), and 10(c), it is found that when moving downstream, the vertical component ( W ) of the 2 D velocity reduces because of the jet’s bending. At the same time, the lower layer vortices lift in vertical direction and weaken in strength. Finally, in far downstream region ( x = 3 D ), the W -component is at the same level of V -component since the jet is almost fully bended and those two layers fully emerge with strength further weakened.

Time-averaged 2 D velocity vector and vorticity at planes of x = 1 D , 1.5 D , and 3 D .

x = 1 D

x = 1.5 D

x = 3 D

6.3. Turbulent Statistics

Reynolds stress distributions of u w ¯ , u u ¯ , and w w ¯ are plotted in the midplane of y -direction ( y = N Y / 2 ) in Figures 11(a), 11(b), and 11(c), respectively. The shear stress component of the Reynolds stress tensor is shown in Figure 11(a). Strong anisotropy of the flow field can be observed in the flow field. Negative values of u w ¯ appear in the leading edge of the jet with positive values in the trailing portion. In the wake region, both positive and negative values exist with big nonuniformity. In Figures 11(b) and 11(c), the normal Reynolds stress is displayed. For both u u ¯ and w w ¯ , the highest values appear in the leading and trailing portions of the jet due to large velocity gradients. Right behind the jet in the recirculation zone, both components maintain some nontrivial values, especially w w ¯ , indicating the wrapping motion around the jet by the fluid flow.

Reynolds stress distribution at y = N Y / 2 plane.

6.4. Coherent Structure

The isosurfaces of second invariant of velocity gradient Q at time steps 80,000 and 120,000 with domain height of N Z / 2 are plotted in Figure 12. The definition of Q is given by (11) in tensor form: (11) Q = 1 2 W i j W i j - S i j S i j = - 1 2 u j x i u i x j , W i j = u i x j - u j x i , S i j = u i x j + u j x i , where W i j and S i j denote the rate of rotation and rate of shearing, respectively. The hair-pin-shaped coherent structures are originated from the side and trailing edge of the ejecting hole and wrapping around the jet. In the downstream region, the hair-pin structure is weakened and dissipated into the mainstream.

Isosurfaces of second invariant of velocity gradient.

Time step = 80000

Time step = 120000

7. Conclusion

A direct numerical simulation of jet in crossflow based on lattice Boltzmann method is performed on multi-GPUs. The current simulation takes about 10 hours based on largest grid of 1.5 × 108. Several points can be summarized and concluded as follows.

A turbulent boundary layer is observed before the jet ( x = - 2 D , y = N Y / 2 ), consisting of laminar sublayer and turbulent core. After the jet at x = 0.5 D , y = N Y / 2 and x = 3 D , y = N Y / 2 , time-averaged streamwise and spanwise velocity components ( U / U and W / U ) are compared with experiments, and good agreement is obtained.

The turbulent energy spectrum is plotted versus frequency at x = 6 D , z = 4 D , and y = N Y / 2 . A decay rate close to the theoretical value of −5/3 is observed.

2 D velocity vectors and vorticity fields are plotted in y -, z -, and x -direction planes. Unsteady shear-layer vortices are formed and shed along the leading and trailing side of the jet trajectory. Horseshoe vortex is generated in the plate boundary layer before the jet. Two-deck structure of the CRVP is retained and the lower pair is observed to rise along the streamwise direction.

The normal components ( u u ¯ and w w ¯ ) and shear component ( u w ¯ ) of the Reynolds stress are revealed in the y = N Y / 2 plane. Strong anisotropy is observed for the flow field due to the shear layer and wake.

Coherent structure represented by the second invariant of velocity gradient ( Q ) is plotted at different time steps. The characteristic hair-pin vortices are observed.

Conflict of Interests

The authors declare that there is no any conflict of interests regarding the publication of this paper.

Acknowledgments

The work is supported by 973 Project of China (2013CB035702), the National Natural Science Foundation of China (11302165 and 11402191), and Postdoctoral Science Foundation of China (2013M540741).

Fric T. F. Roshko A. Vortical structure in the wake of a transverse jet Journal of Fluid Mechanics 1994 279 1 47 10.1017/s0022112094003800 2-s2.0-0028668833 Haven B. A. Kurosaka M. Kidney and anti-kidney vortices in crossflow jets Journal of Fluid Mechanics 1997 352 27 64 10.1017/s0022112097007271 2-s2.0-0031337246 Chen S. Y. Doolen G. D. Lattice Boltzmann method for fluid flows Annual Review of Fluid Mechanics 1998 30 329 364 10.1146/annurev.fluid.30.1.329 He Y. L. Wang Y. Li Q. Lattice Boltzmann Method: Theory and Applications 2009 Beijing, China Science Press Chen L. Kang Q. Mu Y. He Y.-L. Tao W.-Q. A critical review of the pseudopotential multiphase lattice Boltzmann model: methods and applications International Journal of Heat and Mass Transfer 2014 76 210 236 10.1016/j.ijheatmasstransfer.2014.04.032 2-s2.0-84901062333 Aidun C. K. Clausen J. R. Lattice-Boltzmann method for complex flows Annual Review of Fluid Mechanics 2010 42 439 472 10.1146/annurev-fluid-121108-145519 nVIDIA NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 2.0 nVIDIA, 2008 Buck I. Foley T. Horn D. Brook for GPUs: stream computing on graphics hardware ACM Transactions on Graphics 2004 23 777 786 Krüger J. Westermann R. Linear algebra operators for GPU implementation of numerical algorithms ACM Transactions on Graphics 2003 22 3 908 916 10.1145/882262.882363 Ogawa S. Aoki T. GPU Computing for 2-dimensional incompressible-flow simulation based on multi-grid method Transactions of JSCES 2009 20090021 Xian W. Takayuki A. Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster Parallel Computing 2011 37 9 521 535 10.1016/j.parco.2011.02.007 MR2868233 2-s2.0-80052038042 Rossinelli D. Bergdorf M. Cottet G.-H. Koumoutsakos P. GPU accelerated simulations of bluff body flows using vortex particle methods Journal of Computational Physics 2010 229 9 3316 3333 10.1016/j.jcp.2010.01.004 MR2601102 2-s2.0-77949317732 Shimokawabe T. Aoki T. Ishida J. Kawano K. Muroi C. 145 TFlops performance on 3990 GPUs of TSUBAME 2.0 supercomputer for an operational weather prediction 4 Proceedings of the 11th International Conference on Computational Science (ICCS '11) June 2011 1535 1544 10.1016/j.procs.2011.04.166 2-s2.0-79958268442 Wang X. Aoki T. High performance computation by multi-node GPU cluster-TSUBAME 2.0 on the air flow in an urban city using lattice Boltzmann method International Journal of Aerospace and Lightweight Structures 2012 2 1 77 86 10.3850/s2010428612000232 Miki T. Wang X. Aoki T. Imai Y. Ishikawa T. Takase K. Yamaguchi T. Patient-specific modelling of pulmonary airflow using GPU cluster for the application in medical practice Computer Methods in Biomechanics and Biomedical Engineering 2012 15 7 771 778 10.1080/10255842.2011.560842 2-s2.0-84862298225 Wang X. Shangguan Y. Onodera N. Kobayashi H. Aoki T. Direct numerical simulation and large eddy simulation on a turbulent wall-bounded flow using lattice boltzmann method and multiple GPUs Mathematical Problems in Engineering 2014 2014 10 742432 10.1155/2014/742432 MR3198541 2-s2.0-84899972630 Bhatnagar P. L. Gross E. P. Krook M. A model for collision processes in gases Physical Review 1954 94 3 511 525 10.1103/physrev.94.511 2-s2.0-26344468007 Qian Y. H. D'Humières D. Lallemand P. Lattice BGK models for navier-stokes equation EuroPhysics Letters 1992 17 6 479 484 10.1209/0295-5075/17/6/001 Guo Z.-L. Zheng C.-G. Shi B.-C. Non-equilibrium extrapolation method for velocity and pressure boundary conditions in the lattice boltzmann method Chinese Physics 2002 11 4 366 374 10.1088/1009-1963/11/4/310 2-s2.0-0036334507 Xu D. Chen G. Wang X. Li Y. Direct numerical simulation of the wall-bounded turbulent flow by Lattice Boltzmann method based on multi-GPUs Applied Mathematics and Mechanics 2013 34 9 1 9 Muldoon F. Acharya S. Direct numerical simulation of pulsed jets-in-crossflow Computers & Fluids 2010 39 10 1745 1773 10.1016/j.compfluid.2010.04.008 2-s2.0-77956345302 Guo Z. L. Zheng C. G. Theory and Applications of Lattice Boltzmann Method 2010 Beijing, China Science Press Meyer K. E. Özcan O. Larsen P. S. Westergaard C. H. Steroscopic PIV measurements in a jet in crossflow Proceedings of the 2nd International Symposium on Turbulence and Shear Flow Phenomena June 2001 Stockholm, Sweden Meyer K. E. Özcan O. Larsen P. S. Westergaard C. H. Flow mapping of a jet in crossflow with stereoscopic PIV Journal of Visualization 2002 5 3 225 231 10.1007/bf03182330 2-s2.0-23044534070