An Efficient Scheme for Coupling OpenMC and FLUENT with Adaptive Load Balancing

This paper develops a multi-physics interface code MC-FLUENT to couple the Monte Carlo code OpenMC with the commercial computational ﬂuid dynamics code ANSYS FLUENT. The implementations and parallel performances of block Gauss–Seidel-type and block Jacobi-type Picard iterative algorithms have been investigated. In addition, this paper introduces two adaptive load-balancing algorithms into the neutronics and thermal-hydraulics coupled simulation to reduce the time cost of computation. Considering that the diﬀerent scalability of OpenMC and FLUENT limits the performance of block Gauss–Seidel algorithm, an adaptive load-balancing algorithm that can increase the number of nodes dynamically is proposed to improve its eﬃciency. Moreover, with the natural parallelism of block Jacobi algorithm, another adaptive load-balancing algorithm is proposed to improve its performance. A 3x3 PWR fuel pin model and a 1000MWt ABR metallic benchmark core were used to compare the performances of the two algorithms and verify the eﬀectiveness of the two adaptive load-balancing algorithms. The results show that the adaptive load-balancing algorithms proposed in this paper can greatly improve the computing eﬃciency of block Jacobi algorithm and improve the performance of block Gauss–Seidel algorithm when the number of nodes is large. In addition, the adaptive load-balancing algorithms are especially eﬀective when a case demands diﬀerent computational power of OpenMC and FLUENT.


Introduction
Although nuclear energy has been contributing to sustainable development goals for decades, its economic feasibility and safety remain the major issues widely concerned [1,2]. Meanwhile, computer science (CS) and information and communications technologies (ICTs) have also demonstrated their key roles in environmental sustainability [3,4]. Combining CS and ICTs with state-of-the-art physics codes is a cutting edge research in the field of engineering and scientific simulations [5][6][7]. Existing studies show that the use of high-performance computers to perform high accuracy multi-physics simulations can reduce design costs and thereby enhance the safety and economic feasibility of nuclear energy [7,8].
Inherent feedback mechanisms in reactors necessitate a multi-physics approach, particularly between neutronics and thermal-hydraulics [9,10]. Numerous neutronics and thermal-hydraulics coupling research studies have been done with simplified physical models, such as nodal diffusion theory and subchannel or single-channel thermal hydraulic codes [11][12][13][14]. Although such models are effective, the assumptions made in their formulation limit the accuracy of solution. Consequently, three-dimensional (3D) neutron transport codes and computational fluid dynamics (CFD) codes have received significant attention because of their detailed model and accurate solutions. Bousbia-Salah et al. discussed the main features and limitations of coupled code techniques in 2007 [15]. ey pointed out that the capabilities of computers should be properly utilized and summed up two schemes that can be utilized when performing the coupling. One is the internal coupling, which provides better convergence but still faces some numerical challenges [16]. e other is the external coupling. is scheme allows neutron transport codes and CFD codes to run separately and exchange data through interface codes [17], and most importantly, it is efficient on massive parallel computer systems [18]. Because of the minimal changes to individual codes, many researchers choose to use the external coupling scheme for multi-physics coupled work [19]. Marzano discussed about the methodology and used an interface code to exchange data between the individual codes for coupling 3D neutron transport calculation and full-field CFD calculation [10]. In the coupled calculation procedures of Marzano's methodology, PENTRAN and STAR-CCM + are calculated in turn during each iteration. Hsingtzu and Rizwan-uddin verified the integrated tight coupling method and implemented it using DRAGON and OpenFOAM and adopted a similar iterative method. Since then, more and more researchers have applied this method, which is known as block Gauss-Seidel-type Picard iteration, in coupled simulation [20]. Gurecky and Schneider coupled MCNP and FLUENT with this method [21]. In 2017, Daeubler et al. developed a CFD/Monte Carlo coupling scheme with this iteration method and applied it to TRIGA reactor [14]. More related works can be found in [16,[22][23][24][25][26]. It seems that the current research studies on coupling 3D neutron transport codes and CFD codes rely on the use of a block Gauss-Seidel-type Picard iteration.
It is well known that coupled studies based on 3D neutron transport codes and CFD codes demand a lot of computational power, and thus high-performance computers are usually needed. Block Gauss-Seidel-type Picard iteration offers a simple path to couple different physics codes with minimal code interaction required. However, because the physics codes have hardly been modified, their parallel performances and scalability are usually different.
is means that there may be load-balancing problems when applying it to high-performance computers. Moreover, Kelley studied iterative methods for linear and nonlinear equations and mentioned block Jacobi algorithm [27]. is method seems naturally parallelized since each block of the iteration can be performed independently of the others. Cervera et al. researched the computing efficiency and implementation of block-iterative algorithms for nonlinear coupled problems and pointed out that block Jacobi algorithm can be satisfactory when the problems to be solved are strongly nonlinear on their own [28]. Matthies et al. took block Jacobi method into account very naturally when mentioned about block Gauss-Seidel methods while considering the algorithms for computing the response of a coupled problem [29]. However, as far as we know, few researchers have applied the block Jacobi method on coupling 3D neutron transport codes and CFD codes. e primary objective of the present research is to introduce computer technology into the neutronics and thermal-hydraulics coupled program to improve its performance in massively parallel computing so as to reduce the time cost of simulation. Firstly, we develop a multi-physics interface code MC-FLUENT to couple the Monte Carlo code OpenMC with the commercial computational fluid dynamics code ANSYS FLUENT. en, we propose an adaptive load-balancing algorithm to improve the computing efficiency of block Gauss-Seidel-type Picard iterative method on high-performance computers. In addition, although some scholars believe that the block Jacobi method has a poor convergence [27,28], its natural parallelism makes it more suitable in high-performance computers. e implementation and performance of the block Jacobi-type Picard iterative method have been studied in this paper, and another adaptive load-balancing algorithm is proposed to improve its performance. e paper is organised as follows. Section 2 describes the codes used for coupling. Section 3 is devoted to introducing the coupling schemes, adaptive load-balancing algorithms, and data exchange method. Section 4 focuses on the models simulated in this paper. Section 5 contains numerical results and discussion. Section 6 presents the conclusion and the future working plan.

Code Descriptions
OpenMC is an open source Monte Carlo transport code that uses the portable and more "readable" XML file format instead of the arbitrary format ASCII file for input, which helps to increase the efficiency of data exchange between programs and interfaces more [30]. HDF5 files are used to store continuous-energy particle interaction, which can be converted from ACE files produced by NJOY. High-performance parallel algorithms in OpenMC are enabled via a MPI and OpenMP programming model [31]. ese features of OpenMC provide a lot of convenience for the code to couple with others. Equally important, ANSYS FLUENT is a popular commercial CFD code which can achieve the best convergence speed and solution accuracy [32]. FLUENT can improve the performance of the solver by linking the userdefined function (UDF) dynamically. UDF allows users to write information to a case or read information from a data file and extend the initialization and postprocessing capabilities of the solver. Moreover, a journal file which contains TUI and/or scheme commands allows FLUENT to automate setup process for the tasks within the same model and control the solving behavior of a model in addition to the setup. e diverse numerical models and flexible user-defined functions of FLUENT allow it to work well with other software.
On account of the functional properties of the aforementioned two codes, an interface code named MC-FLUENT has been developed to couple OpenMC and ANSYS FLUENT in order to study the implementation and optimization method of iterative algorithms in 3D neutron transport code and CFD code coupling. Written in C++ language, the code can submit the running instructions of OpenMC and ANSYS FLUENT to Linux system and Slurm Workload Manager through shell script. A fixed-format input card containing problem information and iterative information must be filled out before running MC-FLUENT. rough this input card, the interface code can obtain the corresponding relationship between OpenMC cells and FLUENTgrid domain. en, MC-FLUENTenabled to collect the cells and materials information by parsing XML format files. After each iteration, the interface code will be able to extract the temperature and density of materials from the output file of FLUENT and modify the material parameters in XML format files. Meanwhile, it can also extract the fission rate distribution calculated by OpenMC to calculate the power density and write it into UDF files. Moreover, MC-FLUENT can adaptively balance the load through dynamic scheduling compute nodes according to the time consumed by each iteration. e specific coupling format and adaptive load-balancing algorithms will be introduced in Section 3. e code procedure with block Gauss-Seidel-type Picard algorithm and block Jacobi-type Picard algorithm is shown in Figures 1 and 2, respectively. Anyone can download the source code for this interface by connecting to https:// github.com/DHMephisto/MC-FLUENT.

Methodology
In this section, we investigate the method for coupling OpenMC and FLUENT. e coupling schemes, adaptive load-balancing algorithms, and data exchange strategy are proposed.

Coupling Scheme.
e coupling framework in this paper is designed by the external coupling method, which solves physical fields by passing data repeatedly between independent codes. Two types of Picard iterative algorithms are used to calculate the coupled problems.

Block Gauss-Seidel-Type Picard Scheme.
e first one is known as block Gauss-Seidel-type Picard iteration method. e method helps to execute any two codes being adopted in MC/CFD coupling in a serial way. During each iteration, OpenMC first calculates a new fission rate distribution (ψ) according to the density (ρ) and temperature (T) calculated in the previous iteration. en, MC-FLUENT will be responsible for data passing. It can calculate the new source term (S) of energy equation at each domain according to the new fission rate distribution and write the power density to the UDF files. After that, FLUENT will calculate a new density distribution and a temperature distribution according to the new power density. At last, MC-FLUENT will rewrite the XML files with new temperature and density. e simulation is stopped according to the criterion shown below: where φ is variable of interest (in this paper, temperature and power density), i is the ith cell, N is the number of cells, and n is the nth iteration. When ε reaches a certain value, the algorithm is stopped. e outline of block Gauss-Seidel-type Picard coupling algorithm is depicted in Algorithm 1 [21,28,33]. e superscript k of the letters means that this is the result of the kth iteration. With this scheme, the second physics (FLU-ENT in this paper) works with a more up-to-date linearization of the coupling term as compared to block Jacobi algorithm, which tends to accelerate convergence. However, it does not seem to be friendly to parallel processing. Additionally, the scalability of OpenMC and FLUENT dealing with one same coupling problem can differ from each other, which means that when the number of computing nodes increases, the computing efficiency of one code may start to decline while that of another code keeps on improving. In view of this, an adaptive load-balancing algorithm is proposed by present research to improve the efficiency of the coupled program. e procedure of MC-FLUENT with this scheme is presented in Figure 1.

Block Jacobi-Type Picard Scheme.
e second alternative, block Jacobi-type Picard scheme, can deal with OpenMC and FLUENT in a parallel manner. During each iteration, OpenMC and FLUENT are calculating with the coupling terms updated in the last iteration at the same time. MC-FLUENTonly needs to collect the results and update the coupling items after each iteration. e algorithm is also stopped according to the criterion shown in equation (1). e outline of the block Jacobi-type Picard coupling algorithm is shown in Algorithm 2 [28]. e superscript k-1 indicates that neither of the two codes needs the results calculated by each other at current iteration step. In this case, both codes would be running in parallel and fully exploiting the computational resources. e procedure of MC-FLUENT with block Jacobi-type Picard algorithm is presented in Figure 2. With this scheme, OpenMC and FLUENT will be running in a parallel manner and each code will use half of the available compute nodes. Nevertheless, the computing efficiency of OpenMC and FLUENT dealing with the same coupling problems differs from each other. e equal number of compute nodes is not the optimal solution, which may lead to longer waiting time. erefore, the resource allocation strategy can exert a great influence on the computational performance of block Jacobi scheme. To increase computing efficiency, an adaptive load-balancing algorithm is proposed by present research. Based on the respective calculating time of the two codes in the last iteration, this algorithm is able to make dynamic adjustments to their respective number of nodes in order to reduce waiting time and improve overall computational performance of the algorithm.

e Adaptive Load-Balancing Algorithms for MC and CFD
Coupling Simulations. According to the shortcomings of the two algorithms stated in Section 3.1, two adaptive loadbalancing algorithms are proposed in this paper.

For Block Gauss-Seidel-Type Picard Scheme. For block
Gauss-Seidel scheme, OpenMC and FLUENT are calculated in turn, and each code will use all the compute nodes assigned to the coupled program. Consequently, there will be no load-balancing problem. However, when the number of computing nodes increases, the computing efficiency of one code may start to decline while that of another code keeps on improving, which is not conducive to the improvement of overall efficiency. Considering that different coupling problems have different requirements for scalability, an adaptive load-balancing algorithm that can adjust the number of nodes dynamically is proposed. With this algorithm, the total number of compute nodes (N total ) and the minimum number of nodes used for MC code (N mc ) and CFD code (N cfd ) should be set before running the coupled program. en, MC-FLUENT will try to increase the number of nodes after each iteration step. In this paper, it will start with the minimum number of nodes and multiply by 2 every iteration. In addition, MC-FLUENT will count the running time of MC code (t k mc ), CFD code (t k mc ), and the total running time (t k total ) during each iteration step. If the calculation time increases, it will stop adjusting and reuse the last number of nodes. e outline of this algorithm is illustrated in Algorithm 3 with the superscript k of letters indicating the kth iteration.

For Block Jacobi-Type Picard Scheme. With block
Jacobi scheme, the MC code and the CFD code will be running in a parallel manner. By default, each code will use half of the compute nodes assigned to the coupled program. However, since the resources required by OpenMC and FLUENT are usually unequal, this solution can be less than the best. Furthermore, as the coupled problems met vary, the corresponding resource allocation schemes are different. erefore, an adaptive load-balancing algorithm has been proposed for the block Jacobi scheme. With this algorithm, the MC-FLUENT will allocate resources according to the total number of nodes (which must be larger than 2). During initialization, nodes are allocated according to the ratio of 1 : 1. en, the running time of each code during each iteration is counted, respectively. If the running time of one code is shorter than another one in the current iteration, the number of nodes used by this code will be halved during next iteration. However, if the total time of current iteration is longer than the last one, MC-FLUENT will stop adjusting and use the last allocative decision. Ideally, the adjustment will stop when the running time of the two codes is equal. e outline of the algorithm for block Jacobi scheme is depicted in Algorithm 4. e meaning of letters and superscripts is the same as Algorithm 3.

Data Exchange.
One major challenge to the coupling codes is the communication of the variables, which, in this paper, refer to temperature, density, and power density. Since the number of cells in OpenMC is smaller than the number of volumes in the FLUNET model, the decision of a suitable spatial mapping is important. During coupling, OpenMC needs a discrete set of temperatures and density for each cell while FLUENT needs correct power density for each volume element. In this case, the data exchange method used in Henry et al. [34] is adopted for reference in present research: an averaging operation is necessary on the FLU-NET volume elements corresponding to the OpenMC cell while transmitting the temperature and density. In addition, the cell of OpenMC model corresponds to the domains of the FLUENT model which is regarded as the coarser mesh, and interpolation can be done with FLUENT UDF.
(1) Requires: initial guess for ψ 0 , T 0 , ρ 0 , v 0 and S 0 (2) k � 0 (3) while Not Converged do: Science and Technology of Nuclear Installations 5 e third technique for updating the cross sections used in Seker et al. [35] is adopted for reference. With this approach, a library would be pregenerated for each nuclide with a proper temperature increment. en, with the use of the interpolation method in OpenMC manual [30], the cross sections lying between the temperature intervals would be almost accurate [35].

Computational Models
A single pressurized water reactor (PWR) cell model is simulated to verify the validity of the MC-FLUENT. In addition, as is mentioned in Section 3, the computing efficiency and scalability of OpenMC and FLUENT are observed to show a tendency to disaccord, which is likely to reduce the overall efficiency of the iteration algorithms. erefore, a 3 x 3 fuel pin example with similar requirements for the OpenMC and FLUENT and a 1000 MWt ABR metallic benchmark core example which centers on the performance of MC code are, respectively, adopted to study the computing efficiency of the two different schemes.

Single PWR Cell Model.
Although MC-FLUENT aims to analyze complex models with massively parallel computing, a single PWR cell model can verify the validity of the coupled program. Cardoni and Rizwan-uddin developed a coupled simulation tool MULTINUKE and performed a steady-state PWR simulation in 2011 [7,8]. eir work has been studied in many papers, and their results are considered to agree with PWR values typically reported in literature [36][37][38][39]. In the present study, we simulated the same PWR cell model in [7] and compared our results with theirs. As shown in Figure 3, this model consists of a fuel rod surrounded by cladding and water and has a height of 20 cm. Figure 4 shows the cell division of OpenMC and meshing for FLUENT in this paper.

3 X 3 Fuel Pins of PWR.
e second example is a 3 x 3 array of PWR pins, in which eight fuel pins were modeled around a central guide. e fuel pellet and the cladding were, respectively, made of UO2 and Zr. e gap between pellet and cladding has not been taken into consideration since this example does not emphasize the temperature distribution (1) Requires: initial set for N mc , N cfd and N total (2) bool update mc , update cfd � True (3) Get t 0 mc and t 0 cfd in the 0th iteration (4) while Iteration Not End do: (5) Get t k mc and t k cfd in the kth iteration (6) If update mc : (1) Requires: initial set for N total (2) bool update � True (3) Set N mc � N total /2, N cfd � N total − N mc (4) Get t 0 mc , t 0 cfd and t 0 total in the 0th iteration (5) while Iteration Not End && update do: (6) Get t k mc , t k cfd and t k total in the kth iteration (7) If (t k total < t k−1 total && t k mc < t k cfd ): inside the cladding; however, the equivalent thermal conductivity was set for the cladding. Figure 5 shows the schematic of the model and the number of fuel pins. e design parameters of the fuel pin and central guide are provided in Table 1. e entire OpenMC model was divided into 10 layers along the axis. For each layer, every single fuel pellet was trisected, and each of its three layers was divided into 4 cells, along with which the cladding was quartered and altogether the water was divided into 40 cells as is schematically illustrated in Figure 6(a). Reflective boundary conditions were set to the surface around the rod bundles. Considering that the model is strictly centrosymmetric in space, a 1/4 model was established for FLUENT to reduce the cost of computation. At the end of the calculation, the results were extended to the whole geometric region for data exchange using symmetry. ree varisized meshes were applied to verify the mesh sensitivity, which, respectively, contained 14000 K cells, 16000 K cells, and 18000 K cells. Each of the three meshes was employed in calculating the total pressure loss, and their errors from one another observed ranged acceptably less than 2%. erefore, the mesh containing 16,726,000 cells was taken as the final choice for FLUENT as is shown in Figure 6(b).

1000 MWt ABR Metallic Benchmark Core.
e third example is a 1000 MWt advanced burner reactor (ABR) metallic benchmark core developed by Organisation for Economic Co-operation and Development (OECD) [40]. e nominal power is 1000 MWt, and the core is divided into the inner and the outer core zones, with 78 and 102 assemblies, respectively.
ere are 271 fuel pins wrapped with wire in each driver subassembly. In order to simplify the geometry, the wire has been smeared with cladding during the modeling, and thus the cladding thickness has been slightly increased. In addition, the gap between fuel pellet and cladding was filled with bond sodium. e model built for OpenMC is presented in Figure 7.
Uniform temperature values in the coolant, structural materials, and fuel have been assumed in this benchmark problem which are shown in Table 2. In practice, however, the core temperature distribution should be dependent on the power distribution. erefore, we hope to provide a reasonable temperature distribution for OpenMC simulation by coupling with FLUENT.
In CFD calculation, due to the noticeably complex reactor structure, the computing scale corresponding to detailed geometric model is unsupportable. Considering that the example here focuses much more on the temperature and density distributions of the reactor rather than on flow details in the bundle, the porous model can be used to replace the detailed bundle structure during CFD calculation for simplicity. As shown in Figure 8(a), according to the division of regions in the benchmark, a quarter model was constructed according to the principle of volume equivalence and the reactor was divided into four parts: inner core, outer core, reflector region, and shield region. e porous media model has been used to simulate the influence of solid region on the flow. e detailed setting method of porous media model is described in FLUENT Users Guide [32]. For bundles of rods with wire wrap, Darcy friction factor can be calculated based on the empirical formula proposed by Cheng and Todreas [41,42]. e nonthermal balance model has been used in FLUENT to evaluate the heat transfer between fluid and solid in porous regions. With this model, the CFD code is enabled to generate overlapped solid regions in the porous regions, so that the temperatures of solid and liquid can be obtained, respectively, which will be transmitted to the MC program for further computing. e heat transfer coefficient of solid material can be set based on the formula proposed by Borishanski et al. [43,44].
After verifying the mesh sensitivity, the mesh containing 1,975,640 cells was taken as the final choice for FLUENT as shown in Figure 8(b). At the end of the calculation, the results were extended to the whole geometric region for data exchange using symmetry.

Single PWR Cell Model.
e purpose of the single PWR cell model is to verify the validity of the coupling program. e converged results are compared with reference [7] and presented in Table 3. Figure 9 shows the converged relative axial power distribution and the axial fuel temperature distribution at different radial locations (r). e axial distribution of average coolant temperature and density is shown in Figure 10. It can be found that our results are in good agreement with those in reference [7].

3 X 3 Fuel Pins of PWR.
MC-FLUENT has been performed on a high-performance computing system with the 64-bit CentOS Linux operating system. Each compute node of the cluster consists of one Intel Xeon E5-2620 CPU with 12 cores and a frequency of 2 GHz. In order to make a full test on the parallel performances of the two schemes, 1/2/4/ 8/16/32 compute nodes were used, respectively, which are    In the cases of this paper, different numbers of nodes are utilized to make a comparative study of computational performance. However, as data transmission remains the same way all along, changes in the number of nodes do not lead to any change in the number of iterations. Only 9 iteration steps were required for the block Gauss-Seidel scheme to converge while 19 were required for the block Jacobi scheme in the 3 X 3 fuel pin example. Figure 11 represents the total calculation time and speedup of the two algorithm using different number of nodes. e speedup is calculated by the following formula: where S p is the speedup, t is the total calculation time, and t 0 is the total calculation time of the block Gauss-Seidel scheme using one compute node. As shown in Figure 11(a), the block Gauss-Seidel scheme demonstrates a higher convergence speed than block Jacobi scheme, which results in less total computation time. However, due to the poor scalability of a certain code, the speedup of block Gauss-Seidel begins to experience a downward trend after the nodes used peaks at 16. erefore, we add the adaptive load-balancing algorithm described in Section 3.2 to block Gauss-Seidel algorithm. For this example, we set the algorithm of increasing from half of the total number of nodes. Because our adaptive load-balancing algorithm cannot make full use of all resources at the beginning of the iteration, the computing time is slightly higher than the original algorithm while number of nodes is less than 16. In fact, the decrease of efficiency can be avoided by setting a larger number of nodes at beginning. Nevertheless, the adaptive load-balancing algorithm successfully prevented the decrease of speedup when the number of nodes reaches 32. erefore, applying our algorithm when the number of nodes is larger can improve the performance of the block Gauss-Seidel scheme.
As shown in Figure 11(b), the total running time of the block Jacobi scheme is generally longer than that of block Gauss-Seidel algorithm because of its poor convergence. However, by adding the adaptive load-balancing algorithm described in Section 3.2, resources can be allocated more reasonably, and in this way, the performance of this algorithm can be optimized. However, when there are only two nodes, there are no resources to allocate. When the number of nodes reaches 32, computing resources are sufficient for this case, so the load-balancing algorithm cannot further improve the computing efficiency. Although the demand for computational power of OpenMC and FLUENT is similar in this example, our adaptive load-balancing algorithms can still improve the efficiency of the two schemes. e errors of effective neutron multiplication coefficients (K eff ) during the iterations are presented in Figure 12. e temperature of materials is 300 K for the initial calculation. 1000 batches and 100,000 particles per batch were specified to OpenMC simulation. Both schemes converge to a unified solution. e combined K eff is 1.26133 ± 0.0001. e power and temperature distributions along the axis of the second fuel pin are shown in Figure 13.

1000 MWt Metallic Fuel Core Benchmark.
Before the coupling iteration begins, core multiplication factor (K eff ), sodium void worth (Δρ Na ), Doppler constant (Δρ Doppler ), control rod worth (Δρ CR ), inlet and outlet temperature of coolant, and the average fuel temperature of the benchmark are calculated to verify the correctness of the model. e formula of these parameters and the related works can be found in the benchmark for neutronic analysis of sodiumcooled fast reactor cores with various fuel types and core sizes [40]. e results before coupling have been verified and are summarized in Table 4. 10 5 is reactivity difference in units of pcm. It is observed that there are no statistical differences from the eigenvalues obtained by other researchers [40].

Science and Technology of Nuclear Installations
For the coupling simulation, MC-FLUENT has been performed on the same high-performance computing system as the one employed in the preceding 3 x 3 fuel pin example. 1/2/4/8/16/32 compute nodes were used, respectively, to test the parallel performances of two schemes, which are equal to 12/24/48/96/192/384 CPU cores. 1000 batches and 500,000 particles per batch were specified during OpenMC simulation while RNG k-ε turbulence model and COUPLED discrete format were adopted during FLUENT calculation. In this case, the porous model is used in FLUENT while OpenMC utilizes the detailed model, so the computational resources required by OpenMC are much more than those required by FLUENT.
Four iteration steps were required for the block Gauss-Seidel scheme to converge, while 6 were required for the block Jacobi scheme. Figure 14 illustrates the calculation   Figure 9: Comparison of (a) the relative axial power distribution and (b) the axial fuel temperature distribution at different radial locations. time and speedup of the two schemes using different number of nodes. For the block Gauss-Seidel scheme, since OpenMC needs a lot of resources, only the number of nodes for FLUENT is limited while applying our adaptive loadbalancing algorithms. e number of nodes for FLUENT grows from 1 according to the algorithm. As shown in Figure 14(a), our algorithm hardly increases the running time while the number of nodes is small. Moreover, compared with the original block Gauss-Seidel algorithm, when there are more nodes, our algorithm obtains a higher speedup. e main reason is that the scalability of FLUENT in this case limits the speedup growth of the original algorithm. For block Jacobi algorithm, as represented in Figure 14(b), our adaptive load-balancing algorithms bring remarkable improvement in computing performance. e reasonable allocation of resources improves the computing efficiency of block Jacobi algorithm a lot. It is worth noting that when there are 32 nodes, the performance of the block Jacobi scheme while adding adaptive load-balancing algorithm is almost equal to that of the original block Gauss-Seidel scheme. It can be seen that for this example which demands different computational power of OpenMC and FLUENT, our algorithms can obviously improve the efficiency of the two schemes.
Monte Carlo results after OpenMC and FLUENT coupling are represented in Table 5. e eigenvalue calculated    after OpenMC and FLUENT coupling featured 54 pcm higher than the one obtained with OpenMC alone at a constant temperature. Besides, the average control rod worth slightly rose after coupling by 10 pcm. Unlike the Doppler constant, which has changed little, the average sodium void worth declined considerably due to the temperature and density distribution change of coolant. e axial distribution of power and core fuel temperature is shown in Figure 15 while the final temperature distribution of sodium on the symmetry section and z-axis intermediate section can be seen in Figure 16. In the calculation process of FLUENT, the density is usually set to have a polynomial correlation to temperature.

Conclusions
In this paper, an interface code MC-FLUENT has been developed to couple the Monte Carlo code OpenMC with the computational fluid dynamics code ANSYS FLUENT. Block Gauss-Seidel-type Picard algorithm and block Jacobitype Picard algorithm were researched to control the calculation of OpenMC and FLUENT. Moreover, in order to improve the performance of multi-physics simulation in massive parallel computing, two adaptive load-balancing algorithms are proposed to reduce the time cost of two algorithms when coupling OpenMC and FLUENT on the high-performance computer. A single PWR cell model is simulated, and the simulation results are compared with the values in literature. A 3 x 3 fuel pin example with similar requirements for the OpenMC and FLUENT and a 1000 MWt ABR metallic benchmark core example which centers on the performance of MC code are used to compare the performance of the two algorithms and verify the effectiveness of the adaptive load-balancing algorithms.
Overall, it was found that the single PWR cell simulation results in this paper are in good agreement with the values in literature, which verifies the validity of MC-FLUENT. Compared with block Jacobi algorithm, the block Gauss-Seidel algorithm has a higher efficiency, but the growth rate of its speedup may limited by the different scalability of OpenMC and FLUENT. e adaptive load-balancing algorithm for block Gauss-Seidel algorithm proposed in this paper can improve its performance when there are many nodes. e adaptive load-balancing algorithm for block Jacobi algorithm proposed in this paper can greatly improve its performance. In addition, with the adaptive load-balancing algorithm, block Jacobi algorithm can achieve similar performance to the original block Gauss-Seidel algorithm when the resources are sufficient. e adaptive load-balancing algorithms are more effective when the case demands different computational power of OpenMC and FLUENT. e future work will move focus from the conventional multi-core CPU platforms to the heterogeneous system architecture (HSA) which can be more cost-effective as an alternation. With the HSA, the coupling simulations may be able to expect better performances.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.