Some Progress on Parallel Modal and Vibration Analysis Using the JAUMIN Framework

In the development of large and complex equipment, a large-scale finite element analysis (FEA) with high efficiency is often strongly required.This paper provides someprogress onparallel solution of large-scalemodal and vibration FEproblems. Somepredominant algorithms for modal and vibration analysis are firstly reviewed and studied. Based on the newly developed JAUMIN framework, the corresponding procedures are developed and integrated to form a parallel modal and vibration solution system; the details of parallel implementation are given. Numerical experiments are carried out to evaluate the parallel scalability of our procedures, and the results show that the maximum solution scale attains ninety million degrees of freedom (DOFs) and the maximum parallel CPU processors attain 8192 with favorable computing efficiency.


Introduction
In many engineering applications such as aerospace, automotive, and numerous other large equipment, the FE modal and vibration analysis are important ways to characterize their dynamic properties.However, the FE computing scale and the efficiency of commercial software often become a bottle to solve these complex problems.Therefore, developing largescale and high efficient FE procedures is strongly required.
Mathematically, modal analysis is treated as computing a number of lower eigenpairs of the generalized eigenvalue problems: where K and M are stiffness and mass matrix, respectively.
For the FE problems of small DOFs (below one million, for example), the equation can be easily solved via traditional methods provided by commercial software, but for larger scale cases, the commercial software is inaccessible.Both algorithms and parallel computing techniques should be therefore considered.Modal analysis takes an important role in dynamic computation, which is also the base of many other vibration analysis types [1,2].For large scale vibration analysis problems, the mode superposition method [3] is usually a feasible and effective way to obtain the dynamic response of structures.We ourselves focus on the random vibration analysis in this paper.
Considering the parallel implementation of these methods for modal and vibration analysis, the design of framework is very important.Currently, many frameworks, such as SIERRA [4], PHG [5], and JASMIN [6], have been successfully developed or are being actively developed.The basic idea of these frameworks is providing common data structures, parallel strategies, and interface for various application programs.Framework has been proven to be an effective approach to develop parallel programs for different massive computers.
Aimed at large scale parallel computation of modal and vibration analysis, this paper gives some progress which is based on our newly developed JAUMIN (J Adaptive Unstructured Mesh application INfrastructure) framework.The rest of the paper is organized as follows.Section 2 gives a brief review of the eigenvalue algorithms and mode-based random vibration theory.Section 3 introduces the parallel implementation, including the data structures, the parallel strategy, and programming interface of JAUMIN framework.
A representative example with different scales is given to verify the parallel scalability of our application procedures and JAUMIN framework in Section 4. Finally, the paper ends with a brief conclusion in Section 5.

Algorithm and Theory for Modal and Vibration Analysis
Since only a few lower eigenpairs are of interest for modal analysis, subspace-based algorithms are usually the best choice for large sparse eigenvalue problems.Subspace-based methods are mainly based on matrix-vector multiplications using the original sparse matrix so that the sparse matrix storage and structures can be used to advantage [7].When modal analysis is finished, the mode superposition method can be adopted to carry out the following vibration analysis.

A Brief Survey of Subspace-Based Methods for Modal
Analysis.Subspace-based methods differ from each other in the ways the subspaces are generated.The dimension of subspaces can be either fixed or variable.The classical algorithms working with fixed dimensions mainly include power method, Rayleigh quotient iteration, and subspace iteration [7,8].
A further class of subspace methods is the class of those whose dimensions increase as the iteration proceeds.Usually one starts with a subspace of dimension one and increases the dimension at each iteration step.These methods are in general more efficient than fixed-dimension ones and have become the mainstream for large-scale eigenvalue problems.This class mainly includes two subclasses, the Krylov-based subspace methods and the Davidson-based subspace methods [7][8][9].
The Krylov subspace method can be traced back to the Lanczos method [10] for symmetric matrices and the Arnoldi method [11] for nonsymmetric matrices.Since Lanczos and Arnoldi use subspaces of increasing dimension, storage and computing time is increased during the algorithm.In order to overcome this disadvantage, many restarted Lanczos and Arnoldi methods have been developed.A significant improvement of these methods appertains to Sorensen's implicitly restarted Arnoldi/Lanczos (IRA) method [12].Later, Stewart [13] proposed the Krylov-Schur (K-S) method by expanding Arnoldi decomposition to a general Krylov decomposition.The implicitly restarted Arnoldi/Lanczos method and the Krylov-Schur method are mathematically equivalent, which are recognized as the most successful algorithms in Krylov subspace methods.
Instead of using Krylov subspace, the Davidson-based methods increase subspace dimension with a Newton iteration step or an approximate Newton iteration step.Based on the standard Davidson method [14], some generalized Davidson methods [7,9,15] were presented by using different preconditioning.In 1996, Sleijpen and Van der Vorst [16] presented a Jacobi-Davidson (J-D) algorithm, which speeded up the development of Davidson-based methods.So far, the Jacobi-Davidson method has been extended to various eigenvalue problems.For more about subspace-based methods in modal analysis, we refer to [17].
2.2.Some Discussion on Subspace-Based Methods.For the Krylov subspace methods, such as IRA and K-S, a spectral transformation is needed to get faster convergence to lower eigenpairs.With a Shift-Invert spectral transformation, (1) will become where  denotes a user-defined shift value.Due to the spectral transformation, the solution of a linear system with the form (K − M)y = Mz is needed in each subspace iteration [17].The coefficient matrix K − M is usually illconditioned or even nearly singular because of the shift value .Thus an iterative solution to (2) becomes difficult [7,9]; instead a direct method is often recommended.The direct matrix decomposition (such as LU or Cholesky) of K − M brings both merits and shortcomings.Firstly, the matrix decomposition gives a fast computation of modal analysis with fewer subspace iterative steps.Secondly, the maximum matrix scale that can be decomposed is limited (about ten million DOFs [17]), and the matrix factorization needs a vast demand of memory storage and parallel communication, which results in a relatively poor parallel scalability.For the Davidson-based methods, especially for the J-D method, the corresponding solution of modal analysis becomes an inner-outer iteration.The inner iteration, which is the kernel of J-D method, becomes the iterative solving of a "correction equation": where   and u  denote the Ritz value and Ritz vector, respectively, t is an orthogonal component vector to be solved, * denotes conjugate transpose, and r  is the corresponding residual vector; that is, r  = (K −   M)u  .Thus, the vector t in (3) is adopted to expand the search subspace (i.e., outer iteration).
Using inner-outer iterations, the J-D method takes lower memory storage since no matrix decomposition like Krylov space methods is carried out.Furthermore, the main mathematical operations in J-D are matrix-vector multiplications and vector inner products, which maintain the sparse characteristics of original matrices and greatly reduce the demand of parallel communication.

Mode-Based Random Vibration Theory.
Once the required eigenpairs for modal analysis are obtained, various following vibration analyses can be carried out using mode superposition method.Here we ourselves focus on the random vibration analysis of single-point support motion; other kinds of vibration analysis will be carried out in future step by step.
The dynamic equation of random vibration analysis for support excitation via FE method can be described as where C denotes damping matrix, x  denotes the relative displacement against support motion, d is a direction vector composed by the direction cosine between the motion direction and the three coordinate axes for each node, and  is the displacement value of support motion.Thus, the real displacement x can be expressed as Using the mode superposition method [3], (4) can be decoupled by the  eigenpairs (  ,   ) obtained by modal analysis (the eigenvectors are normalized with M), which yields where   is the th modal displacement,   is the th modal damping ratio, and   is the th modal participation factor with the definition   =    Md.Modal participation factors are indicators of the possible contributions of particular modes towards the stochastic response of the structure, and the corresponding computation concerns operations of matrix-vector multiplications and vector inner products, which can be implemented with parallel data structures.
For ( 6), the corresponding modal power spectral density (PSD)      () can be given as where and  ü ü() denotes PSD of acceleration, which is the input load on the base.Equation (7) builds the relationship between modal PSD and input PSD.
Applying the mode combination in mode superposition method, we get the PSD of x  for the th DOF as where  is the modal orders for combination and   denotes the value of the th DOF in the th modal shape   .
Considering relationship between the relative displacement and the absolute displacement in (5), we deduce the expression of PSD of x for the th DOF as where   denotes the value of the th DOF in direction vector d.Equation (10) gives the relationship between response PSD of the th freedom and the input PSD of base motion.Equation ( 10) is a perfect expression.Firstly, the implementation of (10) mainly becomes computing a component or subclass: which physically represents the transfer function between output relative displacement and input base acceleration; secondly, the response of the th DOF is absolutely independent of others.Thus, the PSD computation of each DOF in ( 10) is a natural parallel pattern, without any parallel communication occurring.In other words, the parallel communication for random vibration analysis only occurs at the stage of computing modal participation factors.

Parallel Implementation in JAUMIN Framework
The algorithm and theory mentioned in Section 2 are the kernel to realize the modal and vibration analysis of large engineering structures.The parallel implementation of modal analysis mainly depends on the eigenvalue solver package SLEPc [18] integrated in JAUMIN [19] framework, and the parallel implementation codes of random vibration are developed by ourselves.

JAUMIN Framework.
JAUMIN is a parallel computing framework which has been developed by China Academy of Engineering Physics since 2011.JAUMIN framework supports the large scale parallel simulations on adaptive unstructured meshes using massively parallel processing (MPP) machines for the FEA of engineering structures; it is a complement to the formerly developed JASMIN [6] framework, which aims at structured mesh applications in fluid mechanics.The software architecture of JAUMIN is based on multilayered, modularized, and object-oriented design thoughts, mainly being written with C++ and MPI, including about 12 thousand of lines so far.JAUMIN can be installed on personal computer, cluster, and MPP machines with a UNIX or LINUX system.The architecture of JAUMIN framework can be described in Figure 1.
In JAUMIN framework, the bottom layer is high performance computation supporting layer, which encapsulates geometry, mesh, and field data structures, providing parallel communication, load balancing, adaptive mesh refinement, and basic mathematical operations; the middle layer is numerical general layer, which provides fast solvers, mature numerical methods and time integrators, and so forth.The top layer provides programming interfaces for different applications.for two-dimensions and most of polyhedrons for threedimensions, such as triangle, quadrangle, tetrahedron, pyramid, and hexahedron.Furthermore, hybrid meshes are also supported in JAUMIN.

Data Structures and Parallel Communications. The element types in JAUMIN framework include most of polygons
Considering parallel computing environment, JAUMIN adopts a two-level partitioning strategy to partition the FEM, shown as in Figure 2. Firstly, the FEM is partitioned into submeshes considering load balance; each CPU processor is assigned with a submesh.Secondly, each submesh is again partitioned into a group of smaller submeshes, named as "patch." Instead of element, patch is the basic object for memory management and numerical computation in JAUMIN, which make it easier to implement MPI + OpenMP hybrid parallel programming and computing.On the other hand, the hierarchical data structure can be well matched with the predominant multicore computer architectures.The first level (P0, P1 in Figure 2) is partitioned and the corresponding information is sent to each CPU processor; and the second level (i.e., patch) is portioned based on the first level and corresponding information can be sent to the inner cores of each CPU processor if necessary.Each patch includes some basic information such as patch geometry, patch topology, and patch data, which constitutes the core data structures of the framework.
According to the data structures in JAMIN framework, the parallel communication module is designed with two kinds of communication modes: one is the data transferring among the first levels, which can be implemented by MPI; the other is the data transferring among the patches for each level, which can be implemented by OpenMP.In our implementation of mode and vibration analysis, only MPI is used for the parallel communication among levels, and the patch data are used only to form the distributing FE matrices for each level.

Interface Service and Parallel Implementation for Dynamic
Applications.Besides the basic data structures and parallel communication, the JAUMIN frame provides some interfaces to pre/postprocessing and the individual solvers.Using the preprocessing interface, the FEM of engineering structures is read to the JAUMIN framework.Currently, we build the FEM of engineering structures via a self-developed procedure, named as SuperMesh, which can implement the FE meshing and support adaptive refinement.
Based on the FEM information, a two-level domain decomposition mentioned above is implemented in JAUMIN framework to partition the FEM into some smaller subdomains.The information of each subdomain is delivered to different processors for FE discretization and numerical integral, until forming the distributing matrices K and M.
Another important interface in JAUMIN framework is the solver interface for different applications.As far as FEA is concerned, JAUMIN provides interfaces to linear/nonlinear static analysis, modal analysis, harmonic response analysis, random vibration response analysis, and some kinds of shock response analysis currently.We ourselves focus on the modal and vibration analysis in this paper.
The implementation of modal analysis mainly depends on an important software package, that is, SLEPc [18], the Scalable Library for Eigenvalue Problem Computations, which is taken as a third solver library.SLEPc provides a collection of eigensolvers, including Krylov-Schur, Jacobi-Davidson, Arnoldi/Lanczos, and some other subspace methods.Most of the parallel implementations in SLEPc are vector operations, the matrix-vector product and linear equation solvers, which are supported by another numerical toolbox PETSc [20].The JAUMIN framework generates parallel distributing stiffness matrix K and mass matrix M and makes them as the input for SLEPc to finish the parallel solution of modal analysis.The JAUMIN framework is responsible for the calling and managing of these software packages.
The codes for mode-based random vibration analysis are developed by ourselves using the corresponding theory shown in (4)∼ (11), which can be treated as a postcomputing process of modal analysis.A crucial step in random vibration is calculating the modal participation factors, which requires matrix-vector products and vector operations with the JAU-MIN parallel data structures.Another parallel process in random vibration analysis lies in the computation of PSD for different nodes.As mentioned in Section 2.3, this process is a natural parallel one without any communication.Each processor is responsible for the corresponding computing nodes that locate in it.Besides the PSD of relative displacement and absolute displacement as shown in ( 9) and (10), PSD of velocity (denoted by  ẋ  ẋ  ) and that of acceleration (denoted as  ẍ  ẍ  ) can also be calculated by relationships: which have been implemented in our codes.We want to emphasize that the vibration analysis in this paper is limited to the type of single-point support random excitations.The case of singe-point nodal load excitation is similar to the calculation of PSD of relative displacement in  (9), where the input PSD  ü ü() on the right side is replaced with the PSD of force   (), and the response PSD      () on the left side is replaced with      () since no relative displacement occurs.In this case, the implementation of singe-point nodal load excitation is easier than single-point support excitations.For the implementations of some other cases such as multipoint excitation calculations [21,22] are being developed in JAUMIN framework.

Numerical Experiments
Many simple engineering examples have been adopted to validate the rightness of our programs, and some of them were compared with commercial software ANSYS or MSC.Nastran, in which the same computing results are obtained.Here we give a representative example in engineering structures to evaluate the rightness and parallel scalability of the modal and vibration analysis in JAUMIN framework.The computations were carried out on a supercomputer, which include thousands of blade computing nodes, each of them with twelve Intel processors and 48 GB of shared memory.
The engineering example is the target ball structure of a laser equipment, whose FEM is shown in Figure 3.The model is meshed with tetrahedron elements, and the node displacements at the bottom of the model are fixed as a boundary constraint.The initial DOFs of the model are about 1.1 million, and an adaptive mesh refinement technique is adopted to increase the numbers of DOFs.After the first refinement, the DOFs attain 12, 214, 455, and after the second refinement, the DOFs attain 91, 207, 359.
As there are a large number of similar pipes in the structures, there exist numerous dense modes in the modal analysis, which obviously enhances the difficulty of numerical computation.Here, the J-D method is adopted for modal  analysis; the lowest 100 eigenpairs are desired for modal and random vibration analysis.For random vibration analysis, an acceleration excitation of base motion in the  direction is adopted, with the corresponding PSD shown in Figure 4.

Analysis and Validation of the Initial Model.
For the initial meshes (1.1 million DOFs) of the model, the modal and vibration analysis are also carried out by ANSYS to validate the rightness of our programs.The results show that the 100 orders of eigenpairs calculated in JAUMIN framework are coincident with that in ANSYS; the largest relative error for eigenvalues is below 0.05%, and the modal shapes are identical.Figure 5 gives a comparison of the first-order modal shape calculated in ANSYS and JAUMIN.Furthermore, using the results of modal analysis, the following mode-based random vibration analysis is also carried out and compared with ANSYS. Figure 6 gives the comparison of PSD of displacement in the  direction of the apogee.We can see from Figure 6 that the two curves are absolutely coincident in the range of analysis frequencies, which validates the rightness of both modal and vibration analysis from another point of view.

Testing on Parallel Scalability.
Based on the initial FEM, the target ball is refined twice to increase the mesh density.As mentioned above, the total DOFs attain 12, 214, 455 after the first refinement (Case 1) and attain 91, 207, 359 after the second refinement (Case 2).For these two scales, the traditional commercial software is incapable of action.The modal and random vibration analyses are carried out for Cases 1 and 2 with a maximum number of processors of 8192.Since the modal analysis is included in the random vibration analysis, we count the total computing time for random vibration analysis.Tables 1 and 2 gave the parallel computing time and efficiency with different numbers of processors for Cases 1 and 2.
From Tables 1 and 2 we can find that the parallel computing time continuously decreases as the increase of  processors, without computing inflexion, emerges.Based on the JAUMIN framework, our application programs behave a favorable parallel scalability.Furthermore, we believe that solving such a modal and vibration analysis problem near 100 million of DOFs is very challenging so far.

Conclusions
In this paper, we introduce some progress for the large-scale modal and vibration analysis based on the JAUMIN framework.The predominant algorithms and theory for modal and vibration analysis are firstly reviewed and deduced.A detailed introduction of JAUMIN framework is given and, based on which, an integrated parallel computing system for modal and vibration analysis is presented.
The numerical experiments via a typical engineering structure validate the rightness and scalability of our procedures for large-scale modal and vibration analysis.In virtue of the developed parallel solving system, we have elevated the computing scale of modal and vibration analysis to a new level with favorable parallel performance.Moreover, our procedures of the modal and vibration analysis fit most elastic structures coming from various engineering realms.

Figure 5 :
Figure 5: The comparison of the first-order modal shape.
PSD of absolute displacement

Figure 6 :
Figure 6: Comparison of PSD between ANSYS and JAUMIN framework.