Barium Titanate Semiconductor Band Gap Characterization through Gravitationally Optimized Support Vector Regression and Extreme Learning Machine Computational Methods

Barium titanate (BaTiO3) is a class of ceramic multifunctional materials with unique thermal stability, prominent piezoelectricity constant, excellent dielectric constant, environmental friendliness, and excellent photocatalytic activities. *ese features have rendered barium titanate indispensable in many areas of applications such as electromechanical devices, thermistors, multilayer capacitors, and electrooptical devices. *e photocatalytic activity of barium titanate semiconductor is hindered by its large band gap and high rate of charge recombination. Doping of the parent barium titanate compound for band gap tuning is challenging and consumes appreciable time and other valuable resources. *is present work relates the influence of foreign material incorporation into the parent barium titanate with the corresponding energy band gap by developing extreme learning machine(ELM-) based models and hybridization of support vector regression (SVR) with gravitational search algorithm (GSA) using the structural lattice distortion that emanated from doping as model descriptors. *e developed gravitationally optimized SVR (GSVR) is characterized with a low value of mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) of 0.036 ev, 1.145 ev, and 0.122 ev, respectively. *e developed GSVR model outperforms ELM-Sine and ELM-Sig models using various performance evaluators. *e developed GSVR model investigates the significance of iodine and samarium incorporation on the band gap of the parent barium titanate and the attained energy gaps conform excellently to the experimentally reported values. *e demonstrated precision of the developed GSVR as measured from the closeness of its estimates with the measured values provides a quick and accurate method of energy gap characterization with circumvention of experimental stress and conservation of valuable time as well as other resources.


Introduction
Barium titanate represents a kind of ferroelectric ceramic multifunctional compound with promising potentials in lead-free devices such as thermistors, electromechanical devices, multilayer capacitors, and electrooptical devices [1,2]. ermal stability, prominent piezoelectricity constant, excellent dielectric constant, and environmental friendliness are unique features that facilitate the application of barium titanate as an electronic component as well as photocatalyst [3]. e choice of barium titanate semiconductor becomes necessary since the most widely explored titanium-based semiconductors are of low quantum efficiency and resisted within UV illumination. In order to further enhance the photo response of barium titanate semiconductor under solar light irradiation, the construction of a heterojunction interface between two semiconductors has been identified as a unique method of enhancing charge separation and further optimizes band potentials in the compound [4,5]. Nonmetallic elemental incorporation into the parent crystal structure has also been an effective method in the literature [6]. e purpose of this incorporation is to lower the band gap and surface charge alteration and further lead to a decrease in the charge-carriers recombination rate. Subsequently, after the doping, the incorporated dopants induce the following modifications on the barium titanate: (a) negative and positive shift of the conduction and valence band, respectively, which consequently narrows the band gap; (b) induction of high stability when doping with nonmetallic heavy atom; (c) dopant localized states being established within semiconductor band gap; and (d) intramolecular energy levels being perturbed which ultimately mixes pure triplet and singlet states and (f ) distorts the crystal structure of the parent barium titanate semiconductor [6]. Extreme learning machine and combination of gravitational search algorithm (GSA) with support vector regression (SVR) computational methods are proposed in this work while the distortions as a result of the incorporated dopants serve as the predictors for band gap estimation. e octahedron center in the structural lattice crystal of the host barium titanate holds Ti 4+ ion while O 2− forms the composition of the center [3]. e oxygen octahedrons surround the void in which Ba 2+ is situated. Doping, composite formation, or codoping of barium titanate strengthen the multifunctional capacities of the compound and promote its application in photocatalysis through band gap tailoring and reduction of charge recombination rate.
ere are two substitution sites associated with barium titanate which include Ba 2+ (referred to as A-site) and Ti 4+ site (called B-site) [5,7]. Codoping and doping into these mentioned sites lead to improved ferromagnetic and optical properties. It worth mentioning that substitutions of foreign materials into these sites result in oxygen vacancies consequent upon oxidation state differences while the possible 3d electrons emanating from the dopants coupled with the oxygen vacancies strengthen the visible light absorption capacity of barium titanate. Additionally, the oxygen defects and the ions from the introduced dopants might reduce the electron-hole recombination rate through trap sites complementation which is one of the factors through which the photocatalytic activities could be enhanced [5]. e distortion due to the introduced dopants as contained in the lattice constants corresponding to the doped barium titanate are the inputs to the proposed hybrid GSVR and extreme learning-based models in this contribution.
Machine learning techniques have recently gained attention for solving complex and real-life problems that cannot be handled by conventional methods [8][9][10][11][12]. Support vector regression belongs to a class of these intelligent techniques which tackles nonlinear real-life problems using kernel function where the input data is transformed into feature space characterized with high dimensionality [13].
e transformation is carried out with the aid of kernel function which can be linear, sigmoid, polynomial, or Gaussian function. ese unique characteristics have strengthened the applications of the SVR-based algorithms in many areas as solid-state physics [14,15], laser spectroscopy [16,17], and others [18,19]. e parameters attributed to the SVR algorithm control the precision of the model and efficient tuning of the parameters is the bedrock for achieving excellent performance. e gravitational search algorithm (GSA) with Newtonian physical principles is implemented for parameter tuning in this work.
An extreme learning machine is regarded as a feedforward neural network with a single layer. e merits of this network as compared with the conventional neural network include high learning speed, excellent generalization performance, and computation of output weight during training procedures using the linear ordinary least square solution which is noniterative in nature [14,20,21]. e learning capacity of ELM extends beyond regression problems and can effectively handle clustering as well as classification problems. e universal approximation strength of the ELM algorithm allows approximation of a target function (which is continuous in nature) in the Euclidean space compact subset. ese unique features of ELM-based models have rendered the algorithm indispensable in addressing many real-life problems [17,[22][23][24]. e structure of the remaining part of this work is arranged in the following order: Section 2 discusses the mathematical formulation and background of support vector regression, gravitational search algorithm, and the extreme learning machine. Section 3 presents the physical description of the employed dataset and its sources. e step-by-step computational strategies are highlighted in Section 3. Section 4 analyses the outcomes of the proposed models. Comparison of the estimates of the developed models with the experimentally determined values for characterizing the influence of different dopants on the band gap of barium titanate is also presented in Section 4. Section 5 discusses the conclusions of the research work.

Mathematical Formulations of the
Implemented Algorithms e formulations of the implemented support vector regression algorithm, as well as the extreme learning machine, are presented here. is section further presents the physical background of the implemented gravitational search optimization algorithm.

Background of the Support Vector Regression Algorithm.
SVR aims at mapping crystal lattice parameters of doped BaTiO3 to an elevated feature space of high dimensionality by the implementation of kernel mapping function β(d) with nonlinearity features. e algorithm develops and constructs linear regression on all training samples where d j ∈ R m and E b j ∈ R m in feature space after transformation [25][26][27]. e approximated regression function of the SVR algorithm is presented in where (ω · β(d)) represents the dot product corresponding to the feature space while β(d) stands for the mapping function.
e ε-insensitive loss function L f presented in equation (2) is minimized in equation (3) purposely to ensure the minimization of Euclidian norm (‖ω 2 ‖/2) and thereby results in a flat function: e parameter C in equation (3) is known as the regularization factor and trades off the flatness of the acquired function linking descriptors with target and the training error. is parameter is among the model hyperparameters that need to be well tuned to ensure a precise and robust model. Slack parameters (ξ j , ξ * j ) are incorporated into the convex optimization problem to limit the effect of some constraints that prevent the actualization of the flat model. e optimization expression is simplified as shown in the following equation [28,29]: e Lagrange multiplier coefficients (η j and η * j ) were obtained after solving the optimization problem presented in equation (4) to compute the following equation: e obtained nonzero coefficients, as well as the crystal lattice parameters of doped BaTiO 3 , make up the number of support vectors (m SV ) which are implementable on subsequent BaTiO 3 based compounds for energy band gap computation. e final SVR approximated relation is presented in the following equation: e expression for the kernel function that helps in data transformation is defined in the following equation: where the kernel option is represented by λ. e epsilon, kernel parameter, and the regularization factor are the hyperparameters of the algorithm that are required to be optimized for ensuring precise and robust SVR-based model. e optimization algorithm with the Newtonian working principle has been employed in this present work due to its robustness, interesting physical operational principle, quick convergence to optimality, and strong mathematical formulation.

Operational Physical Principle of the Gravitational Search
Algorithm.
e gravitational search algorithm is an optimization algorithm with a characteristic of being heuristic which navigates the probable solution search space through the principle of Newtonian gravitational force and combines the concept of mass and Newtonian gravitational law in general [30,31]. A set of objects (called agents in this description), which are being affected by the gravitational pull and subsequently change their positions as well as the distances from other agents in accordance to their respective masses, are referred to as candidate solutions in the defined space. e agent's fitness is a function of its mass. Consider a search space containing m agent number, each with distinct masses and positions. Suppose that the agent jthin d-position is denoted by x d j . e position vector X j of jth agent is presented in equation (8) after taking search space dimension N into consideration: e gravitational force F d ji (k) acting from the agent of mass i M a i (active gravitational mass) to that of mass j M p j (passive gravitational mass) at iteration time k in dimension d for a closed system of masses is presented in [14,32] the following equation: where ξ represents a small value preventing zero denominators and promoting stochastic search to the algorithm. e expression for the Euclidean distance R ji between the two agents is presented in [33,34] the following equation: e gravitational constant G(k) contained in equation (9) is expressed in the following equation: where μ and G 0 stand for the positive constant and initial value of gravitational constant, respectively. e optimum attainable iteration number is represented by T. e Mathematical Problems in Engineering 3 expression for the total resultant force F d j (k) acting on jth agent at iteration time k along d dimension is depicted by where rand d i and Kbest, respectively, stand for random number within an interval range of [0, 1] and the number of probable solutions generated with search space. At the commencement of the optimization procedures, Kbest equates the number of possible solutions and decreases gradually as the algorithm approaches the global solution. Equation (13) computes the acceleration a d j (k) of jthagent along d dimension at time k: where M jj represents the inertial mass. e velocities of the agents with gravitational law description are updated in accordance with the following equation: where V d j (k) represents the velocity of jthagent along d dimension at time k.
Similarly, the position of the agent during the exploitation and exploration search of the defined search space is presented in equation (15) while the assumptions contained in equation (16) hold [35,36]: e fitness of the agent in the search space was calculated through the incorporation of equation (17) in equation (18): where worst(k), fit j (k), and best(k) represent the worst (highest value of root mean square error) fitness of jthagent (root mean square error of jthagent) and the best agent (lowest value of root mean square error), respectively. e mathematical expressions for the best and worst agent are, respectively, shown in the following equations:

Extreme Learning Machine Mathematical Background.
Typically, there are three distinct layers in the ELM operational principle which include the input layer, hidden layer, and output layer [20]. e input biases and the input layer weights are randomly generated and cease to partake in the training process furthermore. Consider the crystal lattice parameters L p ∈ R nx d , L p � (L 1 , . . . , L n ) T as input data for modeling band gap of doped barium titanate while d and n are the feature and sample size, respectively. During training, the available experimental crystal lattice parameters of doped barium titanate are mapped to ELM random space with N-dimensionality. e output of the hidden layer is computed as presented in equation (21) after g nonlinear transformation [16,24]: e activation function g might be sine, sigmoid, linear, polynomial, and triangular basis, among others. e output layer of the network is computed using the following equation: where In order to determine the coefficients φof the output layer, the condition contained in the following equation needs to be satisfied: where E g � (E 1 , . . . , E m ) T is the energy band gap matrix of the input crystal lattice matrix L p . e ELM approximated function E predicted g (L p ) � τ represents the approximation of the measured band gap E g matrix. Introduction of H ∈ R mxN further simplifies the problem as presented in the following equation: erefore, equation (23) can be further minimized as presented in the following equation: Solving equation (25) through generalized Moore-Penrose inverse completes the training processes of the ELM algorithm.

Computational Hybridization of the Algorithms and Data Acquisition
e computational strategies of the proposed hybrid model and those of ELM-based models are discussed in this section. e acquisition of the employed dataset as well as the physical connection between the predictors and the energy gap of doped barium titanate is also presented. 4 Mathematical Problems in Engineering

Data Physical Description and Acquisition.
e structural lattice parameters of barium titanate semiconductors doped with different materials and their corresponding bang gap employed for developing the proposed GSVR model are extracted from the literature [2][3][4][5][6][7][37][38][39][40]. Replacement of Ti 4+ ions in the parent barium titanate by the equivalent ions (say Sn 4+ ) from the dopants results in lower angle peak shifting with characteristic unit cell expansion due to ionic radii difference [41]. e expansion and contraction in the lattice parameters due to the octahedral coordination incorporation serves as the predictors to the developed model. e predictors and the energy gap of the doped barium titanate are analyzed statistically and the outcomes of the analysis are shown in Table 1. e inference about the entire content of the dataset can be deduced from the obtained values of the minimum, maximum, and mean. e information regarding the consistence in the band gap determination from measurement to measurement can be inferred from the presented standard deviation in the table.
e correlation coefficients are also presented to investigate the possibility of linear modeling techniques in handling the band gap prediction since the extent and degree of the linear relationship are controlled by the value of the correlation coefficient. From the obtained coefficients of correlation, it is observed that linear modeling methods would perform poorly in establishing a link between the energy gap semiconductors barium titanate and Structural parameters.

Computational Strategies of the Developed Hybrid Model.
Generation of support vectors from support vector regression algorithm was conducted within computing MATLAB environment while the hybridized gravitational search algorithm aims at optimizing the hyperparameters associated with SVR purposely to attain a precise and robust model. e entire dataset extracted from forty barium titanate based compounds were partitioned into training and testing set in the ratio of 8 : 2 where 80% of the dataset (thirty-two compounds) was employed for support vector acquisition and generation while the testing set takes the remaining 20% (equivalent to eight compounds). Before the described data partitioning, the dataset was randomized so that the data points can be evenly and uniformly distributed. e step becomes necessary for unbiased and efficient computation. e procedures and methodologies for computational hybridization of SVR with gravitational search algorithm are described in a stepwise manner as follows: Step I. Definition of the search space and population initialization: the search space that houses the limits of the probable solutions to the hyperparameters (epsilon regularization factor and kernel option) is defined as [1000, 0. 1, 0. 1] and [1, 0.001, 0.001] for upper and lower limits, respectively. is means the upper and lower limits of the regularization factor were set at 1000 and 1, respectively, while those of epsilon were set at 0.1 and 0.001, respectively. e epsilon and kernel option span over the same search space. A definite number of populations were initiated within the defined search space while the significance of this number on the convergence of the model was further investigated.
Step II. Completion of support vector regression implementation circle for fitness evaluation: the fitness of each of the agents was evaluated by implementing support vector regression algorithm on the training set of data while the root mean square error between the experimental band gap and the predicted value serves as the measure of the fitness. e fitness evaluation is summarized as follows: (a) Kernel function selection: a function was selected for data transformation among available pool of functions such as radial basis function (Gaussian function), linear, sigmoid, and polynomial. (b) Incorporation of GSA agents and training dataset into the chosen transformation function: the contents of each agent of GSA were incorporated with the mapping function to train the SVR algorithm using the training set of data. e number of the initial population of the agent corresponds to the number of developed the SVR-based model while the root mean square error between the measured band gap and the predicted value determines the fitness of each of the agents (developed model). (c) Fitness evaluation: the fitness of each of the agents was ranked from highest to the lowest with the agent that is characterized with the lowest root mean square error as the best fit while the agent with the highest root mean square error as the worst of the population. Equation (17) computes the best and the worst of the agents in the population. is evaluation was carried out using a testing set of data through the implementation of the acquired support vectors during the training phase of model development.
Step III. Update of GSA important parameters: the gravitational constant, worst agent of the population, and best agent are updated using equations (11), (19), and (20), respectively.
Step IV. Computation of inertial mass and the acceleration with which each of the agents is approaching convergence: from the results of the fitness obtained in equation (17), the inertial mass of each of the agents was computed using equation (18). e acceleration of each of the agents was calculated using equation (13).
Step V. Updating the position and velocity of the agent for population replacement: after the selection of the best agent in the initial iteration, the entire population is replaced with the new population through velocity and position update using equations (14) and (15), respectively.
Step VI. Stopping conditions: the algorithm goes through a repeating circle from Step II to Step V until the number of iteration attains 100 or the root mean square error attains zero value.

Extreme Learning Machine Based Model Computational
Strategy. e entire forty barium titanate based compounds doped with different materials were randomized and partitioned into training and testing set in the ratio of 4 : 1. e following steps were taken after the randomization and partitioning stages of model development.
Step 1: the biases (b) and input weights (φ) were randomly generated using a pseudorandom number generator through seeding in MATLAB environment.
Step II: activation function was selected from a pool of functions such as sine function, sigmoid function, linear function, and triangular basis function. Hidden neuron was also selected as 1.
Step III: computation of hidden layer output matrix using equation (22) coupled with training set of data.
Step IV: determination of the coefficients of output layer through the implementation of equation (23).
Step V: computation of output layer weights through Moore-Penrose inverse implementation.
Step VI: for a given activation function, repeat Step II to Step V with the number of hidden neurons varying from two to one hundred. Compute the performance measuring parameters (root mean square error, mean absolute error, and correlation coefficient) for each of the models and save the parameters of the best model. e parameters of the best model include the input weights, biases, outputs weights, number of hidden neurons, and the activation function.
Step VII: repeat Step II to Step VII for each of the available activation functions.
Step VIII: evaluate each of the models using a testing set of data by computing the performance measuring parameters and save the model parameters. e computational flowchart for the developed ELMbased model is presented in Figure 1

Results and Discussion
e results of each of the developed models and their comparison are presented in this section. A comparison of the measured and estimated band gap is also presented.

Exploration and Exploitation of Search Space for Different Population Sizes.
e significance of the exploitation and exploration capacities of the developed GSVR model on the number of agents within the search space is presented in Figure 2. e exploration capacity of the algorithm becomes significant before locating the most probable region of the search space while exploitation capacity measures the probing efficiency of the algorithm around the global solution. At a different number of agents, the algorithm converges to the same global solution. is kind of convergence justifies the robustness of the developed hybrid model.
e optimum values of SVR hyperparameters as obtained from GSA are presented in Table 2. e table also presents the optimum hyperparameters of ELM-based models.

Performance Comparison between ELM-Based Models and GSVR Model.
e developed hybrid GSVR and ELMbased models were evaluated during the training phase of model development and the results of the evaluation are presented in Figure 3. Figure 3(a) presents the evaluation on the basis of correlation coefficient (CC), Figure 3(b) compares using the mean absolute error (MAE) as a yardstick for performance evaluation while Figure 3(c) presents the comparison using root mean square error (RMSE) metrics. e developed GSVR model outperforms ELM-Sine and ELM-Sig model with performance improvement of 68.34% and 68.33%, respectively, using CC as performance evaluator as presented in Figure 3(a) while performance enhancement of 99.96% and 99.97% were, respectively, obtained when the developed GSVR model is compared with ELM-Sine model on the basis of MAE and RMSE as presented in Figures 3(b) and 3(c). e performance superiority of the developed GSVR model over the ELM-based models can be attributed to the uniqueness of the hybridization and the strong structural risk minimization mathematical principle upon which the SVR algorithm is built. e comparison of the performance of the developed model during the testing phase of model development is presented in Figure 4. e developed GSVR model performs better than ELM-Sine and ELM-Sig models on the basis of CC with performance improvement of 4.27% and 4.25%, respectively, as depicted in Figure 4(a). On the basis of RMSE as shown in Figure 4(b), the developed GSVR model shows the superior performance of 24.90% to the ELM-Sine model. e training phase of the GSVR model shows better performance of 7.92%, 99.96%, and 99.94% than the testing phase of model development using CC, MAE, and RMSE as performance evaluator, respectively. e testing phases of the ELM-based models outperform the training phase as can be observed in Figures 3 and 4. Performance improvements of 64.37%, 29.41%, and 44.55% were, respectively, obtained when comparing the performance of the testing phase of the ELM-Sine model with the training stage of model development on the basis of CC, MAE, and RMSE, respectively. e overall comparison for the entire dataset is presented in Figure 5.
Outstanding performances of 86.59% and 86.60% were, respectively, obtained while comparing the superiority of the GSVR model with ELM-Sine and ELM-Sig models, respectively, on the basis of CC as presented in Figure 5(a). Performance enhancements of 54.60% and 63.02% were obtained during a comparison between the performance of the GSVR and ELM-Sine model on the basis of MAE and RMSE, respectively, for the entire dataset as depicted by Figures 5(b) and 5(c). e actual values obtained for each of the performance measuring parameters for each model are presented in Table 3. Figure 6 also depicts the cross-plot correlation between the experimental gap energy and the estimated values using each of the developed models.

Mathematical Problems in Engineering
Alignment of band gap energy estimates of doped barium titanate of GSVR model shows its superiority of other developed models.

Band Gap Enhancement Effect of Iodine Ion Incorporation on Barium Titanate Semiconductor Using the Developed GSVR
Model. e result of iodine ion incorporation on the crystal structure of barium titanate is presented in Figure 7. Substitution of Ti 4+ ion in the parent barium titanate by I 5+ iodine ions changes the symmetry and dynamics of the system. Apart from photo-response enhancement, the efficiency of charge transfer also gains significant improvement. Figure 7 also compares the estimates of the developed GSVR model with the measured values [6]. e estimates of the developed GSVR model align perfectly with the measured values.
e observed photocatalytic enhancement can be attributed to the dopant energy states formation near the conduction band edge of the parent barium titanate consequent upon mixing up of iodine 5p states with Ti 3d states which results in shifting oxygen 2p states upward to higher energy in the parent compound.   Mathematical Problems in Engineering cation (Ba 2+ ). is ionic state difference changes the dipole region carrier concentration and also leads to the formation of vacancies as well as defects. e developed GSVR model captures the trend of band gap enhancement effectively due to its hybridization capacity for hyperparameter optimization and the structural risk

Conclusion
e band gap of barium titanate semiconductor doped with foreign materials is modeled using extreme learning machine and gravitational based support vector regression computational methods. Since dopants incorporation into the host structure of barium titanate creates vacancies, changes the dipole region carrier concentration, and ultimately distorts the lattice structure, crystal lattice parameters of the doped semiconductor serve as the descriptors to the developed models. e developed GSVR model outperforms both ELM-based models using three different performance evaluator parameters. e estimates of the developed GSVR model agree excellently with the experimental band gap while investigating the influence of iodine and samarium dopants on the photocatalytic strength of barium titanate. e obtained performance precision of the developed GSVR model in band gap characterization would ultimately facilitate quick and precise band gap determination of barium titanate with incorporated dopants for photocatalytic applications.
Data Availability e raw data required to reproduce these findings are available in the cited references in Section 3.1 of the manuscript.

Additional Points
Recommendation. Other machine learning techniques can be explored on this material for the estimation of band gap and other physical properties of barium titanate semiconductor.