A Review on Machine Learning Application in Biodiesel Production Studies

The consumption of fossil fuels has exponentially increased in recent decades, despite significant air pollution, environmental deterioration challenges, health problems, and limited resources. Biofuel can be used instead of fossil fuel due to environmental benefits and availability to produce various energy sorts like electricity, power, and heating or to sustain transportation fuels. Biodiesel production is an intricate process that requires identifying unknown nonlinear relationships between the system input and output data; therefore, accurate and swift modeling instruments like machine learning (ML) or artificial intelligence (AI) are necessary to design, handle, control, optimize, and monitor the system. Among the biodiesel production modeling methods, machine learning provides better predictions with the highest accuracy, inspired by the brain’s autolearning and self-improving capability to solve the study’s complicated questions; therefore, it is beneficial for modeling (trans) esterification processes, physicochemical properties, and monitoring biodiesel systems in real-time. Machine learning applications in the production phase include quality optimization and estimation, process conditions, and quantity. Emissions composition and temperature estimation and motor performance analysis investigate in the consumption phase. Fatty methyl acid ester stands as the output parameter, and the input parameters include oil and catalyst type, methanol-to-oil ratio, catalyst concentration, reaction time, domain, and frequency. This paper will present a review and discuss various ML technology advantages, disadvantages, and applications in biodiesel production, mainly focused on recently published articles from 2010 to 2021, to make decisions and optimize, model, control, monitor, and forecast biodiesel production.


Introduction
Fossil fuel, the most popular fuel with an essential role in developing economy and politics in both established and developing countries, has been a common industrial energy source for several decades because of its perfect properties combination like easy transportability, versatility, accessibility, and costly prices [1][2][3]. Although many undiscovered oil reserves remain in geological structures and rich unconventional oil reservoirs like tar sands, heavy oil, and oil shale indicate a suitable possibility of commercially viable resources, they are nonrenewable and limited. e world energy demand is assumed to reach a 56% growth between 2010 and 2040; hence there is a dire need for a sustainable alternative energy resource [4][5][6]. In addition to resources limitation, fossil fuel consumption for economic and industrial activities causes many challenges like air pollution, global warming, environmental deterioration, health problems, global climate change issues, and emitting greenhouse gas (GHG) in the entire world [7]. e energy crisis followed by high dependence on fossil fuels, increasing resource fluctuation, and environmental challenges exacerbated the resources ending-up concern and leading the world towards eco-friendly energy resources to assure a sustainable energy supply and meeting the escalating energy requirements from a renewable source [8][9][10][11][12]. Fossil fuel production will not suddenly stop and remains a universal energy resource, but scientists try to obtain low carbon footprint energy [13]. Biofuels, hydrogen, compressed natural gas, liquefied petroleum gas, and alcohol have enough potential to become alternative energy sources [14][15][16].
A study was performed on renewables to choose the best alternative energy where bioenergy, today's largest renewable energy resource, presented great potential in addressing climate change and global energy issues [17]. Biofuel includes biodiesel, bioethanol, and biogas, obtained from biomass resources, which can be applied instead of fossil fuels due to integrating enhanced energy security, environmental benefits, availability, renewability, and sustainability to produce various energy sorts like electricity, power, and heating or to sustain transportation fuels [6,16,[18][19][20][21][22]. Figure 1 illustrates the research trend in the biofuels field. e number of published documents has increased sharply from 2002 till 2020. Since 2016 a decrease in the growing number of articles was observed; however, it is still progressing.
Among all sustainable alternatives to fossil fuel, biodiesel is a suitable choice for diesel engines due to lower engine emissions (41% less greenhouse emission), physical and chemical properties advantages, and no need for significant modifications [23][24][25][26]. Biodiesel and petrodiesel are miscible in any ratio, which leads to the use of their combination rather than pure biodiesel, not only in developed countries such as e United States, France, Italy, and Germany, but also in developing countries such as Malaysia, Brazil, Indonesia, and Argentina [7,[27][28][29]. Biodiesel production capacity is an attractive growing trend; the automotive biofuels market is growing dramatically; it has engaged many scientists and researchers to satisfy the ever-rising energy supply demands by producing alternative fuels [25,30]. As shown in Figure 2, the share of renewable energy in generating power is expected to have a 23% increase by 2030.
e challenge is to identify the biofuel production process outputs relationship as a function of process parameters, then maintaining and optimizing effective parameters in an optimum range to ensure high quality and productivity [13,32]. Various transesterification associated raw materials parameters and reaction conditions like temperature, oil and catalyst type, reaction duration, oil to alcohol molar ratio, and catalyst concentration can affect productivity and production process response features, estimated through physical experiments [13,[33][34][35][36][37]. Despite the necessity of experiments, the prediction of factors effect is not successful due to the underlying nonlinear relations between the responses and parameters and also plenty of process parameters; therefore, high accurate experimental modeling methods like machine learning-based prediction and artificial intelligence (AI) techniques are beneficial to overcome experiment methods limitations and traditional computing techniques challenges [13,[38][39][40]. ey provide mathematical models or independent modeling approaches according to the nature of the process to prevent waste of time and money and, furthermore, to study a wide range of physical and chemical process parameters separately and generate experimentally inaccessible details [10, 12, 41-44].

An Introduction to AI and ML
AI is the ability of machines to simulate the human brain activities, applied through different computer science techniques, like heuristic algorithms, machine learning, and fuzzy logic [45][46][47]. It is chiefly employed to predict biomass and biofuel properties, bioenergy end-use systems performance, conversion process performance, supply chain modeling, and optimization. Recommended optimization methods are response surface methodology (RSM), genetic algorithm, and Taguchi method; in the meantime, artificial neural network (ANN), regression, and analytical methods are trending modeling methods in internal combustion engine research [48][49][50][51].
ML algorithms evolved with deep learning, reinforcement learning, transfer learning, and extreme learning are utilized in industrial processes to optimize, monitor, and control the systems, forecast maintenance, diagnose mistakes, and notify process attacks [2,[52][53][54][55][56]. Linear regression, Principal Component Analysis (PCA), Decision Trees (DT), Genetic Algorithms (GA), K-nearest Neighbor Classifier (KNN), Random Forests regression (RF), Artificial Neural Networks (ANN), and Support Vector Machines (SVM) are some powerful machine learning algorithms [57]. Machine learning refers to a programmed process using consecutive iterations based on inputs of external variants, gradually updating problem-solving capability and selfimprovement to solve the study's complicated questions [57,58].  AI applications to bioenergy systems are limited; however, studies indicate its great potential in addressing bioenergy development obstacles. Former reviews have separately focused on either a single AI approach or a part of bioenergy systems [2,44,48,49,59]. Due to the wide variety of AI techniques, conversion technologies, bioenergy products, biomass types, and supply chain design, a comprehensive review of AI applications throughout biomass agriculture to the consumption phase is necessary. is review intends to recommend advanced statistical methods and current popular machine learning algorithms conflux to obtain overall pragmatic models as an experiential agreement.

An Introduction to Biodiesel Production
Biodiesel is a clean, aromatic, biodegradable fatty acid methyl ester derived from waste oils, edible and nonedible vegetables oil, and animal fat (i.e., chicken and mutton tallow) as an alternative fuel source for diesel engines to reduce engine emissions, becoming a global mainstream for transportation [34,45,51,[60][61][62]. In addition to alternative transport fuel, biodiesel has other potential usages such as heating oil, plasticizers, power production, high boiling absorbents for cleaning gaseous industrial emissions, lubricants, and various solvent applications. Biodiesel has similar properties to diesel fuel, for instance, cetane number, viscosity, energy content, and phase variations. Biofuels can provide a new business for agricultural products and revitalizing rural areas [63].

Advantages
(i) Sulfur-free (ii) Releasing fewer emissions (iii) Profitable Physicochemical properties such as density, cetane number, flash point, viscosity, and lubrication (iv) More complete combustion because it is highly oxygenated (v) Promoting energy sufficiency [42].

Disadvantages
(i) Less energy content (ii) Releasing more nitrogen oxides (iii) Higher maintenance cost (iv) High cost of establishment (v) Separation and purification stage for product (vi) Undesirable side reactions [51,64] Easy production from available renewable feedstock makes it more attractive. Nonedible tree seed oil resources are easily found everywhere, even in nonappropriate food crops land. Pure biodiesel, or a mixture of commercial diesel and biofuel, can be used in unmodified diesel engines due to the environmental sustainability advantages [51,65]. Several countries command to add biodiesel into all diesel fuels to encourage people to use biodiesel [63,66]. e most common reaction in the biodiesel production process is transesterification, which uses heterogeneous or homogeneous acid and base catalysts to improve transesterification under mild reaction conditions. Sodium hydroxide and potassium hydroxide (NaOH, KOH) are regular alkaline catalysts that can provide higher biodiesel yield [67][68][69][70]. e transesterification reactions among the oil (i.e., canola oil, Simarouba glauca oil, soybean oil, sunflower seed oil, evetia peruviana seed oil, palm oil, etc.) and alcohols (i.e., methanol, ethanol) produce biodiesel [62,[71][72][73][74][75][76]. It is a costly energy-consuming production process which results from product purification and separation, requiring a pretreatment step to reduce water and free fatty acids over a long period [2]. Low esterification efficiency arose from undesired side reactions. Figure 3 illustrates the transesterification reaction for biodiesel production and input and output variables.
Various transesterification associated parameters and reaction conditions like temperature, oil and catalyst type, reaction duration, oil to alcohol molar ratio, and catalyst concentration affect productivity, and production process response features significantly affect transesterification reaction [37,78]. Statistical tools and many physical experiments are necessary to predict reaction responses and interactions to each parameter due to optimizing transesterification [36,42].

ML Methods Application in Biodiesel Life Cycle
Producing biodiesel from renewables includes the following steps: extracting oil, pretreating feedstock, transesterification reaction, separating products, recovering unreacted alcohol, neutralizing glycerin, washing, and purification of biodiesel [70,79]. In this section, we attempted to categorize and review ML technology applications in 5 crucial steps of biodiesel production, including soil, feedstock, production, consumption, and emissions [57,80]. ML technology can be beneficial in all five stages to enhance the quality of estimations. ere are plenty of research reviews on applications of machine learning technology in modeling biodiesel-fueled engines and combustion approaches; therefore, this study mainly focuses on the first three stages. Figure 4 shows an overview of the biodiesel production trend, inspired by Aghbashlo et al. [79] and Ahmad et al. [57]. Sorghum crop is beneficial for producing health-promoting food from seeds, fodder, and biofuels from aboveground biomass [81]. To predict future trends in sorghum bicolor yield,

International Journal of Chemical Engineering
Huntington et al. [82] used the RF approach under four greenhouse gas (GHG) emission scenarios and two different watering regimes. e most valuable sorghum productivity predictors were vapor pressure deficit, time, and irrigation practices. e RF model obtained a rational prediction accuracy by uniquely training and classifying data samples by year and country. Habyarimana et al. [81] performed a study based on sorghum fields satellite imaging to predict sorghum biomass yield using various ML methods like radial basis kernel ( Gleason et al. [83] compared the Linear Mixed-effects Regression (LME), Cubist, Support Vector Regression (SVR), and Random Forest (RF) methods to predict biomass in a moderately dense forest with 40 Figure 3: Biodiesel production procedure, transesterification reaction outline [77].

Quality control
Washing and drying produced biodiesel  Table 2 provides a summary of feedstock phase studies [86][87][88] to classify the efficient method and study purposes.

ML Applications in Production.
In the production stage, choosing a proper ML method depends on produced biofuel type (i.e., biodiesel, biogas, and biohydrogen). Based on studies, machine learning applications in biodiesel study can be organized into four sections: both quality and yield optimization, estimating quality, estimating yield, estimating, and optimizing process conditions and efficiency [57].

Quality Prediction.
e prevailing ML method for quality prediction is ANN developed by the regression model, using reaction temperature, reaction time, calcination temperature, pressure, and flow rate as input variables and FAME (fatty acid methyl ester) content, viscosity, composition, quantity, cetane number, and density stand as output variables.
Soltani et al. [89] used an artificial neural network (ANN) to model various reaction parameter effects, i.e., calcination temperature, metal ratio, reaction time, and reaction temperature in a palm fatty acid (PFAD) to esters distillation, using sulfonated mesoporous zinc oxide SO 3 HZnO catalyst. Assessed optimum conditions for predicting a 56.41 nm SO 3 H-ZnO nanocrystalline catalyst size were 160°C reaction temperature, 700 calcine temperature, and 0.004 mole of Zn concentration during 18 min reaction time. Zinc concentration and the reaction time are recognized as the most and least effective parameters, respectively.
Ahmad et al. [90] used an ensemble learning method like Least Squares Boosting (LSBoost) integrated with the polynomial chaos expansion method (PCE) to predict quantity, quality, flow rate, the cetane number of fatty acid methyl esters (FAME), and composition in the vegetable oilbased biodiesel production process. Predicted values showed 1% uncertainty in all process parameters using mean absolute deviation percent (MADP), showing high accuracy of the proposed model in outcomes prediction and quantification uncertainty effect in the process. During the biodiesel production process from vegetable oil, the PCA method was applied to estimate relative density, viscosity, and percentage of vegetable oil conversion to methyl esters. Using PCA is an effective technique to differentiate and discriminate between pure biodiesel, pure diesel, waste oil, and their mixture.
Sarve et al. [91] used artificial neural network (ANN) and response surface methodology (RSM) based on a central composite design (CCD) to predict fatty acid methyl ester (FAME) content in biodiesel production from sesame oil, using barium hydroxide as a basic catalyst. e best possible combination of optimum condition values is methanol-tooil molar ratio (6.69 : 1), reaction time (40.30 min), catalyst concentration (1.79 wt.%), and (31.92°C) temperature, which resulted in 98.6% of FAME content. e study revealed that catalyst concentration has the main influence on the FAME contents in the final product. ANN has a better capability in predicting the FAME content due to better correlation coefficient, root mean square error (R 2 ), standard error of prediction (SEP), and relative percent deviation (RPD) values compared to RSM.

Yield Estimation.
Several studies concentrated on ML methods application in predicting biodiesel synthesis from nonedible oils like anaerobic sludge, castor oil, and jatrophaalgae.
Kumar et al. [92] trained an ANN model with Levenberg-Marquardt (LM) algorithm and backpropagation learning algorithm to predict biodiesel yield in the transesterification process, using jatropha-algae oil blends as inputs.
e R-square value of 0.9976 compared with the experimental results confirmed the competency of the ANN technique.

International Journal of Chemical Engineering
Banerjee et al. [93] used the ANN and CCD model in castor oil and methanol transesterification using H 2 SO 4 acid catalyst to predict the % fatty acid methyl ester content. ey also devised a kinetic model using the experimental and computed data. Also using ANN-based predicted data and the experimental outputs, the rate constants of a kinetic model have been estimated. e temperature, catalyst concentration, and methanol-to-oil molar ratio are input parameters. e ANN model predicted a % fatty acid methyl ester yield with an 8% deviation.
Kanat et al. [94] used the ANN method and multilayer neural networks topology to model and estimate the anaerobe thermophilic upflow sludge blanket digester biodiesel and biogas production rate. Trained and tested experimental data were evaluated in both steady conditions and abnormal conditions; a high correlation coefficient showed ANN optimistic results for online monitoring of the thermophilic reactors. In a jatropha-algae oil blend study, ANN performed better than RSM [95].
A biodiesel synthesis process from waste goat tallow containing remarkable free fatty acids (FFAs) has been modeled by RSM and ANN to identify optimum parametric values that resulted in maximum FA conversion. Under optimal conditions, response surface methodology (RSM) and ANN presented similar predictability performance [96].
In another study, a linear regression (LR) and ANN model based on a Levenberg-Marquardt learning algorithm were developed for predicting soybean oil-based biodiesel transesterification yield, where the ANN performed better than LR [97]. Various conditions of soybean oil to biodiesel transesterification process have been studied to predict biodiesel yield [39]. In this study, the artificial neural network is applied with a multilayer feedforward neural network and kinetic models. e results showed the ANN model superiority, accuracy, and clarity over the kinetic modeling method. Guo et al. [98] used an adaptive neurofuzzy interference system (ANFIS) method, based on a statistical learning theory to estimate the biodiesel production yield as a function of methanol/oil ratio, pressure, reaction time, and temperature in the noncatalytic supercritical methanol (SCM) method. e high value of Rsquared results indicates the ANFIS model's impact on biodiesel yield prediction. Mostafa et al. [35] compared adaptive neurofuzzy inference system (ANFIS) and response surface methodology (RSM) to predict and simulate the efficiency of these approaches in modeling the transesterification yield. Box-Behnken design of RSM and two ANFIS approaches (hybrid and backpropagation optimization methods) investigated independent variable's impact on the conversion of fatty acid methyl esters (FAME). e considerable R 2 value was 0.9669 for RSM compared with 0.9812 and 0.9808 for two ANFIS models indicating the ANFIS models superiority against the RSM model for modeling and optimizing. Maran et al. [49] compared artificial neural network (ANN) and response surface methodology (RSM) efficiencies to predict and simulate muskmelon oil-based biodiesel yield. Central composite rotatable design CCRD investigated the ANN model against the RSM model. Catalyst concentration, reaction time, reaction temperature, and methanol-to-oil molar ratio affect FAME conversion by Multilayer Perceptron (MLP) neural network and RSM. e R 2 value for RSM was 0.869, and it was 0.991 for ANN models, showing the ANN model superiority against the RSM to model and optimize FAME production.

Quality and Yield Estimation.
Numerous studies have focused on biodiesel quality and yield optimization. Bobadilla et al. [77] used a set of Support Vector Machines (based on radial basic function kernel, linear kernel, and polynomial kernel) and linear regression methods to predict and improve biodiesel yield of particular properties like turbidity, higher heating value (HHV) with decreased viscosity, and density. Appling genetic algorithms to the regression models obtained more accurate biodiesel optimization Table 1: Various ML applications in the soil phase of biodiesel production outline.

Reference
Applied models Field Results [81] GBL, GBD, GBT, ANN, RF, SVR, SVM, SVM-P, SVM-R, SVM-G, PCA-DA, PLS-DA Predict sorghum crop yield GBT [82] RF Predict sorghum crop yield RF [83] LME, SVR, RF Predict biomass yield in forest SVR [84] BRT Estimate corn production environmental impacts BRT [85] GPM, RF Land productivity GPM International Journal of Chemical Engineering scenarios to identify the best combination of independent and dependent variables. Cheng et al. [99] developed a GA-ESIM method which is the combination of Evolutionary Support Vector Machine Inference Model (ESIM) and K-means Chaotic Genetic Algorithm (KCGA) to predict precisely and optimize biodiesel mixture properties. ey found GA-ESVM better than ANN-GA and SVM. Obtained results demonstrate that the GA-ESIM model performance in prediction is more accurate than other AI-based tools.
Sivamani et al. [100] used ANN-GA-based and RSM models to predict and optimize the biodiesel yield in Simarouba glauca transesterification. ey used a gas chromatography-mass spectroscopic (GC-MS) analysis oil to observe free fatty acid (FFA) level, and alcohol ratio, reaction time, and reaction temperature were input variables.
Ighose et al. [101] focused on an RSM optimization tool alongside the ANFIS model to predict and optimize the biodiesel yield in the evetia peruviana seed oil transesterification process. In addition to ANFIS and RSM model, using GA resulted in higher evetia peruviana methyl esters yield (TPME) in less time. e results determined the priority of ANFIS prediction capability over the RSM model. Dhingra et al. [102] applied ANN and GA combination in polanga oil-based biodiesel production to predict and optimize reaction variables to maximize the transesterification process.
e input variables are the ethanol-to-oil molar ratio, the reaction temperature, the catalyst concentration, the reaction time, and the stirring speed. Outputs were combined with GA to optimize reaction conditions resulting in 92% by weight biodiesel yield.

Estimation and Optimization of Process Conditions and
Efficiency. Karimi et al. [103] implemented a multiobjective analysis, using RSM and ANN to estimate FAME content and exergetic efficiency in waste cooking oil transesterification (WCO) for biodiesel production. Water concentration, reaction time, immobile lipase, and methanol concentration have been optimized to achieve 95.7% predicted FAME content. Corresponded input variables are the 35% catalyst concentration, 12% water content, methanol-to WCO molar ratio of 6.7, in 20 hours, produced 86% FAME content, and 80.1% exergy efficiency.
Patle et al. [104] used nondominated sorting GA-II (NSGA-II) multiobjective optimization to simulate and compare palm waste cooking oil esterification and transesterification reactions and optimizing heat duty, profit, and organic waste. As the heat duty increased, the profit improved, which increases the amount of organic waste. Rouchi et al. [105] used a Multivariate Curve Resolution Alternative Least Square (MCR-ALS) to process analysis and control the reaction parameters into the desired path. Multiple Scatter Correction preprocessing technique and MCR-ALS evaluate concentrations, the component's type, and spectra to obtain biodiesel production from the soybean process. e correlation coefficient and standard deviation of residuals demonstrated the suitability of the MCR-ALS method. Shukri et al. [106] used ANN to optimize the engine performance, using a mixture of palm oil methyl ester and diesel as fuel in a diesel engine. Both experimental results and the ANN model showed better engine performance for the biodiesel 10 percent blend (B10) diesel fuel and palm oil blends due to the higher heating value and cetane number.
Aghbashlo et al. [107] developed an ANFIS model integrated with linear interdependent fuzzy multiobjective (ALIFMO) approaches and nondominated sorting genetic algorithm (NSGA-II) to optimize operating conditions as a function of inputs. Input parameters were reaction temperature, methanol/oil molar ratio, and residence time. Optimization minimized normalized exergy destruction (NED) and maximized functional exergy efficiency (FEE) and universal exergy efficiency (UEE) output parameters towards achieving the best conversion efficiency (CE), which is more than 96.5% of biodiesel content. Applied ANFIS models perfectly estimated the FEE, UEE, NED, CE parameters with an R 2 ≈ 1.0.
Sarve et al. [108] compared ANN and RSM in biodiesel production optimization concerning their analysis sensitivity, predictivity and generalization capability, and parametric effects. 97.42% of fatty acid ethyl ester (FAEE) content have been obtained at optimized temperature, ethanol-to-oil molar ratio, initial CO2 pressure, reaction time, and temperature, where the temperature was the most effective. ANN model performed better results than the RSM in mahua oil FAEE content predictions and data fitting.
In a biodiesel production process from vegetable oil, Nicola et al. [80] employed a multiobjective GA optimization to maximize important compounds' purification and minimize energy requirements by optimizing main parameters in the process. Input parameters to the process model are reflux ratio, the mass flow rate of water, the water temperature, flash temperature, the number of trays, and dryer temperature. Among all optimized configurations, the one which confirms the minimum specific energy consumption and meets the biodiesel quality required standards was detected. Noriega et al. [109] used group interaction parameters (GIP) to predict and validate all present twophase equilibriums between liquids in the biodiesel production system, including glycerol, low molecular weight alcohols, water, fatty acids, and biodiesel. Results demonstrated that the amount of carbon, hydroxyl groups, and unsaturated bonds affect liquid-liquid equilibrium, and the most efficient parameter was distributed component overall mass fraction, afterward length of the alcohol chain.
López-Zapata et al. [110] used an Extended Kalman Filter (EKF) and virtual sensors to measure and estimate operating conditions variables, control performance, and monitor the reaction. Performance analysis used alcohol, triglycerides (TG), methyl ester, diglycerides (DG), glycerol (GL), and monoglycerides (MG) concentrations to evaluate jatropha oil-based biodiesel due to a minor number of measurable variables, like PH and temperature. Fahmi and Cremaschi [111] developed an ANN superstructure model to recognize the optimum biodiesel production plant and best operation conditions. e ANN model was an effective alternative for thermodynamics, unit operation, and mixing

Conclusions
According to the machine learning applications in this study, the most common ML methods in the soil stage are Random Forest, Gaussian Process Model, and Support Vector Machines. In the feedstock phase studies, ANN, multiple linear regression, statistical regression, and multiple nonlinear regression models are the most popular methods. Blend composition, temperature, mixing speed, and mixing time are typical input variables, and the output variables are viscosity, flash point, oxidation stability, density, methane fraction, higher heating values, and cetane number. e prevailing ML method for quality prediction is ANN developed by the regression model, using reaction temperature, reaction time, calcination temperature, pressure, and flow rate as input variables, and FAME content, viscosity, composition, quantity, cetane number, and density stand as output variables. e prevailing ML method for yield estimation is ANN accompanied by ANFIS, using methanol-to-oil molar ratio, reaction time, catalyst concentration, total volatile fatty acid of the effluent, and temperature, while % FAME yield, biogas production rate estimation, biodiesel yield, and biodiesel production are regular output variables. e prevailing ML method in optimizing yield and quality section is ANN accompanied by GA-based ANFIS and SVM. e top five main frequently used input variables are methanol-to-oil molar ratio, stirring speed, catalyst concentration, reaction time, and reaction temperature. e most common output variables are FAME yield, biodiesel yield, high heating value density, and oil's final acid value. e dominant ML method in the process efficiency and optimization portion is ANN accompanied by ANFIS. Frequently used input variables are reaction time, concentration, water content, methanol-to-oil molar division, and temperature, while CE, universal exergy efficiency (UEE), FAME content, biodiesel yield, and functional exergy efficiency are output variables. ANN, ANFIS, ELM, and SVM Machine Learning methods were employed to study consumption, engine performance, and emission.

Nomenclature
ALIFMO: Artificial linear interdependent fuzzy multiobjective optimization AI: Artificial intelligence ANFIS: Adaptive neurofuzzy interference system ANN: Artificial neural networks ALS: Alternative least square B10: Biodiesel 10 percent blend BRT: Boosted regression tree CCD: Central composite design CE: Conversion efficiency CN: Cetane number DA: Discriminant analysis ELM: Extreme learning machine FAME: Fatty acid methyl ester FAs: Fatty acids FEE: Functional exergy efficiency FP: Flash point GA: Genetic algorithm GBD: eXtreme Gradient Boosting-xgbDART GBL: eXtreme Gradient Boosting-xgbLinear GBP: eXtreme Gradient Boosting-xgbtree GBT: Gene expression programming GHC: Greenhouse gas GIP: Group interaction parameters GPM: Gaussian process model HC: Hydrocarbon IAV: Initial acid value of vegetable oil K-ELM: Kernel-based extreme learning machine KV: Kinematic viscosity LLE: Liquid-liquid equilibrium LME: Linear mixed-effects LR: Linear regression LS: Least square MAPE: Mean absolute percentage error MCR: Multivariate curve resolution ML: Machine learning MNLR: Multiple nonlinear regression MO: Mustard oil MSE: Mean squared error PU/MU: Mono-and polyunsaturated fatty acids balance NED: Normalized exergy destruction PAT: Process analytical technologies PCA: Principal component analysis PLS: Partial least square RB-FNN: Radial basis function neural network RF: Random forest RFM: Random forest model RLS: Recursive least squares RSM: Response surface methodology SVM: Support Vector Machines SVR: Support vector regression UEE: Universal exergy efficiency UHC: Unburned hydrocarbons VCR: Variable compression ratio.

Data Availability
e data used to support the findings of this study are provided within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest. 8 International Journal of Chemical Engineering