Artificial Intelligence Application in Bioethanol Production

. Energy consumption from biofuels against fossil fuels over the past few years has increased. This is due to the availability of these resources for production of di ﬀ erent forms of energy, and the environmental bene ﬁ t in the utilization of these resources. Ethanol fuel production from biomass is a complex process of known challenges in the area of handling, optimizing, and future forecasting. The existence of modelling techniques like arti ﬁ cial intelligence (AI) is, therefore, necessary in the design, handling, and optimization of bioethanol production. The ﬂ exibility and high accuracy of arti ﬁ cial neural network (ANN), a machine learning technique, to solve intricate processes is bene ﬁ cial in modelling pretreatment, fermentation, and conversion stages of a bioethanol production system. This paper reviews various AI techniques in bioethanol production giving emphasis on published articles in the past decade.


Introduction
Energy demand continues to increase annually with much dependence on fossil resources especially crude oil and natural gas for most chemicals and products on the market.These fossil resources are reported to be limited for production and processing [1,2].These concerns coupled with the negative environmental effects of the production and utilization of fossil resources on nature drive the need to search for sustainable greener and renewable resources like biofuels as alternative energy sources [1, [3][4][5].Biofuels are fuels derived from renewable resources.Examples include bioethanol, biodiesel, and biogas.The applications of these biofuels are of great benefits like fossil fuels; however, they are preferred over fossil fuels in addressing global warming, energy, and environmental issues.Current research in this field is reported to continually increase owing to the knowledge about the sustainability, availability, and ecofriendly benefits of these fuels [1,6,7].
Bioethanol, an alternative biofuel, is still recognised for its low automotive emissions and high compression ratios for effective spark-ignition engine performances.Its highoctane number classifies it as green gasoline blending additive or component for engine efficiency [8][9][10].Although past and recent studies acknowledge this sustainable fuel from different biomass, its pretreatment and conversion routes depend on several factors, which affect its yields.These stated processes are known to employ hydrolysis and fermentation routes which are mostly affected by parameters such as the type of feedstock, time, and temperature, among others [11,12].Modelling of bioprocesses based on parameters and routes still throws a challenge which cannot only be solved experimentally but would require simple and efficient based prediction techniques such as artificial intelligence technique (AI) [13].Several researchers [14][15][16][17][18] highlight the fact that the application of AI to these bioenergy systems is limited.This translates to the need to utilise these efficient mathematical and statistical models to estimate and analyse the biomass feedstock, production routes, expenditures, and key parameters to ensuring a robust and efficient bioethanol production.

Bioethanol Production
Bioethanol is an alcohol produced from plants, wastes, and algae, that is, first-, second-, and third-generational biomass, respectively [9,19].The first-generation biomass are mostly edible crops like sugarcane, wheat, and corn.These feedstocks are relatively easy to grow and harvest, which makes them a popular choice for bioethanol production.However, the use of food crops for bioethanol production has been criticized for contributing to rising food prices and food insecurity in some regions of the world [20].The second-generation biomass is considered as a great potential feedstock for high bioethanol production rate.This is due to their less cost, availability, and sustainability [21,22].However, these feedstocks (lignocellulose biomasses) require more complex processing techniques to extract the sugars needed for bioethanol production, making them more expensive.Some examples include grass, cornstalks, sugarcane bagasse, wood, and wastes.On the other hand, the third generation is from algae species.These are highly sustainable feedstocks since they can be grown in saline water and harsh conditions, reducing the competition for land and other resources.Although reported to be harvested multiple times per year, making them highly productive, their cultivation process is expensive [23,24].Bioethanol with high octane number is an alternative blending component for gasoline engines to reduce emissions and protect the environment, contributing to achieving sustainable development goal 13.Common blends include the gasohol (E10) and E85, which have bioethanol and gasoline in a percentage ratio of 10 : 90 and 85 : 15, respectively [25,26].According to [27], more than one billion gallons of bioethanol are blended with gasoline and utilized in countries like the United States of America (USA) and Brazil.Apart from its utilization as a blend component, it is also used in its pure anhydrous in flexible fuel vehicles in Brazil [28].Moreover, [21] also concluded on the recognition of bioethanol as the most produced biofuel in these developed countries.The production of bioethanol from the various biomass feedstocks is generally categorised into pretreatment, hydrolysis, fermentation, and recovery stages.Pretreatment is mostly done to expose the tissues/cells of these feedstocks by decreasing their size or using chemicals and microorganisms for biomass degradation to release sugars [24].That is, most lignocellulosic feedstocks are chemically, physically, biologically, or physicochemical pretreated to increase concentration of fermentable sugar yields after enzymatic hydrolysis.Some examples of physical pretreatment include milling, mechanical extrusion, and ultrasonication.Chemical pretreatment has its basis to be either acidic or alkaline.However, there also exist "green chemical pretreatments" which employ ionic liquids and deep eutectic solvents like choline chloride-lactic acid (ChCl-LA).Steam, carbon dioxide, and ammonia fibre explosion are some of the physicochemical methods [29,30].Biological pretreatments, drawing attention currently and characterised with short reaction time, also employ bacteria or fungal strains in the process [31].In acid or enzyme-catalysed hydrolysis, concentrated/dilute acids or enzymes are employed for the integration of cellulose into simple sugars which are then converted into ethanol and other products through fermentation employing yeast, bacteria, or fungi [19,21,26,32].Some enzymes employed for hydrolysis of these feedstocks include glycosyl hydrolases (GHs) such as cellulases and hemicellulases, auxiliary activity (AA) proteins, and carbohydrate esterase (CE) [29,33].Also, among various fermentation microorganisms, Zymomonas mobilis bacteria and Saccharo-myces cerevisiae yeast are the most commonly used species in the process.This is due to their exceptional ethanol yield and high tolerance limits [34,35].Several routes and reactions are known to occur in the pretreatment, fermentation, hydrolysis, and recovery stages.Some of the stages of this production lack robustness of which the type of biomass and the prevailing conditions may be impacted.These associated conditions and parameters which may include time, temperature, biomass type, chemical composition, type of enzyme, and pH affect the feasibility of the process and yields obtained.There is, therefore, the need to predict their interactions and effects using efficient mathematical tools, which would help optimise the entire production process to increase yields, while reducing costs.

Artificial Intelligence (AI)
The process whereby the activities of the human intelligence are simulated using algorithms and computer science techniques is known as artificial intelligence.This name is known to be coined by John McCarthy in 1956 [36,37].These computer-based techniques can majorly be classified into four categories/subfields. AI applications vary greatly in areas of process system engineering, modelling, bioenergy systems, optimisation of complex systems, etc.The branch of AI mostly known to use symbols for logic deduction is referred to as the symbolic AI.Heuristic algorithms, a powerful statistical method, on the other hand, find solutions to complex problems using either evolutional or swarm intelligence.The use of connectivism and statistical learning techniques to improve upon a system or task for efficiency is known as machine learning (ML).According to [38], ML continues to be a highly recognised AI technique in the field of engineering due to its effectiveness [5].A common tool under this technique with neurons, and known for its efficiency even with little or no process/system information, is an artificial neural network (ANN).It mainly consists of input, hidden, and output layers.The hidden layers aid in connection formation between the input and output layers.ANN is known to exhibit exceptional information-processing capabilities with an adaptive approach resilient to errors.Some common ANNs include feedforward backpropagation, counter propagation, and radial basis function (RBF) networks, among which the first is the known extensively used network [13].Moreover, to increase estimation performances for a better simulation and modelling of a process/system, the hybrid techniques are also used [17,18,36].Due to the limited applications of this efficient and great tool (AI) in addressing bioenergy system challenges, this study presents a review of AI techniques in modelling, predicting, and optimizing of bioethanol production.Major AI techniques with various algorithms comparison are shown in Figure 1 and Table 1, respectively.

AI Applications in Bioethanol Production
Cycle.Generally, the production cycle of bioethanol involves the following steps; pretreatment, hydrolysis, fermentation, and recovery.Hydrolysis and fermentation are known as the critical stages of this production [6,22,40].They are mostly considered as dynamic, nonlinear, and complicated.For instance, in enzymatic 2 International Journal of Energy Research hydrolysis of lignocellulosic biomass, enzymes break down polymeric carbohydrates into sugar monomers, utilizing mild operating conditions.This method exhibits high efficiency, leading to substantial sugar recovery, while avoiding the formation of inhibitors and minimizing the risk of corrosion.Furthermore, to achieve optimal performance, temperatures, pH values, and loading times are necessary.The appropriate enzyme to be utilized also plays significant role depending on the substrate.On the other hand, acid hydrolysis known to be convenient and widely employed method may require varying temperatures and concentration for higher yields.In concentrated acid-catalysed hydrolysis, lower temperatures and higher acid concentrations result in a high sugar recovery.However, this method suffers from the drawback of high production costs associated with acid recovery, disposal, concentration control, and recycling.Additionally, the concentrated acidcatalysed hydrolysis treatment poses a risk of degrading sugar monomers due to the prevailing acidic environment which can negatively affect the fermentation [29].Additionally, temperature and fermentation pathways play important roles in the transformation of these monomeric sugars into ethanol.Some common fermentation routes reported in literature include separate hydrolysis and fermentation (SHF), simultaneous saccharification and fermentation (SSF), simultaneous saccharification, cofermentation (SSCF), and consolidated bioprocessing (CBP) [23].The most extensively used two-stage process, where enzymatic hydrolysis is conducted separately from fermentation, enabling enzymes to function at elevated temperatures while the fermentation microorganisms operate at moderate temperatures to achieve optimal performance, is known as SHF.Although SHF of lignocellulosic biomass ensures good enzymatic hydrolysis and fermentation, it is mostly challenged with less yield, high production cost, and contamination [29,41].In SSF, cellulose saccharification and  3 International Journal of Energy Research monomeric sugar fermentation occur concurrently in the same reactor, whereas SSCF approach conducts hydrolysis saccharification within the same unit with fermentation taking place simultaneously.The cost-effective process of CBP also has enzyme production, hydrolysis, and fermentation occurring in a single vessel for bioethanol production.These latter three integrated processes (SSF, SSCF, and CBP) were introduced to help overcome the limitation associated with SHF, as they exhibit less contamination and easy process design with higher yields [20,21,29,42].Typically, after the fermentation of monomeric sugars, the next step involves recovering ethanol from the fermented broth.To achieve this, the water content of the broth is usually reduced, allowing the production of anhydrous ethanol.This process encounters challenges due to the azeotropic nature of the ethanol-water solution.However, distillation techniques, which exploit the difference in boiling points of the solution components, can be employed to overcome this limitation.The azeotropic solution problem can be resolved by introducing a separating agent that modifies the relative volatility of the key component [43].Various techniques are utilized for the recovery of pure ethanol from the fermentation broth, including adsorption distillation, extractive distillation, vacuum distillation, membrane distillation, and chemical dehydration.Among the conventional methods, azeotropic distillation, liquid-liquid extraction, and extractive distillation are commonly employed.Extractive distillation stands out as the most extensively used technique for largescale operations.However, emerging techniques like pervaporation and salt distillation are gaining attention for their potential in future applications, particularly due to their lower energy requirements [21].AI technologies applied in the stages of this cycle would be discussed in this section.Various studies on the application of AI systems especially the ML and hybrid techniques in production of bioethanol have been reported by researchers.Although these studies in the literature are few, they show that these techniques are capable of solving complex problems involving large set of variables and are able to model and ensure optimisation of various parameters around these processes.

AI Applications in Pretreatment and
Hydrolysis.Enzymatic hydrolysis (EH), seen as an obstacle in the production of bioethanol, due to the vast range of enzymes, was reported to be effectively optimized simultaneously with fermentation using ML technologies: random forecast algorithms (RF) and artificial neural network (ANN).This optimisation and achievement were successful in predicting the effect of temperature, feedstock, and enzyme load in the production of ethanol from sugarcane biomass.The author confirms results were close to experimental ones and concludes on its modelling effectiveness for a feasible bioethanol production [11].Empirical data was also modelled to predict the production of bioethanol from second-generation feedstock [12].The modelling and prediction of the output variable (bioethanol concentration) from these biomasses using ionic liquids as the input variables were also done using ANN and RF algorithms.The selection of these algorithms was based on their capabilities of predicting not only the bioethanol concentration but also facilitating the selection of suitable ionic liquids for the process.They emphasised that these ML algorithms in hybrid modelling for multistage hydrolysis and fermentation were excellent and in agreement with experimental results in predicting the bioethanol concentration with a coefficient of determination (R 2 ) value of 0.961.
In another study, a comparison assessment was done between response surface methodology (RSM) and ANN in predicting the various components in production hydrolysis stage for oligosaccharide mixtures from sugar beet pulp.RSM is a statistical technique mostly attributed with principles of randomness.Multilayer perceptron (MLP) is a feedforward ANN characterised by a number of neurons which work together to generate a set of outputs for complex nonlinear processes.This perceptron has three layers, namely, input, hidden, and output layers.Enzymatic hydrolysis of sugar beet pulp is affected by polygalacturonases to solid ratio, cellulose activity to polygalacturonase activity ratio, and reaction time.[15], therefore, decided to assess these factors in relation to oligosaccharides using the stated statistical tools.This approach involved the variation of several neurons in different training and hidden layers.Findings of the comparative assessment showed the MLP to be a valid tool in the modelling of oligosaccharides production from the pulp through enzymatic hydrolysis.[44] assessed RSM and ANN techniques in the maximation of reducing sugars in enzymatic hydrolysis stages of bioethanol production from water hyacinth biomass.This assessment was done from a comparative study of these techniques for an optimized enzymatic saccharification.MLP once again proved to be more effective than RSM, with an average and optimum prediction errors of 3.08 and 0.95, respectively.
The regression/coefficient of determination R 2 was used as a benchmark in examining the effectiveness of a hybrid model (PSO-ANN) and RSM for xylose and glucose production.These AI systems were used to model the pretreatment and enzymatic hydrolysis of lignocellulosic biomass to improve its estimated yields.For the same xylose and glucose yields, analysis revealed the accurate nature of the hybrid model compared to RSM in the stages of production of these sugars.R 2 of the hybrid model for glucose and xylose were 0.9939 and 0.9479, and that of RSM model were 0.8901 and 0.8439 [45].An experimental data from the pretreatment and hydrolysis of sugarcane bagasse using dilute acid and combined dilute acid ozonolysis were compared with modelled results (glucose concentrations) from a trained ANN for bioethanol production.The trained ANN model, multi-layer perceptron (MLP), was reported to exhibit good estimation capabilities and agreement with the experimental data [46].With the help and efficiency of a MLP, [47] predicted and concluded that the sensitive operational conditions for high glucose yields in the pretreatment and hydrolysis of sugar bagasse are low initial biomass concentration and acid concentration, high enzyme concentration, and enzymatic hydrolysis duration of 72 hours.In addition, [8] modelled the enzymatic hydrolysis stage of a bioethanol production from sugarcane bagasse using MLP.The neural network combined the effects of the cellulase and β-glucosidase loads, which successfully predicted and optimized the glucose concentration and yield.[48] also reported on the accuracy and efficiency of MLP model with the Levenberg-Marquardt backpropagation algorithm in predicting the effect of substrate 4 International Journal of Energy Research particle size, biomass loading, and reaction time on glucose and xylose production during the enzymatic hydrolysis of rice straw biomass.

AI in Fermentation.
Ahmadian-Moghadam et al. [14] examined the ability of MLP in estimating bioethanol concentration using sugar concentration, live and dead yeast as the input variables for a batch industrial fermentation process.This was done by training and comparing the neural network with 30 sets of data and testing it with already existing database.The results from the comparison, as stated by the authors, agreed with these existing data.The input variables were strongly related with the output variable (bioethanol concentration), manifesting low errors with high R 2 values.The conclusion made from the prediction was that the MLP is an accurate, simple, and powerful tool for modelling a cost-effective bioethanol production from sugarcane molasses.Industrial bioethanol production using fermentation variables was modelled and simulated using multilayer feed-forward neural network and particle swarm optimisation algorithms (PSO).PSO is recognised as an effective robust optimisation tool especially when the modelling of a targeted output variable is of high interest.Using 3400 data values from a Brazilian Company's fermentation unit, MLP was modelled and optimized with PSO to enhance the production levels on industrial scale.The coefficient of determination value of 0.91 obtained highlighted these combined trained models and algorithms can predict accurate concentrations and effectively maximise bioethanol production and concentration levels.This was justified from the 10% increment obtained from this new approach on an industrial scale [49].In assessing the fermentation stage for the production of bioethanol (output variable) from watermelon waste, three different amounts of yeast and fermenter agitator speeds (input variables) were employed.Modelling and prediction of results were done using a known precise algorithm: Levenberg-Marquardt algorithm for MLP and adaptive neurofuzzy inference system (ANFIS), respectively.The MLP which uses backpropagation training method yielded a mean square error (MSE) and R 2 of 0.0089 and 0.9895, whereas that of ANFIS was 0.3129 and 0.9993, respectively.These results concluded on the effectiveness of these models in production assessment and prediction of bioethanol from this waste [50].Esfahanian et al. [51] evaluated batch fermentation stage for the production of bioethanol from glucose using the yeast species: Saccharomyces cerevisiae.Three input variables that effect on production, that is, temperature, pH, and glucose concentration were modelled and optimized using RSM and MLP.Although results from the optimisation were fairly closed, the MLP precision was higher than RSM in the prediction.This was backed with an R 2 value of 0.9975 and 0.9965 for MLP and RSM, respectively.To economically analyse the feasibility for optimal production of bioethanol, MLP trained with the Levenberg-Marquardt algorithm was used to model and predict bioethanol content, number of yeast cells, and reducing sugars from intermediates and byproduct of sugar beet in a yeast batch fermentation process.The computed values obtained showed the predictive capabilities of this kind of ANN (MLP) in process decisionmaking for biotechnological processes [52].Furthermore, an optimal process control for the fermentation of "ricotta cheese whey" was achieved with a hybrid neural model (HNM).The successive predictive capabilities of this model which yielded an average percentage error of less than 10% was attained by coupling neural network for lactose, biomass, and bioethanol to mass balance equations [53].
In the application of AI in the combined critical stages (hydrolysis and fermentation), [40] concluded on the optimisation efficiency, cost, and time effectiveness of backpropagation ANN in predicting reducing sugars and concentration from an enzymatic hydrolysis and fermentation process.This was based on the closed average values obtained.That is, predicted reducing sugars value against experimental value was 175.94 g/L and 174.29 g/L, and ethanol concentration was 82.11 g/L and 81.52 g/L, respectively.The MLP architecture for both stages is presented in Figure 2. The architecture of a model depicts the structure of connections/arrangement of neurons in the network.In the above study, a three-layered feedforward architecture 5 International Journal of Energy Research was used.The input parameters for the hydrolysis process were substrate loading, α-amylase enzyme concentration, amyloglucosidase enzyme concentration, and stroke speed.The fermentation process, on the other hand, had four inputs, namely, reaction temperature, agitation speed, and yeast concentration.An intelligence technique (ant colony optimisation) was then integrated with the ANN model to optimise both stages for reducing sugars and ethanol concentrations.
Talebnia et al. [54] developed and combined two multilayer perceptrons feed-forward network models to predict time course and bioethanol concentrations.The models were applied in the enzymatic hydrolysis and fermentation stages to model the entire bioethanol production from steamexploded rapeseed straw.Betiku and Taiwo [55] evaluated trained multi-layer feed-forward neural network and RSM on the effect of hydraulic retention time, breadfruit hydrolysate concentration, and pH in bioethanol production.The absolute average deviation between the experimental and predicted value for MLP was 0.09%, and RSM was 1.67%.The authors based on these results confirmed that ANN was more accurate than RSM.A summary of the hydrolysis and fermentation studies in bioethanol production is presented in Table 2.

Conclusions
The use of ANN tool from past and recent studies is of great importance in modelling and optimization of bioethanol production.This is attributed to the algorithms' flexibility and high error tolerance for nonlinear and complex stages of production.The frequently used input variables include number of yeast cells, fermentation time, pH, biomass type, and number of yeast cells.Common output variables were bioethanol concentration, bioethanol production, reducing sugars, and yields.From the trend seen in the various studies of this review, ANNs compared to other AI techniques/models keep exhibiting higher prediction accuracy and efficiency with R 2 in ranges of 0.91-0.99.Therefore, implementing these AI techniques for future studies on bioethanol production processes will indeed not only ensure robustness but also reduce costs and time during process development.