Metaheuristics Method for Classification and Prediction of Student Performance Using Machine Learning Predictors

Department of Basic Sciences, College of Science and eoretical Studies, Saudi Electronic University, Dammam 32256, Saudi Arabia Chettinad Hospital and Research Institute, Chettinad Academy of Research and Education, Kelambakkam, Tamil Nadu, India Universidad Señor de Sipán, Chiclayo, Peru Universidad Nacional Santiago Antunez DeMayolo, Huaraz, Peru College of Science and Humanities at Sulail, Prince Sattam Bin Abdulaziz University, Al Kharj 11942, Saudi Arabia Cebu Technological University NEC City of Naga, Cebu, Philippines Isteqlal Institute of Higher Education, Kabul, Afghanistan


Introduction
Data mining (DM) is the extraction and processing of valuable information from a large data warehouse. DM is a subset of data mining. e rst step in data mining (DM) is to look at the data in various ways and nd the most valuable information in the most summarized form. In marketing strategy, DM approaches are extremely bene cial since they minimize unnecessary data and save resources. ey also help to discover consumer behavior patterns and are practical because of their simple knowledge. For the purposes of forecasting and prediction, DM techniques cover a populated region more quickly than previous methods. However, despite the hoopla, this discipline is making a signi cant inuence in the education, industry, and science sectors. Also, new methodological advancements might be made as a result of this. In spite of the clear link between DM and statistical/ mathematical data analysis, most approaches utilized in DM have so far arisen from the subject of statistics. As part of our investigation, we will be looking at some of the latest educational models and practices [1]. All colleges and universities are primarily concerned with improving the quality of managerial choices and educating students. High-quality higher education may be achieved in a number of ways. One is through accurate predictions of students' success in their chosen educational setting. ere are a variety of prediction models to choose from. While it is not clear if there are any indicators that can reliably forecast whether a student will be an academic genius, a dropout, or an ordinary performer, the researcher reports student achievement. Higher and professional education's evolving needs have not been met by existing methods. e goal of this study is to identify issues and potential solutions related to the quality of education provided by universities and other higher learning institutions, as well as provide a framework for doing so [2,3].
In higher education, students and alumnae are dealing with serious issues. A school would want to determine who will enroll in a specific program and who will require extra help to get their degree. Can you tell which pupils are more likely to switch schools than others? ese and other difficulties, such as student enrollment management and the time it takes to complete a degree program, keep higher education institutions on their toes. e analysis and presentation of data, or DM, may be an effective means of addressing these issues for students and alumni. It is now possible for companies to make use of their existing reporting tools to find and analyze patterns in massive databases using DM. Individuals' behavior may be predicted with great accuracy using these patterns in data mining models. Schools can better distribute resources and employees as a result of this discovery [4]. An institution's DM might for example provide the institution with the information essential to take action before a student drops out, or provide an accurate forecast of how many students will attend a certain course [5,6].
is article presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, the input data set is preprocessed using the AMF algorithm. en, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data. e literature survey section contains a review of existing work in education technology. e methodology section presents a metaheuristics and machine learning-based method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers-BPNN, RF and NB are used to classify student academic performance data. e result analysis section contains experimental set-up details and results of various algorithms. e conclusion section contains the contribution of the research article.

Literature Survey
An effective survey for evaluating and forecasting student performance in higher education institutions utilizing DM technology is presented in this work.
Data from a preoperative evaluation were used to forecast whether or not an individual will pass or fail a course, and the Bayesian, decision tree, and neural network algorithms were used to evaluate their prediction accuracy, ease of learning, and user-friendliness. By taking the proper measures at the right time, the researchers found that this system can assist students and teachers enhance student achievements and minimize the failure ratio [7].
To forecast pupils' grades, Jayasingh et al. [8] utilized a Bayesian classification algorithm that relied on the previous year performance data. Teachers and students alike stand to benefit from the research, according to the findings of the researchers. Additionally, the research aids in identifying those pupils that require additional attention in order to lower the failure rate. In order to create a multiclassifier, they compared the performance of four distinct classifiers. A GA was used to reduce the error rate by at least 10% by classifying the data into three separate groups based on the attributes that were most important to the prediction.
Researchers found that it is feasible to automatically estimate student performance in the problem of performance prediction. As a result, the incorporation of this information into the learning process is simplified and uniform thanks to the use of the Bayesian network, an extensible categorization framework.
is study demonstrates that strategies for performance prediction and further investigation of learning algorithms are both necessary and desirable [9].
ere was a strong correlation between a student's current performance and their previous performance in earlier courses, as demonstrated by these data (most likely a prerequisite course). Categorization trees are popular, according to [10], because their classification criteria are more easily comprehensible. In order to discover the best classifier for student data and forecast students' performance on the endof-semester assessment, researchers looked at commonly used decision tree classifiers. To classify data, CART is the best method, according to experimental findings obtained.
It was shown that the decision tree method may be used to accurately predict students' academic achievement. Authors [11] claimed that DM might be used in higher education, namely to predict students' ultimate achievement. Researchers used questionnaires to gather information from students on their attitudes toward learning and their academic achievement.
is was followed by the implementation of DM strategies. Students' final grades were predicted using a decision tree-based model and an SVM algorithm that implemented the model's criteria for prediction. With the help of kernel k-means clustering, the pupils have formed groups.
To forecast the final exam scores of engineering students, It was important to develop prediction models that took into account all of a student's individual characteristics, as well as their social, psychological, and other influences on their performance [12,13]. When compared to ID3 and CART algorithms, the C4.5 approach has the greatest accuracy of 67.7778 percent. Some criteria were examined by Bharadwaj and Pal [14] to derive performance prediction indicators essential for instructors to measure, monitor, and evaluate their performance. Naive Bayes classification was shown to be the most effective method based on the data. In a study, they used educational data mining to forecast the likelihood that pupils would stay in school [15]. Learning algorithms (ID3, C4.5, and ADT) are used in the study machine to evaluate and extract data from previously collected student records. ADT, a machine learning algorithm, was shown to be able to develop predictive models using the previous year's retention data by using their study's predictive models. In Rossi's groundbreaking work [16], he was the first to propose an ideal algorithm and system architecture suited for anticipating instructors' performance, as well as recommending critical action to assist school administrators in making decisions based on the limitations of conventional approaches. In [16], school districts that use this technique to aid administrators in making better decisions and teachers in improving their performance may see an increase in students' academic achievement. is is the method via which the goals will be met. Researchers Hemaid and Halees [12] carried out a similar research project in order to better understand the aspects that influence the appraisal of instructors' performance. Teachers from Gaza City's Ministry of Education and Higher Education participated in this survey, which was conducted in English. In each activity, they spent a significant amount of time discussing the relationship between their outcomes and the teacher's performance. e technique developed by Agaoglu et al. [17] was designed with the main goal of improving student performance. A questionnaire on how they were taught and how they interacted with one another was given to the students as part of the course requirements. e performance of staff members who taught the relevant courses was assessed using an education mining-based categorization approach, which was used to evaluate their performance. In this investigation, the C4.5 classifier beat the competition, according to the findings. In our investigation, we discovered that a substantial number of the survey questions used to assess student satisfaction with the course were erroneous. According to Tripti et al., children's social integration, emotional capacity, and intellectual accomplishment are all important factors in their development. Students in their third semester were examined using two separate classification approaches: j4.8 and the random tree method, both of which were used. When compared side by side, the random tree beat the j4.8 algorithm.

Methodology
is section presents a metaheuristics and machine learningbased method for the classification and prediction of student performance. Firstly, features are selected using a relief algorithm. Machine learning classifiers such as BPNN, RF, and NB are used to classify student academic performance data. e flowchart for the proposed methodology is shown in Figure 1.
In 1992, Kira and Rendell [18] created the relief algorithm, which was based on an instance-based learning methodology. Discovering feature-to-feature correlations may be accomplished via the use of a filtering mechanism. In the process of computing feature statistics, the nearest neighbors method is employed to account for the interactions between variables. is approach ignores data that has missing values or has a large number of classes. e back propagation approach developed by Haykin and Anderson is one of the most widely used techniques for learning new knowledge. BPN is an excellent choice for pattern recognition and mapping jobs that are simple in nature. Backpropagation is a method of learning rather than a network. An algorithm example will train a network to provide the proper output for each pattern of input data that is presented to the algorithm. A result of this is that the network's weights are recalculated. A training pair is comprised of an input and a target [19].
Bayes' theorem may be broken down into its component parts. Naive Bayes is the creator of the imaginary figure. Naive Bayes, who also goes by the same name. It is a highly classified mechanism; thus, a great deal of focus is placed on it. A straightforward estimate of the iterative parameters is not required to get naive Bayesian models off to a good start. By using the Bayes theorem, we are able to determine the posterior likelihood of P(c|x) by combining the posterior likelihoods of P(c, x) and P(x|c). is allows us to compute the posterior likelihood of P(c|x). An estimate of the posterior probability may be obtained by first constructing a frequency table and then analyzing the data. After analyzing the frequency tables, naive Bayesian approaches are used in order to calculate the probabilities associated with each group included in the dataset. After determining the probabilities associated with each category, it is then feasible to choose the most appropriate classification [20].
In order to construct a tree reflecting multiple circumstances and potential values of target-class labels, the ID3 algorithm of decision trees is used. An if-else tree may be readily constructed in any programming language by layering if statements on top of each other. Entropy and information gain are used by ID3 to examine and describe training data statistics. It is a greedy algorithm that does not believe in backtracking in order to improve its judgments. As far as tree length is concerned, normal ID3 does not use any kind of pruning or optimization. In order to determine how homogeneous a subset of data is, the entropy of the data set is determined.  is zero if they are all the same. Its entropy is equal to one if the values it contains are evenly dispersed. Entropy must be determined at the attribute level or a mixture of two attributes.

Results and Discussion
e university data set (http://archive.ics.uci.edu/ml/ datasets/university) is used for the experimental study.
is data set consists of 285 instances. is data set contains seventeen attributes. 240 instances are used for the training of machine learning classifiers and the remaining 45 instances are used for testing. Results are shown in Figures 2-4.
Parameters used in the experimental study are discussed:

Conclusion
ere has been a gradual decline in higher education over the last few decades in all three areas: the academic setting (both faculty and students), as well as research and development output (both faculty and students) (including graduates). In essence, all colleges and universities are devoted to the improvement of management decision-making and the education of their students. A high-quality higher education can be obtained through a variety of different means and formats. In order to accurately predict students' performance in their chosen educational context, one technique is to use predictive analytics. ere are a plethora of different prediction models to choose from. While it is uncertain whether there are any signs that may indicate whether a child will be an academic genius, a dropout, or an ordinary performer, the researcher reports on student achievement in his or her research. In order to identify students and make accurate projections about their academic success, this article makes use of machine learning and metaheuristics. e implementation of a relief strategy helps to reduce the number of elements that need to be considered in the initial phase of the process. e information on the academic achievement of the students is classified using BPNN, RF, and NB, which are three different machine learning classifiers. e accuracy of BPNN as a tool for classifying and predicting the level of academic accomplishment attained by students is continually improving.
Data Availability e data shall be made available on request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Acknowledgments
is research work is self-funded.