University Classroom Teaching Model Based on Decision Tree Analysis and Machine Learning

,


Introduction
Since the reform and opening-up, teaching reform has always occupied the most important part of higher education reform. In particular, when the planned economic system is transformed to a market economic system and higher education has moved from an elite education stage to a mass education stage, changes in educational concepts, educational values, and society's demand for talents have put forward new requirements for college education and teaching.
is urges college education and teaching to continuously deepen reforms in concepts, content, methods, and other aspects and move forward to adapt to the new needs of economic and social changes. Since the implementation of the "Higher Education Quality and Teaching Reform Project," the Ministry of Education has focused on improving teaching quality every year and will also launch a larger "Teaching Quality Project." e emphasis on improving the quality of teaching in higher education requires that the reform of teaching, especially the reform of teaching mode, should move towards improving the quality of talent training [1].
Teaching mode is the reaction of teaching theory in educational practice and the main way to achieve teaching goals. erefore, different teaching models reflect different teaching theories and cultivate different types of talents. As an important institution for cultivating talents, colleges and universities have an important influence on the type of talent cultivation by their teaching mode. In order to achieve the goal of cultivating high-quality talents, colleges and universities need to carry out teaching reforms and explore teaching models that are consistent with their teaching goals.
With the implementation of our country's strategy of rejuvenating the country through science and education, education moving towards the digital information age has put forward new and high requirements for college classrooms and also posed severe challenges to traditional classroom education.
ere are many problems in traditional classrooms. For example, traditional classrooms are mainly based on teachers, students passively accept knowledge in the classroom, and students revolve around examinations, unable to think independently, and cannot explore and acquire knowledge actively. Such a model cannot mobilize learners' initiative and enthusiasm. e learning efficiency is naturally low. e talents that today's society needs are not only those who master theoretical knowledge and basic skills but also those who have the awareness of active participation, the awareness of unity and cooperation, the awareness of challenges, and the awareness of proactiveness. ey should also have the spirit of scientific innovation and strong innovative practice ability talents. So, the purpose of our education is to train such innovative talents through a correct teaching mode that meets the needs [2]. e information age has given birth to a new teaching model of flipped classrooms, making it into people's sight. e development of Internet technology and big data has enabled people to acquire knowledge through different channels. e way of knowledge transfer has changed from one way to interaction. Emerging teaching models such as MOOC and flipped classrooms have come into people's sight. Schools and teachers are no longer in sight. Furthermore, it is the only channel for people to receive education, and the status of higher education has been shaken.
is has also had a huge impact on the internal operating mechanism of colleges and universities. erefore, we must reform the traditional classroom teaching mode, creatively use modern educational technology to guide students to learn actively, give full play to the role of students as the main body of learning, promote interaction and communication between teachers and students, and achieve personalized teaching to cultivate creative talents.
Traditional high-efficiency classroom teaching methods are mostly multimedia-assisted teaching methods, and it is difficult to find problems in teaching. erefore, this article combines decision tree algorithms to explore problems in college teaching classrooms, change the problems existing in traditional teaching models, and make targeted improvements.
In this paper, the decision tree algorithm is improved on the basis of the research of other documents, so that it can meet the knowledge discovery needs of efficient teaching mode. e model in this paper is used for effect evaluation.

Related Work
Because traditional rational decision-making models cannot perfectly explain the many anomalies of individuals in actual decision-making behaviors, economists and management scientists began to focus on the "decision-making process" and they established a research framework based on cognitive psychology [3]. e core of the literature [4] theory is to impose a series of constraints on the preference relationship, so that the representation of the preference relationship is a real-valued mapping from the state space to the result space. In other words, every decision-maker has his own actual value utility function, which is called the VNM expected utility function. Decision-makers only face the uncertainty of "risk," that is, the decision-makers know the probability distribution of the results of various actions, and the decision-makers assign a specific value to each result as the utility of an action. e decision-maker chooses among multiple actions in accordance with the principle of maximizing the expected value of the utility function [5]. e classic view of the literature [6] theory is that the utility function has linear probability. e literature points out that the preference relationship is a binary relationship that satisfies the independence axioms and Archimedes axioms. e study in [7] combined the expected utility theory and the definition of subjective probability to propose seven axioms to systematically summarize the expected utility theory of subjective probability. e study in [8] started from two clues to concisely and completely summarize the expected utility theory. Starting from the risk decision preference axiom system and Savage's decision theory in the context of uncertainty, the study in [9] proved and summarized the axioms and numerical expressions of preference. With the birth of the expected utility theory, various application fields have clear judgment standards and research systems. For example, after companies' investment decision-making, financing, asset pricing, and other fields are incorporated into the expected utility theory system, they have achieved rapid development [10]. Similarly, under the framework of expected utility theory, financial scientific research has also made remarkable achievements [11].
Many anomalies often appear in decision-making research based on the expected utility theory, such as isolation effect, reflection effect, certainty effect, common ratio effect, and common result effect. erefore, the improvement direction of the expected utility theory is to update and abandon the independence axioms and the principle of certainty and use another set of axioms to form a new theoretical system to derive the extension of the expected utility theory, that is, various undesired utility theories [12].
Decision trees are used to graphically represent available decisions, random factors, and their consequences. It is usually drawn from left to right and consists of decision nodes (squares), chance nodes (circles), end nodes (triangles), and branches (arrows) from one node to another. Moreover, all decision trees start with a decision node called the root node [13]. e branches that appear from the root node and other decision nodes represent the set of available decisions (alternatives), and only one of them can be selected. e realization of the decision may lead to the final result (outcome node, also called the end point), random event (opportunity node), or another decision problem (decision node) [14]. e branch that emerges from the chance node represents the possible realization of the corresponding random event and its probability [15]. As a result of achieving random events, three types of situations may occur: the decision-maker must make another decision, another random event occurs, or the final result is obtained. Before giving the mathematical definition of the decision tree, we first distinguish between the concepts of dynamic selection and static selection [16]. Static selection: if one or more individual final decisions must be made (in the irrevocable sense) before any optional lottery (or stage of a composite lottery) is resolved, then a static selection is involved. In other words, "natural" will not take any action until the decision-maker has taken all his own actions without reservation [17]. It is embodied in the decision tree as if and only if there is no chance that the node follows the selected node, and the given decision tree represents a static selection problem. Dynamic selection: if a decision needs to be made after certain uncertainties have been resolved, then such a decision is called dynamic selection. ere may be several reasons for dynamic selection. One reason is that the decision maker may not need (or even be unable) to make a decision until some uncertainties are resolved. Another reason may be that the available selection set depends on the result of uncertainty [18]. In any case, the dynamic selection situation will include at least some choices and the decisionmaker can (or must) postpone it until the natural state has made some actions. It is embodied in the decision tree that if and only if there is a selection node after at least one chance node, a given decision tree represents a dynamic selection problem [19]. e difference between static selection and dynamic selection is as follows: e static selection scenario means that all decisions will be made irrevocably before any uncertainty is resolved. However, the dynamic selection situation distinguishes the decision maker's planned choice of each decision node at the beginning of the decision problem (that is, at the root node) from the actual choice when reaching a given decision node [20].

Decision Tree Algorithm Learning Environment
Evasion of blur is the most common type of blur in various applications. Its goal can be expressed as the following optimization problem: (1) Here, d is the distance function measuring the distance between the original sample x and the fuzzy sample x ′ and f: X ⟶ Y is the classifier decision function that outputs the classification result for the input sample x. e goal of this optimization problem is to find the minimum perturbation added to x, so as to achieve the purpose of changing the output of f. e commonly used distance measurement functions are L 0 , L 1 , L 2 , and L ∞ , which are defined as [21] where x i is the i-th feature of the sample x, m represents the number of features of the sample, x is the original sample, and x ′ is the blurred sample. When p is 0, the distance function L p calculates the number of modified features of the sample. In the experiment, the appropriate distance calculation function according to the actual application is chosen. e adversarial sample fuzzy neural network model is constructed by optimizing the following problem, that is, under the premise of satisfying the image pixel value constraint, the smallest disturbance that can make the generated adversarial sample misclassified into the l class is found: where loss f refers to the loss function of the neural network and l is the label of the target category, that is, and c is the penalty parameter, which is used to control the size of |r|. e L-BFGS was used to solve this problem. Although this method has good performance, it is more expensive to calculate adversarial samples.
is paper proposes a method based on fast gradient symbols, which constructs adversarial samples by calculating the gradient of the cost function of the model relative to the input of the neural network: By increasing the loss of the model when the sample x is divided into the correct sample, the neural network can divide it into other categories, which is also a kind of nondirectional blur (that is, no specific blur target).
Label smoothing is to modify the label y of the training sample during model training. y is represented by the standard basis vector.
y � e y .
It does not use one-hot encoding: Here, k represents the number of categories, t is the subscript of the correct category, and y max is a parameter set by the user. When it is 1, the category is one-hot encoded.
is method can resist the model being blurred based on the gradient. At this time, the model trained by this method can reduce the gradient obtained by the fuzzy person, thereby reducing the success rate of the fuzzy parameter [22].
is article proposes a defensive distillation method to resist ambiguity. e original distillation technology is designed to compress a large model into a small model while maintaining the accuracy of the model. However, the defensive distillation method does not change the size of the model but produces a model with a smoother output surface and a lower sensitivity to disturbances, so it aims to improve the robustness of the model. As shown in Figure 1, it first trains an initial network T with data X and the temperature of softmax is set to T at this time. en, it uses the same data X, probability vector F(X), and network settings to train a F d predicted by the distillation network. Compared with the original class of the sample, the probability vector F(X) contains more information about the classification. e results of the experiment prove that the use of defensive Mobile Information Systems distillation can reduce the success rate of hostile ambiguity by 90%. e work flowchart of Defense-GAN is shown in Figure 2. Although Defense-GAN has proven to be very effective in fuzzy defense, its success relies heavily on the expressiveness of GAN and the ability of the generator. In addition, training GAN is challenging. If proper training is not carried out, the defensive ability will be significantly reduced.
e process of building a decision tree is a process of continuously dividing the data set ( Figure 3). Each time a feature is selected according to a specific feature selection criterion, the data set is divided according to the selected feature. Until there is only one type of sample in the data set or other termination conditions are met, the establishment is completed. Since the decision tree can clearly express its decision-making process through visualization, it is widely used in malware detection, network intrusion detection, and spam detection. ree commonly used decision trees are discussed in the following, including the clear decision tree ID3 [23].
As a representative of a clear decision tree, ID3 uses information gain to measure the classification ability of each feature. We are given a data set X, where each sample x has n features, denoted as x � x 1 , · · · , x n and each feature x j has e values, namely, S(x j ) � x 1 j , · · · , x e j . e sample x belongs to the y category, where y � 1, 2, · · · , c and c is the total number of categories. Information gain is defined as [24] Here, X x v j represents the sample set that has feature value of x j in the data set as v, | · | represents the size of the sample set, that is, the number of samples, and E(X) represents the information entropy, which is defined as follows: Here, P k represents the proportion of k-type samples in the data set and b is the base of the logarithmic function. In the field of information technology, b is often set to 2. If x j is a continuous variable, the information gain can be calculated by discretizing features.
Fuzzy decision tree is a model that applies fuzzy logic. It uses fuzzy logic to make the decision tree handle inaccurate knowledge. We assume that there is a fuzzy data set X. Here, for the feature x j , the fuzzy set whose value is V is expressed as [25] Here, the degree of membership μ x v j (x) indicates the degree to which x j belongs to x v j . For a clear data set, it is first necessary to fuzzy it by the membership function and then use the fuzzy decision tree to processing it.
As one of the representatives of fuzzy decision tree, FID3 also uses information gain (see the following formula) to select features. Here, the size of the fuzzy data set is determined by the fuzzy degree of the data: YST uses ambiguity to do feature selection, and each time, the feature with the smallest ambiguity is selected as a subnode. For attribute x j , its ambiguity is equal to the ambiguity weight of each attribute value: Here, p jv is the proportion of samples whose feature x j takes the value x v j and g(x v j ) represents the ambiguity of the feature value x v j . Here, We calculate the probability of classifying the sample into k when x j is x v j . p k jv represents the proportion of samples belonging to category k in the sample set, π c+1 jv � 0, and when s > w, π s jv > π w jv . For decision trees, there are currently two main black box fuzzy methods. One is to treat the target classifier as a database O (Oracle) by asking the target classifier about the classification label of the sample. It can accept input and can return the class label of the sample, which is the index value with the largest classification probability: Here, O(x) represents the classification label of the final sample x and O j refers to the j-th value in the probability vector of the final output of the classifier O, that is, the probability of the sample being classified into the j class. Based on the available information, through the collected sample sets that are sufficient to represent the tasks completed by the target classifier, using the data generation method described in Figure 4, an alternative model can be learned to gradually approach the decision boundary of the target model o. en, the fuzzy person can use the information of the learned alternative model to formulate a fuzzy strategy and then fuzzy the target model. e replacement model and the target model can be either a homogeneous model or a nonhomogeneous model.
At present, the fuzzy method usually uses the gradient information of the model to modify the sample, the decision function of DT is not differentiable, and it is impossible to obtain the gradient information to guide the fuzzy operation. erefore, this paper proposes a sensitivity-based measurement standard to solve this problem; that is, by calculating the influence of sample feature changes on successful fuzzing, the most important feature subset is iteratively selected for fuzzing. M f + (x) and M f − (x) represent the effect of increasing and decreasing the feature x f by Δx f , which are defined as [26] Here, C t (x) is the confidence that the sample x of the model output belongs to the t-th category and

Mobile Information Systems
e size of ε is determined according to the value range of the sample feature and is generally set to a relatively small value. e algorithm is designed to start from the original sample x and then iteratively modify the fuzzy sample x ′ until the optimal fuzzy sample is found and misclassified by the decision tree. In each iteration, the feature with the largest M f + (x) or M f − (x) will add a disturbance of ε or −ε. is process is repeated until the sample is misclassified, or the change to the sample exceeds the maximum perturbation limit, or the number of iterations exceeds the maximum number of iterations limit. e output of the ID3 decision tree is the category label, and the classification confidence output is not provided. erefore, this paper adopts a method based on the path depth to calculate C t . H is a set of paths from the root node to the leaf nodes in the decision tree. Function L(h) calculates the number of nodes in path h, and D(h, x) calculates the number of samples x that satisfy the conditions in path h.
erefore, reducing D(h * , x) can increase the probability of ID3 classifying the sample into true categories, thereby achieving the goal of successful fuzziness. With the help of these two pieces of information, we get the definition of C t for ID3 [27]: Here, h ∈ Hand H t represents the set of paths in H that divide the sample into the sample real category t. Because there may be more than one path in ID3 that will output category t, the maximum value is selected. e goal of ambiguity is to weaken the relationship between sample x and its category t. e smaller the C t I D3 (x), the greater the chance of classifying the sample x into other classes.
In FID3, for path h, the membership degree of each branch is multiplied to obtain m(h, x), and then, it is multiplied by the category probability vector (leaf node) to obtain the category probability result of path h to sample x. Finally, the classification results of all path pairs x in FID3 are added to obtain the classification result of FID3 for sample x, and the classification label of FID3 for x is the category with the highest probability in the classification result. erefore, the definition of C t is obtained: Here, l t (h, x) represents the probability of classifying the samples into t types in the leaf nodes of the path h and m(h, x) represents the degree to which the sample x belongs to the path h. If the modified feature can reduce C t (x) under the cost limit, the fuzzy objective can be achieved. From the above analysis, the classification confidence C of YST can be obtained as at is, YST takes the path with the largest degree of membership as the classification confidence.
From a theoretical analysis, the time complexity of the proposed fuzzy algorithm is O(i max × n × q), where n represents the number of features, q represents the time to ask the target to classify the sample, and i max is the set maximum number of iterations. e size of n depends on the application scenario, i max is related to the fuzzy cost, and both of these factors have nothing to do with DT. e only difference in the time complexity of the proposed algorithm fuzzy CDT and FDT is q. e higher the complexity of the decision tree, the greater the q. Generally speaking, since the decision-making process of CDT is usually relatively simple, the q of CDT is less than FDT.

Teaching Mode Evaluation System Based on
Decision Tree e teaching mode evaluation system mainly includes the demands of three types of users. e demand of the system administrator is to manage the system. e demand of teachers is to log in to view their own teaching evaluation results after the teaching evaluation is over and to change their teaching methods in time based on the evaluation results to improve teaching quality. e demand of students is to log in to the system to evaluate their own teachers after the start of the teaching evaluation activity.
According to the demand analysis of the system, the use case diagram of the system demand is shown in Figure 5. e flowchart of the teaching evaluation system is shown in Figure 6. e flowchart is mainly composed of the following parts: (1) e collection and processing of basic teacher information, student evaluation information, and peer information are carried out in the database in a unified format. (2) Teaching evaluation data mining system: collect the acquired data into the database, and then, according to the data obtained after data mining, provide management decision-makers with the latest and most valuable information or knowledge to help them quickly and correctly make a decision. (3) Data mining: determine the task or purpose of mining according to the characteristics of the problem raised by the decision-makers, streamline and preprocess the relevant data in the database, and then dig out new and effective knowledge from the streamlined data. Data mining teaching evaluation system provides effective knowledge for decision makers. (4) Database: this mainly stores various data related to teacher information.
e system stores the basic information of the teacher and various information of the teaching activities of the teacher in the database. (5) Knowledge base: this includes useful information obtained after data mining, that is, rules extracted from it, used for decision-making by management decision-makers. e functions completed by the system: the user authority setting ensures the security of the system; the database query function allows users to directly modify the data in the database without opening the database; the simplicity and speed of the decision tree algorithm is the biggest advantage of the system; there are extremely convenient database backup and recovery functions, as shown in Figure 7.

System Evaluation
is article uses SQL Server 2020 database when designing the teaching evaluation system. e mechanism of the program has been changed. ere is no main program in the traditional sense. e basic method for program execution is to have "events" to drive the operation of the subprogram (or process). For example, clicking the command button with the mouse will generate a command. e "click event" of the button will execute a program in response. In view of the abovementioned characteristics of VB, this system chooses VB6.0 as the development tool of the system.
On the basis of the above analysis, the system performance is verified. is article combines the actual needs to verify the performance of the college classroom teaching system based on decision tree analysis. is article takes a university as an example to conduct experimental teaching.
rough this system, the teaching method and student learning method are analyzed and quantitatively processed by manual scoring during system evaluation. After using the system in colleges and universities for a period of time, students and teachers evaluate the system. First of all, this paper verifies the effect of teaching evaluation on the system, and the results are shown in Table 1 and Figure 8.
From the above analysis, we can see that the analysis system of college classroom teaching model based on decision tree analysis constructed in this paper has a good performance in teaching evaluation. On this basis, this paper evaluates the effect of teaching strategy formulation of the system, and the results are shown in Table 2 and Figure 8.
It can be seen from Table 1 and Figure 9 that compared with the literature [15], the research results of this article have advantages in teaching effect evaluation and most of this article has above 85 points, so the teaching effect of this article can be called excellent.
rough the above experimental teaching and evaluation analysis, we can see that the effectiveness evaluation system of the college classroom teaching model based on decision tree analysis constructed in this paper has certain practical effects.
We compare the effect of the decision tree model in this paper with the neural network and K-means algorithm in problem mining of the high-efficiency teaching mode and compare the effect of statistical problem mining. e results are shown in Table 3 and Figure 10.
From Table 2 and Figure 10, it can be seen that the evaluation results of teaching strategy formulation in this  Figure 6: Flowchart of the teaching evaluation system.    article are more reasonable than those in the literature [10] and the formulation has a certain pertinence to teaching methods. erefore, it can be considered that the system model of this article has certain advantages in the reference formulation of teaching strategies.

Conclusion
Teaching quality is the basis for the survival of distance education universities, and teaching evaluation is an effective means to comprehensively improve teaching quality, enhance teachers' teaching level, and regulate teachers' teaching behavior. e application of data mining to teaching evaluation has solved this problem. Using data mining methods to analyze teaching information and evaluation data, we can extract potentially useful knowledge hidden in it, can help decision-makers find the rules, and explore various factors that affect teachers' teaching effects, thereby improving teaching management, optimizing resource allocation, and improving teaching quality. is paper mainly uses decision tree algorithm and data mining technology of association rules to construct the effectiveness evaluation system of the college classroom teaching model based on decision tree analysis. After the system is constructed, the effectiveness of the system constructed in this paper is verified through experimental teaching methods.

Data Availability
e labeled dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e author declares no conflicts of interest.