Allocation of English Distance Teaching Resources based on Deep Reinforcement Learning and Multi-Objective Optimization

is work employs deep reinforcement learning and multi-objective optimization algorithms to the allocation of English distance teaching resources in order to increase their allocation eciency. Moreover, based on the analysis of current regression correction, this paper discusses the algorithm of partition regression correction in depth, and proposes two dierent neighborhood regression correction algorithms. e proposal of neighborhood further expands the original concept of partition and solves various problems in partition correction. In order to reduce the model complexity of the neighborhood regression algorithm, this paper proposes to solve the problem through structural risk minimization and principal component extraction. e simulation results suggest that the English distance teaching resource allocation approach described in this research, which is based on deep reinforcement learning and multi-objective optimization, may signicantly enhance the English distance teaching resource allocation impact.


Introduction
Education assessment, education fundamental theory, and education development are recognized as the three most important study subjects in the area of education today. Among these, education assessment plays a crucial role in education growth and reform, as well as education administration and decision-making, thus it has garnered a great deal of attention from the relevant authorities of several countries. In addition, educational assessment is often known as teaching evaluation or educational evaluation. It refers to the process of determining the value of education by seeking, sorting, processing, and evaluating educational material methodically, scienti cally, and exhaustively according to particular value standards and educational goals.
Linked Open Data (LOD) based on Semantic Web and Ontology technology has become one of the most important ways to publish high-quality linked semantic data, which is widely used in intelligent services such as semantic search and personalized recommendation. Linked data connects resource objects described by RDF in the form of URLs, so that unstructured documents are marked as structured data with semantics, so that both machines and users can understand and work together. People can directly obtain digital resources through the HTTPAJRI mechanism (thjng). e resource objects released by linked data technology have the characteristics of sharable, reusable, structured, and standardized, which are conducive to integrating isolated teaching data, establishing links between course resources in the same and di erent elds, and realizing cross-platform and cross-system communication.
Inquiring through ontology reasoning and semantic expansion, semantic comprehension of query requests in a distributed environment is clari ed, and knowledge resources needed by users are retrieved from linked data using a semantic index structure. Discover even more Linked Data information. e rich and extensive basic research in semantic science lays a solid foundation for the advanced analysis of semantic structure in various sciences, and enables the Semantic Web to provide feasible ideas for managing all kinds of knowledge data. By studying the semantic data of the World Wide Web, researchers at home and abroad have covered various disciplines such as biological science, information science, philosophy, geographic information, and art. Differences in different fields, levels, regions, and thinking lead to the emergence of large-scale heterogeneous structured semantic data. How to perform efficient data integration, data cleaning and expansion, data storage and indexing, as well as data query, search, browsing, and visualization operations for complex and heterogeneous data knowledge is the focus of knowledge management research on the Semantic Web.
In this paper, deep reinforcement learning and multiobjective optimization algorithms are applied to the allocation of English distance teaching resources, and an English distance teaching resource management system is constructed to improve the effect of English distance teaching.

Related Work
e core of the file management system is file storage. Researchers such as Patterson summarized several typical file storage systems, including direct storage, object storage, and disk array, compared the advantages and disadvantages of each storage method, and put forward the prospect of cloud storage [1]. e widespread use of file management systems has resulted in the accumulation of a large number of resources and materials on the Internet, and the rapidly increasing number of files also has higher requirements for storage technology. How to store massive data has become a hot topic [2]. Distributed storage technology with high reliability and scalability was born and promoted rapidly, pushing the development of file management technology to a new level [3].
ere are two very popular distributed system frameworks, namely, HDFS (Hadoop Distributed File System) and FastDFS (Fast Distributed File System) [3]. HDFS can meet the system design requirements of high throughput and large amount of data. In comparison, FastDFS is a lightweight distributed file system [4]. ese two distributed storage systems have their own merits and need to be used in combination with specific application scenarios. In recent years, cloud computing has become a research hotspot in the computer field [5]. As a new concept derived from cloud computing, cloud storage is also favored by researchers [6]. Cloud storage combines virtualization and distributed storage, which can organically integrate a large number of storage devices and provide file storage services to the outside world [7].
Structured data and unstructured data are two types of data that may be classified based on how they are stored. e database system has performed admirably in terms of storing and retrieving structured data. Unstructured data storage, administration, and retrieval, on the other hand, are impossible to perform directly via the database system. In response to this need, many full-text retrieval methods have been developed. One of the key research paths in network retrieval is full-text retrieval technology [8]. Some mature technical frameworks have also emerged, and now the most widely used is Lucene, a sub-project of the Jakarta project group of the Apache Software Foundation. It is an open source full-text search engine toolkit. e purpose of Lucene is to provide software developers with a simple and easy-to-use toolkit to easily implement full-text search functions in target systems, or build on it a complete full-text search engine. e online teaching system based on WebRTC and Node.js, starting from the fundamental mission of teaching, cultivates all-round talents [9]. erefore, in this online teaching system, the original intention is to cultivate students' learning ability, and the Internet is used as a means to guide students to recognize the world, understand the world, and learn knowledge in an all-round way, stimulate students' initiative and interest in learning, and encourage students to learn. Learning through exploration, in this online teaching system, not only helps teachers impart knowledge to students but more importantly improves students' communication skills and problem-solving skills [10]. Among them, WebRTC integrates the best audio/video engine. e ultimate purpose of the WebRTC project is mainly to allow web developers to easily develop real-time multimedia applications with browsers. Web developers only need to develop simple JS through the media stream processing process. e code can be implemented [11]. Compared with other teaching systems, WebRTC has good real-time performance in audio and video, so it can better improve the communication experience between teachers and students during class.
anks to the powerful package manager NPM in Node.js, it takes less time to build a web program than Java, Node.js development is more efficient than Java, and Node.js runs based on the event loop mechanism, compared to Java has the characteristics of high concurrency and high I/O, which is more suitable for dealing with the problem of excessive server overhead caused by the rapid growth of access volume in the current teaching system [12].
At present, there are usually two ways to transmit multimedia information such as video and audio on the network: downloading and streaming. When adopting the download method, the user inevitably faces two problems: the problem of limited storage capacity of the client and the problem of playback delay. Streaming while downloading and playing overcomes the shortcomings of downloading first and then playing, saving time and storage space, and making it possible to learn online through audio and video [13]. Streaming media technology is now more and more widely used, and has become one of the most important technologies for network transmission of audio and video data, especially in the network education system based on network technology and with courseware as the main teaching resource. e application of streaming media technology makes it possible for people to learn audio and video through the network. Streaming media technology overcomes the problem of the transmission of massive audio and video data in medium and low bandwidth, making it more and more widely used not only in online education, but even in traditional education. In light of these benefits, the creation of a digital interactive learning platform based on network and streaming media technology, using network flexibility, openness, breadth, and timeliness, may allow students and instructors to engage with one another. ere are different means of interactive learning over the network, including not just classic techniques such as e-mail forums, but also real-time vivid learning exchanges through audio and video [14]; not only streaming media courseware but also online learning. Learning effects are assessed using tests. is not only overcomes the limitations of time and space but also mobilizes students' interest and enthusiasm for learning, making personalized learning possible. Online education enables students to choose the teaching resources they need most according to their own needs, combined with their own interests and existing knowledge structure, and to learn independently without being bound by time and space [15].

Image Color Correction based on Deep Reinforcement Learning
A new three-dimensional interpolation algorithm based on fuzzy logic is proposed, and new interpolation range and interpolation rules are defined. On the basis of information theory, the literature generalizes the tetrahedral interpolation algorithm, and proposes a linear interpolation algorithm based on the maximization of probability entropy for color correction. e method is novel in design and achieves better results than the traditional tetrahedral interpolation algorithm. Furthermore, starting with the neighborhood of corrected color points, this part determines the k-nearest neighbor fuzzy entropy on the sample set. An interpolation approach to increase the maximum fuzzy entropy estimate is provided based on an investigation of the physical properties of the device's color gamut. e interpolation algorithm utilizes multiple sample points and proposes constraints based on fuzzy logic. It does not need to locate the corresponding geometry, which solves the shortcomings of the tetrahedral interpolation algorithm.
We assume that the color points participating in the 3D value are (x i , f(x i )), i ∈ 1, . . . , k respectively, where 13 ) is the color point in the source color space, and f(x i ) is the color point mapped by x 1 in the destination color space. Now, we assume that the color point of the source color space to be corrected is p � (x p1 , x p2 , x p3 ), then the generalization process of the three-dimensional duster value is: (1) It finds a weight μ i that satisfies formula (1). (2) It substitutes μ i into formula (2) to obtain the mapping esti- Usually, the tetrahedrons in the tetrahedral quantization algorithm are divided by cubes. e literature simulation experiments show that the accuracy of the tetrahedral value algorithm and the cube value algorithm is not much different, but the tetrahedron value algorithm has advantages in execution cost and calculation speed compared with the cube interpolation.
In information theory, entropy is used to describe the average amount of information in an observation space. is criterion is called the principle of maximum entropy. We assume that the discrete random variable is x. When there is probability, the principle of maximum entropy can be expressed as: e mentioned algorithm based on maximum probability entropy estimation combines the weights μ i in formula (1) into the form of probability entropy, as shown in formula (3). en, the value μ i that satisfies the constraint of formula (1) is obtained, and finally the final estimated value is obtained. It is easy to find that formula (1) already contains the constraints in formula (3).
ere are many definitions of fuzzy entropy, and nearly 20 kinds of fuzzy entropy are introduced in the literature alone.
e fuzzy H(·) of the fuzzy set A consisting of x should satisfy the following five conditions: Among them, A * is called the sharpening set of A, and the following two conditions are satisfied: Although many fuzzy entropies have been defined, not all fuzzy entropies satisfy the above five conditions. e following fuzzy entropy is proposed, which satisfies the above five conditions.
Among them, C is a constant, which can usually be set to 1.
So far, the k-nearest neighbor fuzzy entropy has been defined, and formulas (1) and (4) can be combined to obtain a maximum fuzzy entropy algorithm, which is denoted as LIMFE-1 and is the initial stage of the LIIMFE algorithm.
After considering the interpolation coefficient μ t as the degree of membership, the value of μ t should be between 0 and 1, which is used to represent the similarity between the Mathematical Problems in Engineering 3 interpolation sample point and the corrected color point. For the constraints of formula (3), the sum of μ i is required to be equal to 1. is constraint can be thought of as the corrected color point being in a convex hull formed by its k-neighbor sample set. e color gamut surface is usually not convex, for example, the color gamut surface of the English teaching resource database is uneven. Figure 1 shows the network diagram of a color gamut of an English teaching resource database wrapped in Lab space.
We assume that an existing corrected color point is at the concave surface of the color gamut, and the k-nearest neighbor formed by the interpolation candidate point is obviously concave, which con icts with the above assumption about the convex hull.
In summary, the restriction on the sum of weights being 1 in formula (1) can be cancelled, that is, the constraints in formula (3) can be abandoned, and the new constraints are: Linked to the centroid defuzzi cation method in fuzzy logic, the estimated form of formula (2) is changed to: Here, λ 1. So far, the improved linear interpolation algorithm for maximizing entropy (LIIMFE) based on fuzzy entropy can be reduced to the process of constraining formula (5) and maximizing formula (4). e nal estimated form of LIIMFE is shown in formula (6).
For the selection of C in formula (4), C 1 can be selected in general. If in order to enhance the nal correction e ect, structural risk minimization in statistical learning can be used. Structural risk minimization aims to minimize the risk functional for both empirical risk and con dence bounds. e loss function in the least squares method is regarded as the empirical risk of regression, and the loss function in the maximum likelihood method is regarded as the empirical risk of density estimation. Here, the formula (7) can be considered as the empirical risk of the maximum fuzzy entropy algorithm.
e SRM model selection criterion in statistical learning shows that the estimation error is composed of empirical risk and penalty factor, and the product form in model selection is shown in formula (8). erefore, the constant C can be considered as this penalty factor.
Among them, q h/k, h is the model complexity (VC dimension in statistical learning theory), and k is the number of sample sets. e literature gives the estimation formula of model complexity in k-nearest neighbor regression. e error is obtained by the Lab value of the correction point and the estimated value of various algorithms, and its formula is shown in formula (9).
When multiple regression is applied directly to the neighborhood correction, the result is the same as the partitioned regression correction split into tiny partitions discussed above, and the accuracy is not better, but worse. Because each component of the mapped data has a high correlation, this is the case. We take the Lab sample set data during a color correction of an English teaching resource database as an example, the correlation coe cient between the rst component and the fourth component in formula (5) is 0.9996, and the correlation coe cient between the second component and the seventh component is 0.9945. When the variables in X are highly correlated, the determinant |X T X| is almost close to zero, and the inverse of X T X will contain serious rounding errors. e calculation of formula (6) is unreliable, which is a part of the reason why the correction accuracy of the partition regression correction is poor when the partition is small. In addition, when the partition is small, the number of samples is relatively small, and the use of a more complex regression model will make the model over tting, which will a ect the regression prediction accuracy of the model on the test set.
To get a better approximation, we de ne a loss between the ideal response y given an input X and the f(X, β) response given by the learning machine, as shown in formula (10):
Considering the mathematical expected value of loss, formula (10) is called the risk functional, where Loss(·) is the loss function and F(X, y) is the joint probability distribution function. e goal of learning is to nd a suitable function f(X, β) that minimizes the risk functional. Speci c to the least squares method in multiple regression, the formula is used as the loss function to minimize the risk functional. Usually, the method of minimizing the risk functional in this way is called the empirical risk minimization principle (Empirical Risk Minimization, ERM), that is, the empirical risk of multiple regression is e algorithm in this section further adopts the principle of structural risk minimization in the neighborhood to obtain the regression coe cient, that is Among them, R emp (β) is the empirical risk, ϕ(h/k) is the con dence range, h is the VC dimension, and k is the number of samples. Combined with formula (11), it can be found in formula (12) that the con dence range ϕ(h/n) constrains the empirical risk of the least squares method, making |X T X| nonzero. us, the problem of correlation of data components in smaller partitions or neighborhoods is avoided, and the complexity of the regression model is limited.
Another representation of least squares in multiple regression is: the overdetermined formula Xβ y is considered, and its solution β LS makes y − Xβ Ls min β y − Xβ . at is to say, the least squares method assumes that there is an error Δy: Xβ y + Δy in y, and the solution is to minimize the sum of squared errors of Δy, that is e least squares method only considers the inaccuracy of y, however, X also inevitably contains noise in practical problems. Speci c to the color calibration, the measurement of the spectrophotometer and the printing paper can bring noise.
e input X is often not accurate, that is, (X + ΔX)β y + Δy. In order to solve this problem reasonably, the total least squares method is proposed. e solution β TLS of the total least squares method is β TLs argmin β ΔX|Δy 2 2 , s.t: (X + ΔX)β y + Δy.
Using the Lagrange multiplier method, the optimization problem in formula (14) can be transformed into: From formula (15), it can be considered that the empirical risk return of the total least squares method is (16) Figure 2 shows the di erence between least squares and total least squares in the one-dimensional case. It can be seen that compared with the least squares solution, the total least squares solution has a shorter direct projection distance, and its residual is perpendicular to the regression line, and the residual is composed of the errors of X and y. erefore, the regression error of the full least squares method is smaller than that of the least squares method.
Following the explanation above, it is recommended that when doing color correction, the residual of the full-squares approach be used as the empirical risk in structural risk reduction. On the one hand, the structural risk reduction concept decreases the real risk error by eliminating the connection between the surrounding data components. e complete least squares approach, on the other hand, accounts for aws in both the input and output data, making the correction more accurate. at is, the algorithm in this section can be described as: We choose a penalty term with shrinking variables for con dence in structural risk minimization.
en, formula (17) becomes: e output data y of formula (19) (the target color space data in color correction) is one-dimensional output, but the target color space data is often multi-dimensional. In the actual calibration, a simple method is to use the algorithm to obtain the regression correction coe cient for each dimension of the target color space. e concept of feature space originates from classi cation algorithms. In order to generalize the linear classication algorithm to the nonlinear classi cation algorithm, the points in the Euclidean space R N can be mapped to a de ned inner product. Moreover, it is a complete normed linear high-dimensional Hilbert space H (feature space), and its mapping relation is Φ(x).
For an algorithm in space R N , it can be considered to use a new sample set to calculate in the new Hilbert space, as shown in formula (21).
It is generally di cult to estimate the speci c form of Φ(x), so a kernel function is introduced for this purpose. e kernel function transforms the nonlinear problem of the original space into a linear operation involving only the inner product operation in the feature space. e process can be written as: e trick of the kernel function is that it does not need to know the speci c form of Φ(x), and it directly calculates the inner product of Φ(x) and Φ(x ′ ) through x, x ′ . Some commonly used kernel functions include (where d, θ, c are all real constants): (1) e homogeneous polynomial kernel function is e kernel function has a simple property: the linear combination of the kernel function is still a kernel function.
In the process of color correction, all color spaces are Euclidean spaces. If the data of the source color space is mapped by the nonlinear mapping Φ(x), it can not only convert the nonlinearity of the source color space data into linearity but also provide additional correction information for color correction. Taking Lab data as an example, we use the properties of the above kernel function and select the kernel function as  k(x, x ′ ) x, x ′ 1 + x, x ′ 2 , then the corresponding nonlinear mapping Φ(x) is Φ (L, a, b) L, a, b, L 2 , a 2 Formula (23) is very similar to the expansion term of the polynomial in the multiple regression color correction method. e terms of the polynomial in the multiple regression are shown in formula (24). (L, a, b)⇒ L, a, b, L 2 , a 2 , b 2 , La, Lb, ab . (24) When performing scanner calibration, it is pointed out that the average error and standard error of the calibration will decrease as the degree and number of terms of the regression polynomial increase.
is shows that the data after Φ(x) mapping can provide more correction information than the data before mapping. In addition, from the point of view of basis, it is still a question whether the global polynomial is the best choice to describe the nonlinear correction, and the kernel function can provide more choice space, which includes local and global kernel functions.
Because partial least squares regression is performed in the feature space after the kernel function mapping, it can be directly calculated by kernel partial least squares regression (KernelLeastSquaresRegression, KPLSRegression). e steps of KPLS regression can be expressed as: (1) e algorithm initializes the vector u; (2) t ΦΦ τ u, t←t/ t , where Φ is the matrix that maps the training data to the feature space; (3) c Y τ t, Y is the output data matrix; (4) u Yc, u←u/ u ; (5) e algorithm repeats steps (2)   Mathematical Problems in Engineering Among them, $u and t$ are the main components. According to formula (22), step (2) can be changed to t Ku, and step (6) can be changed to K←(I − tt T )K(I − tt T ) T , where K is the Gram matrix expanded by formula (22). We write u, t as principal component matrices U and T, respectively, and the KPLS regression coe cient can be obtained as: Φ is the matrix that maps the test data to the feature space, then the estimated form of its KPLS regression is

Allocation of English Distance Teaching Resources based on Deep Reinforcement
Learning and Multi-Objective Optimization e meaning of unit management and subject management in the system platform design mainly refers to the realization of classi ed management by implementing unit and subject management, guiding teachers and logging in customers through tracking access. For example, according to di erent personnel, the system platform or all information is fully open to it, or partially open to it. From the perspective of the management of English distance teaching resources for managers, classi ed management is to implement classication for the management platform of college teaching English distance teaching resources in the system English distance teaching resource database. It formulates the directory structure of the remote teaching resource management platform for teaching English in colleges and universities, and then selects di erent les for classi cation. On the one hand, if instructors are classi ed as managers, it is important to create English distance teaching resource monitoring records. On the other hand, as indicated in Figures 3 and 4, it is vital to assess the demand for English distance teaching resources and then conduct data statistics to better understand the purpose of English distance teaching resource. Figure 5 is a schematic diagram of the directory structure, only teachers or managers have the right to change, and the designer can change  the directory tree structure at any time, which has the dynamic characteristics of node change. ere are some unique styles of English distance teaching resource management based on colleges and universities in its own eld.

Management for Instructors.
According to the characteristics of the remote teaching resource management platform for teaching English in colleges and universities, the storage and management of text-based English remote teaching resources and le-based English remote teaching resources are realized with the help of the database and le system. e storage structure of English remote teaching resources is shown in Figure 6. e main function of the input design of the system database is to de ne and input information. It is about the design of what kind of English distance teaching resources and how to input into the system. Moreover, the input design must pay attention to the quality of system data. In this context, the correct search term setting is very important. Once the entered search term is not related to the original data information, no matter how well the system performs, no matter how advanced the system technology is and how appropriate the search tool is. At the same time, the nal search content is also likely to be inconsistent with the user's requirements, which will directly a ect the functioning of the system. Figure 7 shows the owchart of the query module of English distance teaching resources.
In the process of using the system, the rst display is the user login module, which requires the user to enter information such as user name and password. e goal is to verify the system user's identity, increase system security via the user login operation, and prevent unauthorised users from accessing the system without permission. erefore, the module is displayed in the form of a login interface, and only the correct input of the two required options of conventional name and password can successfully log in to the system and use two identities of administrators and ordinary users to distinguish the use rights. e design interface of this module is shown in Figure 8. Figure 9 is a schematic diagram of the resource allocation of English distance teaching based on deep reinforcement learning and multi-objective optimization, which is expressed in the form of simulation.
e English distance teaching resource allocation method based on deep reinforcement learning and multiobjective optimization proposed in this paper is simulated and evaluated by Matlab, and the results are shown in Table 1.
From the above research, we can see that the English distance teaching resource allocation method based on deep reinforcement learning and multi-objective optimization proposed in this paper can e ectively improve the e ect of English distance teaching resource allocation.

Conclusion
At present, the main form of online teaching is the sharing and use of teaching resources. However, the sharing and use of teaching resources is inseparable from a unified management platform to integrate, classify, and manage teaching resources on the network. Based on this, it is not difficult to see that the research significance of the teaching resource file management system is that it provides basic support for network teaching and is of great significance to the construction of educational informatization. In recent years, the popularity of the Internet has become higher and higher, the network teaching resources have also accumulated a certain amount, the use of teaching resources has become increasingly diversified, and the research on the storage, classification, and utilization of teaching resource files has also increased. At the moment, China is still in the early stages of developing educational resource pools. Many great teaching materials have not been successfully integrated and utilized, and the dissemination of teaching resources is excessively dispersed and fragmented. To some degree, this obstructs the promotion and growth of online educations. In this paper, deep reinforcement learning and multi-objective optimization algorithms are applied to the allocation of English distance teaching resources, and an English distance teaching resource management system is constructed. e research shows that the English distance teaching resource allocation method based on deep reinforcement learning and multi-objective optimization proposed in this paper can effectively improve the resource allocation effect of English distance teaching.

Conflicts of Interest
e authors declare that they have no conflicts of interest.