Application of Virtual Reality Human-Computer Interaction Technology Based on the Sensor in English Teaching

In order to improve the online English teaching effect, the paper applies the sensor and human-computer interaction into the English teaching. The paper improves the sensor information by applying Kalman Filter, combines sensor positioning algorithm to trace the students in the English teaching online, and turns the kernels by the skeleton algorithm into corresponding coordinates of space rectangular coordinate system taking the waist as a coordinate origin to get a humancomputer interaction skeleton model in the virtual reality. According to the actual needs of English teaching human-computer interaction, the paper builds a new English teaching system based on the sensor and the human-computer interaction and tests its performance. The experiments suggest that the smart system in the paper can effectively improve English teaching effects.


Introduction
The traditional English teaching mode has already been unable to fulfill the need of English teaching in the information age, especially in the period of the COVID-19. Online tutoring has already become a new trending. Therefore, it is necessary to conduct a revolution in the English teaching through information technology and smart technology [1]. With the extensive application of the computer, the users have been gradually increased, which are not just the early computer professionals. Generally speaking, the users have different needs and opinions toward the data, kinds of ways to solve the problems as well; and for the same user, he could present changes in different periods [2]. The traditional fixed user interface designed is not to the point, which cannot satisfy the specific need of the user and even the needs in different periods; the traditional Human-computer interaction system is just a system without any "flexibility," and the users have to follow the interaction way and operate [3]. Sometimes, the user just needs a simple operation, which is obviously with a purpose especially understanding the user on the basis of previous operations, but the system is unable to know the real intention and just treats it as a normal opera-tion. So this cannot cause a smooth information exchange between the system and the user. The user cannot freely interact with the computer, and the computer cannot deal with the requests from the user just like humans understand their friends. The user-centered human-computer interaction technology is the best effective way to solve the foresaid problems. The human-computer interaction system, to a certain degree, can allow the users to freely talk to the computer in their own familiar way; the computer has a certain ability of understanding and can understand the users to a certain degree or can smartly give the users a specific feedback although the intension is not clear enough [4].
A current research focus of human-computer interaction is to improve the performance of user interaction when users use computers to complete tasks. The improvement of user interaction performance includes a variety of humancomputer interaction interface components, such as display media, display content, interface structure and style, and various input mechanisms. These human-computer interaction elements are closely related to user's perception, perception, processing, and reaction to the computer. To get the best user interaction performance, the interactive system must be built on the basis of a full understanding of users.
The first thing to study is how people conduct daily communication in real life, including the way of expressing intentions, the way of receiving outside information, and the way people express their intentions to people in the outside world and people's response models when asking questions. These expression models and response models are related to the types of users, users' experience, abilities, skills, and preferences in the field. At the same time, different users have different computer knowledge, comprehensive capabilities, and various factors that affect their interactions. It is necessary to understand the characteristics of all aspects of user behavior (from behavior to primary perception), establish the corresponding user model, then establish the user model of the corresponding application field, and improve the efficiency of interaction by improving system's understanding of people.
Based on above studies, the paper applies the sensor and virtual reality into the English teaching and builds a smart system so as to improve the English teaching effect.

Related Work
The ITS (intelligent tutoring system) is a multidisciplinary field of traditional CAI (computer aided instruction), AI (artificial intelligent), and cognitive science; ITS has internationally become a larger research mode and achieved convincing results [5]. Especially in recent 10 years, ITS grows fast abroad and turns into the commercial stage from research stage [6]. ITS can be established and maintain the student model; on one hand, the system can simulate the teaching decision-making and provide real-time tailored tutor to the students; on the other, the system can automatically analyze the historical data of a set of students in the student model and help teachers objectively understand the overall condition of students so that the teachers can timely adjust their own teaching plans and contents. The student model can tell the knowledge level, learning ability, and cognitive feature of the students; it is essentially a programme based on the algorithm that solves the actual problems in the way that students do [7].
Human-computer interactive learning systems using multimedia ingenuity are widely used at home and abroad. Human-computer interaction learning can provide students with a safe, predictable, repeatable, relaxed, and lively interactive learning environment. The system follows the basic principles of human-computer interaction and scientifically designs learning goals based on related theories and establishes corresponding human-computer interaction learning activities [8]. At the same time, the human-computer interactive learning environment pays attention to the design of interactive experience, is committed to letting students participate in interactive operations in an entertaining way, and focuses on the friendliness and fun of the game interface. The color matching and screen display design can attract students for different student groups. Attention is a system that mobilizes students' interest in learning and guides students to learn actively [9]. Literature [10] designed and implemented a three-dimensional drawing model tool, which can outline and model the Lanwei model. Users can interact with gestures and pens. Although its accuracy is not as good as professional modeling software, it can satisfy students and nonbasic professional needs. Literature [11] uses an infrared camera to obtain the facial feature data of children, uses a pressure seat to obtain the child's body posture, and combines the learning behavior on child's computer to identify child's rational state. Literature [12] uses computer vision technology to study joint attention (that is, to follow and guide others' attention). The experiment uses a web camera to detect learner's head posture in real time to estimate learner's attention. Literature [13] establishes a multichannel learning environment by using video and audio fusion technology. In this learning environment, the video signal of the lips is obtained through the camera and merged with the voice signal obtained by the microphone. The speech recognition result and the handwriting recognition result are combined for human-computer interaction.
The rapid development of multimedia technology has also greatly promoted the progress of the T human-machine interface. The human-computer interaction gradually uses multimedia input and output devices such as microphones and cameras [14]. With the rise and development of emerging disciplines such as cognition, science, artificial intelligence, graphics, and image processing, multimedia-based multichannel human-computer interaction has gradually become a research hotspot. Virtual reality technology is a computer system that can create and experience a virtual world. It uses computer technology to generate a realistic virtual environment with multiple perceptions such as sight, hearing, and touch. The user uses various interactive devices to be the same as the virtual environment. The interactive visual simulation and information exchange that produce immersive feelings in the interaction between the entities in the world is an advanced digital human-computer interface technology, and its appearance will change the way of life of human beings [15]. For the research of virtual reality technology, literature [16] searched the WPI database and counted the country/regional distribution of related patents. From the perspective of the development of human-computer interaction, it can be found that the overall trend is towards naturalization and intelligent development. It is believed that the future humancomputer interaction design will bring people a more relaxed and comfortable life [17].

Sensor Intelligent Fusion Algorithm
Extended Kalman filter (extended Kalman filter, EKF) is suboptimal that performs Kalman filtering after linearizing a nonlinear system. Assuming that the system equation is represented by a discrete equation that remains constant under stochastic nonlinear conditions, the equation of state described above and the sensor measurement equation of the moving student can be expressed as follows [18]: J f ðxÞ is the Jacobi matrix of function f onxðkÞ, then [19]: Similarly, If the Kalman filter incorporates several readings of the above ranging sensors, then HðxðkÞÞ can be expressed as H in which Pðk =Þ is the distance from moving student sensor i to a point on the j wall in the environment; thus, the line segment characteristics of the global coordinates in the environment can be expressed as [20]: The Kalman algorithm is In the above formula, P is the variance matrix of X, and F is the Jacobian matrix of f .
(2) Observation phase From formula (7) and formula (10), the predicted measurement equation can be obtained: We use the predicted error of the sensor measurements to correct the predicted equation of statexðk + 1Þ, and we can obtain the predicted error of the measurements γðk + 1 Þ, which can be expressed as The measurement error variance is H is the Jacobian matrix of hðxðkÞÞ in the measurement Equation (7) [21]: (3) Estimation phase It is seen from the above two steps thatxðk + 1Þ is the predicted value of the state that is realized by extending the Kalman filter to obtain the prediction error information based on the measured value. Among them, the measured variance S ðkÞ represents the corrected weight factor.
If the extended Kalman filter gain matrix is K, state is estimated as x, and its covariance matrix is P, and the 3 Journal of Sensors estimation value can be expressed in the following formula: In conclusion, the recursive formula of EKF includes the test, observation, and estimation phase. To obtain the optimal estimation value for the system, we can assume that some parameters are not related to the system, including the initial input noise w ðkÞ of the system and the measurement noise v ðkÞ generated during the measurement in the process, then we can handle the system by the irrelevant Gaussian white noise.
The sources of error in information fusion are summarized as follows: (1) Milodometer system An odometer is a sensor that uses data obtained from the mobile sensor to estimate the change of the object position over time, usually installed in the internal body position of the moving student. Its working principle is first use the pattern information or coding information on the code plate to obtain the rotation radian information of students' left and right wheel, and the forward direction and speed changes of the students can be calculated through these radian information.
According to the statements above, the odometer can represent the line speed and angular speed of the moving student; thus, odometer movement model can be represented as shown in Figure 1. We take the student as a mobile robot. Based on which, assuming that the student radius of movement is r, photocode disk is p line/turn, in a cycle time ΔT, the movement distance of the left wheel is Δs l , the moving distance of the right wheel is Δs r , and the moving distance of the car body Δs, the output pulses of student's photoelectric code disk is M, and the movement distance of the car body is By the nonlinear regression analysis, it can be assumed that in a sampling cycle time ΔT, the movement track of the moving student is almost a straight line. If the distance between the left and right wheels is a, and then, the moving student is moved from the position Xðk − 1Þ = ðx k−1 , y k−1 , θ k−1 Þ to the position XðkÞ = ðx k , y k , θ k Þ, the distance that the moving student moved is Δs = ðΔs l + Δs r Þ/2, the angle information that the moving student rotated is Δθ = ðΔs l + Δs r Þ/a, and the odometer model input is μðkÞ in the formula μðkÞ = ðΔD k , Δθ k Þ T obtained by the control command (v, w). During a sampling cycle ΔT, the motion path of the moving e student is ΔD k = v × ΔT, and the pose relative angle is ΔD k = v × ΔT.
The odometer has two models: circular arc and straight lines. Since the linear model is simplified from the arc model, the arc mode is generally used in practice, which is more accurate.
(2) Arc model Δθ k in formula (17) indicates the rotation angle difference between the end point and the beginning point of the moving student, in which ΔD k is the movement displacement of the moving student during the sampling period AT time.
(3) Line model Assuming that the moving students makes a small displacement in a very small period of time, the extremal method of nonlinear regression analysis method can be used for processing, and the model dealing with the problem becomes a linear model, which can be represented by simplified lines, namely, |△0 = 0, then the model equation can be expressed as  Journal of Sensors To calculate the position of moving students, in this paper, linear model and arc model are used. The specific implementation method is the displacement calculation adapts a straight line model, and the arc model is used to calculate the change difference of direction angle. Then, the two models are represented as At the k moment, if line speed and angular speed values of the moving student are known, formula (19) can be transformed as The errors of the moving student odometer include two types, namely, nonsystematic error and systematic errors (as shown in Figure 2). Its systematic error accumulation is always existed, while the nonsystematic error accumulation is random and indeterminable.

(4) Sonar calculation method
Generally, most of the sonar applications adapts very cheap Polaroid6500 sonar modules. Among them, the environmental road sign characteristic is used for the sonar sensor to scan the environmental information, extract useful environmental feature information, and correctly calculate the specific location of the environmental characteristics. Assuming that the coordinate position of the line segment in the local coordinate system is ðx l , y l Þ, the predicted coordinate position ðx w , y w Þ for observation can be estimated and obtained. Among them, the width of the sonar beam and the reflection of the sonar signal are the important factors affecting the sonar sensor. Figure 3 shows the coordinate system transformation diagram.
In Figure 3, the point O ′ is the origin in this local coordinate system that represents the moving student, the point Р represents the wall in the moving student walking area, a is the sonar sensor, and O represents the offset angle of the on-board coordinate relative to the coordinate OXY. The degree of the sonar is represented by the point S, and the distance to the point Р is d, and the relative position of each of the sonar is fixed. Figure 4 shows the interrelationship between the global and local coordinate systems for moving students. The width of the sonar beam and the reflection of the sonar signal are two factors affecting the principle of the sonar sensor.
Assuming that in the moving student environment, the reflected wall is represented by P, the distance from the origin O to the plane and the distance between the plane and the sonar sensor in the environment are represented by P j r , P j n , P j r . Since the relative position information of the sonar in the local coordinate system is known, which can be represented by ðx i , y i , θ i Þ, formula (22) Set P j n = θ l , P j r = D, in P j n ∈ ½θ i ðnÞ − δ/2, θ i ðnÞ + σ/2, δ is the sonar beam width, and d j i ðkÞ may be represented based on formula (21) above as According to the view of Figure 5 above, assuming we are sampling at the k time, the position of the moving student is XðkÞ = ðxðkÞ, yðkÞ, θðkÞÞ. After a rotating coordinate transformation, we can convert the coordinate numbers ðx  Journal of Sensors coordinate system, then the model of the sonar sensor is expressed as the following formula: As the P point on the wall in the sonar coordinate system in Figure 5, the point P, a, O ′ is on the same line, and the position detected by moving the student odometer and the sonar sensor can be represented as  Mobile students must rely on the collected external environment information to achieve more accurate navigation and positioning. However, in the real environment, there are various uncertainties and intertwined complexity in the environment itself, which leads to the absence of theoretically reliable environmental information. In addition, the data measured by the mobile student's own sensors will also be disturbed by the environment. As a result, the measured environmental information is not our ideal information, and there are often extremely complex uncertain factors. In order to reduce or even eliminate these interferences in the observation information of mobile student sensors, we can establish a reliable noise simulation system to express the uncertain factors described above. However, this is only our ideal state. In fact, due to the nonlinear and incomplete constraint system of moving students, it is not easy to establish an efficient and accurate mathematical model. In this case, we use a model (also called an error model) that introduces noise that obeys the Gaussian function model to approximate it.
In this paper, the sonar sensor is used to detect the road signs of environmental information. Due to the characteristics of the sonar sensor, if an observation point cannot be selected correctly in the process of selecting an observation point, it will lead to the environment corresponding to it. It is impossible for the feature information to perform correct data association, which will cause the divergence of the filter. The divergence of the filter will cause all the predicted observations to be wrong. In order to make the road sign data scanned by the sonar sensor match correctly, a data association model can be introduced to solve this problem.
The analysis shows that the position estimation inferred by the mobile student through the odometer and sonar sensor is inaccurate, so the correct position estimation of the mobile student cannot be obtained. In this paper, a two-dimensional vector road sign is selected as the observation in the position of the global coordinate system. Since the environment in the space is variable and complex, but there is a certain correlation between the environmental characteristics, it can be estimated; how to eliminate the interference of the correlation between these variables is the problem we want to solve. A large number of studies have verified that Mahalanobis distance can well eliminate these interferences, so Mahalanobis distance is selected as the measure of data association in this article. In addition, it should be noted that the correlation gate used in this article is an elliptical correlation gate.
Mahalanobis (Mars distance) is defined as d 2 k ðxÞ = ðx − m k Þs −1 k ðx − m k Þ, d 2 k ðxÞ represents the mean distance of the measured value within group k, the vector m in the formula represents the mean of the variable in group k, x represents the value of the variable in the environment observed by the sensor model, and s represents the covariance matrix within the group.

English Teaching System Based on the Sensor and Human-Computer Interaction
The kernels of the skeleton extracting algorithm are universally turned into corresponding coordinates of space rectangular coordinate system taking the waist as a coordinate origin. The right part of the body is x-axis, the vertical upper part is y-axis, the front of vertical to the body is z-axis, which is shown in Figure 6. The paper proposes human-computer interaction method based on monocular camera. First extract the physical skeleton from the video taken by monocular camera and complete the interaction with virtual environment through the human body pose. The process is shown in Figure 7.
After cutting out the current frame, first of all, pretreat the picture. That is because the profiles in the video are of sizes, if the body in the frame takes up smaller, which can greatly reduce the precision of the later kernels. At the same time, much irrelative information in the frame could cause too much useless information in the network training process and affect the precision of identification result. Therefore, we need to find out the boundary of human profile and get it out from the frame to identify aiming to improve the precision in the following stages. There are many ways to extract from the RGB images. As the video in the paper is shot by a fixed camera and there is only one person in the frame, our background is fixed, and we use background subtraction to extract the human profile from the video; the principle is shown in Figure 8.
The paper builds English teaching system based on the sensor and human-computer interaction technology and tests the performance. According to actual needs, the paper conducts a test on the basis of the real-time information transmission in English teaching by the sensor and human-computer interaction. The simulation experiment is used to test the performance of the system, transmission effect of sensor, and human-computer interaction (by the way of experts' appraisal); the statistics is shown in Table 1 and Figure 9.
From above experiment, the English teaching system in the paper based on sensor and human-computer can effectively improve the English teaching effect and have some positive influence on the following English teaching revolution.   Journal of Sensors

Conclusion
In the teachers-centered link, teachers play roles of traditional instructor, teaching monitor, after-school tutor, homework assigner and corrector, and document provider. With further development of teaching-related theory and interaction design theory, the human-computer interaction between teachers and students has attracted more and more attention among people, from the feasibility of humancomputer interaction in English teaching in the light of theory and study, to the theory foundation of creating learning environment. In order to improve the effect of humancomputer interaction in English teaching, the paper studies the sensor technology and puts a Kalman Filter Algorithm which stresses the precision to solve the problems from the human-computer interaction information by the positioning in sensor information. The virtual reality is used to build in English teaching online interaction system. The system goes through the performance test, and the study suggests that the system in the paper has certain effects.

Data Availability
The labeled dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare no competing interests.