Study of Cognitive State Recognition and Assistive System for Overall Reading of Foreign Literature Based on Intelligent Sensors

In real life, the text is one of the main carrier forms of information, which carries human civilization, and spreads knowledge to people, and also promotes culture and records history; however, how to read more information in a limited time, that is, to improve reading e ﬃ ciency, has become a problem to be solved by current technology. The purpose of this paper is to integrate the existing wearable device concept, combined with a wireless intelligent sensor system; design a wearable reading assistance system designed to facilitate the use of blind and partially sighted people, based on the study and comparison of existing text recognition products; improve their functions and implementation platform, combined with wireless network; and design a wearable device that can achieve foreign text recognition and reading cognitive state reading assistance thereby improving reading e ﬃ ciency. This paper proposes a method to implement foreign text decoding under the embedded platform with relatively few resources and quickly completes image acquisition, binarization, and compressed storage through the bit and storage area and DMA (direct memory access) double bu ﬀ ering mechanism unique to the chip selected in this paper; proposes to use the connected boundary tracking algorithm to ﬁ nd foreign text locators, reducing a large number of ﬂ oating-point operations; does not rotate the image, instead, the image is directly sampled at the current rotation angle, and then foreign text bitstream information is acquired to realize the decoding of foreign text under the embedded platform with relatively fewer resources.


Introduction
With the rapid development of sensors and artificial intelligence technology, some auxiliary tools for the human brain are gradually becoming a reality. The scientific study of the brain is the greatest challenge for man himself and has been at the forefront of science. Brain science has been included in the research programs of several countries, and the importance of reading cognitive state (RCS) identification as an important part of brain science research is self-evident [1]. At present, the methods of reading cognitive state recognition are mainly divided into the following three categories: reading cognitive state recognition based on wireless intelligent sensors, reading cognitive state recognition based on EEG, and reading cognitive state recognition based on functional magnetic resonance imaging [2]. As reading content becomes richer and richer, we cannot all master the reading content; for example, we will encounter words and sentences we do not understand when reading English articles or want to share the content of interest when reading, then you need to stop to check the relevant information or share the relevant content, which is a very troublesome and timeconsuming thing. For example, when you encounter a word you do not understand, you need to type in the word or sentence to find out its meaning; when you want to share the content you are interested in, you need to type in the content and then share it. If we can identify the difficulties in reading content based on the user's reading cognitive state and push the information related to the difficulties to the user's mobile terminal, the user can quickly access or share the relevant content through the mobile phone, thus improving the reading efficiency. In addition, reading state is also one of the factors affecting reading efficiency [3]. When a person is in a focused reading cognitive state, his or her reading efficiency is the highest. When people are in a state of focused reading cognition, their reading efficiency is the highest. However, it is difficult for people to stay in a state of concentration all the time, and they will inevitably be distracted or sleepy. If we can detect users' reading status in real time and give warm reminders when they are distracted or stuck, readers can adjust their reading status in time and improve their reading efficiency. In summary, analyzing the user's cognitive state and providing the user with corresponding reading assistance can improve the user's reading efficiency and enhance the user's reading experience.
This paper researches a wireless smart sensor-based user cognitive state recognition and assistance. By using wireless smart sensors to collect information about users' reading trajectories and eye states, we design algorithms to identify users' reading states (including concentration, distraction, and sleepiness) and analyze reading sequences to identify what is of interest to users and what users find difficult in the text and also propose a method to implement foreign text decoding under an embedded platform with relatively few resources.

Related Work
Since this paper focuses on cognitive analysis techniques, the following is a summary of the existing cognitive analysis methods. When eye-tracking technology was first developed, the recognition accuracy was relatively low because it required multiple transitions through the eye signal to recognize the user's cognitive state. However, with the development of technology, the recognition rate is now gradually increasing. Compared to the eye-tracking method, the brainwave method and functional MRI method can achieve higher accuracy in recognition because both methods directly monitor brain activity. But they have their drawbacks. The brainwave method requires the use of an invasive device (brainwave meter), which is very inconvenient to wear and cannot be used in everyday life. The functional MRI method uses equipment that is not only expensive (hundreds of thousands to millions) and large but also requires that no metal objects be worn during use. In conclusion, if we want to promote foreign literature-assisted cognitive recognition technology, using wireless smart sensors is the wisest choice.
The text recognition process includes image preprocessing, text region detection, character segmentation, feature extraction, character training, and character recognition. The main difficulty lies in the problems of text region detection, character training W, and recognition of characters in different languages. The literature [4] proposes a text region detection algorithm for network video based on Harris corner points, because its application scenario has more drastic color changes, so its detection principle also relies on the change of RGB, which is not very suitable for detecting text in certain color monotone backgrounds; a text detection algorithm based on SVM is proposed in the literature [5], which is more targeted and only targets the text with baseline. In [6], an SVM-based text detection algorithm is proposed, which is more targeted, only for partial language text with baseline features, but not for detecting text without baseline features in complex natural scenes, and its computational speed is slow and its computational overhead is large; literature [7] also mentions some methods for text detection, but their applicability is more limited. Literature [8] developed an English literature-assisted reading system, where learners can translate English into Chinese while browsing English literature, assisting people with limited English reading ability to read effectively and improve the reading speed of literature, but there are still some problems with remote collocation of verbs; in order for Chinese learners to better browse Chinese web pages using browsers, literature [9] developed an assisted word analysis unit-based. In order to better use the browser to read foreign language materials, literature [10] developed a browser page-assisted reading system to realize the filtering, extraction, and annotation of webpage text information, so that users can get timely and accurate reading assistance information to help users understand the information in the webpage; in order to solve the problems of shallow screen reading, short attention span, and poor comprehension ability, literature [11] developed a browser page-assisted reading system to help users understand the information in the webpage. To solve the problems of shallow screen reading, short attention span, and poor comprehension, literature [12] developed a collaborative reading annotation system based on reading annotation and interactive discussion scaffolding (CRAS-RAIDS). Literature [13] developed a computer-aided reading system for ethnic minority phonetic information resources combining machine translation technology with a computer-aided cross-language reading system to complete the translation of ethnic minority languages and assist people in reading ethnic minority books; literature [14] constructed a more complete text structure model and used it to develop a printed text recognition system; literature [15] proposed an SVM-based handwritten character recognition method as a way to improve the accuracy of character recognition. Literature [16] proposes an online handwritten character recognition system capable of regularizing the information.
Literature [17] proposed a symbol recognition method t34 based on K-L transform and SVM; the selection of K value in this method seriously affects the recognition rate of symbols; to be able to recognize effectively with a limited number of training samples, literature [18] combined the advantages of SVM and NN and proposed a two-layer classification method M1, which can quickly complete the training and effectively recognize the handwritten symbols of the ATC, and its recognition rate reaches 99%.
With the popularization of information automation and office automation, as well as the continuous improvement of relevant hardware equipment, OCR text recognition technology has achieved rapid development in recognition rate, in which recognition speed can meet the needs of the majority of users, and is now used in all walks of life.

A Study on Cognitive State Recognition and
Assistive System for Overall Reading of Foreign Literature Based on Intelligent Sensors 3.1. Smart Sensor-Based Algorithm for Foreign Literature Recognition. Combining the infinite smart sensor and the handwriting symbol mark foreign text recognition method, this paper proposes a user overall reading state cognitive assistance system, which is implemented in the following steps: firstly, the input video image is preprocessed, which includes skew correction, image alignment, and image binarization, and the purpose of preprocessing the video image is to reduce external environmental factors affecting the extraction of image features [19]; secondly, the extraction of handwritten symbols is performed, which includes differencing the preprocessed image using the image differencing method to obtain the symbols marked by the problems encountered by the user during the reading process and then using the connected domain removal method for stain removal to eliminate factors such as camera jitter that affect the accuracy of symbol extraction, followed by segmenting each symbol using the horizontal-vertical projection method to obtain the location of individual handwritten symbols and symbol markers; thirdly, using the symbol marker's position, combined with the ROI extraction method in the OpenCV library to extract the symbol marker's text; finally, the extracted text and symbols are recognized using OCR text recognition technology and SVM handwritten symbol recognition, respectively, and the recognized results are output. The specific implementation flowchart is shown in Figure 1.
The frequency-domain method of load identification is usually divided into the frequency response function matrix inversion method and the modal coordinate transformation method. The frequency response function matrix inversion method identifies dynamic loads by establishing the relationship between the system response and the load input (frequency response function matrix), which is a straightforward method. The modal coordinate transformation method transforms the system into the modal space, identifies the load information in the modal space, and then switches to the physical space to obtain the dynamic loads. For determining the system, the number of applied loads a, the number of response signals b, the dynamic load c, and the response signal d, the frequency domain relationship is where e is the frequency response function matrix, then F ðeÞ is where ⊗ denotes the pseudoinverse and H denotes the conjugate transpose. u and V are two matrices, Σ is the diagonal matrix, and U, V, and Σ are obtained from the singular value decomposition of the frequency response function matrix. For the random response, the relationship between the load and the response signal is expressed in terms of the spectral density function as follows: where M AA and M BB are the mutual power spectral density matrices of the measured response and the dynamic load. When k ≤ K, the self-power spectral density of the load is found from the self-power spectral density of the response as follows: For a known structure, modal analysis is first performed to obtain its modal information. Then, the system is transformed to the modal space for load identification, which avoids the pathological problems caused by the inverse of the frequency response function and reduces the computational difficulty. For a linear system with proportional damping, N degrees of freedom, a modal matrix of P, and a response spectral vector of Q, the following relations are obtained: where Q = ðQ 1 , Q 2 , ⋯, Q n Þ is a vector of modal coordinates in the frequency domain, and converting the system equations to modal coordinates yields

Journal of Sensors
Relative to the early stage of research and development, dynamic load recognition technology has now received extensive attention and development, load recognition involves various factors such as system modeling, sensor installation, signal test preprocessing, and algorithm research, and there are still many defects in the actual engineering application. In practical engineering applications, the load recognition algorithm based on the Kalman filter has better system stability, can well suppress the system divergence, and has good engineering application value. For nonlinear systems, the volumetric Kalman filter can satisfy most nonlinear cases, but its computational complexity is high and has high requirements on the system's computational ability. The load identification of nonlinear systems is more difficult compared to linear systems, and its main point is that the system is easy to diverge. In the actual application, there is a need to select the appropriate Kalman filter, such as robust Kalman filter, iterative Kalman filter, and square root volume Kalman filter. The specific selection needs to be determined based on the actual situation and debugging results. The research on the Kalman filter is still in the development stage, and further research on it is needed in the future for the load identification problem of nonlinear systems.
A typical RNN structure is shown in Figure 2 below. A very important concept for an RNN is the moment, and the RNN will give an output for each moment's input combined with the current model's state. As can be seen from the figure, the input to the main structure A of the RNN is not only from the input layer TX but also a looped edge to provide the current moment's state. Also, the state of A is passed from the current step to the next step.
The chained features reveal that RNNs are inherently sequence-related. They are the most natural neural network architecture for this type of data. In the last few years, RNNs have been applied with some success to problems such as speech recognition, language modeling, translation, and picture description. In the development of recurrent neural net-works, long-and short-term memory neural networks (LSTM) are widely used due to their good performance. LSTM is a special type of RNN characterized by the ability to learn long-term dependent information. In many problems, LSTM has achieved quite great success and has been widely used.

Intelligent Sensor-Based Design of a Holistic Reading
Cognitive State and Assistance System for Foreign Literature. The system framework diagram of the reading cognitive state recognition and assistance system based on wireless intelligent sensors is shown in Figure 3. From the system framework diagram, it can be seen that the reading state recognition and assistance system based on wireless intelligent sensors mainly includes five major parts: data collection, data preprocessing, reading content recognition, reading cognitive state recognition, and reading assistance. First, we collect the reading track data of users through wireless smart sensors. Since there are anomalies and sweeping points in the reading track data we get from the wireless smart sensor, we need to preprocess the data to remove the anomalies and sweeping points in the data. After the data is preprocessed, we identify the user's reading content and the reading cognitive state and send the identified results to the reading assistance system [20]. The reading assistance system gives the user different reading assistance according to the different recognition results, thus improving the user's reading efficiency and experience. In data acquisition, to obtain the user's eye movement trajectory in real-time, relying on traditional wireless smart sensors is not enough, so we modify part of the code to obtain the user's gaze point coordinates in real-time. In data preprocessing, since the wireless smart sensor determines the location of the user's gaze point by locating the center of the user's pupil, when the wireless smart sensor cannot obtain a clear image of the pupil, its judgment of the user's gaze point location will be inaccurate, which will lead to anomalies in the sequence of gaze points we obtain. Therefore, we need to perform an  Journal of Sensors anomaly removal process on the acquired data. In addition, we also need to remove the sweeping points from the data since the user cannot get the information when he/she is sweeping. Through analysis, we have used a k-means-based outlier and sweep point removal algorithm for data preprocessing. In reading content identification, reading content recognition is to mark the user's reading interests and difficulties in response to the user's needs. If a user is interested in or finds a part of the reading content difficult, the user will spend more time reading that part or even return to that part several times. Therefore, we extract two features, the user's gaze duration and the number of returns, to identify whether the user is interested in or finds it difficult to read the content. After identifying the location of the reading content that the user is interested in or finds difficult, we need to convert that content into the text from the image [21]. Since the quality of the images obtained from the wireless smart sensor is affected by the reading environment, the accuracy of the recognition is relatively low if the original images are directly subjected to OCR text recognition, so we binarize the images and correct the recognized characters, thus improving the recognition accuracy.
In reading cognitive state identification, the purpose of reading cognitive state identification is to identify the user's reading cognitive state, including concentration, distraction, and distress. When users are indifferent reading cognitive states, their reading trajectories will show different characteristics; for example, when users are in distracting reading state, their gaze depth will exceed the reading paper plane; when users are in a distracted reading state, their blink frequency will be higher than the blink frequency is focused and distracted states. Therefore, we can identify the user's cognitive state by extracting multiple eye-tracking features and using multiple sensors. In reading assistance, to complete the user's reading experience and efficiency, we designed a reading assistance system. When the reading content recognition system identifies content that the user is interested in or finds difficult, we send the content to the wireless smart sensor, which gives real-time feedback.
Binarization processing of video images is to take advantage of the grayscale difference between printed characters (handwritten characters) and their backgrounds to distinguish the characters from the background in grayscale, and at the same time, it can remove some irrelevant information, which is beneficial to the later image processing. In this paper, we choose to use the OTSU algorithm for binarization processing. The most critical thing in the OTSU algorithm is the determination of the optimal threshold value, which can divide the foreground and background into black and white parts.
Eye-movement technology provides a unique source of information about how humans and animals visually explore the world. Through eye-tracking, we can study the cognitive processes underlying visual experience (e.g., attention, preference, and discrimination) as well as quantify physiologicallevel parameters involved in eye-movement control (e.g., response latency and kinetic characteristics of eye movements). For these reasons, eye-movement techniques are increasingly used in a variety of research areas from neuroscience to psychology and have important scenario-based applications. In response to visual information, our eyes first notice the element of interest. Thus, if the mouse pointer is controlled by eye movements, the pointing time may be shorter than that of a usual input device (computer mouse). From a performance point of view, the eye-gaze input system is one of the candidates for a new interaction method. Since eye-gaze input systems use patterns of eye movements to determine the coordinates of the pointer on the display, they require information about the user's gaze. Specifically, the technology underlying eye-movement-based HCI is eye-tracking technology, which uses external devices (e.g., optical cameras and infrared transmitting and receiving devices) to capture electrical signals containing eye information and apply algorithms to process the electrical signals to extract characteristic signals of the eye, such as gaze, eye jump, and blink. The eye-movement human-computer interaction is achieved by converting the feature signals using eye-tracking technology into cursor movement commands on the screen or control selection commands (e.g., click and long press) to achieve the function of human-computer interaction.
Incorporating eye-movement technology into wireless intelligent sensors for the overall reading cognitive state recognition system design of foreign literature can better get the most realistic reading state of users, more conveniently solve the problems encountered by users in reading, and thus achieve user-friendly reading assistance functions. The handwriting symbol marker text recognition module is a key module of the server-side subsystem to achieve reading cognitive state recognition in complex situations, in which the results of handwriting symbol extraction, marker text extraction, handwriting symbol recognition, and text recognition will directly determine the accuracy of the system's feedback to students' questions. In this module, the image is firstly preprocessed; then, the handwritten symbols and marker text are extracted from the processed image, in which the image difference method and horizontal-vertical projection segmentation method are mainly used to 5 Journal of Sensors implement; and finally, the handwritten symbols and marker text are recognized using a support vector machine and OCR text recognition, respectively.
Based on the different character features of English, Arabic numerals, and Chinese characters, it is given to distinguish whether a character is a Chinese character or not using the features such as the minimum external rectangle size of the character, stroke width, and character spacing. For characters that are not Chinese characters, a semisupervised learning approach is proposed to train the neural network based on the traditional recognition algorithm. Among them, the composition of the appropriate training set and the training process, the architecture of the neural network, and the selection of nodes in the hidden layer need to be reasonable. In the actual testing process, the recognition rate is low in the first few times because of the small number of valid samples and categories obtained and the unequal number of characters per category obtained, and the recognition rate is improved in the later times because of the increased number of samples. When collecting image samples, it should be noted that if we want to get more valid samples and a better recognition rate, we should collect more images from printed documents, so that we can also reduce the number of training iterations. The number and types of characters in shop signs or road signs are limited, so collecting only such images will inevitably cause problems such as incomplete sample sets, low recognition rate, and more training.

Experimental Design and Validation
The first experiment is for a classification model designed using the multihead-attention mechanism. The results are shown in Figure 4, where two input sizes are used to compare whether the length of the input text has an impact on the classification accuracy. The vertical axis of the graph is the accuracy rate, and the horizontal axis is the number of training sessions. Comparing the accuracy of the first 10 steps, we can find that the classification accuracy of the model with input size 10 is slightly higher than that of the model with input size 35, and it is clear from the analysis that in the task of model classification, the more features can be referred to, the higher the classification accuracy is. After 20 steps, the accuracy of both models reaches 160, which means that the two input sizes do not affect the final result because the extractable features are fully sufficient to classify the text. 30 steps later, the accuracy index fluctuates occasionally from 160 to 180 because some texts in the training set are not clear in type and are fluctuations caused by the training set. It is not possible to classify these error samples accurately even when they are observed artificially, so this is ignored and it can be seen that the classification task can be done very accurately using this model.
The second experiment is to find a more suitable neural network to use for the corresponding classification task, as shown in Figure 5, where the vertical coordinate is the accuracy rate and the horizontal coordinate is the number of training sessions. The training curves of the two are very close to each other, and the accuracy of GRU is slightly higher in the first stage before 10,000 steps; compared to LSTM, the computation speed of GRU is slightly faster, which can be reflected in the accuracy, but in the later stage, the accuracy of LSTM is slightly higher than GRU, because, in the network structure, GRU has one less gate structure than LSTM, but in the end, both. The accuracy of both of them reaches 100%, which means that both neural networks can complete the classification task, so both of them can be used, but considering that more classifications may be added at a later stage, the LSTM neural network-based model is finally chosen to complete the classification task.
The third experiment is about the effect of different sized summary models on the summary accuracy, as shown in Figure 6, where the vertical coordinate is the text similarity and the horizontal coordinate is the number of training sessions. It is very obvious in the figure that there is a significant gap between the smaller model structure and the larger model in terms of the final convergence accuracy, which means that the model with Heads = 8 and Block = 6 is not able to complete the summarization task with an input size of 1024, and at this model size, the information of many features in the text cannot be passed on and a lot of important information is lost. In terms of final accuracy, the final accuracy of the summary model with Heads = 12 and Block = 8 is  Journal of Sensors stable at 93%, because its accuracy is based on the similarity of the text content between the generated summary and the correct headline, which cannot be the same; using the text similarity when it reaches 90%, the main information of the two sentences is already included, by observing the results of the test set and then by observing the results of the test set and then manually judging whether the summary meets the criteria. After the algorithm optimization, most of the key contents, grammatical structures, and sentence information of the test results have met the requirements and can be used to assist users in reading state cognition. The fourth experiment is to test whether different input sizes have an impact on the final summary results, by training the model at a fixed size, as in Figure 7, where the vertical coordinate of the plot is the text similarity and the horizontal coordinate is the number of training steps. Thanks to the premise of pretraining, the trained model can get 60% of the main feature information in about 5000 steps, and it can be found by the observed results that the summaries produced at this time only contain the more important keyword information, are not rigorous in terms of sentence structure, and are not highly readable. 10000-45000 steps of this process, the text similarity of the model with smaller input size is consistently lower than the input. The gap between the two decreases as the number of training continues to increase, and it is reasonable to conclude from the analysis that for a long text, the determinants for summary generation are more evenly distributed in the article, and not all important information is present in some positions, so the final results of the two are obtained. The output of the models is still similar. The final model with a stable text similarity size of 512 is still slightly lower than the model with 1024, because its referenceable feature information is relatively small after all, and although the main features can be obtained, they are always less comprehensive than those contained in the model with a larger input size.   Figure 7: Training walk of the summary generation model.

Journal of Sensors
As can be seen in Figure 7, when we use the SVM classifier for classification, P8 experimenters have the highest recognition accuracy of reading cognitive state at 91.7% and P6 experimenters have the lowest recognition accuracy at 75.8%. When we used the RF classifier for classification, P8 experimenters had the highest reading cognitive state recognition accuracy of 88.3%, and P7 experimenters had the lowest recognition accuracy of 79.8%. To compare the performance of different classifiers, we calculated the mean and variance of the reading cognitive state recognition accuracy of different classifiers. From Figure 7, we can see that the mean of the reading cognitive state recognition accuracy of the SVM classifier (83.9%) is only slightly higher than that of the RF classifier (83.6%), but its standard deviation (6.3%) is much larger than that of the RF classifier (3.3%). Therefore, we believe that the RF classifier outperforms the SVM classifier in the real-time reading cognitive state recognition and assistance system, and we used the RF classifier. Figure 8 shows the average reading time of the experimenter for different operations by different methods, and we can see from the figure that among the same types of operations, the experimenter's reading time is the shortest when using the reading assistance system with the wireless smart sensor. When the experimenter searches for reading content, the time spent using the reading assistance system is 71.88 seconds less than the time spent on keyboard input, 22.62 seconds less than the time spent on speech recognition, and 38.03 seconds less than the time spent on OCR text recognition, and the user's reading efficiency is increased by 40%, 17%, and 26%, respectively. When the experimenter translated the reading content, the time spent using the reading assistance system was 66.32 seconds less than the time spent on keyboard input, 13.52 seconds less than the time spent on speech recognition, and 14.58 seconds less than the time spent on OCR text recognition, and the user's reading efficiency increased by 45%, 14%, and 15%, respectively. When the experimenter sent the reading content to a friend via communication tools, the time spent using the reading assistance system was 48.88 seconds less than the time spent on keyboard input, 8.26 seconds less than the time spent on speech recognition, and 35.2 seconds less than the time spent on OCR text recognition, and the user's reading efficiency increased by 29%, 6%, and 22%, respectively. When the experimenter reads the content, the time spent using the reading assistance system is 67.07 seconds less than the time spent on keyboard input, 35.12 seconds less than the time spent on speech recognition, and 55.82 seconds less than the time spent on OCR text recognition, and the user's reading efficiency is increased by 33%, 21%, and 29%, respectively. When the experimenter took notes on the reading content, the time spent using the reading assistance system was 46.32 seconds less than the time spent on keyboard input, 60.24 seconds less than the time spent on speech recognition, and 32.61 seconds less than the time spent on OCR text recognition, and the user's reading efficiency increased by 31%, 37%, and 24%, respectively. It can be seen that when users use a reading assistance system, their reading efficiency was improved.
Experience is the unique combination of various elements, such as interactive objects and internal states of the user (e.g., emotions, expectations, and behavioral goals) that expand over time with definite beginnings and endings. Compared to usability, the experiential composition is more susceptible to momentary changes and, therefore, more susceptible to temporal uncertainty in assessment. From the current single experiment, the experience of using the eyemovement-assisted interaction system has a certain learning cost in the short term, but the efficiency improvement brought by the usability improvement and user experience can bring users an efficient and novel usage experience, so users not only gave positive comments that the user experience was improved when evaluating the system but also subjective ratings and objective data corroborated that the user experience of the system appeared extremely Both subjective ratings and objective data confirm that the experience has improved significantly. The evaluation of the system's experience over time is worthy of more in-depth study.

Conclusion
This paper presents research results and experiences related to eye-movement-assisted human-computer interaction in terms of cognitive processing characteristics of wireless smart sensor interfaces; the working principle of eyemovement instrumentation and eye-movement characteristics under interface interaction propose a method to distinguish Chinese characters from Western characters, trains neural networks using semisupervised learning, coincidentally classifies Western characters, uses semisupervised learning for single English and Arabic numeral characters for neural network training, analyzes and discusses the process of producing the training sample set, and compares the recognition results at different numbers of samples and different numbers of nodes in the implicit layer. The handwritten symbol marker text recognition method is investigated. The research of this method is the key part of the whole 8 Journal of Sensors paper; firstly, the video image is processed by skew correction, image alignment, and image binarization; then, the processed image is extracted by using the image difference method and projection segmentation method for handwritten symbols, and the symbol-tagged text is extracted by using the ROI method in OpenCV library; finally, the OCR text recognition method and SVM symbol recognition method are used. The extracted text and symbols are recognized, and the method is proved to be effective in extracting and recognizing the text at the handwritten symbol markers in the video images through relevant experiments. This paper also proposes to use the connected boundary tracking algorithm to find foreign text locators, which reduces a huge amount of floating-point operations; instead of rotating the image, we directly sample it at the current rotation angle and then obtain the foreign text bitstream information to realize the decoding of foreign articles under the embedded platform with relatively fewer resources. Through the wireless intelligent sensor-based recognition and assistance system for the overall reading cognitive state of foreign literature designed in this paper, we can realize the functions of reading content recognition, reading cognitive state recognition, and reading assistance to help users optimize the experience of reading and read foreign literature as a whole more easily and immersively.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.