The study aims to explore the application of international classification of diseases (ICD) coding technology and embedded electronic medical record (EMR) system. The study established an EMR information knowledge system and collected the data of patient medical records and disease diagnostic codes on the front pages of 8 clinical departments of endocrinology, oncology, obstetrics and gynecology, ophthalmology, orthopedics, neurosurgery, and cardiovascular medicine for statistical analysis. Natural language processing-bidirectional recurrent neural network (NLP-BIRNN) algorithm was used to optimize medical records. The results showed that the coder was not clear about the basic rules of main diagnosis selection and the classification of disease coding and did not code according to the main diagnosis principles. The disease was not coded according to different conditions or specific classification, the code of postoperative complications was inaccurate, the disease diagnosis was incomplete, and the code selection was too general. The solutions adopted were as follows: communication and knowledge training should be strengthened for coders and medical personnel. BIRNN was compared with the convolutional neural network (CNN) and recurrent neural network (RNN) in accuracy, symptom accuracy, and symptom recall, and it suggested that the proposed BIRNN has higher value. Pathological language reading under artificial intelligence algorithm provides some convenience for disease diagnosis and treatment.
With the improvement of living standards, people’s attention to their health is also increasing. At the same time, they are concerned about the medical conditions and facilities of the hospital. The hospital will conduct statistical analysis and comparison of the annual medical data to form an indicator of medical quality management [
Through the extraction and statistical analysis of these data, relevant indicators are formed to provide convenience for medical personnel, which help them better understand the relevant medical conditions and improve the quality of medical care [
ICD coding technology is a relatively complete and mature disease-coding method. Almost every hospital unit uses its rules to code diseases on the front page of medical records [
In this experiment, the statistical analysis of the first page of medical records is conducted through the sample survey. The causes of errors in the disease classification on the first page of the medical record are analyzed and summarized. Meanwhile, the embedded medical record information knowledge system is established, and the NLP-BIRNN algorithm is used to optimize the medical record text. Also, the detailed improvement measures are proposed to provide a basis for the investigation of disease classification.
The data were randomly selected from the EMR database of our hospital, and the patient medical records and disease diagnostic code data on the front pages of 8 clinical departments of endocrinology, oncology, obstetrics and gynecology, ophthalmology, orthopedics, neurosurgery, and cardiovascular medicine were collected.
The medical record information is randomly extracted from the EMR management system for analysis. Statistics on the diagnosis selection and the number of disease-coding errors are conducted. The statistical analysis of the data of clinicians and medical coders before and after training is performed. Through research, self-examination, training, and feedback, the disease-coding knowledge topics of the above eight clinical departments are explored. The coder calls up the medical record data of a certain department. According to ICD-10’s undergraduate disease-coding rules, such as clinical knowledge and main diagnosis selection principles of the disease’s physiological mechanism, development process, clinical manifestations, and treatment methods, the disease-coding process is conducted. During the coding process, the coder must carefully read the case, especially the admission records (judgment of the main diagnosis selection), surgical records (surgical name, postoperative method, surgical grade, surgical incision, and anesthesia method), course records (development of the disease, such as whether the disease is aggravated, whether it is improved, and whether it is treated), and discharge records (diagnosis and treatment process, whether the diagnosis at admission and discharge are consistent). Pathology, imaging, ultrasound, and laboratory reports should also be paid attention to.
According to the results, the coder reexamines the previous medical records, mainly checking the diagnosis selection and disease coding. Statistics on the data of diagnosis selection errors and disease-coding errors are conducted. Then, ICD-10 training will be given to the relevant doctors and coders of the above eight clinical departments on the problems examined this time.
The coder compares and analyzes the data of main diagnosis selection errors and disease-coding errors before and after the training. The results are then shared to the relevant personnel with feedback, to explore rectification measures and improve the coding level.
The system establishes conditions. First, conditions are written based on the writing standard of medical records stipulated by the national health department. The data content expressed in the medical record should be a common medical term without ambiguity. Second, the knowledge system must contain descriptive and conclusive knowledge about medical records. Descriptive knowledge refers to the detailed description of the disease. For example, the types of cold are wind chill and wind heat and their common symptoms are headache and cough, which is descriptive knowledge. On the contrary, it is conclusive knowledge to judge the type of cold according to the description of the disease. The knowledge system needs to combine the structured input interface of medical records and provide a selective input prompt based on the user input. Third, the medical record information is stored as an XML document, which has hierarchical structure. The knowledge base is also represented by the hierarchical XML document to achieve efficient synchronization of medical record input interface, which corresponds to the medical record information document. For some simple information input without prompt, the node can be set to null value. For items with multiple choices, multiple child nodes can be set when prompted, read into memory, and displayed in the medical record interface for users to choose. Fourth, the semantics should be concise and include the hints about limiting the use of characters.
The information base of the embedded medical record knowledge system contains knowledge, as shown in Table
Content of database construction of the embedded medical record knowledge system.
Descriptive knowledge of disease | Terms of disease characteristics and selection of entry |
---|---|
Conclusive knowledge of disease | Keywords given for pathological conclusion are given |
Normative data content | Age |
XML document input information | Restrictions on character entry |
There are seven pages in the embedded EMR input system, and seven XML documents of medical record knowledge base are established accordingly. When doctors create a new medical record page, the system will automatically call the knowledge base XML document according to the current page. When doctors input specific document node information, they can input medical record information according to prompt operation to improve the recording speed and reduce expression errors.
First, the data preprocessing of EMR is carried out, including data processing, cleaning, and screening.
Second, NLP based on medical tagging (medical record tagging, character extraction, word vector transformation, deep neural network, automatic tagging, and feature vector splicing) and NLP without medical tagging (no medical record, part of speech tagging, keyword selection, word vector transformation, and feature vector splicing) are performed.
Third, calculation is done. The symptom feature vectors involved in NLP solutions are normalized, and the values at each position of vector data are limited to [0, 1]. The ICD is coded by one hot representation to become the tag of deep learning training. The normalized feature vector and label are imported into the deep learning model for training, the auxiliary diagnosis model is obtained, and the test set is used to complete the test of the model results.
The SPSS24.0 software was adopted for data statistics and analysis. The difference between the two groups of data was analyzed by the
At present, the diagnosis basis of 2019 novel coronavirus (2019-nCoV) is mainly nucleic acid testing and medical imaging detection. The combination of the two is more conducive to diagnosis. It is found from the announcement of officially confirmed cases that some patients showed a positive result after more than 2 nucleic acid tests and even showed a positive result after the fifth nucleic acid test. Therefore, while performing nucleic acid testing, lung CT imaging examination is carried out. Patients with lung CT presenting with signs of acute inflammation should be admitted as soon as possible in accordance with the principle of “suspected disease is always present.” The CT and ultrasound results of some diseases are shown in Figure
CT and ultrasound images. (a) The chest CT image of COVID-19 cases. (b) The segmented image of the pneumonia-infected area. (c) CT of the abdomen. (d) Ultrasound of the abdomen.
Results of main diagnosis selection errors and disease-coding errors of the obstetrics and gynecology department are shown in Figure
(a) Main diagnosis selection errors and (b) disease-coding errors of the obstetrics and gynecology department (
Results of the main diagnosis selection errors and disease-coding errors of the endocrinology department are shown in Figure
(a) Main diagnosis selection errors and (b) disease-coding errors of the endocrinology department (
Results of the main diagnosis selection errors and disease-coding errors of the orthopedics department are shown in Figure
(a) Main diagnosis selection errors and (b) disease-coding errors of orthopedics department (
Results of the main diagnosis selection errors of the oncology and neurosurgery department are shown in Figure
Main diagnosis selection errors of (a) oncology and (b) neurosurgery (
Results of disease-coding errors of cardiovascular medicine, burns surgery, and ophthalmology departments are shown in Figure
Disease-coding errors of (a) cardiovascular medicine, (b) burns surgery, and (c) ophthalmology departments (
The results of automatic annotation of the medical record language under different artificial intelligence algorithms are statistically analyzed (Figure
Performance comparison of different artificial intelligence algorithms.
Figure
In medicine, disease classification reflects the medical level of a hospital to a certain extent. Disease classification refers to the scientific classification of various diseases through coding to provide a basis for clinical diagnosis and treatment. Hospital requirements for accurate classification of diseases have also increased the requirements for disease coders [
The experiment analyzes the causes of disease-coding errors. There is no merge coding for diseases that should be merged. Studies have found that coding undiagnosed diseases according to diagnosed diseases is also one of the reasons for coding errors [
This experiment shows that the main reason for the main diagnosis selection errors and coding errors for coders is that they do not fully understand the various diseases and their classification. Therefore, relevant training in this area should be strengthened.
The study analyzes the reasons for the wrong selection of main diagnosis before and after training in obstetrics and gynecology, endocrinology, oncology, orthopedics, and neurosurgery, as well as the wrong disease coding in these departments. It is found that the statistical errors of various diseases before and after training are significantly different between the groups. It shows that the coding personnel do not have a thorough understanding of various diseases and their specific classification, and the medical staff cannot clearly fill in the medical records of diseases. Therefore, the relevant training of coders and medical personnel should be strengthened. However, there are also some shortcomings, such as the small number of samples. Later, the scope of sample collection can be expanded to provide some support for the main diagnosis selection and disease-coding research.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare no conflicts of interest.