IoT Based System for Heart Monitoring and Arrhythmia Detection Using Machine Learning

Internet of Things (IoT) technologies allow building a digital representation of people, objects, or physical phenomena to be available on the Internet. Thus, stakeholders can access this information from remote places or computational systems could analyze this data to find patterns, make decisions, or execute actions. For instance, a doctor could diagnose patients by analyzing the received data from an IoT system even when patients are located in a remote place. This article proposes an IoT system for monitoring electrocardiogram (ECG) signal and processing heart data in order to generate an alert when an arrhythmia is present. This system involves a Polar H10 heart sensor, machine-learning models to classify heart events, and communication technology to share and store patient's information. In the first place, the architecture of the IoT monitoring system and the communication between the components are described by discussing the designing criteria. Second, the experimentation process performs the training and the assessment of three classification algorithms, random forest, convolutional neural network, and k-nearest neighbors. The results show that k-nearest neighbor has the best accuracy percentage classifying the arrhythmias under study (premature ventricular contraction 94%, fusion of ventricular beat 81%, and supraventricular premature beat 82%); also, it is able to discern normal and unclassifiable beats with 93% and 97%, respectively.


Introduction
Te Internet of Tings and artifcial intelligence are emergent felds where researchers are developing methods and techniques to reduce human efort [1,2]. While IoT is relying on communication technology, sensors, and data processing technics [3]; classifcation algorithms are used to recognize a hidden pattern in the input data and identify the class where it belongs [4].
It is possible to apply this technology in felds such as industry, farming, meteorology, or medicine. For instance, a medical cyber-physical system (MCPS) is a concept where software, sensors, and networks interacting with physical human signals in order to improve the efectiveness of the patient's medical care. In this way, sensors capture biological signals and send them through a network to be processed by software algorithms that extract useful information and take actions, such as emitting an alert, diagnosis, or even providing treatment in order to easily improve the patient's health care [5]. Due to the high costs associated with inhospital care, alternatives including home care, assisted living, telemedicine, and sport-activity monitoring, have drawn more interest. Likewise, mobile monitoring and home monitoring of vital signs and physical activities allow health to be assessed remotely at all times [5].
On the other hand, according to the World Health Organization, cardiovascular disease is the leading cause of death with an estimated 32% of all deaths worldwide [6]. Similarly, sudden cardiac death (SCD) and arrhythmia represent between 15% and 20% of all deaths in the world.
Sometimes arrhythmias are detected when complications appear; knowledge of the characteristic ECG changes may provide early clues to the presence of these disorders, the prompt recognition of which can be lifesaving [7]. Terefore, continuous heart monitoring is required for early detection in order to give a suitable treatment for patients with cardiac arrhythmias [8].
In order to detect and diagnose heart diseases, the electrocardiogram (ECG) is an indispensable tool for monitoring the electrical activity of the heart [9]. Te ECG has allowed for studying and classifying diferent arrhythmias types according to their particular characteristics. Based on this typology, diferent studies have proposed using machine learning in order to extract ECG data patterns for the early identifcation of arrhythmias. In this way, machine learning algorithms such as k-nearest neighbors (KNN), convolutional neural networks (CNNs), and random forest has shown interesting result in arrhythmia classifcation [10][11][12].
In this work, we propose a system for monitoring the patient's heart signal and analyzing the heart data by using machine learning for detecting arrhythmias. Tus, when the machine-learning algorithm detects an arrhythmia, it generates an alert. In order to allow caretakers to see the patient's heart activity, the system provides a web application that displays the ECG chart and the generated alerts.
For capturing the heart signal, the Polar H10 heart monitor is selected because of its good performance [13]. In addition, it provides an ECG sensor and software necessary for accessing the captured data and sending it to the mobile terminal (via Bluetooth). Tis last, send the heart data through the message broker, which delivers the published data to the classifcation component.
Te classifcation component includes a classifcation algorithm selected by testing three classifcation algorithms (CNN, KNN, and random forest) and assessing their performance. It is in charge of receiving the heart data and recognizing patterns in order to classify the input data as arrhythmia (premature ventricular contraction, fusion of ventricular beats, or supraventricular premature beats) or normal beats. If the input data are an arrhythmia, the system generates an alert to avoid complications in the patient's health.
Te goal of this IoTsystem is to improve the medical care service to people at potential risk for heart incidents located at remote places by providing a monitoring system that helps doctors in the patient's heart signal supervision. Besides, authorized people can supervise patients accessing their ECG via the web.
Te remainder of the article is organized as follows: Section 2 introduces previous works in reference to cardiovascular disease prediction and supervision systems based on IoT. Section 3 describes the MIT-Arrhythmia database used for training the classifer models. Section 4 shows the system's architecture describing each component proposed. Section 5 presents the experimentation for building the classifcation model. Section 6 presents the results and discussion. Finally, Section 7 resumes the considerations and conclusions of the work.

Related Work
Interest in machine learning applications such as health care and medical science is growing considerably [14]. For instance, Ellaji et al. design an architecture of a narrowband IoT system to integrate wearable devices to improve medical health care. Te main benefts of using narrowband IOT are its low and low power consumption. Terefore, it provides a novel mechanism for linking all the smart devices, which requires a very less amount of data and long-range intervals. Tis improves hospital data transmission and accomplishes data processing in real time. Te narrowband IoT system includes an application server, NB-IoT devices, cloud stage, and customer applications. First, NB-IoT devices send sensing data to the cloud platform and it stores the data. Diferent applications process data subsequently clients can access this information through a user interface [15].
Ahamed et al. [16] propose a cardiovascular disease prediction system applying fve machine-learning algorithms (random forest, decision tree, Naive based, k-nearest neighbors, and support vector machine). Teir methodology includes four tiers as follows: (1) Data collection: tier 1 collects data from IoT sensors or wearables devices using TingSpeak Cloud, which is an Internet of things analytics service that allows aggregating, visualizing, and analyzing live data streams in the cloud [17]. (2) Data storage: tier 2, preprocesses and stores data using Heroku's PostgreSQL cloud, an open-source database as a service that provides a trusted, secured, and scalable storage platform [18]. (3) Data Analysis on a cloud: this tier uses diferent machine learning models to predict the output class for heart input data. Tey choose the model based on the best performance. (4) User interface: this tier provides a graphical user interface to allow doctors and patients to access clinical results.
Te work [19] presents an architecture of an embedded platform for web-based monitoring and medicine schedule reminder. Tis platform allows patients and health professionals to access the data captured by the sensor through a web monitoring system and it generates an alert according to the medicine schedule. Sensors transmit data through ZigBee or Bluetooth to manage units and RFID allows linking patient records with unique identifcation numbers. Additionally, they use the MQTT protocol to share data between components.
Barbosa et al. designed a telemonitoring system for patients with cardiovascular disease [20]. Tis system allows caretakers to monitor the signals and symptoms of patients with heart disease while patients reside at home, in this sense, telemonitoring means monitoring from a remote place. On the other hand, this system follows the continua design guidelines (CDG) in order to provide security to medical data fow among sensors, gateways, and health information services. Te CDG is an open framework for secure and interoperable health data exchange in personal connected health. Te CDG provides a set of clearly defned interfaces that enable the secure fow of medical data among sensors, gateways, and end services, removing ambiguity in underlying standards to ensure a consistent and interoperable ecosystem of personal connected health devices [21].
Heart and breath data are telemonitoring in the system presented in [22]. A DatascopeTrio acquires heartbeat and breathing frequency then this data are stored in a database and a JSP-based web allows clients to monitor patient's data.
Works as [23] present a self-diagnosis system of cardiovascular disease based on IoT. Sensors transmit the data to an Android application via Bluetooth then, the application stores this data using a CRUD REST API. Finally, the heart rate data are analyzed using the theory of probability. Te previous system is used for studying stress' efects on cardiac dynamics [24].

Background
As follows, some necessary concepts and backgrounds related to ECG, arrhythmias, IoT, and machine learning are presented.
3.1. Electrocardiogram. Electrocardiogram (ECG) lead is a recording of the heart's electrical activity. Terefore, a 12lead ECG is a recording of cardiac electrical activity from 12 diferent perspectives [25]. Tis ECG confguration builds up a three-dimensional characterization of the heart's electrical activity. Figure 1. Shows the twelve lead groups, on the right the frontal leads, and the transverse leads on the right.
Vertical plane (frontal leads): six frontal leads provide information about the heart's vertical plane: lead I, lead II, lead III, augmented vector right (aVR), augmented vector left (aVL), and augmented vector foot (aVF). In this confguration, three electrodes are placed on the right arm, the left arm, and the left leg. Given the electrode placements, these leads detect electrical activity in the frontal plane.
Horizontal plane (transverse leads): this confguration provides information about the heart's horizontal plane: V1, V2, V3, V4, V5, and V6. On the right side of Figure 1, is the localization of the electrodes of transverse leads. Te transverse leads are unipolar and require only a positive electrode, the negative pole of all 6 leads is found at the center of the heart [27].

Modifed Limb Lead.
During exercise stress testing, the motion of the limbs can disrupt the ECG recordings and in conditions where limbs become clinically inaccessible, the modifed limb electrode position on the torso addresses the above problem [28]. In the modifed limb lead (MLL) confguration, the right arm electrode is placed on the subject's third right intercostal space, slightly to the left of the midclavicular line. Te left arm electrode is placed in the ffth right intercostal space, slightly to the right of the midclavicular line, and the left leg electrode is placed in the ffth right intercostal space, on the midclavicular line [29].

Ambulatory ECG.
Ambulatory ECG (AECG) devices allow the cardiac rhythm to be monitored and recorded over days, weeks, or years and are used primarily in the outpatient setting [30]. Recording may take place continuously or occur intermittently in response to patient activation or autosensing of rhythm disturbances. Multiple devices are available, being the Holter monitors, ambulatory telemetry monitors (ATMs), and patch electrode monitors (PEMs) are the most common [31]. Other ambulatory ECGs include an event monitor, external loop monitor, implantable loop monitor, and pacemaker [32]. Journal of Healthcare Engineering

Cardiac Arrhythmia.
Cardiac arrhythmias are disturbances in the rhythm of the heart, manifested by abnormally fast or slow rates [33]. Sinus rhythm represents the heart's normal rhythm and its rate is between 60 and 100 beats per minute (bpm) while you are resting. Sinus bradycardia occurs when the sinus rhythm is slower than 60 bpm, and if the sinus rhythm is faster than 100 bpm, it is called sinus tachycardia [34]. Te typical arrhythmia symptoms include palpitations (being aware of your heartbeat), weakness, shortness of breath, lightheadedness, dizziness, fainting (syncope), and, occasionally, chest pain. Te symptoms tend to be more severe when the rate is faster. However, sometimes there are no symptoms, and the condition is evident during a routine examination. A tachyarrhythmia that is rapid enough and lasts long enough can produce cardiomyopathy and congestive heart failure. In these cases, treatment of the arrhythmia can often return normal function to the ventricles. Although certain physical signs present during arrhythmias can help the physician make a correct diagnosis, electrocardiography is the standard method used for recognizing cardiac arrhythmias [33].
Typical sinus bradycardia causes are as follows: (i) Medicines such as beta-blockers, including eye drops, which contain beta-blockers (ii) An underactive thyroid gland (iii) Hypothermia (iv) Tachy-brady syndrome Typical sinus tachycardia causes are as follows: (i) Underlying health conditions such as an overactive thyroid gland (ii) Severe anemia (iii) Infections (iv) Severe blood loss (v) Stimulants such as cafeine, nicotine, and alcohol (vi) Prescribed medicines such as salbutamol (vii) Illegal drugs such as amphetamines, cocaine, ecstasy, and cannabis

Arrhythmia Types.
Tere are many diferent types of arrhythmia; the type depends on where the rhythm starts in the heart, and whether the heartbeat is too fast, or too slow. Premature heartbeat: A premature, or extra, beat is a common, usually harmless type of arrhythmia that typically does not cause symptoms. Most healthy people who experience an occasional extra beat do not need treatment. However, a premature heartbeat can lead to a longer-lasting arrhythmia in persons with previous heart disease.
Supraventricular arrhythmias (SVT): these arrhythmias are tachycardias that start from above the ventricles. SVTs are quite common and rarely life-threatening.
Te principal types of supraventricular arrhythmias include [35]the following: Inappropriate sinus tachycardia: this is a fast heart rhythm, which can happen suddenly, with no obvious cause. While resting, the heart rate can quickly rise to over 100 bpm and with a very small amount of activity; it can quickly rise to 150 bpm. It is more common in young women [34].
Atrial fbrillation (AF): this happens when many impulses begin and spread through the atria. Tis results in a disorganized rapid and irregular rhythm. Because the impulses are traveling through the atria in a disorderly fashion, there is a loss of coordinated atrial contraction.
Atrial futter: an atrial arrhythmia caused by one or more rapid circuits in the atrium. Atrial futter is usually more organized and regular than atrial fbrillation. Supraventricular premature beats (SPB): these represent premature activation of the atria from a site other than the sinus node and can originate from the atria or the atrioventricular node, though the vast majority are atrial in origin. Atrial premature beats (APBs) are also referred to as atrial premature complexes, premature atrial beats, and premature atrial complexes. Tis includes patients without structural heart disease and those with any form of cardiac disease, independent of severity [36].
Ventricular Arrhythmias: ventricular arrhythmias are fast, abnormal heart rhythms that start from the ventricles. Most ventricular arrhythmias are caused by underlying heart disease, and can often be life-threatening [35]. Types of ventricular arrhythmias include.
Fusion beats: it is the presence of fusion beats, which identify simultaneous depolarization of the ventricle by both the normal conduction system and an ectopic impulse originating in the ventricle.
A fusion beat occurs when electrical impulses from diferent sources act upon the same region of the ventricular chambers at the same time.
Premature ventricular contractions (PVCs): they are early, extra heartbeats that start out in the ventricles. Most of the time, PVCs do not cause any symptoms or require treatment [35].
Ventricular tachycardia (VT): it is a rapid heartbeat that begins in the ventricles. Te rapid rhythm keeps the heart from adequately flling with blood, and less blood is able to pump through the body. VT can be serious, especially in people with heart disease, and may be associated with more symptoms than other types of arrhythmia.
Ventricular fbrillation (VF): it is an erratic, disorganized fring of impulses from the ventricles. Te ventricles quiver and cannot generate an efective contraction, which results in a lack of blood being delivered to your body.

Medical Cyber-Physical
System. Medical cyber-physical systems (MCPS) are the combination of embedded software controlling devices, networking capabilities, and the complicated physical dynamics of the human body in order to improve the efectiveness of medical care by monitoring patients' physical signals and providing proper treatment while ensuring safety [5]. Depending on their main functionality, it is possible to categorize the MCPS devices into two large groups as follows: (i) Monitoring devices: they provide diferent kinds of clinic-relevant information about patients such as heart rate, oxygen level, blood pressure, and temperature (ii) Delivery devices: they provide an actuate therapy in order to change the patient's physiological state In MCPS, interconnected monitoring devices can feed collected for decision support. For example, caregivers can analyze the information provided by monitoring devices and then use delivery devices to initiate treatment. In order to illustrate applications of MPCS examples are presented as follows: Pumps are commonly used to deliver opioids for pain management but patients have diferent reactions to the medications and require distinct dosages and delivery schedules. A major problem with opioid medications in general is that an excessive dose can cause respiratory failure. An MCPS can be designed to deliver the doses according to the patient's profle in order to prevent an overdose. Additionally, it can monitor the oxygen level, if some complication appears it can emit an alert.
Sometimes is necessary to take X-ray images during surgical procedures under general anesthesia. During the procedure, the patient breathes with the help of a ventilator. Because patients on ventilators cannot hold their breath to avoid blur in the X-ray image caused by moving lungs, the ventilator has to be paused and later restarted. In some unfortunate cases, the ventilator is not restarted, leading to the death of the patient. A safer alternative is to let the ventilator transmit its internal state to the X-ray machine. In this approach, the X-ray machine receives the state of the breathing cycle in order to decide the convenient instance when it can take an X-ray image [5].

MQTT Protocol.
MQTT is a client-server publish/subscribe messaging transport protocol. It is lightweight, open, simple, and designed to be easy to implement. Tese characteristics make it ideal for use in many situations, including constrained environments such as for communication in machine-to-machine and Internet of Tings (IoT) contexts.
Te protocol runs over TCP/IP, or over other network protocols that provide ordered, lossless, bi-directional connections. Its features include; the use of the publish/ subscribe message pattern providing one-to-many message distribution and decoupling of applications [37]. Client: a program or device that uses MQTT and always establishes the network connection to the server. Server: a program or device that acts as an intermediary between clients, which publish application messages, and clients, which have made subscriptions.
Topic Name: it is the label attached to an application message, which matches the subscriptions known to the server. Te server sends a copy of the application message to each client that has a matching subscription [37].
EMQX is an open-source MQTT broker with a highperformance real-timemessage-processing engine, powering event streaming for IoT devices at a massive scale [38].
Te benefts of EMQX broker are as follows: (i) Massive Scale: scale to 100 million concurrent MQTT connections with a single cluster (ii) High performance: move and process millions of MQTT messages per second in a single broker (iii) Low latency: guarantee submillisecond latency in message delivery with the soft real-time runtime (iv) High scalability, security, and reliability 3.5. JavaScript Object Notation. JavaScript Object Notation (JSON): is a lightweight data-interchange format. It is written as such a property list enclosed in braces ({}), with name-value pairs separated by commas (), and with the name and value of each pair separated by a colon (). In JSON each property name and each string value must be enclosed in double quotation marks (") [39].
Te benefts of JSON are as follows: (i) JSON is a text format completely language independent (ii) It is easy for humans to read and write (iii) It is easy for machines to parse and generate 3.6. Machine Learning. Machine learning has three main learning approaches, supervised learning, unsupervised learning, and reinforcement learning [40]. Te most general problem approachable by a supervised method is predicting a variable (answer or label) when some others (predictors) are known. As an example, in medical diagnosis, the predictors could be clinical parameters such as temperature, blood pressure, and cholesterol, and the answer could be the knowledge if the patient is sick or not.
Supervised learning includes a frst stage called training, where the predictors and the answer (labeled data) are provided letting the system learn how to fnd an answer based on the predictors. If the training is successful, the algorithm is able to automatically predict the answer for any other datasets including the same predictors [41].
Some remarkable supervised learning algorithms are the decision tree, random forest, logistic regression, and KNN [42]. Regarding this project, the following algorithms are used: 3.6.1. Convolutional Neural Network. Convolutional neural network (CNN) is a deep learning model for handling information that has a grid pattern, such as images; however, they are also efcient with audio data, temporal series, and signals. Tis algorithm learns spatial hierarchies of features, from low-to-high-level patterns. CNN has typically three types of layers, convolution, pooling, and fully connected. Te convolutional layer is responsible for extracting features from the data, while the polling layer is in charge of reducing the dimensionality of the feature map keeping the essential elements. Once the features extracted by the convolution layers and downsampled by the pooling layers are created, they are mapped by a subset of fully connected layers to the fnal outputs of the network (classes) [43].
Te advantages of CNN are as follows: (i) It automatically detects important features without any human supervision (ii) High accuracy in image recognition and classifcation (iii) CNN is computationally efcient by minimizing computation operations

K-Nearest Neighbors.
Te classifcation approach of K-nearest neighbors [44] is an instance learning where the algorithm classifes new observations based on the similarity to the rest of the data points stored in the model. Te process takes a new observation and selects the K closest points to the new example then selects the most common class between them by the majority vote [45]. Te classifcation has two stages, frst determining the nearest neighbors and second, deciding the class based on these neighbors. Figure 2 illustrates the KNN process: in this case, the algorithm fnds the nearest neighbors, three triangles, one circle, and one plus. Te majority is for the triangle class with three votes; therefore, the algorithm classifes the element (?) as a triangle class.
Te advantages of KNN are as follows: (i) It has no training period, it just stores the dataset during the training period and learns from it when performs real-time predictions (ii) Since the KNN algorithm requires no training before making predictions, new data can be added and this will not afect the accuracy of the algorithm (iii) Low-complex implementation 3.6.3. Random Forest. Te random forests algorithm is a combination of multiple tree predictors where each tree depends on a random vector. Tis algorithm generates a large number of trees using the training dataset, then they vote for the most popular class [46]. Figure 3 represents a general schema of the random forest process as follows: Random forest procedure as follows: (i) Step 1: select random samples from the training dataset. (ii) Step 2: build a decision tree for the K data point. (iii) Step 3: chose the n-tree subset from the generated trees and execute step 1 and step 2. (iv) Step 4: the fnal class is the most voted one.
(v) Tey also ofer a superior method for working with missing data. Missing values are substituted by the variable appearing the most in a particular node. Among all the available classifcation methods, random forests provide the highest accuracy.
Te advantages of the random forest are as follows: (i) It can handle big data with numerous variables running into thousands (ii) It can automatically balance data sets when a class is more infrequent than other classes in the data Te method also handles variables fast, making it suitable for complicated tasks.

Confusion Matrix.
Te confusion matrix is a tool for measuring the performance classifcation of machine learning models. In addition, we can say confusion matrix is a summarized table of the number of correct and incorrect predictions generated by a classifer. Table 1 represents an example of a confusion matrix as follows: Accuracy is a measure of how many correct predictions the model made for the complete test dataset. Tis is a convenient metric to evaluate the performance of classifcation models [47].

MIT-BIH-Arrhythmia Database
Te MIT-BIH-Arrhythmia database is a set of standard test materials for the evaluation of arrhythmia detectors. Tese detectors are algorithms that identify normal or abnormal signals by using detection or classifcation techniques [48]. Te MIT-BIH-Arrhythmia is a convenient testing dataset since it provides a variety of and well-characterized arrhythmia data [49].
Tis database contains samples of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by Boston's Beth Israel Hospital Laboratory. In most records, one channel is a modifed limb lead II (MLII), obtained by placing the electrodes on the chest. and the other channel is usually V1 and, to a lesser extent, V2, V4, or V5 [50]. Tis work takes samples of fve classes from the MIT-BIH-Arrhythmia database as follows: (i) N: normal beat (ii) S: supraventricular premature beat (iii) V: premature ventricular contraction (iv) F: fusion of ventricular and normal beat (v) Q: unclassifable beat

System Architecture
Tis section presents the architectural design decisions to accomplish the requirements of the system. In this sense, four components interoperate to capture and process data to present useful information to the user, as shown in Figure 4.
Sensing manager: it is a mobile application that manages the sensing data that the ECG sensor sends via Bluetooth.
CRUD REST API: it provides create, read, update, and delete operations to other components in order to exchange information with the database.
Monitoring GUI: it is a web application where users can access for monitoring patients through a graphic user interface.
Data structures: in order to exchange information between components, every data structure is defned in JSON format given the benefts of easy parsing and reading data.
Data exchange: a message broker allows sharing between components.  Coupling is an undesirable interdependence between components or modules. It is suitable in software design to keep low coupling because it allows making changes in one component without afecting the others.
Given the performance of the EMQX broker for IoT devices, it is selected to manage the message exchange in the system.

Data Storage.
All data needed by the system is stored on a document database due to its capability of storing data structures such as JSON directly. Tis means it is not required to transform the JSON element data because document databases store data as JSON documents, instead of columns as needed by a relational database. Tis feature eases the data persistence implementation.
Patients' personal data (id and names), heart data events (id, ECG, and timestamps), and medical staf data (id, names, and credentials) are represented in JSON format.

CRUD Component.
Te purpose of the REST API is to provide the required CRUD operations (create, read, update, and delete) on the database. Tis implies just API REST has direct communication with the database; therefore, every component that needs to exchange data with the database has to do it by sending HTTP requests to the REST API. In this way, every component has to implement a RESTclient to use the operation that REST API exposes. Te detail of the CRUD operations are as follows: Create operation: the API provides endpoints for saving heart incidents when arrhythmia is detected, for this purpose, the REST client must send this data (patient Id, type of arrhythmia, and timestamp) in the request, and the API saves the register in the database.
Read operation: the API provides endpoints for reading heart data by sending the patient's Id. In the same way, medical stuf data (name, id, and credentials) are returned when the proper id is sent in the request.
Update operation: the API provides endpoints for updating personal data or credentials by sending the proper id and the data to be updated.
Delete operation: the API provides endpoints for deleting the register by sending the id. Tis operation is enabled just to unsubscribe medical stuf.
Although API REST can handle diferent formats to exchange data, JSON is suitable to meet the need of this system, so all data structures use JSON format.
Given that API REST provides the required operation on the database, all components must implement REST clients to perform these operations instead of implementing database clients it selves. Tis implies REST API depends on the database implementation but avoids coupling with other components. Tis means any change in the database implementation afects the API REST but the other components have no impact.

Arrhythmia Detection Component.
Tis component requires a machine learning model to recognize arrhythmia patterns, such as premature ventricular contraction, a fusion of ventricular beat, or supraventricular premature, on the heart data. For this purpose, an experimentation phase is performed in order to evaluate diferent classifcation algorithms and select the model with the best performance (see section 7). As a result of this experimentation, the selected model is integrated into the detection component.
Te detection component is a subscriber of the heart data published by the sensing component. Terefore, when heart data are published, the MQTT sends it to the arrhythmia detection component. Ten, the machine-learning model processes the information and performs the classifcation. If the algorithm classifes the data as any arrhythmia, the component registers the alert using the proper operation. In this way, API REST creates a new record of heart data incidents in the database.

Monitoring Component.
Te heart monitor is a web application that provides a graphical user interface where authorized people access patients' data. When a user tries to login into the system by sending his credentials, the web application read the corresponding staf data by using a proper read operation from API REST. If the credentials are correct, the login is valid and the user can access to the system. ECG data are received from the message broker in order to build the ECG chart in real time. Additionally, the alert list is disposed of in the web application; this data are obtained through the proper API REST read operation.

Experimentation
Tis section describes the arrhythmia dataset and explains the applied methodology in this work for building and choosing the classifcation model.

Model Building Process (Ofine).
Tis process split the dataset into two subsets, 70% of the data are for training and 30% for testing; this is a typical proportion used in machine learning [52].
Te algorithm learns from the training data where each sample has a feature vector and a known category so the model adjusts its parameters with every sample to improve the classifcation.
In the next step, the algorithm receives data from the testing group with unknown categories, and the algorithm has to classify each input element in the correct category or class. For measuring the model accuracy, the convolutional matrix is generated. Te classifcation accuracy is represented by the percentage of data labeled in the right category and misclassifcation take place when samples are classifed into categories they do not belong.
Training samples have the same quantity for each class in order to balance the training set as shown in Figure 5.
It is necessary to normalize the sample and take the same number of characteristics in the feature vector. Tereby, 180 features are included and they are normalized between zero and one.
Based on this dataset, three machine-learning algorithms receive this dataset to build the classifcation models. Ten, each model runs tests with the validation data to measure its classifcation performance. Tis test gives us the percentage of success and failure in the classifcation. Te model with the best classifcation performance is selected as a part of the detection component.

Arrhythmia Detection Component.
Te arrhythmia component integrates the best classifcation model from the ofine process. It receives data from the message broker to process and classify into the fve given classes. Each input data corresponds to the output label. When the output label is ventricular, supraventricular, or fusion beat, the component registers an alert in the database using the CRUD

. Results and Discussion
Tis section resumes the results and discussion about the main fndings for architectural decisions and experimentation.

Architectural Decisions.
Te proposed architecture allows the decoupling of each component of the system keeping the single principle responsibility [53] and they communicate with each other by sharing data in the message broker. Tis desegregation permits every component to take advantage of diferent technologies and programming languages, for example, python for data analysis, angular for graphical user interface, and spring boot for the API REST implementation.
By selecting JSON as a format for data exchange allows for simplifying data handling by every component. On the other hand, the document database let the direct storage of data in JSON format and transformation is not necessary.
Finally, instead of connecting directly with the database, every component uses the CRUD operations that API REST provides. Tis approach simplifes the implementation given every component that needs stored data to have to use the proper API REST operation instead of implementing database connection by itself.

Model Training and Performance Assessment.
Te classifer model is an essential part of the system because it is in charge of receiving heart data and categorizing in any of the fve classes. Tus, errors in classifcation could lead to a fake alert or hiding real heart incidents so it is necessary to test diferent classifer models and select the model with the best performance.
Te frst approach is a convolutional neural network (CNN). Te confusion matrix in Figure 6 shows the results of this algorithm. Tis reveals a high classifcation accuracy for normal beats, a fusion of ventricular, and unclassifable beats but low performance for other classes.
CNN classifes 83% of the normal beats correctly but 7% as a fusion of ventricular, 5% as premature ventricular contraction, 3% as unclassifed, and 2% as supraventricular. Fusion of ventricular class has 91% of data well classifed but 1% are misclassifed as ventricular and 9% as normal. Similarly, 89% of unclassifed class belongs to the correct category but 9% are labeled as normal and 1% as ventricular.
Instead of previous classes, supraventricular classifcation has a poor performance, just 48% of data are in the correct category, and almost half of them 47% are misclassifed as supraventricular, 2% as ventricular, 3% as fusion beats, and 1% as unclassifed. Likewise, Ventricular beats have 61% of accuracy, and 21% are labeled as normal, 2 as supraventricular, 7% as fusion beats, and 9% as unclassifed beats.
Te most remarkable results are 47% of supraventricular and 21% of ventricular data are classifed as normal beats. In practice, the system skips a high number of incidents of these classes so it never registers these events because it considers them as normal beats. For this reason, it is necessary to fnd a better performance in the classifcation.
Te second approach uses a random forest algorithm to perform the classifcation; Figure 7 shows the confusion matrix. Tis algorithm improves the classifcation accuracy for some classes but reduces it for others.
Classifcation accuracy increases by 9% for normal class and supraventricular. However, accuracy decreases by 8% for fusion beats and 4% for unclassifed beats.
Despite a better performance for normal and supraventricular classifcation, the algorithm reduces accuracy for fusion and unclassifed beats.    Focus on the misclassifcation percentage of supraventricular data, it is remarkable that the reduction of data labeled as normal being supraventricular. It implies this algorithm detects 6% additional incidents of this type but increments by 11% for ventricular, 6% for fusion beats, and 5% for unclassifed beats in the percentage of misclassifcation as normal.
Te results show the general performance of the random forest algorithm is worse than the convolutional neural network performance. Terefore, it is not suitable to use this algorithm for this work.
Te third approach is a classifer based on k-nearest neighbors, Figure 8. Tis algorithm achieves a considerable improvement in classifcation accuracy, especially for normal, supraventricular, and unclassifed classes.
It gets 93% of correct classifcation for normal class reducing the misclassifcation error as follows: supraventricular 4%, ventricular 2%, fusion beats 1%, and no errors in unclassifed.
In regards to supraventricular classifcation, random forest, and CNN algorithms have problems distinguishing supraventricular samples from normal samples so they reach errors greater than 40%. By contrast, the KNN algorithm gets the best performance in this topic with 82% of accuracy and 13% of data misclassifed as normal, 3% as ventricular, 1% as fusion beats, and 0% as unclassifed.
KNN has an excellent performance classifying ventricular samples with 94% of accuracy, 33% greater than CNN and random forest results. On the other hand, 2% of ventricular samples are labeled as normal, 1% as supraventricular, 2% as fusion beats, and 1% as unclassifed.
Te accuracy of KNN classifying fusion beats is worse than CNN and random forest where they have 10% and 2% higher accuracy, respectively, than KNN. Tis increment of the error is related to 11% of fusion beats samples misclassifed in the ventricular class. However, in practice, this behavior is better than having a high percentage in any class of arrhythmia that is considered a normal beat because it implies the system generates no alert.
KNN classifes 97% of unclassifed samples correctly labeling 2% of them as normal and 1% as fusion beats. Tis algorithm is able to diferentiate perfectly unclassifed samples from supraventricular and fusion beats.
Considering all of these, the KNN is the algorithm with the best performance result for this system; it improves the lowest scores obtained by CNN and random forest classifying supraventricular and ventricular samples. Tus, this model is selected to be part of the detection component.

Conclusions
IoT technologies solve problems in various felds such as health. In such a way, the remote monitoring system for cardiac signals, proposed in this work, enables remote supervision providing a support tool in the timely detection of problems related to arrhythmias such as ventricular, supraventricular, and fusion beats and improves medical service and patient's quality of life.
Architectural decisions presented in this work are useful in solving similar problems that require remote supervision of a given variable where IoT technology is suitable. For instance, separate components allow separating responsibilities and using diferent technologies according to the needs of every component. Tis approach helps in the decoupling, scalability, and maintenance of the system.
A nonrelational database provides a simple mechanism to store data such as JSON due to the fexibility it provides in the data structure. In addition, API REST simplifes database management providing CRUD operations instead of implementing a database client in every component. In this sense, required changes in the database imply modifcations only on the data management component.
On the other hand, it is evident that the KNN algorithm has superior performance over convolutional neural networks and the random forest, in the classifcation of supraventricular, ventricular, and fusion beats arrhythmias.

Data Availability
All data generated or analyzed during this study are included in the manuscript.

Conflicts of Interest
Te authors declare that they have no conficts of interest.