Predicting Asthma Outcome Using Partial Least Square Regression and Artificial Neural Networks

The long-term solution to the asthma epidemic is believed to be prevention and not treatment of the established disease. Most cases of asthma begin during the first years of life; thus the early determination of which young children will have asthma later in their life counts as an important priority. Artificial neural networks (ANN) have been already utilized in medicine in order to improve the performance of the clinical decision-making tools. In this study, a new computational intelligence technique for the prediction of persistent asthma in children is presented. By employing partial least square regression, 9 out of 48 prognostic factors correlated to the persistent asthma have been chosen. Multilayer perceptron and probabilistic neural networks topologies have been investigated in order to obtain the best prediction accuracy. Based on the results, it is shown that the proposed system is able to predict the asthma outcome with a success of 96.77%.The ANN, with which these high rates of reliability were obtained, will help the doctors to identify which of the young patients are at a high risk of asthma disease progression. Moreover, this may lead to better treatment opportunities and hopefully better disease outcomes in adulthood.


Introduction
Artificial neural networks (ANNs) are one of the main constituents of the artificial intelligence (AI) techniques.Besides the different applications in many other areas, neural networks are also used in health and medicine areas, such as biomedical signal processing, diagnosis of diseases, and medical decision [1,2].
ANNs have an excellent capability of learning the relationship between the input-output mapping from a given dataset without any prior knowledge or assumptions about the statistical distribution of the data [3].This capability of learning from a certain dataset without any a priori knowledge makes the neural networks suitable for classification and prediction tasks in practical situations.Furthermore, neural networks are inherently nonlinear which makes them more practicable for accurate modeling of complex data patterns, in contrast to many traditional methods based on linear techniques.Due to their performance, they can be applied in a wide range of medical fields such as cardiology, gastroenterology, pulmonology, oncology, neurology, and pediatrics [1].
Several studies have proposed ANN models for the prediction of various diseases.The authors of [4] developed an ANN to determine whether patients had breast cancer or not.If they had, its type could be determined by using ANN and BI-RADS evaluation, based on the age of the patient, mass shape, mass border, and mass density.In another study, an ANN model combined with six tumor markers in auxiliary diagnosis of lung cancer was investigated in order to differentiate lung cancer from lung benign disease, normal control, and gastrointestinal cancers [5].
The most commonly used neural network for disease prognosis systems is the multilayer perceptron (MLP) due to its clear architecture and comparably simple algorithm.The backpropagation algorithm is widely recognized as a powerful tool for training of the MLP structures.Even though MLPs have been successfully used in selected medical applications, they are still faced with skepticism by many scientists in the medical community, due to the "black box" nature of the ANN procedure.Specht and Shapiro [6] have developed an alternative neural network, the probabilistic neural network (PNN), which uses Bayesian strategies for pattern classification, a process familiar to medical decision makers.PNNs are exceptionally fast, since their training phase only requires one pass through the training patterns.Due to the fact that PNN provides a general solution to pattern classification problems, it is suitable for disease diagnosis systems.
Asthma is a chronic inflammatory disorder of the airways characterized by an obstruction of airflow, which may be completely or partially reversed with or without specific therapy [7].Airway inflammation is the result of interactions between various cells, cellular elements, and cytokines.In susceptible individuals, airway inflammation may cause recurrent or persistent bronchospasm, with symptoms like wheezing, breathlessness, chest tightness, and cough, particularly at night or after exercise.Most of the children who suffer from asthma develop their first symptoms before the 5th year of age.However, asthma diagnosis in children younger than five years old remains a challenge for the clinical doctors [8][9][10].
Most of the times, it is difficult to discriminate asthma from other wheezing disorders of the childhood because they might have similar symptoms.Thus, children with asthma may often be misdiagnosed as a common cold, bronchiolitis, or pneumonia.For the diagnosis of asthma a detailed medical history and physical examination along with a lung function test are usually required.On the other hand, lung function test is difficult to be performed in children younger than six years old.Hence, the diagnosis in the preschoolers is mainly based on clinical signs and symptoms and remains a challenge for the clinician.Finally, the main question deals with the possibility if a patient with asthma symptoms before the 5th year will either continue to have such symptoms or not.Asthma is a disease with polymorphic phenotype affected by several genetic environmental and genetic factors which play a key role in the development and persistence of the disease [11][12][13][14].These factors include family history of asthma, presence of atopic dermatitis or allergic rhinitis, bronchiolitis episodes during childhood, maternal smoking during pregnancy, lower respiratory tract infections, patient's diet, and several perinatal factors other than maternal smoking.Early identification of patients at risk for asthma disease progression may lead to better treatment opportunities and hopefully better disease outcomes in adulthood [15,16].
In preventive medicine, the value of a test lies in its ability to identify those individuals who are at high risk of an illness and who therefore require intervention, while excluding those who do not require such intervention.The accuracy of the risk classification is of particular relevance in the case of asthma disease.Due to the high prevalence of this condition, inaccurate risk prediction will lead to overtreatment of a large number of people and undertreatment of many other.In recent years, several large-scale studies have shown that in people at high risk of asthma the prevalence of asthma can be reduced if some common asthma triggers are avoided during the first years of life [17].
Several studies in order to answer the question of which young children with recurrent wheezing will have asthma at school age have utilized the Asthma Predictive Index (API).The API was developed 12 years ago by using data from 1246 children in the Tucson Children's Respiratory study [18].The positive API score includes frequent wheezing episodes during the first 3 years of life and either one of two major risk factors (parental history of asthma or eczema) or two of three minor risk factors (eosinophilia, wheezing without colds, and allergic rhinitis).A loose index requires any wheezing episodes during the first 3 years of life as well as the same risk factors with the positive API.A positive stringent API score by the age of 3 years was associated with a 77% chance of active asthma from the ages of 6 to 13 years while over 95% of children with a negative API score never had active asthma during their school years.After API, some other scoring systems were also developed in order to identify which of the young children will have asthma later in their life [19].To the knowledge of the authors, this is the first study where ANNs are used in the prediction of persistent asthma.
The paper is organized as follows: in Section 2 the experimental material, which has been used, is presented; Section 3 shows the feature selection method and the prognosis model, while the results and the final conclusions are described in Sections 4 and 5, respectively.

Description of the Asthma Database
Data from 148 patients from the Pediatric Department of the University Hospital of Alexandroupolis, Greece, were collected and recorded during the period 2008-2010.A group of 148 patients who received a diagnosis of asthma were studied prospectively from the 7th to 14th year of age.All patients with missing data were excluded, leaving a total of 112 patients.A case history, including data on asthma, allergic diseases, and lifestyle factors, was obtained by questionnaire.All participants, parents and their children, filled out a questionnaire about asthmatic and allergic symptoms, wheezing episodes until the 5th year, pet keeping, family members, parental history, and some other useful information.The prognostic factors that were used in the questionnaire have been described by previous studies [11][12][13][14][15].
All the 48 prognostic factors are summarized in Table 1.A kind of encoding is necessary for a few of these factors in order to be efficiently utilized.Their encoding is presented in Table 2.

Partial Least Square Regression.
The selection of input features plays a very important role in the successful implementation of prediction problems [20].It is, therefore,   necessary to use the inputs carrying the maximum amount of information to the output.Redundant or uninformative inputs may overshadow the performance of the ANNs.In addition to that, the detection of the essential diagnostic factors might support the utilization of smaller and simpler datasets for ANNs training, as the number of the input features is directly related to the dataset size.The reduction of the dimension of the features space could lead to a quicker and possibly more accurate classifier [21,22].A partial least square (PLS) regression is applied for the selection of the most relevant input features among the preselected factors [23].PLS regression is a technique used with data which contain correlated, predictor variables.This technique constructs new predictor variables, the so-called components, as linear combinations of the original predictor variables.PLS constructs these components while considering the observed response values, leading to a parsimonious model with reliable predictive power.
Let X be the matrix where the rows represent the predictor variables, some of which are highly correlated and the columns the number of the patients.Additionally, let Y be the matrix where the number of rows is the asthma outcome and the number of columns is the number of the patients.In PLS regression, matrices X and Y are decomposed into principal components and regression coefficients (loadings): where T and U are the matrices of scores and W and Q are the matrices of loadings.PLS regression places two conditions in the decomposition of X and Y [21].The first requires orthogonality of W and Q and the second requires Advances in Artificial Intelligence maximal correlation between the columns of T and U.After decomposition, U is regressed on T as follows: where B is the matrix of regression coefficients for T and E is an error (noise) term.
In order to choose the number of components 10-fold cross-validation was used.Overfitting was avoided by not reusing the same data to fit a model and to estimate the prediction error.Thus, the estimate of prediction error was not optimistically biased downwards.After choosing the number of the components, the PLS weights which are the linear combinations of the original variables that define the PLS components were investigated.The PLS weights were used in order to select only those variables which contribute the most to each component.The best prediction can be performed by only using 9 factors: wheezing episodes until 5th year, wheezing episodes between 3rd and 5th year, wheezing episodes until 3rd year, weight, waist's perimeter, seasonal symptoms, FEF 25/75 , number of family members, and corticosteroids inhaled.

MLP and PNN Classifiers.
Several factors are crucial in designing a feed forward neural network topology for prediction problems.Such factors are the input, the hidden, and the output layer configuration as well as the used training methodology.The neural network architecture is determined by experimentation in practice.In this paper, the number of input layers is 48 corresponding to the input features in the original dataset.It has been shown by Cybenko [24] and Patuwo et al. [25] that neural networks with one hidden layer are generally sufficient for most problems.Thus, all the networks investigated in this study use one hidden layer.There are many choices for the number of the neurons in the hidden layer.In order to achieve the best neural network configuration, the simulations have been started with a minimal MLP neural network (48-1-1 structure) and step by step more nodes have been added in the hidden layer.
One binary output layer is employed, corresponding to the two classes of either having persistent asthma or not.The target values for each node are either zero (absence of asthma) or one (existence of asthma) depending on the desired output class.The simulation of all the ANNs has been performed using Matlab Neural Network Toolbox due to its user-friendly interface [26].
In order to achieve the best transfer functions for input and hidden layers, the trial and error method was applied.The best result was obtained with a network with tan-sigmoid transfer function in the hidden layer and saturating linear function in the output layer.
Training a neural network involves modifying the weights and biases of the network in order to minimize a cost function [27,28].The cost function always includes an error term, which actually indicates how close the network's predictions come to the class labels for the examples in the training set.One of the most widely used error functions is the mean squared error (MSE), while the most commonly used training algorithms are based on the backpropagation algorithm.
In such an algorithm, the synaptic weights and biases are adjusted by backpropagating the error signal through different layers of the network in a chain form.During the learning process, the weights of nodes can be adjusted according to minimizing the overall error: where  is the number of patterns, () is the predicted output, and () the target.The Levenberg-Marquardt backpropagation learning algorithm was selected for the training of the ANNs due to its faster convergence and better estimated results than other training algorithms.PNNs, a variant of radial basis function (RBF) neural networks, were also used in order to predict the childhood asthma outcome.Although the PNNs have few applications on medical science, they have had satisfactory performance.
PNN consists of an input layer followed by a radial basis layer (hidden layer) and a competitive layer (output layer).The structure of PNNs has only one hidden layer and the number of neurons for PNN's hidden layer depends on the number of the patterns during PNN's construction.Consequently, the proposed PNN has 112 neurons for the hidden layer as the available data set for PNN implementation, consists of 112 cases.The design of PNN is straightforward and does not depend on the training process.Thus, no learning algorithm was selected during PNN's implementation.The number of neurons in the input layer is 48, equal to the number of the input variables, while the number of neurons in the output layer equals the number of outputs.
The determination of PNN structure for asthma outcome prediction was based on the number of the used input patterns, as well as the spread of radial basis function.The spread was increased from 0.1 to 100, with a step of 0.1.

Performance Evaluation.
The performance of the neural networks is estimated using false positive (FP), false negative (FN), true positive (TP), and true negative (TN) values.Classification of normal data as abnormal is considered as FP and classification of abnormal data as normal is considered as FN.TP and TN are the cases where the abnormal is classified as abnormal and normal classified as normal, respectively.The accuracy, sensitivity, and specificity are presented in the following equations: Sensitivity and specificity are statistical measures of the performance of a binary classification test [29][30][31].Sensitivity measures the proportion of positive (asthmatic) people who have been correctly identified to have asthma.Specificity measures the proportion of negative (not asthmatic) people who have been correctly identified not to have asthma.The accuracy is the degree of how close the predicted values are to the actual ones [32].
In this study, a 10-fold cross-validation method was used in order to construct a more flexible model.At first, the 112 patients were divided into 10 almost equal subgroups.One of the 10 subgroups has been used as the evaluation data and the rest as the learning data for the classification.The evaluation data were changed 10 times, so that each group was investigated once as evaluation data.The average value of all obtained accuracies of the evaluation data was considered as the estimation ability of the model.

Results
The feature size of the MLP classifier, the MSE over the training and test set, and the training and the test success of the classifier are summarized in Table 3.The correct percentage (overall accuracy) of prediction is 83.87% in the test phase.The neural network statistics for the training set show a sensitivity and specificity of 100% and 0%, respectively.The MSE over the training set and over the test set equals 0.2494 and 0.2190, respectively.
With the 9 highly ranked features, the proposed MLP network is implemented once again.At this time, the structure of the network is 9-6-1.Simulation results show that the new classifier has an average accuracy of 96.77%.Furthermore, the sensitivity and the specificity values are 96.15% and 100%, respectively.The MSE over the training set is decreasing to 1.0553 − 004 and over the test set to 0.0326.Thus, the new classifier with 9 features performs much more efficiently than the previous one having 48 features.
The best implemented MLP and PNN classifiers, the number of neurons in hidden and output layer, the transfer functions of hidden and output layers for each of the architecture, and the test success of the classifiers are summarized in Table 4.
PNNs have correctly estimated all the normal cases of the test set while the original PNN classifier performs better than the MLP classifier over the negative people.The optimal performance of the reduced MLP and PNN classifiers in terms of asthma outcome prediction is observed from the results.

Conclusion and Discussion
The use of ANNs in prognosis problems is well established in the human medical literature, due to their capacity to model complex and nonlinear relationships and their tolerance of missing data and input errors.From the results, it has been shown that the proposed medical decision support system can achieve very high prediction accuracy.
The goal of designing the new classifier is to maximize the classification accuracy and simultaneously minimize the size of the feature set.By selecting a small number of important features, the prediction performance of the constructed classifier has been improved.The improved performance may be attributed to the greater generalization capability of the classifier.After that, a comparison with a PNN classifier was made.It was found, that the PNN networks have had better sensitivity compared to MLP neural networks.The value of specificity has shown that the MLP network classified abnormal data more accurately than PNN network.Based on the obtained values for sensitivity, it is indicated that both the two networks have diagnosed the normal data in a more efficient way than the abnormal data.
Due to the fact that asthma is a serious condition, the various models that have been used to detect it must have high sensitivity so that patients with asthma are not overlooked.An ANN that has been trained to predict 96.77% of patients with asthma may be very useful to physicians.
Moreover, this is the only study that has evaluated the diagnostic accuracy of 48 clinical factors through feature selection and it is concluded that only a set of 9 factors is the most important for the persistent asthma.The present study was also able to show the importance priority of each factor in asthma prediction.The most crucial factor in asthma outcome prediction is wheezing episodes until the 5th year of age.In particular, evidence from a large number of Advances in Artificial Intelligence prospective case-control studies shows that wheezing until the 5th year of age of a child is often associated with asthma during subsequent years.
In conclusion, this study will contribute to science by helping doctors to early identify which of the symptomatic young children will continue to develop asthma during their school years and thus to draw a plan in order to change the natural course of the disease.

3. 1 .
The Proposed Algorithm.The prediction algorithm which has been employed in this study consists of two stages: the feature reduction through partial least square regression and the classification stage by MLP and PNN classifiers.The flowchart diagram of the used system is shown in Figure1.

Figure 1 :
Figure 1: Flowchart diagram for the asthma prediction system.

Table 2 .
All other factors are numerical.

Table 2 :
Encoding of prognostic factors.

Table 3 :
Comparison between the original and the optimal MLP classifier.

Table 4 :
Comparison between the MLP and PNN classifiers.