An Intelligent Diagnostic System to Analyze Early-Stage Chronic Kidney Disease for Clinical Application

Chronic kidney disease (CKD) is a progressive condition characterized by the gradual deterioration of kidney functions, potentially leading to kidney failure if not promptly diagnosed and treated. Machine learning (ML) algorithms have shown sig-nifcant promise in disease diagnosis, but in healthcare, clinical data pose challenges: missing values, noisy inputs, and redundant features, afecting early-stage CKD prediction. Tus, this study presents a novel, fully automated machine learning approach to tackle these complexities by incorporating feature selection (FS) and feature space reduction (FSR) techniques, leading to a substantial enhancement of the model’s performance. A data balancing technique is also employed during preprocessing to address data imbalance issue that is commonly encountered in clinical contexts. Finally, for reliable CKD classifcation, an ensemble characteristics-based classifer is encouraged. Te efectiveness of our approach is rigorously validated and assessed on multiple datasets, and the clinical relevancy of the strategy is evaluated on the real-world therapeutic data collected from Bangladeshi patients. Te study establishes the dominance of adaptive boosting, logistic regression, and passive aggressive ML classifers with 96.48% accuracy in forecasting unseen therapeutic CKD data, particularly in early-stage cases. Furthermore, the efectiveness of the FSR technique in reducing the prediction time signifcantly is revealed. Te outstanding performance of the proposed model demonstrates its efectiveness in addressing the complexity of healthcare CKD data by incorporating the FS and FSR techniques. Tis highlights its potential as a promising computer-aided diagnosis tool for doctors, enabling early interventions and improving patient outcomes.


Introduction
Te kidneys flter about 120 to 150 quarts of blood per day to generate approximately 1 to 2 quarts of urine [1,2].Te primary function of the kidneys is to remove waste from the body's fuids via urine.CKD starts with unexpected metabolic disorders that gradually refer to the loss of endocrine, excretory, and metabolic functions in the kidneys [3].Tese unusualities are evident as the signs and symptoms of renal damage.Since the underlying cause of the disorder stays unspecifed in many patients, the most common causes can be diabetes, hypertension, interstitial diseases, systemic infammatory disorders, glomerular diseases, congenital conditions, and renovascular abnormalities [4].
In the absence of timely treatment, kidney disease progresses to end-stage renal failure (ESRF), which causes coma and even death in patients [5].According to [6], approximately 750,000 patients annually are afected by renal failure in the United States, with an estimated 2 million people globally sufering from kidney failure, and the diagnosed patient rate rises at a 5-7% rate annually.Over the past decade, the overall CKD mortality rate has shown a substantial increase at 31.7% [7].Studies exploit the fact that, in low-and middle-income countries, CKD is a more signifcant burden when compared to high-income countries [7][8][9][10].Te number of patients diagnosed with renal disease in South Asian cities is 7.2%-17.2%[11].Te regularity reports that 13% of all the available populations in Dhaka city are aged 15 years, or older [12].About one-third of Bangladesh's rural people have incurable renal failure risk, as suggested by another community-based report [13].Hence, CKD poses an upright threat in a developing country like Bangladesh.
A computer-aided diagnosis process can leverage an efective CKD diagnosis for accurate detection at the primary stage.ML is now one of the most essential and prosperous areas in the healthcare sectors for analyzing and making predictions for diferent diseases and stages [14].Te ML models gain knowledge by exploring large datasets and their features, patterns, modes, and so on.In data analysis, the FS strategy is used to select a subset of the most relevant features in the dataset to improve the performance and interpretability of the ML models, while the FSR technique aids in simplifying the feature representation and overall complexity in the dataset by extracting the principal components [15].
Previous research has shown that choosing the most relevant and useful features can improve early-stage CKD detection.Some researchers have used FS techniques and others have used FSR techniques.However, combining both has not been fully explored, resulting in limitations in reaching a maximum accuracy while keeping the ML model's generalization capabilities for clinical CKD diagnosis.Moreover, analyzing healthcare table data related to CKD is challenging due to missing or null attributes and categorical values in the dataset.A data encoding methodology is generally well-suited for categorical values, but a suitable strategy for addressing missing or null attribute values that takes into consideration the dataset's random nature is required.Tough the existing studies have used various methods to overcome these issues, their efectiveness in dealing with unseen clinical data has not been fully established.Moreover, there are still a number of issues, such as a lack of standardization of CKD, models' interpretability, generalizability, and fairness, in order to ensure their safe use in normal clinical trials [16,17].
Terefore, this work aims to extend renal disease diagnosis in a clinical setting by efectively utilizing computer intelligence.To achieve this goal, both the FS and FSR techniques are employed in the preprocessing phase.In addition, a data balancing strategy, as well as data encoding and cleaning, is used to account for clinically unseen data that is imbalanced, missing, or noisy.Finally, multiple classifcation models are incorporated, with adaptive boosting, logistic regression, and passive aggressive being the recommended ML models for CKD analysis due to their ensemble capabilities.
Te efectiveness of the proposed intelligent diagnostic system is evaluated on multiple datasets separately.Finally, the clinical CKD detection performances are evaluated on unseen healthcare data collected from Bangladeshi patients.Tis study increased the model performances in clinical CKD detection by handling missing values, imbalanced data, data encoding, feature selection, and dimension reduction efectively.To sum up, the most signifcant contributions of this work are as follows: (1) Te datasets are analyzed to ensure that no data loss occurs, even in the case of missing value.
(2) Te dimension reduction methodology is investigated in order to reduce the feature space; as a result, the model training and testing time could be reduced while simultaneously improving the overall results.
(3) Tis study presented a generalized intelligent diagnostic system to analyze and predict renal disease at an early stage with unseen healthcare data.To the best of our knowledge, this is the frst work on CKD prediction with clinical unseen data.(4) A comprehensive analysis was performed on four diferent datasets to fnd the best ML models for CKD analysis.(5) Adaptive boosting, logistic regression, and passive aggressive techniques are recommended classifers for CKD analysis on unseen real-life data due to their robust ensemble capabilities.
Te rest of the paper is organized as follows: Te related literature review is discussed in Section 2. Te proposed methodology is categorized into subsections and briefy discussed in Section 3. Data encoding, balancing, cleaning, feature selection, and dimension reduction techniques are discussed in Section 3.1.Te experimental analysis is discussed in Section 4.3.Dataset collection and dataset descriptions are stated in Section 4.1.In Section 4.3 performance evaluation metrics and experimental results are discussed concerning diferent methods and datasets.Finally, the discussion and conclusion are delivered in Sections 4.4 and 5, respectively.

Literature Review
For efective disease classifcation and prediction, various methodologies are designed and explored.Te study [18] examined 12 ML classifers across four distinct datasets: breast cancer, liver disorders, wine quality, and Indian liver patients.Te evaluation primarily focused on accuracy and prediction speed.Tey concluded that the classifer's performances are disease specifc.However, this study did not elaborate on how the data complexity was handled, and the clinical relevancy was not discussed.As CKD is among the life-threatening diseases that necessitate early detection to enhance patient outcomes, researchers have explored numerous ML algorithms coupled with preprocessing techniques for efcient CKD prediction.A synthetic minority oversampling technique (SMOTE) is employed in [19] to balance the CKD-15 dataset.Te authors tested three different FS methods including correlation-based feature selection (CFS) as a flter method, forward feature selection (FFS) as a wrapper method, and the least absolute shrinkage and selection operator (LASSO) feature selection as an embedded feature selection method.Te data balancing with SMOTE and FS with LASSO resulted in an increase of 1.39% accuracy compared to using a linear support vector machine (LSVM) with the original dataset.Te authors in [20] performed an FS strategy using a genetic algorithm (GA).Tey achieved the highest accuracy of 99.75% from the 2 Applied Computational Intelligence and Soft Computing multilayer perceptron (MLP) classifer.Diferent featurebased prediction models were suggested in [21] for detecting kidney disease in which logistic regression with a Chi-square test-based model showed the highest accuracy (98.75%).Similar ML-based models but diferent applications were analyzed by the authors in [22,23].In their work, the gradient boosting-based model was utilized in which their major fnding was to utilize the FS and sampling techniques (SMOTE, OneR, etc.) for achieving favorable accuracy.Te fuzzy-based intelligent system that incorporated fuzzifcation, implication, and defuzzifcation was proposed in [24] for CKD analysis.Tey modeled an IF-THEN fashion to develop the knowledge base for a fuzzy inference system.A summary of the data imbalanced analysis was presented by the authors [25].Te study investigated 23 class imbalanced techniques (resampling and hybrid systems) with three ML classifers including random forest (RF), logistic regression (LR), and linear support vector classifer (LinearSVC) to identify the most suitable imbalanced method for the medical dataset.Tey found that class imbalance learning can signifcantly improve classifcation, with random oversampling (ROS) and RF delivering the best results.Several other FS methods have been explored to identify the most relevant features.Te L1-regulated FS technique has been explored in [26] to classify microarray cancer data with improved performance.Te authors in [27] applied L1norm-based and chi-square-based FS strategies to classify breast cancer.In other CKD studies [28][29][30], principal component analysis (PCA) is utilized to extract noteworthy features from the dataset.Te authors [28] extracted 19 features using PCA and achieved the highest accuracy of 98% using the support vector machine (SVM) classifer.Other classifers such as LR, naive Bayes (NB), and k-nearest neighbor (KNN) also demonstrated noteworthy performance.Te study [31] utilizes PCA, discriminant analysis (DA), and LR to extract features from the breast cancer dataset.While achieving notable accuracy with a hybrid feature extraction technique, discriminant logistic (DA-LR), the study failed to discuss the data complexity, such as data balancing and cleaning issues.Te authors in [32] performed their experimental analysis on the CKD-15 dataset without employing a feature optimization strategy.Despite this, they were able to attain an accuracy of 97.25% using MLP as the classifer, 96.5% using LR, and 95.75% using NB.Te highest accuracy of 98.25% was achieved using SVM as the classifer.Another study [30,33] worked with handling nominal attributes and observed the feature selection strategy in performance analysis.Te nominal attributes were transformed into binary attributes, and then they conducted a best-ft feature selection (BFFS) method.According to their fndings, SVM and KNN outperformed LR and decision tree (DT) classifers, with accuracy rates of 98.3% and 98.1%, respectively.Non-numerical data of the CKD-15 dataset were transformed into binary data in the study [34].Te authors aimed to identify the most signifcant clinical test attributes by using SHapley Additive exPlanations (SHAP) values and reducing the number of attributes to a minimum for optimal clinical testing and high CKD detection accuracy.Among the tested classifers, the RF achieved the highest accuracy of 99.5%, while gradient boosting (GB), extreme gradient boosting (XGB), LR, and SVM also performed well with high accuracy.
To handle the missing values in the CKD dataset, the authors in [28,30,34,35] replaced the missing values with the mean value.Te missing values are handled in [3] with the mean, median, and mode values of the attributes and also dropped the null values.Te authors in [36] utilized mutual information measures (MIMs) for feature selection and replaced missing values through multiple imputations while analyzing kidney disease.Te authors in [37] used the median technique to replace the missing values.Other studies in [38,39] replaced the missing values with 0. Te top accuracy of 99.1% was achieved by decision forest (DF) and 97.5% while implementing NN with an arbitrary selection of 14 attributes [38].Other authors [35] have selected 13 out of 24 attributes for classifcation, and the results showed that adaptive boosting (ADAB) achieved a prediction accuracy of 99% while the extra-tree classifer (ETC) obtained 98% accuracy.Te authors in [39] considered 21 attributes from the CKD-15 dataset.During the classifcation phase, the DF achieved the highest prediction accuracy (99.17%) to predict three diferent potassium zones: LR 89.17% and NN 82.15%.Te authors in [28] handled the categorical variables by converting them to a corresponding numerical value utilizing the one-hot encoding technique.Tey found the best performance of 98.0% accuracy using an SVM classifer.In a study [1], the attributes with more than 20% missing values were removed from the dataset, and the remaining values were flled using KNN imputation.Te authors then selected features based on statistical signifcance, medical importance, and test data availability.Eleven ML algorithms were evaluated, and four classifers (DT, RF, ETC, and ADAB classifer) showed 100% accuracy.
Te comprehensive literature review highlights various techniques and approaches employed in disease prediction, particularly early CKD diagnosis and reveals common data preprocessing techniques such as nominal-to-binary transformation and one-hot encoding for categorical variables.Handling missing data involves methods like mean imputation, median, mode, multiple imputations, or replacement with 0. Existing studies often focus solely on feature selection or reduction techniques.For CKD prediction, popular methods included CFS, FFS, LASSO, GA, Chi-square, BestFit, SHAP, MIM, and PCA.Breast cancer classifcation is employed, while L1-regulated and L1-normbased feature selections are used for efcient breast cancer classifcation.While these studies demonstrated high accuracy on the datasets utilized for training and testing by splitting them, a critical gap emerged.None of them combined feature selection and reduction methods to improve model performance, particularly in better handling clinical CKD data complexity.In addition, they also lack the assessment of the performance of their models on realworld, unseen clinical CKD data to provide patientcentric CKD solutions at the initial phase.Tese raise the necessity for an improved automated diagnostic system for CKD detection.Tis study aims to address these gaps by introducing a novel methodology tailored to enhance CKD Applied Computational Intelligence and Soft Computing diagnostic accuracy and handle data complexity in a patientcentric manner.

Proposed Methodology
Preprocessing and classifcation are the two parts of the proposed methodology.In the preprocessing step, data encoding, balancing, cleaning, feature selection, and dimension reduction approaches were implemented to properly train the ML algorithms.Te entire block diagram of the proposed methodology is shown in Figure 1.

Preprocessing.
Te datasets contain a mixture of numerical, categorical, nominal, and missing values, so the data are preprocessed to address the issues with categorical, nominal, and missing data.Before starting the preprocessing phase, the "Afected" attribute is manually omitted from the processed data, thus the processed data could not be afected by the class variables.

Categorical Variable.
Variables with two or more categories but without intrinsic ordering to the categories are known as categorical variables, often known as nominal variables [40].Categorical variables are the types of data that may be divided into groups.For example, the categorical variables are age, sex, group, race, educational level, etc.

Data Encoding.
Data encoding is the process of converting data or a given sequence of characters, symbols, alphabets, etc., into a specifc format that can be processed by a computer system or application.Te purpose of data encoding is to transform the data into a standard format.Tis study utilized the label encoding or ordinal encoding technique to complete this task.All the non-numerical (nominal categorical variables) labels are mapped to numerical labels using this encoding (Table 1).

Data Balancing.
Data balancing is a procedure in which the amount of class data is equalized using diferent data balancing techniques.Tis analysis used two datasets, CKD-15, and CKD-21; both datasets were imbalanced.CKD-15 contains 250 CKD and 150 non-CKD instances, and CKD-21 contains 78 CKD and 122 non-CKD data.To address this imbalance and prevent potential bias and poor model generalization, the ROS technique was employed to increase the lower number of instances.ROS duplicates minority class examples randomly, ensuring an equal representation of CKD and non-CKD instances in both datasets.
Table 2 shows the data imbalance for both datasets.Table 3 shows the amount of data after balancing the data using the sampling technique.

Data Cleaning. Missing entries are common in clinical
CKD data due to the challenges of tackling a large number of CKD patients within a limited time by medical assistants.However, simply removing instances with missing data can pose issues for accurate classifcation by ML models.In addition, to ensure accurate and reliable outcomes, it is crucial to avoid bias and data distortion caused by incomplete or erroneous data.Tis necessitates employing a data imputation technique tailored to the specifc disease characteristics.Here, the study addressed the lost data by flling up the mean value of the corresponding attribute based on how the missing values were distributed randomly.It serves to preserve the statistical properties of the dataset while ensuring accuracy and reliability in subsequent analyses.

Feature Selection.
As the increasing number of features creates computation overhead and increased model overftting possibilities, FS comes into the solution [41].Te FS strategy reduces the input variables by using only relevant data and eliminating unnecessary and noisy data [42].It is an automatically relevant feature-choosing process.Te signifcant advantage of using this technique is that it reduces overftting [43].Regularization is a useful technique for reducing model complexity and feature selection [26].Te penalty "L1" (Lasso regularization) and solver "liblinear" are used here with the "LogisticRegression" method to select essential features based on the importance weights.It employs the shrinking strategy by penalizing the least-square errors.To minimize the cost function, the model set the weights of some features to zero, and a total of 13 features are chosen for CKD-15, CKD-21, hybrid, and unseen clinical data.

Dimension Reduction.
Dimension reduction is a process that reduces the feature space to the most relevant feature space [15,44] while preserving the maximum amount of relevant information from the actual data.Tis technique can enormously reduce the time complexity of the ML algorithm's training phase, and it does not degrade ML model performance [45].Among other dimension reduction techniques including PCA, singular value decomposition (SVD), linear discriminant analysis (LDA), and generalized discriminant analysis (GDA), an unsupervised ML technique, PCA, is employed here due to its efectiveness and popularity in feature reduction particularly in CKD analysis.PCA employs mathematical principles to reduce a large number of potentially correlated variables to a smaller number of variables (lower dimension), which are referred to as principal components [46].Tis investigation utilized PCA as the dimension reduction strategy in four ways to prepare 4 categories of datasets.
For efective PCA analysis, the features of datapoints "X" are standardized through mean removal and scaling to unit variance using the following equation to ensure equal feature scaling: To determine the direction in which the features are most correlated, the covariance matrix "COV" is calculated 4 Applied Computational Intelligence and Soft Computing using equation (2).It is a square matrix with dimensions equal to the number of features, and each element in the matrix represents the covariance between two features, indicating their linear association.
Here, N � number of samples in the dataset.Te eigenvectors and eigenvalues of the "COV" matrix are computed from equation (3).Tey determine the directions in which the features are most varied and the amount of variance explained by each component.Te eigenvalues and their corresponding eigenvectors are sorted in descending order, with the largest eigenvalues being considered the principal components for projecting the data onto a lower-dimensional space.
Here, λ is the eigenvalue, I is the identity matrix, and ] is the eigenvector.
Ten, the frst "k" values are chosen as the largest eigenvalues and their corresponding eigenvectors to form a matrix for the projection step to reduce data dimensionality utilizing the following equation: In the experiment, the study used k � 2, for CKD-15, k � 7, for CKD-21, k � 3, for hybrid data, and k � 10, for clinical unseen data. (5) Finally, the new feature vectors are calculated from equation (5).In the experimental analysis, the "k" values are chosen such that they are minimal and outperform the existing models.

Logistic Regression.
Te logistic regression model estimates the possibility of an event within a particular class [47].LR is commonly used for binary classifcation, although its title incorporates "regression."A decision boundary is a value that is set to predict the data class.Te sigmoid activation function is used here to compute this classifcation probability.Te mathematical model of the algorithm can be denoted as in the following equation: where i � 1 to N (number of observations), j � 1 to M (number of individual variables), P i � probability of "1" at observation i, β j � regression coefcient, and x ij � the j th variable at observation i.  [48].Tis method incorporates the decorrelated tree by building a substantial range of decision trees on bootstrapped samples from the training dataset.It screens a few feature columns among all feature columns throughout bootstrapping.Gini impurity is used in the experiment with ten maximum depths of the tree.Te tree grows with ten maximum leaf nodes.Predictions for unknown data after training can be defned as in the following equation: where B � optimal number of trees and f b (x ′ ) � prediction from the i−th decision tree for the unknown sample x ′ .Also, the uncertainty (σ) of the prediction is defned by the following equation:   [49].SVM can categorize both linear and nonlinear datasets by using the kernel trick.As a subset of training points in the decision function, it is also computationally efcient (called support vectors).Te prediction function of an SVM classifer can be described by the following equation (9).Te "rbf" kernel and regularization parameter "1" are used in the experiment.
where x � new data point, β o � bias, S � set of support vectors, a i � corresponding weights of the training data x i , and x i ′ are support vectors in the training data.

K-Nearest
Neighbor.KNN is the most straightforward supervised ML algorithm [50].A distance is calculated to determine similarities with other instances.For example, the closest data point to the point under observation is thought to be the most appropriate for the data point.Tere are numerous distance metrics for calculating the nearest point, such as Euclidean, Hamming, Manhattan, Cosine, Jaccard, and Minkowski distances.In the experiment, 7 neighbors are used with the Euclidean distance 10.Here, p and q are two points in the space, p i and q i are the i th dimensions of points p and q, and n is the number of dimensions.
3.2.7.Gradient Boosting.Tis classifer [51] is also operated to estimate the prediction performance as a boosting algorithm.Te primary stages of a GB classifer are computing the error residual, learning a regression predictor, and memorizing to predict the residual.Additive models are usually utilized, and weak learners are counted to optimize the loss function.For weak learners, decision trees (regression trees) are employed.

Naive Bayes.
Te NB is a probabilistic supervised algorithm while classifying data imposes independence of features [52].Te method works efectively for datasets with a signifcant number of input variables.It assumes all the features available, including weak features, in the fnal prediction.Te probabilistic naive Bayes ML model can be stated as the following equation (11) where A and B are two independent events.

P(A | B) � P(B | A) × P(A) P(B) .
(11) 3.2.9.Stochastic Gradient Descent.Te word "stochastic" denotes a system or process connected with a random probability.Hence, for each iteration, a few samples are selected randomly instead of the whole data in stochastic gradient descent (SGD).To perform each iteration, SGD uses only a sample, i.e., a batch size of one.Te sample is shufed randomly and picked for executing the iteration.To train the model, L1 regularization and 20 epochs are used.

Multilayer Perceptron.
A multilayer perceptron is considered to be the most signifcant class of feed-forward artifcial neural networks (ANNs) that is made up of several layers of perceptron [52].Te network contains three layers where at least one hidden layer is required, and others are the input and the output layer.Tis experiment used the sigmoid activation function and "lbfgs" solver which is an optimizer in the family of quasi-Newton methods.

Adaptive Boosting.
Te adaptive boosting algorithm also known as AdaBoost is an ensemble ML technique that merges a number of weak classifers to form a stronger classifer to increase the classifcation performance [53].Te performance of this model is improved by using extra copies of the classifer on the same dataset; for incorrectly classifed samples, weights are adjusted to represent the fnal output of the boosted classifer.

Extreme Learning
Machine.An extreme learning machine (ELM) is a single hidden layer feed-forward neural network that solves problems by fnding the minimum norm least-square (MNLS) solution of a system [54].It provides good generalization performance by solving problems in a single iteration at an extremely fast speed.Te model Moore-Penrose generalized inverse is used to set its weights.In this experiment, 150 hidden nodes with the sigmoid activation function are used.Te output of this model is calculated using the following equation: Here, x represents the input feature vector, and the prediction is made by summing the product of the weights "β i " and the activation function "g i (x)" for each hidden node "i" in the hidden layer.

Performance Evaluation
Statistically, fnding the best ML classifers is difcult because it relies on the type of application and the data format.Terefore, the focus of this work is on experimentally validating all ML models in terms of CKD analysis.Based on the data, both balanced and imbalanced conclusions can be drawn about the most efective models for the application.4 and 5 contain a description of the attributes with the necessary information for the CKD-15 and CKD-21 datasets, respectively.To apply ML algorithms, data must be well structured and reliable.

Training and Testing.
To train and test the proposed model, two datasets, namely, CKD-15 and the real-world clinical dataset CKD-21 are used in four ways.Te extensive experimentation on diferent datasets ((i) CKD-15, (ii) CKD-21, (iii) hybrid, and (iv) unseen clinical cases) with the combination of multiple evaluation metrics strengthens the validity of the work and demonstrates the proposed model's generalization capability for early CKD prediction in a clinical setting.Furthermore, validating the model on clinically unseen data highlights its clinical relevance for CKD detection.In the CKD-21 dataset, "Afected" and "Class" attributes have the same meaning.
As both datasets have diferent dimensions, PCA helps here to bring them to the same number of dimensions for hybrid and unseen cases.
(1) For both the CKD-15 and CKD-21, the model was trained with 70% of the data and tested with the rest 30% of the data, as depicted in Figure 2(a).(2) A hybrid dataset is created by utilizing both the CKD-15 and CKD-21 datasets.To make a hybrid dataset, all the datasets must be in the same space.As the datasets contain diferent feature spaces, this analysis transformed the dimensions of the two datasets into a particular dimension utilizing PCA.
Here, for both datasets, 3 feature spaces are chosen to carry out the research by confguring PCA.Ten, the vertical (row-wise) concatenation is performed on the transformed CKD-15 and CKD-21 datasets to create a new dataset.Te diversity inherent in hybrid datasets signifcantly enhances the generalization capabilities of ML models, which is a crucial aspect when tackling real-world applications.Te ML models are trained on 70% of the sample data and tested on the remaining 30%, as shown in Figure 2(a).( 3) Te study transformed the existing feature space of both datasets to 10 feature spaces by using PCA for evaluating the ML models on clinically unknown patient data.As Figure 2(b) shows, in the experiment, the model is trained using dataset CKD-15 (i.e., 503 samples) and tested the model with a clinical real dataset CKD-21 (i.e., 256 samples) for clinical analysis of the unseen data.
All three datasets (CKD-15, CKD-21, and hybrid) were additionally split using a random state argument to ensure a nonoverlapping and unbiased evaluation of the proposed approach on all datasets.Tis approach helps maintain the integrity of the testing process and ensures the generalizability of the model's performance.

Experimental Analysis.
Tis work utilized the PCA as a dimension reduction technique that addresses the issue of overftting in ML models, improves computational efciency, and enhances the model's generalization capability, thereby reinforcing its clinical relevance.Te use of multiple metrics and datasets provides a holistic assessment and reduces the likelihood of biased results.Te previous studies suggest evaluating multiple classifers comprehensively on multiple datasets using considerable evaluation metrics, recall, true negative ratio (TNR), positive predictive value (PPV), f1-score, area under the receiver operating characteristic (ROC-AUC) curve, and accuracy metrics that are appropriate and relevant to evaluate ML models' performance in the context of early-stage CKD detection.Tese metrics are commonly used in medical and healthcarerelated studies to understand each classifer's performance in diferent aspects, particularly in the context of early-stage CKD detection, where sensitivity, specifcity, and diagnostic accuracy are critical.Tough cross-validation is a common and widely used technique for evaluating ML models, it may not be feasible for our specifc datasets (unseen and hybrid) due to their unique characteristics.For instance, crossvalidation on clinical unseen datasets might not provide meaningful insights as this experiment aims to simulate realworld clinical scenarios by testing the model on entirely unseen data.Similarly, for the hybrid dataset, it may introduce biases due to the combination of datasets with varying characteristics.Te work fully operated on Google's cloud platform using "Colab Notebook." As Table 6 recites, eleven ML models (i.e., ADAB, DT, ELM, GB, KNN, LR, MLP, PAC, RF, SGD, and SVM) with PCA performed with 100% test accuracy, and ROC-AUC value was exactly 1 in the experiment for CKD-15 dataset.Although three ML algorithms (ADAB, DT, and RF) achieve  and ROC-AUC of 0.978.In the clinical unseen dataset, the ADAB, LR, and PAC classifers achieve the highest accuracy of 96.48%, and the best ROC-AUC of 0.984 is achieved by the LR model.Among the other ML models, DT and GB produce the least results (97.27% test accuracy and 0.93 ROC-AUC), as shown in Table 9. Tough the NB model's accuracy is not the best, its ROC-AUC value of 0.981 was the closest to the LR's ROC-AUC value, establishing it as the second-best well-ftted model whereas with 96.01%accuracy, RF and SGD acquire the second-best performing models for unseen clinical data.Te proposed model with a dimension reduction technique (PCA) achieves the fnal predicted value for a classifer in an average of 1.93 seconds for CKD-15 datasets and 6.57 seconds for CKD-21 datasets.Te model without PCA takes 2.33 seconds for the CKD-15 dataset and 15.9 seconds for the CKD-21 dataset, as shown in Table 10.Te hybrid dataset model takes 7.7 seconds, while the unseen dataset model takes 7.95 seconds.An ML model needs to have the same dimension of datasets to create a hybrid dataset, and it also needs to have the same dimension for training and testing the model.Hence, the average required time without considering PCA could not be calculated for hybrid and unseen cases.

Results and Discussion
. Our innovative machine learning approach is fully automated and integrates feature selection through L1 regularization and feature space reduction using PCA during the preprocessing phase.Tese techniques were specifcally designed to address the complexities of therapeutic data in CKD diagnosis, with a primary focus on enhancing early-stage prediction accuracy.Consequently, the intelligent model surpasses all other existing methodologies.A comprehensive analysis of the performance for the four types of datasets, namely CKD-15, CKD-21, hybrid, and unseen clinical cases, is presented in Tables 6-9 subsequently.Te datasets (CKD-21, hybrid, and unseen clinical cases) employed here are novel and unique for early-stage CKD detection, and there were no previous works available that used these datasets for CKD diagnosis.While conducting a direct comparison with stateof-the-art methods for these specifc datasets (CKD-21, hybrid, and unseen clinical cases) was not feasible, this study thoroughly evaluated the proposed approach on the CKD-15 dataset in Table 11.Te outcomes demonstrated the superiority of our approach over previous works by a wide margin for the CKD-15 dataset, establishing its efectiveness in CKD detection.Table 6 depicts that overall, the ADAB, DT, and RF classifers achieve better performance than other models regardless of PCA usage, while the GB classifer performs better when PCA is utilized.Te performance of other models steadily decreased when PCA was not considered.Te four ML models (i.e., ELM, SVM, SGD, and PAC) perform worst without PCA for the CKD-21 dataset depicted in Table 7.To the best of our knowledge, this is the frst work done on this dataset to detect CKD from the non-CKD class.A few works have been conducted on the CKD-21 dataset but they are limited to identifying renal disease risk factors only.Furthermore, no works on CKD-hybrid and clinical unseen data are found to compare with our model outputs.
Figure 3 shows the ROC-AUC curves, and Figure 4 shows the result comparison with all the 12 ML models on four types of datasets (i.e., CKD-15, CKD-21, hybrid, and unseen case) using PCA.For the CKD-15 dataset, all models perform with 100% of accuracy except for the NB model.Te PAC model performs least for the CKD-21 dataset.In aggregate, ADAB, DT, ELM, GB, KNN, SVM, and RF models perform best for both datasets.
MLP performs best for the hybrid dataset (99.12% test accuracy), and NB performs the least (95.18% test accuracy).Te average performance on the hybrid dataset was relatively good and is in a steady state, whereas for the unseen clinical data, ML model performances steadily degraded from 96.48% (ADAB, LR, and PAC) to 92.97% (DT and GB) test accuracy.
To evaluate the overall ML model's performance on the four types of the dataset, this exploration presents the average performance analysis in Table 12 and plots the average train-test accuracy and ROC-AUC value in Figures 5 and 6

Conclusion
Kidney failure causes diseases ranging from mild to severe, has signifcant health implications, and demands accurate diagnosis, especially in rural areas of developing countries where specialists are limited.To address these issues, this work suggests an intelligent diagnostic system for early CKD detection in a clinical environment with high accuracy and in a time-efcient manner.Te suggested model was evaluated from four distinct perspectives to enhance its real-life clinical performance and credibility.To optimize the model's performance, necessary corrections were made to the datasets.It outperforms previous studies for the CKD-15 dataset and exhibits impressive accuracy for test data.Tis positions it as a valuable novel solution and establishes its validity.Te kidney disease prediction has been improved efectively by employing both the logistic regression method with the "L1" penalty as feature selection and PCA as feature space reduction technique alongside an ensemble characteristic-based classifer.It also shows notable performance for CKD-21 and hybrid datasets.
To validate the signifcance of this study and clinical relevancy, the proposed intelligent diagnostic system was fnally evaluated on clinically unseen complex data and Incorporating PCA into the model improved the CKD detection performance and signifcantly decreased the analysis time, specifcally by 0.4 seconds for the CKD-15 dataset and 9.33 seconds for the CKD-21 dataset.Tis states the real-life clinical applicability of the suggested model.
A future investigation might include performing statistical tests on more patient-centric data.Te study acknowledges the necessity of more work on clinical benchmark data to facilitate thorough comparisons with state-of-the-art methods, especially for novel datasets like CKD-21, hybrid, and unseen clinical cases.Furthermore, validation by domain experts is a necessary step prior to clinical implementation.

Figure 4 :
Figure 4: Accuracy (%) comparison graph for the four types of data using PCA.

Table 1 :
Data representation from non-numerical to numerical label using encoding operation.
3.2.2.Decision Tree.Te basic goal of the decision tree algorithm is to generate a prediction model from a set of training data sets to predict classes or values of target variables.Te DT algorithm is structured like a tree, with leaves, branches, and roots.When compared to other classifcation algorithms, the DT algorithm is simple to grasp.3.2.3.Random Forest.Tis algorithm creates multiple decision trees during training and provides an output class of individual trees

Table 6 :
Experimental result analysis of various classifers for with and without PCA on dataset-1 (CKD-15).

Table 8 :
Experimental result analysis of diferent classifers for the hybrid dataset (mixture of CKD-15 and CKD-21).

Table 9 :
Experimental result analysis for clinical unseen dataset, i.e., training the model with CKD-15 and testing with CKD-21.

Table 10 :
Average model processing time comparison for the proposed model with and without PCA.

Table 11 :
Experimental result (accuracy (%)) comparison for CKD-15 with state-of-the-art Te best results are indicated in bold. ,

Table 12 :
Average performance analysis of ML models on all four types of dataset.Te best results are indicated in bold.Applied Computational Intelligence and Soft Computing respectively.Te ELM, LR, MLP, and NB show the best ROC-AUC performances at 0.99 whereas DT shows the least ROC-AUC at 0.91, and other classifers achieve 0.98 ROC-AUC performance in Figure6.Figure5describes that on the CKD dataset, the RF classifer averagely performs the best (98.48% test accuracy and 0.98 average ROC-AUC), and ADAB acquires the second-best position with 98.46% test accuracy.Tough the train-test accuracy gaps are less for the SVM, KNN, and ELM classifers, overall, in kidney disease prediction, the RF model could be the best choice for CKD-15, CKD-21, and unseen clinical data considering the accuracy and ROC-AUC performances, and the MLP model could be the best model for hybrid renal data.
Figure 5: Average train-test accuracy (%) comparison of ML models using PCA on chronic kidney disease data.Figure 6: Average ROC-AUC comparison of ML models using PCA on chronic kidney disease data.14AppliedComputational Intelligence and Soft Computing achieved impressive performance, demonstrating its potential as a valuable patient-centric solution for early CKD diagnosis in clinical practice.Te implementation of the in local healthcare systems would allow for a swift assessment of patients for early-stage CKD identifcation.