Stroke Subtype Clustering by Multifractal Bayesian Denoising with Fuzzy C Means and K-Means Algorithms

Multifractal denoising techniques capture interest in biomedicine, economy, and signal and image processing. Regarding stroke data there are subtle details not easily detectable by eye physicians. For the stroke subtypes diagnosis, details are important due to including hidden information concerning the possible existence of medical history, laboratory results, and treatment details. Recently, K-means and fuzzy Cmeans (FCM) algorithms have been applied in literature with many datasets. We present efficient clustering algorithms to eliminate irregularities for a given set of stroke dataset using 2Dmultifractal denoising techniques (Bayesian (mBd), Nonlinear (mNold), and Pumping (mPumpD)). Contrary to previous methods, our method embraces the following assets: (a) not applying the reduction of the stroke datasets’ attributes, leading to an efficient clustering comparison of stroke subtypes with the resulting attributes; (b) detecting attributes that eliminate “insignificant” irregularities while keeping “meaningful” singularities; (c) yielding successful clustering accuracy performance for enhancing stroke data qualities.Therefore, our study is a comprehensive comparative studywith stroke datasets obtained from2Dmultifractal denoised techniques applied forK-means andFCMclustering algorithms. Having been done for the first time in literature, 2DmBd technique, as revealed by results, is themost successful feature descriptor in each stroke subtype dataset regarding the mentioned algorithms’ accuracy rates.


Introduction
Multifractal analysis is concerned with the study of the regularity structure of processes, both from a local and global point of view.Multifractal Bayesian denoising is a technique on regularity-based enhancement and it acts by finding data that is close to the observations along with the multifractal data prescribed.This method depends on the tuning of a small set of parameters which are capable of providing different improvements pertaining to the observed noisy data.In many applications, this has been successfully utilized in cases in which irregularity carries important information.Abundant natural phenomena in fields like physics, finance, construction, environment, medicine, and biology have been shown to display a fractal behavior [1][2][3].
Being the third most frequent cause of death following heart disease and cancer in developed countries, stroke is among the most common causes of cognitive impairment and vascular dementia [4].Stroke can be described as the quick loss of brain function owing to the disturbance occurring in the blood supply to the brain [5].Being one of the foremost causes of death worldwide, stroke can be generally classified into two types.The clinical course types are defined as the ischemic stroke and hemorrhagic stroke [6].
In this study, we worked on the dataset of individuals who have been diagnoses with ischemic stroke: no stroke/TIA, large vessel, small vessel, cardioembolic, cryptogenic, dissection, other (moyamoya, FMD, hereditary, coagulopathy, vasculitis, other rare).No stroke/TIA is transient ischemic attack; it is a mini stroke or a minor stroke that happens when a blood clot blocks an artery for a brief period of time [7].Large vessel is infarction happens owing to artery-toartery or low flow embolism in the ipsilateral arterial tree (intracranial or extracranial) segments of carotid or vertebrobasilar arteries, or proximal middle cerebral artery [8].Small vessel diseases of the cerebral vasculature contribute to varied forms of brain dysfunction cell death and injury.Small vessel disease of the brain corresponds to ≈25% to 30% of strokes, and it is a primary cause of cognitive decline and disability due to age-related and hypertension-related reasons [9].Cardioembolic stroke is mainly preventable, calling for efforts at primary prevention for major-risk cardioembolic sources.When stroke as a result of cardiac embolism has happened, the chances of recurrence are comparatively high for most cardioembolic sources.In addition, cryptogenic stroke is commonly seen in clinical practice.Cryptogenic stroke is defined as a brain infarction not attributable to a source of definite cardioembolism, small artery disease, or large artery atherosclerosis, even with a standard cardiac, vascular, and serologic evaluation [10].Dissection stroke is cervical arterial dissection; it is a main reason for stroke experienced in young adults.However, the management thereof remains uncertain despite the standard treatment administered through anticoagulants or antiplatelet drugs [11].Demographic information, medical history, results of the laboratory tests, treatments, and medications are among the most important data belonging to the patients, which can be categorized into the following major groups.
In recent years, the fractal and multifractal analysis in biomedical data has seen a growing interest.Regarding the relevant topics, in the following studies, Wang et al. [12], Yang et al. [13], Karaca and Cattani [14], Tsaneva [15], Doubal et al. [16], Shanmugavadivu et al. [17], and Ahammer et al. [18] underlined the significance of fractal and multifractal techniques for data analysis in medicine.It has also been acknowledged that the multifractal techniques have successful feature descriptor in stroke applications [19][20][21].Numerous studies reveal successful clustering results regarding the stroke subtypes with applications of -means and FCM [22,23].However, it has been also seen that there is a shortage in literature and subject matter as to studies with combined applications of numeric data [24,25], 2D multifractal denoising techniques, and machine learning approaches.
Our approach aims to be broader and more complete since this study of ours is large and comprehensive when compared with other studies done with stroke dataset [22,23] in literature, taking into consideration the dimension of 2204 (the number of patients with 7 different stroke subtypes) and 23 attributes.The 7 different stroke subtypes are as follows: no stroke/TIA, large vessel, small vessel, cardioembolic, cryptogenic, dissection, other (moyamoya, FMD, hereditary, coagulopathy, vasculitis, other rare).The attributes include demographic information, medical history, results of laboratory tests, treatments, and medications.The clustering of the subtypes of stroke is a remarkable challenge in its own term.Besides, all researches have been done on many different kinds of analysis regarding stroke dataset, but no work has been reported yet which relates attributes (demographic information, medical history, results of laboratory tests, treatments, and medications) through the 2D multifractal denoising techniques to fuzzy  means and -means algorithms applied for clustering purposes.For this reason, 2D multifractal denoising techniques (mBd, mNold, mPumpD) have been administered for the identification of significant and efficient attributes belonging to the patients (among 23 of them) for the clustering of 7 subtypes of stroke.It is adapted well to the case in which the data to be recovered is very irregular and nowhere differentiable, a property relevant to fractal or self-similar structures.We obtained regularity-based enhancement from 2D multifractal denoising techniques datasets (mBd, mNold, mPumpD).These datasets are clustered using the -means and FCM algorithms.2D mBd stroke dataset has yielded better clustering than stroke dataset, 2D mNold stroke dataset, and 2D mPumpD stroke dataset of stroke subtypes.When compared with studies mentioned above, our study is a comprehensive and comparative one since the stroke datasets as obtained from 2D multifractal denoised techniques have been applied for the first time in literature for -means and FCM clustering algorithms.

Ischemic stroke
The paper is organized as follows: Section 2 provides Materials and Methods.Methods of our approach are basic facts on Hölder regularity and multifractal analysis, multifractal Bayesian denoising in S(, ), numerical experiments (stroke dataset experiments in multifractal Bayesian denoising technique), -means algorithm, and fuzzy  means algorithm.As the last sections, results and discussion and conclusions are presented in Sections 3 and 4, respectively.

Patient Details. 2204 individuals from Massachusetts
Medical School, University of Worcester, Massachusetts, USA, were kept under observation in this study.Data were collected in the period between March 9, 2007, and October 2, 2016.The individuals had ischemic stroke diagnosis (see Figure 1).The ischemic strokes in the dominant hemisphere lead to more functional deficits compared to the strokes in the nondominant hemisphere as they are evaluated on the National Institutes of Health Stroke Scale (NIHSS).
A total of 2204 patients (414 males [labelled with (1)] and 1790 females [labelled with (0)]) were included in our experiments.Stroke patients are aged between 0 and 104, with seven subtypes of ischemic stroke being examined in this study.In this study, demographic information, medical history, results of laboratory tests, treatments, and medications data, as can be seen in Table 1, pertaining to 2204 stroke subtypes patients.Table 2 provides the main headings of attributes used for the stroke subtypes.
Baseline characteristics of the patients involved as stratified by infarct side are outlined in Table 3 regarding the stroke dataset.
Medications which are given to the patients are classified into broad categories: statin, antiHTN, antidiabetic, antiplatelet, anticoagulation.Attributes of CT perfusion and neurointervention are utilized for the treatment.The modified Rankin Scale (mRS) score was evaluated at 90 days by a physician who had training on strokes or a stroke nurse with knowledge of strokes and certified in mRS via in-person or via phone interview.We comply with the Strengthening the Reporting of Observational Studies in Epidemiology guideline (https://www.strobe-statement.org/).0.09% of the individuals' disorder progressed to the hemorrhagic stroke.

Methods.
In this study, we have provided two potential contributions.We introduced the 2D mBd, 2D mNold, and 2D mPumpD which are relatively novel multifractal techniques calculating the regular data from stroke data.We proposed the use of stroke dataset and regular and denoised stroke datasets (2D mBd, 2D mNold, 2D mPumpD) to be trained with unsupervised -means and FCM algorithms for the clustering along with the aim of improving the clustering performance of the stroke subtypes.Our method is reliant on the steps specified below: (a) 2D multifractal denoising techniques (2D mBd, 2D mNold, 2D mPumpD) were applied to the stroke dataset (which can be seen in Table 2).In order to identify the stroke dataset significant regularity, which is the fundamental concept concerning multifractal denoising; the best is explained on a simple example.The aim is to eliminate "insignificant" irregularities while retaining "meaningful" singularities and denoised dataset.
( We are concerned with the enhancement, or denoising, of complex data, which is the stroke dataset, relying on the analysis of the local Hölder regularity.We rather suppose that data enhancement is comparable to increasing the Hölder regularity at every point [27].Such methods are adapted well to the case in which the data which would be recovered is highly irregular, for instance, nowhere differentiable with local regularity that varies rapidly.In this paper, our focus is on the pointwise Hölder exponent for simplifying the notations, and we assume that our data are not differentiable [28,29]. Let  ∈ (0, 1) and where  is a constant.Pointwise Hölder exponent of  at  0 , denoted by ( 0 ), is the supremum of the  for which (1) is valid.
In this paper, we shall concentrate on the statistical approach that brings about consideration of a quantity named the large deviation multifractal spectrum [29,30].This spectrum can be defined as follows: Consider a stochastic process (),  ∈  ⊂  on a probability space (Ω, , ).For convenience in terms of notation, we will assume without loss of generality that  = [0, 1].
Set    () = #{ :  −  ≤    ≤  + }, where    is the coarse-grained Hölder exponent that corresponds to the dyadic interval    = [2 − , ( + 1)2 − ]; that is,    = log |   |/ − log .At this point, (#)is the number of nonempty boxes, and its measure is portrayed by exponents from the interval ( − ,  + ) [29].At this point,    is some quantity measuring the variation of  in the interval    .The choice    fl (( + 1)2 − ) − (2 − ) brings about the simplest analytical computations.Another possibility, which shall be the one used in this paper, is to take    to be the  , of  at scale  and location  [29].This definition is convenient in various aspects, since it ensures the utilization of the wavelet bases' versatility.However, it also has a setback: As a matter of fact, the multifractal spectrum obtained accordingly will rely predominantly on the wavelet chosen .Hence, if one sets    :  , , it would not make sense to speak of the spectrum of  without reference to the analyzing wavelet chosen.Along the paper, the wavelet coefficient of data  is denoted by  , , with  being the scale and  being the location [29,30].
Then coarse-grained multifractal large deviation spectrum is provided by the equation   () = lim →0 lim →∞ sup(log    / log ).The definition of   () is connected with the large deviation theorem that provided the probabilistic interpretation for the multifractal spectrum.It should be noted that irrespective of the choice of    ,   all the time ranges in R + ∪ {−∞}.The value −∞ corresponds to values of the coarse-grained exponent which are not observed at all sufficiently on small scaler [29].
The intuitive meaning of   can be found as follows: For  large enough, one has approximately   (   ≃ ) ≃ 2 −(1−  ()) , in which   denote the uniform distribution over {0, 1, . . ., 2  − 1}.Hence, for all  such that   < 1, 1 −   () measures the exponential rate of decay of the probability of finding an interval    with coarse-grained exponent equal to , when  tends to infinity.As a whole,   is a random function.In the applications, it is convenient to regard the following deterministic version of   : where See [29].
Here, we present technique based on the multifractal data instead of applying it to the use of the Hölder exponent merely.This approach can generally ensure more robust estimates because a higher level description is used for subsuming information on the entire data.Besides this, we also assume a semiparametric approach.To put it more specifically, it can be said that we put forth the assumption that the considered data belong to a given set of parameterized classes and are explained currently [31].
For  large sufficiently, the assumption that the wavelet coefficients ( , )  at scale  are distributed identically requires the following: Consequently, Definition 1 yields a plain interpretation regarding multifractal analysis.We consider the set of random data  for a given wavelet .Accordingly, the normalized data  has deterministic multifractal data   () that equals 1 +  along with the following further condition:   is gained as a limit in  lim sup and this limit has been attained in a uniform manner into .This condition confirms that the rescaled statistics of the  , are sufficiently close to their limit for large enough , which allows a meaningful inference.The cluster S(, ) includes a wide variety of data.

Multifractal Bayesian Denoising in S(𝑔, 𝜓)
. At this point, the key steps in the classical Maximum A Posteriori (MAP) approach in a Bayesian frame can be recalled, as adjusted to our setting.It is observed that the noisy data is , and it is assumed that  =  + , where  is a noise independent from original data , with known law as  (it should be noted that we use orthonormal wavelets;  is denoted in equation as ).Hence, we have  , =  , +  , .The map estimate x, of  , from the observation  , is defined to be an argument maximizing ( , / , ).Since ( , ) does not depend on  , , using Bayes rules and maximizing ( , / , ) correspond to maximizing the product ( , / , )( , ) [29,30].
The MAP estimate [30] This leads to the definition of an approximate Bayesian MAP estimate as where sgn() is the sign of  and Q = (sup > 0 sup  ( , )) −1 .The estimate for  can be justified in a heuristic way as follows: log 2 (| , |)/ −  ≃  with  > 0 suggesting that | , | < 1 for all the couples (, ).Q is chosen as the smallest normalizing factor requiring the latter inequality.In our experiments, we address the incident in which the noise is centered, Gaussian, with variance  2 .The MAP estimate can be seen in accordance with Equation ( 6) provides an explicit formula for  denoising, but it often offers limited practical use.In fact, one is not aware of the multifractal data of  in most of the applications.If there is no evaluation of , it would not be probable to use (6) for the purpose of obtaining x, .Moreover, one should bear in mind that in general   depends on the analyzing wavelet.Hence, it is necessary to understand the shape of the data for a given wavelet.Moreover, the main goal of our approach is to remove the multifractal characteristics Complexity of  from the denoised data X: regarding the Multifractal Bayesian approach use in our study, a strong justification is to be able to estimate    in the following manner: (a) denoise , (b) evaluate the data  X  numerically, (c) set F  =  X  .It will be obvious from this approach that it does not seem right to have the necessity of previous knowledge of     in the Bayesian approach.Hence, we present a "degenerated" version of (6), and here the input is used as a single real parameter rather than the whole data.The heuristic reads as follows: from regularity perspective a significant piece of information in the data is concerned with its support, for example, the set of all the occurring Hölder exponents. 0 denotes the smallest regularity observed actually in the data [32,33].The shapes of the   spectra gained through different analyzing wavelets is reliant on the wavelet but their support is included in [ 0 , ∞).The "flat data," therefore, encompasses inherent information.It only relies on the positive real  0 .Rewriting (6) with a flat data gives the explicit simple expression as in 0 is really a priori information but it is possible to be predicted from the noisy observations.In view of that, it can be analogous to the threshold that is used in the classical soft or hard wavelet thresholding scheme.It would be beneficial to regard  0 as a tuning parameter in the applications.Increasing  0 offers a more smooth estimate (as it is assumed that the original data have a larger minimal exponent).It would be remarkable to make a comparison with the hard-thresholding policy on the wavelet coefficients (see more details in [29]).

Numerical Experiments.
We present some results regarding the stroke dataset.In each case, the result of the Bayesian multifractal denoising and the classical hard-thresholding technique is shown.For all procedures and stroke dataset, the parameters (see Tables 2 and 3) are set in order to obtain the best fit to the known original data.On the whole, the following conclusions can be inferred from these experiments.It is observed that for the irregular data, like the ones handled here, which belong to S(, ), the Bayesian method yields more satisfactory results compared to those of classical wavelet thresholding.This method particularly preserves a roughly correct regularity along the path, whereas the wavelet shrinkage yields data that have both too smooth and too irregular regions.In this study, numerical experiments were obtained with 2D mBd, 2D mNold, and 2D mPumpD techniques being applied to numerical experiments pertaining to the stroke dataset (see Table 2).
The steps of multifractal Bayesian denoising technique applied on the stroke dataset to be able to get the regular and denoised stroke dataset are provided below.
Step 1.We consider Step 2. For each attribute in the stroke dataset ( =  =1,...,2204×23 ) MAP ( x, of  , ) values are calculated.Here,  and  are the positive constants and  , is a random variable supported in [0, 1].All  , are independent, having the same law levelwise.For instance,  , and  ,  are distributed identically with probability distribution   for all , ,   .In addition, we assume that   (0) < 1 for infinitely many .
In line with the law   of the local regularity behavior of the functions in S, we will consider the particular case of data with uniformly distributed wavelet coefficients.  () = 1 for Step 3. Our second type of stroke data has one of the simplest fractal stochastic processes, which is the fractional Brownian motion (fBm) (for more information see [29]).As is well known, fBm is the zero mean Gaussian process () with covariance function; where  is a real number in (0, 1) and  is a real number as well.
Step 4. Hence, the result of our denoising procedure will be wavelet-dependent in principle.The impact of the wavelet is controlled through the choice of the prior, that is, the multifractal, spectrum among all admissible ones.In practice, we have found out that few variations are observed if one uses a Daubechies wavelet [29] with length as 10 and a nonincreasing spectrum supported on [, ∞) with   () = 1.
The result of the denoised stroke dataset obtained by having applied the steps between Steps 1 and 4 on the stroke dataset is presented in Figure 2(a), mesh plot display for 2D mBd stroke dataset.
In this study, 2D mBd technique is applied to main captions of attributes (as can be seen in Table 2) pertaining to the stroke subtypes.2D mBd by multifractal technique is reliant on the fact that stroke data enhancement is comparable to increasing the Hölder regularity at each point.Stroke dataset () was applied on FracLab [26] program to regularitybased enhancement ( X) from 2D multifractal denoising techniques.Regularity-based enhancement of 2D mBd stroke dataset X was clustered with -means and FCM algorithms.Consequently, the most accurate clustering was attained for the subtypes of stroke.

𝐾-Means
Algorithm. = ( 1 ,  2 , . . .,   )  d-dimensional data is to be grouped into a set of  clusters,  = {  ,  = 1, . . ., }.-means algorithm discovers a partition in which the squared error between the empirical mean of a cluster and the points in the cluster is reduced to minimum.With   being the mean of cluster   , the squared error between   and the points in cluster   can be defined based on the following equation [34][35][36]: The aim of -means is to minimize the sum of the squared error over all  clusters in line with -means is initiated with a primary partition with  clusters and assigns patterns to clusters so that the squared error can be reduced.The squared error all the time goes down with an increase in the number of clusters  (with () = 0); when  = , it can be minimized only at a fixed number of clusters [37].
The clustering of the training set with the -means clustering algorithm can be seen in Algorithm 1.
The key steps of -means algorithm are as follows: Step 1. Choose an initial partition with  clusters; repeat Steps 2 and 3 till the cluster membership stabilizes.
Steps 2-3.Form a new partition by assigning each pattern to its closest cluster center.
Step 4. Calculate the new cluster centers.

The clustering of the stroke dataset with the 𝐾-means clustering algorithm can be seen as follows:
Stroke dataset  = ( 1 ,  2 , . . .,  2204×23 ) was applied to means algorithm.The main steps of -means algorithm are as follows: Step 1.  cluster is selected as an initial partition with 7 for  = ( 1 ,  2 , . . .,  2204×23 ); Steps 2 and 3 are repeated up until the stroke subtypes cluster membership stabilizes.
Steps 2-3.For 1000 iterations, the data of each patient in the stroke dataset is assigned to the closest cluster centroid.
Step 4. The new cluster centroids of stroke subtypes are calculated.The results of the clustering procedure as obtained from the application of -means algorithm (see Algorithm 1) on the stroke dataset are presented in Figure 4(a).The clustering of the 2D mBd stroke dataset with the means clustering algorithm can be described as follows: The stroke datasets (mBd, mNold, mPumpD) obtained from the 2D multifractal denoising techniques X = (x 1 , x2 , . . ., x2204×23 ) were applied to -means algorithm.The main steps of -means algorithm are as follows: Step 1.  cluster is selected as an initial partition with 7 for X = (x 1 , x2 , . . ., x2204×23 ); Steps 2 and 3 are repeated up until the stroke subtypes cluster membership stabilizes.
Steps 2-3.For 1000 iterations, the data of each patient in the stroke dataset is assigned to the closest cluster centroid.
Step 4. The new cluster centroids of stroke subtypes are calculated accordingly.The results of the clustering procedure as obtained from the application of -means algorithm (see Algorithm 1) on the 2D mBd stroke dataset are presented in Figure 4(b).

Fuzzy 𝐶 Means Algorithm.
The FCM algorithm assigns data to each category through the use of fuzzy memberships [38][39][40][41].Let  = ( 1 ,  2 , . . .,   ), -dimensional data signifying with  data to be split into  clusters, in which it denotes the features data.The iterative optimization algorithm steps are provided below: where   denotes the membership of data   in the th cluster,   is the th cluster center, ‖ ⋅ ‖ is norm metric, and  is the denotation of a constant greater than 1.
The cost function is brought to the minimum when high membership values are assigned to the data that are close to the centroid of their clusters.In addition, the low membership values are assigned to the data which encompass data distant from the centroid.The membership function shows the probability that data belongs to a specific cluster.
As for FCM algorithm, the probability is dependent merely on the distance between the data and the cluster center of each stroke patient in the feature domain.The cluster centers and membership functions can be updated by using The clustering of the training set with the FCM clustering algorithm is stated as in Algorithm 2.
Beginning with a preliminary guess for each cluster center, the FCM converges a solution for   , demonstrating the local minimum of the cost function.It is possible to detect such convergence through the comparison of the changes in the membership function or the cluster center at two consecutive iteration steps ().
The clustering of the stroke dataset with the FCM clustering algorithm can be depicted as follows: Stroke dataset  = ( 1 ,  2 , . . .,  2204×23 ) was applied to the FCM algorithm.The main steps of FCM algorithm can be seen in the following steps regarding the stroke dataset: (14)).Here,   represents the membership of data   in the th cluster,   is the th cluster center, ‖ ⋅ ‖ is norm metric, and  is chosen as 2.
Step 4. The cost function is brought to the minimum when high membership values are assigned to the stroke dataset close to the cluster centroid for 1000 iterations.The results of the clustering procedure as obtained from the application of FCM algorithm (see Algorithm 2) on the stroke dataset are presented in Figure 5(a).(2) while () = 0 do Algorithm 1: -means algorithm in the stroke dataset and 2D mBd stroke datasets.
The clustering of the 2D mBd stroke dataset with the FCM clustering algorithm is as follows: X = (x 1 , x2 , . . ., x2204×23 ) obtained from the 2D multifractal denoising techniques were applied to the FCM algorithm.The main steps of FCM algorithm are stated below: Step (1)(2)(3) (14)), where   shows the membership of data x in the th cluster,   is the th cluster centroid, ‖ ⋅ ‖ is norm metric, and  is chosen as 2.
Step (4).The cost function is minimized when high membership values are assigned to the stroke dataset close to the cluster centroids for 1000 iterations.The results of the clustering procedure as obtained from the application of FCM algorithm (see Algorithm 2) on the stroke dataset are presented in Figure 5(b).

Results and Discussion
In order to have a detailed vision of the relationship between the variables concerning the stroke dataset and 2D mBd stroke dataset in  of this study, the demonstration is done as plot based on mesh function. includes the application of stroke dataset and regular and denoising 2D stroke dataset to the -means algorithm, and  includes the calculations and results pertaining to the application of stroke dataset and regular and denoising 2D stroke dataset to the FCM algorithm.

Mesh Plot Display for Stroke Dataset and 2D mBd Stroke
Dataset.The stroke dataset in our study is a matrix with a dimension of 2204 × 23. Figure 2(a) displays the meaning attribute headings for the stroke dataset attributes (demographic information, medical history, results of laboratory tests, treatments, and medications) as well as the relationship between the stroke patients based on mesh function in plot.Figure 2(a) presents the stroke dataset (, , ) drawing a wireframe mesh and a contour plot under it with color determined by .Thus, the color is proportional to the surface height. and  are vectors length (: main headings of attributes) = 23 and length ( = number of stroke subtypes) = 2204, where [23,2204] = size ( = main headings of attributes, number of stroke subtypes).Here, ((), (), (, )) are the wireframe grid lines' intersections;  and  stand for the columns and row of .
Figure 2(a) displays the main headings of attributes concerned with the stroke dataset (demographic information, medical history, results of laboratory tests, treatments, and medications) and 2D mBd technique was applied to the data of stroke patients.Figure 2(b) shows the Bayesian regularity that matches each attribute in the stroke dataset (2204×23) as plot based on the mesh function result.Figure 2(b) presents the 2D mBd stroke dataset (, , ) drawing a wireframe mesh and a contour plot under it, with color determined by .Thus, the color is proportional to the surface height. and  are vectors length (: 2D mBd attributes) = 23 and length (: 2D mBd number of stroke subtypes) = 2204, where [23,2204] = size (: 2D mBd attributes, 2D mBd number of stroke subtypes).Here, ( (),  (),  (, )) are the wireframe grid lines' intersections;  and  stand for the columns and row of .In the stroke dataset of our study, Daubechies wavelet with a length ranging between 2 and 20 was applied.fBm with  = 0.6 and denoised version with Gaussian noise was applied on the dataset.As Figure 2(b) shows, taking   to be the theoretical spectrum is obtained with increments, or taking   () = 1 for  ≥ .
In this study, for the clustering procedure of stroke subtypes, -means and FCM algorithms were applied to the stroke dataset (Figure 3(a)) as an initial step.The clustering procedure was obtained for the stroke subtypes (the details can be seen in Figure 3).In the second stage, multifractal denoising techniques (see Figure 4(b)) were applied to the same dataset (Figure 4(a)), stroke dataset.-means and FCM algorithms were applied to the regular and denoised stroke dataset obtained (Figure 4(d)).As a result, the clustering procedure was obtained for the stroke subtypes (the details of which can be seen in Figure 4).

Application of 𝐾-Means
Clustering Algorithm.Different iterations (200, 300, 400, 500, and 1000) have been used for the clustering procedure of stroke dataset, 2D mBd stroke datasets through the -means algorithm.The most accurate results in the -means algorithm have been obtained for the 1000 iterations.As shown in Figure 5, each epoch corresponds to 200 for more vivid display of classification.
The parameters pertaining to the 1000 iterations are presented in Table 4.In Figure 5(a) shows the best total sum of distance (()) for the centroid values.The value (()) did not change following this value.The clustering calculation was stopped in the 1000th iteration since no change was recorded in the results of the centroid values compared to the previous iteration.
In Figure 5(b), the result of best total sum of distance (()) of the stroke 2D mBd dataset with -means algorithm was obtained as 19.2979.The value (()) did not change after this value.After the 900th iteration the iteration was stopped since no change in the iteration calculation happened in the clustering calculation compared to that of the previous iteration.Both of the datasets () have a 7-by-23 matrix that contains the final centroid locations.-means is used to calculate the distance from each centroid to points on a grid.To be able to do this, the centroids () and points on a grid to -means are passed, and 1000 iterations of the algorithm are implemented (Table 5).
The result of the best total sum of distance calculated (see Figure 5) reveals that 2D mBd stroke dataset has a better clustering accuracy than the stroke dataset.
The reason for mentioning such parameters is that these parameters were of help for the best cluster analysis to be performed in this study.
For the 1000 iterations in -means clustering algorithm by splitting to 200 iterations with corresponding epoch, the calculation result pertaining to the stroke dataset is presented in Figure 5(a) for stroke dataset (see Figure 2  For the 1000 iterations in FCM clustering algorithm, by splitting to 200 iterations with corresponding epoch, the calculation result pertaining to the stroke dataset is presented in Figure 6(a) for stroke dataset (see Figure 2 The parameters for the 1000 iterations are displayed in Table 6. The reason for mentioning such parameters is that these parameters were of help for the best cluster analysis to be conducted in this study.The data in stroke dataset (2204 × 23) and 2D mBd stroke dataset (2204 × 23) as well as the distance between the cluster centroids and the data are computed (the distance between the data and the cluster center of each stroke patient in the feature domain).In addition, the results computed are stored in the   matrix.The optimization stopped on the 1000th iteration for the stroke dataset (in Figure 6(a)) and 2D mBd stroke dataset (in Figure 6(b)), as the objective function improved by less than 1 − 3 between the final two iterations.The clustering process comes to an end when the maximum number of iterations is reached, or when the objective function improvement between two consecutive iterations is less than the specified minimum.In the calculations of   matrix for 1000 iterations (see Figure 6), it has been revealed that the clustering accuracy of 2D mBd stroke dataset is better than that of the stroke dataset (Table 7).
Calculations given in Table 7 were performed using Matlab, Mosek, and FracLab [26] environment.The clustering accuracy rates obtained from -means and FCM algorithms applied to the stroke dataset and 2D mBd, 2D mNold, 2D mPumpD stroke datasets in the study are provided in Table 8.
Accurate clustering results have been obtained with means and FCM algorithms by applying 2D multifractal denoising techniques for the stroke dataset based on the clustering results of the stroke subtypes obtained in this study (Table 8).The clustering accuracy of the stroke subtypes with -means and FCM algorithms through the 2D mBd stroke dataset proved to be better compared to the 2D mNold stroke dataset and 2D mPumpD stroke dataset.In the 2D mBd stroke dataset, the clustering accuracy of the -means algorithm proves to be better for the no stroke/TIA and cardioembolic stroke subtypes.In the 2D mBd stroke dataset, the clustering results regarding the FCM algorithm application proved to be more accurate for the large vessel, small vessel, cryptogenic, dissection, and other stroke subtypes.
In literature, very limited number of papers exist on mathematical modeling and clustering of stroke subtypes.In this study, seven subtypes among no stroke/TIA, large vessel, small vessel, cardioembolic, cryptogenic, dissection, and other subtypes have been analyzed.Four datasets (stroke

Conclusions
The main contribution of this paper is that it has proposed a novel approach in stroke main headings regarding attributes with the use of 2D techniques of multifractal denoising.The clustering performances of 2D multifractal denoised techniques (mBd, mNold, mPumpD) for the stroke subtypes regarding a total of 2204 stroke patients' dataset have been provided in a comparative manner.When our study is compared with the other works [22,23], it is seen that first of all there is no attribute constraint for the clustering of 7 subtypes of stroke.Secondly, it is possible to select the efficient and significant attributes through the denoising techniques.Finally, popular FCM and -means algorithms are applied to the datasets comprised of efficient and significant attributes.The output datasets are provided in supervised learning and they are used to train the machine and get the desired outputs, whereas in unsupervised learning, no datasets are provided.Instead, the data is clustered into different classes.-means is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem.The procedure follows a simple way to classify a given dataset through a certain number of clusters (assuming  clusters) fixed a priori.-means is a simple algorithm that has been adapted to many problem domains.As we have seen, it is a good candidate for extension to work with fuzzy feature vectors [42].Fuzzy  means algorithm [43,44] yields the best result for overlapped dataset and is comparatively better than means algorithm.Unlike -means, where data point must exclusively belong to one cluster center, here the data point is assigned membership to each cluster center as a result of which data point may belong to more than one cluster center.
For the first time in literature, 2D multifractal denoising techniques and -means and FCM algorithms have been applied to the numeric data obtained from the attributes that belong to the patients with seven different stroke subtypes.
The results reveal that the 2D Bayesian denoising technique application used in our study has proven to be much better compared to the other techniques and methods.

Figure 2 :Figure 3 :
Figure 2: Display of stroke dataset and 2D mBd stroke dataset with mesh plot.

Figure 4 :
Figure 4: Clustering of the stroke dataset with the application of multifractal denoising techniques through -means and FCM algorithms.

Figure 5 : 10 ComplexityData:
Figure 5: The clustering analyses of -means algorithm based on the epochs.

3. 3 .
The Application of FCM Clustering Algorithm.Different iterations (200, 300, 400, 500, and 1000) have been used for the clustering procedure of stroke dataset, 2D mBd stroke datasets through the fuzzy  means algorithm.The most accurate results in the fuzzy  means algorithm have been obtained for the 1000 iterations.As shown in Figure6, each epoch corresponds to 200 for more vivid display of classification.

Figure 6 :
Figure 6: The clustering analyses of FCM algorithm based on the epochs.

Table 1 :
Breakdown of stroke patients by age.
is stated as x, = arg max  [( , /)()].The term ( , /) can be calculated from the law of  easily if  is assumed to be (0), since  , share the same law as .It can also be recalled that orthonormal wavelets are used by us.The preceding ( , ) is inferred from our assumption that  belongs to S(, ) as follows: for  > 0, set   () = log 2 ()/ − ,

Table 5 :
Result of -means algorithm clustering for 1000 iterations (as obtained from Figure5).

Table 7 :
Result of FCM algorithm clustering with for 1000 Iterations (as obtained from Figure6).

Table 8 :
Clustering results.2D mBd stroke dataset, 2D mNold stroke dataset, 2D mPumpD stroke dataset) are totally performed on our new approach with -means and FCM algorithms.