Multifractal denoising techniques capture interest in biomedicine, economy, and signal and image processing. Regarding stroke data there are subtle details not easily detectable by eye physicians. For the stroke subtypes diagnosis, details are important due to including hidden information concerning the possible existence of medical history, laboratory results, and treatment details. Recently,
Multifractal analysis is concerned with the study of the regularity structure of processes, both from a local and global point of view. Multifractal Bayesian denoising is a technique on regularity-based enhancement and it acts by finding data that is close to the observations along with the multifractal data prescribed. This method depends on the tuning of a small set of parameters which are capable of providing different improvements pertaining to the observed noisy data. In many applications, this has been successfully utilized in cases in which irregularity carries important information. Abundant natural phenomena in fields like physics, finance, construction, environment, medicine, and biology have been shown to display a fractal behavior [
Being the third most frequent cause of death following heart disease and cancer in developed countries, stroke is among the most common causes of cognitive impairment and vascular dementia [
In this study, we worked on the dataset of individuals who have been diagnoses with ischemic stroke: no stroke/TIA, large vessel, small vessel, cardioembolic, cryptogenic, dissection, other (moyamoya, FMD, hereditary, coagulopathy, vasculitis, other rare). No stroke/TIA is transient ischemic attack; it is a mini stroke or a minor stroke that happens when a blood clot blocks an artery for a brief period of time [
In recent years, the fractal and multifractal analysis in biomedical data has seen a growing interest. Regarding the relevant topics, in the following studies, Wang et al. [
Our approach aims to be broader and more complete since this study of ours is large and comprehensive when compared with other studies done with stroke dataset [
The paper is organized as follows: Section
2204 individuals from Massachusetts Medical School, University of Worcester, Massachusetts, USA, were kept under observation in this study. Data were collected in the period between March 9, 2007, and October 2, 2016. The individuals had ischemic stroke diagnosis (see Figure
Ischemic stroke image.
A total of 2204 patients (414 males [labelled with (1)] and 1790 females [labelled with (0)]) were included in our experiments. Stroke patients are aged between 0 and 104, with seven subtypes of ischemic stroke being examined in this study.
In this study, demographic information, medical history, results of laboratory tests, treatments, and medications data, as can be seen in Table
Breakdown of stroke patients by age.
Stroke subtypes | Age | |||||
---|---|---|---|---|---|---|
0–29 | 30–36 | 37–49 | 50–70 | 71–90 | 91–104 | |
No stroke/TIA | 7 | 2 | 13 | 62 | 72 | 11 |
Large vessel | 18 | 2 | 17 | 216 | 208 | 20 |
Small vessel | 14 | 1 | 7 | 113 | 85 | 8 |
Cardioembolic | 18 | 5 | 21 | 195 | 391 | 59 |
Cryptogenic | 22 | 12 | 53 | 228 | 196 | 17 |
Dissection | 2 | 2 | 14 | 30 | 8 | 3 |
Other | 5 | 5 | 7 | 17 | 17 | 1 |
Stroke dataset.
Number of stroke subtypes/TOAST | Main heading of attributes | Data size |
---|---|---|
No stroke/TIA (167) | Demographic information | |
TOAST: type/etiology of stroke; TIA: ischemic attack; HTN: hypertension; DM: diabetes mellitus; CAD: coronary artery disease; AtrialFib: atrial fibrillation stroke; CAD: coronary artery disease; CHF: congestive heart failure; PAD/carotid disease: peripheral artery disease; NIHSS 90 days: National Institutes of Health Stroke Scale 90-day mortality; CT perfusion: computer tomography perfusion, ETOH: alcohol; antiHTN: antihypertensive drugs after acute ischemic stroke; NIHSS discharge: National Institutes of Health Stroke Scale; H/O stroke/TIA: history of transient ischemic attack.
Baseline characteristics of the patients involved as stratified by infarct side are outlined in Table
Stroke dataset description.
Attributes | Status | Number of patients/values (%) | Descriptions |
---|---|---|---|
HTN | Yes | 1593 (72%) | Hypertension |
Hyperlip | Yes | 1197 (54%) | High levels of lipid (fat) in blood |
| |||
DM | Yes | 602 (27%) | Diabetes |
H/O stroke/TIA | Yes | 546 (25%) | History of stroke/TIA |
AttrialFib | Yes | 541 (25%) | Abnormal heart rhythm |
CAD | Yes | 513 (23%) | Coronary artery disease |
CHF | Yes | 229 (10%) | Congestive heart failure |
PAD/carotid Disease | Yes | 318 (14%) | Peripheral artery disease |
| |||
Tobacco | Yes | 520 (23%) | Cigarette addict |
ETOH | Yes | 308 (1.7%) | Alcohol addict |
| |||
Statin | Yes | 1000 (45%) | Medications given to the patient are grouped into five broad categories |
AntiHTN | Yes | 1332 (60%) | |
Antidiabetic | Yes | 454 (20%) | |
Antiplatelet | Yes | 1031 (47%) | |
Anticoagulation | Yes | 242 (10%) | |
| |||
CT perfusion | Yes | 137 (0.06%) | Procedures used for treatment |
Neurointervention | Yes | 271 (12%) | |
mRS 90 days | Low | 2007 | Dichotomized into low (0–2), high (3–6) |
High | 197 | ||
| |||
Hemorrhagic con. | Yes | 204 (0.09%) | Whether the ischemic stroke turned to hemorrhagic |
| |||
NIHSS admission | 9.3 +/− 8.3 | Measures the severity of stroke | |
| |||
TPA | Yes | 413 (19%) | TPA (tissue plasminogen activator) is used to break down blood clots |
Medications which are given to the patients are classified into broad categories: statin, antiHTN, antidiabetic, antiplatelet, anticoagulation. Attributes of CT perfusion and neurointervention are utilized for the treatment. The modified Rankin Scale (mRS) score was evaluated at 90 days by a physician who had training on strokes or a stroke nurse with knowledge of strokes and certified in mRS via in-person or via phone interview. We comply with the Strengthening the Reporting of Observational Studies in Epidemiology guideline (
In this study, we have provided two potential contributions. We introduced the 2D mBd, 2D mNold, and 2D mPumpD which are relatively novel multifractal techniques calculating the regular data from stroke data. We proposed the use of stroke dataset and regular and denoised stroke datasets (2D mBd, 2D mNold, 2D mPumpD) to be trained with unsupervised 2D multifractal denoising techniques (2D mBd, 2D mNold, 2D mPumpD) were applied to the stroke dataset (which can be seen in Table The stroke datasets obtained from stroke dataset and 2D multifractal denoising techniques (mBd, mNold, and mPumpD) were clustered by having been applied to the The comparisons of datasets (stroke dataset, mBd stroke dataset, mNold stroke dataset, and mPumpD stroke dataset) were performed with the
We are concerned with the enhancement, or denoising, of complex data, which is the stroke dataset, relying on the analysis of the local Hölder regularity. We rather suppose that data enhancement is comparable to increasing the Hölder regularity at every point [
In this paper, our focus is on the pointwise Hölder exponent for simplifying the notations, and we assume that our data are not differentiable [
Let
In this paper, we shall concentrate on the statistical approach that brings about consideration of a quantity named the
Consider a stochastic process
Set
Then coarse-grained multifractal large deviation spectrum is provided by the equation
The intuitive meaning of
Here, we present technique based on the multifractal data instead of applying it to the use of the Hölder exponent merely. This approach can generally ensure more robust estimates because a higher level description is used for subsuming information on the entire data. Besides this, we also assume a semiparametric approach. To put it more specifically, it can be said that we put forth the assumption that the considered data belong to a given set of parameterized classes and are explained currently [
Let us state that
For
Consequently, Definition
At this point, the key steps in the classical Maximum A Posteriori (MAP) approach in a Bayesian frame can be recalled, as adjusted to our setting. It is observed that the noisy data is
The MAP estimate [
We present some results regarding the stroke dataset. In each case, the result of the Bayesian multifractal denoising and the classical hard-thresholding technique is shown. For all procedures and stroke dataset, the parameters (see Tables
The steps of multifractal Bayesian denoising technique applied on the stroke dataset to be able to get the regular and denoised stroke dataset are provided below.
In line with the law
The result of the denoised stroke dataset obtained by having applied the steps between Steps 1 and 4 on the stroke dataset is presented in Figure
Display of stroke dataset and 2D mBd stroke dataset with mesh plot.
Stroke dataset
2D mBd stroke dataset
In this study, 2D mBd technique is applied to main captions of attributes (as can be seen in Table
The aim of
The clustering of the training set with the
( ( ( (
The key steps of
The clustering of the stroke dataset with the
The clustering of the
The stroke datasets (mBd, mNold, mPumpD) obtained from the 2D multifractal denoising techniques
The FCM algorithm assigns data to each category through the use of fuzzy memberships [
The cost function is brought to the minimum when high membership values are assigned to the data that are close to the centroid of their clusters. In addition, the low membership values are assigned to the data which encompass data distant from the centroid. The membership function shows the probability that data belongs to a specific cluster. As for FCM algorithm, the probability is dependent merely on the distance between the data and the cluster center of each stroke patient in the feature domain. The cluster centers and membership functions can be updated by using
The clustering of the training set with the FCM clustering algorithm is stated as in Algorithm
( ( ( (
Beginning with a preliminary guess for each cluster center, the FCM converges a solution for
The clustering of the stroke dataset with the FCM clustering algorithm can be depicted as follows:
The clustering of the 2D mBd stroke dataset with the FCM clustering algorithm is as follows:
In order to have a detailed vision of the relationship between the variables concerning the stroke dataset and 2D mBd stroke dataset in
The stroke dataset in our study is a matrix with a dimension of
Figure
Figure
In this study, for the clustering procedure of stroke subtypes,
The application of the
Clustering of the stroke dataset with the application of multifractal denoising techniques through
The clustering analyses of
Stroke dataset
2D mBd stroke dataset
Different iterations (200, 300, 400, 500, and 1000) have been used for the clustering procedure of stroke dataset, 2D mBd stroke datasets through the
The parameters pertaining to the 1000 iterations are presented in Table
Parameters | Parameters value |
---|---|
| 7 |
Maximum number of iterations | 1000 |
In Figure
In Figure
Result of
Steps concerning epochs | | |
---|---|---|
Epoch 1 with 200 iterations | 317.372 | 192.979 |
Epoch 2 with 200 iterations | 319.855 | 19.2988 |
Epoch 3 with 200 iterations | 320.911 | 192.984 |
Epoch 4 with 200 iterations | 318.939 | 19.2983 |
Epoch 5 with 200 iterations | 317.292 | 19.2979 |
The result of the best total sum of distance calculated (see Figure
The reason for mentioning such parameters is that these parameters were of help for the best cluster analysis to be performed in this study.
For the 1000 iterations in
Different iterations (200, 300, 400, 500, and 1000) have been used for the clustering procedure of stroke dataset, 2D mBd stroke datasets through the fuzzy
The clustering analyses of FCM algorithm based on the epochs.
Stroke dataset
2D mBd stroke dataset
For the 1000 iterations in FCM clustering algorithm, by splitting to 200 iterations with corresponding epoch, the calculation result pertaining to the stroke dataset is presented in Figure
The parameters for the 1000 iterations are displayed in Table
FCM clustering algorithm parameters.
Parameters | Parameters value |
---|---|
Exponent for the partition matrix | 2.0 |
Maximum number of iterations | 200 |
Minimum amount of improvement | |
The reason for mentioning such parameters is that these parameters were of help for the best cluster analysis to be conducted in this study.
The data in stroke dataset (
Result of FCM algorithm clustering with for 1000 Iterations (as obtained from Figure
Steps concerning epochs | Objective function vector for stroke dataset | Objective function vector for 2D mBd stroke dataset |
---|---|---|
Epoch 1 with 200 iterations | | |
Epoch 2 with 200 iterations | | |
Epoch 3 with 200 iterations | | |
Epoch 4 with 200 iterations | | |
Epoch 5 with 200 iterations | | |
Calculations given in Table
Clustering results.
Stroke Subtypes | | | | | FCM stroke data (%) | FCM 2D mBd stroke data (%) | FCM 2D mNold stroke data (%) | FCM 2D mPumpD stroke data (%) |
---|---|---|---|---|---|---|---|---|
No stroke/TIA | 34.7 | 83 | 52.1 | 46.4 | 30 | 64 | 39 | 49 |
Large vessel | 58.2 | 70.7 | 60.2 | 68.6 | 8.3 | 83.1 | 62 | 83 |
Small vessel | 63.3 | 76 | 71.3 | 71.2 | 17 | 96.5 | 87 | 76 |
Cardioembolic | 17.4 | 69.6 | 34.5 | 43.5 | 38 | 63 | 43 | 60 |
Cryptogenic | 2.4 | 62.5 | 49.2 | 56.8 | 15 | 98 | 76.6 | 41 |
Dissection | 19.6 | 47.1 | 17.3 | 17.3 | 10 | 67.7 | 26.8 | 14 |
Others | 14.8 | 20.1 | 16.4 | 12.4 | 12 | 62.4 | 26 | 15.6 |
Accurate clustering results have been obtained with
In literature, very limited number of papers exist on mathematical modeling and clustering of stroke subtypes. In this study, seven subtypes among no stroke/TIA, large vessel, small vessel, cardioembolic, cryptogenic, dissection, and other subtypes have been analyzed. Four datasets (stroke dataset, 2D mBd stroke dataset, 2D mNold stroke dataset, 2D mPumpD stroke dataset) are totally performed on our new approach with
The main contribution of this paper is that it has proposed a novel approach in stroke main headings regarding attributes with the use of 2D techniques of multifractal denoising. The clustering performances of 2D multifractal denoised techniques (mBd, mNold, mPumpD) for the stroke subtypes regarding a total of 2204 stroke patients’ dataset have been provided in a comparative manner. When our study is compared with the other works [
The authors declare that they have no conflicts of interest.
Yeliz Karaca is grateful to Tuscia University, Engineering School (DEIM), for hosting her during this research.