Breast Cancer Screening Using a Modified Inertial Projective Algorithms for Split Feasibility Problems

To detect breast cancer in mammography screening practice, we modify the inertial relaxed CQ algorithm with Mann's iteration for solving split feasibility problems in real Hilbert spaces to apply in an extreme learning machine as an optimizer. Weak convergence of the proposed algorithm is proved under certain mild conditions. Moreover, we present the advantage of our algorithm by comparing it with existing machine learning methods. The highest performance value of 85.03% accuracy, 82.56% precision, 87.65% recall, and 85.03% F1-score show that our algorithm performs better than the other machine learning models.


Introduction and Preliminaries
Breast cancer is common in women, with approximately 2 million diagnosed women worldwide each year [1].Its mortality rate has increased over decades due to the change in risk factors, aging society, and better cancer registration and detection [2].Many trials conclude that mammography screening at age 40s has mortality reduction, many years of life saved, and improved treatment, including evaluation of the extent of the disease [3].However, there are some risks of this modality: overdiagnosis, false-positives, anxiety, and radiation injury [4].Recently, imaging options combined with artificial intelligence are believed to be enhanced by integrating new screening protocols directed toward more personalized and precision medicine [1].
Many optimization algorithms were used to solve medical classification in machine learning; see [5,6].In this paper, we focus on the split feasibility problem (SFP) applying to mammography classification.Let C and Q be two nonempty closed and convex subsets of real Hilbert space such that C ⊆ H 1 and Q ⊆ H 2 , and let A : H 1 ⟶ H 2 be a bounded linear operator.The problem SFP is to if such ω * exist.The solution set Ω ≔ fω * ∈ C : Aω * ∈ Qg of the problem SFP (1) is denoted by Ω.
The first algorithm to solve the problem SFP (1) was presented by Censor and Elfving [7].After that, many mathematicians (see [8][9][10]) applied the problem SFP (1) to solve many real-world problems such as machine learning, signal processing, image restoration, and many more.To find a solution of the problem SFP (1), Xu [11] proved that the point ω * is a solution of the problem SFP (1) if and only if the point ω * is a fixed point of the following mapping: Later on, Byrne [12] was the first to propose a popular CQ algorithm for solving the problem SFP (1).The CQ algorithm of Byrne [12] was generated as follows: where the parameter λ belongs in the interval ð0, 2/ kAk 2 Þ which makes the mapping ðI − λA * ðI − Proj Q ÞA is nonexpansive where I is an identity, i.e., and A * denotes for the adjoint operator of A, while Proj C and Proj Q are the orthogonal projections onto C and Q, respectively.The overall cost of calculation is not high if the metric projections onto C and Q are simple to calculate.However, precisely computing the metric projection is difficult or requires too much effort in some circumstances when C and Q are complex constructs.
Later on, Yang [13] introduced a relaxed CQ algorithm by modifying the C and Q sets of the CQ algorithm (3) to reveal sets as follows: where c : H 1 ⟶ ℝ ∪ f+∞g and q : H 2 ⟶ ℝ ∪ f+∞g are two proper convex functions such that ξ n ∈ ∂cðω n Þ and η n ∈ ∂qðAω n Þ.Since the projections Proj C n and Proj Q n are easier to use, many mathematicians use them to modify numerous algorithms for solving the problem SFP (1); see [14][15][16].
One of the techniques to speed up the convergence of the algorithms is the inertial technique which Polyak first introduced [17] in 1964.Polyak's algorithm was called the heavy ball method, and it was improved by Nesterov [18].Later on, it has been widely used to solve a wide variety of problems in the optimization field, as seen in [9,[19][20][21][22].
In 2017, Dang et al. [9] modified the inertial technique which was introduced by Alvarez and Attouch [23] with the CQ algorithm of the Byrne algorithm (3) for the problem SFP (1) in a real Hilbert space.This algorithm was defined as follows: where the parameter λ is in the interval involving the norm of operator A, C n and Q n are the revel set introduced by Yang [13], and the extrapolation factor σ n ∈ ½0, σ n and σ ∈ ½0, 1Þ such that The weak convergence of algorithm fω n g generated by (6) was proved under the conditions of the extrapolation factor (8) and the stepsize parameter λ.
Very recently, Wang and Yu [32] generalized an inertial relaxed CQ of Yang [13] by modifying C n , Q n as follows: ð9Þ C n and Q n of Wang and Yu [24] can be reduced to C n and Q n of Yang [13] when α and β are set to 0. The inertial generalized relaxed CQ algorithm (IGRCQ) of Wang and Yu [24] was introduced by ω 0 , ω 1 ∈ H 1 , and where fσ n g ⊂ ½0, σÞ ⊂ ½0, 1Þ, fε n g ⊂ ð0, 4Þ and For each n ∈ ℕ, the functions are defined as follows: It was shown that, under the conditions (10) weakly converges to a solution of the problem SFP (1).
In this paper, we modify the inertial technique with relaxed CQ algorithms and Mann's algorithm to solve the split feasibility problems in Hilbert spaces.We establish weak convergence theorems under suitable conditions.We apply our main result to solve a data classification problem by using extreme learning machine with mammographic mass data set from UCI, and then compare the performance of our algorithm with other existing machine learning algorithms.

Main Results
In this section, we introduce a new modified inertial projective algorithm by combining Mann algorithms with inertial where A : H 1 ⟶ H 2 is a bounded linear operator, c : H 1 ⟶ ℝ and q : H 2 ⟶ ℝ are lower semicontinuous convex functions.We next assume that ∂c and ∂q are bounded operators.For a sequence fy n g in H 1 , we modify the half-spaces C n and Q n by using the idea of Wang and Yu [24] as follows: where ϱ n ∈ ∂cðy n Þ, α ≥ 0, and where (13).
We now introduce a modified inertial projective algorithms as follows: Assume that the following condition hold: Theorem 1.Let H 1 and H 2 be two real Hilbert spaces, and let C and Q be nonempty closed convex subsets such that C ⊆ H 1 and Q ⊆ H 2 .Let A : H 1 ⟶ H 2 be a bounded linear operator.Assume that the solution set Ω of the problem SFP (1) is nonempty, the condition (C1)-(C2) hold.Then the sequence fω n g generated by Algorithm 1:.converges weakly to a point ω * ∈ Ω: In the other hand, we set t n = y n − λ n ∇f n ðy n Þ, From (18), we have Replacing (20) into (19), we have This implies that By Lemma 1 in [25] and (C1), we have that fkω n − ω * kg is convergence sequence for any ω * ∈ Ω.Therefore, fω n g is bounded.From the definition of fρ n g, fρ n g is also bounded.It follows from (20), (C1)-(C3) that Again by I − λ n ∇f n and P C n are nonexpansive, then we have which implies that From lim n⟶∞ kω n − ω * k exists, (C1) and (C3), we obtain It follows from (C2) and ( 24) that And it is clearly from (C1), we have From ( 28) and ( 29), we obtain Finally, let ω * be a weak sequential cluster point of fω n g.There exists a subsequence fω n k g of fω n g which converges weakly to ω * ∈ H 1 .From (30), we also have that fy n k g converges weakly to ω * and hence Ay n k ⇀ Aω * as k ⟶ ∞.By the definition of z n , we have that where ξ n k ∈ ∂qðAy n k Þ.By our assumption as ∂q is bounded ( 24) and ( 31), we have qðAω * Þ ≤ 0, this shows that Aω * ∈ Q.Again, by the definition of fz n g, we have that where ϱ n k ∈ ∂cðy n k Þ.By our assumption as ∂c is bounded (27) and (32), we have cðω * Þ ≤ 0, this shows that ω * ∈ C. By Opial's lemma in [26], we can conclude that fω n g converges weakly to a solution in Ω.This completes the proof.

Application to Data Classification Problem
Nowadays, many cancer patients are reported around the world each year.In the population survey in Global Cancer Initialization: Take ω 0 , ω 1 ∈ C and set n = 1: Iterative Steps: Calculate fω n g by the following step:  1 shows the overview of all attributes.
In 2021, Parvez et al. [29] showed many machine learning predictive models to classify breast cancer using this mammographic mass dataset.The following Table 2 shows the comparison of our algorithm 1 consider in two constrain closed convex sets L 1 and L 2 with machine learning predictive models by Parvez et al. [29] after feature engineering.
From Table 2, the results show that our algorithm 1 when constrain closed convex set by L 2 was used gives the highest accuracy 85.03% after removing rows with missing values and outliers.We next explain how our algorithm 1 optimizes weight parameter in training data for machine learning.We focus on extreme learning machine (ELM) by using 5-fold cross-validation [30].The ELM method is defined as follows: training data with N distinct samples such that μ s is an input training data and r s is a target.Finding optimal output weight using the output function is the objective of the ELM method.The following output function is for singlehidden layer feed-forward neural networks (SLFNs) with M hidden nodes: where V is an activation function and c i and e i are parameters of weight and finally the bias, respectively.The optimal output weight w i at the i-th hidden node is found by setting the hidden layer output matrix H as follows: We also assume an optimal output weight w the training target data.For solving linear system H w = R, we use the least square problem when the Moore-Penrose generalized inverse of H is not easy to find.To ovoid overfitting in the machine learning, we consider constrain least square where 2 − γ, and qðωÞ = 1/2kω − Rk 2 for our algorithm 1 to solve the problem (3.1).
We use four evaluation metrics: accuracy, precision, recall, and F1-score [31] as explained below for comparing the performance of the classification algorithms.where TN is the true negative, FP is the false positive, FN is the false negative, and TP is the true positive.
For avoiding model overfitting, we consider accuracy and loss plots.This research, we use the following binary cross-entropy loss function: where Ôi is the i-th scalar value in the model output, O i is the corresponding target value, and m is the number of scalar values in the model output.
For comparison with other existing methods from the literature, the necessary parameters of each algorithm are chosen in Table 3.The extrapolation parameter of algorithm 1 (L 1 , L 2 ) and algorithm (10) is in the following from: where N is a number of iterations that we want to stop and σ n = σ n for algorithm (6).Sigmoid is set as an activation function with hidden nodes M = 160, and four evaluation metrics of each algorithm are shown in Table 4.
Table 4 shows that our algorithm 1 with constrain closed convex set L 2 is the highest F1-score, precision, recall, and accuracy efficiency.Additionally, our algorithm 1 with constrain closed convex set L 1 has the lowest number of iterations.The optimal-fitting of our algorithm 1 is shown by considering the training and validation loss with the accuracy.
From Figures 1 and 2, we observe that both of algorithm 1 L 1 and L 2 have optimal-fitting models.This means that the algorithm suitably learns the training dataset and generalizes well to predict the severity of mammographic mass based on BI-RADS assessment, the patient's age, shape, margin, and density of mass.
Remark 2. Since a matrix H in ELM was generated by a finite dataset that contains real numbers, thus we can see from Table 4 that our Algorithm 1:. which requires a norm estimation of the bounded linear operator of H , gives more efficiency than the algorithm (10) of Wang and Yu [24].

Conclusion and Discussion
Nowadays, there are many studies interested in the accuracy of artificial intelligence (AI) for the detection of breast cancer in mammography screening programme.Some believe that artificial intelligence (AI) has helped improve radiologists' performance and provides results equivalent or superior to those of radiologists' alone such as reduce the volume in screen-reading without affecting cancer detection substantially [32].Although there are some issues that should be more explore including possible factors on recall and interval cancers [33], Freeman et al. [34] performed the systematic review of test accuracy and concluded that there is inadequate evidence in judgement of accuracy of artificial intelligence (AI) in detecting breast cancer on screening mammography.There is still small researches, which could not be representing the real effect of artificial intelligence (AI) in clinical practice or where on the clinical pathway AI might be of most benefit.This paper presented an applying inertial modified relaxed CQ Mann algorithms for split feasibility problems for extreme learning machine based on BI-RADS assessment, the patient's age, and three BI-RADS attributes for predicting the severity of mammographic mass lesion to assist the physician regarding making decision about whether to go for biopsy or not.The comparison with other machine learning models and existing algorithms for split feasibility problems shows that our algorithm provides the highest performance value of 85.03% accuracy, 82.56% precision, 87.65% recall, and 85.03% F1-score.Moreover, considering training and validation loss, and the accuracy plots show that our algorithm has good fit model.

3
International Journal of Breast Cancer Since I − λ n ∇f n and Proj C n are nonexpansive, we have

Figure 1 :Figure 2 :
Figure 1: Training and validation loss and the accuracy plots of Algorithm 1: with constrain closed convex set L 1 .

2
International Journal of Breast Cancer technique and inertial relaxed CQ algorithm.Assume that C and Q are two nonempty closed and convex subsets of real Hilbert space such that C ⊆ H 1 and Q ⊆ H 2 such that n ðy n ÞÞ, where λ n ∈ ð0, ð2/kAk 2 ÞÞ.Step 3. Compute ω n+1 = ð1 − α n Þy n + α n z n , where α n ∈ ð0, 1Þ.Replace n with n + 1 and then repeat Step 1.
have been identified on full-field digital mammogram collected at the Institute of Radiology of the University Erlangen-Nuremberg between 2003 and 2006.These datasets can indicate how well a several computer-aided diagnosis (CAD) system performs compared to the radiologists [28].After 167 missing attribute values from 7 BI-RADS, 5 ages, 31 shapes, 48 margins, and 76 densities were removed before the training process.The following Table

Table 1 :
Overview of mammographic mass data set from UCI.

Table 2 :
Highest accuracy of ML algorithms after feature engineering.

Table 3 :
All different necessary parameters of each algorithm.

Table 4 :
All performances of each algorithm for comparison.