Degree-Based Molecular Descriptors and QSPR Analysis of Breast Cancer Drugs

The disease that involves abnormal cell growth and spreads through the surrounding tissues damaging other parts of the body is cancer. Breast cancer is the most common one out of many types. Women are affected by breast cancer either by hormonal changes or genetic changes that occur in DNA. As breast cancer is a life-threatening disease, it is necessary for further studies to gear up in fighting the deadliest disease. In this work, a detailed study is made on the occurrence, symptoms, and the drugs involved in its treatment. For this purpose, the quantitative structure-property relationship (QSPR) analysis is made using 21 drugs used in the treatment of breast cancer. The drugs considered in the study are modelled as molecular graphs using their molecular structures, and 11 topological indices are computed. The QSPR analysis is carried out for these drugs, and conclusions are drawn based on the analysis.


Introduction
e human body is made up of trillions of cells. Cell division is a natural phenomenon in all living beings. If the division of cells happens uncontrollably and spreads to surrounding tissues to form lumps, it results in cancer. Cancer can a ect any part of the human body. It is the most life-threatening disease, even though a lot of research is happening to cure this disease. Nowadays, the recovery rates in patients have considerably improved. Cancer occurs because of hormonal changes or genetic changes in DNA.
Irrespective of age, cancer occurs in human beings starting from infants to old age and more commonly in adults. If an extra growth or lump or tumour appears in the body, it is necessary to check for biopsy and con rm the diagnosis. Tumours may be malignant or benign. Benign tumours are non-cancerous and do not spread to surrounding tissues. However, some benign tumours may be life threatening, if grown in the brain. e risk of contracting cancer may be prevented by various factors such as maintaining a healthy lifestyle, avoiding food that causes cancer, and by taking vaccines that can prevent cancer from further development.
e substances that cause cancer are usage of tobacco, exposure to carcinogen, and cooking with Te on-coated vessels that have the ability to cause the deadliest disease [1][2][3].
Cancer occurs in human beings, irrespective of gender. In common, women are a ected especially by breast cancer and cervical cancer. According to the statistics available in the year 2020, 2.3 million women were a ected by breast cancer globally out of which 685000 lost their battle against the deadliest disease. It usually develops in the lining of milk ducts and lobules that supply milk to these ducts. ere are more than 18 types of breast cancer. Early detection of breast cancer is done by mammograms. e treatment includes clinical trials, immune therapy, hormone therapy, targeted therapy, and surgery with chemotherapy and radiation therapy [4,5].
Breast cancer is classified based on grading systems influenced by prognosis.
ere are several factors in describing the type of cancer and its response. ey are histopathology, grade, stage, receptor status, and DNA assays. Histopathology deals with the confirmation and analysis of the report produced by pathologist, and grade is a type of category based on the appearance of breast and primarily confirms the malignant cells in the ducts or lobules. is also includes stages starting from 0 to 4. Stage 0 is called the precancerous stage, stages 1-3 refer to cancer within the breasts or lymph nodes, and stage 4 is called metastatic cancer since it would have spread throughout the breast.

Topological Index Significance and Applications.
A numerical descriptor is a mathematical tool pertaining to the structure of the chemical compound used to analyse and investigate physicochemical properties of a molecule, thereby avoiding exorbitant and time-consuming laboratory experiments. It is a real number that stores/gives a lot of valuable information of a chemical compound. ere are different types of topological indices (TI's) such as degreebased, neighbourhood degree-based, distance-based, and eigenvalue-based indices. e property and activity-based models with indices are used that correlate with biological activities and other properties of the corresponding chemical structures [6,[9][10][11][12][13][14][15][16][17].
To manufacture any drug, the pharmacists collect the properties of molecular structure identified from quantitative structure-property relationship/quantitative structureactivity relationship (QSPR/QSAR) modelling and topological indices [18]. e results obtained helps in knowing that new product is consumable or not by the living beings. Various numerical descriptors are applied to foresee the properties of anticancer drugs, as there is a interrelation between anticancer drugs and characteristics of alkanes [3,[19][20][21].
In designing any new drug, properties of the molecular structure are required. Such properties are obtained using QSPR models with topological indices. To assist chemists, a detailed study of 21 drugs is carried out and various topological indices are computed.

Motivation for the Indices Used.
ere have been many topological indices introduced since 1947 till date. In this work, topological indices chosen have high correlation between them and various drugs used in breast cancer treatment.
e application of the considered topological indices is discussed below. e first and second Zagreb indices help in determining the total π-electron energy of molecules [22]. Randic introduced a topological index for computing the extent of branching of carbon atoms of saturated hydrocarbons which is named as Randic index [23]. e reciprocal Randic index helps in studying the chemical and physical properties of compounds with alkanes [24]. e harmonic index is another variant of Randic index first introduced by Fajtlowicz [8]. e heat of formation of heptanes and octanes is studied using the ABC index [7]. e heat of formation of alkanes is anticipated using augmented Zagreb index [25]. Furtula and Gutman [26] proposed forgotten TI, used to test various properties of drugs. Zhao et al. [27] proposed SS index and studied the physicochemical properties of 67 alkane isomers. It was found that SS index has good correlation with five properties, that is, boiling point (BP), melting point (MP), molar refractivity (MR), heat of vaporization (HV), and critical pressure (CP), of which molar refractivity (MR) was found to be having highest correlation of 0.99. Also, the SS index has good correlation for four various dendrimer structures. It was observed that the correlation coefficient of porphyrin dendrimer was perfect positive (r � 1). e Sombor index was recently introduced by Gutman [28], and its chemical applicability was checked by Redzepovic [29]. It was found that there was a reasonable correlation between the Sombor index and entropy. e Sombor index is used in forecasting the entropy of octanes. e total surface area of octane isomers is forecasted using the inverse sum indeg index [30].
In chemical graph theory (CGT), the molecular structure of drug is expressed as molecular graph such that an atom denotes a vertex and the bond connecting two atoms denotes an edge. For standard graph notations and terminologies, see [31][32][33][34]. Definition 1. Gutman et al. [22] introduced Definition 2. Estrada et al. in [7] introduced Definition 3. Vukicevic et al. [30] introduced Definition 4. Recently, Zhao et al. [27] formulated the SS index which is defined as Definition 5. Recently, Gutman [28] formulated the Sombor index which is defined as

Results and Discussion
In this work, topological indices are computed for chemical structures of drugs used in the treatment of breast cancer. e QSPR analysis of indices considered in the study is discussed, and it is shown that the correlation coefficient between the indices and physical properties of drugs is highly correlated. e drugs considered in this work are alpelisib, azacitidine, cytarabine, daunorubicin, dexamethasone, docetaxel, doxorubicin, glasdegib, gilteritinib, ivosidenib, midostaurin, olaparib, paclitaxel, palbociclib, pamidronic acid, prednisone, ribociclib, tioguanine, toremifene, tucatinib, and venetoclax. e molecular structure of these drugs is represented in Figure 1. e analysis includes computing 11 indices such as  Tables 1 and 2, respectively.  Tables 3-13 display the statistical parameters such as number of drugs considered, constant, regression coefficient, correlation coefficient, Fisher's statistic, significant value, and standard error denoted by N, A, b, r, F, p, and SE, respectively, for all the considered TI's and physical properties. In each table, the value of p is less than or equal to 0.001 (p ≤ 0.05), indicating the significance of the results. e correlation coefficient of physicochemical properties against TI's is depicted in Figure 2. Proof. From Figure 3, it is obvious that there are 39 vertices and 8 different types of edges counting to 43. ey are as follows.
such that Considering the number of edges and their respective types in the definitions of indices from equations (1)-(11), the following results are obtained.
Similarly, the indices are computed for other drugs considered in the study. e results obtained are depicted in Table 1.
From Table 1, it is noticed that the obtained values are normally distributed based on the descriptive statistics analysis, and the kurtosis value lies in between ± 1.96. e normality is also checked with the Shapiro-Wilk test(n < 50), such that the significance value is greater than 0.05. Hence, we conclude that the values are normally distributed. erefore, the suitable method used to analyse the data is regression analysis.

Regression Models e linear regression model is given by
where P, A, b, TI ⟶ physical property of drug, constant, regression coefficient, and topological index. Using equation (14), the linear models for the respective topological indices considered in the study are obtained as follows.

Conclusion
In the present work, drugs used in the treatment of breast cancer are studied for which various numerical descriptors are computed. To develop any novel drug, the properties of its structure are required and these properties can be obtained from QSPR modelling using TI's. e aim of this work is to obtain data regarding the topology of structure using topological indices with less cost and less time. e correlation coe cient between topological indices against the six physicochemical properties of the drugs is represented in Table 14. By inspection, it is observed that BP has the highest correlation with SO (G) with r 0.898. Also, EV has the highest correlation with F (G) having r 0.896, and FP has a good correlation with SO(G) with r 0.902, while MR with H(G) has high correlation with r 0.993, LogP with H(G) has r 0.827, and MV with R(G) has r 0.973. e obtained results have good correlation coe cients between physical properties and their respective topological indices. It is obvious from the study that MR is supposed to have good correlation with all topological indices considered in the study. It is observed that the correlation coe cient is more than 0.7 except a value (0.656) for ABC(G) index, and in all models, the value of p is less than or equal to 0.001 (p ≤ 0.05), indicating the signi cance of the results.

Study Implications.
e capacity of a molecular entity to reach a target implies the biological activity measured in terms of potency or concentration needed for the entity to produce the effect. e physicochemical properties include solubility, hydrogen bonding, ionization, isosterism, etc. e QSPR analysis carried out in this work assists the readers to know the properties of drugs required to include in the treatment or inclusion of this compound in the discovery of new drug.
e work provides right direction to the chemists and pharmacists to develop new drugs required for the treatment of different ailments. e anticipation of physicochemical properties is extensively done using TI's. e indices are used in prediction studies for models developed for soil absorption, boiling point, viscosity, densities of organic solvents, and chromatographic retention of data.
Biological studies performed using TI's help in providing good predictions. Some examples are enzyme inhibition, carcinogenicity, and hallucinogenic activity. e biological predictions include studies that are related to environmental pollution and toxicity. e TI's obtained here may be considered as reference in composing new compounds for further research. It is observed from the study that the physicochemical properties of the drugs show high positive correlation, indicating that these components or the drugs may be utilized in the discovery of novel drugs for various ailments.
To analyse the chemical information of chemical compound obtained by optimal procedures and experiments, the chemical discipline known as chemometrics is used. is discipline uses the statistical methods to obtain maximum chemical information of the compound.

Future Scope.
A similar study may be carried out for different chemical compounds useful for chemists in their further research. Also, various drugs used in the treatment of COVID-19 may also be considered for a similar study which helps the researchers.

Data Availability
e data used to support the findings of this study are included within the article.