Molecular Descriptor Analysis of Certain Isomeric Natural Polymers

Polymers, drugs, and almost all chemical or biochemical compounds are frequently modeled as diverse ω -cyclic, acyclic, bipartite, and polygonal shapes and regular graphs. Molecular descriptors (topological indices) are the numerical quantities and computed from the molecular graph Γ (2D lattice). These descriptors are highly signiﬁcant in quantitative structure-property or activity relationship (QSPR and QSAR) modeling that provides the theoretical and the optimal basis to expensive experimental drug design. In this paper, we study three isomeric natural polymers of glucose (polysaccharides), namely, cellulose, glycogen, and amylopectin (starch), having promising pharmaceutical applications, exceptional properties, and fascinating molecular structures. We intend to investigate and compute various closed-form formulas such as ABC, GA, sum-connectivity χ (− 1/2 ) , ABC 4 , GA 5 , and Sanskruti indices for the aforementioned macromolecules. Also, we present the closed-form formulas for the ﬁrst, second, modiﬁed, and augmented Zagreb indices, inverse and general Randi´c indices, and symmetric division deg, harmonic, and inverse sum indices. Furthermore, we provide a comparative analysis using 3D graphs for these families of macromolecules to clarify their nature.


Introduction
Cheminformatics is a comparatively new area of information technology that comprises chemistry, mathematics, and other informational sciences that concentrate on gathering, storage, treatment, and examination of chemical data. A molecular descriptor (MD) distinguishes the topology of a molecular graph and is invariant under isomorphism. Some of these descriptors take part in QSAR/QSPR analysis [1,2] which infer about the bioactivities and physicochemical properties of biochemical materials. Various types of distance-based, degree-based, spectral, and polynomial-related descriptors of graphs are well established and extensively studied in the literature. Out of these classes, vertex degree-based descriptors turn out to be the most important and play a phenomenal role in chemical graph theory (CGT). ese descriptors are used, in combination, to infer physicochemical, biological, and pharmacological properties such as the stability, chirality, melting point, boiling point, similarity, connectivity, entropy, enthalpy of formation, surface tension, density, critical temperature, and toxicity of chemical compounds in CGT; see [3,4]. roughout, in this work, Γ denotes a simple, finite, and connected graph, whereas E(Γ) and V(Γ) represent the edge and vertex set of Γ, respectively. For a vertex v ∈ V(Γ), degree of vertex v is denoted by d v , and the sum of the degree of vertices at unit distance from v is represented by S v , i.e., Here, we specify few distinct, significant, and well-studied bond-additive invariants of our concern.
In [5], Gutman and Trinajstić proposed two degreebased invariants known as the first Zagreb index (FZI) M 1 and the second Zagreb index (SZI) M 2 . ese indices initially appeared in the expression of the total π-electron of the molecular graph and were later applied to study molecular complexity and ZE isomerism. e formulas of M 1 , M 2 , and the modified Zagreb index (MSZI) are given as (1) In [6], Randić offered a very influential invariant which is considered to be a prototype of degree-based invariants and is called Randić index. It is the oldest and extensively studied invariant that measures the amount of branching in the carbon atom skeleton of saturated hydrocarbons [7,8]. For a molecular graph Γ, Randić index is defined as (2) In [9], Bollobás and Erdös initiated the concept of the general Randić index (GRI), and it is given as Moreover, the inverse Randić index (IRI) is defined by the following formula: In [10], Zhou and Trinajstić instigated the concept of the generalized sum-connectivity index (GSCI) that is defined as follows: For α � (− 1/2) and α � 2, we have sum-connectivity index χ (− 1/2) (SCI) and "hyper-Zagreb index" (HZI) [11], respectively.
In [12], Li and Zheng introduced the first general Zagreb index (FGZI), and it is defined by the following formula: For α � 3, we obtain the forgotten index (FI). In [13], Azari and Iranmanesh initiated the idea of the generalized Zagreb index (GZI) that is given as where r, s ∈ Z + ∪ 0, − 1 { }. In [14], Estrada et al. introduced a significant invariant called atom-bond connectivity index ABC(Γ) that proved to be a good predictor for the stability of alkanes and strain energy of cycloalkanes [15,16]. In [17], Vukičević and Furtula suggested another prominent invariant known as geometric-arithmetic index GA(Γ). ese indices are defined as follows: An interested reader may refer to the surveys [18,19] regarding Randić and GA indices of graphs, respectively. In [20,21], Ghorbani and Hosseinzadeh and Graovac et al. proposed the fourth version of ABC and the fifth version of GA that are denoted by ABC 4 and GA 5 , respectively. Likewise, Hosamani introduced the Sanskruti index denoted by SI [22]. ese invariants are based on the sum of neighbor's degrees of end vertices and are defined as For a molecular graph (Γ), some other invariants of key importance and related to our concern are SDD (symmetric division deg), HI (harmonic index), ISI (inverse sum index), and AZI (augmented Zagreb index). ese indices are defined as follows: In [23], Ranjini et al. proposed the idea of first, second, and third redefined Zagreb invariants that are given by the following formulas: It is evident that ReZM 1 � n and violates the criteria to be a topological index. Furthermore, ReZM 2 is the same as the already defined topological index called the inverse sum index. Consequently, the only novel invariant is ReZM 3 (Γ) known as redefined Zagreb index (ReZI) and is denoted by ReZM(Γ).
In [24], Deutsch and Klavžar initiated the idea of Mpolynomial for graph Γ � (V(Γ), E(Γ)), and it is mathematically given as where m ij (Γ) denotes the number of edges uv Table 1 depicts the relationship between some essential topological indices and the M-polynomial, where Note: all formulas depicted in Table 1 will be calculated at x � y � 1.
We summarize the relationship of GZI with certain important invariants in Table 2.
Harry Wiener, an American theoretical chemist, observed that invariants estimated from the molecular graph of a chemical compound carry information and properties of that chemical compound. Camarda and Maranas [25] employed the connectivity indices to invent and create the polymers correlated with a certain optimal characteristic. Dendrimers are acknowledged to be the "polymers of the 21st century" due to their increased popularity, which is evident through research articles and patents registered. In [26], Wang et al. provided the closed-form formula for the k-connectivity invariant in the class of nanostars and dendrimers. In [27], Ali et al. derived general formulas of certain invariants for some specific polymers such as polyphenylenes, nanostars, and dendrimers. In [28], Shao et al. worked out for the maximum value of the ABC index and provided its characterization in the class of chemically oriented graphs. In [29], Gao et al. figured out the enthalpy and entropy for copper oxide I and copper oxide II. Kang et al. [30], Liu et al. [31], and Gao et al. [32] studied various topological aspects of 2D silicon-carbons, nanotubes, and dendrimers, respectively.
Liu et al. [33] investigated and identified proteins having nucleotide-binding activity using star graph TIs. Ali et al. [34,35] and Du et al. [36] studied and applied some degreebased TIs such as the first Zagreb connection index, ordinary generalized geometric-arithmetic index, general Platt index, and general sum-connectivity index to establish extremal results for alkanes. Hayat et al. [37] performed comparative testing of certain chemical structures (carbon nanotubes, carbon nanocones, and tetrahedral diamond) using various degree-based TIs. Arockiaraj et al. [38] computed variants of Wiener indices for the molecular graphs of coronoid systems, carbon nanocones, and SiO 2 nanostructures. In [39], Ahmad et al. computed and compared several invariants of synthetic polymers such as bakelite, vulcanized rubber, and acrylic (polymethyl methacrylate) to ascertain a relationship between their physicochemical properties. From their monomers, we develop the polymeric graphs of three closely related natural polymers (isomeric), broadly known as cellulose, glycogen, and amylopectin, to compute certain invariants to anticipate their physicochemical properties. Numerous theoretical, mathematical, and chemical properties of diverse chemical structures based on various invariants obtained from their molecular graphs have been investigated in [40][41][42][43].
Polymers pervade every aspect of our daily life, and it is hard to imagine a society without natural as well as synthetic polymers, and they are characterized into four major types based on their molecular chains; see Figure 1.
Typically, almost all food items comprise macromolecules which are some sort of polymers. Most of the food items primarily include naturally occurring polymers (polysaccharides) such as starch and cellulose. e main biological functions of these polysaccharides are nutritional, e.g., energy storage for metabolism (starch and glycogen), and building material (cellulose). Like graphite and diamonds, glycogen, starch, and cellulose are also composed of the same substance but with different structures. We know glycogen, starch, and cellulose are all natural polymers of glucose (carbohydrate) having the same chemical formula (C 6 H 10 O 5 ) n . e entire class of natural polymers is made up of smaller segments called monomers (monosaccharides), and glucose is the basic building block for cellulose, glycogen, and starch. ey differ from each other based on the glucose type present and the nature of the bond which links the glucose monomers together. Glucose is a type of sugar comprising carbon, hydrogen, and oxygen. ese elements bind together to create a hexagonal structure having six carbon atoms (numbered C 1 to C 6 ) with one of the carbons Table 1: Formulas to derive some promising invariants from the M-polynomial.

Topological indices Formulas derived from the M-polynomial
Journal of Chemistry 3 sticking off the end. Distinct glucose rings can be attached at different carbons to produce different types of structures. Some segments of the ring are flipped, causing two different forms of glucose, known as α-glucose (alcohol OH attached to C 1 is down) and β-glucose (alcohol OH attached to C 1 is up); see Figure 2. ere are two types of bonding, namely, α(C 1 − C 4 ) and α(C 1 − C 6 ) glycosidic bonding in amylopectin and glycogen; see Figure 2. Natural polymers, particularly of carbohydrate origin, have been found very promising pharmaceutical applications in different forms [44][45][46].

Discussion and Construction of the Planar Graph of Cellulose Network CL n m
Cellulose is among the most abundant, renewable, and biodegradable organic compounds found in nature. Anselme Payen (1838), a French chemist, recognized the existence of cellulose in green plants [47]. It is the main component of tough cell walls that surround plant cells, thus making plant stems, leaves, and branches strong as well as rigid. e rigid structure of cellulose allows plants to stand upright, difficult to digest, and hard to break down. Recently, the government, as well as industry, is highly interested in products from sustainable and renewable energy resources that produce low human health and environmental risks [48]. Cellulose-based materials (cellulosics) are used as key excipients in compounding pharmaceutical objectives and gained immense attraction due to various intriguing features such as low cost, biocompatibility, reproducibility, and recyclability (green technology). First, we explain the chemical structure of cellulose, in general, and then convert it into a mathematical object called a molecular graph to investigate its properties using tools from graph theory. It comprises over 3, 000 D-glucose units that are linked by β(C 1 − C 4 ) glycosidic bonding (see Figure 3) and have general formula (C 6 H 10 O 5 ) n . Cellulose is a linear unbranched polymer: unlike glycogen and starch, no coiling occurs. Multiple hydroxyl groups on the glucose ring from one chain create hydrogen-oxygen bonding on the same or a neighboring linear chain (highly cross-linked polymer) that results in the formation of microfibrils having high tensile strength. Now, we provide the construction, from scratch, for the molecular graph of CL n m . e basic building unit of the cellulose network is (C 6 H 10 O 5 ) 2 , depicted in Figure 3, consisting of three hexagons and one octagon with three pendant edges. Out of these pendants, one is fixed carbon, and the remaining two pendants are OH (hydroxyl group), one at the upper side and one at the lower side for further bonding. Here, n represents the number of hexagons in basic units, and when we add one monomer to the basic unit, we get 7 hexagons. Similarly, every single addition of the monomer resulted in an increment of four hexagons.
Assume n to be the number of hexagons in one hexagonal chain with l isomeric units and m to be the number of hexagonal chains in cellulose network CL n m . Clearly, the number of hexagons in each chain is odd, and the relation between hexagons in one chain with isomeric units l is given as n � 4l + 3, l � 0, 1, 2, 3, . . .. Figure 4 elaborates a threedimensional network of cellulose CL 7 4 along with its planar network. We recognize three types of the polygon in the molecular graph of cellulose, namely, hexagons, octagons, and decagons.   Journal of Chemistry CL n m , n � 4l + 3, is 22ml + 20m − 2, and the number of edges is 30ml + 25m − 2l − 4. To compute our results, we require edge partition of edge set E(CL n m ). ere is one and only edge having degrees 1 and 3 of end vertices, i.e., |E 13 (CL n m )| � 1. e number of edges with end vertices, each of degree 2, is |E 22 (CL n m )| � 4m + 4l + 1. We detect total edges with degrees 2 and 3 of end vertices as |E 23 (CL n m )| � 12ml + 12m − 2. Finally, we identified the number of edges with end vertices, each of degree 3, as |E 33 (CL n m )| � 18ml + 9m − 6l − 4. An edge partition of the cellulose network comprising different parameters is presented in [49]. For the sake of computational ease, we summarized the edge partition of the cellulose network in Table 3.

Journal of Chemistry
Journal of Chemistry Proof. Employing equation (7) and using Table 3, we obtain the desired result as follows: □ Corollary 1. Using formulas outlined in Table 2 in equation (17), we derived the following results of different TIs: Theorem 2. Let CL n m be a cellulose network having m hexagonal chains and n � 4l + 3 hexagons in each chain. en, Proof. We determine the required results with the help of Table 3 along with equations (8), (9), and (5), respectively.

Discussion and Construction of the Planar Graph of Glycogen GL l m and Amylopectin AM l m Networks
In the 19th century, Claude Bernard, a prominent French physiologist, discovered glycogen that mainly resides in the liver and muscles. Natural polymers such as cellulose, chitin, proteins, carbohydrates, and glycogen are a great source of energy as they are the key component for life to keep going. Glycogen (C 6 H 10 O 5 ) n is a giant, complex, and highly branched polymer consisting of about 30,000 monomers of glucose. It comprises chains of glucose molecules linearly linked via α(C 1 − C 4 ) glycosidic linkages, and after every ten to twelve residues, a chain of glucose branches off via α(C 1 − C 6 ) glycosidic linkages. is latter kind of bonding creates branching and winding patterns in glycogen. On the contrary, cellulose (a close ally) has β(C 1 − C 4 ) glycosidic linkages that produce a  Journal of Chemistry 13 more rigid linear chain and hence cannot be broken down in our stomach. Glycogen has only one reducing end, whereas it has plenty of nonreducing ends; see Figure 5. Glycogen molecules contain glucose as the principal storage reservoir in human and animal cells, and when desired, glycogen is readily processed to release glucose. is self-regulating process maintains the amount of glucose in blood at a constant level even though the supply is uneven. e interaction between glycogen and glucose is at the heart of what is commonly interpreted as the Cori cycle (muscle glycogen ⟶ blood lactic acid ⟶ liver glycogen ⟶ blood glucose ⟶ muscle glycogen). Although sufficient investigation has been performed about the regulation of glycogen metabolism by hormones such as insulin, glucagon, and adrenaline [50][51][52], however, still it is the subject of extensive investigation. In glycogen, approximately after every 10 glucose residues, α(C 1 − C 6 ) glycosidic bonding occurs which creates the branch. e network of our interest GL l m is constructed from the glycogen molecule in such a way that it has m − 1 branches of length 1 < l < 10; see Figure 6.
Amylopectin is an analog of glycogen that has fewer branches and is less compact as compared to glycogen. e helical branching structure gives an open structure to these molecules; as a result, they are easily accessible by enzymes, and so, they can be broken down or assembled quickly. In amylopectin, approximately after every 20 glucose residues, α(C 1 − C 6 ) glycosidic bonding occurs which creates the branch. e network of our interest AM l m is constructed from the amylopectin molecule in such a way that it has m − 1 branches of length 1 < l < 20. Proof. Suppose V deg (GL l m ) and E xy (GL l m ) denote the vertex set and edge set partition and are defined as y)}, respectively. In the molecular graph of glycogen network GL l m , we recognize three kinds of vertices with degrees 1, 2, and 3, i.e., δ(GL l m ) � 1 and Δ(GL l m ) � 3. By applying the basic counting principle, we acquire the partition of the vertex set, and it is presented as |V 1 | � m (9 + l) − l, Subsequently, the total number of vertices |V(GL l m )| of the glycogen network is 8m(l + 10) − 8l − 4. Similarly, we identify four types of edges in GL l m with respect to degrees of end vertices of each edge. Again, employing the basic counting principle, we get the partition of the edge set, and it is given as erefore, the total number of edges |E(GL l m )| is 9m(l + 10) − 9l − 5. Lemma 2 reveals some basic properties of the amylopectin network that are of utmost importance for promising results. Note that we skip the proofs of results for the amylopectin network as they would have been attained by working on the same lines as for the glycogen network. □ Proof. Employing equation (7) and using  Table 2:

Corollary 2. From equation (30) of the GZI, we derived the following results of different TIs presented in
9m + ml − l − 1 Total vertices 8m(l + 10) − 8l − 4 Total edges 9m(l + 10) − 9l − 5 Table 7: Partitioning of the edge set with respect to the degree of end vertices for AM l m .
Corollary 3. From equation (31) of the generalized Zagreb index, we derived the following results of different TIs: Proof. Using Table 6 and equations (8), (9), and (5), respectively, we compute the desired result as given in the following:

16
Journal of Chemistry .
In the next theorem, we compute the M-polynomial of glycogen network GL l m , which will eventually be used to formulate closed-form formulas of certain TIs of our interest.
Although both natural and synthetic polymers are appropriate for the drug delivery and, in general, for the pharmaceutical industry, however, natural polymers are more suitable as they are nontoxic, biocompatible, without side effects, and economical. As pointed out earlier, M 1 , M 2 , R (− 1/2) , and Randić-type indices (χ (− 1/2) , ABC, and AZI) assess the intensity of branching and connectivity in the molecular graph. Figures 8-11 pronounce the subsequent order between the indices for the same number of monomers (i.e., same values of m and l in each polymer). Hence, TI(GL l m ) ≤ TI(AM l m ) ≤ TI(CL l m ) where TI ∈ M 1 , M 2 , R (− 1/2),χ (− 1/2) ,ABC,AZI }. is ordering is convincing due to the fact that there exists extensive cross-linking in cellulose, while glycogen (frequent branching) and amylopectin (less branching) are branched polymers. In [53], properties of the SDD index and ISI are investigated, and it turned out that the ISI and SDD index are reasonable predictors of total surface area for octane isomers and polychlorobiphenyls, respectively. We conjecture, relying on the comparison presented in Figure 12, the relationship between surface areas (SAs) of cellulose, glycogen, and amylopectin which could have been organised as SA(AM l m ) ≤ SA(GL l m ) ≤ SA(CL l m ), for the same number of monomers. We observe, from Figures 8-11 and Figure 13, that all the graphs of TIs for cellulose behave like an outlier as compared to glycogen and amylopectin. We anticipate that the eccentric behavior of cellulose is due to its nature of being used as a building material (forms the plant cell wall) as well as its physical properties such as the presence of monomer β-glucose, insoluble, indigestible, and considerable tensile strength. Moreover, all the graphs of TIs for glycogen and amylopectin go alongside, which might be due to the presence of monomer α-glucose, solubility, digestibility, and their nature of being used as energy storage (bonds break easily) in animal organs and plants, respectively. Also, the results obtained in this section could further be applicable in QSPR/QSAR analysis to predict the biological properties of the natural polymers under discussion.
For future research, the study of polysaccharides can be further enhanced to molecular graphs of other natural polymers. Develop molecular graphs for some new natural 24 Journal of Chemistry polymers such as proteins and nucleotides (RNA and DNA), and give a mathematical formulation of degree-based indices studied in this article. Finally, compare their physicochemical properties, theoretically and mathematically, using these indices.

Data Availability
All the data used to support the findings of this study are included within this article. However, the reader may contact the corresponding author for more details of the data.

Conflicts of Interest
e authors declare no conflicts of interest.