A Relook into the Flavonoid Chemical Space of Moringa oleifera Lam. Leaves through a Combination of LC-MS and Molecular Networking

Moringa oleifera Lam. is a functional tree that is known to produce a variety of metabolites with purported pharmacological activities. It is frequently called the “miracle tree” due to its utilization in numerous nutraceutical and pharmacological contexts. This study was aimed at studying the chemical space of M. oleifera leaf extracts through molecular networking (MN), a tool that identifies metabolites by classifying them based on their MS-based fragmentation pattern similarities and signals. In this case, a special emphasis was placed on the flavonoid composition. The MN unraveled different molecular families such as flavonoids, carboxylic acids and derivatives, lignin glycosides, fatty acyls, and macrolactams that are found within the plant. In silico annotation tools such as network annotation propagation (NAP) and DEREPLICATOR, an unsupervised substructure identification tool (MS2LDA), and MolNet enhancer were also explored to further compliment the classic molecular networking output within the Global Natural Product Social (GNPS) site. In this study, common flavonoids found within Moringa oleifera were further annotated using MS2LDA. Utilizing computational tools allowed for the discovery of a wide range of structurally diverse flavonoid molecules within M. oleifera leaf extracts. The expansion of the flavonoid chemical repertoire in this plant arises from intricate glycosylation modifications, leading to the creation of structural isomers that manifest as isobaric ions during mass spectrometry (MS) analyses.


Introduction
Moringa oleifera Lam. has been reported to have a broad range of pharmacological activities such as antimicrobial, anti-infammatory, hypotensive, antidepressant, antioxidant, antidiabetic, hypoglycemic, and immunomodulatory properties [1][2][3].Te chemical constituents of the stems, leaves, fowers, pods, and seeds of M. oleifera have been analyzed to determine the presence of bioactive compounds, and they were found to contain various secondary metabolites such as phenolic acids, sterols, terpenoids, favonoids, alkaloids, and sugars and anticancerous agents such as glucosinolates, isothiocyanates, glycoside compounds, and glycerol-1-9-octadecanoate which have nutritional, pharmaceutical, and antimicrobial properties [4][5][6][7][8].However, studies on this plant have shown that the presence of the bioactive compounds is dependent on various factors such as the geographical origin, the harvesting season, and cultivation conditions [9].
Metabolomics is a feld of study that gives a systematic view of the unique chemical fngerprints of metabolites and their small changes in a specifc cellular process [10].A metabolomics study includes sample preparation, analytical measurement, data analyses, and interpretation [11][12][13].Mass spectrometry (MS) and nuclear magnetic resonance (NMR) techniques are reported to be the analytical workhorses of metabolomics [14][15][16].A molecular family (MF) is constructed by the grouping of structurally related molecules that generate similar fragmentation patterns.To do this on a larger scale, computational tools such as molecular networking (MN) have been developed [12,[17][18][19].MN is a popular tool in the analysis of tandem MS-(MS/MS-) based metabolomics data.MN is fundamentally based on the observation that two structurally related molecules share fragment ion patterns when subjected to MS/MS and aid the elucidation of the structure/identity of many compounds of untargeted MS [20][21][22].MN has led to the development of the Global Natural Product Social (GNPS) which is a molecular networking and data-sharing web-based platform [23,24].
GNPS is widely used by scientists from various platforms in the felds of chemistry, microbiology, forensics, and many more to perform sample classifcation with the objective to give identity of the content thereof.GNPS facilitates data, stores knowledge, enables sharing, and promotes reproducible data analysis [21].GNPS can be used for molecular networking and is currently the only public infrastructure that enables molecular networking [23].Te related molecules as depicted in MN can be viewed online at GNPS or on Cytoscape for analysis [13,25].Other tools in GNPS include network annotation propagation (NAP) and DEREPLICATOR and an unsupervised substructure identifcation tool called MS2LDA, all of which are meant to strengthen metabolite identifcation through MN.Tese tools are used to complement classic MN output and integration using MolNetEnhancer within GNPS [26].
In this study, the chemical space of M. oleifera was studied through computational tools within GNPS.Molecular networking was used to reveal the molecular families of this plant, and the unsupervised substructure annotation tool (MS2LDA) was used to annotate the Mass2Motifs of some of the favonoids that are found within M. oleifera by depicting similar fragmentations and neutral losses.

Chemicals and Reagents.
Methanol (99% CP) was purchased from Associated Chemical Enterprises (Johannesburg, South Africa).Ultrapure water using a Direct-Q 5UV distiller (Massachusetts, the United States of America) was used for the preparation of the 80% methanol solution.Te extraction was performed on a DIAB MX-RL-Pro dragon shaker.Chromatographic separation of the metabolites in the extracts was done using a reverse phase Shim-pack Velox C18, 2.1 × 100 mm, 2.7 μm (Columbia, USA).Te UPLC was connected to a Shimadzu 9030 LC, qTOF-MS detector (Shimadzu, Kyoto).Te solvents used for the chromatographic runs were methanol and formic acid, which were purchased from ROMIL Pure Chemistry (Cambridge, UK).

Plant Collection and Sampling.
Leaves were collected from cultivated M. oleifera plants in multiple households across various villages within the Vhembe District of the Limpopo Province of South Africa.After being harvested, these leaves were kept in darkness while being transported to the University of Venda.Subsequently, the leaves were airdried in the absence of light at room temperature and then fnely ground into a powder using a blender.Tis powdered form was stored in a dark environment until the metabolite extraction process.

Preparation of the Extract.
A modifed version of a previously described extraction method [27] was utilized.In summary, 1 gram of ground leaf powder from each cultivar was mixed with 10 mL of 80% aqueous methanol (MeOH) and shaken overnight using a dragon shaker.Te resulting mixture was then centrifuged at a high speed of 5000 × g for 20 minutes at a temperature of 25 °C.Te supernatant liquid was transferred into an Eppendorf tube, fltered through 0.22 µm flters into a vial, and subjected to UPLC-qTOF-MS analysis.Any remaining supernatant solutions were stored in a refrigerator.

Ultrahigh Performance Liquid Chromatography-Quadruple Time-of-Flight Mass Spectrometry (UHPLC-qTOF-MS).
To analyze the extracts, the LCMS-9030 qTOF instrument from Shimadzu Corporation in Kyoto, Japan, was employed, following the method outlined by Ramabulana et al. in 2021 [26].Liquid chromatography separation took place on a Shim-pack Velox C18 column (100 mm × 2.1 mm, particle size 2.7 µm) housed in a column oven maintained at 55 °C.A binary mobile phase gradient consisting of solvent A (0.1% formic acid in Milli-Q water) and solvent B (methanol with 0.1% formic acid) was used.An injection volume of 3 µL was applied to all samples.Te gradient conditions were as follows: 10% B for 3 minutes, 10-60% B over 3-40 minutes, 60% B from 40 to 43 minutes, and 90% B from 43 to 45 minutes (maintained for 3 minutes), returning to initial conditions from 48 to 50 minutes, followed by a 3-minute column reequilibration time.Te chromatographic efuents were analyzed using a qTOF high-defnition mass spectrometer in a negative electrospray ionization mode.Te instrument was calibrated with sodium iodide (NaI), and both MS1 and MS2 data were simultaneously generated through a data-dependent acquisition (DDA) mode for all ions within an m/z range of 100-1000 and an intensity threshold of 5000.Fragmentation experiments were conducted using argon as a collision gas, with collision energy of 30 eV and a spread of 5 eV.Te MS settings were as follows: interface voltage of −4.0 kV, interface temperature of 300 °C, nebulization and dry gas fow rate of 3 L/min, heat block temperature of 400

Results and Discussion
MS/MS spectra of six methanolic extracts from the M. oleifera cultivars were compared to fnd similarities in the fragmentation patterns (i.e., same fragment ions or similar neutral losses) of the metabolites.Metabolites that are structurally related and have similar gas phase chemistries were grouped into molecular families based on a cosine score ≥0.7 [26].Using molecular networking, the MS/MS spectra were organized into 565 nodes, with 338 clustered into 38 diferent molecular families (with a minimum of two nodes connected by an edge) based on GNPS spectral matching.A total of 227 nodes were not clustered into a molecular family and were represented as individual nodes at the bottom of the network (Figure 1).Previous studies have shown the presence of structurally diverse favonoid molecules in the plant extracts.However, most of the work conducted in this study was through classical means of chemical identifcation where obtained mass spectrometry (MS) signals were compared with what is already known in the literature.Tis approach, however, has negative connotation owing to the limitation on information of some uncharacterized metabolites.A molecular network is a computational method aimed at metabolite identifcation by classifying metabolites based on their MS-based fragmentation pattern similarities and signals.

Exploration of the Chemical Space of Moringa oleifera.
Moringa oleifera is well known for its nutraceutical and pharmacological metabolic profles which are characterized by the presence of favonoids, glucosinolates, and chlorogenic acids.In this study, the metabolic profle of M. oleifera was studied with the help of molecular networking from the GNPS website.MolNetEnhancer (Figure 2) represents the metabolomes of this plant that were observed in this study.Te node annotations of MolNetEnhancer were based on MS2LDA, network annotation propagation (NAP), and DEREPLICATOR outputs.It was observed that this plant contains 16 diferent classes of metabolites including carboxylic acids and derivatives, fatty acyls, favonoids, glycerophospholipids, lignin glycosides, macrolactams, macrolides, naphthacenes, organooxygen compounds, prenol lipids, purine nucleotides, and tetrapyrroles and derivatives.A study by the authors in [29] revealed the presence of hydroxyl fatty acids, phenolic acids, favonoids, intact glucosinolates, sulfolipids, and phenolic acid derivatives' metabolite classes.
Flavonoids have been reported to be the predominant group of metabolites in M. oleifera leaf extracts with kaempferol and quercetin derivatives being the most predominant group [30].Flavonoids are naturally occurring polyphenols that accumulate in the edible parts of plants, more particular in fruits and vegetables [31].Flavonoids can further be subdivided into favones, favanols, favanones, favonols, favanonols, isofavones, and anthocyanins.In this study, much attention was given to favones and favonols.Flavones and favonols have antioxidant efects and are essential for protecting plants from UV radiation [32].Quercetin and kaempferol (Figure 1(a)), among others, are abundant dietary favonols found in fruits and vegetables.Flavonols have various health benefts which include cardiovascular and antioxidant properties.Luteolin and apigenin (Figure 1(b)) are the main favones that are found in fruits and vegetables and have a wide range of biological efects such as anticancer, antioxidant, and antiinfammation properties [33][34][35].
In this study, a total of 52 favonoids were detected.Kaempferol derivatives are known to have a major fragment ion at m/z 285, and quercetin derivatives are known to have a major fragment ion at m/z 301, both indicating the aglycone moiety thereof.Another common favonoid in M. oleifera leaves is isorhamnetin, and derivatives of this favonoid have a major fragment ion at m/z 314, again indicating the aglycone moiety.Te detailed mass information of selected favonoids that were annotated in this study is shown in Table 1.
Tere are other various tools that are available in GNPS that compliment molecular networking.Such tools are in silico metabolite annotation tools such as network annotation propagation (NAP) and dereplication.Tese tools perform in silico fragmentation of known structures and then search against chemical databases.Within the GNPS, there is another valuable resource known as mass spectrometry-mass spectrometry latent Dirichlet allocation (MS2LDA).MS2LDA is an unsupervised computational technique that reveals inherent substructures within compounds by analyzing intricate mass spectrometry (MS) data.Tis algorithm operates on an unsupervised basis, automatically detecting patterns and substructures within the complex MS data.Tis capability allows for the identifcation of shared substructures or fragmentation patterns Journal of Analytical Methods in Chemistry among compounds.MS2LDA decomposes each molecule into one or more Mass2Motifs which allow for more efcient molecular grouping, searching, and exploration [36].Mass2Motifs consist of similar fragments and neutral losses [37,38].Te structural annotation of the Mass2Motifs is straightforward and less complex because Mass2Motifs represent smaller substructures [39].Figure 3 represents the metabolite annotation using MolNetEnhancer and by MS2LDA of favonoids found in M. oleifera leaves.Te colored parts are representative of the favonoids that make up the Mass2Motifs.Quercetin, kaempferol, and isorhamnetin are the major favonols that are represented in Figure 3.It is observed that some of these favonols share the same Mass2Motif owing to their similar fragments and neutral losses.For example, the favonoids with precursor ion (M-H) − at m/z 533.088 and at m/z 592.785 share the same Mass2Motif because they share similar fragments due to the similar aglycone structure.

Quercetin Flavonoids.
Quercetin is a favonoid that is abundantly found in fruits and vegetables and can be used as a nutritional supplement.Tis compound has been reported to prevent diseases such as tumors, lung and cardiovascular diseases, and some forms of cancer [40][41][42].Figure 4 shows the fragmentation spectra of four quercetin-related favonoids as annotated by rhamn_motif_86.m2mand motif_447 mass2motifs on MS2LDA approach.Rhamn_motif_86.m2m(a quercetin-related motif ) indicated the presence of a quercetin aglycone with diagnostic fragments at m/z 301, 300, 255, and 179 and a neutral loss of 106 amu.Motif_447 also indicated the presence of a quercetin aglycone with fragments at m/z 301, 300, 271, and 255 and neutral loss of 44 amu.Quercetin favonoids are characterized by a deprotonated quercetin aglycone fragment at m/z 300/301, and other characteristic product ions of m/z 271, 255, 179, and 151 further confrm the identity of the quercetin aglycone [43].Compound 1 gave a precursor ion (M-H) − at m/z 609.197 and a fragmentation ion at m/z 300.028 due to the loss of the rhamnose and glucose sugars was seen as a base peak.Terefore, this compound was identifed as quercetin rutinoside [44].Compound 2 gave a precursor ion (M-H) − at m/z 505.098 and showed a fragment ion at m/z 445.078 due to the loss of the acetyl moiety (60 amu) and a further loss of the hexosyl moiety (162 amu) resulting in the fragment at m/z 300.Tis compound was thus identifed as quercetin acetyl hexose [45].Compound 3, which was identifed as quercetin malonyl hexose, gave a precursor ion (M-H) − at m/z 549.089 showing a fragment at m/z 505 due to the loss of an acetyl (44 amu) and another fragment at m/z 463 due to the loss of the malonyl moiety (86 amu).A further loss of the hexosyl moiety (162 amu) led to the fragment ion at m/z 300 [46].Compounds 2 and 3 share the same Mass2Motif due to the similar neutral losses which are incurred due to the loss of the hexose moiety.Compound 4 gave a precursor ion (M-H) − at m/z 463.087 and a fragmentation ion at m/z 300.028 due to the loss of hexose.Tis compound was identifed as quercetin hexose [47].

Kaempferol Flavonoids.
Kaempferol is a favonoid that is found in various plant parts such as seeds, leaves, fruits, fowers, and even vegetables.It has been referred to as a nutraceutical, owing to its medicinal and nutritional benefts [48].For instance, kaempferol and its glycosides have been reported to have cardioprotective, neuroprotective, anti-infammatory, antioxidant, and anticancer activities [49,50].Figure 5 shows the fragmentation       [61][62][63].Chrysin is also a natural favonoid that is found in many plants and bee products.Tis favonoid has been reported to have a variety of biological properties such as anti-infammation, antioxidation, anticancer, antibacterial, antidiabetic, and neuroprotective efects [64][65][66].Luteolin is a favonoid that is found in medicinal plants, fruits, and vegetables.Plants that are rich in this favonoid are often used for the treatment of various diseases such as infammatory disorders, hypertension, and cancer [67,68].
3.6.Glycoisomerization of Flavonoids.Moringa oleifera has been reported to undergo glycosylation patterns in order to diversify its favonoids.Moringa oleifera attaches diferent types of sugars to its favonoid aglycones [73].For example, quercetin is observed to attach diferent types of sugars to its aglycone structure, as observed in Figure 3. Furthermore, the glycosylation of favonoids can undergo further chemical modifcation such as isomerization, acetylation, malonylation, and acylation.Tese modifcations, however, bring about an analytical challenge because of the isomers are identifed as structural artefacts.Some of the favonoids undergo glycosylation through disaccharide sugar attachments [74].Coelution of diferent favonoids is often encountered in LC, which makes it difcult to characterize the favonoid composition.However, MS has a high sensitivity by making use of multiple reaction monitoring (MRM) which helps to improve the selectivity of the favonoids [75].
Compounds that have similar molecular formulae but diferent chemical arrangements are considered to be isomeric.For example, compounds kaempferol acetyl hexose (m/z 489), quercetin malonyl hexose (m/z 549), and isorhamnetin hydroxymethylglutaroyl hexose (m/z 621) with molecular formulae C 23 H 22 O 11 , C 23 H 22 O 11 , and C 28 H 28 O 11 , respectively, are considered to be isomeric (Table 2).Tese isomers have a similar molecular formula and molecular mass and are also observed to have similar fragmentation patterns.However, the chemical arrangement of these compounds difers, which could be due to a slight shift in the position of the glycosidic bond between the organic acid and the sugar that is conjugated to the aglycone structure as suggested by the authors in [26].It, however, still remains a challenge to distinguish these molecules.Tere is, therefore, a need to develop advanced analytical techniques so as to be able to distinguish between molecules of such a nature.Isobaric molecules were also observed in this study.Isobaric molecules are molecules with the same mass but are of diferent compound composition.In this study, isobaric favonoids were observed to have similar precursor ion mass at m/z 609 and 447 and molecular formula C 27 H 30 O 16 and C 21 H 20 O 11 , respectively.However, the compound composition difers.Tis observation thus makes these compounds isobaric.Te favonoids with molecular formula C 27 H 30 O 16 and precursor ion mass m/z 609 were identifed as quercetin rutinoside and kaempferol diglucoside, and those with molecular formula C 21 H 20 O 11 and precursor ion mass m/z 447 were identifed as kaempferol hexose and luteolin-8-C- Journal of Analytical Methods in Chemistry hexose (orientin).Tese compounds were difcult to tell them apart using only an LC-MS spectrum.However, upon the untargeted LC-MS/MS approach for metabolite profling, the diference in the fragmentation spectra was useful in the identifcation of these favonoids and was thus easy to distinguish them, as can be seen in Table 3 [76].

Conclusions
Te use of computational tools such as molecular networking highlighted the diferent molecular families that are found within M. oleifera and thus bringing insight into the chemical space of the plant.Unsupervised substructure annotation (MS2LDA) was useful in the annotation of Mass2Motifs of some of the favonoids found within M. oleifera.An enhanced molecular network unraveled the diferent chemical classes found in this plant and thus revealed the metabolome of M. oleifera.Seventeen favonoids (favonols and favones) were successfully annotated by MS2LDA in this study and confrm what has been previously reported in the literature.MS2LDA was also useful in the annotation of chrysin-6,8-C-diglucoside which is reported in MO leaves for the frst time through this study.
In the existing literature, it has been documented that favonoids in M. oleifera undergo glycosylation using various sugars as a mechanism to expand their chemical diversity.Tis glycosylation process has led to the detection of isomeric and isobaric favonoids in our current study.Te untargeted LC-MS/MS approach in combination with computational metabolomics tools such as molecular networking proved valuable in identifying isobaric molecules due to their distinct fragmentation patterns, thereby successfully accomplishing their identifcation.However, a challenge persists when it comes to identify isomeric favonoids, primarily because traditional MS techniques struggle to diferentiate them efectively.Consequently, the future application of alternative MS analyzers, such as orbitraps and ion mobility, will become essential in addressing this challenge, especially when hyphenated to other computational metabolomic tools such as a featurebased molecular networking.

Figure 1 :
Figure 1: Molecular network of Moringa oleifera Lam.leaf extracts as analyzed by liquid chromatography-tandem mass spectrometry using electrospray ionization in negative mode (center), with two diferent kinds of favonoids highlighted: (a) favonols and (b) favones.

3. 4 .Figure 2 :
Figure 2: An enhanced molecular network in which nodes are highlighted based on their chemical superclass based on MS2LDA, network annotation propagation (NAP), and DEREPLICATOR outputs.
[30]uct ions at m/z 255 and 227 further confrm the identifcation of the kaempferol aglycone[43].Compound 5 gave a (M-H) − ion at m/z 533.093 while its MS/MS fragmentation gave a base peak at m/z 285.043, due to the loss of the malonyl hexose moiety, and was thus identifed as kaempferol malonyl hexose[51].Compound 6 gave a (M-H) − ion at m/z 592.785, while its MS/MS fragmentation gave a base peak at m/z 285.043 due to the loss of the rutinoside sugar and was identifed as kaempferol rutinoside[52].Compound 7 was identifed as kaempferol diglucoside with a precursor ion at m/z 609.146 (M-H) − with a fragmentation peak at m/z 285.043.Tis compound also has fragments at m/z 446.089 and 447.098 due to the loss of the two hexose moieties (162 + 162 amu)[53].Compound 8, which was identifed as kaempferol hexose, has a precursor ion (M-H) − at m/z 447.093 with a fragmentation ion at m/z 284.033 which is due to the loss of the hexose sugar (162 amu)[54].Compound 9 gave a precursor ion (M-H) − at m/z 489.114 with a fragmentation ion at m/z 284.033 due to the loss of an acetyl hexose moiety.Tis compound was thus identifed as kaempferol acetyl hexose[30].
kaempferol with diagnostic fragments at m/z 285, 284, and 255 and a neutral loss of 68 amu.Motif_551 was characterized by diagnostic fragments at m/z 283 and 110 and neutral losses of 162, 167, 182, 193, and 194 amu.Kaempferol favonoids are characterized by a deprotonated kaempferol aglycone fragment at m/z 284/285, and other characteristic Compounds 5 and 6 were annotated by motif_622, compound 7 was annotated by motif_551, and compounds 8 and 9 were annotated by rhamn_motif_130.m2m,as shown in Figure 3.