Gender Discrimination of Flower Buds of Mature Populus tomentosa by HPLC Fingerprint Combined with Chemometrics

A high performance liquid chromatography-diode array detector (HPLC-DAD) was used to establish the HPLC fingerprint. Chemometrics methods were used to discriminate against the gender of flower buds of Populus tomentosa based on areas of common peaks calibrated with the HPLC fingerprint. The score plot of principal component analysis (PCA) showed a clear grouping trend (R2X, 0.753; Q2, 0.564) between female and male samples. Two groups were also well discriminated with orthogonal partial least squares-discriminant analysis (OPLS-DA) (R2X, 0.741; R2Y, 0.980; Q2, 0.970). As the hierarchical clustering analysis (HCA) heatmap showed, all samples were separated into two groups. Four compounds were screened out by S-plot and variable importance in projection (VIP > 1.0). Two of them were identified as siebolside B and tremulacin. This study demonstrated that HPLC fingerprints combined with chemometrics can be applied to discriminate against dioecious plants and screen differences, providing a reference for identifying the gender of dioecious plants.


Introduction
Populus tomentosa Carrière (Fam. Salicaceae) is a deciduous tree, which is widely planted in North China, such as Beijing. It is a common afforestation tree species, which can prevent the damage of wind and sand in the north. e leaves, barks, and male inflorescences can be used in traditional Chinese medicine (TCM). e dried male inflorescences of P. tomentosa, as the plant source of TCM Flos populi, are recorded in the People's Republic of China Pharmacopoeia and used to treat acute colitis and bacillary dysentery with the effect of reducing dampness and stopping dysentery [1]. e chemical constituents of P. tomentosa are mainly flavonoids, sterols, organic acids, phenols, and glycosides [2][3][4], which have anti-inflammatory, analgesic, antidiarrheal, antibacterial, and antioxidant effects [5][6][7][8]. e barks of mature P. tomentosa have the efficacy of clearing away heat and expelling dampness and are mainly used for the treatment of dysentery, leucorrhea, acute hepatitis, bronchitis, pneumonia, roundworms, and habitual constipation [9]. Flos populi is used as a veterinary medicine for the treatment of dysentery and diarrhea in cattle, sheep, and pigs [10][11][12]. In addition, fresh male inflorescences are used as food in China [13].
P. tomentosa is one of the dioecious plants that refers to seed plants with unisexual flowers and where the female and male flowers grow on different plants [14,15]. In the last several years, many studies have focused mainly on physiological and biochemical indicators [16,17] and molecular biotechnology [18,19]. However, few studies have compared the differences between dioecious plants from the perspective of chemical constituents and their content [20].
ere are some reports about the chemical constituents of P. tomentosa. e male barks of P. tomentosa have isolated and identified some compounds, such as siebolside B, sakuranetin, isograndidentatin A, and sakuranin [3]. e method of high-performance liquid chromatography (HPLC) fingerprint combined with chemometrics was used to study the differences in the chemical constituents and their content in the male and female barks of mature P. tomentosa, which showed that the content of four compounds was different, containing siebolside B, sakuranin, isograndidentatin A, and micranthoside. e content of micranthoside in the male samples was lower than that in the female samples. e content of sakuranin, siebolside B, and isograndidentatin A was higher in the male barks than in the female barks [21]. e differences in volatile components of mature P. tomentosa flower buds were studied by HS-SPME-GC-MS, which showed that the content of four compounds was different, containing benzyl benzoate, 2-cyclohexen-1-ketone, methyl benzoate, and benzoate. e content of benzyl benzoate, 2-cyclohexen-1ketone, and methyl benzoate in the male samples was significantly lower than that in the female samples. e concentration of methyl benzoate in the male flower buds was remarkably lower than that in female flower buds [22]. e male inflorescences of P. tomentosa were used as TCM and food, so it is of great significance to discriminate against the gender of P. tomentosa. e flower buds are reproductive organs of P. tomentosa, and unlike with bark, the differences in chemical constituents should better reflect the correlation with their gender. e male and female inflorescences of P. tomentosa cannot be distinguished from the appearance of the trees except for the fruits and mature flowers.
e relative expression of a full-length cDNAdenominated PtLFY in females was remarkably lower than that in males, which was found by using the RT-PCR technique [23]. However, there has been no report that shows differences in the nonvolatile components in the flower buds of P. tomentosa until now.
In this study, we collected 11 female and 11 male flower buds of mature P. tomentosa and established the HPLC fingerprint. Differential compounds between female and male flower buds were screened and identified with chemometrics and UPLC-Q-TOF/MS, respectively.

Sample Preparation and Standard Solutions.
All flower buds were dried in the shade, crushed into powder with a grinder, and passed through a 24-mesh sieve. e powder of 2.0 g was weighed precisely into a 50 mL conical flask, and then 25 mL of 70% methanol was added. ey were weighed and refluxed at 80°C for 30 min and then were weighed again after cooling. e weight losses by refluxing were replaced with 70% methanol. ree standards were weighed appropriately and dissolved in 70% methanol. e sample and standard solutions were filtered through a 0.45 μm membrane filter.

Mass Spectrometry
Conditions. An Acquity UPLC BEH C 18 column (2.1 mm × 100 mm, 1.7 μm) was used in the ultra-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UPLC-Q-TOF-MS). e mobile phase consisted of methanol (A) and 0.1% formic acid water (B). e UPLC elution condition was applied as follows: 0-5.6 min: 10%-35% A; 5.6-13.7 min: 35%-58% A; 13.7-18 min: 58%-80% A; 18-21 min: 80%-100% A; 21-25 min: 10% A. e column temperature was kept at 35°C. e flow rate of the mobile phase was 0.3 mL/ min. e detection wavelength was set at 287 nm. e injection volume was 2 μL. UPLC-Q-TOF-MS was operated in the positive and negative modes with a scanning range of m/ z 50-1200. e capillary voltage and cone voltage were 3.0 kV and 30.0 kV, respectively. e ion source temperature was set at 120°C. e ion desolvation temperature was set at 450°C. e scan time of the data collection was 0.2 s. e flow rates of cone gas and desolvation gas (N 2 ) were set at 50 L/h and 1000 L/h, respectively.
2.6. Validation of the HPLC Method. All sample solutions were prepared according to 2.3 and analyzed in the chromatography system. e precision of the chromatographic method was determined by analyzing the same sample solution injected six times in a single day. e stability was determined by analyzing the same sample solution injected after 0, 3, 6, 9, 16, and 24 hours. e repeatability was determined by analyzing six replicates prepared from the same sample. e relative standard deviation (RSD, %) values of the relative retention time (RRT) and the relative peak area (RPA) of each common peak were calculated to estimate precision, stability, and repeatability.

Statistical Analysis.
e raw chromatographic data files of 22 flower buds were transformed into AIA format files. en, these files were imported into the software "Similarity Evaluation System for Chromatographic Fingerprint of TCM" (Version 2004A, Committee for the Pharmacopoeia of PR China). One of the samples was randomly chosen as the reference chromatogram. e control fingerprint, which can represent the characteristics of samples, was generated by using an average method and performing multipoint correction and automatic match. e similarities of the HPLC fingerprint of the samples were obtained, which were based on the reference fingerprint.
e peak areas of all common peaks were utilized to perform the chemometrics.
Chemometrics employs unsupervised and supervised models. Principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were performed using the software SIMCA-P version 14.1 (Umetrics) to discriminate between MFB and FFB of P. tomentosa. e screened compounds were analyzed by a hierarchical clustering analysis (HCA) heatmap performed by MetaboAnalyst (https://www.metaboanalyst.ca/ MetaboAnalyst/home.xhtml).
Firstly, unsupervised PCA was used to analyze the clustering of samples and yield a score plot. Subsequently, supervised OPLS-DA was used to estimate sample grouping and obtain the differential compounds between male and female samples. en, the permutation test was to prevent the OPLS-DA model from overfitting. e differential variables were selected by S-Plot and variable importance for the projection (VIP) values (VIP > 1) [24][25][26]. e student's t-test was carried out by the SPSS 25.0 software, which could verify that the differential variables were significantly different. Additionally, an HCA heatmap was applied to observe the clustering of samples, which illustrated the content of differential compounds in different groups of samples.
To identify the differential compounds, sample F4 was analyzed by the UPLC-Q-TOF/MS (Waters, Milford, MA, America). 292 compounds published in the literature about P. tomentosa and other species of the Populus genus were downloaded as structure files (mol and .sdf ) from Chemspider (https://www.chemspider.com) and PubChem (https://pubchem.ncbi.nlm.nih.gov/). All files were integrated into a database by Progenesis SDF Studio. MS data were imported into Progenesis QI and matched with the database. Each mass spectrum was manually analyzed with Waters MassLynx V4.1 software to verify whether the compounds predicted by the software were correct.

Optimization of Sample Preparation.
In this study, the extraction solvents (methanol and a serial concentration of methanol/water) and the sample-to-solvent ratios (1 : 10, 1 : 12.5, 1 : 20, and 1 : 30 g/mL) were investigated to extract the sample. e results suggested that 70% methanol and a ratio of 1 : 12.5 g/mL were better than other extraction solvents according to the numbers of chromatographic peaks and peak areas. erefore, they were selected for further experiments. Ultrasonic and reflux extractions were studied to obtain the best extraction efficiency. e results suggested that the reflux extraction was better than the ultrasonic extraction. Extraction time (30 min, 60 min, 90 min, and 120 min) was also tested and evaluated. e numbers of chromatographic peaks, peak areas, and extraction efficiency were comprehensively considered for evaluation ( Figure S1). Finally, 2.0 g of the sample was selected and 25 mL of 70% methanol was added for refluxing for 30 min to prepare the sample solutions.

Evaluation of Validation of the HPLC Method.
e precision, stability, and repeatability were determined using the method of HPLC-DAD and assessed by the RSDs of RRT and RPA of the common peaks. All of the RSD results were below 3.0% (Table 1), indicating that the HPLC method was a stable and feasible method for fingerprint analyses.   were calibrated. Peak 4 was determined as the reference fingerprint peak that was used to calculate RRT and RPA. Seventeen peaks were determined as the common peaks in the fingerprints of male and female flower buds of P. tomentosa. e peaks 4, 5, and 12 were identified as siebolside B, isograndidentatin A, and sakuranetin, respectively, based on the standards. e similarity of 11 female and 11 male samples was 0.978-0.999 and 0.943-0.985 (Tables S1 and S2), respectively, which indicated that the HPLC fingerprint of MFB and FFB of P. tomentosa was highly similar (similarity > 0.9) [27]. e similarities of 22 samples are displayed in Table S3, which indicates that the male and female samples were difficult to classify by HPLC fingerprint.

PCA and OPLS-DA.
To analyze the difference between FFB and MFB of P. tomentosa, the areas of 17 common peaks of 22 flower buds were imported into SIMCA-P version 14.1 software to perform unsupervised PCA analysis. e score plot and loading scatter plot of PCA are shown in Figure 3. e first and second PCs reflected 59.0% and 16.3% of the sample information, respectively. e first two PCs accounted for 75.3% (R 2 X) of the total variance and could be applied to classify the samples, which replaced the 17 original compounds and reflected the compound information of the samples. e Q 2 was 0.564, which indicated that the PCA model had good predictive ability (Q 2 ≥ 0.50) [28]. As shown in Figure 3(a), FFB (on the left) and MFB (on the right) were clustered into one group,  respectively. e distribution of 17 peaks was relatively scattered ( Figure 3(b)). Peaks 4, 9, 14, and 15 provided the main difference between female samples and male samples. erefore, these results demonstrated that there were differences between MFB and FFB of P. tomentosa in the chemical constituents.
To screen out the differential variables between male and female samples, the areas of 17 common peaks were imported into SIMCA-P version 14.1 software to perform supervised OPLS-DA analysis. e score plot of OPLS-DA is shown in Figure 4(a). e result of OPLS-DA showed that the R 2 X was 0.741, which demonstrated that 74.1% of the variance could be modeled by these chosen compounds on the X axis. e R 2 Y and Q 2 are 0.980 and 0.970, respectively. ey were close to 1, indicating that the OPLS-DA model was fully fitted and had great predictivity [29][30][31]. e permutation test was performed based on the OPLS-DA model to further test whether the OPLS-DA model was reliable. Two-hundred permutation test results showed that the intercepts of R 2 and Q 2 were 0.134 and −0.582 on the Y axis. As shown in Figure 4(b), the R 2 and Q 2 on the left, which were generated by random permutations, were less than the original values on the right. e intercept of the regression line of Q 2 on the Y axis was below zero. erefore, this model had no overfitting and was reliable [32,33]. e S-plot is shown in Figure 4 e results of the student's t-test showed that there was a significant difference (p < 0.05) in these four peaks between FFB and MFB of P. tomentosa.
3.6. HCA Heatmap. Four screened compounds were plotted on the HCA heatmap ( Figure 5) to get an intuitive overview of the differential compounds between the male and female samples. e samples were classified into two groups: the content of siebolside B and peak 14 (496.1318 m/z) were higher in the male samples, and the content of tremulacin and peak 15 (318.3023 m/z) were higher in the female samples.   According to the standard and literature, combined with the polarity, 14 compounds were identified in Table 2 and Table S4 ( Figures S3-S18) [21,[34][35][36][37][38][39][40]. e retention times at 8.91, 13.76, 17.37, and 18.14 min corresponded to the peaks 4, 9, 14, and 15 in the HPLC fingerprint by comparing the peak area, polarity, and spectrum. As shown in Table 2, two of the four screened compounds were identified, containing siebolside B and tremulacin. ese two components might be gender markers of P. tomentosa.
is provided the reference for further research and utilization of P. tomentosa. e pharmacological activity of siebolside B is allergy-preventive [41]. Tremulacin has antiinflammatory, antiviral, and antioxidant activities and can inhibit xanthine oxidase activity [41][42][43][44]. As shown in Figure 5, the content of siebolside B in MFB was higher than that in FFB. e content of siebolside B in the male barks of P. tomentosa was also higher than that in the female barks of P. tomentosa. e male inflorescences of P. tomentosa are the plant source of Flos populi [21]. erefore, siebolside B may be the important bioactive compound in male P. tomentosa. ere have been a few studies on siebolside B so far. is has guiding significance for the following research.
is can provide a reference for the expansion of the use of P. tomentosa flowers. e method of HPLC fingerprint combined with chemometrics is simple, fast, and accurate. It accurately distinguished the gender of flower buds of P. tomentosa in this study and enriched research on dioecious plants. In late spring and early summer, a large number of poplar catkins of P. tomentosa are scattered in the air, causing people to have symptoms such as cough and allergies, which causes great trouble in people's lives [48,49]. is method can be applied to identify the gender before the flower buds of P. tomentosa mature. Poplar catkins can be treated by some physical measures, such as artificially trimming female flower buds.  International Journal of Analytical Chemistry is will provide a reference for reducing poplar catkin pollution.

Conclusions
In this study, gender discrimination of flower buds of mature P. tomentosa was achieved by using an HPLC fingerprint combined with chemometrics. e female and male flower buds of P. tomentosa were clearly discriminated by PCA, OPLS-DA, and HCA heatmap. Furthermore, 15 compounds were identified, and 4 compounds were successfully screened out by S-plot and VIP values, whose content was significantly different in the female and male samples. erefore, this study provided a reference for the gender identification of dioecious plants and enriched research on dioecious plants.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Table S1: similarity evaluation of 11 male flower buds (MFB) of P. tomentosa. Table S2: similarity evaluation of 11 female flower buds (FFB) of P. tomentosa. Table S3: similarity evaluation of 22 flower buds of P. tomentosa. Table S4: other compounds of the sample F4 identified by UPLC-Q-TOF/ MS. Figure S1: optimization of the sample preparation: (A) optimization of the extraction solvents; (B) optimization of the sample-to-solvent ratios; (C) optimization of the extraction methods; the picture above depicts reflux extraction, while the picture below depicts ultrasonic extraction; and (D) optimization of the extraction time. Figure S2: optimization of HPLC Conditions: (A) optimization of the mobile phases: the picture on the left depicts the methanolwater of the mobile phase, and the picture on the right depicts the methanol-0.1% formic acid in the water of the mobile phase; (B) optimization of the analytical columns: the B1 is about the column of Diamonsil C 18 (1), the B2 is about the column of Diamonsil C 18 (2), the B3 is about the column of Diamonsil Plus C 18 , the B4 is about the column of Agilent 5 TC-C 18 (2), and the B5 is about the column of Agilent ZORBAX SB-C 18 . Figure