Phenotypic Diversity Assessment of Okra (Abelmoschus Esculentus (L.) Moench) Genotypes in Ethiopia Using Multivariate Analysis

Okra is a minor crop that has not gained research attention in Ethiopia. Characterization of such underutilized crops has important implications for their utilization. Thus, this study was conducted to assess the genetic diversity of okra genotypes in Ethiopia using agromorphological and biochemical markers. Thirty-six okra genotypes were evaluated for 29 agromorphological and biochemical traits. The results of the analysis of variance showed significant differences among genotypes for most of the traits, except for the number of flower epicalyx and fruit diameter. Results of the principal component analysis indicated that the first eight principal component axes accounted for 3.83 to 30.54% and 82.44% of the total variability. Genetic distances estimated by Euclidean distances from 27 traits ranged from 3.55 to 14.49. The 36 genotypes were grouped into four distinct clusters from the Euclidean distance matrix using the unweighted pair group method with arithmetic mean (UPGMA). The first cluster contained 24 (66.66%) genotypes, and the second cluster contained 10 (27.77%) of the genotypes. This study showed the presence of considerable genetic variation among the genotypes for most of the traits, including fruit yield, seed yield, and nutrient content of seeds, indicating the possibility of using these genotypes to develop okra varieties with high fruit-yielding and good nutritional content.


Introduction
Today, the world's food supply is based on a small number of crop species, mostly major cereals (wheat, rice, and maize) leaving an abundance of genetic resources and potentially beneficial traits neglected [1]. In the face of climate change, utilizing the vast pool of minor and underutilized crop species would provide a more varied agricultural system and food sources, ensuring food and nutrition security problems. Underutilized crops play a significant role in ensuring food security, nutrition, and income generation for resource-poor farmers and consumers, especially in the developing world [2].
Climate change may increase the relevance of plant species that were previously underutilized or thought to be of minor importance [1]. One approach to maintaining a good fit between crops and environmental challenges because of climate change is to use underutilized (minor, orphan, or neglected) crops and their wild relatives. Okra is among the most underutilized crops cultivated in the southwestern and western parts of Ethiopia [3][4][5]. e characterization of okra genotypes existing in the country would contribute to developing varieties that could thrive in extreme climatic conditions and would allow further utilization of the crop for enhancing food security. e value of germplasm collections depends on their diversity, and crop improvement prominently relies on existing genetic variation [6,7]. Shujaat et al. [8] suggested that genetic variation is an important feature to achieve the diversified goals of plant breeding, including higher and quality yield, resistance to diseases, and wider adaptations. e pattern and level of genetic diversity in a given gene pool can be measured in terms of genetic distance, which is a measurement of average genetic divergence between genotypes or populations [9]. Regardless of the dataset (morphological, biochemical, or molecular marker data), multivariate analytical procedures that simultaneously make several measurements on each individual under examination are frequently utilized in genetic diversity studies [10]. is study aimed to assess the phenotypic and biochemical diversity of Ethiopian landrace okra genotypes along with exotic commercial varieties using multivariate analysis for further utilization of the crop and contribute to ensuring food security and alleviating malnutrition.

Description of the Study Site.
e field experiment was conducted at the Melkassa Agricultural Research Center, Ethiopia, during the 2018 main cropping season (rainy season). Melkassa is located at 8°24′59.20″ N latitude and 39°19′15.19″ E longitude, with an altitude of 1,548 m above sea level [11]. e biochemical contents were determined at the Ethiopian Biodiversity Institute Nutrition Laboratory (total ash and crude fat), the Debre Ziet Agricultural Research Center (crude fiber), and the Melkassa Agricultural Research Center (total protein).

Experimental Materials and Design.
irty-six okra genotypes of 24 landrace accessions (collected by the Ethiopian Biodiversity Institute from different okra growing regions of Ethiopia), three genotypes (from the Humera Agricultural Research Center), and nine exotic commercial varieties (eight from India and one from the USA) were used in this study. e 36 genotypes were planted in a 6 × 6 simple lattice design. ree seeds per hill were sown and thinned to one plant per hill when plants reached the 3-4 leaf stage.

Data Collection.
Data were collected for phenology traits (days to 50% emergence, days to first flowering, days to 50% flowering, and days to 90% maturity), growth and yieldrelated traits (plant height, stem diameter, number of primary branches per stem, number of internodes, internodes length, leaf length, leaf width, number flower of epicalyxes, peduncle length, fruit length, fruit diameter, average fruit weight, number of tender fruits per plant, number of mature pod per plant, number of ridges on fruit, fruit yield per plant, fruit yield per hectare, number of seeds per pod, hundred seed weight, seed yield per plant, and seed yield per hectare), and biochemical content of the seed (total ash, total fat, crude fiber, and total protein). Phenology and growth-related traits' data were recorded according to the IPGRI [12] descriptor list developed for okra.

Total Ash.
Total ash was determined following the method of AOAC [13] using the gravimetric method. Crucible was cleaned, dried, and ignited at 550°C for 1 hour and weighed (m1). e flour sample (3 g) weighed (m2) and dried at 120°C for 1 hour.
en, the dried sample was carbonized over a blue flame and ignited in a muffle furnace at 550°C until ashing was complete (over 12 hrs). After being ignited, the sample was cooled to ambient temperature and was weighed (m3). Finally, the total ash content was calculated as follows: where m1 is the mass of crucible (g), m2 is sample mass with crucible (g), and m3 is the final mass of sample with crucible (g).

Crude
Fat. e crude fat content of okra seed was determined by the Soxhlet extraction method according to the AOAC [13]. e flour sample (3 g) was weighed and added into a thimble.
e thimble with the sample was placed in a 50 ml beaker and dried in an oven for 2 hours at 110°C. A 150-250 ml dried beaker was weighed and rinsed several times with petroleum ether. e sample contained in the thimble was extracted with petroleum ether in a Soxhlet extraction apparatus for 6-8 hours. After extraction is completed, the extracted fat was transferred into a preweighed beaker (M i ). e beaker with the extracted fat was placed in a fume hood to evaporate the solvent on a steam bath unit no odor of the solvent is detectable. en, the beaker with contents was removed, cooled in a desiccator, and weighed (M f ). e amount of fat in flour was calculated by using the following formula: where M f is the dried mass of the fat with beaker (g), M i is the mass of beaker (g), and M is the sample mass (g).

Crude
Fiber. e crude fiber was determined according to the AOAC [13]. Ground sample (3 g) was weighed (m 1 ), placed in 500 ml beaker, digested with 1.25% sulfuric acid, and washed with water and was further digested with 1.25% sodium hydroxide and filtered in course porous (75 μm) crucible in apparatus at a vacuum of about 25 mm. e residual left after refluxing was washed again with 1.25% sulfuric acid at near boiling point. en, the residual was dried at 110°C overnight, cooled in a desiccator, and weighed (m 2 ). After being dried, the sample was ashed at 550°C until the ashing was complete, cooled in a desiccator, and weighed again (m3). e total crude fiber was expressed in percentages as follows: where m1 is a mass of sample (g), m2 is mass of sample with crucible before ashing (g), and m3 is mass of sample with crucible after ashing (g).

Total Protein.
A dried and grounded sample was taken (0.5 g) and added into a Kjeldahl digestion flask. One gram of catalyst (Na 2 SO 4 mixed with anhydrous CuSO 4 in a ratio of 10 : 1) and 5 ml of concentrated H 2 SO 4 were added into the digestion flask. en, using a digester, the mixed sample was digested at 350°C for about two hours until the sample was completely digested. en, the flask was removed from the digester and allowed to cool and the digested sample was diluted by adding 30 ml of distilled water. en, 25 ml concentrated 40% NaOH was added into the digestion flask to neutralize the acid and make the solution slightly alkaline. e contents were immediately distilled by inserting the digestion tube line into the receiver flask that contained 25 ml of 4% boric acid solution and about 150 ml of distillate collected. en, the distillate was titrated by a standard acid (0.1 N HCl). e percentage of crude protein was calculated by multiplying the nitrogen percentage by the conversion factor (6.25) [13].
where V � volume of standard acid used for titration of sample (A) and blank sample (B), N � normality of standard acid used for titration (0.1 N HCl), 0.014 is the molecular weight of nitrogen, and W � weight sample taken for digestion, on a dry basis.

Analysis of Variance.
e quantitative field data were subjected to analysis of variance (ANOVA) and computed with R statistical software agricolae package [14]. e biochemical traits were analyzed following the CRD (completely randomized design) procedure. e traits that exhibited significant mean squares in ANOVA were further subjected to multivariate analysis.

Principal Component Analysis.
Principal component analysis (PCA) was computed to find out the traits, which accounted more for the total variation. e data were standardized to mean zero and variance of one before computing principal component analysis to avoid differences in measurement scales. e principal component based on the correlation matrix was calculated using the R statistical software FactoMineR package [15].

Euclidean Distance and Clustering of Genotypes.
Euclidean distance (ED) was computed from quantitative after subtracting the mean value and dividing it by the standard deviation as established by Sneath and Sokal [16]. R statistical software factoextra package [17] was used for the analysis of distance matrix and constructing dendrogram.
e dendrogram was constructed based on the unweighted pair group method with arithmetic mean (UPGMA) from the distance matrix of phenotypic traits.

Analysis of Variance.
e results of the analysis of variance for phenology, growth, yield-related traits, and biochemical traits showed a significant (P < 0.05) difference. However, the genotypes exhibited a nonsignificant difference for the number of flower epicalyx and fruit diameter (Tables 1 and 2).

Principal Component Analysis.
e result of principal component analysis for 27 quantitative traits is presented in Table 3. With eigenvalues ranging from 1.033 to 8.247, the principal component analysis resulted in eight principal components (PC1 to PC8). e eight principal components each accounted for a different percentage of the total variance, ranging from 3.83 to 30.54%, for a total variance of 82.44%. e PCs with an eigenvalue of <1 were ignored due to Gutten's lower bound principle that eigenvalues <1 should be ignored. e first principal component (PC1) contributed to most of the variation (30.54%), followed by PC2, PC3, and PC4, which contributed 14.11%, 10.87%, and 6.98%, of the variation respectively, and the first four PCs accounted for 62.51% of the total variation.
A similar result on okra was reported by Muluken et al. [18] in which the first three principal components PC1, PC2, and PC3, with values of 32.4%, 16.7%, and 8.2%, respectively, contributed more to the total of 57.3% variation. Amoatey et al. [19] reported the first, second, and third principal components with values of 32.44%, 19.78%, and 9.68% of the total genetic variation, respectively. Ahiakpa [20] also reported that the first principal component (PC1) was (32.44%) the major contributor for variance in okra genotypes.
Within the PC1, traits with the largest values closer to one influence the cluster more than traits with lower absolute values closer to zero [21]. Hence, the differentiation of the genotypes into different clusters was because of the cumulative effect of several traits rather than the large contribution of a few traits. In this regard, stem diameter (7.41%), fruit yield per hectare (8.03%), leaf width (7.56%), fruit yield per plant (7.87%), leaf length (7.11%), peduncle length (6.75), seed yield per plant (6.75%), and seed yield per hectare (6.75%) had relatively higher contributions to PC1.
is indicates that these traits were responsible for the differentiation of the clusters and had a greater contribution to the total diversity. In PC2, days to 50% flowering (18.58%), days to first flowering (16.87%), date of maturity (14.76%), and the number of mature pods (8.12%) had more contribution, whereas fruit length, fresh fruit weight, and ash content had relatively more contribution in PC3 (Table 3).
A biplot was performed based on the first two PCs (Figure 1). e genotypes and quantitative traits were shown on a biplot to visualize their associations. e first and the second PC biplots explained 44.66% of the total variability among the genotypes, displaying that stem diameter, leaf length, and leaf width, fruit yield per plant, fruit yield per hectare, seed yield per plant, and seed yield per hectare were considered the most discriminating traits. e genotypes positioned on the right top quadrant were characterized by late maturity, high fresh fruit weight, much fruit ridges, and high stem diameter. e genotypes depicted in the bottom right quadrant had the highest seed yield, number of fruits per plant, number of mature pods, and longest, and widest leaf. e genotypes distributed around the origin had similar genetic characteristics, while the genotypes that were found far from the origin are considered unrelated genotypes (Figure 1). erefore, these divergent genotypes could be used as potential parents for successful hybridization to develop heterotic groups in the okrabreeding program.
PC3 and PC4 biplots are presented in Figure 2. ese two PCs accounted for 17.85% of the total variability among genotypes, showing that ash content, and total protein content, crude fiber content, fruit length, and fresh fruit weight were the most contributing traits. PC3 and PC4 biplots provided information regarding the similarities and the pattern of differences among the okra genotypes and the association between traits. Genotypes were distributed in all four quadrants on the axes, indicating the presence of wide genetic variability for the traits studied. Overlapped accessions and accessions closer to each other on the axes had similar genetic makeup. However, genotypes that are apart from each other could be considered genetically distinct.
Genotypes positioned in the top right quadrant were characterized by high-seed protein and fiber content. e top left quadrant consists of the okra genotypes that are closely related and have a high number of internode and a high number of mature pods. Genotypes found on the right bottom quadrant exhibited the highest seed ash content.

Cluster Analysis.
e optimum number of clusters was determined by the total within sum of square (WSS) (elbow method) using R statistical software version 3.6.3 (Figure 3). A dendrogram was constructed based on the unweighted pair group method with arithmetic mean (average) from the distance matrix of phenotypic traits. e distances of all possible pairs of the 36 okra genotypes from 27 quantitative traits were estimated by Euclidean distance. e distances between okra genotypes ranged from 3.55 to 14.49 with a mean, standard deviation, and coefficient of variation of 7.12, 1.80, and 25.25%, respectively. e highest genetic distance (Euclidean distance) was computed between 29407 and Humera 2 (14.49), whereas the lowest     (Figure 4). Based on PC axes 1 and 2, a scatter plot was constructed for four clusters (Figure 4). e plot showed that the genotypes that have similar genetic makeup were grouped in a cluster (near to overlap), and those genotypes that have different genetics were positioned in the opposite corner of the scatter plot.
Generally, the Euclidean distances measured among the introduced varieties were lower than the genetic distances among genotypes collected from Ethiopia. is showed that there is a higher chance of improving fruit yield and seedrelated traits through the selection and/or hybridization of okra genotypes collected from different okra growing regions of Ethiopia.
By characterizing 24 Ethiopian okra genotypes, Fozia [22] reported Euclidean distance that ranged from 1.96 to 11.36 with a mean, standard deviation, and coefficient of variation of 5.85, 1.97, and 33.75%, respectively. e same study also reported that introduced (Indian) varieties had lower (1.96 to 10.01) genetic distances than Ethiopia's okra collection, which ranged from (2.07 to 11.36). Muluken et al. [18] reported that Ethiopian okra collections exhibited a wider genetic distance than exotic varieties. Anteneh [23] estimated the genetic distances of all possible pairs of 25 okra genotypes and reported that the highest genetic distances were observed between okra collections from Ethiopia and introduced commercial varieties from other countries, while the lowest genetic distance was estimated between introduced commercial varieties. e extent of diversity present between genotypes determines the extent of improvement gained through selection and hybridization. e more distant the two genotypes are, the greater the probability of improvement through   selection and hybridization. Mihretu et al. [3] also reported the presence of considerable genetic distance among okra collections from Gambela regional state, which is one of the okra-growing regions in Ethiopia.
Clustering of genotypes based on Euclidean distances revealed four major clusters. e number and names of genotypes in each cluster along with their collection origin are presented in Table 4. Cluster I consists of the majority of the genotypes, which accounted for 24 (66.66%) of the genotypes. Cluster II contains 10 (27.77%) genotypes, while clusters III and IV each contain only single genotypes (Table 4, Figure 5).
Genotypes clustered in cluster I and cluster II were early maturing, while the two genotypes positioned in cluster III and cluster IV were late-maturing genotypes (Table 5). erefore, genotypes found in cluster I and cluster II could be used for okra production in areas characterized by the low amount of rainfall. ese genotypes could also be used by breeders for developing varieties suitable for drought-prone areas. On the contrary, the two genotypes found in clusters III and IV could be used for areas that have a long rainfall season. e highest mean of fruit yield was measured from the genotype (29407), which is found in cluster IV. is genotype also had the highest seed protein content. is genotype could

Conclusions
e results showed the presence of considerable genetic diversity for the studied morphological and biochemical traits. is variation could be exploited to develop varieties with different desirable agronomic traits like early maturing, high yield, and good nutrient content through either selection and/or hybridization using the okra genotypes collected and conserved in Ethiopia.
In addition, the study revealed the potential of the landrace okra genotypes as sources of nutrients.
is indicates the importance of neglected crops that could be utilized for ensuring food security and alleviating malnutrition in developing countries like Ethiopia, where malnutrition is a widespread problem. It is also recommended to extend the research in okra to include micronutrient content analysis, molecular diversity study, and sequencing okra genotypes to identify important agronomic and biochemical traits and to characterize the genes responsible for the traits.

Data Availability
e raw data and additional information could be made available from the corresponding author upon request.

Conflicts of Interest
e authors have not declared any conflicts of interest.