Primary Selection and Secondary Diversification: Two Key Processes in the History of Olive Domestication

Knowledge on the crop domestication process is important from a cultural and agricultural standpoint since it can shed light on the origin and history of human civilizations as well as the management of genetic resources, while oﬀering guidance for modern breeding. The olive tree ( Olea europaea ssp. europaea ) is the most iconic of the old crop species of the Mediterranean Basin (MB). Primary domestication from wild olive probably occurred around 6000 BP in the Middle East. However, the question remains as to whether cultivated olive derived from a single domestication event in the Levant, followed by secondary diversiﬁcation, or whether it was the result of independent domestication events. Here, we analyzed a comprehensive sample collected from 35 wild populations (722 individuals) and 410 cultivars from across the MB using nuclear and plastid DNA markers. Our genetic investigations argue in favor of a single primary domestication event in the eastern MB, followed by diﬀusion of the ﬁrst domesticated olive and diversiﬁcation in the central and western MB as key processes in the olive tree history.


Introduction
Understanding crop domestication and diversification processes is important to infer the origin of the crop and highlight the history of human civilizations. ese investigations can be useful for genetic resource management while offering guidance for modern breeding. Olive (Olea europaea ssp. europaea) is considered to be the most iconic tree in Mediterranean areas. Oliviculture is one of the oldest cropping practices developed in these areas, and olive trees have therefore accompanied the emergence of early Mediterranean civilizations [1]. According to paleobotanical, archaeological, and genetic investigations, the olive tree may have persisted around the Mediterranean Basin (MB) as part of the natural plant community since the Late Tertiary [2]. However, despite the economic, cultural, ecological, and historical importance of the species, its origin and history have yet to be clearly documented. Clarifying the olive domestication and diversification process has therefore long been a focus of active scientific research [3].
According to Carrión et al. [4], during the Middle and Late Pleniglacial (59,000-11,500 yrs BP), Olea europaea had persisted in three thermophilous refugia located in the southern areas of the north MB, the southern Levant, and North Africa. Due to outcrossing of olive species, wild forms (oleasters, O. europaea ssp. europaea var. sylvestris) probably contributed to genetic diversity at the local scale, thus facilitating secondary diversification. It is usually considered that the center of primary olive domestication, from wild progenitors, began roughly around 6000 BP in the Middle East, near the border between Turkey and Syria [1,5]. is assumption was supported by investigations based on chloroplast DNA polymorphism [5,6], showing that more than 90% of olive cultivars across the Mediterranean Basin share the same eastern-like haplotype, therefore indicating an east-west human-mediated diffusion of cultivars in the MB. However, the scenario of a single primary domestication center has yet to be demonstrated. Based on paleobotanical and archaeological investigations, early exploitation and use of wild olive trees from the Near East to Spain have been documented since the Neolithic period [7,8]. e findings of several studies support multiple origins of cultivars across the Mediterranean region [9][10][11][12][13], but it remains unclear whether this reflects secondary diversification or multiple independent primary domestication events [13][14][15]. Indeed, genetic patterns observed by Diez et al. [13] suggest the occurrence of a second separate olive domestication event in the central MB. e hypothesis of a second independent domestication in the central MB remains to be explored because cultivated olive diversification may also have occurred in this area and in the western MB as the result of local independent domestication [14]. Following the response letter of Diez and Gaut [15], the hypothesis of an independent domestication event in the central MB may ultimately be confirmed or disproven. erefore, the question as to whether there was a single olive domestication center or multiple ones has yet to be answered.
In the present study, we investigated the history of olive trees through a comprehensive sampling of genuinely wild populations and domesticated forms from across the Mediterranean using nuclear and plastid marker analysis. Our results are discussed based on the question as to whether the current cultivated olive derived from a single domestication event in the Levant followed by secondary diversification or whether it is the result of independent domestication events.

Plant Material.
A total of 1,132 distinct genotypes were analyzed in this study, including 410 cultivars from 15 countries and 722 wild olive samples from 35 populations throughout the Mediterranean area ( Figure 1; Tables S1 and S2). All cultivars are maintained in the ex situ Worldwide Olive Germplasm Bank at the experimental station of Tassaout, INRA, Marrakech, Morocco [6,16]. Wild populations were sampled in natural areas far from olive agroecosystems in order to minimize admixture with cultivated olives, while taking morphological traits that differ from those of cultivated olives into consideration, such as smaller fruits with less fleshy mesocarp [17]. irty of the 35 wild populations were previously described and analyzed [5,18].

Molecular Analysis.
Total DNA was extracted from 100 mg of fresh leaf tissue, as described by Khadari et al. [19]. DNA quality was checked on 1% agarose gel, and the concentration was estimated using spectrofluorometry (GENios Plus, TECAN, Grödig, Austria).

Data Analysis.
We computed the following genetic diversity parameters for wild and cultivated olives separately and for each genetic group: the number of alleles (Na), expected heterozygosity (He), and observed heterozygosity (Ho) using the Excel Microsatellite Toolkit v3.1 [25]. e inbreeding coefficient (Fis) was calculated using the F STAT program v2.9.3.2b [26], whereas the allelic richness (Ar; [27]) was estimated using the ADZE program [28]. e Mann-Whitney comparison test was used to evaluate the significance of the allelic richness differences.
To investigate the genetic structure pattern within olive samples (wild and cultivated), discriminant analysis of principal components (DAPC; [29]) with the ADEGENET 1.3.1 package [30] in the R environment was applied with a priori grouping assumptions based on previous studies [6,13,18]. Unlike the STRUCTURE program, the absence of any assumption about the underlying population genetics model, in particular concerning Hardy-Weinberg equilibrium or linkage equilibrium, is one of the main assets of DAPC [29]. Based on the model-based Bayesian clustering approach implemented in the STRUCTURE program [31] as described in previous studies on olive species [6,13,18], wild olive was found to be structured in two groups (named westerncentral and eastern Mediterranean wild), whereas cultivated olive was in three groups (called western, central, and eastern Mediterranean cultivated olive), with one group shared between wild and cultivated olives (eastern Mediterranean). Hence, we set an a priori group number of four in the DAPC method for the whole dataset.
Moreover, once the wild and cultivated olive genotypes were assigned to their a posteriori genetic groups, relationships among genotypes and genetic groups were analyzed by principal coordinate analysis (PCoA) based on the simple matching coefficient [32], as implemented in the DARWIN v. 6.0.11 program [33]. Pairwise genetic differentiation and significance (F ST , [34]) between genetic groups, as revealed by membership assignation using the DAPC method, was estimated using 100,000 permutations with the GENEPOP program [35], and the unrooted F ST was plotted using the POPTREE2 program [36], with 999 bootstrap replicates with the Neighbor-joining method.

Genetic Diversity in Wild and Cultivated
Olive. Based on the analysis of 1,132 genotypes using 16 SSR markers, we identified a total of 427 alleles with an average of 26.69 alleles per locus. e number of alleles observed in wild olive (420) was higher than that in cultivated olive (276). Similarly, the expected heterozygosity (He, diversity index) was greater in wild than in cultivated olive (Table 1).

Genetic
Clustering. e genetic structure of Mediterranean olive was investigated using discriminant analysis of principal components (DAPC). e "find.clusters" function was used to determine the number of clusters maximizing the variation between clusters [30]. To avoid the loss of information, the function was performed with 200 principal components, accounting for more than 98% of the variance ( Figure S1(a)). e Bayesian information criterion (BIC) was used to identify the optimal number of clusters, i.e., 11 clusters ( Figure S1(b)). Based on these 11 clusters as a first analysis, DAPC clustering was represented according to the origin of olives classified in four a priori groups: westerncentral Mediterranean wild olive, eastern Mediterranean wild and cultivated olives, western cultivated olive, and central cultivated olive ( Figure S1(c)). Western cultivated olive showed narrow genetic diversity (cluster 7), whereas those from the central Mediterranean Basin displayed high diversity included in 3 clusters (clusters 1, 3, and 6). Similarly, eastern and western-central wild olive displayed high diversity, i.e., 3 and 4 clusters, respectively (Figures S1(c) and S1(d)). Pairwise F ST values among the 11 predefined clusters resulting from DAPC ranged from 0.017 (cluster 5-cluster 8) to 0.129 (cluster 4-cluster 7) (Table S3).
Based on the assignation membership probability resulting from DAPC at p ≥ 0.8, five groups could be identified: (i) eastern Mediterranean wild (referred as Wild east ), (ii) eastern cultivated olive (Cultivated east ), (iii) western and central Mediterranean wild olive (Wild west-center ), (iv) western Mediterranean cultivated olive (Cultivated west ), and (v) central Mediterranean cultivated olive (Cultivated center ). Although they are belonging to the same pool (Figures 2 and 3), Wild east and Cultivated east were considered as two distinct groups to describe the relationships between wild and cultivated olives in the eastern MB. Otherwise, at the p < 80% assignation level, wild and cultivated olive showed admixture: Wild admixed and Cultivated admixed , respectively (Figure 2; Tables 2 and  S4). For admixed wild and cultivated forms, we noted the occurrence of the three plastid lineages with a high proportion of E1 for cultivated olive (86.0%), whereas for admixed wild olive, close proportions of E1 and E2 were observed (53.1% and 44.9%, respectively, Table 2). When considering the a priori groups, more wild admixed genotypes were noted in the western-central part of MB. Similarly, more admixed cultivars were observed in the western-central MB area compared to the east (Table S5).
To investigate genetic relationships between the five groups defined above, principal coordinate analysis (PCoA) was performed ( Figure S2). Most of the variation (13.06%) was explained by the first two axes. For both wild and cultivated olives, the first axis corresponded to the east-west spatial distribution at the MB scale, where both wild olive groups (Wild west-center and Wild east ) were genetically distinct. e second axis separated Cultivated west and Cultivated center olives from Wild east and Cultivated east olives. e latter were clustered as one pool. Admixed genotypes for both Wild admixed and Cultivated admixed were plotted midway between the five genetic groups ( Figure S2).

Genetic Variation and Relationships among Genetic
Groups.
e mean heterozygosity (Ho � 0.701 observed) noted for wild olive was less than expected (He � 0.826) based on the Hardy-Weinberg equilibrium findings (Fis � 0.151; p < 0.001; Table 3), indicating a deficit of heterozygotes, as noted for wild genetic groups (Wild west-center and Wild east ). ese results may be explained by the subdivision of local populations into isolated and differentiated units (Wahlund effect), as revealed by the 11 predefined DAPC clusters (Figures S1(c) and S1(d)).
Allelic richness (Ar) was estimated and revealed a highly significant difference between wild olive and cultivars (24.58 vs 17.23; Mann-Whitney test, p < 0.001; Table 3). Otherwise, a highly significant difference was observed between Wild east and Cultivated east (12.95 vs 8.01), but not between Wild westcenter and Wild east (10.67 and 12.95, respectively) or between Cultivated center and Cultivated east . However, within cultivated olive, Ar for Cultivated west was significantly lower than for Cultivated center and Cultivated east (4.19, 7.52, and 8.01, respectively). When focusing on the maternal lineage, more maternal lineages belonging to E2 and E3 were revealed in  Figure 3). Moreover, for cultivars identified as admixed on the basis of DAPC (membership assignation <0.8), E2 and E3 haplotypes were found in higher proportion than for other cultivated groups (Table S5). e genetic differentiation (F ST ) values were significant between all pairs of the five groups (Wild east and Cultivated east treated separately, p < 0.001; Table 4). e mean F ST was 0.105 (p < 0.0001) for the five groups, indicating that 10% of the total genetic variation resulted from genetic differentiation between groups. e pairwise F ST ranged from 0.036 to 0.183 (Table 4). e highest values were observed between Wild west-center and all the other groups, whereas the lowest values were revealed between Wild east and Cultivated east groups. Relationships between the five groups showed a clear distinction between three main groups: (i) Wild west-center , (ii) Wild east and Cultivated east , and (iii) Cultivated west and Cultivated center , as supported by the high bootstrap values (Figure 3). e group Wild west-center was highly separated from the others based on both nuclear and plastid polymorphism, with the highest proportion of maternal lineages belonging to E2 and E3 haplotypes (86.1%; Table 2; Figure 3).

Discussion
Over the last two decades, substantial paleobotanical, archaeological, historical, and molecular data have been accumulated on olive species and the history of its domestication [  founded on the basis of several centers of primary selection across the MB [10][11][12]. However, the question remains unclear as to whether cultivated olives derived from a single primary domestication center followed by secondary diversification events or whether they are the result of independent primary selection events. A scenario of at least two independent primary selection centers in the eastern and central Mediterranean was proposed by Diez et al. [13].
Investigations on wild olive in the eastern Mediterranean were, however, limited to few sampled populations that were likely feral, as assumed by Diez et al. [13]. e lack of genuinely eastern wild populations has drastically limited the possibility of testing alternative complex domestication scenarios, as pointed out by Besnard and Robio de Casas [14] and Diez and Gaut [15]. Hence, investigating a comprehensive sample of wild olives throughout the MB could bring insight to help solve questions related to primary domestication and secondary diversification centers. Moreover, contrary to the findings of Diez et al. [13], our eastern wild populations were clearly distinct from the western-central wild olive populations, thus indicating their genuine status.
Olive tree history is complex, as previously highlighted by several studies (see review in Besnard et al. [3]). Instead of multiple primary domestication centers, we argue in favor of a single primary domestication in the Levant, followed by human-mediated diffusion of the first domesticated forms and admixtures with wild olives in the central and western Mediterranean Basin. However, we cannot exclude the occurrence of minor domestication centers in western and central parts of the MB, as some varieties have been found to  harbor maternal E2 or E3 lineages specific to local genetic resources, indicating their ancient local selection heritage ( Figure 1, Table S4). Moreover, morphometric olive stone and charcoal analyses have revealed the use of wild olive before the Neolithic period, suggesting local domestication could have occurred in the western MB area [7]. Here, by investigating current varieties using both nuclear and plastid markers, we obtained evidence of primary selection and secondary diversification as two key processes in the history of olive domestication based on the following arguments. First, we used DAPC and identified a single group including both eastern wild olive (Wild east ) and eastern cultivars (Cultivated east ), thus indicating direct selection from wild olive populations, as suggested by Gurbuz-Veral et al. [37]. Second, most cultivated olives have an eastern-like maternal haplotype as a signature of the diffusion of the first domesticated olives from the eastern to western Mediterranean Basin [5]. Note that the above two arguments are supported by the genetic differentiation index (F ST ) between different groups, including wild and cultivated olive trees ( Figure 3 and Table 4). ird, the allelic richness revealed highly significant differences between Wild east and Cultivated east olives. Contrary to other perennial fruit species such as apple [38], a substantial reduction in allelic diversity was observed between domesticated and wild olives across the MB (up to 30%), especially from the eastern MB (up to 38.1%; Table 3). is finding is in line with the selection pattern during the domestication process, as reviewed by Gaut et al. [39] and Besnard et al. [3] and references therein. Fourth, the genetic pattern of the Cultivated west and Cultivated center groups indicated a diversification process based on selection from crosses between the first domesticated olive forms and local olives, thus supporting the assumption of human-mediated diffusion of cultivars. Indeed, among varieties from the central MB, 35% (81 varieties) were admixed with limited gene flow from western-central wild populations (Figure 2; Table S4) and displaying the three maternal lineages (Table S5). Among varieties from the central MB, we found 11% (24) harboring maternal lineages E2 and E3. Moreover, as reported by Belaj et al. [40] and Klepo et al. [41], some central Mediterranean varieties retain wild-like phenotypic characteristics, such as low endocarp weight and a smooth endocarp surface. ese findings suggest a second center of domestication, as reported by Diez et al. [13], but evidence to back this assumption has yet to be documented [3].
We argue here in favor of a diversification process occurring in the western and central MB. Fifth, Diez et al. [13] found that most first-degree relationships were from the same genetic group (i.e., western cultivated olive; 96.3%) in which two cultivars from Spain (i.e., Gordal Sevillana and Lechin de Granada) had more than 60 first-degree relationships. Varieties harboring E2 and E3 from the western MB were found to be closely related within the western cultivated group such as Lechin de Sevilla with E2.3 maternal lineage (Table S2) and five first-degree relationships [13]. Regardless of the maternal lineages, the presence of highly related varieties indicated diversification based on crosses between cultivated olives as a key olive domestication process in the central and western MB.

Conclusion
Beyond a single primary olive tree domestication event, our investigation underlines the importance of admixtures within cultivated olive groups from the central and western Mediterranean Basin. Clarifying the evolutionary processes responsible for these groups will help gain important insight into accurately identify the genes under selection. is will also help to design methods for sampling of Mediterranean olive germplasm, including wild olives suitable for genome-wide association studies and genomic selection under the impact of climate change and within the sustainable oliviculture setting. Moreover, identifying olive diversity hotspots in the MB could also help to develop cost-effective diversity-prioritized approaches for in situ olive genetic resource conservation and management.
Data Availability e complete dataset is available upon request to the corresponding author: khadari@supagro.fr.

Conflicts of Interest
e authors declare that they have no conflicts of interest.
is research was supported by the project OliveMed/Agropolis Fondation no. 1202-066 through the Investissements d'avenir/Labex Agro ANR-10-Labex-0001-01 managed by the French National Research Agency (ANR) and by the BeFOre project "Bioresources for Oliviculture" 2015-2019, H2020-MSCA-RISE-Marie Skłodowska-Curie Research and Innovation Staff Exchange, Grant Agreement no. 645595. Table S1: list of the 35 wild olive populations. e sampling locations, the number of populations, individuals per location (size), and the GPS coordinates are given. Table S2: list of the 410 Mediterranean olive cultivars analyzed in the present study, along with their origins and maternal lineages. Table S3: pairwise genetic differentiation (F ST ) among the 11 subclusters, as identified by DAPC using the "find.clusters" function. Table S4: number of wild and cultivated olives per region of origin and the number and proportion of individuals assigned to each group based on the DAPC findings with a membership probability of 0.8. Table S5: proportion of maternal lineages according to the a priori grouping clusters for both wild and cultivated olives identified as admixed genotypes by DAPC with a membership assignation of p < 0.8. Figure S1: discriminant analysis of principal component (DAPC) results. Cumulative variance explained by the principal component analysis (PCA) relative to the number of principal components (PCs) retained in the analysis (a). Selection of the optimal number of clusters in the DAPC using the lowest Bayesian information criterion (BIC; (b)). Comparison of clustering performed by DAPC (K � 11) and the a priori wild and cultivated olive groups (c). Squares represent the number of individuals in each pairwise comparison. Scatterplot from a DAPC of olive genotypes showing the relationships between the 11 identified clusters (d). Figure