Modularity of the Oral Jaws Is Linked to Repeated Changes in the Craniofacial Shape of African Cichlids

The African cichlids of the East-African rift-lakes provide one of the most dramatic examples of adaptive radiation known. It has long been thought that functional decoupling of the oral and pharyngeal jaws in cichlids has facilitated their explosive evolution. Recent research has also shown that craniofacial evolution from radiations in lakes Victoria, Malawi, and Tanganyika has occurred along a shared primary axis of shape divergence, whereby the preorbital region of the skull changes in a manner that is, relatively independent from other head regions. We predicted that the preorbital region would comprise a variational module and used an extensive dataset from each lake that allowed us to test this prediction using a model selection approach. Our findings supported the presence of a preorbital module across all lakes, within each lake, and for Malawi, within sand and rock-dwelling clades. However, while a preorbital module was consistently present, notable differences were also observed among groups. Of particular interest, a negative association between patterns of variational modularity was observed between the sand and rock-dwelling clades, a patter consistent with character displacement. These findings provide the basis for further experimental research involving the determination of the developmental and genetic bases of these patterns of modularity.


Introduction
Adaptive divergence is likely influenced by the coordination and integration of multiple traits. If genetic variation affecting patterns of trait covariation have fitness consequences, then a particular pattern of integration that allows for a closer match to a new local multivariate phenotypic optimum should be favoured [1][2][3]. Alternatively, ancestrally conserved patterns of integration may act to constrain the rate and direction of evolution by preventing certain functions from evolving [4,5]. Either way, modularity may influence the pace of evolution and determine evolvability [6,7]. It is therefore not surprising that the study of trait integration has been of interest to biologists for more than half a century [8][9][10] and has recently seen renewed attention [3,[11][12][13][14].
The study of integration has more recently been extended to the closely related concept of modularity-the relative degrees of connectivity in systems. A module is a tightly integrated unit that is relatively independent from other such modules. For morphological data, modularity has been studied in a variety of contexts including those that are developmental, genetic, functional, and evolutionary in their focus [15][16][17][18][19][20][21][22]. An emerging consensus is that patterns of modularity in complex phenotypes likely represents a balance between functional and developmental integration and that modularity is better viewed as a matter of degrees rather than an all-or-nothing phenomenon [12].
It has been suggested that modularity can facilitate divergence by allowing organisms to alter aspects of their phenotype without facing the developmental or fitness tradeoffs that would be present in a wholly integrated unit [12,13]. In this way, the evolution of modularity could be tied to the idea of key innovations (see [23] for an example). The origin or evolutionary "success" of taxa is often attributed to key innovations-aspects of organismal phenotype that promote diversification [24]. Key innovations may enhance competitive ability, relax adaptive tradeoffs, or permit exploitation of a new productive resource base.
The African cichlids from lakes Victoria, Tanganyika, and Malawi in East-Africa's Great Rift Valley represent the largest 2 International Journal of Evolutionary Biology extant example of vertebrate adaptive radiations known [25]. Certain anatomical features of this group have been proposed as key innovations that have facilitated the rapid evolution of these fishes [25,26]. The best known of these represents an important example of functional modularity, wherein the highly derived cichlid pharyngeal jaw mechanism allows the processing of prey within the throat to be decoupled from prey capture by the oral jaws [12,26]. This is thought to have allowed African cichlids to exploit a wide array of niches that would be unavailable if only one set of jaws was present [27].
The cichlid radiation of Lake Malawi is particularly interesting, because although it is intermediate to Tanganyika and Victoria in terms of age and morphological diversity [28,34], it has produced the greatest number of endemic species (well over 700) [35,36]. The evolutionary history of Malawi cichlids suggests that current diversity arose via three stages of diversification: (1) early divergence of the sanddweller and rock-dweller clades, each of which has adapted to a major macrohabitat, (2) competition for trophic resources within each of these clades that caused further differentiation of trophic morphology, and (3) divergent sexual selection resulting in differentiation of male nuptial coloration [37,38].
We recently completed an extensive analysis that explored patterns of craniofacial shape variation in African cichlids from each of the three rift lakes [28]. Our data, which represented approximately 80% of the genera across lakes, revealed that all three cichlid radiations share a common trajectory of divergence with respect to each lineage's major axis of divergence (PC1). Our geometric morphometric analysis also showed that these changes were primarily related to changes in the relative length and size of the "preorbital region" of the skull, which encompasses the oral jaws and supporting structures, with shape posterior to the orbital region remaining relatively stable. These trends suggest that a large portion of the head diversity seen in African cichlids has been achieved by relatively simple and repeated shifts in jaw shape and that these may have happened relatively early in their evolutionary history.
Functional differences in jaw size reflect divergent foraging modes. African cichlids with longer oral jaws are either "suction feeders" and forage on zooplankton, or they are piscivorous and feed on other fishes [28]. Alternatively, species with shortened jaws are typically "biters" that possess a higher mechanical advantage to scrape algae or forage on large macrobenthic prey. In Lake Malawi, this fundamental division is reflected in the cladogenic split between rockand sand-dwelling species. On average, rock-dwelling species have a shorter jaw in common morphospace, whereas sanddwellers species have relatively longer jaws [28,Cooper unpublished data]. Notably, these morphological patterns seem to be a common theme in the adaptive radiation of other fish assemblages (e.g., [39]) and even in populationlevel divergence among ecomorphs of charr whitefish, and sunfish [40][41][42]. Thus, a propensity for changes in the size of oral jaws seems to exist in teleosts at multiple levels of biological organization and perhaps represents a key innovation for this group as a whole. While the evolutionary origins for a preorbital module may not lie within African cichlids examining potential patterns of craniofacial modularity in cichlids may identify important targets for future developmental genetic research to understand the proximate mechanisms that have facilitated these important radiations and divergence in other groups of fishes. Cichlids may be especially useful for this research, because species with widely variable jaw morphologies can be hybridized, facilitating the creation of large populations for genetic mapping to identify the loci and genetic pathways that underlie changes in jaw shape [43,44].
As mentioned, Liem's [26] seminal work on the pharyngeal jaw apparatus in cichlids suggested that the functional decoupling of prey capture and processing should free the oral jaws to more readily adopt an array of niche-specific shapes for food capture, largely independent of other traits. Implicitly, this insight confers a level of modularity to the cichlid oral jaw apparatus. Recent work in our lab, as well as from others, supports this assertion by demonstrating that morphological divergence among rift lake cichlids is characterized by prodigious shifts in oral jaw shape [28,34] and has lead to the specific hypothesis that the preorbital region of the skull represents an evolutionary module that is conserved among cichlids from each of the three East African rift lakes [28]. Here, we objectively test this hypothesis by comparing multiple combinations of models of cichlid head variational modularity. Specifically, we use an approach of model selection recently introduced by Márquez [45] to statistically assess patterns of variational modularity across a large sample of rift valley lake cichlids. To determine whether similar patterns of modularity are operating at different levels of biological organization, we also examine craniofacial modularity in each lake separately, as well as within the rockand sand-dwelling clades of Lake Malawi.

Data Collection.
The data used for this study has been previously published in Cooper et al. [28], where further details, including a full list of specimens sampled, can be found. Briefly, our sampling included 78.8% of the genera endemic to the three East-African rift lakes, with the following percentages from each lake: Tanganyika (74.5%), Malawi (88.5%), and Victoria (57.1%). Within Lake Malawi, 19 rock-dwelling species, representing 11 genera, were sampled, and 36 sand-dwelling species, representing 31 genera, were also sampled. Dissections were performed on cichlid heads in order to expose anatomical landmarks important for oral jaw function ( Figure 1). A total of sixteen anatomical landmarks were plotted on the images of each specimen using the software program tpsDig2 [46].

A Priori Hypotheses.
Our goals were to determine first whether modularity was present in the cichlid head and second what the best-supported pattern of modularity was in our data. This required comparative testing of alternative a priori models, each of which specified a particular modular structure in the cichlid head. In this approach, each model is comprised of a series of partitions defined as anatomical Table 1: A priori hypotheses of modularity in the cichlid head. Brackets denote putative modules. Note that two similar models are presented for jaw function.
Based on knowledge of the development and biomechanical function of cichlid heads, we constructed a number of hypotheses of modularity that were intended to extensively cover potential patterns of covariance. We selected a total of five a priori models representing the spatial distribution of developmental units and functional components of the cichlid head (see Table 1). An additional "null" model representing a lack of any integration or modularity was included in our analyses. Because it is not biologically realistic to expect that patterns of modularity predicted by these developmental and functional models are mutually exclusive, all possible nonnested combinations of the modules defined by the original five hypotheses of modularity were also included in model comparisons. In total, 137 competing models were tested. It is important to note that while this list of hypotheses is far from exhaustive, it represents an extensive collection of models-likely covers a substantial proportion of the developmental and functional processes capable of affecting covariation in the cichlid head.

Modularity Analysis.
The methodology for testing a priori hypotheses of modularity was adapted from an approach proposed by Márquez [45] consisting of four basic steps implemented in the Mint software package (available at: http://www-personal.umich.edu/∼emarquez/morph/).
(1) Computation of an expected covariance matrix from each model of modularity, by assuming that each module resides in its own subspace within the phenotypic space occupied by the entire structure, as described in Márquez [45].
(2) Computation of a goodness of fit statistic, γ, to measure the dissimilarity between observed and expected covariances for each model, as where S and S 0 are the observed and modeled covariance matrices and T is the transpose symbol [47]. To ensure the comparability of this statistic across models, γ is standardized twice: first, all γ values are divided by γ max , corresponding to the null model describing complete absence of integration, so that γ is scaled to vary within the interval [0, 1]; second, scaled γ is standardized via linear regression to remove the effect of the number of estimated parameters in models, which takes advantage of the linear relation observed between γ and the number of zeros in models.
The standardized statistic is defined as the residual m γ * = γ − f (z), where f (z) represents the linear function relating the values of γ computed from all possible models of modularity to their corresponding counts of zero elements, z. Even though it would be computationally unfeasible for most studies to include all possible models, the fact that scaled γ values are restricted to the interval [0, 1], where 0 corresponds to the observed covariance matrix and 1 to the null model of no integration, implies that f (z) must also vary within these limits, which are sufficient to define the linear function for any given set of variables. Given a large random sample of models, with γ values symmetrically distributed about their mean, E(γ) − f (z), and thus E(γ * ) − 0. Consequently, models in which γ * < 0 correspond to comparisons where observed covariances are relatively low on average and hypothesized to be zero, and conversely, cases where γ * > 0 occur when relatively high covariances are on average hypothesized to be zero, the best-fitting model is that with the lowest γ * value. Note that this approach differs slightly from the one used in Márquez [45], where f (z) was estimated via regression using only the models included in a study, as opposed to all possible models. 95% confidence intervals were computed as the 2.5 and 97.5 percentiles of a distribution of 1,000 jackknife subsamples [48] formed by removing random subsets of 10% of the specimens from each sample.
(3) The statistical significance of γ * was assessed using a parametric Monte Carlo approach. In these tests, a null distribution for the statistic is generated by comparing the original observed covariance matrix S to each of 1,000 random matrices generated from a Wishart distribution with mean vector 0 (i.e., the same mean as Procrustes residual data) and covariance matrix S [45,49].
(4) Finally, to allow choice among the multiple models that are significantly better than chance according to the Monte Carlo approach described above, models are ranked by their goodness of fit (i.e., γ * values, in ascending order). The relative support for each model is determined by computing the stability of its rank using a jackknife approach in which γ * values and model ranks are recomputed after removing a random portion of the samples. In this study, we removed 10% of the data in each of 1,000 jackknife replicates.

Comparisons of Covariance among Lakes, Sand Dwellers, and Rock
Dwellers. If a single model fits two of our groups (Lake Tanganyika, LT; Lake Malawi, LM; Lake Victoria, LV; Rock dwellers, RD; Sand dwellers, SD) equally well, it would not necessarily mean that they were close to each other in our model space. This is because two objects that are equally distant from a third (the best supported model) are not required to occupy the same position, especially in a high-dimensional space. In our case, the γ * values calculated for each group represents reference points useful for determining their relative position. This vector of γ * values can have two interpretations, the first as a set of distances between the observed covariation matrix and known patterns of modularity and the second as coordinates for the data in "model space" centered on a group covariance pattern. Because each group may be centered at a different position, only the direction of these vectors can be compared, which was achieved through the use of correlations between γ * vectors of each group. This involved two separate analysis; first, we determined levels of correlation for γ * across the three lakes; second, we determined levels of correlation for γ * among LM, RD, and SD groups. However, we did not use all 137 possible gamma values in these correlations; rather, we used the ten top-ranked models in for each group. This increased the likelihood that we were testing associations between the most biologically relevant models.

Results
Monte Carlo tests were unable to distinguish among models, suggesting that hypotheses were too similar to distinguish amongst each other given available sample sizes. We, therefore, focus our interpretations on the basis of the relative rankings of γ * values and their jackknife support.
Overall, there was strong support for the hypothesis that modularity is present in the heads of African cichlids. Across the three lakes the null model of no integration was ranked 57th, 100th, and 102nd, out the 137 models in LV, LM, and LT, respectively. In the RD and SD groups the null model of no integration ranked 70th and 108th, respectively. Jackknife tests provided high support for these rankings in all groups.

Top-Ranked Models.
At all levels, the best supported hypothesis included one preorbital and one postorbital module. In our pooled data set across all lakes, as well as separate data sets for LV and RD, a preorbital module that defined the upper jaws and encompassed the exact same set of landmarks was identified (Figure 2). Support for these patterns of modularity was high with the top model in the pooled sample of cichlids being ranked number 1 in 96.6% of jackknife reps. LV and RD groupings had top models that were similarly highly supported with 84%, and 85% of jackknife reps, respectively.
For LM as a whole and the SD sample, the top ranked models displayed a preorbital module that encompassed both the upper and lower jaws ( Figure 2). Statistical support for the LM model was high with 86% of jackknife reps maintaining its top ranking. In the case of the SD sample, there were two, statistically indistinguishable top models: The highest-ranked model included three modules, one encompassing the oral jaws, one defining the orbital size, and another that covers much of the posterior region of the head. The second ranked model was identical to the first with the exception that it did not possess an eye/orbital module. Support for the best SD model (i.e., three modules) was low, with only 47% of jackknife reps supporting its ranking. The second best SD model (i.e., two modules) was also ranked as the best model in 44% of jackknife reps. However, a subsequent set of analyses found that when one of these models was removed, support for the other model significantly improved to where its top ranking was supported in over 97% of jackknife reps. This analysis suggests that both models are equally valid.
The LT dataset also showed strong support for a preorbital module in its top-ranked hypothesis (supported in 98.6% of jackknife reps). However, it differed from the other groups by having a preorbital module comprised primarily of the lower jaw ( Figure 2).
International Journal of Evolutionary Biology Figure 2: The best supported hypotheses of modularity for African cichlids as a whole (a) and their respective adaptive radiations within lakes Malawi (b), Tanganyika (c), and Victoria (d). Also shown are the best-supported patterns for rock (e), and sand dwellers (f, g) within Malawi. Two models are shown for rock dwellers, because our statistical analysis was unable to discern whether one hypotheses was significantly better at describing patterns of covariance. Note that despite differences among these hypothesis, each contains a preorbital module based in the oral jaws.

Relationships among Patterns of Covariance Across Lakes, and between the Lake Malawi Sand and Rock Dwellers.
Across the three lakes, we observed strikingly similar patterns of covariation. We used γ * values from a total of 23 hypotheses of modularity, reflecting the top ten ranked models for each of the three lakes, meaning that 7 of these hypotheses were shared among lakes. The r-values for our tests were all extremely high, and positively correlated, indicating that in spite of differences between top-ranked models, very similar patterns of covariance underlie each of these adaptive radiations (Table 2). We also found that patterns of covariance may be diverging within LM. We used a total of 18 models to describe the top ten models across LM and within the RD and SD datasets. Thus, a total of 12 out of a possible 30 models were shared among these groups. The correlation between γ * values for LM as a whole and SD dataset was particularly strong, indicating that sand dwellers may be influencing the overall pattern of modularity exhibited by Malawi cichlids. This result could be due, in part, to their larger relative sample size compared to RD cichlids. Alternatively, LM as a whole showed almost no relationship with RD species, and there was a strong negative relationship between SD and RD species (Table 3). These data suggest that patterns of trait covariance are being repelled between SD and RD.

Discussion
Our results demonstrate that a preorbital module is present in the oral jaws of East African rift valley cichlids and that this pattern of covariation is conserved across all lakes. This trend strongly supports the hypothesis that this pattern of modularity has influenced the rate and direction of adaptive 6 International Journal of Evolutionary Biology phenotypic divergence among African cichlid radiationsan idea rooted in the proposal that the cichlid pharyngeal jaw apparatus is a key innovation that freed the oral jaws from a functional constraint [26], formalized in light of quantitative patterns of trophic divergence among cichlid lineages [28], and empirically tested here. While our results are compelling, we suggest that the comparisons of rates of evolution to other groups which lack a pharyngeal jaw apparatus (e.g., salmonids and characids), and possibly a preorbital module, may be needed to confirm whether the patterns of modularity identified in cichlids represent a key innovation.

Conserved and Divergent Patterns of Craniofacial Modularity among Cichlids.
While the results of our correlation analyses indicate that general patterns of covariance are conserved across lakes, there were several notable differences in the top-ranked hypotheses of modularity among groups, suggesting that while conserved patterns exist, modularity itself is capable of evolving. The LM dataset had a pattern of modularity in which the preorbital module encompassed both the upper and lower jaws, while in the LT dataset, the preorbital module was exclusive to the mandible. For the LV dataset, integration was most prevalent for the upper jaws. The LV radiation is the youngest of the three rift lakes [50], and correspondingly, our prior analysis found relatively low levels of shape variation (disparity) in this lake compared to LM and LT cichlids [28]. Also, more than 60% of the morphological variation among species in LV can be explained by a single major axis (principal component), considerably more than was explained by this shared axis for LM and LT (i.e., Victoria cichlid head anatomies were relatively more integrated). Taken together, these results suggest that the younger divergence in LV is determined by a more limited set of strong interactions among traits. Since the upper jaw contains the anatomical linkages most responsible for highly kinetic jaw movements, such as jaw protrusion, this would imply that both the functional and morphological evolution of this lineage has been constrained. As the youngest of the three rift lake lineages, patterns observed within LV may offer insight into the proximate mechanisms that have shaped cichlid radiations in general. It is possible that the pattern of modularity we have identified in LV has played a dominant role in the early patterns of divergence of cichlids in LM and LT. Consistent with this idea, the preorbital module identified in the upper jaw for LV was very similar to one identified in the top-ranked model for our pooled data set across all cichlids (Figure 2).
The top-ranked models for the SD and RD clades within LM also exhibited notable differences. Whereas the SD group exhibited a preorbital module that included both the upper and lower jaws, RD species expressed a pattern of modularity similar to that of LV, where only the upper jaws were integrated. Moreover, the SD/RD division within Malawi was characterized by a strong negative relationship in covariance patterns, suggesting that ecological competition between these clades during the early history of the lake may have caused patterns of trait covariance to diverge. This pattern is consistent with character displacement, but at a different biological scale (groups of species or clades) than where it is usually recognized [51][52][53]. Character displacement is often thought to occur between two closely related species; however, research suggests that character displacement can also occur between distantly related species, as well as whole communities [54,55]; see also [56,57] for evidence of character displacement in African cichlids. Therefore, it is appropriate to speculate that this process is contributing to divergence between SD and RD clades in LM.
Integration between the upper and lower jaws, as displayed by the SD dataset, may be especially advantageous for ram/suction-feeding predators, a predominant SD trophic niche [58], because both jaws need to work together in a highly coordinated fashion to produce kinematic force [59]. Alternatively, in RD species that most often employ a biting tactic whereby the upper jaw is relatively more stationary during foraging [26], the upper jaw is integrated, and the lower is not. This implies that the lower jaw in RD species is free to evolve a wide array of geometries, which may be advantageous for substrate feeding species, where demands on the lower jaw should be more variable relative to the upper jaw apparatus. However, this is not to say that there is a complete lack of integration between the upper and lower jaw, as modularity is a matter of degree rather than an all-or-nothing phenomenon [12] . Also, it is important to note that patterns of divergence among SD and RD are still acting within the overall context of a preorbital module (i.e., both upper and lower jaw for SD, upper jaw for RD,) suggesting that the rate and direction of phenotypic evolution is being dictated by historical constraints that are manifested in patterns of covariance and modularity. In other words, putative character displacement between SD and RD species in Malawi cichlids may be proceeding along genetic lines of least resistance [5,53].

Origins for Adult Patterns of Modularity: Developmental
Mechanisms. Although there are a number of possible functional explanations for patterns of craniofacial modularity, it is important to remember that selection must work within the context of developmental systems to improve functional performance. That is not to say development inherently constrains evolution, but rather that it can direct its outcome in concert with selection. In fact, simulations have shown that some degree of order may actually be required for evolution to proceed with ease [60]. It is, therefore, probable that the patterns of craniofacial modularity identified here, while probably causing an increased propensity for adaptations involving the oral jaws, are also dictated by underlying developmental processes. Clues to these potential processes may lie in early embryological events during the formation of craniofacial anatomy in fishes (see [61] for a similar view in mammals).
Structural progenitors of the ossified structures in the preorbital region of the skull include the trabeculae and ethmoid cartilages (i.e., anterior neurocranium), palatoquadrate (i.e., upper jaw precursor), and Meckel's cartilage (i.e., lower jaw precursor). All of these structures are derived International Journal of Evolutionary Biology 7 from the same population of anterior cranial neural crest (CNC) cells that migrate away from neural tissue beginning at approximately 12 hours afterfertilization (hpf) in zebrafish [62]. Thus, the preorbital region of the skull is defined early in development, and these events may underlie the persistence of a preorbital module among African riftlake cichlids. For instance, LM cichlids show integration between the upper and lower jaws, suggesting that this developmental hypothesis may have particular merit for this adaptive radiation.
The modular divisions between the upper and lower jaws found between LV and LT may be influenced by slightly later developmental events. Fate mapping experiments in zebrafish show that at approximately 24 hpf the stomodeum forms as an invagination of the oral ectoderm, and both the pterygoid process and anterior neurocranium reside within a compact condensation of cells closely associated with dorsal edge of this structure, whereas Meckel's cartilage forms from cells ventral to this structure [62]. Thus, while early ontogenetic events (i.e., CNC migration) regionalize the skull along the anterior-posterior axis, slightly later events (i.e., formation of the mouth) are necessary to specify the dorsal-ventral identity of the jaws within the preorbital region of the skull.
Later still in development, the sequence of ossification in bones of the craniofacial region may play a role in determining patterns of modularity. Evidence from zebrafish and Nile tilapia (Oreochromis niloticus) show that the oral jaws (premaxillae, maxillae, and dentary) are among the first structures to become mineralized in the teleost head [63,64]. In fact, the only other structures that are ossified as early as the oral jaws include the basio-occipital and opercle. Functional reasons have been attributed to this chronological pattern in teleost development [63,[65][66][67]. Specifically, bones involved in early basic functions such as respiration and feeding have been observed to ossify first. This suggests that the bones of the oral jaws and opercular regions of the skull are predisposed to reflect the patterns of variational modularity we have identified. Ossification sequence, and heterochronic shifts in this process, could, therefore, act as another early mechanism that sets the stage for craniofacial modularity throughout life history.

Origins for Adult Patterns of Modularity: Integrating
Developmental and Functional Processes. Beyond initial ossification, bone remodeling over ontogeny could represent another means of achieving modularity of the oral jaws and a way of simultaneously integrating developmental and functional mechanisms in a straightforward way. Bone is a dynamic, metabolically active tissue that is constantly being renewed and changed. Bone cells are strain sensitive and can transduce signals from mechanical loading into cues that result in either reduced bone loss or gain [68][69][70][71]. Disuse usually causes an acceleration of bone turnover, with resorption being the dominant process. Conversely, excessive strain can damage bone, which may in turn be repaired or further reinforced through remodeling. Importantly, both bone resorption and deposition involve highly conserved genetic and developmental pathways [72][73][74].
Mechanical stimuli may be particularly important for inducing adaptive patterns of modularity through the process of bone remodelling. Bone turnover tends to be most effective in areas of high stress, thus reducing the risk of injury [69]. In teleosts, the oral jaws are used for both respiration and food acquisition, but it is likely that the oral jaws are under the highest stress during food acquisition and processing, which should in turn provide the greatest stimulus for bone remodeling [72]. Indeed, several labbased studies on cichlids have documented that different diet treatments can induce changes in bone and head shape [75,76], demonstrating the ability of elements in the upper and lower jaws to respond to mechanical stimuli through changes in shape. Within the RD lineage of LM, it is certainly possible that a high degree of remodeling and plasticity of the lower jaw has led to a pattern of modularity, wherein the mandible lacks a measurable degree of integration across species. The lower jaw may be more amenable to remodeling due the greater degree of movement that it is afforded in the RD lineage. Alternatively, patterns of integration within the lower jaw may differ between species, resulting in a perceived lack of integration in the combined dataset. In either case, the conclusion that must be drawn is that the lower jaw is a highly evolvable trait within the RD lineage.
Perhaps the most compelling evidence for a fundamental link between developmental and functional processes comes from work in the BMP family of signaling proteins (reviewed by [70]). Critical roles for BMP signaling during bone and cartilage development are well established (reviewed by [77]), and variation in BMP expression over ontogeny has been associated with the origin and adaptation of key vertebrate innovations including the turtle shell [78], bat wing [79], cichlid mandible [44,80], and bird bills [81][82][83]. All of these examples involve differential Bmp expression that is presumed to be due to mutational effects (either cis or trans), but several studies have also documented environmentally induced changes in BMP expression in skeletal tissue. Specifically, tensile stress has been shown to alter BMP expression during bone growth [84,85], remodeling [86], and repair [87,88]. Thus, a scenario wherein patterns of craniofacial modularity are established via early developmental mechanisms and then either reinforced or altered by functional processes might represent the true nature of variational modules within the cichlid skull. Examining how patterns of integration potentially shift over ontogeny and under different feeding regimes in different cichlid lineages would represent a fruitful line of future research.

Modularity and Evolvability of the Craniofacial Skeleton in Cichlids.
Recent reviews suggest that an extended evolutionary synthesis (EES) is necessary to account for the origins of variation that is acted upon by natural selection [6,7]. The empirical center for the EES will lie in discovering the features of organisms that determine evolvability [7]. While specific definitions of evolvability are numerous and vary according to context, modularity figures prominently in these discussions insofar as it imposes a constraint on 8 International Journal of Evolutionary Biology direction or speed of evolutionary change [12,13]. In this context, we suggest that modularity can act as a "key innovation". While key innovations are typically defined by the appearance of an anatomical structure that precedes an adaptive radiation, as is the case for the pharyngeal jaws [26], we contend that patterns of modularity, whereby the cichlid oral jaws represent a module that allows them to change with a high degree of autonomy, have had a strong influence on the rate and direction of adaptive divergence in this group. This pattern of modularity is likely what has allowed for the rapid lengthening or shortening of the oral jaws relative to the rest of the head in cichlids and shape changes that comprise the major axes of variation in each of the three African rift lakes, and likely, it represents the template upon which additional changes in trophic morphology occur [28]. In other words, the evolution of this pattern of modularity may facilitate evolution, providing an example of the "evolution of evolvability" (see [7]). The degree to which these patterns are specific to cichlids, or represent a more generalized perciform innovation, will be an important area of future study. Several avenues may have lead to preorbital modularity; therefore, finding groups that lack this pattern of modularity and comparing rates of diversification will be important for identifying its potential role as a key innovation.
As discussed above, several avenues may have led to the consistent patterns of preorbital modularity we have discovered. In the order of their ontogenetic appearance, these include (1) migration and specification of progenitor cells, (2) dorsal-ventral division of the oral cavity, (3) sequence of ossification with early calcification of the jaws and operculum region, and (4) remodeling of bone in response to mechanical stimuli. These all represent separate hypotheses and processes that can be tested to understand the developmental and genetic basis of a preorbital module. We predict that each of these processes may play important roles in determining modularity in the cichlid head, depending on the lineage being queried. Fortunately, we have the means to assess patterns of modularity over ontogeny in cichlids and can statistically track when the patterns we have identified in adult cichlids begin to emerge. We also have the means to identify QTL associated with these anatomical modules and to track changes in gene expression during the emergence of these patterns [43,44]. In all, cichlids represent an attractive model to reveal both the genetic basis of modularity and the evolvability of the craniofacial skeleton.