Reliability of Growth Indicators and Efficiency of Functional Treatment for Skeletal Class II Malocclusion: Current Evidence and Controversies

Current evidence on the reliability of growth indicators in the identification of the pubertal growth spurt and efficiency of functional treatment for skeletal Class II malocclusion, the timing of which relies on such indicators, is highly controversial. Regarding growth indicators, the hand and wrist (including the sole middle phalanx of the third finger) maturation method and the standing height recording appear to be most reliable. Other methods are subjected to controversies or were showed to be unreliable. Main sources of controversies include use of single stages instead of ossification events and diagnostic reliability conjecturally based on correlation analyses. Regarding evidence on the efficiency of functional treatment, when treated during the pubertal growth spurt, more favorable response is seen in skeletal Class II patients even though large individual responsiveness remains. Main sources of controversies include design of clinical trials, definition of Class II malocclusion, and lack of inclusion of skeletal maturity among the prognostic factors. While no growth indicator may be considered to have a full diagnostic reliability in the identification of the pubertal growth spurt, their use may still be recommended for increasing efficiency of functional treatment for skeletal Class II malocclusion.


Background
It has been reported decades ago that the growth rate of the mandible is not constant throughout development [1][2][3] showing a peak during puberty [1,2,4,5]. However, the intensity, onset, and duration of the pubertal growth peak (including mandibular growth peak) are subjected to noteworthy individual variations [1,[3][4][5]. A deficient mandibular growth on the sagittal plane is the most frequent diagnostic finding in skeletal (and dental) Class II malocclusion that occurs in up to one-third of the population [6,7]. Thus, a therapy able to enhance mandibular growth is indicated in skeletal Class II patients [8]. In this regard, animal studies have shown that forward mandibular displacement enhances condylar growth resulting in significant mandible elongation [9,10]. Consequently, a wide range of functional appliances (either removable or fixed) have been developed to stimulate mandibular growth by forward posturing of the mandible.

Common Issues related to the Investigation and Use of the Skeletal Maturity Indicators
Current evidence on the reliability of the different growth indicators and consequent definition of treatment timing is highly controversial. Contrasting results have been reported on the capability of the growth indicators (mainly the CVM method) in the identification of the mandibular growth peak [46][47][48][49][50][51][52][53] and on the efficiency of functional treatment for Class II malocclusion [13,18]. The investigation on growth indicators has common sources of controversies for all the indicators and specific issues related to each indicator. Herein, common controversial issues to all indicators are listed, while specific issues and controversies on the functional treatment are reported below.

Stages versus Ossification Events.
In using radiographical indicators of growth phase that are based on sequential discrete stages, an important distinction has to be made between stages and ossification events [54,55]. The stages are specific periods in the development of a bone that have been described in that particular rating method, while an ossification event occurs when a given stage matures into the following one [54,55]. Of particular clinical relevance, as ossification event is defined as the midpoint between two consecutive stages, a proper identification event requires serial radiographs. The main limitation raised by the use of single stages resides in the concept that these stages have variable duration [35,47,55,56] as has been seen for the HWM [5,55], MPM [37], and CVM [47,56] methods, making the prediction of the imminent growth spurt less reliable. Therefore, the exact determination of the imminent growth spurt would require closer monitoring of the ossification event, that is, longitudinal recordings, rather than being based on a single stage. This aspect is of further relevance considering that fine transitional changes in the hand and wrist or cervical vertebral morphology may be responsible for determining a pubertal or nonpubertal stage. According to these concepts, longitudinal studies on the capabilities of the different indicators in the identification of the mandibular growth peak (or pubertal growth spurt) are to be preferred over cross-sectional ones. From a clinical standpoint, whenever possible, serial monitoring should be preferred over growth prediction based on single staging.

Correlation Analysis versus Diagnostic Reliability.
In spite of the huge number of studies on growth indicators and pubertal growth spurt, the diagnostic reliability of any of the growth indicators in the identification of the peak in standing height or mandibular growth on an individual basis is yet undetermined. Of note, correlations between parameters do not necessarily imply diagnostic accuracy [57,58]. One of the reasons underlying this noteworthy lack of data may reside in the difficulty of obtaining diagnostic parameters, such as sensitivity, specificity, and accuracy, from longitudinal data in a subset of selected subjects all with a predetermined condition (mandibular growth peak) or a diagnostic outcome (a given HWM/CVM stage). However, the identification of a mandibular growth peak requires longitudinal data, and it is defined as the greatest growth interval [21,37].
To overcome such limitations, a recent study [21] using already published data on the CVM method [49] has introduced a simple procedure to derive data on diagnostic reliability in the case of longitudinal recordings of growth indicators and mandibular growth. In particular, individual CVM stages and increments in mandibular growth recorded longitudinally were analysed in a group of subjects according to the different predetermined annual (chronological) age intervals. Therefore, a full diagnostic reliability analysis, including sensitivity, specificity, positive and negative predictive values (PPVs and NPVs), and accuracy, of a given CVM stage in the identification of the mandibular growth peak could be carried out within each age interval group. To date only limited longitudinal studies reported on the diagnostic reliability of the CVM [21] and MPM [37] methods in the identification of the mandibular growth peak. Therefore, longitudinal studies reporting diagnostic reliability should be preferred over investigations using bivariate correlations [59,60] or even multiple regression analyses [61,62].

Definition of Total Mandibular Length.
In several studies on the reliability of growth indicators [34,35,[63][64][65][66] or on the efficiency of functional treatment for Class II malocclusion [17,26,67,68] (see below), the landmark Articulare (Ar) was used instead of the landmark Condylion (Co) to assess the posterior end-point of the mandible. The Ar is defined as the point of intersection of the images of the posterior border of the ramal process of the mandible and the inferior border of the basilar part of the occipital bone [69]. The problem with Ar is that it is not an anatomical landmark that pertains to the mandible exclusively. On the other hand, the landmark Ar has the advantage of being more easily identified as compared to the Co. Even though a previous study [70] reported close correlation between the Ar-Pogonion (Pog) and Co-Pog distances on a sample of 60 cases; other evidence [71,72] suggested the use of the point Co over Ar as being more reliable in terms of mandibular growth recording. In particular, the posture of the mandible might also affect the position of Ar [71]. Yet repeatability analysis on a crosssectional sample [70] does not provide evidence that, in a longitudinal analysis, increments in mandibular length (as Ar-Gn and Co-Gn or Ar-Pog and Co-Pog) would yield overlapping patterns of mandibular growth peaks (which are mostly used to validate growth indicators). Therefore, future data are warranted to fully elucidate whether the different landmarks may be used indifferently.

Hand and Wrist Maturation Method
The use of the hand and wrist bones for the assessment of skeletal maturity has initially been reported by Todd [73]  The method is also referred to as skeletal maturity assessment (SMA). SMI, skeletal maturity indicator.  Figure 1: Diagram of the stages of the hand and wrist maturation (HWM) method according to Fishman [35]. The method is also referred to as skeletal maturity assessment (SMA). Blue, prepubertal stages; red, pubertal stages; black, postpubertal stages. See Table 1 for details. Modified from Fishman [35] with permission.
followed by others [30,74,75]. In particular, all of these methods were based on the assessment of a skeletal age (in years) according to specific ossification events of the hand and wrist. Subsequently, such individual skeletal age had to be compared with reported norms. For reasons listed below, stage-based procedure for the hand and wrist maturation has been added. Among the different stage-based HWM methods [32,34,35,65], the most used nowadays both in research and clinical practice is likely to be that proposed by Fishman [35], also known as skeletal maturation assessment (SMA). Details of the 11-stage HWM method according to Fishman [20] are summarized in Table 1 and shown in Figure 1, while main longitudinal investigations in relation to mandibular growth in untreated subjects without major malocclusion are summarized in Table 2.

Current Evidence.
All the published longitudinal studies on the HWM methods and mandibular growth peak included Caucasian [35,53,65] and Australian aborigine [34,76] subjects, and none reported a specific diagnostic reliability analysis. Tofani [65] reported that onset of fusion of distal phalanges are good predictors of mandibular growth peak; however, this study included only females. The study by Grave [34] also reported moderate significant correlations of the hand and wrist maturation with mandibular growth peak for both females and males. A further study by Grave and Brown [76] on the same sample reported previously [34], investigating the HWM method with standing height, reported that peak height velocity would occur up to 3 and 6 months later, in males and females, respectively, of the attainment of the third finger middle phalanx (MP3) stage  Figure 2: Diagram of the improved third finger middle phalanx maturation (MPM) method according to Perinetti et al. [37]. Blue, prepubertal stages; red, pubertal stages; black, postpubertal stages. See Table 3 for details. Modified from Perinetti et al. [37] with permission.
G (corresponding to the SMI6, See Table 1). In the HWM method according to Fishman [35], peak in mandibular growth (as Ar-Gn) would occur in stage 6 and 7 for females and males, respectively [35]. Further studies correlated this HWM method with standing height [32,54,76]. Similarly, the study by Mellion et al. [53] reported for the HWM according to Fishman [35] a moderately strong or weaker relationships in males and females, respectively. In particular, the HWM method assessments had consistently lower errors than either mean chronologic age or CVM method in the identification of both the peaks in standing height and mandibular length [53]. Of note, a previous longitudinal study [77] compared the skeletal age of the whole HWM method (according to Todd [73] and Greulich and Pyle [30]) with specific ossification events of the first, second, and third finger, referred to as the three-finger maturation assessment. As a result, the three-finger maturation assessments were shown to mature in slight advancement than the whole HWM assessments. However, this study [77] was based on correlation analyses and differences in skeletal age between methods, lacking a true diagnostic analysis [78] of concordance or measurement of agreement [79]. [30] and other similar methods [73][74][75] have been criticized in that it may be difficult to set a reference standard, because of the differential rate of maturation in different bones across individuals of the same population or across different population [54,80]. For this reason, several standards, that is, norms, have been published for the hand and wrist maturation assessment according to the population of interest. For more detail, see Greulich and Pyle [30] and Todd [73] for white American subjects, Sutow and Ohwada [74] for Japanese subjects, and Tanner and Whitehouse [75] for British subjects. However, such norms are not always available for each population, while another important issue relates to the secular trends, with successive generations becoming taller and reaching puberty at earlier stages [81,82]. Therefore, the staging of skeletal maturity by describing specific ossification events on the hand-wrist radiograph [32,34,35,53,65,66,83,84] may be a valid tool as being more independent of differences among populations and secular trends and availability of published standards [80]. The methods based on ossification events [32,34,35] might thus be considered to have a wider clinical applicability.

Clinical Implication.
Even though the number of studies correlating the HWM methods with mandibular growth peak is limited ( Table 2), all of these investigations concluded that these methods may be useful in clinical practice. Therefore, the use of the HWM method may be recommended for planning treatment timing. In spite of this favorable evidence, the HWM method has a main disadvantage residing in the need of an additional film, with consequent increased radiation exposure of the whole hand and wrist. This aspect would prevent a serial recording to monitor closely the ossification events, limiting the diagnosis that has to rely on single stages.

Third Finger Middle Phalanx Maturation Method
Previous studies reported above on the HMW methods [34,54,76,85] provided an indication of the possibility for the third finger middle phalanx maturation to the used alone as an indicator of skeletal maturity. Close concurrence of the attainment of MP3 stage G with the peak height velocity has been reported for both males and females [54,85]. Similar results were seen when correlating the third finger middle phalanx maturation with mandibular growth peak [35,76]. Therefore, the use of the sole third finger middle phalanx for a maturational method has been proposed [36,38,[86][87][88]. This third finger middle phalanx maturation (MPM) method [37,78] would thus have the advantage of an easy interpretation of the stages, without double contours or superimposition by other structures. Details of a 5-stage MPM method according to Perinetti et al. [37] are summarized in Table 3 and showed in Figure 2, while the only longitudinal investigation [37] in relation to mandibular growth in untreated subjects without major malocclusion is summarized in Table 4.

Current Evidence.
All of previous investigations [36-38, 78, 86-88] suggested the use of the MPM method in clinical practice. The main advantage of the MPM method resides in the minimal radiation exposure that would allow close monitoring of the ossification events by longitudinal recordings. Therefore, ideal timing of treatment in individual patients may be identified more precisely as compared to when information comes from single recording, as for the case of the HWM and CVM methods. Finally, the MPM method is of easy execution and interpretation and may be performed in any clinical setting with minimal instrumentation. In spite of the potential clinical advantages offered by Table 3: Description of the stages of the third finger middle phalanx maturation (MPM) method according to Perinetti et al. [37].
Stage description Attainment MPS1: epiphysis is narrower than the metaphysis, or epiphysis is as wide as metaphysis but with both tapered and rounded lateral borders. Epiphysis and metaphysis are not fused. Reported as MP3-F [32] More than 1 year before the onset of the pubertal growth spurt [32] or mandibular growth peak [37] MPS2: epiphysis is at least as wide as the metaphysis with sides increasing thickness and showing a clear line of demarcation at right angle, either with or without lateral steps on the upper contour. In case of asymmetry between the two sides, the more mature side is used to assign the stage. Reported as SMI2 [35] or as MP3-FG [32] 1 year before the pubertal growth spurt [32] or mandibular growth peak [37] MPS3: epiphysis is either as wide as or wider than the metaphysis with lateral sides showing an initial capping towards the metaphysis. In case of asymmetry between the two sides, the more mature side is used to assign the stage. Epiphysis and metaphysis are not fused. Reported as SMI6 [35] or as MP3-G [32] At coincidence of the pubertal growth spurt [32] or mandibular growth peak [37] MPS4: epiphysis begins to fuse with the metaphysis although contour of the former is still clearly recognizable. The capping may still be detectable. Reported as MP3-H [32] After the pubertal growth spurt [32] or mandibular growth peak [37] MPS5: epiphysis is totally fused with the metaphysis. Reported as SMI10 [35] or as MP3-I [32] At the end of the pubertal growth spurt [32]  the MPM method, current evidence is still little. The present investigations [36,38,78,[86][87][88] are limited by the crosssectional designs in which the MPM method was analyzed in correlation [36,38,[86][87][88] or in diagnostic agreement [78] with the CVM method. Indeed, such analyses do not prove the diagnostic reliability of the method in the identification of the pubertal/mandibular growth peak. The results for the recent longitudinal study [37] on diagnostic reliability (Table 4) showed that the MPM stage 2 (MPS2) precedes the mandibular growth spurt, which is generally concomitant of MPS3. However, even though the overall diagnostic accuracy of 0.91 was satisfactory, the overall positive predictive value was 0.73, thus meaning that false positives may be encountered. This evidence was mainly due to the duration of the MP2 that in some cases lasted for 2 years and it was more evident in the older age groups. Again, the following of the ossification events should be preferred instead of basing growth prediction on single stages [54].

Clinical Implications.
Although further investigations are needed, the MPS2 and MPS3 may be considered to be associated with the onset and maximum mandibular growth peak, respectively, in most of the subjects, and may therefore be used for planning treatment timing for functional treatments especially for skeletal Class II malocclusion [4]. According to the minimal radiation exposure, longitudinal monitoring is recommended to follow closely the ossification events. Finally, a combinational use of the MPM method with a further noninvasive indicator of pubertal growth spurt, that is, standing height, especially in the older adolescents, might increase diagnostic reliability [37].

Cervical Vertebral Maturation Method
The CVM method was initially proposed by Lamparski [39] and then modified by others [20,33,46,49]. In this procedure, the shape of the first cervical vertebrae is analyzed to carry out information on the different growth phase of the subject. In particular, the original method by Lamparski [39] uses vertebrae that can be obscured by the thyroid collar and relied on interstage comparisons, while the subsequent variants of the CVM method [20,33,46,49] were less or   [20]. Blue, prepubertal stages; red, pubertal stages; black, postpubertal stages. See Table 5 for details.
not dependent on interstage comparisons. The most common CVM methods are the variants proposed by Hassel and Farman [33] and Baccetti et al. [20], where mandibular growth peak has been reported to occur between stages 3 and 4 [20,21,46,49]. Among the main advantages of the CVM method is the fact that it does not require supplementary radiographic exposure, as for the HWM method, since lateral head film is usually available as a pretreatment record. Details of the 6-stage CVM method according to Baccetti et al. [20] are summarized in Table 5 and shown in Figure 3, while main studies in relation to mandibular growth in untreated subjects without major malocclusion are summarized in Table 6.

Current Evidence.
According to previous evidence [20,46,49], maturation of the cervical vertebrae occurs in females earlier than in males. Ideally, CVM stages from 2 to 4 should have precise durations in a way that interventions may be easily planned on a basis of a single lateral head film. In this regard, the duration of each CVM stage from 2 to 5 has been reported to last 1 year according to Franchi et al. [49] (  [89] and Class II subjects [90], respectively, as compared to that of Class I subjects. However, these studies [89,90] were limited by their crosssectional design not allowing the detection of any individual variation in the duration of single CVM stages. Longitudinal studies correlating facial growth patterns with duration of CVM stages are still missing. Many previous studies were limited to the correlation analyses between the different CVM and HWM methods [33, 36, 48, 59, 86-88, 91, 92] with no information on the mandibular growth (or standing height) peak; other studies were limited to the longitudinal investigation of the cervical vertebral maturational changes [56] or investigated the potential of the CVM method to detect postpubertal mandibular growth [50]. A further investigation [62] was focused on the capability of the CVM method to predict the total amount of mandibular growth from prepubertal to postpubertal phases, irrespective of timing of pubertal growth peak [93], and it included exclusively Class II female subjects, where mandibular growth peak has been shown to be minimal or absent [23].  However, as for the HWM method, the most relevant information may be derived from longitudinal studies investigating the capabilities of these methods in detecting the mandibular growth peak, possibly in individual subjects. Previous studies on the CVM method and mandibular growth peak have reported contrasting results of negligible [47,48,53,61,62] and noteworthy [49,52,64,66] correlations. Interestingly, only few studies [21,47,49,52,53,61,64,94] (Table 6) correlated the CVM method (as stage system) with mandibular growth under longitudinal monitoring. According to this evidence, a total of five studies [21,49,52,64,94] reported mandibular growth peak to occur during stages 3 and 4, and four [21,49,64,94] of them recommended the use of the CVM method in treatment planning. One study [21], however, used the same sample of Franchi et al. [49] from which the CVM method was derived. The remaining three studies [47,53,61] failed to detect a significant correlation between the CVM and mandibular growth peak and did not recommend the method for treatment planning.

Current Controversies.
When reporting on the CVM method, the different variants of the method [20,33,46,48,84] have to be taken into account and results should be limited to the investigated methods or parameters [95]. Significant differences in study designs, cephalometric recordings, and data analysis have to be taken into account when dealing with clinical usefulness of the CVM method. For instance, apart from the study [21] using the same sample reported by Franchi et al. [49] (Table 6), the only investigation [66] that has reported on the diagnostic capability of the CVM method in the identification of the mandibular growth peak used receiver operating characteristics curves. However, this study [66] was based on a cross-sectional sample and it was limited to the analysis of the area under the curve, which is not enough to describe in full the diagnostic reliability of the method. Therefore, conclusions on the diagnostic reliability of the CVM method in the identification of the mandibular growth peak have conjecturally been based on difference among groups/stages [47,49,52,64,94], regression analyses [61], or other analyses missing diagnostic capabilities [53].
Another relevant issue when dealing with the CVM method resides in its repeatability. The method has been reported to have poor repeatability [96,97]. Although this limitation may be avoided by proper training [98], poor repeatability has been seen even in studies correlating the CVM method with mandibular growth [62], while longitudinal investigations herein considered (Table 6) reported no information [52,53,64,94] or good to high repeatability [47,49,61] in the CVM stage assignment. Finally, when assigning the CVM stage, it has been suggested that exceptional cases, that is, cases outside the reported norms, may exist [98] and this may be responsible for doubtful interpretation and poor reproducibility.

Clinical Implications.
As for the HWM method, the CVM methods require films that are usually available as a pretreatment record, while optimal treatment timing is to be delayed for an undermined term after the diagnosis. Therefore, further reevaluation of the growth phase needs a reexecution of a lateral head film, which would not be indicated. Moreover, the cervical vertebrae might be partially covered by the protection collar, which would be necessary to reduce radiation exposure [99]. Apart from this consideration, the use of the CVM method requires proper training in stage assignment and knowledge of exception cases [98]. Moreover, variability in duration of the CVM stages 2 to 4 [47,56] has been taken into account and functional treatment requiring the inclusion of the mandibular growth spurt in the active treatment period should last until attainment of CS5 [21]. Future longitudinal studies on diagnostic reliability of the CVM method in the identification of the mandibular growth peak are still necessary to fully elucidate the clinical usefulness of the method.

Dental Maturation Method
Dental maturity can be assessed by the exfoliation of deciduous teeth, such as the second molars [100], phases of dentition [101], dental emergence [5,32], or calcification stages through the evaluation of tooth formation [40]. Calcification stages of the teeth can be carried out on panoramic radiographs that are routinely used for different purposes, with mandibular teeth preferred over maxillary ones being less subjected to superimpositions from other skeletal structures. Even intraoral radiograph may be used with minimal irradiation to the patient. Therefore, dental maturation has been proposed as a further useful method for assessing the growth phase in individual subjects [31]. The most common method used for scoring dental maturation is the one described by Demirjian et al. [40]. This method has the advantage of using relative values of the root formation to the crown height, rather than absolute lengths. Foreshortened or elongated projections of developing teeth will not affect the reliability of this assessment [40]. Details of the dental maturation method according to Demirjian et al. [40] are summarized in Table 7 and shown in Figure 4, while main cross-sectional studies of diagnostic reliability using the HWM or CVM methods in untreated subjects without major malocclusion are summarized in Table 8.

Current Evidence.
The period corresponding to the exfoliation of the deciduous second molars has been advocated as favorable for the beginning of a one-phase orthodontic treatment in growing subjects [102]. However, as previously reported [100], the exfoliation of the deciduous second molars has no significant relationship with the onset of the pubertal growth spurt (Table 8). Similarly, the assessment of the phase of dentition (as deciduous, early mixed, mixed, and permanent) is a simple procedure and has been used to assess the effects of different treatment timing in Class II patients [103]. However, the only study [101] on diagnostic reliability ( Table 6) reported that neither the early mixed nor the mixed dentition phases are valid indicators of the pubertal growth spurt. Therefore, the use of the exfoliation of the deciduous second molar or phases of dentition is not recommended Table 7: Description of the stages of the most common dental maturation method according to Demirjian et al. [40].

Stage description
Attainment Stage D. When (1) the crown formation is complete down to the cementoenamel junction; (2) the superior border of the pulp chamber in the single-root teeth has a definite curved form, with it being concave towards the cervical region; the projection of the pulp horns, if present, gives an outline shaped like the top of an umbrella and (3) the beginning of root formation is seen in the form of a spicule Canine, premolars, and second molar before the pubertal growth spurt [28,57,110,112] Stage E. When (1) the walls of the pulp chamber form straight lines, the continuity of which is broken by the presence of the pulp horn, which is larger than in the previous stage and (2) the root length is less than the crown height Mostly, canine and first premolar before the pubertal growth spurt [28,57,110,112] Stage F. When (1) the walls of the pulp chamber form a more or less isosceles triangle, with the apex ending in a funnel shape and (2) the root length is equal to or greater than the crown height Sometimes, canine before the pubertal growth spurt [28,57,112] Stage G. When the walls of the root canal are parallel and its apical end is still partially open Canine, premolars, and second molar before, during, and after the pubertal growth spurt [28,57,110,112] Stage H. When (1) the apical end of the root canal is completely closed and (2) the periodontal membrane has a uniform width around the root and the apex Second molar after the pubertal growth spurt [28,57,112] Only stages D to H are summarised due to their relevance with the circumpubertal growth phase. In molars, the distal root is considered in assessing the stage [40]. Only results from studies reporting diagnostic reliability analysis are shown regarding the moment of attainment of the different stages for mandibular teeth.
D H E F G Figure 4: Diagram of the stages of the most common dental maturation method according to Demirjian et al. [40]. Only the stages D to H are represented due to their relevance with the circumpubertal growth phase. In molars, the distal root should be considered in assessing the G and H stages. Blue, prepubertal stages; grey, any stage; black, postpubertal stages. See Table 7 for details.
for treatment planning. Similarly, dental emergence has also been reported to be poorly correlated with pubertal growth spurt [5,32]. Regarding dental calcification stages, high correlations with skeletal maturity have been reported by most of the investigations performed to date using the CVM [31,60,86,[104][105][106][107][108][109], HWM [106,110,111] or MPM [86,112] methods. As a consequence, most of the studies have proposed the staging of dental maturation as a reliable indicator of the individual skeletal maturity, which has major diagnostic implications [31, 60, 106-109, 111, 113-117]. On the contrary, other studies [104,105,110,112] (Table 8) including a meta-analysis [118] reported a very limited clinical usefulness of dental maturation in the identification of the pubertal growth spurt.

Current Controversies.
The apparent inconsistency among all the current investigations on dental and skeletal maturation (all cross-sectional) resides in the use of proper diagnostic reliability analysis. The present evidence on diagnostic reliability [104,105,110,112] has revealed that the conclusions reported in previous investigations based on correlational analyses [31, 60, 106-109, 111, 113-117] were not actually supported by the results obtained in those studies. The few exceptions seen for early dental developmental stages, which were reliable in the identification of the prepubertal growth phase [105,110,112,118], would have poor clinical meaning since early mixed and intermediate mixed dentition may be used instead for the same purpose [5,32,101]. Longitudinal studies on the diagnostic reliability of dental maturation, mainly as calcification stages, in the identification of the mandibular growth peak are still missing.

Clinical Implications.
Irrespective of the mandibular tooth, none of the dental maturation stages may be reliably used to identify in individual subjects the pubertal growth spurt (Table 8). Other indicators remain preferable for the determination of the growth phase in individual growing patients [118].

Standing Height.
Standing height has been used as an indicator of the pubertal growth spurt from several decades ago [3,5,119,120]. This procedure requires several measurements of standing height repeated at regular intervals to construct an individual curve of growth velocity and has the advantage of being noninvasive. The peak in standing height has been reported to precede [3,119] or to be in concurrence [120,121] with the peaks in facial bones growth. Other evidence reported that standing height had little predictive value in determining the growth profile of any of the mandibular parameters except for Ar-Pog for females [63]. Mandibular growth peak has been seen to occur in concurrence with or slightly after the peak in standing height for males and females, respectively [35]. In a more recent investigation [53], the peak in stature had a shorter duration and tended to occur a few months before that of the face and mandible. Although all of these investigations [3,5,32,34,49,53,[119][120][121] reported a satisfactory degree of correlation between the standing height and mandibular growth, data on diagnostic reliability of standing height peak in the identification of the mandibular growth peak has been reported only in one study [21]. In particular, a variable diagnostic accuracy (between 0.61 and 0.95) was seen for the standing height peak in the identification of the mandibular growth peak (as greatest annual increments in Co-Gn or in mean value between Co-Gn and Co-Go) [21]. From a clinical perspective, therefore, the recording of standing height may be useful, especially in conjunction with other radiographical indicators. [32,35,53,122,123] reported that the average ages at the onset and peak of pubertal growth in stature are about 12 and 14 years in boys and 10 and 12 years in girls. However, a noteworthy variability was also seen when pubertal growth spurt was defined as standing height peak [21,35,54,63,65,76,84,92] or mandibular growth peak [37,49,52,64]. To date, only one cross-sectional study [124] reported on diagnostic performance of chronologic age in the identification of the pubertal growth phase (according to the CVM method [20]). In males, age up to 9 years can reliably identify a prepubertal stage of skeletal development, and in females an age of at least 14 years can reliably identify a postpubertal stage. In both males and females, chronologic age could not reliably identify the onset of the pubertal growth phase [20]. Therefore, in spite of the simplicity of the method, its clinical applicability as an indicator of the onset of the pubertal growth spurt in the individual patient is limited [20,21,32,37]. On the contrary, the study by Mellion et al. [53] reported that chronological age would have only a slightly greater error, as compared to that of the HWM according to Fishman [35], in the identification of the mandibular growth peak and it is therefore recommended for the treatment planning. However, this only evidence [53] derived from an old sample (Tables 2 and 6) has to be confirmed by further investigation, especially considering that onset of puberty can be influenced by several factors including genetics, ethnicity, nutrition, and socioeconomic status [82] responsible for a secular trend [81].

Menarche and Voice Change.
Menarche usually occurs immediately after [123,125] or 1 year after the pubertal growth spurt [5,126]. According to other evidence [65], menarche would occur after the mandibular growth peak in the earlyand average-maturing girls, while in late-maturing girls it may generally occur before the mandibular growth peak. However, late-maturing girls would represent a minority of the population rendering this indicator useless [65]. Similarly, in boys, the voice change occurs during or after the pubertal growth spurt [54,125]. Therefore, these two indicators are not usable in planning treatment timing in orthodontics.

Biomarkers.
The use of biomarkers has been proposed very recently as a new aid in assessing individual skeletal maturity, with the advantage of being related to the physiology of the patient and of avoiding the use of radiations. The very scarce data reported to date include molecular constituents from the serum, such as insulin-like growth factor I (IGF-I) [42,43,127], or from the gingival crevicular fluid (GCF), such as alkaline phosphatase (ALP) [41,44] or total protein content [45]. These studies reported increased levels of the investigated biomarkers during the pubertal growth spurt as compared to the prepubertal and postpubertal growth phases [41][42][43][44]127] with the exception of the GCF total protein content [45]. However, these studies followed cross-sectional designs and used the CVM method to assess pubertal growth phase [41][42][43][44][45], with one exception where a sample of 25 subjects was followed longitudinally in their mandibular growth [127]. Of particular interest are the biomarkers from the GCF, since its sampling involves a very simple, rapid, and noninvasive procedure that can be performed in a clinical setting. However, even though dental permutation has been reported not to influence significantly the GCF ALP activity [128], variability among the subjects and method errors [129] have to be taken into account. Moreover, optimal gingival conditions without plaque accumulation or clinically evident inflammation is necessary as the GCF ALP activity reflects local tissue inflammation [130]. Future studies on the diagnostic reliability of these biomarkers in the identification of the pubertal growth spurt or mandibular growth peak are warranted.

Current Evidence.
Herein, to report and evaluate critically current evidence on functional treatment for skeletal Class II malocclusion, data from most recent meta-analyses has been reviewed. Several meta-analyses on the efficiency of functional treatment for Class II malocclusion (skeletal or not) [11][12][13][14][15][16][17][18][19][131][132][133] have been published reporting contrasting results. Some evidence has shown how functional treatment for skeletal Class II malocclusion may be effective in terms of mandibular elongation [17,18,132,133] or dentoalveolar compensation [15,16]. On the contrary, other evidence reported minimal effects for such treatment [11,13,131]. The reason for this apparent inconsistency might reside in the different interventions performed [19,134], in the large variation in individual responsiveness to functional treatment [17,18] in conjunction with the absence of an analysis of potential prognostic factors [135], type of appliance [14,17,18,131,132], and patient's compliance for the removable appliances. Most recent meta-analyses [14-18, 131, 133] including untreated matched Class II control subjects with contrasting outcomes have been herein summarized (Table 9). In particular, these meta-analyses have been analysed according to the main sources of controversies such as design of clinical trials, definition of Class II malocclusion, and skeletal maturity.  [27,29] or mostly [136] prepubertal patients. For this reason, the consideration of controlled clinical trials (CCTs) with reasonable methodological quality has been advocated [137], especially considering that whenever RCTs are not available for metaanalysis, CCTs or observational studies may be used with essentially similar outcomes [138]. In spite of a previous metaanalysis including exclusively RCTs [13], the most recent ones herein summarized included both RCTs and CCTs, although an attempt has been made in several cases to the inclusion of prospective trials over retrospective investigations (Table 9).

Definition of Class II Malocclusion.
A clear distinction should be made between skeletal and dentoalveolar Class II malocclusion. Interestingly, clinical trials [27,29] on the efficiency of functional treatment for Class II malocclusion used overjet (equal or above 7 mm) as the only diagnostic criterion for Class II malocclusion. However, such an overjet as a sole diagnostic parameter has been shown to be not fully reliable in the identification of a skeletal Class II malocclusion [139]. On the contrary, other trials [140,141] used specific cephalometric parameters to assure the inclusion of skeletal Class II patients. In the meta-analyses herein reported, trials were included according to dental parameters alone [15,16,131], to a combination of ANB angle equal to or above 4 ∘ in combination with at least half-cusp Class II molar relationship [17,18], or to nonspecified criteria [14,133]. Therefore, conclusions on the supplementary mandibular elongation consequent to functional treatment should be limited to those trials including true skeletal Class II patients due to retrognathic mandible [17,18].

Skeletal Maturity.
In spite of the previous evidence suggesting skeletal maturity as a potential prognostic factor in terms of skeletal effects produced by functional treatment in skeletal Class II patients [4,25,134,142], to date few clinical trials have focused on the timing of intervention. The assessment of skeletal maturity, with clear distinction among prepubertal, pubertal, and postpubertal groups, was an inclusion criterion only for 2 meta-analyses [17,18], while it was not considered for all the others [14-16, 131, 133]. However, information on skeletal maturity, when available, was extracted in most of the meta-analyses (Table 9). Subgroup analysis for the different growth phases (mainly prepubertal versus pubertal patients) was performed in 4 meta-analyses [15][16][17][18], even though it was inconclusive in 1 case [15] because of limited data available, while, in another case, prepubertal and pubertal patients were pooled [16]. Of note, meta-analyses in which skeletal maturation was not considered or not analyzable [14,15,131] reported minimal effects of dentoalveolar nature, while meta-analyses evaluating specifically [17,18] or mostly [133] pubertal patients reported clinically relevant effects in terms of mandibular elongation and reduction of the skeletal Class II malocclusion (Table 9).

Other Limitations of the Current Studies.
The current investigation on the effects of functional treatment of Class II malocclusion is inherently hampered by other factors [14-18, 131, 133]. For instance, in spite of the use of annualized changes, observational terms may include not only the effective functional treatment, but also variable periods of time of retention or of further management of the dentition. Therefore, skeletal changes might occur not uniformly during the entire observational term skewing the analysis of treatment outcomes [12]. It is hard to avoid heterogeneity of the selected studies because of small sample sizes, inclusion of retrospective trials with historical control groups, and similar skeletal outcomes defined by different cephalometric parameters. Finally, an analysis of the potential responsiveness to treatment according to specific prognostic factors is still not feasible, and current evidence is mostly focused on the shortterm effects. Results are limited to the short-term effects.

Concluding Remarks
Current evidence on both the reliability of growth indicators and efficiency of functional treatment for skeletal Class II malocclusion is still controversial and highly heterogeneous. Although no skeletal maturity indicator may be considered to have a full diagnostic reliability in the identification of the pubertal growth spurt or mandibular growth peak, treatment timing according to available indicators (mainly HWM and CVM methods) has yielded more favorable outcomes in terms of mandibular elongation and reduction of the Class II malocclusion. The use of the HWM or CVM methods (or others) may still be recommended for treatment planning, even though large individual responsiveness and dentoalveolar compensations have been reported even in pubertal patients. Future investigation will have to further elucidate the controversies reported herein and follow more robust designs.