The Difficulty of Sexing Skeletons from Unknown Populations

Determination of sex from skeletal remains is performed using a number of methods developed by biological anthropology. They must be evaluated for consistency and for their performance in a forensic setting. Twenty skeletons of varied provenance had their sex determined by 15 existing methods of forensic anthropology (7 metric and 8 morphological). The methods were evaluated for their consistency in determination of sex. No single individual was identified as belonging to one sex exclusively. Ambiguous results were obtained by metric methods for fourteen individuals (70%) and by morphological methods for only five individuals (25%) (Chi-squared = 4.3, df = 1, P < 0.05). Methods which use the size of bones as an indicator of sex perform poorly on skeletal remains of individuals of unknown provenance. Methods which combine morphologic and metric techniques, that is, geometric morphometric analysis, may result in greater levels of consistency.


Introduction
Determination of skeletal sex can be achieved using morphological (descriptive) or metric (quantitative) methods of forensic anthropology.Morphological methods rely on features which arise from an interaction between genetically controlled sex-linked patterns of growth and development, with environmental influences that may differ according to gender [1].However, they can only be measured in categorical scales, which rely on judgments of observers [2][3][4].Metric methods asses the same results of growth and development of males and females as morphological methods but remove the potential for observer's subjectivity.However, they are dependent on size and shape differences between members of various populations.This makes discriminant functions of sex assignment, based on metric measurements, population specific, thus biasing application to individuals from different populations [5,6].Skeletal size differs considerably across populations and the use of inappropriate discriminant functions can result in misinterpretation of sex [1,7].The consistency with which existing methods are able to correctly identify skeletal sex needs to be investigated.It would be desirable to develop a method that combines good aspects of both descriptive and metric methods and is not population specific.The first step towards this goal is an evaluation of how consistently various methods perform in sex estimates of individuals of unknown population of origin.
Numerous methods exist for determining skeletal sex; however, for the purposes of this study seven metric and eight morphological methods were chosen.The methods were included because they are widely used or cited, such as Acsadi and Nemeskeri [7], Krogman and Iscan [8], Phenice [9], and Steyn and Is ¸can [10], and other studies which address a novel or revised technique for sex estimation [1,[11][12][13][14][15][16][17][18][19].Their full list can be seen in Table 1.The material was chosen to imitate cases where sex and other individualising characteristics are unknown.The material comes from a variety of populations.The purpose of this study was not to evaluate the competence of specific studies against skeletons of known sex, but to assess consistency with which morphological and metric methods can determine skeletal sex.

Material and Methods
This study used 20 skeletons held by Ray Last Laboratory at University of Adelaide.There was no selection; simply all available skeletons in the laboratory were used.The skeletons were of unknown sex, with one exception.This skeleton was flexure recently acquired by dissection of a male cadaver.Although the sex of this skeleton was known, it was treated, in all analyses and interpretations, in the same manner as all other skeletons in this sample.For this study, knowledge of actual sex was not essential as this study aims to evaluate the consistency of results of various methods, rather than the competence of each single method.These skeletons are derived from a variety of populations.They come from two sources: (1) donated skeletons of Australians of European descent and (2) teaching skeletons bought by the university from India early in the 20th century.Some Australians of European descent may have a small admixture from Aboriginal Australians; this, however, is unlikely to influence their morphology.Since particular methods of sex estimation were based on skeletons from specific populations it is possible that they will not perform well on skeletons of different origins.The methods selected, chosen for this study, are those commonly used in forensic anthropology.These methods are based on a variety of United States, European, South African, and Egyptian skeletal series.To the author's knowledge, there are not many methods based on skeletal series from Asia.Therefore, application of commonly used methods to Indian skeletons imitates a possible forensic situation where the population of origin of a skeleton is unknown and difficult to establish.The population of India is not easily defined as belonging to classic divisions of biological affinity, such as European/African/East Asian.Fifteen methods were applied to the skeletal sample (Table 1).Methods 1-7 employ metric techniques, while methods 8-15 use morphological techniques.
All assessments and measurements were conducted by a single observer (IS) to minimise interobserver errors.The first author is an osteologist, trained at a Master's level.She is familiar with Martin's technique of anthropometric measurements, which was designed to minimise measurement errors.Since we are testing actual forensic applications of methods of sex estimation, we did not include error estimates.
The ability of the methods to consistently determine the sex of an individual was evaluated in four different ways: (1) counting in how many cases the majority of methods gave the same result, that is, at least eight of the 15 methods identified the skeleton as belonging to the same sex; (2) counting in how many cases methods placed the sex of the same skeleton, into all three of the possible categories (male, female or ambiguous); (3) comparing the consistency between metric and morphological methods (this was achieved by counting how many times an individual was categorised as belonging to the same single sex by all seven metric methods or all 8 morphological methods); (4) counting in how many cases a single method resulted in ambiguity of sex estimation.This was achieved by evaluating the inconsistencies in estimations within an individual method, namely, those which present multiple opportunities for sex estimation.
The methods chosen utilise different skeletal elements to determine sex using different approaches to describing individual variation: categorical classification of morphological features or metric dimensions of skeletal elements.There is some overlap between specific methods in either skeletal elements used or methods of measurement.Thus the methods are only partly independent.We used nonparametric approach to measure interdependencies between specific methods; contingency table analyses producing Chi-squared values that allow the evaluation of statistical significance of differences, and when converted to Cramer's , values describe the degree of correlation.Since we are interested in similarities, rather than differences between methods, we interpreted Chi-squared values exceeding the level of significance ( < 0.05) as showing no similarities between methods.In this case, Cramer's  values subtracted from unity indicate the level of similarity between results of two methods.Contingency tables were 2 × 3 with columns specifying methods, and rows specifying sexing results of each method as male, female, or ambiguous.

Results
Tables 2(a) and 2(b) show that no one individual was identified as being consistently of one sex by all 15 methods.Table 3 shows, by using Cramer's  values subtracted from unity, that some methods produced similar results, but many did not to the extent of their results being significantly different ( < 0.05).Two methods based on descriptive evaluation of pelvic morphology (9,11) produced sex estimates in perfect agreement.Method 12 (descriptive, mandible) shows the highest level of similarity (0.97) with methods 2 (quantitative, other bones), 8 (descriptive, skull), and 14 (descriptive, skull, pelvis, and long bones).Methods 1 (quantitative, long bones) and 3 (quantitative, long bones) show only similarities to each other but not to any other method.All other methods show a significant level of similarity with at least two different methods.Even those methods that showed formerly significant often had low degrees of agreement as indicated by Cramer's  values below 0.70.We chose the value of 0.70 per analogy with correlation coefficient values indicating that at least 50% of variance in one method is explained by the variance of results in the other method compared.These values are shown in boldface in Table 3.It can be stated that in general terms methods are to a large extent independent in results they produced.
In evaluation approach (2) 14 individuals (specimen numbers 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, and 18), shown in Tables 2(a) and 2(b), were identified as belonging to all three of the possible categories (male, female, or ambiguous) by at least one method in each category.Thus, of 17 individuals with a majority of methods indicating one particular sex, 13 were also classified as another sex by at least one method.
In evaluation approach (3), Figure 1(a) shows only three individuals (specimen numbers 1, 2, and 3) were consistently identified as belonging to only one sex by all seven of the metric methods.In contrast, Figure 1(b) shows that nine individuals (specimen numbers 4, 5, 11, 13, 14, 15, 16, 17, and 18) were consistently identified as belonging to only one sex by all seven of the morphological methods.

Discussion
The combination of the 15 methods used in this study did not produce a single, fully consistent identification of sex out of 20 skeletal cases.A limitation of this study remains to be that sex was only actually known in one of the 20 cases; however, this study's aims were not to     Figure 1: (a) Consistency of metric methods in determination of sex in 20 skeletal cases."A" is ambiguous, "F" is female, and "M" is male.(b) Consistency of morphological methods in determination of sex in 20 skeletal cases."A" is ambiguous, "F" is female, and "M" is male.evaluate the ability of the methods individually; rather, they were to assess the consistency with which morphological and metric methods can determine skeletal sex.Therefore, knowledge of actual sex of any given skeleton was not essential.
The methods themselves are not completely independent of each other because some use the same body parts and the same/similar traits.Besides methods 9 (descriptive, pelvis) and 11 (descriptive, pelvis), similarities between results of methods are independent of the body part assessed and of the approach (quantitative or descriptive).For example, results of descriptive assessment of mandibular morphology are most similar to results of methods based on measurements of the skull, long bones, and other bones (Table 3).The realisation that sex is known, even if the investigator was blinded to that information or was trying to act as objectively as possible, may have some influence on subjective categorisation of morphological characteristics.It did not, however, happen in this study since even the only skeleton of known sex (specimen 2, a male) has not been consistently estimated as a male.
Table 4 shows the number of sex estimates, sorted into skeletal area/bone, which are in disagreement with the majority of estimates.Out of the total number of individual assessments (300), metric methods produced somewhat higher percentage of estimates in disagreement with the "majority" sex; however, it was not significantly different from that of morphological methods.Therefore, both types of methods have the same propensity for erroneous estimations of sex.It is interesting to note that the bones which were responsible for the largest level of disagreements in sex estimates were from the upper limb.Familiarity with distribution of sexual dimorphism in a particular single population may influence the judgements of an investigator.Investigators usually gain experience in assessing the range of sexual dimorphism in the particular populations they frequently work with.When faced with a need for a sex estimate from a skeleton of unknown population of origin, even with the best intentions, their experience may result in a bias.In the case of this study, this bias was avoided because IS, although fully trained in methods of sex and age estimation, had no experience working for a substantial time with any single, specific skeletal sample from a particular population.
A larger sample size may have enabled a consistent determination of sex to occur although the fact that this was not seen in 20 skeletal cases is cause for concern.In forensic proceedings the standard of evidence required is "beyond reasonable doubt," and in this study significant doubt is evident.As mentioned earlier, this study assesses the consistency of 15 out of many available methods.It may be argued that this subset cannot provide an accurate representation of all available methods; however, with results showing a 0% reliability rate of the consistency between many methods used, one can assume that if more methods were used, lack of consistency would have persisted.This is following the assumption that the ancestry of the skeleton/s under investigation is unknown and therefore, the methods chosen may be based on little more than an educated guess.Anthropologists with an experience of working in a particular geographic area may develop a good understanding of local patterns of variation that will allow them to choose the most appropriate sexing methods to suit their conditions.This, however, may not always be the case, especially in highly mobile societies or when an anthropologist is studying a series of skeletal remains of an unfamiliar area.Increasingly, courts of law require that experts provide objective proof of their statements rather than simply expressing an opinion based on their experience.A method of sex estimation in a sample from a particular population, that is independent of population-specific biases, inherent in existing methods, has been proposed as early as the 1950's [20][21][22].In essence, it asks for seriation of skeletal traits, observed in that sample, from the most female-like to the most male-like one.The range is then divided in the middle to provide female/male estimates based on the appearance of each trait within the sample.This method is based on an assumption that the sample contains both female and male skeletons even if in unknown proportions.Although this method is ideal to remove any population biases of other sex estimation methods, it cannot be applied to individuals of unknown population of origin.Therefore, as yet, no method exists to provide an unbiased sex estimation of a single skeleton of unknown provenance.
Currently, it is thought that the morphological indicators provide the greatest discrimination between the sexes, that is, the pelvis and the skull, with 95% and 90% accuracy, respectively [1,7,9,23].This, however, is not completely free from bias because many morphological criteria for sexing skeletons were established on samples from specific populations and therefore they may not have captured the whole range of male/female differences that may occur within some other population.Metric discriminant function analyses are used to determine skeletal sex from a number of sites [10,15,[24][25][26]; however, the appropriateness of the particular method chosen must be assessed, as discriminant functions are created using a skeletal sample of a specific population.An individual skeleton may not conform to the characteristics of the population used in the study, and as a result misidentification of sex may occur [5].Since both groups of methods, morphologic and metric, have strengths and weaknesses, their combination, by quantifying shape, may provide a solution.Methods which use geometric morphometric analyses, such as that by Steyn et al. [27] and Urbanová et al. [25], are a step in the right direction to meet these requirements.
This study employed methods that are being used currently in forensic investigations and they failed to provide consistent identification of sex in all 20 cases, including the one skeleton of known sex.Seventeen individuals (85%) were found to be by "majority" one sex.This study considers it "majority" when at least eight of the 15 total methods produce the same result.Of the 17 skeletal cases which were found to be by majority one sex, four cases had a majority of 14, one had a majority of 13, three had a majority of 12, three had a majority of 11, two had a majority of 10, three had a majority of nine, and one had a majority of eight.The determination of skeletal sex, in a particular case, may depend on the method used by forensic investigators.This is especially highlighted by three cases (specimens 8, 12, and 20), where individuals did not have a majority of methods indicating clearly one sex.Forensic investigators are guided in the choice of the method by a possible population to which the investigated individual may belong.This increases reliability of a sex estimate but does not always guarantee it.
When comparing Figures 1(a) and 1(b), it can be seen that morphological methods provide greater consistency in determining skeletal sex in comparison to metric methods.Only 15% of cases were consistently identified as belonging to only one sex by all seven of the metric methods, compared to 45% of cases identified as belonging to only one sex by all eight of the morphological methods.These results A A skeleton of an adult from the Abbie Museum of Anatomy, University of Adelaide.B A skeleton of known sex, age, and race from a donated cadaver: White Australian male who died at age XX years. 1-5Actual skeletons of individuals of unknown sex, age, and race used as teaching aids in dissection rooms.
AM# Skeletons of individuals of unknown sex, age, and race from Abbie Museum of Anatomy, University of Adelaide.HS-0##/SC-0## Boxed half-skeletons of unknown sex, age, and race from University of Adelaide.* Some traits were excluded due to their absence on the skeleton.
suggest that the sole use of metric methods, in a forensic investigation, increases the chances of getting in inconsistent sex estimation by 30% in comparison to morphological methods.
Figure 2 shows that metric methods produce many more "female" identifications than "male."This is possibly a result of importing some skeletons from India many years ago, when the average size of males, in the local population, was small.Results such as these are disturbing, especially when time constraints placed upon forensic investigations may result in a minimal number of methods being used.If only metric methods are employed, great variation will be seen in sex estimation, as shown in Figure 1(a), especially when the population of origin of the skeleton in question is not known, as may happen in countries with high numbers of immigrants.

Conclusion
Determination of skeletal sex is often the first biological characteristic sought after by forensic investigators in cases of discovery of skeletal remains.In methods of sex estimation characteristics that may result from specific gender roles in particular populations, as well as those related to overall size  Comparison of the consistency of all metric and all morphological methods."M" is male, "F" is female, and "A" is ambiguous.
of individuals, seem to be least useful when a population of origin is not known.The search for characteristics reflecting, as directly as possible, the influence of sex hormones on development of skeletal morphology, irrespective of an individual's size, may produce most generally applicable methods.Also, technologies enabling better observation of morphological characteristics, for example, geometric morphometrics, may be used to quantify shape of rigid structures that are often curved and are not easily measured using standard metric methods [27].As is the nature of scientific enquiry, the practice of skeletal sex estimation requires further research, perhaps by expanding the range of reference populations to reflect the ever-increasing trend of mixed populations.
of an adult from the Abbie Museum of Anatomy, University of Adelaide.BA skeleton of known sex, age, and race from a donated cadaver: White Australian male who died at age XX years.1-5Skeletons of individuals of unknown sex, age, and race used as teaching aids in Ray Last Dissection Laboratory.AM#Skeletons of individuals of unknown sex, age, and race from the Abbie Museum of Anatomy, University of Adelaide.HS-0##/SC-0## Boxed half-skeletons of unknown sex, age, and race from University of Adelaide, teaching collection.* Some traits were excluded due to their absence on the skeleton.

Figure 2 :
Figure2: Comparison of the consistency of all metric and all morphological methods."M" is male, "F" is female, and "A" is ambiguous.

Table 2 :
(a) Results of sex estimation of a series of skeletons by a number of metric methods recommended by forensic anthropology texts.Numbers (1), (2), and so forth in columns indicate results of different discriminant function equations by the same authors.(b) Results of sex estimation of a series of skeletons by a number of descriptive methods recommended by forensic anthropology texts.Numbers (1) and (2) in column 14 indicate results of two different discriminant function equations by the same author.

Table 3 :
Similarity between results of various methods as measured by Cramer's  values subtracted from unity.Only data for significant similarities between methods are shown.Methods are clustered by greatest similarity.

Table 4 :
A summary of methods in disagreement with a majority of sex estimates found in this study.

Table 5 :
Comparison of the consistency between morphological (descriptive) and metric methods, where M = male; F = female; A = ambiguous.