Metric Identification of the Same People from Images: How Reliable Is It?

Ratios have been applied to humans to identify individuals from images. These attempts have been proven unsuccessful, as camera angle, height, and distortions of the image affected the results. The anharmonic ratio is a ratio of ratios; it has proved successful in the identification of objects from images, as it is not affected by any distortions. The anharmonic ratio was applied to the human body and face to identify individuals from their images. Faces and bodies of twenty South Australian males aged 16–65 years were measured using standard anthropometric techniques. Participants were photographed in high quality images and recorded by standard surveillance camera (low quality images). Ten ratios were calculated from manual measurements and from all images. An Euclidean distance showed ratios incorrectly identified individuals 64.3% of the time between images of different quality. Variation of ratios between individuals is low so that standard deviations of ratios are of the magnitude similar to technical errors of measurements. Therefore participants cannot be isolated based on ratios. Ratios are an unreliable method for identification.


Introduction
Criminal activities are increasingly being recorded in closed circuit television systems (CCTV).Scientists with expertise in biological anthropology or computer image analysis are often called upon to identify the individuals from the images.Image identification involves comparing anatomical features seen on the image, of the person who committed the crime (person of interest) to the person who is suspected of committing the crime (suspect).Often, the images that are available for the analysis are of different quality (i.e., high or low, particularly in relation to clarity), as they are commonly taken at different times and by different cameras.It is not uncommon to have both high and low quality images in one case.
In order to accept a testimony of identification in a court of law, it should adhere to the Daubert criteria [1][2][3].The Daubert criteria aim to make testimonies of expert witnesses as reliable as possible by ensuring that they are peer reviewed, repeatedly tested, have a known error rate, and are based on sufficient facts or data.
As a result of the Daubert criteria, the interval scale of measurement is considered to be superior to the categorical scales [4], which are currently used by expert biological anthropologists.Interval scales of measurements could be expressed in any units that are directly measured; for example, nose width is 68 mm if a millimetre is used as the interval.Categorical scales of measurement use adjectives to describe anatomical variations; for example, the nose is wide.
In the past, the evidence/opinion presented by an expert witness for the identification of an individual from images by using categorical classifications has been criticised as unreliable [4,5] and the assessment of images in categorical scales is open to interpretation [6].
Therefore, research has been focused on the application of interval scale measurements to images for the identification of individuals [7][8][9][10].It has the advantage of reproducibility and the ability to calculate error rates [10].Disadvantages of the methods that use interval scale measurements are that they require complex analytical formulae [7][8][9][10] and the processes are tedious and thus time consuming.Taking reliable interval scale measurements from 2D images is difficult because of perspective (angular distortion) and possible rectilinear and curvilinear distortions that could result from the quality of the video equipment used [11].Another source of error is introduced from the pixelation of digital images, which particularly results from the enlargement of small images (e.g., enlargement of the image of a face to obtain the details).The lowest error rates that are scientifically accepted with these methods lie at 5% [10].However, 5% of stature is approximately 80 mm, which is unacceptable in a court of law as it broadens the suspect pool significantly.Therefore, an error rate of 5% does not allow accurate comparisons between images.
To avoid complexities of correcting for angular and rectilinear distortions, various authors have used facial proportions measured on photographs taken in standard position in an attempt to identify known individuals from images [12][13][14].Kleinberg and Siebert [12] investigated the use of ratios in facial identifications.It was found that their techniques were not able to sufficiently isolate an individual from the sample.Kleinberg et al. [13] reported similar results, showing that individuals could only be correctly identified approximately 22%-25% of the time at best, even under the most optimal conditions.In each of these cases the images were of a high quality, which does not represent real life forensic conditions.Moreton and Morley [14] studied the effects of image quality as well as angle of the camera, distance, and lighting.They found that proportionality indices changed significantly with camera angles, distance, resolution, and the lighting conditions.The conclusions of these studies were similar [12][13][14] and indicated that proportions of facial measurements are not adequate for the identification of an individual from a sample.However, the proportions used were single proportions, for example, taken as a percentage ratio of one trait to another; therefore these proportions are often subject to effects caused by image conditions, as reported.The anharmonic ratio is an improvement of the traditional proportional measurement used in previous investigations [12][13][14].The anharmonic ratio is a ratio of ratios; thus it is not affected by any effects that camera angle and some other photographic distortions may have on the identification process.
The current paper aims to isolate individuals using the anharmonic (cross) ratios of the body and the face from images of different quality.The body and the face will be used as both are extremely variable and can be used for the identification of an individual [15].For the anharmonic ratio to be successful as an identification technique, it must isolate an individual from the sample; that is, it should not allow to find any duplicates [15] based on the ratios.A secondary aim of the paper is to provide error rates for the measurements taken between anthropometric points placed on images of different quality.

2.1.
Principles of the Method.The anharmonic (cross) ratio is a ratio of ratios.It has been previously applied to photographs for identification purposes [16].The ratio remains a constant for a particular object through projective projection; that is, it remains the same irrespective of the angle and the height of the camera or the size of a photograph.Therefore, the anharmonic ratio is ideal for comparisons of proportions between images.Although it is an ideal method for the identification of objects from photographs and adheres well  with the Daubert criteria, anharmonic ratio has not been applied to the human body.The method relies on identifying four collinear points (A, B, C, and D).The anharmonic ratio of distances between those four points is defined as An example is shown in Figure 1.Points of nasion, subnasale, stomion, and gnathion are collinear as are vertex, nasion, subnasale, and stomion.The example illustrates how the same points can be used in different ratios as long as they remain collinear.
If points are not already collinear, they can be realigned to become collinear with other points.A point may be moved horizontally or vertically in the same plane to align with other points, as long as the distance between the two points does not change due to the realignment.Figure 2 shows the upper body of an individual with four points marked along the horizontal line marked (a); the two points marked (zy) can be realigned with the two points marked along the line (a) by moving them straight down.
When applying these principles to the human body, only points whose anatomical position is fixed can be used.Distances should not be measured between points whose anatomical relationship can be altered by altering position of the body.
The number of ratios that can be applied to the human body can exceed the number of fixed anatomical points.The reason for this is that a particular point may be used in more than one ratio.In two different ratios, a fixed anthropometric point such as symphysion is used as either B or C (Table 1).If a point is labelled as A for one ratio, it does not necessarily The distance between the metopion (m), the intersection between median sagittal plane and horizontal line between left and right frontal eminences and nasion (n), the point where the midsagittal plane crosses the junction between the frontal and nasal bones, the deepest root of the nose

Metopion-subnasale length m-sn
The distance between the metopion (m), the intersection between median sagittal plane and horizontal line between left and right frontal eminences and subnasale (sn), the point where the nasal septum meets the philtrum

Metopion-stomion length m-sto
The distance between the metopion (m), the intersection between median sagittal plane and horizontal line between left and right frontal eminences and stomion (sto), the midpoint of the occlusal line between the lips

Metopion-gnathion length m-gn
The distance between the metopion (m), the intersection between median sagittal plane and horizontal line between left and right frontal eminences and gnathion (gn), the most inferior point on the body of the mandible in the midsagittal plane have to be point A in another (e.g., nasion in ratios 2 and 3 (Table 1)).

Application of the Method.
Twenty males between the ages of 16 and 65 years were recruited from South Australia.Like in the other studies [12][13][14], males were chosen as they are the most likely sex to commit crimes that are caught on video surveillance systems [17].
The three authors are all anatomists experienced in locating anthropometric points on human bodies.A set of standardised anthropometric measurements as defined by Martin and Saller [18] were manually taken from each of the males using GPM anthropometer, sliding, and spreading calipers (Table 1).Males were measured wearing shorts only.All measurements/landmarks were chosen as they are visible anteriorly (Figure 3).
Participants were then photographed using a Panasonic Lumix DMC camera in the anatomical position wearing shorts only.The distance between the camera and the participant was 6 m.The camera has a resolution of 12 megapixels with a 1.5x crop factor compared to a 35 mm camera.The lens was a 14 mm-45 mm 3.5 zoom lens which is equivalent to a 28-90 mm lens on a standard 35 mm camera.All images were taken at 35 mm, equivalent to 70 mm on a standard 35 mm camera.These photographs of participants are considered to be of a high quality as all details of the face and body are easily visible, while distortions are minimised.Therefore all photographs of this quality are referred to as "high quality" in this paper.
Participants were then videoed using an Axis 216MFD (CCTV) at the University of Adelaide Medical School, foyer corridor of 8 m in length, 2 m in width, and 3 m in height.The camera records video with a resolution of 1280 × 1024 pixels at a rate of 5 frames per second.The focal length is between 2.8 mm and 4.0 mm.The mounting height of the camera is between 2.7 m and 3.0 metres off the ground.This is consistent with standard CCTV surveillance systems.Participants were pictured wearing a pair of trousers and a shirt.Participants wore every day clothing to represent real life forensic scenarios.Clothing is shown to have little effect on assessment of body shape [19].Participants were asked to walk the length of the corridor and pause on a marker line placed on the floor about 5 m away and facing towards the camera.This position is comparable with the position the participants were in, in the high quality photographs.These photographs of participants are considered to be of a low quality as details of the face and body are not clearly visible.Therefore, all photographs of this quality are referred to as "low quality" in this paper.For a comparison of high and low quality images used in analysis refer to Figure 4.
All previously defined landmarks were marked on both the high and low quality images of the participants.The images were printed on photographic paper and landmarks were marked manually together with distances between landmarks.A total of 10 anharmonic ratios were calculated for the body and face of each participant using the landmarks (Table 2).
In order to apply the same ratios used in photographic analyses to the manual measurements, the distances between landmarks were calculated by subtracting one measurement from another; that is, the distance AC for ratio 1 (Table 2)  is the distance between acromiale and tibiale.This was not measured directly.The distance, tibiale to base, was subtracted from the distance acromiale, in order to obtain the distance acromiale to tibiale.This allowed an accurate comparison to be made between each of the three settings, manual measurements, and the measurements from the high and the low quality photographs.In all three situations, measurements were repeated separately by another anthropometrist.This allowed calculations of errors to be made.The technical error of measurement (TEM) was used, as it is the most commonly used measure of error in anthropometry [20].The interobserver errors were calculated between the two measurers.Calculation of the interobserver errors automatically included the intraobserver errors, as the 2 people who took the measurements did not do so at the same time [21].The formula for TEM is as follows: where  is the difference between two measurements of the same dimension of the same individual taken on two occasions and  is the number of individuals so measured [22][23][24].
Comparisons of participants with themselves and others were done using a modified Euclidean distance.In cases of all comparisons, differences between the two measurements of the same person, for all traits (either ratios or single dimensions), were divided by 2 * TEM.Only integers of results of divisions were reported; thus any differences less than 2 * TEM became zeros.This allowed the reporting of differences exceeding 95% confidence range of errors.
SPSS statistics and Microsoft Excel were used for all statistical analysis.

Results
Table 3 shows the means, standard deviations, coefficients of variation (CV) of the TEM, and CV for individual dimensions, which were then used to calculate the ratios.There was a good range of variation within the sample with the largest standard deviation being for stature, which was 62.48 mm.The smallest TEM was for stature; it was 0.33%, which was 5.85 mm.The greatest accuracy of measurements was 99.67%.The largest TEM was for the distance between the trochanterion to tibiale landmarks; it was 48.48 mm or 11.08%.The largest CV-TEM was 12%, which is for the distance between the subnasale and stomion landmarks.With the greatest TEM, the accuracy rate was still >88% for the most ideal conditions.This is comparable with other anthropometric studies [21].
Table 4 shows descriptive statistics of all ten ratios for all participants ( = 20).The ratios have been calculated from the manual measurements, high quality photos, and the low quality photos.The averages for each of the ratios were very similar between conditions; the largest difference was 0.53 for ratio 3, which was a difference of 30.7%, between the manual measurements and low quality photographs.Overall, the means did not differ significantly between ratios and conditions.
In the ideal condition, that is, ratios calculated from manual measurements, the TEMs were equal to or only slightly below the standard deviation (SD).The greatest difference between a TEM and a standard deviation for manual measurements was 0.03.Therefore, it can be seen that, even in the most ideal conditions, measurement errors were responsible for majority of the variation of a ratio.In both high and low quality images, the TEM was equal to or larger than the standard deviation.
The CV-TEMs show the percentage of error for each ratio.There was a gradual increase in error between all three conditions; for example, manual measurements had the lowest error and low quality images had the highest error.This is to be expected.The largest consistent errors in all three conditions were seen in ratios 2 and 3 (Table 4).
Considering that the manual measurements have comparable TEMs, SD, means, and CV with other studies [21] (Table 3), manual measurements or their derived ratios will not be discussed further.
An Euclidean distance was used to establish whether an individual could be correctly identified from the ratios calculated using low quality and high quality images, when the errors were taken into account.In an ideal situation, only the image of the participant that is compared with another image of the same participant should have no difference, that is, 0, between measurements, after the TEMs were considered.Anything above a zero difference is concluded not to be a match for that person.Table 5 shows that individuals are being falsely identified as others; for example, participant 1 (P1) and participant 2 (P2) have a value of zero; this indicates that when errors are considered, there is no difference between them.At the same time, individuals who are themselves, that is, P1 and P1, are considered different, because there is a substantial difference between them in measurements from high quality and low quality images, which exceeds measurement errors.Participants were incorrectly identified 64.25% of the time.Unlike a traditional Euclidean distance matrix, which is symmetrical, this one compares two different variables (i.e., high quality and low quality images); therefore a high quality image compared with a low quality image is above the diagonal, while below the diagonal it is a low quality image compared with a high quality image.These variables were compared to establish whether the same person could be correctly identified from ratios when errors are considered and shown in images of different quality, as seen in real life forensic cases.This example shows that ratios do not have enough discriminatory power to differentiate between the correct and incorrect matches.This is further illustrated by the results seen in Table 4, which shows little difference between the average ratios for all face and body measurements.
Individual dimensions were analysed, as the ratios did not have the power to discriminate between individuals (Tables 4  and 5).The differences between measurements of the same individual taken by two people were analysed to examine whether it exceeds the TEM.Also, the measurements of each individual were compared with other individuals taking TEMs into account.High quality images were chosen for this exercise, as they represent the most optimal conditions for taking measurements from images.Table 6 shows that the number of traits (out of 40) differed by more than two times the TEM for each participant compared with themselves and all others, when the measurements were made from high quality photographs by two researchers.In reality, any participant compared with themselves should not have differences, but the findings contradicted this (Table 6).The lowest difference between any participant and themselves was 2; that is, 2 out of 40 traits differed by more than 2 times the TEM for that particular trait.This is not enough to make an identification of an individual.In some cases, the differences between a participant and themselves exceeded the differences between that particular participant and someone else; for example, P3 compared with P6 indicated that participant 3 and participant 6 are more likely to be identified as the same person compared to participant 3 is with himself.Results show that 34 out of the 40 traits did not exceed the acceptable error range.This was an accuracy of 85% once the errors were considered.
Even under optimal conditions, results show that measurements taken from images are not reliable; the question is, what traits can be reliably measured, if there are any?
Table 7 shows the number of times that a particular trait measured from high and low quality photographs differed by more than two times the TEM in the same individual.and 3).In both ratios, there was a difference of approximately 10% between manual measurements and low quality images.Details of the face are small and cannot be clearly seen in low quality images [11].Thus, the errors are higher as placement of anthropometric points accurately is difficult.On images, the head is one of the smallest parts of the body; therefore any inaccuracy in the placement of one point is reflected in the placement of all others as the relative distance between facial points is small.The smaller the distance, the greater the chance of misplacing an anthropometric landmark.An example of this can be seen in Table 3, where the highest TEM is for the distance between the trochanterion and tibiale landmarks, which was 48.8 mm, but the largest percentage of error was found when measuring the distance between the subnasale and stomion, which was only 2.5 mm.Moreton and Morley [14] state that proportions should only be used "to test for elimination."The example presented in Table 5 compares individuals with each other and themselves, and the individuals could not be excluded based on ratios.This is an important finding, as it illustrates the ineffectiveness of the proportions for any step in the identification of an individual.This is consistent with the findings reported by Moreton and Morley [14].This was also tested on single dimensions and results were similar (Table 6).
Porter and Doran [25] have claimed that only horizontal proportions should be used, as the proportions in the vertical plane undergo image distortion.Although this was tested on facial proportions, the same should hold true for the body proportions.Moreton and Morley [14] showed that all proportions vertical or horizontal were affected by image distortions.However, the use of the anharmonic ratio in the current study should remove any effects of angular and rectilinear image distortions.Even though the images were free from distortions, there was an error that deems their use unreliable for forensic identification.This was the case regardless of the location of the ratios, that is, face, body, vertical, or horizontal.
The anharmonic ratio uses a combination of four measurements.Each of those measurements is taken with an error, even in cases where the error is small; that is, within the accepted error range, as seen in the manual measurements (Table 3), the error of the ratio is approximately the same as the standard deviation.When measurements with an error are combined, their errors are also combined.Thus, any variation in the ratio is due to errors and not actual human biological variation.This could result in someone being identified as someone else (Table 5).In real forensic cases, this outcome would be disastrous.This makes ratios an unacceptable tool for the identification of individuals from images.

Conclusion
The error rates of dimensions measured from images increase as the quality of the images decreases.Due to the high error rates, ratios are not considered as a reliable method for the identification of individuals.This paper shows the error rates for ratios and single dimension measurements taken from high and low quality images; in both cases the errors exceed those of manual measurements.Therefore, taking measurements from images generate high error rates and make them unreliable in the identification of an individual, particularly in the court of law.Furthermore, they do not meet the Daubert criteria.

Figure 1 :
Figure 1: Example of four collinear points of the face and head.The example illustrates how the same points can be used for different ratios.The collinear points of the face and head, vertex (v) nasion (n), subnasale (sn), stomion (sto), and gnathion (gn).

Figure 2 :
Figure 2: Realignment of preexisting points along a vertical axis to become collinear with points along a horizontal line.

Table 1 :
Anthropometric measurements/landmarks and their associated definitions taken manually of each of the male participants.The distance between the floor (b) and the highest point on the top of the skull when the head is held in Frankfurt horizontal (v) Tragion height b-t The distance between the floor (b) and the superior point on the tragus of the ear (t) Acromiale height b-a The distance between the floor (b) and the acromiale (a), the point at the superior and lateral borders of the acromion process Suprasternale height b-sst The distance between the floor (b) and the suprasternale (sst), the lowest point in the suprasternal notch in the midsagittal plane Iliocristale height b-ic The distance between the floor (b) and the iliocristale (ic), the highest and the most lateral palpable point of the iliac crest of the pelvis Trochanterion height b-tro The distance between the floor (b) and the trochanterion (tro), the superior point of the greater trochanter of the femur Symphyseal height b-sy The distance between the floor (b) and the symphysion (sy), the point on the superior margin of the pubic symphysis in the midsagittal plane Tibiale height b-ti The distance between the floor (b) and the tibiale laterale (ti), the superior point on the lateral condyle of the tibia Biacromial width a-a The distance between the left and right acromiale landmarks Bi-iliocristal width ic-ic The distance between the right and left iliocristale landmarks Bizygomatic width zy-zy The maximum horizontal breadth between the zygomatic arches Metopion-nasion length m-n

Figure 4 :
Figure 4: A comparison of high and low quality images.Note: the participants' face has been blocked out for confidentiality reasons.

Table 2 :
The ratios and corresponding landmarks applied to each of the participants (n = 20).

Table 3 :
Means, SD, TEM, CV-TEM, and CV for individual dimensions for manual measurements.

Table 4 :
Means, SD, CV, R values, TEM, and CV-TEM for all ratios under all three conditions.

Table 5 :
Euclidean distances between all males (n = 20) of all ten ratios taken from high and low quality images by the same person.A measure is whether the ratios of each individual exceed two times the TEM for that particular ratio.

Table 6 :
The number of traits (out of 40) that differ by more than 2 times TEM for each participant compared with themselves and allothers as measured from high quality photographs.This shows a comparison between participants as measured by TL (rows) and JK (columns).

Table 7 :
The number of times a dimension differed by more than two times TEM in an image of the same individual measured twice on high and low quality images.