Methods of Developing User-Friendly Keys to Identify Green Sea Turtles ( Cheloniamydas L . ) from Photographs

Identifying individual animals is important in understanding their ecology and behaviour, as well as providing estimates of population sizes for conservation efforts. We produce identification keys from photographs of green sea turtles to identify them while foraging in Akumal Bay, Mexico. We create three keys, which (a) minimise the length of the key, (b) present the most obvious differential characteristics first, and (c) remove the strict dichotomy from key b. Keys were capable of identifying >99% of turtles in >2500 photographs during the six-month study period. The keys differed significantly in success rate for students to identify individual turtles, with key (c) being the best with >70% success and correctly being followed further than other keys before making a mistake. User-friendly keys are, therefore, a suitable method for the photographic identification of turtles and could be used for other large marine vertebrates in conservation or behavioural studies.


Introduction
Photographic identification of individual organisms can allow many aspects of their biology to be studied, for example, population sizes can be estimated using capture, mark, recapture techniques [1,2], distributions and foraging ranges can be calculated and even social structure of group living individuals can be elucidated [3].
Photographic techniques can be beneficial for many reasons.They can reduce stress and handling of capturing an individual (e.g., [4], based on behaviours of turtle hatchlings) and eliminate problems such as tag loss [5,6].Furthermore, given the rise in popularity of digital cameras, and cameras (including those on smartphones) along with the use of Web 2.0 technologies (such as social networks), it is possible to obtain large quantities of useful scientific information from photographs taken by members of the public [7][8][9].
The main problem of photographic identification is the amount of time taken by highly trained personnel to correctly identify individuals [10].Much research has shown that the use of poorly trained personnel results in misidentification of individuals [1] and for taxonomic surveys, misidentification of species [11,12].Naturally, an increase in number of photographs leads to increases in the total time needed to process images, and currently with the increase in camera trap studies worldwide (e.g., a SCOPUS search for scientific papers containing the word "camera trap" in the title, key words, or abstract produces 12 articles in 2002 steadily increasing to 91 in 2010), and the large number of images being uploaded to research pages of social networks, such as projects on Flickr and the iSpot identification website (reviewed in [9]), the number of images in need of identification to either species or individual level is rapidly increasing.
While patterns of markings such as spots or stripes have been shown to be unique for individuals of many species (e.g., cheetahs, [13]; zebras, [14]; whale sharks, [15]; manta rays, [16]), automatic image recognition systems are still in their infancy.Computer-aided recognition systems have been developed (e.g., numerous cetacean species, [17]; whale sharks, [15]; and even leatherback turtles, [18]), but these can still involve significant amounts of time to process an image, as well as significant time, skills, and expertise to use, modify, or develop the programs.For example, in terms of using a computer-aided identification system, a trained user would take around 10 minutes to identify an individual whale shark, using the technique developed by Arzoumanian et al. [15].While more rapid techniques exist to identify smaller animals, such as amphibians, these techniques tend to require careful photography of the animal in artificial conditions, such as a lightbox, or while being restrained by an assistant, after the animal has been dried of excess water [19].This problem of recognition in natural environments exists not just for wildlife, but also for human recognition systems, with far greater monetary resources for their development at their disposal [20].Where facial or iris recognition systems have been developed successfully, they involve an individual staying still and/or facing the camera at an angle of no more than 20 • [20].Although many facial recognition systems have been deployed-for example, in airports, to detect possible terrorists-the effectiveness of these techniques has been questioned in the literature, mainly due to concerns over poor lighting, the angle at which the photograph is taken and the resolution of the face in the image [21].All of these issues also relate to the capture of photographs of wildlife, especially from automated traps or in difficult conditions such as the photographer being in a wildlife hide or underwater.Because of this, the automatic identification of any individual (human or animal), and automatic identification of many species, from photographic records is currently best described as an area of ongoing research [19,20].
An alternative approach for rapidly identifying individuals is the development of simple keys, akin to taxonomic keys used to identify species.For "charismatic" marine organisms, it is possible to cheaply and effectively utilise volunteers, and if keys were user-friendly and involved only limited training, then identification could be conducted from photographs accessed remotely over the internet (e.g., as done for species identification of bees, [9]).
In this paper, we examine effective mechanisms to develop simple keys to identify individual green sea turtles (Chelonia mydas L.) from a population in Akumal Bay, Mexico.Photographs are taken from foraging turtles, rather than nesting turtles, as occur in most studies (but see [22][23][24], e.g., of photo identification whilst swimming or foraging).We demonstrate that user accuracy increases with inclusion of obvious discriminatory features early on in the key.Furthermore, incorporation of statistical techniques and the removal of strict dichotomy of the keys can help minimise the number of steps involved to make an identification.Accurate identification of photographs from rapidly trained volunteers occurs if both of these processes are combined.

Methods
Daily snorkelling surveys between February 2nd and July 2nd, 2009 were used to capture >2500 photographs of 54 turtles in Akumal Bay, Mexico (20 • 394.896N, 87 • 313.542W).From these photographs, key characteristics of the head and neck, which differed between individuals, were determined  1 for further details of abbreviations.
(Table 1; Figure 1) and were used as the basis for developing keys.

Minimising the Number of Steps in a Key.
To develop a key with the minimum number of dichotomous steps, a matrix was developed where each turtle was listed against the classification criteria in Table 2.For each criterion either a 1 or 0 was placed for each turtle.To develop the key, the variance was calculated for each criterion (over all turtles).The highest variance corresponded to the criterion that would separate the turtles into the most equal-sized groups (i.e., had the most equal numbers of turtles with or without the characteristic).This criterion's presence or absence then became the first step in the key.By repeating the process independently for the two new subgroups of turtles divided by the first step, the next steps in the key were generated, and this process was continued until all turtles separable by the list of criteria were accounted for.Since each step splits the remaining turtles into the most equal-sized two groups, this process resulted in a minimal number of steps occurring.

Allowing the Most Obvious Factors to Be Accounted for
First.To allow the most obvious distinguishing characteristics to be accounted for first in a key, but still try to minimise the number of steps taken, each characteristic was given a "priority" based on its ability to be accurately identified from a photograph (Table 2).The matrix developed above was then multiplied by this priority value for the appropriate characteristic.For example, differentiating the number of parietal scales was considered the most obvious step and was given the highest priority of 13.As such, instead of values of 1 or 0, this characteristic had values of either 13 or 0 for presence or absence.Using the refined matrix, the same process as above was used to develop the key.Multiplying by the "priority" value would therefore increase the variability of the characteristic across all turtles, meaning it was more likely to be selected early in the above process.From herein this is referred to as the "priority key", with the key in the section above referred to as the "nonpriority key."

Combining
Steps: Nondichotomous Keys.The priority key initially involved many choices related to the number of parietal scales.Here, we combined these steps from strictly dichotomous to a choice of: for example, "how many parietal scales are present?,"with possible answers ranging from 1 to 5. To avoid skewing the balance of variability (and hence order) developed in the priority key, the matrix developed for the priority key was unchanged, but the question of "how many parietal scales are present?"was asked at the point where the first occurrence of any questions relating to numbers of parietal scales occurred.These ideas were also maintained for the number of spots at the parietal base, the number of temporal scales, and the positions of frontoparietal scale ticks.Other than this, the process for developing this key-herein referred to as the "combined Key"-was identical to that of the priority key.

Testing the Efficiency and Success
Rate of the Keys.The usability and time taken to identify individuals were determined for all the three keys using a group of undergraduate biosciences students.The students were given a short presentation (∼30 mins) explaining the anatomy of a sea turtle's head scales.Unique characteristics of the scales and the terminology of the scale names and other scientific terms used in the keys were explained.Each test user was provided with a photographic identification guide, a table of definitions of key terms, and a reference photograph of many characteristics (as per Figure 1).Each student was required to identify turtles from photographs, using one of the three keys.In total, 27 students took part in the survey, with nine students using each key.Each key was tested with nine different groups of photographs (with each group of photographs having five images of different turtles to identify, therefore there were 45 images and in total 135 identifications made by the 27 students); hence students using the same key had different groups of photographs, and these groups of photographs were identical for each of the keys used, making the group of photographs a repeated measure in the analysis.Photographs were selected randomly, but clarity of the image and ability to see the key distinguishing features of the head and neck were ensured, since the aim of the study was to identify which keys were most userfriendly-not to show that they could distinguish individuals from a range of qualities of photograph.While each group of photographs did not contain more than one image of the same turtle, different photosets did contain different images of the same turtle.Students were asked to record their steps through the key, to determine if and when they went wrong and also to record the time taken to identify each individual.Repeated measures ANOVAs were calculated for each of the dependent variables (1) number of turtles correctly identified, (2) proportion of the way through the key before a mistake was made (arcsine transformed), and (3) time taken to identify the individuals in the five images.

Results
All three keys are available for visualisation online at https:// public.me.com/richardstafford1.All of the three keys produced could identify most individuals uniquely.Of the individuals the keys failed to distinguish, some individuals were consistently inseparable by all keys.The computer-generated keys could not make individual distinctions for 15 individuals, and the combination key had 17 individuals not totally separable.However, despite not being able to identify every individual, the combination key could identify all but 20 of the 2,593 photographic records of turtles obtained in the study period (a > 99% success rate for photographs obtained), hence only individuals rarely seen (i.e., photographed no more than twice) were not identified by the key.Furthermore, the turtles not identified by differences in head markings could be further separated by use of morphological differences on their carapace, although, for simplicity, these features are not included in the current key (see discussion).
The combination key had a shorter total number of steps (n = 214), than the priority (n = 364) or the nonpriority key (n = 353).While the combination key was shorter than the other keys, it was not fully dichotomous, thus the number of steps was reduced by combining several steps together.The ability of students to correctly identify individuals and the proportion of the way through the key that students navigated before making mistakes both differed significantly between the different keys (Table 3); however, the time taken to identify individuals using the different keys was not significantly lower (Table 3).In terms of correct identification and proportion of the key navigated successfully, the combination key outperformed the other two keys (Figure 2).Post-hoc Tukey tests revealed that significant differences occurred between the combination key and the other two keys for both the number of correct identifications and for the proportion of the way through the key (P < 0.05).No significant differences were found between the priority and nonpriority keys (P > 0.05).
Table 3: Summary of repeated measures ANOVAs to identify differences between the three keys in terms of correct identifications, time taken, and the proportion of the key navigated before a mistake was made.

Discussion
Firstly, this study demonstrates that a large percentage of foraging turtles can be identified from photographs.While this has been previously reported (e.g., [24,25]), these results indicate the applicability of the technique to the resident Akumal Bay population.The use of photographs creates much less disturbance to foraging turtles and eliminates problems associated with tagging [4][5][6]26], such as tag loss, increased risk of infections, impaired swimming performance, or the need to capture a turtle while nesting in order to tag it [25][26][27][28].Therefore the use of this technique may allow a greater understanding of the natural foraging behaviour of this species.
The results also demonstrate that effective keys can be made to identify turtles from photographs.Keys show an improvement in their performance (as judged by successful numbers of identifications made by end users) if (1) obvious distinguishing factors are included early on in the key and (2) the length of key is reduced, by combining steps.There was a significant difference between the performance of the combination key and the nonpriority key for all metrics studied, indicating the importance of these two improvements.For example, while the nonpriority key was itself designed to minimise the number of dichotomous steps, the mean correct identification rate by students using the keys was <50%, compared to the >80% from the combination keywith fewer steps on average and important characteristics listed first.
While the current keys cannot identify all individuals uniquely, and have a lower successful identification rate by volunteers than some previous studies (e.g., [24]), these shortcomings in fact highlight the importance of the processes undertaken in this study.The current study has focussed solely on the head and neck of turtles, rather than on more obvious characteristics (in some cases) such as shell markings.Largely, this has been done to ensure simplicity in the design of the keys by the computer-based, or semicomputer-based methods.By including features such as shell characteristics, not only can all turtles be uniquely identified, but also the success rate may be able to be improved, if some of the obvious shell characteristics were included early in the design of the key.The choice of characteristics to include in a key, and the priorities assigned to these characteristics, is clearly subjective, and will ultimately influence the structure of the keys.Given the relatively high proportion of the population which could not be separated uniquely using the keys (31%), but the low proportion of total photographs (<1%) which could not be identified by any of the keys, it is highly probable that the choice of characteristics used to differentiate between individuals is determined by studying the most commonly seen individuals.Such an approach is suitable for small and relatively closed populations, but may not be so robust if population sizes are larger (i.e., >100 individuals), or populations are open.
It is important that the proportion of correct identifications is established, as this has implications for any subsequent work, for example, on capture mark recapture analysis.Based on genetic marker studies, where exact identification of individuals can be problematic, a successful identification rate of >94% is needed to ensure that population estimates will be reliable [29].While this is higher than the identification rate obtained here, it is likely that, with keys using features from the entire turtle, these values could be obtained and have been in other turtle photo-identification studies [24].Furthermore, as per the use of genetic markers, statistical processes applied to the identification may be able improve identification rates, especially for individuals that show high territoriality [30].
The methods presented here for development of a key are easy to transfer to other species, as long as a suitable list of distinguishing characteristics can be drawn up and ranked in order of the most obvious features.As such, the process may be more transferable than computer identification systems, which can require, for example, positions of spots to be inputted [13,15] and are not directly transferable to stripes [14] or to numbers of scales [18, this study].
While it is easily possible to manually insert a few (<10% of the total population size) new individuals into appropriate positions in a key, (semi)automating the construction of the key through extraction of maximum variance, meaning that if many new individuals migrate into a population (e.g., if more than 10% individuals in a population change, either through immigration or emigration), it would be easy and not especially time consuming to reconstruct the key from scratch (i.e., by repeating the entire key creation process with all current individuals).Such a reconstruction of the key from scratch when large changes to a population occur would ensure that the key remains simple to use and continues to use the most obvious features first in its design.Moving from a paper-based key to a computer based system, as presented here, would also allow for such changes to be managed simply and cost effectively.
Since many studied populations of marine vertebrates consist of relatively small numbers of individuals, keys such as these can be very suitable techniques for behavioural studies or estimating numbers of individuals (through capture-mark-recapture methods-given the assumptions on accurate identification mentioned above can be ensured).However, if the population size of interest is very large, then keys would become very large and would need to resort to minor discriminating features to identify individuals.As such, the procedures detailed here are best for examining smaller populations (<100 individuals).
While fully automated computer identification methods for species or individuals should be developed, for example, similar to the fully automated astronomy recognition system available on Flickr (http://www.flickr.com/groups/astromery), in the mean time, user-friendly keys could be an important step in allowing effective processing of "citizen science" collected photographic data.While individuals of some species (e.g., species with characteristic spots such as leopards, cheetahs or even manta rays) can be identified solely by these characteristics [13,16], other species, such as sea turtles, require a number of characteristics to be used for individual identification.As such, development of identification keys-where key characteristics are identified first-would be a useful step in creating automatic identification systems for these animals, akin to the feature-matching applications of face recognition technology [20].

Figure 1 :
Figure1: Key diagnostic features used to distinguish individual turtles from photographs of their heads.Refer to Table1for further details of abbreviations.

Figure 2 :
Figure 2: Mean (±S.E.n = 27) (a) success rate of correctly identifying five individuals, (b) time taken to complete five identifications, (c) proportion of key successfully navigated during identification.

Table 1 :
Definitions of scale characteristics, anatomical terminology, and short hand used.

Table 2 :
List of factors used to distinguish turtles in the developed keys.Priority indicates a ranking system from 1 to 13, where 13 indicates the most obvious characteristic to separate individuals, this ranking was used as a weighting factor in the development of the priority key.* For the combined key all these factors were weighted as 8. * Number of spots at parietal scale base 1