Mapping antibiotic-resistant Neisseria gonorrhoeae isolates in Metropolitan Toronto : Issues of scale , positional accuracy and confidentiality

Department of Geography and Environmental Studies, Wilfrid Laurier University, Waterloo, Ontario; and Laboratory Centre for Disease Control, Ottawa, Ontario Correspondence: Dr Jody Decker, Department of Geography and Environmental Studies, Wilfrid Laurier University, Waterloo, Ontario N2L 3C5. Telephone 519-884-1970 ext 2215, fax 519-725-1342, e-mail jdecker@mach1.wlu.ca Received for publication July 30, 1996. Accepted January 14, 1997 JF Decker, B Sharpe, J-AR Dillon. Mapping antibiotic-resistant Neisseria gonorrhoeae isolates in Metropolitan Toronto: Issues of scale, positional accuracy and confidentiality. Can J Infect Dis 1997;8(5):273-278. The primary objective of this paper was to investigate the methodological implications of mapping Neisseria gonorrhoeae using the partial three-digit postal code instead of the complete six-digit postal code. The reporting locations of N gonorrhoeae isolates submitted from hospitals, doctor’s offices, private and provincial laboratories, and sexually transmitted disease (STD) clinics were used as a model. Specifically, the paper focused on variations in geographical distributions of STD data when mapped at different aggregations of postal code data and at different map scales. Such variations are of importance to those who analyze the spatial epidemiology of STDs, and the accessibility and use of health care services. This analysis showed that three-digit postal codes are useful in summarizing overall geographic distributions, but greatly reduce positional accuracy, which can lead to inaccurate delineation of service areas and populations served. The six-digit postal code is more appropriate for detailed analysis addressing behavioural issues. The analysis demonstrated that six-digits can be used without breaching individual confidentiality.

I n Canada, the incidence of Neisseria gonorrhoeae has steadily declined for over a decade; from 1981 to 1993, it decreased from 56,336 to 6820 cases per year (1).However, plasmid-mediated resistance, notably of penicillinaseproducing isolates of N gonorrhoeae (PPNG), has continued to be an ongoing problem (2)(3)(4).Isolates of N gonorrhoeae with plasma-mediated resistance to tetracycline have increased (5).As gonorrhoea persists as a public health problem (6), appropriate diagnosis and treatment are essential disease control strategies, and the surveillance of antimicrobial susceptibility is essential to monitor ongoing and emerging resistance.Knowledge of the geographic distribution of N gonorrhoeae can be a useful tool for assessing patterns and their underlying processes.
A cursory examination of the published data shows important geographic variations and a pronounced urban bias in the distribution of N gonorrhoeae isolates with plasmid-mediated resistance.The provinces of Ontario and Quebec contained 92% of total Canadian PPNG cases in 1989, and the large urban centres in these provinces, Toronto and Montreal, respectively, accounted for 68.2% and 55.5% of their respective provincial totals (7).Furthermore, Brown et al (8) reported a high prevalence of PPNG in Ontario, particularly in the Metropolitan Toronto area, which also carried plasmid-mediated resistance to tetracycline (tetracycline-resistant N gonorrhoeae [TRNG]), alone or in combination with PPNG (PPTRNG).
Further geographic analysis with more disaggregated data may reveal patterns of spatial concentration within the city.The locational data commonly used for this purpose in Canada are the first three digits of the postal code.Our objective was to investigate some of the methodological implications of using partial three-digit postal code versus the more positionally accurate complete six-digit postal code.In this instance, the client's place of residence postal code was not available because of confidentiality concerns and is rarely available to researchers analyzing sexually transmitted diseases (STDs), so we mapped submitting locations of isolates in Metropolitan Toronto.Using submitting locational data, we demonstrated that methodologically the use of partial three-digit postal codes had significant limitations.The partial codes changed the boundaries of service areas and were thus unreliable descriptions of core population groups within those service areas.Because they also shifted geographic locations, sometimes up to a kilometre, they also raised questions about the allocation of submitting locations.

DATA AND METHODS
The data, consisting of 1329 isolates submitted for the 133 submitting locations in Metropolitan Toronto between 1988 and 1992, were provided by the National Laboratory for Sexu-ally Transmitted Diseases (NLSTD), Laboratory Centre for Disease Control, Ottawa, Ontario.Data included laboratory data on antibiotic resistance, as well as specific isolate data (eg, submitting location) which the Ontario Laboratory Services Branch -Etobicoke had originally given the NLSTD.Only PPNG, TRNG and PPTRNG isolates of N gonorrhoeae were recorded.These antibiotic-resistant isolates were used because over 90% of such reported isolates were forwarded to the NLSTD as part of a national surveillance program, thus providing a detailed snapshot of such isolates.
Each record in the data set contained a postal code for the submitting location from which isolates were sent for analysis.These locations included practitioner's offices; provincial STD clinics; primary medical laboratories, Central Public Health Laboratory, Laboratory Services Branch, Ontario Ministry of Health, Etobicoke, Ontario,; or hospitals.Although the data set initially contained 133 different submitting locations, not all had a unique location or postal code.Many of the submitting locations were smaller laboratories in larger facilities and thus shared postal codes.As a result, the number of submitting location records with unique six-digit postal codes was 75 rather than 133.For the purpose of comparison, the records were further aggregated using only three digits of the postal codes.This reduced the number of submitting locations to 44.
To map their spatial distribution, geographic coordinates for each submitting location were assigned using the postal code address.This was accomplished using the postal code conversion file of Statistics Canada.For each postal code in Canada, this file specifies its precise location, using either latitudinal and longitudinal coordinates or the Universal Transverse Mercator (UTM) system, established international systems of specifying point locations on the globe.The postal codes consist of six alpha-numeric characters intended to describe the destination of each item of mail addressed in Canada (for example, M2M 2R9).The first character of the postal code represents a province or territory or major sector within a province (for example, 'M' represents Metropolitan Toronto).The first three characters represent forward sortation areas (FSAs), defined by Canada Post to sort mail into coarse geographic areas to speed up mail delivery.As of January 1988, there were 1296 FSAs across Canada, 99 of which were in Metropolitan Toronto (9).The geographic coordinates for a threedigit FSA's position are at the geometric centroid of a FSA, essentially an arbitrary point without any reference to the phenomenon being mapped.
The complete six-digit postal code, which includes the FSA and another three characters, denotes local delivery units, typically city blocks.These postal codes are associated with points on the map known as block face centroids.A block face is one side of a city street, and its centre is equidistant between consecutive intersections with other streets.There are 48,136 block face centroids in Metropolitan Toronto.The centroid is a good estimate (within a city block) of the submitting location, yet is still just an estimate.This is important to understand when questions of confidentiality are raised.

RESULTS
The number of isolates sent by a submitting location was partly a function of the nature of the facility.For example, 43 of 133 submitting locations reported sending only one isolate.Medical diagnostic laboratories and special STD clinics submitted higher numbers of isolates.One location submitted 148 isolates, while the average submitted across all 133 locations was 10 isolates (Figure 1).Because of their exclusive mandate, STD clinics were differentiated from all other submitting locations.
Figure 2 shows the geographical distribution of antibioticresistant N gonorrhoeae isolates across submitting locations in Metropolitan Toronto.Figure 3 shows the distribution of isolates at the more detailed scale of downtown Toronto.A comparison of the two maps in Figure 2 demonstrates that the use of FSA postal codes compared with complete sixdigit postal codes.Forty-four submitting locations were located at any one of the 99 FSA centroids in Metropolitan Toronto (Figure 2 [top]), mapped using the partial codes.In Figure 2 (bottom), 75 submitting locations were located at any one of the possible 48,136 block face centroids.The complete postal code data revealed more pronounced clustering of submitting locations, especially evident in downtown Toronto and throughout the suburbs (Figure 2 [bottom]).
The necessity for increased positional accuracy also became apparent when delineating arbitrary service (catchment) areas for submitting locations.Consider, for example, the service area for submitting locations at designated point 1 on Figure 2 (top).At the three-digit postal code level, four submit-ting locations were confined within the one FSA, giving the impression that only one FSA was serviced.This is not the case on Figure 2 (bottom), where several of the previously aggregated locations were more accurately positioned on the boundary between two FSAs.Point 2 on the maps demonstrated another similar example.Point 3 on the maps illustrates a striking example of the total displacement of a STD clinic.In Figure 2 (top), point 3 is located in the centre of an FSA, whereas in Figure 2 (bottom) this same clinic is associated with three different FSAs.
When analyzing the information on Figure 3 (top and bottom) on a larger scale, issues of positional accuracy became even more apparent.The submitting location reporting the largest number of isolates in the upper FSA moved at least 1 km to the southeast of that FSA on Figure 3 (bottom) when six-digit postal codes were used.The location of the STD clinics shifted dramatically from a random pattern in Figure 3 (top) to a clustered pattern around Yonge Street in Figure 3 (bottom).Figures 2 and 3 demonstrate the point that the spatial patterns of submitted antibiotic-resistant N gonorrhoeae vary considerably when working at the two levels of aggregation possible with the Canadian postal code data.

DISCUSSION
Epidemiological studies of different infectious diseases have noted the importance of working at multiple scales (10,11).Three-digit postal codes are generally effective in summarizing overall geographical distributions when mapped at the metropolitan level.However, when mapped at a neighbourhood scale, the use of three-digit postal codes reduces the number of submitting locations and visibly reduces their positional accuracy.The movement of submitting locations away from their true locations, sometimes as much as a kilometre, can have considerable implications.Inaccuracies in the position of submitting locations may lead to misidentification of core groups in the transmission of N gonorrhoeae.Descriptions of the socio-economic characteristics of a population living near a medical clinic involves defining a service area overall sense of the geographic pattern of N gonorrhoeae reporting throughout a large metropolitan region or the rationalization of services and allocation of scarce resources, or to answer questions about where to place a new facility, then the use of three-digit postal codes is appropriate.Differences between suburbs and the central city, or between public health unit jurisdictions within the metropolitan region can be readily discerned.Further analysis of the generalized pattern might relate the distribution of STD submitting locations to the more generalized socio-demographic patterns and transportation infrastructure of the city.
At the neighbourhood scale, on the other hand, if the research goal is to analyze the behavioural patterns of clients or to associate submitting locations with population characteristics, the full postal code is more appropriate.Use of client postal code data has produced several important results.A study by Potterat et al (12) in Colorado in 1985 demonstrated that geographic clustering of STDs was evident at the census tract level within downtown areas.Rothenberg (13) analyzed all submitted cases of N gonorrhoeae (resistant and nonresistant strains) in upstate New York, using data combined from 1975 to 1980.He was able to identify high prevalence census tracts that he suggested may be responsible for continuing endemicity of the disease in that state.Studies of N gonorrhoeae in other places have also shown that further geographical analysis can identify core transmission groups and locate core transmission areas (14,15).
Ecological studies on social and sexual networks and support systems have shown that STDs tend to concentrate in neighbourhoods (16,17).In Canada, only one ecological study of STDs has been done, yet it relied on three-digit postal codes to detect and compare high rate areas with census tract characteristics within those areas.This report found that N gonorrhoeae outbreaks occur in "low-income minority populations involving drug use and high risk sexual behaviour" (18).
The use of six-digit postal codes in the above Canadian study would have been more accurate.Confidentiality would not have been compromised because six-digit codes are only locational estimates centred on one side of a street of a city block.Furthermore, when mapped, 'confidential' data are often aggregated by several years in order to get an adequate number of cases and then grouped into ranges.This challenges the reticence and rigidity of some institutions, agencies and government departments that restrict data accessibility on the basis of confidentiality.Moore (19) referred to such restrictions as the "one-legged Tarzan syndrome".He asserted that it is indeed highly improbable that an individual could ever be recognized, let alone any inferences made, from linkages between aggregated cases and socio-economic data.Moore's thoughts are echoed by another prominent geographer, Peter Gould (20), who models the geographic spread of diseases.Gould stated that in all studies of mathematical modelling, geographers have never identified any one person, have no need to do so and would not know what to do with the information if they had it.He added that the loss of confidentiality "is a genuine fear, but it has been taken to extreme and absurd lengths" (20).

CONCLUSION
Depending on the research question, careful consideration must be given to the use of partial versus complete postal codes in analyzing data.We used submitting location data to demonstrate this methodological fact.An analysis that related submitting location data to place of residence data would be very valuable and it is clear from the discussion that confidentiality would not be compromised.Questions could then be addressed such as: What type of facility do clients favour given they have a choice in their immediate vicinity?How far are they willing to travel to receive treatment?What is the percentage of repeat cases (not isolates) of individuals out of the total number of submissions?To move on to these more complicated issues of the geography of STDs, it is important to understand the ramifications of using three-versus six-digit postal codes, regardless of the type of data.However, in issues of such sensitivity, as with STDs, positional accuracy becomes all the more important.
Figures 2 (top) and 3 (top) represent submitting locations of N gonorrhoeae isolates at the three-digit FSA level.Figures 2 (bottom) and 3 (bottom) used the complete six-digit postal code for submitting locations of N gonorrhoeae isolates.Regardless of the type of submitting location, the overall pattern revealed by Figure 2 (top) is hierarchical.Downtown Toronto had the highest concentration of submitting locations, several with high isolate counts.Surrounding the downtown area was a semicircular ring, in which most submitting locations reported low or moderate isolate counts.Submitting locations with high isolate counts were located in the northern, eastern and western suburbs of Metropolitan Toronto.STD clinics handled less than one-quarter of the total number of antibiotic-resistant isolates submitted.Of the 133 submitting locations, 19 were STD clinics, accounting for 237 of the 1321 isolates.Most of the STD clinics were concentrated downtown.Of the few STD clinics in the suburbs, only the Scarborough clinic on the east side submitted a moderately high isolate count.In contrast, three STD clinics in the centre of the map, east and west of Yonge Street submitted only a few isolates.

Figure 1 )Figure 2 )
Figure 1) Number of submitting locations in Metropolitan Toronto, 1988 to 1992, by frequency of isolates reported

Can J Infect Dis Vol 8 Figure 3 )
Figure 3) Top Spatial distribution of penicillinase-producing isolates of Neisseria gonorrhoeae (PPNG) and tetracycline-resistant N gonorrhoeae (TRNG) by submitting locations in Toronto, 1988 to 1992, using three-digit postal codes.Bottom Spatial distribution of PPNG and TRNG by submitting locations in Toronto, 1988 to 1992, using sixdigit postal codes.Symbols (circles and boxes) are proportionally graduated according to the number of isolates reported at each submitting location.Forward sortation area boundaries are included as a frame of reference.STD Sexually transmitted disease