Intra-and interlaboratory reproducibility of an ELISA serological test for Lyme disease

Graduate Department of Community Health, and Department of Preventative Medicine and Biostatistics, University of Toronto, Toronto, and Laboratory Centre for Disease Control, Ottawa, Ontario Correspondence and reprints: Dr M Tammemagi, Graduate Department of Community Health, Fourth Floor, McMurrich Building, University of Toronto, Toronto, Ontario M5S 1A1. Telephone (416) 978-2058, Fax (416) 978-1883, e-mail mtammem@pmb.med.utoronto.ca Received for publication June 16, 1994. Accepted September 21, 1994 MC TAM ME MAGI, JW FRANK, M LEBLANC, H ARTSOB. In traand in ter la bo ra tory re pro duci bil ity of an ELISA se ro logi cal test for Lyme dis ease. Can J In fect Dis 1995;6(2):9095.

L YME DIS EASE IS A MULTI SYS TEMIC TICK-BORNE BAC TE RIAL IN - fec tion of hu mans and ani mals caused by the spi ro chete Bor re lia burg dor feri.Lyme dis ease has spread rap idly across the United States and has in creased dra mati cally in in ci dence since its first rec og ni tion in the mid-1970s (1).Clini cal di ag nosis and pub lic health sur veil lance are aided by the de tec tion of an ti bod ies against B burg dor feri.The ac cu racy and re produci bil ity of avail able se ro logi cal tests have not been dem onstrated con vinc ingly and have been ques tioned by many authori ties (2)(3)(4)(5)(6)(7)(8)(9)(10).
It has been rec om mended by a con sen sus con fer ence on Lyme dis ease that all labo ra to ries in Can ada use a com mercial ELISA for ini tial test ing of pa tients' sera for an ti bod ies to B burg dorferi (11).Since 1990, an an nual na tional qual ity con trol as sess ment (QCA) of pro vin cial di ag nos tic labo ra tories that test for Lyme dis ease has been car ried out by the Labo ra tory Cen tre for Dis ease Con trol (LCDC) in Ot tawa.Some Ca na dian labo ra to ries em ploy the Whit taker Lyme Stat (WLS) ELISA Kit (Whit taker Bio products, Mary land) to meas ure im mu no globu lin (Ig) G and IgM an ti bod ies in hu man se rum.This pa per re ports the re sults of a study of the in ter-and intrala bo ra tory re pro duci bil ity of the WLS ELISA used by three labo ra to ries in the LCDC 1991 QCA.

MA TE RI ALS AND METH ODS
In May 1991, the LCDC con ducted a QCA of eight pro vin cial labo ra to ries that per form tests to de tect anti bodies to B burgdor feri.Three pro vin cial labo ra to ries used the WLS ELISA, while the other labo ra to ries used al ter na tive tests.The re sults of only two pro vin cial labo ra to ries are re ported here, be cause one set of sera was spoiled.The sera of 27 in di vidu als were in cluded in this study.The sera were di luted 1 in 10 parts of phos phate buff ered sa line, and each labo ra tory was pro vided 0.24 mL ali quots.Sera were shipped fro zen on dry ice and the LCDC main tained fro zen ali quots for com para tive test ing.All labo ra to ries tested the sam ples blind to the pu ta tive 'gold stan dard' status of the sera ex cept for the LCDC, and all labora to ries tested the sera blind to the out comes from the other labo ra to ries.Labo ra tory 6 and the LCDC used kits from the same lot for their re peat test ing, but the kit lots dif fered between labo ra tory 6 and the LCDC.The lot number for labo ratory 2 was not avail able.Due to a scar city of ma te rial, West ern blots were not per formed by these labo ra to ries.
Sera in the sam ple came from both in fected and un infected in di vidu als, and rep re sented as close to bona fide 'gold stan dard' cases and non cases as could prac ti cally be obtained.The 'gold stan dard' status was based on a de tailed clini cal and di ag nos tic work-up (in clud ing West ern blots) by ex perts in the field.Six teen sera were con sid ered posi tive.These pa tients had been ex posed in en demic ar eas, had clini cal mani fes ta tions char ac ter is tic of Lyme dis ease, and ex perts (LCDC and/ or Cen ters for Dis ease Con trol and Preven tion, Fort Col lins, Colo rado) felt rea sona bly cer tain about their status.In many posi tive cases the pa thog no monic sign, ery thema chroni cum mi grans, was ob served or the etio logi cal agent was de tected bac te rio logi cally.Pa tient back ground and source of sera are pre sented in Ta ble 1.All 11 nega tive sera came from in di vidu als liv ing in nonen demic ar eas and were con sid ered by ex pert opin ion to rep re sent in di vidu als free from Lyme dis ease.Five nega tive and 12 posi tive sera were ob tained from the Cen ters for Dis ease Con trol and Pre vention at Fort Col lins.The se rum from one of these pa tients was con sis tently non re ac tive on se ro logi cal test ing de spite the fact that B burg dor feri had been iso lated from the pa tient.Thus, the maxi mum of se ro logi cal re ac tors ex pected was 15.Five of the 11 nega tive sera came from healthy pa tients and six came from pa tients who were thought to have a higher like li hood of be ing false re ac tors (9,(12)(13)(14): three from syphi lis pa tients, two from pa tients who were an ti nu clear antibodypositive and one from a pa tient who was rheumatoid-positive.
The WLS test has its col or imet ric out come meas ured op tically.The di ag nos tic sig nifi cance of the op ti cal den sity or absor bance is not in ter preted di rectly, but rather the read ing is con verted into a pre dic tive in dex value (PIV), which is a standard ized score that is then in ter preted as a posi tive, equivo cal or nega tive re sult.The ab sor bance read ings for a set of results de rived from one test kit are trans formed into PIVs (di agnos tic scores) by use of a cali bra tion (re gres sion) line ob tained by meas ur ing the ab sor bance of three stan dards pro vided in each test kit and fix ing those ab sor bance read ings to pre de ter mined in dex val ues called la belled in dex val ues.Al though the la belled in dex val ues may vary slightly among kits, they are set so that the PIVs among kits in dif fer ent labo rato ries and at dif fer ent times are al ways in ter preted in the same way.A PIV be low 0.80 is con sid ered sero nega tive; equal to or greater than 1.00 is con sid ered posi tive; and in between these val ues is con sid ered an equivo cal re sult.
The 27 sera were tested twice by the LCDC, once by labo ratory 2, and twice by labo ra tory 6.This al lowed the as sessment of in ter la bo ra tory re pro duci bil ity to be made among three labo ra to ries, and in trala bo ra tory re pro duci bil ity in two dif fer ent labo ra to ries.In tra-and in ter la bo ra tory re pro duci bil ity were as sessed by com pu ta tion of the weighted kappa sta tistic (k w ) (15)(16)(17) for the data in the cate gori cal scale, that is, through use of the di ag nos tic clas si fi ca tions of the PIVs.An un weighted kappa sta tis tic for mul ti ple ra ters and mul ti ple RÉSUL TATS : Les données sur les moy en nes kappa pondé rées étaient de 0,87 pour les com parai sons en tre les labo ratoires et de 0,89 pour les com parai sons à l'inté rieur des labo ra toires.CON CLU SIONS : De façon globale, les épreu ves ELISA évaluées dans cette étude ont démon tré une re pro ducti bil ité de bonne à ex cel lente en tre les labo ra toires et dans les labo ra toires, dans le cadre de l'évalua tion du con trôle de la qual ité de 1991 du Labo ra toire de lutte con tre la mala die, lor sque les données ont été évaluées dans une échelle de caté go ries utili sant les moy en nes kappa pondé rées.La gé né rali sa tion de ces résul tats au con texte des labo ra toires clin iques ne peut se faire sans ex er cer une cer taine pru dence.
Lyme ELISA re pro duci bil ity cate go ries (16) was com puted for in ter la bo ra tory com parisons.Only one set of re sults was in cluded from each labo ratory in com pu ta tion of the lat ter sta tis tic.From the two sets of re sults pro vided by the LCDC and labo ra tory 6, the least re produci ble set was used, so that the es ti mate of the over all kappa would tend to be con ser va tive.The sen si tiv ity and speci fic ity were cal cu lated via the method de scribed by Poynard et al (18) and Simel et al (19) for tests with tri choto mously cate gorized re sults.Es sen tially, this method ex cludes in ter me di ates from the cal cu la tion.Sen si tiv ity is the ra tio of true positives:(true posi tives + false nega tives).Speci fic ity is the ra tio of true nega tives:(true nega tives + false posi tives).

RE SULTS
Test re sults and the di ag nos tic in ter pre ta tion for each patient are pre sented in Ta ble 2. The sen si tiv ity for the dif fer ent sets of re sults ranged from 0.88 to 0.94 (mean ± SD 0.91±0.04)and the speci fic ity ranged from 0.82 to 1.0 (0.96±0.08).In two sets of re sults there were no equivo cal results.The re main ing three sets of re sults had only one equivo cal re sult each, and the pa tient was dif fer ent in each set of re sults.
Ten dif fer ent pair-wise com pari sons could be made from the five sets of re sults: eight in ter la bo ra tory and two in trala bora tory.The weighted kappa sta tis tics for the dif fer ent compari sons are re ported in Ta ble 3. The weighted kappa sta tis tics for the 10 com pari sons range from 0.74 to 0.96.The in trala bo ra tory weighted kappa sta tis tics were 0.96 and 0.81 (mean 0.89).The weighted kappa sta tis tics for the eight in terla bo ra tory com pari sons range from 0.74 to 0.93 (mean ± SD 0.87±0.07).The over all un weighted kappa for mul ti ple ra ters and mul ti ple cate go ries for the three labo ra to ries was 0.80 (95% CL 0.70, 0.91).

DIS CUS SION
Past stud ies of Lyme se ro logi cal tests have con cluded that these tests lacked re pro duci bil ity (8-10), but ques tion able statis ti cal meth ods and study de signs were used.For ex am ple, some stud ies used meas ures of trend or as so cia tion, not concor dance, to ap praise agree ment, such as Spear man's rank cor re la tion co ef fi cient (9) or Wil coxon's signed rank test (10).Wil coxon's rank test meas ures for a dif fer ence in the means be tween two sets of re sults, but it is pos si ble for two sets of results to have very dis crep ant re sults and yet have simi lar  (9) or re stricted sam ples (8) were used in some re pro duci bil ity stud ies.Also, some stud ies failed to deal with in ter me di ate or equivo cal re sults (8,9), group ing these re sults to gether with ei ther posi tive re sults (8) or nega tive re sults (9).Some stud ies were not con ducted blind (8).
For clini cal pur poses the PIVs are in ter preted in the or di nal scale as nega tive, equivo cal or posi tive.The weighted kappa sta tis tic is con sid ered by many authori ties to be the meas ure of re pro duci bil ity of choice for such or di nal data (16,17).It not only meas ures con cor dance but also takes into con sid era tion chance agree ment, and it weights the de gree of dis agree -ment.A kappa sta tis tic of 1 re flects per fect agree ment, of 0 indi cates ran dom agree ment, and of less than 0 sug gests dis agree ment.The range of val ues en coun tered in this study are gen er ally be lieved to re flect good to ex cel lent agree ment (16,20).
As the test re sults, the PIVs, natu rally oc cur in the con tinuous form, why was the analy sis of re pro duci bil ity not car ried out with the data in the con tinu ous scale, via meth ods such as the in tra class cor re la tion co ef fi cient (21)(22)(23) or Bland and Altman's lim its of agree ment (24)(25)(26)?Meas ures of re pro duci bility with the data in the con tinu ous form may over em pha size kw weighted kappa (95% CL); LCDC Labo ra tory Cen tre for Dis ease Con trol lack of re pro duci bil ity at ex tremes where the ex act re sult does not mat ter di ag nos ti cally and may un der em pha size re produci bil ity around im por tant di ag nos tic in ter pre tive cut-points.Be cause medi cal decision-making is based on the in ter pre tation of the PIV and not the par ticu lar value it self, or di nal assess ment of re pro duci bil ity is ap pro pri ate here.The over all un weighted kappa for mul ti ple ra ters and mul tiple cate go ries for the three labo ra to ries dem on strates ex cellent re pro duciblity (k w =0.80).The nar rower 95% CLs (0.70, 0.91) com pared with the pair-wise weighted kappa is in large part due to an in crease in sam ple size.The in ter la bo ra tory repro duci bil ity (mean k w =0.89) did not ap pear to be ap pre cia bly worse than the in trala bo ra tory re pro duci bil ity (mean k w =0.87) as might have been ex pected, but it is un likely that this study had the power to dem on strate a dif fer ence if it ex isted.Also, lack of in trala bo ra tory blind ness to the first set of re sults does not ap pear to be a bi as ing fac tor be cause re sults are remarka bly uni form across and within labo ra to ries.In ter lot repro duci bil ity as re flected in com pari sons of re sults be tween the LCDC and labo ra tory 6 ap pears high (mean k w =0.87).
It is some what re as sur ing that the number of equivo cal results was small (range 0 to 3.7%), be cause these re sults of ten pres ent more of a di ag nos tic co nun drum than cle ar cut positive or nega tive cases.No con sis tent pat tern could be detected in di cat ing whether cer tain in di vidu als were more likely to have equivo cal re sults or equivo cal out comes oc curred more in true posi tives than true nega tives, or vice versa.Given the sam ple size, this study lacks the power to de tect such pat terns if they ex ist.

CON CLU SIONS
As meas ured by the weighted kappa sta tis tic, the WLS ELISA dem on strated good to ex cel lent in ter-and in tra la bo ra tory repro duci bil ity in the LCDC 1991 QCA.Ide ally, a test should be tested in a sam ple that is simi lar to the popu la tion in which it will be used.In the case of rare dis eases, such as Lyme disease, this is im prac ti cal be cause an ex traor di nar ily large sample size is re quired to ob tain an ade quate number of true posi tive cases.The LCDC 1991 QCA is a much more het ero gene ous sam ple, with many more posi tive cases, than is encoun tered in the ho mo ge ne ous popu la tion tested in pub lic di ag nos tic labo ra to ries in which most sera test nega tive.The vari ance of the study sam ple re sults are at least four times greater than those ob served in the On tario Pro vin cial Labo ratory (27).In creased het ero ge ne ity in flates the kappa sta tis tic and leads to an over es ti mate of re pro duci bil ity.When marginal pro por tions are un bal anced, as would oc cur for re sults ob tained in most North Ameri can labo ra to ries, the maxi mum achiev able mag ni tude of kappa drops to well be low 1 (28,29).
Us ers of the WLS test may there fore feel fairly con fi dent that it is rela tively re pro duci ble by the stan dards ap plied to most labo ra tory tests.Labo ra tory fac tors that can im prove and main tain the re pro duci bil ity of Lyme se ro logi cal tests are dis cussed by Duf fey and Sa lugsugan (30).How ever, as others have pointed out (31,32), this high re li abil ity does not trans late into high va lid ity.In low preva lence situa tions the posi tive pre dic tive value of the test may still be quite low, and this has lead to the prac tice of con firm ing posi tive ELISAs with West ern Blot tests, which are be lieved to be more spe cific.De fi cien cies of West ern Blot tests as con fir ma tory tests for Lyme dis ease have re cently been pointed out (33).It is therefore im por tant to em pha size that the di ag no sis of Lyme disease is pri mar ily based on ap pro pri ate clini cal pres en ta tion, with labo ra tory tests pro vid ing backup (33).

TA BLE 1 The panel of sera used in the LCDC 1991 qual ity con trol as sess ment Se rum number Status His tory sup plied with se rum
CDC Cen ters for Dis ease Con trol and Pre ven tion, Fort Col lins, Colo rado; LCDC Labo ra tory Cen tre for Dis ease Con trol, Ot tawa; LSPQ Labo ra toire de Santé Pub lique du Qué bec, Ste Anne de Bel le vue; SPHL Sas katche wan Pub lic Health Labo ra tory, Re gina.All Lyme dis ease sera from CDC were from clini cally char ac ter ized Lyme dis ease pa tients; Nor mal is se rum from healthy do nors in nonen demic ar eas; *Bac te rio logi cally con firmed Lyme dis ease but nega tive in se ro logi cal tests; Pa tient 17 clini cally fit Lyme dis ease but se ro logi cal tests were more char ac ter is tic of ter ti ary or la tent syphi lis; † Pa tients had his to ries of travel to rec og nized Lyme-endemic ar eas TAM ME MAGI et a l 92 CAN J INFECT DIS VOL 6 NO 2 MARCH/APRIL 1995 means.Bi ased sam ples