Bayesian Estimation of the True Prevalence and of the Diagnostic Test Sensitivity and Specificity of Enteropathogenic Yersinia in Finnish Pig Serum Samples

Bayesian analysis was used to estimate the pig's and herd's true prevalence of enteropathogenic Yersinia in serum samples collected from Finnish pig farms. The sensitivity and specificity of the diagnostic test were also estimated for the commercially available ELISA which is used for antibody detection against enteropathogenic Yersinia. The Bayesian analysis was performed in two steps; the first step estimated the prior true prevalence of enteropathogenic Yersinia with data obtained from a systematic review of the literature. In the second step, data of the apparent prevalence (cross-sectional study data), prior true prevalence (first step), and estimated sensitivity and specificity of the diagnostic methods were used for building the Bayesian model. The true prevalence of Yersinia in slaughter-age pigs was 67.5% (95% PI 63.2–70.9). The true prevalence of Yersinia in sows was 74.0% (95% PI 57.3–82.4). The estimates of sensitivity and specificity values of the ELISA were 79.5% and 96.9%.


Introduction
Yersiniosis is a foodborne disease in humans, which is caused by Yersinia enterocolitica and to a lesser extent by Yersinia pseudotuberculosis, and it is the third most reported zoonotic disease in the EU [1]. Y. enterocolitica infections have been associated with the consumption of pork products [2][3][4]. Often healthy pigs are asymptomatic carriers of Y. enterocolitica; and they are a major reservoir for human pathogenic strains [3,5,6].
Diagnostic tests are used for prevalence surveys. Ideally, true prevalence should be estimated from apparent prevalence adjusting for the diagnostic test sensitivity and specificity [7]. It is a common observation that the sensitivity and specificity estimates differ among validation studies, which can be explained due to differences among reference population and sampling strategies [8]. Differences in sensitivity and specificity between diagnostic methods can result in a considerable variation in prevalence estimations, when they are not taken into account. For this reason, reliable estimates of sensitivity and specificity of diagnostic tests are necessary.
Various methods have been described for detection of antibodies against enteropathogenic Yersinia in serum samples of pigs at farms and in juice extracted from tonsils and meat at farms and slaughterhouses [9][10][11][12][13][14]. However, these diagnostic tests have different sensitivities and specificities making the direct comparison of the results difficult.
The true prevalence can be estimated from an apparent prevalence by using frequentist or Bayesian methods. For example, frequentist methods assume that true prevalence is a fixed unknown quantity by which a randomly chosen individual from the population is infected [7]. One of the estimators of true prevalence is the Rogan-Gladen estimator [15]. The Bayesian inferences have been advocated as more flexible and useful to solve complex problems [16], and they allow the incorporation of prior information in addition to the data. The Bayesian approach has been used in validation of diagnostic methods, providing a reliable estimate of the sensitivity and specificity when there is more than one diagnostic test but no gold standard. An example of this is the evaluation of the diagnostic test for detection of classical swine fever [17]. Also, a Bayesian hidden variable model has been developed to study the occurrence of foodborne pathogens in the pork production chain [18].
The true prevalence of Y. enterocolitica in pigs sampled in farms and slaughterhouses is not directly noticeable. These should be estimated using the information from the apparent prevalence and the sensitivity and the specificity of the diagnostic test [7]. Neither the sensitivity nor the specificity of the commonly used tests is known with certainty, which introduces additional uncertainty when adjusting apparent prevalence. Using a Bayesian analysis, the true prevalence of enteropathogenic Yersinia in serum of Finnish pigs has been estimated. The sensitivity and specificity of the diagnostic test were also estimated.

Definitions.
Definitions of prevalence, sensitivity, and specificity were considered as defined by Greiner and Gardner [8] and Thrusfield [19]. Apparent prevalence (Ap) is the proportion of the pig population that tests positive using a diagnostic method, and true prevalence (Tp) is the proportion of truly infected pigs in that population. The sensitivity (Se) of a diagnostic test is the proportion of infected animals that the test detects as positive. Specificity (Sp) of a diagnostic test is the proportion of noninfected animals that the test detects as negative.

2.2.
Modelling Approach. The model was built in two steps using the Bayesian analysis to calculate the posterior probabilities, depending on data and prior distribution. The model estimated the true prevalence of Yersinia in serum samples. The prior distribution of the true prevalence was estimated based on a systematic review in the first step of the model, and later on introduced in the second step.

First
Step. The first step is a model to estimate the prior distribution of the true prevalence and to estimate the prior distribution for sensitivity and specificity of ELISA test.
Systematic Review. The objective of the systematic review was to assess the apparent prevalence of Yersinia in serum samples in slaughter-age pigs and sows from farms in Finland. For this review, the questions, type of intervention, population, and outcome were used to create the inclusion criteria [20]: any study or survey that evaluates the presence of and risk factors for antibodies against enteropathogenic Yersinia in serum samples from slaughter-age pigs and sows in farms using a commercially available enzyme-linked immunosorbent assay (ELISA) kit (Pigtype Yopscreen, Labor Diagnostik, Leipzig, Germany).
Papers written in any language were searched, and when data was published in different articles by the same authors or in reviews, we considered them only once to avoid duplication. Data from unpublished studies was not available. The keywords used for the search were Yersinia, pigs or pig farms, and prevalence or seroprevalence as words in the titles or the abstracts when searching in the National Center for Biotechnology Information (NCBI) PubMed database or as the topic when searching in Web of Science. We also looked over the reference lists of the relevant papers and in auxiliary data sources, such as the Google search engine.
All studies identified were assessed against the defined inclusion criteria. Selection of studies was carried out in two stages: the first stage by screening the title and abstract of the manuscripts and the second stage by screening the full text. The number of publications selected from the systematic review was 4, while 8 manuscripts were excluded from the review because they failed in at least one of the inclusion criteria (the list of manuscripts is shown in Table 1); for example, the diagnostic tests described in the manuscripts were different from the commercial ELISA kit, and thus the sensitivities and specificities, or the samples were taken at the slaughterhouses.
The data collected from each of the manuscripts was as follows: the number of positive pigs and the number of positive farms (or herds), the number of sampled pigs and the number of sampled farms (or herds), age of sampled pigs, methodology used for analysing the samples, when and where (country level) the study was carried out, the authorship, and the published journal. We took into account data taken from tables when there was any inconsistency between data of the text and the tables. Data was collected from the selected studies and recorded in Excel (Microsoft Corp., Redmond, WA).
Construction of the Model with Literature Data. Information on number of positive pigs (or herds) and number of sampled pigs (or herds) obtained from the systematic review was used as observed data. Noninformative (uniform) prior distributions Beta(1, 1) were assigned as the prior distributions of pig and herd level true prevalence in the literature data, since it is commonly used as prior distribution for binomial proportions [29] when the prevalence is a random variable. As result, the posterior distributions of the prevalence, based on literary data, were used as informative prior distributions in the second stage below. In this way, the information from previous literature becomes utilized, with the assumption that the selected collection of literature represents roughly similar prevalence in pig populations in Finnish studies.
Information provided by the validation report published by the manufacturer of ELISA test (Pigtype Yopscreen, Labor Diagnostik, Leipzig, Germany) was used to estimate the prior distributions for sensitivity and specificity of the serological analyses. The sensitivity of ELISA was modelled using the validation report of the manufacturer, where out of infected animals tested positive; then beta( + 1, − + 1) gives the posterior distribution of sensitivity, assuming a binomial model and uniform prior distribution for sensitivity [30].
Estimates of the pig and herd prevalence were calculated in this first step by using a model mathematically similar to the one use in the second step. The obtained posterior medians and 95% PI (probability interval) of the prevalence were used as inputs in the software Betabuster (downloaded from http://www.epi.ucdavis.edu/diagnostictests/ betabuster.html) to obtain the shape parameters for the prior beta distributions to be introduced in the second step of the modelling. When the estimated value was between 0 and 0.5, the 95th percentile was chosen, and when the estimated value was between 0.5 and 1 the 5th percentile was chosen, according to the instructions provided by the copyright holders of Betabuster. The beta prior distribution of the specificity was also obtained using this procedure.

Second
Step. The second step is a model to estimate the pig and herd true prevalence of Yersinia in serum samples in Finland.

Collection and Analyses of Samples.
The study was carried out in Varsinais-Suomi region that accounts for 28% of the total pigs in Finland (pig census from Matilda, Agricultural Statistics of Ministry of Agriculture and Forestry, 2010). The number of pigs to be sampled was calculated as previously described by Vilar et al. (2013) [21]. Individual serum samples from 120 slaughter-age pigs (50 kg or more) and 107 sows were collected in 16 farms and analysed for occurrence of antibodies against Yersinia. The total number of sampled pigs and the number of pigs positive using the diagnostic test in each farm were recorded to calculate the prevalence at pig and herd level. Serum samples were tested for the presence of Yersinia antibodies by using a commercially available ELISA kit (Pigtype Yopscreen, Labor Diagnostik, Leipzig, Germany), with a cut-off optical density (OD) value of 0.2 according to the manufacturer's instructions.

Construction of the Model with Observed Data.
A binomial sampling model was assumed as the population size in each farm was large enough compared with the sample size. The size of the farms was on average 630 slaughter-age pigs and 306 sows. The average number of slaughter-age pigs sampled in each farm was 13, and the average number of sows sampled in each farm was 9. The Bayesian model to estimate true prevalence was mathematically constructed from the conditional distributions (shown in Figure 1  There is variation between farms in true prevalence so that it is not realistic to assume a common prevalence for all farms. These differences are accounted by modelling prevalence as a farm specific parameter. Finally, tau represents herd true prevalence, the proportion of truly positive herds. The prior of tau was also calculated based on the literary review. The independent beta prior distributions obtained in the first step of this paper were used to take into account the uncertainty in the prevalence as well as in the diagnostic test sensitivity and specificity [31]. Thus, the priors for prevalence were based on the literature data, expressed as beta distributions, beta( Tp, Tp), conditionally based on that the population is infected.
Bayesian analysis was also used for upscaling estimates for a larger finite population, assuming that it is similar to the study population. Data of pig census was obtained from Matilda (Agricultural Statistics of Ministry of Agriculture and Forestry, 2010) and used to calculate the apparent and true prevalence of Yersinia in the whole of Finland. The upscaling was based on evaluating the average of actual true prevalence in the study farms avtp = mean (Tp0[1, . . . , 12]), which represents the actual true prevalence in the study population of pig herds. Assuming that these are representative of all herds in the census, the expected number of positive pigs is avtp times the census size. Posterior distribution of this was computed.
Models were constructed in WinBUGS 1.4.3, and the graphical representation is shown in Figure 1. Inferences were based on 50000 iterations after a burn-in for convergence of 1000 iterations. Results of the posterior probability distributions are summarized by the median and the probability intervals (PI).

Sensitivity Analysis.
Different prior distributions and noninformative prior distributions were used to perform the sensitivity analysis. Different prior distributions for the prevalence were used in the set of priors 1. Noninformative prior distributions for prevalence and sensitivity were introduced in the set of priors 2. Later on, the posterior median values obtained were compared for significant differences by a general linear model for repeated measures.

Results and Discussion
In this study a Bayesian analysis was used to provide reliable information on the prevalence of enteropathogenic Yersinia in pigs sampled at farms in Finland, and also to provide useful and relevant information of the diagnostic test commonly used for their detection. Table 2 shows the estimates of the posterior distributions of the pig and herd true prevalence from the model built in the first step. The sensitivity and the specificity of the diagnostic tests are also shown. These values were used when building the model of the second step. The prior distribution of the sensitivity of the commercial ELISA used to test the serum samples for the presence of Yersinia antibodies was beta(63, 29), and the specificity prior distribution was Sp ∼ beta(6.0, 1.1).
The results of the posterior probabilities obtained in the second step are shown in Table 3. The posterior probability of the true prevalence of enteropathogenic Yersinia in slaughterage pigs had a median value of 67.5%. The predicted total number of Yersinia positive slaughter-age pigs was 329,000 (308,400-345,800) out of 487776 slaughter-age pigs in the whole of Finland. The posterior probability of the true prevalence of enteropathogenic Yersinia in sows had a median value of 74.0%. The true prevalence of enteropathogenic Yersinia in serum samples from slaughter-age pigs estimated in the present study was lower than apparent prevalence reported previously [14,21,[32][33][34][35][36][37][38]. However, those studies were based on a frequentist approach. On the other hand, when there is no prior information, frequentist analysis produces good estimates of prevalence [39], and this would correspond to Bayesian analysis with noninformative priors. However, some background information about sensitivity and specificity is needed in both cases. Table 3 also presents the sensitivity analysis conducted by comparing the model with the original set of priors with the other sets of priors. Sensitivity analysis serves to illustrate how prior knowledge could affect the posterior estimates [40].
Although the values were not exactly similar, no significant differences were found between posterior medians and their PI. The model used was not very sensitive to the choice of priors, as the posterior probabilities for the three sets of priors were similar across the pig populations.
It has been reported that prevalence is associated with the prevalence of Yersinia in tonsils [10] and in faeces [11,21]. However, the prevalence values of Yersinia are usually higher than the prevalence values of Y. enterocolitica in faeces,  both collected in farms [21]. This difference can be explained because the antibodies are usually present long after an infection starts [10,41] and because the commercial ELISA test used in the present study detects antibodies based on the outer membrane proteins and thus detects infections with all pathogenic Yersinia. However, in Finland the prevalence of Yersinia pseudotuberculosis has been reported to be less than 8% [42,43]. Sensitivity and specificity of the ELISA diagnostic test were 79.5% and 96.9%, respectively. The estimations obtained indicated that the commercial ELISA test, although good, had lower sensitivity and specificity than that previously reported by the manufacturer. Some studies [11][12][13] have used the commercial ELISA test but the accuracy characteristics of 100% sensitivity and 100% specificity reported by the manufacturer have not been discussed. Furthermore, no tests can be considered as having both 100% sensitivity and 100% specificity, as it is thought that estimates vary among validation studies, such as sampling strategies, technical variation between laboratories, choice of gold standard, and state of infection [8].

Conclusions
By using the estimates obtained by the Bayesian analysis it was possible to estimate the true prevalence of Yersinia in the population under study, without sampling all animals. Consequently, the model constructed in the present study can be extended when studying a country's population, which would overcome the logistic difficulties of sampling high numbers of animals. The Bayesian approach provided a reliable estimate of the sensitivity and the specificity of the commonly used commercial ELISA for detection of enteropathogenic Yersinia.