An Innovative Approach to Functionality Testing of Analysers in the Clinical Laboratory

The established protocols for evaluating new analytical systems produce indispensable information with regard to quality characteristics, but in general they fail to analyse the system performance under routine-like conditions. We describe a model which allows the testing of a new analytical system under conditions close to the routine in a controlled and systematic manner by using an appropriate software tool. Performing routine simulation experiments, either reflecting imprecision or method comparison characteristics, gives the user essential information on the overall system performance under real intended-use conditions.


INTRODUCTION
Conventionally, the evaluation of new analytical systems is conducted on the basis of established protocols related to analytical performance like those published by CLSI (NCCLS) [1], ECCLS [2], or other national organizations. Sometimes additional exploratory testing is performed in the hope of gaining some insight into the routine behaviour of the system. While the standard protocols produce indispensable information with regard to the quality characteristics, they fail to analyse the system performance under routine conditions. Similarly, random testing only gives a chance opportunity to detect system malfunctions.
Obviously, there is no easy way to experimentally test the course of events that lead to an erroneous assay result and to verify its incorrectness under nonstandardized, that is, routine conditions. This situation is caused by the increasingly complex interactions of hardware, software, and chemistry which are found on modern analysis systems. The manual generation of experiments that describe a sample sequence with variable specimens and request patterns is feasible, but it is cumbersome to produce and provides no information on the correctness of the measurements. A better approach can be obtained by developing a software tool which generates appropriate experimental request lists, allowing the testing of a new system under conditions close to the routine in a controlled and systematic manner, and which provides sufficient data reduction for the analysis of the results.

METHODS
We have integrated this functionality in our evaluation software tool Windows-based computer-aided evaluation (Win-CAEv) [3,4] in such a way, that the generation of simulation experiments, the transfer of requests to the instrument, the on-line data capture, and the result evaluation, can be easily achieved with the available programme functions [5]. The routine simulation (hereafter referred to as RS) module allows for the definition and generation of typical test request patterns.
A request list that reflects a routine laboratory workload can be simulated by WinCAEv using appropriately defined parameters. The required input data embraces typical test distributions, sample materials, and sample request profiles. As an alternative to this programme supported simulation, laboratory specific request lists captured electronically from the laboratory information system or directly from the routine analysers are automatically converted by WinCAEv to a corresponding worklist for the system under evaluation.
Three main types of RS experiments were designed to allow systematic testing of an analytical system. In this way, different types of routine situations can be modelled and the respective performance situation evaluated.
In repetitions of the same experiment, routine provocations are introduced during the randomized processing to further challenge the system's performance under various conditions. The type and number of provocations depend on the system under evaluation, but generally include items regularly encountered during operation in a routine laboratory, like calibration and quality control measurements, reagent switchover or exchange, sample short, STAT analysis, provocation of various data flags, sample reruns, and so forth. Errors related to instrument malfunctions or chemistry problems can be deducted from the experimental data by comparing the batch and random results. The mean, median, CV, relative 68%-median distance (md68% describes a robust measure of variation) [6], and minimum and maximum of the random part are compared with those from the batch part for every analyte measured. Random and/or systematic errors will result in significant deviations like elevated CVs and differences of the means. One can expect that the imprecision in a simulated routine run will result in somewhat higher CVs due to more interactions of the analytical system than during a standard batch run. Based on experience from various system evaluations, we use the fol- Usually the routine simulation experiment is performed with many different methods, and the high number of results produced has to be assessed for relevant deviant results. This can easily be done by comparing the CV and the relative 68%-median distance.
The system handling of the routine provocations is assessed for correctness, and the analytical results produced during and after provocations are checked for marked deviations which may represent systematic and/or random errors.
Recently we extended this experiment in order to run the routine simulation precision experiment via a host download procedure (see below) so that the real routine request pattern is reflected and a simulation by the WinCAEv software is not necessary. RS-Series 1/2 is used for the comparison of randomized test processing in two runs. Fresh human specimens are used as sample materials with request patterns reflecting the evaluation sites typical routine workloads. The sequence of sample processing is identical in both runs and the same samples are used, not placing fresh samples for the second run.
Random errors can be deducted from the experimental data by comparing the deviation of the second from the first run results. The results are grouped in 7 five percent categories between ±15% deviation. Each sample pair is categorized; a summary shows the number of samples per analyte in each category as well as a total statistic per category for the complete experiment (see Table 2 and Figure 1). Random errors will result in marked deviations between both run results for one or more samples.
RS-Method Comparison Download allows direct comparison of the routine analyser methods (reference data) with those of the instrument under evaluation processed in a randomized routine-like fashion. The test results and sampling patterns from the routine laboratory analyser(s) are electronically captured by WinCAEv via file import (host-download) or simply with a batch upload. Using the host-download option, sample identification numbers, requests, and results are exported from the laboratory host in a text format file (e.g., comma separated values (CSV)) and then imported in WinCAEv. No patient demographics are transmitted to Win-CAEv. Method comparison statistics and graphs are generated per analyte and comparison instrument.

APPLICATIONS
Over the last decade, routine simulation experiments have become an integral part of inhouse and multicentre system evaluations at Roche Diagnostics. Here, we outline some typical areas of use based on practical experience, as well as examples of errors difficult to produce using conventional procedures yet observed using these experiments.
RS-Precision is an extremely effective means of testing the interaction of software with all other system components under stressed conditions. During a multicentre study of Roche/Hitachi 917 in the early nineties for example, these experiments yielded CVs of up to 4% for test applications using low-sample volume (2 μL). An example is shown in Figure 2 for cholesterol. Of the 48 runs performed during this experiment, 83% (=40 series) were found with a CV higher than the expected 2%; 24 series had a relative 68-median distance of more than 2%. The difference between CV and md68% indicated that several series had clear deviant results. Further inhouse investigations revealed that a software malfunction in the sample pipetting process under certain conditions was the root cause for these conspicuous results. After correction of the software and repetition of this experiment, the CV of the cholesterol assay was in all cases below 2%.
On Roche/Hitachi 912, we found that introduction of STAT samples during operation led to intermittent incorrect data flags on STAT sample results when a sample material other than serum was selected. Investigation showed that if a STAT sample was requested by sample disk position only on the analyzer, and the sample type downloaded from the laboratory host is one other than serum or plasma, the sample was correctly measured for the specified biological material but the data flagging was done as if the sample was serum. In this case, an incorrect rerun of the sample was indicated by the generated data flag.
With the new generation of Roche systems MODULAR ANALYTICS from the late nineties, combining multiple analyser modules, this experiment became indispensable and gained many new areas of use. An Intelligent Process Manager distributes the sample carriers to the various analyser modules in a way that ensures most efficient operation, and background maintenance features on these systems allow the operator to perform maintenance on one or more modules while continuing routine operation on the other modules. The RS-Precision experiment allowed us to check these, among other complex functions, in a systematic manner under numerous simulated routine-like conditions. A typical provocation on such systems is the deactivation and reactivation of a module during routine operation. The goal is to check that the samples with requests for tests on the deactivated module(s) are handled correctly, and that the reactivated module performs as expected after return to operation. During the MODULAR ANALYTICS SWA (serum work area) evaluation, these provocations revealed sporadic errors after module reactivation like wrong reagent inventory and bottle change-over to standby reagents although current reagents were not empty.
Also, RS-Precision is an effective tool to test the interaction of reagents on a selective access analyser. During the recent multicentre evaluation of cobas 6000, a new reagent carryover-not observed on other Roche analysers-caused by the reagent probe was found for the creatinine enzymatic assay, when the creatine kinase reagent (CK) was pipetted just before the creatinine assay. Creatine phosphate of the CK reagent (vial 2) may partly be hydrolysed to creatine which can influence the creatine concentration of the enzymatic creatinine assay, when contaminated by the reagent probe. As shown in Table 3, the CV changes from 0.9% in the batch part to 2.1% in the random part. Looking to the single Wolfgang Stockmann et al.