Delivery of High-Quality Biomarker Assays

Biomarker measurements now support key decisions throughout the drug development process, from lead optimization to regulatory approvals. They are essential for documenting exposure-response relationships, specificity and potency toward the molecular target, untoward effects, and therapeutic applications. In a broader sense, biomarkers constitute the basis of clinical pathology and laboratory medicine. The utility of biomarkers is limited by their specificity and sensitivity toward the drug or disease process and by their overall variability. Understanding and controlling sources of variability is not only imperative for delivering high-quality assay results, but ultimately for controlling the size and expense of research studies. Variability in biomarker measurements is affected by: biological and environmental factors (e.g., gender, age, posture, diet and biorhythms), sample collection factors (e.g., preservatives, transport and storage conditions, and collection technique), and analytical factors (e.g., purity of reference material, pipetting precision, and antibody specificity). The quality standards for biomarker assays used in support of nonclinical safety studies fall under GLP (FDA) regulations, whereas, those assays used to support human diagnostics and healthcare are established by CLIA (CMS) regulations and accrediting organizations such as the College of American Pathologists. While most research applications of biomarkers are not regulated, biomarker laboratories in all settings are adopting similar laboratory practices in order to deliver high-quality data. Because of the escalation in demand for biomarker measurements, the highly-parallel (multi-plexed) assay platforms that have fueled the rise of genomics will likely evolve into the analytical engines that drive the biomarker laboratories of tomorrow.


Introduction
Heightened focus on the use of bioimaging and ex vivo laboratory measurements to guide the development of new therapies is transforming processes and expectations within the pharmaceutical research and regulatory communities [2]. Drug developers are finding new ways of supporting clear decision-making, limiting downstream risk of project failure, and documenting therapeutic claims. Earlier assessment of the therapeutic benefits and liabilities of new chemical entities promises substantial improvements in the overall cost and time for drug development. Regulators are enthused by the prospect of a richer array of information supporting the safety and efficacy of new drug en-tities. And there is renewed enthusiasm and expectation for the use of surrogate clinical endpoints which can assess the effects of therapeutic intervention more quickly than conventional observations of morbidity and mortality.
The impetus for these sweeping changes are twofold.
Although biological markers (biomarkers) and associated bioanalytical tools have always been the bedrock of pharmacology and toxicology, older paradigms for drug development cannot be sustained due to ballooning costs, especially during the clinical phases. The biomarkers of tomorrow must become more timely, reliable and definitive to propel decisionmaking. Drug candidates that ultimately fail must do so far earlier in the process. Secondly, revolutionary changes in both our understanding of human genomics and bioanalytical technologies have enabled the identification and quantitation of a wide array of biomarkers. This knowledge and technology will allow comprehensive profiling of individual research subjects and

Personalized Therapy in Practice
Rational explanation of: • variability in drug response • variability in PK/ADME •   inevitably drive medical practice toward personalized therapy that is less empirical and driven by laboratory data (Fig. 1).
The challenge that drug researchers and bioanalytical chemists face today are practical ones having to do with the selection of appropriate biomarkers for critical decisions, the delivery of high-quality biomarker methods, and the evaluation of the data. All of these processes are contingent upon clearly understanding and controlling the factors that influence the reliability and variability of the data [12]. Indeed, the size of study groups (e.g., in toxicology studies and clinical trials) and statistical plans are a direct function of variability in the measurements. Analytical sources of variation can be reduced by the use of rugged, validated assay methodologies. Biological sources of variation can be controlled in part by experimental design (e.g., by specifying inclusion and exclusion criteria, restricting food consumption, defining the time of day for specimen collections). This paper summarizes the various bioanalytical and biological factors that influence the quality and variability of ex vivo biomarker measurements. It begins with a consideration of the key guidelines and regulations that have molded approaches to assay validation and then presents practical recommendations that will assure consistent delivery of high-quality biomarker data.

Regulatory considerations
Bioanalytical laboratories that support pharmaceutical research and development are often categorized according to their level of compliance with the federal regulations which govern animal toxicology studies (Good Laboratory Practices; GLP) and human diagnostic testing (Clinical Laboratory Improvement Amendments; CLIA). GLP regulations are enforced by the US Food and Drug Administration (FDA) and were developed to assure the quality of data generated for toxicology and safety pharmacology studies in animals [3]. They do not apply to most of the exploratory work done in animal pharmacology laboratories. CLIA regulations are administered by the Centers for Medicare & Medicaid Services (CMS), formerly known as the Health Care Finance Administration [4]. The purpose of CLIA regulations is to ensure the quality of the assay work performed in "any facility which performs laboratory testing on specimens derived from humans for the purpose of providing information for the diagnosis, prevention, treatment of disease, or impairment of, or assessment of health". Although CMS administers CLIA regulations, the FDA has an important role in regulating the testing in clinical laboratories by reviewing and approving the commercial reagents and kit methodologies used for diagnostic purposes. The CLIA rules do not apply to "research laboratories that test human specimens but do not report patient specific results for the diagnosis, prevention or treatment of any disease or impairment of, or the assessment of the health of individual patients". FDA regulations for Good Clinical Practices (GCP), which govern the testing of new medical devices in humans, do not specifically apply to clinical laboratories, but the data generated by clinical laboratories on specimens derived from clinical drug trials must be presented in final study reports and adverse event documents submitted to the FDA.
The recent escalation in biomarker measurements throughout drug development has brought about a convergence of laboratories from GLP-regulated, CLIAregulated, and basic research settings. In this new environment, bioanalysts are prone to confuse the applicability of GLP and CLIA regulations (and their regulatory counterparts outside the USA). Table 1 summarizes some key similarities and differences. Both GLP and CLIA regulations establish quality standards for facilities, personnel and procedures. However, GLP regulations do not apply to studies involving human subjects, so, by definition, human specimens cannot be processed "under GLP". At the same time, it is recognized that "GLP-like" practices will add an element of quality to work performed on clinical specimens. Similarly, CLIA regulations do not apply to animal safety studies and may not even apply to clinical biomarkers when the tests are purely of a research nature (i.e. either of unproven value in guiding patient care decisions or not reported to participating human subjects and their attending physicians). Even so, running these research assays using "CLIA-like" practices will add to the quality and credibility of the data. This raises two questions. First, to what extent should biomarkers that are not being used for nonclinical safety studies or patient care be measured within strict GLP-like and/or CLIA-like environments? The short answer is that the quality standards embodied by GLP-like and/or CLIA-like practices are important for all biomarkers used to support critical decisions. Implementation of best laboratory practices ensures that biomarker laboratories will deliver an array of high-quality assay results, irrespective of the regulatory environments under which the specimens are collected. This basically means operating under standardized written guidelines for assay validation, sam-ple processing, quality control, and data management. Secondly, how onerous is it for biomarker laboratories to establish and maintain GLP-like and/or CLIA-like conditions? While research staff will often eschew more rigid processes due to the perceived investment of additional time and expense, the reality is that quality systems are relatively straightforward to maintain and will reimburse the initial investment in Standard Operating Procedures (SOPs) and other documentation by avoiding errors in sample collection and processing.

Quality standards for analytical validation and laboratory practices
The original GLP regulations were not specific as to requirements for assay validation and quality control. Consequently, bioanalytical laboratories supporting preclinical studies developed a broad range of approaches to assay validation. In 1990, representatives of various bioanalytical laboratories associated with the pharmaceutical industry met to develop a set of uniform guidelines for drug assay validation [41]. These guidelines were widely adopted for GLP studies [8,18,19]. Because of their focus on chromatography methods and the subsequent rise of biotechnology products dependent on immunoassay technology, additional guidelines have been more recently developed to encompass immunoassays [11,42]. Regulatory guidelines for validation of drug assays have now been issued by the FDA and The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) [16,23,24]. While none of these guidelines specifically apply to biomarker assays, their principles of quality have clear application. One obvious shortcoming is that they do not consider the broad range of biological factors that impact biomarker assay validation.
Quality standards for clinical laboratories in the USA have their roots in the Clinical Laboratory Improvement Amendments (CLIA) and in accrediting organizations such as the College of American Pathologists (CAP) and the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) [17]. Accrediting organizations adopt guidelines and polices that go beyond the generalities of regulatory requirements and promote specific, best laboratory practices across all processes and require well defined quality assurance programs. Thus, accreditation by CAP or JCAHO assures full compliance with CLIA regulations plus achievement of an overall high standard of laboratory practice. These regulations and guidelines require all assay methods to be periodically validated for accuracy and reproducibility, to be checked daily for quality control performance, to have biological reference ranges defined, and, for regulated analytes, to be monitored via external proficiency testing. The National Committee for Clinical Laboratory Standards (NCCLS) promotes best practices by developing standard operating procedures in collaboration with teams of experts and issuing SOPs for a wide range of clinical laboratory procedures. Among the NCCLS publications are guidelines for assessing assay performance and analytical validity. Similar bioanalytical performance standards have been developed in Europe and internationally [7,14,22,26,30,40,44]. In general, these various guidelines are applicable across all assay methodologies; however, some technologies, such as flow cytometry and quantitative NMR, pose special challenges for validation and these have been addressed in additional publications [35,36]. Probably no single individual has done more to advance laboratory quality control systems and techniques than James O. Westgard [49]. A wealth of practical information on assay validation experiments and approaches to quality control can be found at his website Westgard.com.

Sources of variability in biomarker measurements
By their nature, biomarkers must be dynamic and responsive to the disease process or pharmacological intervention of interest. Imposed upon these desirable signals is the background noise of biological and methodological sources of variation [12,37]. These background fluctuations affect both the sensitivity and the specificity of biomarker measurements and ultimately limit their utility. The size of treatment groups needed for drug studies is directly related to the variability in the measurements used for endpoints and decisions. Practical implementation of biomarkers in drug development therefore requires both an understanding and control of the various sources of variability in assay performance [27].
Preanalytical sources of variability in biomarker measurements are noted in Table 2 [1,6,10,13,25,29,[31][32][33]46,47]. Narayanan has recently reviewed many of these variables [34]. The impact of these factors is often not well understood at the time new biomarkers are implemented in drug studies. This is especially true for those biomarkers specifically linked to novel molecular targets. To deliver high-quality biomarker results, drug researchers must either use study inclusion and exclusion criteria that control these factors or else undertake pilot studies that seek information on these factors. For example, as part of the biomarker assay validation process, reference ranges can be established using specimens from healthy individuals as well as from subjects with the targeted disease. Also, small numbers of subjects can be evaluated as inpatients for a limited period of time to define variability due to diurnal rhythms, posture, exercise, and meals.
Timed urine collections are particularly problematic in that subjects can fail to void completely either prior to, or at the end of, the collection interval of interest or simply discard urine that is voided in the middle of the collection interval. To minimize the impact of these collection errors, normalization of quantitative urinary biomarkers using urinary creatinine is highly recommended. The rate of creatinine production is relatively constant for an individual, although it can vary somewhat immediately following ingestion of well-cooked meat [45].
Analytical sources of variability in biomarker measurements are summarized in Table 3. Standardization of biomarker assays in a research environment is particularly challenging, given that standard reference materials are often not readily available. Purification  of endogenous biochemicals and enzymes may not be feasible without loss of essential biological or chemical properties. Also, biomarker assays can be a function of a complex series of in vivo and ex vivo procedures and measurements. For example, cell incubation steps (such as lipopolysaccharide stimulation) and amplification steps (such as PCR) can be integral parts of the biomarker assay method. To correct for variability introduced by pre-assay manipulations of cells or biological matrix, it may be necessary to include internal chemical controls or normal control samples that are used to normalize the final biomarker results.
Commercial reference materials for large peptides and proteins can be particularly problematic because the assigned potencies can be based on biological test systems (rather than chemical properties), whereas the corresponding biomarker assays can report results in terms of absolute concentration (e.g., ng/mL of plasma). This is frequently true for cytokines. The bioanalyst might assume that a weighed amount of reference material has the same biological potency (and purity) across lots, but this cannot be assured. While absolute accuracy is not necessarily essential in an unregulated research setting, every effort should be made to employ reference standards of documented and reproducible purity so that assay results from different experiments and from different analytical laboratories can be reasonably compared. This highlights the great importance of utilizing long-term quality control samples that are independent of the calibrators and that verify consistent assay performance across transitions in standards and reagents. When using commercial quality control material, each laboratory should establish a mean value based on actual assay performance in the local laboratory, rather than accepting the manufacturer's stated value. Indeed, most suppliers of quality control material no longer state an absolute value, but rather, a target range.
One of the most intuitively obvious, yet frequently overlooked, sources of analytical error is the interference of the test drug with the biomarker reagents and detection systems. It should be presumed that all new drug entities can potentially interfere with the performance of biomarker assays, for example, by altering absorbance and fluorescence endpoints, co-eluting in chromatographic systems, or cross-reacting with reagent antibodies [50]. Therefore, all bioanalytical laboratories supporting a drug research project should verify that the drug does not interfere with the applicable biomarker and diagnostic tests. This is most directly done by assessing the performance of the biomarker assay before and after spiking the drug (and, if available, known drug metabolites) into the biological matrix of interest. Various endogenous substances can also potentially interfere with biomarker assays [38].

Minimum requirements for biomarker assay validation
Given the fact that most biomarkers in a research setting are not subject to GLP and CLIA regulations, what is the extent of validation needed for measurements that drive drug development decisions? While no consensus paper has been issued, it is generally recognized that sound assay validation should include documentation of the parameters listed in Table 4. These concepts can be illustrated by validation experiments that were undertaken for a kit enzyme immunoassay for IL-2 soluble receptor alpha (Quantikine ELISA, R&D Systems, Minneapolis, MN). The validation data provided in commercial kit documentation should not be taken at face value because assay performance can be dependent upon personnel, equipment, working environment, reagent water and other local factors (Table 3). Also, commercial kits that are marketed for research purposes only are not subjected to the rigorous FDA review that is mandatory for diagnostic reagents. Good Manufacturing Practices are not necessarily utilized by commercial vendors, resulting in marked variability in lot-to-lot kit performance.
Fundamental to all quantitative assays is a documented concentration-response relationship, preferably using a well-characterized reference material. This is usually shown as a fitted curve based on responses observed for a biological matrix or buffer spiked with a pure reference standard (Fig. 2). During the assay process, responses observed for biological samples are referred to the standard curve to derive the concentrations. The goodness of fit for the curve-fitting function, and hence the ability of the calibration curve to report accurate results throughout its concentration range, is checked by "reading back" the responses of standards against the curve to arrive at theoretical measured concentrations for the standards. In general, the read-back values should be within a few percentage points of true value throughout the functional range of the assay (Table 5).
If a calibration curve is used in place of a standard curve in the biological matrix, then the ability to accurately recover the analyte from the matrix must be established. This is done by spiking known amounts of pure reference material into the matrix (covering a wide concentration range), assaying the spiked samples, and then calculating the recovery, taking into account the original amount of endogenous analyte present in the matrix (Fig. 3). Since immunoassays in particular have a discrete, limited functional range, one should verify that samples with concentrations higher than the highest functional standard can be diluted into the functional range and assayed with good accuracy (after correction for the dilution factor). The analysts should know Table 4 Minimum requirements for biomarker assay validation Reference standard of known purity or specific activity Defined concentration-response relationship (standard curve or calibration curve) Accurate curve-fitting function (accurate°read-back-of standards) Accurate recovery from biological matrix Accurate recovery for samples requiring pre-assay dilution Highest concentration not requiring dilution Functional limits of the calibration or standard curve Within-assay precision (CV) for at least 3 samples, 6 measurements each Between-assay precision (CV) for at least 3 samples, 6 runs each Stability during maximal time needed for each step (sample collection, transport, storage and intermediate steps) Proof of specificity Lack of chemical/immunochemical interference by test drugs Comparison with previous method (when possible) Interlaboratory comparison with blinded samples (when possible) Reference range for pertinent population Quality controls independent of calibrators Explicit rules for acceptance/rejection of assay runs Table 5 Read-back values in assay buffer for IL-2 soluble receptor alpha calibrators versus nominal (true) concentrations using an ELISA kit from R&D Systems. After a typical calibration curve was generated using a four-parameter logistic fit, read-back concentrations for the calibrators were derived from the fitted curve. The mean results for 6 different assay runs are shown the upper limit of the functional range; all samples with initial values above this limit must be diluted and re-assayed. Standards that are based on the biomatrix and are taken through all assay steps, automatically correct for analyte losses (e.g., during extraction or evaporation steps). In contrast, if calibrators in buffers are used, then a factor accounting for percent loss may be needed to correct the results for biological samples that pass through all preparatory steps. Failure to correct the results for these losses is not uncommon in laboratories measuring biomarkers; consequently, marked differences in absolute concentrations can occur between laboratories. Variation in the purity or specific activity of the reference standard can also account for interlaboratory differences. To assure that assay results can be reasonably compared between laboratories and/or between analytical runs, each laboratory should indicate whether or not assay results are corrected for recovery and should declare the purity or specific activity of the reference standard in use. As an overall check on accuracy, it is recommended that samples be split and assayed simultaneously at another laboratory already performing the assay.
Overall precision should be checked both within and across multiple runs (Table 6). To assure that an assay continues to perform consistently across multiple runs, quality control samples based on the biological matrix should be prepared, preferably at three different concentrations, and assayed with each run. Clear rules for accepting or rejecting analytical runs based on QC outcomes must be stated (e.g., 2 of 3 within 15% of nominal value, or 2 of 3 within 2 SD of assigned mean with SD based on interassay CV).
Potential interference from endogenous compounds and the test drugs should be considered and assessed as needed. The specificity of reagent antibodies should always be established relative to compounds of similar structure. For example, R&D Systems established that the antibodies used in its kit for assay of IL-2 soluble receptor alpha did not cross-react with 71 other cytokines and chemokines.
Stability of the analyte should be verified for every step of the assay process where timing is not absolutely controlled. This can apply to specimen collection, Table 6 Assay precision for IL-2 soluble receptor alpha in plasma using an ELISA kit from R&D Systems. Human plasma was supplemented with recombinant IL-2 soluble receptor alpha and was assayed repetitively within the same assay run (N = 6 measurements) and between multiple runs (N = 18 runs)

Sample
Within-assay mean (pg/mL) Within-assay CV Between-assay mean (pg/mL) Between-assay CV  Ideally, stability experiments should cover the longest time envisioned for these steps, but this is not always practical (Fig. 4).

Conclusion
Completion of the human genome sequence has initiated a rapid proliferation in the number of molecular targets for novel drug therapies and has fueled the search for downstream markers that reflect disease processes and therapeutic interventions [9,15]. Strategic use of biomarkers can enable the evaluation of greater numbers of drug candidates, while potentially decreasing the overall time and expense by eliminating inferior compounds earlier in the selection process. As the relationships between the genotypes, various markers and therapeutic outcomes are discerned, an array of clinically-validated diagnostic, prognostic and theranostic assays will emerge [20,21]. Thus, it is likely that comprehensive profiling of both the drugs in development and the patients undergoing treatment will be essential for successful outcomes [28,39].
Juxtaposed in the midst of these revolutionary changes in drug development and laboratory medicine are bioanalytical laboratories that must deliver high-quality data supportive of accurate and timely decisionmaking. In order to launch rugged and reproducible biomarker methods, the assay validation processes should account for all significant sources of variation. While analytical factors related to assay methodology and performance can be readily understood and controlled in the laboratory, the biological and pathophysiological factors may require exploratory studies involving collection of specimens from donors with the target disease under controlled conditions (e.g., control of diet, posture, time of collection). For biomarkers used as therapeutic endpoints, this kind of information is especially important for planning clinical drug trials (e.g., definition of the inclusion and exclusion criteria, specimen collection schedules, and numbers of subjects required for statistical assessment of efficacy).
Collectively, these impending changes in research and patient care paradigms will compel research, toxicology and clinical laboratories to deliver high-quality bioanalytical data in amounts far beyond what is currently possible. The autoanalyzers of today are still variants on serial processing, albeit progressing in speed and diversity of test menus. Batch, singleassay processing (e.g., EIA) can be faster but lacks the random-access capabilities needed for rapid customization of test menus and cannot readily support the kind of multi-analyte profiles that will be required. To cope with these demands, bioanalysts must now turn to the newer technologies that offer highly-parallel ("multiplexing") capability, several of which have fueled genomics research [5,43,48]. Transformation of microchip platforms into analytical systems of superior accuracy and precision will be challenging but likely. Finally, special bioinformatics tools will be needed to translate biomarker profiles into practical indices that can be readily interpreted and used to accelerate decision-making.
of the Bristol-Myers Squibb Company. Several of the concepts underlying biomarker assay validation were developed in collaboration with Dr. Vesterqvist.