Assessing Candidate Gene nsSNPs for Phenotypic Differences in Double-Strand Break Repair Using Radiation-Induced γH2A.X Foci

Nonsynonymous SNPs (nsSNPs) in DNA repair genes may be important determinants of DNA damage and cancer risk. We applied a set of screening criteria to a large number of nsSNPs and selected a subset of SNPs that were likely candidates for phenotypic effects on DNA double-strand break repair (DSBR). In order to induce and follow DSBR, we exposed panels of cell lines to gamma irradiation and followed the formation and disappearance of γH2A.X foci over time. All panels of cell lines showed significant increases in number, intensity, and area of foci at both the 1-hour and 3-hour time points. Twenty four hours following exposure, the number of foci returned to preexposure levels in all cell lines, whereas the size and intensity of foci remained significantly elevated. We saw no significant difference in γH2A.X foci between controls and any of the panels of cell lines representing the different nsSNPs.


INTRODUCTION
Defects in double-strand break repair (DSBR) can lead to genome instability and eventually cancer [1]. Several disease syndromes have rare gene mutations that disrupt DSBR and result in phenotypes with increased cancer risk including Ligase IV (LIG4) syndrome, severe combined immunodeficiency with sensitivity to ionizing radiation (RS-SCID), Ataxia-telangiectasia (A-T), Nijmegen breakage syndrome (NBS), and Fanconi anemia complementation group D1 (FANCD1) [2]. Additionally, a growing literature shows that common polymorphisms in DSBR genes can be associated with increased, and in some cases decreased, risk of cancer [3][4][5][6][7][8].
Recent resequencing efforts by the Environmental Genome Project [9] and others have greatly expanded the list of known single nucleotide polymorphisms (SNPs) within DNA repair genes. However, only a few of these SNPs have been examined in association studies, and even fewer have been functionally characterized in vitro. A number of in sil-ico tools including Sorting Intolerant From Tolerant (SIFT) (http://sift.jcvi.org/), Polymorphism Phenotyping (PolyPhen) (http://genetics.bwh.harvard.edu/pph/data/index.html), and SNPs3D (http://www.snps3d.org/) can be used to predict a SNP's effect on protein function, structure, or gene regulation, although in vitro functional studies are ultimately required to confirm those predictions.
DSBs can be evaluated in vitro by the localization of a phosphorylated form of the histone variant H2A.X. In response to a DSB, histone H2A.X molecules become rapidly phosphorylated on serine 139 by a member of the phosphoinositide (PI) 3-kinase family which includes ATM, DNA-PK, and ATR [10][11][12][13][14]. These phosphorylated H2A.X molecules, termed γH2A.X, can span up to 2 Mb of chromatin surrounding a DSB and can be visualized microscopically as distinct foci after fluorescent antibody labeling [11,12]. Not only can DSBs, and in turn H2A.X phosphorylation, be induced by exogenous agents such as ionizing radiation but they can also occur endogenously during DNA replication, recombination in mitosis and 2 Journal of Cancer Epidemiology meiosis, apoptosis, senescence, telomere shortening, and V(D)J recombination [11][12][13]15]. Formation of γH2A.X foci is cell cycle-dependent and is greater in S-and G2/M-phase than in G1, reflecting both DSBs created during replication and the increased quantity of DNA available in which DSBs may occur [16].
Previous reports have demonstrated a direct association between the number of DSBs and the number of γH2A.X foci [11,17]. Several studies have also shown that the kinetics of DSBR correlates with γH2A.X induction and clearance by phosphatases at low-dose radiation [17][18][19][20]. Using ionizing radiation doses similar to those used in our study, several groups have used γH2A.X foci formation and disappearance over time to successfully detect DSBR defects in cells with known mutations in DSBR genes, such as LIG4, XRS-6, DNA-PK, and ATM [17][18][19]21].
We applied a set of screening criteria and in silico prediction tools to identify a set of 5 nsSNPs that appeared likely to affect DSBR. Using gamma irradiation to induce DSBs, we evaluated panels of HapMap and Environmental Genome Project (EGP) normal human lymphoblastoid cell lines representing each of the SNPs and followed the kinetics of induction of γH2A.X foci and their persistence over time.

Selection of nsSNPs
Using dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP) (build 126), we compiled a list of 1455 nsSNPs within 149 genes known to be involved in DNA repair [22]. Of these, 227 nsSNPs had minor alleles that were found as homozygotes or heterozygotes in 5 or more HapMap or EGP cell lines and thus met our minimum frequency requirement. Using PolyPhen [23], 49 of 227 nsSNPs were predicted to be "possibly damaging" or "probably damaging" to protein function based on phylogenetic, sequence, and structural changes induced by the SNP. Of these 49 nsSNPs, 22 were in genes known to be involved in DSBR and, after further evaluation using SIFT classification [24], we selected 5 nsSNPs for detailed functional analysis (Tables 1 and 2).

Selection of cell lines
For each nsSNP, at least 5 Epstein-Barr-Virus-transformed lymphoblastoid cell lines were purchased from the HapMap or Polymorphism Discovery Resource collections (for EGP) at Coriell Institute for Medical Research (Camden, NJ, USA). Depending on the allele frequency of the particular SNP, the cell lines were either homozygous for the allele of interest (WRN, LIG4) or heterozygous if the allele was rare (PNKP, BRCA1, ATM). In a few cases, cell lines that were homozygous for either the WRN or LIG4 variant allele were also heterozygous for one of the other nsSNPs (PNKP, BRCA1, or ATM). In order to establish a panel of control cell lines we identified all HapMap or EGP cell lines that were homozygous for the common alleles at all 5 of the candidate polymorphic sites. From these, we selected 6 control cell lines (2 from each of the 3 ethnic groups in HapMap) after applying an additional criterion to minimize sequence variation at other polymorphic sites in the 5 genes of interest.

Cell culture and irradiation conditions
Lymphoblastoid cell lines were grown at 37 • C in 5% CO 2 and maintained by tri-weekly subculture in RPMI-1640 medium with L-glutamine (GIBCO, Carlsbad, Calif, USA), which was supplemented with 15% fetal bovine serum (Gemini Bio-Products, West Sacramento, Calif, USA), and 1% antibiotic/antimycotic (GIBCO, Carlsbad, Calif, USA). The day before irradiation, cells were seeded at 3.5 × 10 5 cells/mL in T25 flasks (BD Falcon, Franklin Lakes, NJ, USA). Cells were exposed to 1.5 Gy on ice in 1 mL complete media while under constant rotation using a J. L. Sheperd Model 431 137 Cs irradiator at a dose rate of 0.77 Gy/min.

γH2A.X immunofluorescence
Unexposed control and irradiated cells were resuspended in warm, complete media, briefly placed in a 37 • C water bath, and transferred to a 37 • C incubator to complete the remainder of a 1 hour, 3 hours, or 24 hours incubation. Preliminary dose and time course experiments using a control cell line (GM12154) were carried out by allowing cells to adhere to poly-D-lysine (BD, Franklin Lakes, NJ, USA) coated cover slips. All subsequent experiments, however, utilized poly-D-lysine-coated chamber slides instead. Cells were allowed to adhere for 10 minutes at 4 • C and then fixed in freshly prepared 4% paraformaldehyde (Electron Microscopy Sciences, Hatfield, Pa, USA) for 15 minutes at room temperature. Cells were washed once in PBS, placed

Image acquisition and processing
Images of cells were acquired at room temperature using a Zeiss Axioplan 2 fluorescent microscope equipped with a JAI M1 HiRes charge-coupled device (CCD) camera, Metafer v3.2 software (MetaSystems, Altlussheim, Germany), and a 40X objective lens. The DAPI channel was used to identify DAPI stained cells, followed by detection of Alexa 488 stained γH2A.X foci using the FITC channel. Nine focus planes were captured at 0.75 μm intervals for each cell using the FITC channel. Automated image processing produced a composite image from both channels ( Figure 1) by using settings recommended by the manufacturer. Imaging operations were applied uniformly across all slides and time points. For each captured nuclear image, we collected a total DAPI intensity value in order to quantify the amount of DNA present in a cell. In order to compensate for DAPI intensity fading over a course of image capture, we adjusted for fading following the method of Böcker et al. [25].

Quantification and measurements of γH2A.X foci
Approximately 200 cells were scored for each time point/ treatment combination (condition) within an experiment, each condition was replicated within an experiment, and each experiment was replicated on a separate day. Foci were identified from captured images after applying a 20% intensity threshold to minimize background. Automated measurements of the number, intensity, and area (μm 2 ) of γH2A.X foci were made using the Metafer v3.2 system (MetaSystems, Altlussheim, Germany). Using lower intensity thresholds or adding area restrictions for size of foci did not substantively alter the results (data not shown).

Cell cycle analysis
Cells were fixed with cold 70% ethanol and placed at −20 • C until flow analysis. Cells were washed once with PBS, incubated at room temperature for 30 minutes with 0.5% Triton X-100, resuspended in propidium iodide solution (5 μg/mL propidium iodide (Invitrogen, Carlsbad, Calif, USA) + 10 u/mL RNase (Promega, Madison, Wis, USA)), and incubated for 30 minutes in the dark before being processed using a Becton-Dickinson (BD) FACSort Flow Cytometer and analyzed using both CellQuest (BD, Franklin Lakes, NJ, USA) and Modfit software (Verity Software, Topsham, Me, USA).

Statistical analysis
The preliminary dose-response and time-response data on the number of γH2A.X foci were analyzed as follows: the mean of the two replicate slides was calculated for each of the two experiments, and the mean of these two means and its standard error were then determined. To assess the doseresponse trend, we used linear regression methods.
For our primary analyses of the effects of nsSNPs on DSBR, we used mixed-model regression techniques [26]. We measured three features of γH2A.X foci (namely, number, intensity, and size) at each of four time points (namely, preexposure, 1-, 3-, and 24-hour postexposure (denoted T0, T1, T3, and T24, resp.)). For each feature, we used these measurements to derive three response variables that compared time points. Each response variable was a ratio representing fold changes in response between time points: T1/T0 indicative of induction of damage, T3/T1 indicative of short-term repair/persistence, and T24/T1 indicative of longer-term repair/persistence. These response variables were assessed for each cell line in at least two replicated experiments on different days. Although capacity restricted us to run at most nine cell lines in any single experiment, generally each experiment included controls represented by two or more cell lines and four or five nsSNPs represented by a single cell line each. Our modeling approach involved fitting a separate multivariable regression for each response variable. To better meet the normal-distribution assumptions implicit in our statistical analysis, we used the base-2 logarithms of these ratio variables when fitting models. Mixed models involve both a model for the mean response and a model for the variation in response. The regression model for the mean response included an intercept and five predictors, namely, the number of copies of the variant (minor) allele for each of the five nsSNPs (control cell lines had zero copies of all variants). The regression coefficients in this model measure the change in mean response associated with an additional copy of the variant allele at the given locus. This regression approach allowed us to accommodate cell lines that carried variant alleles at more than one nsSNP as well as cell lines that were either hetero-or homozygous for the variant allele. It also allowed us to estimate the geometric mean response that a cell line homozygous for any of the variants under investigation would have. Our model for the variation in response accounted for several distinct sources of variation and the correlations that they induce: among different cell lines with the same genotype, among experiments, and among replicates of a given cell line within an experiment.

Dose response and time course for the formation of foci following gamma irradiation
In preliminary experiments using a control cell line (GM12154) with cells affixed to coated cover slips, we found a linear dose-response relationship (R 2 = 0.99) in the mean number of γH2A.X foci per cell 0.5 hour after exposures of up to 1.5 Gy of gamma irradiation, with the highest dose producing a tripling in the number of foci per cell compared to unexposed cells (Figure 2(a)). Subsequent time course experiments showed that the number of gamma-induced foci reached a maximum at 0.5 hour following exposure to 1.5 Gy, decreased by ∼50% from this maximum by 3 hours, and returned to preexposure levels by 24 hours (Figure 2(b)). Allowing cells to affix to coated chamber slides instead of  coated cover slips resulted in modest but similar reductions in the mean number of foci in both exposed and unexposed cells and in reduced experimental variability, so we utilized chambered slides for all subsequent experiments. Based on preliminary flow cytometry data, we estimated that about 60% of cells were in G0/G1 and thus we used adjusted DAPI intensity as a measure of DNA content to exclude the 40% of cells in G2/M or S-phase for all analyses (data not shown).
We compared groups of cell lines representing each of the DNA repair gene polymorphisms to the group of cell lines used as a control (Figure 3). All groups of cell lines exhibited similar time-course response patterns for the mean number of γH2A.X foci per cell before and following exposure to 1.5 Gy of gamma irradiation (Figure 3(a)). The mean number of foci increased about fourfold from before exposure to 1 hour postexposure in all of the groups of cell lines. Average number of foci remained more than twofold elevated in all groups of cell lines 3 hours after exposure, but all groups returned to near-baseline levels by 24 hours.
Although all cell lines appear by inspection to have similar induction and disappearance of γH2A.X foci, we statistically tested whether any of the 5 nsSNPs differed from controls using a mixed-model regression analysis with the number of copies of the variant allele at each nsSNP as predictors. This approach can accommodate cell lines that may be heterozygous or homozygous at a locus or that may have variants at more than one locus. We considered three response variables constructed as ratios of the number of foci at different time points: induction of foci at 1hour compared to preexposure (T1/T0), the repair of this damage over a 3hour time course (T3/T1), and the persistence of damage at 24 hours compared to preexposure levels (T24/T0). All groups of cell lines had similar estimates for these ratios, and we saw no evidence that any group of cell lines differed from controls (Table 3). In addition to measuring the number of foci within cells, we also measured the size and intensity of foci, again concentrating on ratio responses analogous to those mentioned earlier. By one hour after exposure, the mean size of individual foci increased by more than 75% compared to preexposure levels in all groups of cell lines and remained elevated and virtually unchanged at 3 hours following exposure (Figure 2(b)). Even 24 hours after exposure, when the average number of foci per cell had returned to near preexposure levels, the average size of the foci remained significantly larger than that of cells before exposure (P < .0001 for all groups, P = .0003 for PNKP). One hour following exposure, mean intensity of foci increased more than twofold over preexposure levels, and it decreased only slightly from the 1 hour levels at 3 hours (Figure 3(c)). Mean intensity remained significantly elevated after 24 hours in all groups of cell lines except those with ATM and PNKP polymorphisms, although even these two had elevated intensities at 24 hours.

DISCUSSION
Recent resequencing efforts have greatly expanded the catalog of SNPs available for study. This catalog is increasingly being used in focused epidemiologic studies of cancer susceptibility genes and in broader genome-wide association studies. DNA repair genes are appealing candidates to study both because rare mutations in a number of these genes have been linked to cancer risk and because genomic instability and mutation are important features of the cancer process [1,2]. The most prominent a priori candidate SNPs for disease causation are those that lead to nonsynonymous amino acid changes, in particular, the small subset that are predicted to alter functional protein domain structure. Relatively few of these SNPs have been evaluated in epidemiologic studies, in part because minor allele frequencies are often less than 5% and thus require large sample sizes for adequate statistical power. Because of the difficulty in carrying out functional studies, even fewer have been evaluated using in vitro assays.
Using several in silico prediction tools to evaluate nsSNPs in DNA repair genes, we selected 5 SNPs with "possibly" or "probably damaging" amino acid substitutions in genes that are involved in DSBR. ATM, BRCA1, LIG4, and WRN have well known associations with cancer or genetic diseases that predispose individuals to cancer [2,[27][28][29].
Specific missense mutations in LIG4 have been associated with LIG4 syndrome which results in increased radiosensitivity [27]. LIG4 plays an essential role in the NHEJ pathway by rejoining ends of DNA at DSB sites. The "possibly damaging" variant, T9I, has been associated with a reduced risk for multiple myeloma [30] but has not otherwise been evaluated for functional effects.
Defective WRN results in Werner syndrome which is characterized by an increased risk of cancer and other age-related disorders [28,29]. WRN is a member of the RecQ family of DNA helicases that has both 3 to 5 helicase and exonuclease activities and may limit nucleotide removal during NHEJ [31]. Although the WRN C1367R polymorphism is predicted to be "probably damaging," one functional study of enzymatic activity found little effect [32].
PNKP phosphorylates 5 hydroxyl termini and dephosphorylates 3 phosphate termini [33]. Although generally characterized as a base excision repair gene, it is also involved in phosphate replacement at damaged termini during NHEJ [34]. PNKP interacts with XRCC4 and loss of this interaction results in a slower rate of DNA repair and increased radiosensitivity [35]. The PNKP Y196N variant lies within the PNK39 protein domain, which in turn is thought to play a role in the repair of single strand breaks caused by exogenous agents, although the functional consequences of this variant have not been previously characterized.
BRCA1 mutations increase an individual risk for breast and ovarian cancer [27]. In response to a DSB, BRCA1 promotes HR and suppresses NHEJ [36]. The BRCA1 polymorphism, Q356R, is located within the site of interaction with the Mre11/Rad50/Nbs1 (MRN) complex [36]. This complex is important in sensing and repairing DSBs [36], although two epidemiologic studies of the BRCA1 Q356R polymorphism failed to find an association with ovarian cancer risk [7,37].
Inactivation of ATM results in A-T which is associated with increased radiosensitivity and risk for cancer [27]. ATM is activated in response to ionizing radiation and phosphorylates a number of proteins involved in DSBR and checkpoint control, including p53, BRCA1, NBS1, CHK2, RAD9, MDM2, and H2AX [21,38]. In addition, ATM is necessary for nucleosome disruption and histone loss at the site of a DSB which may be necessary for proper recruitment of repair proteins [39]. To our knowledge, the only study to date of the ATM variant P1054R found no evidence for association with radiosensitivity in breast cancer patients [40].
The induction and elimination of γH2A.X foci following 1.5 Gy irradiation was remarkably consistent among controls and the 5 cell line panels representing different DNA repair gene polymorphisms. Compared to preexposure levels, all cell line panels showed statistically significant increases in number, intensity, and area of foci at both 1-hour and 3hour time points. The number of γH2A.X foci proved to be the most sensitive index of exposure at both 1 and 3 hours, showing larger fold changes over preexposure levels than either intensity or area of foci. In addition, the number of γH2A.X foci showed a large and significant decrease between 1 and 3 hours whereas mean intensity of foci showed only small changes between these two time points and mean area of foci remained virtually unchanged. Whereas the number of foci had returned to baseline at 24 hours following exposure, both intensity and area of foci remained significantly elevated compared to preexposure levels for controls and most SNP panels.
Our study is limited both by the number of cell lines that constitute each panel and by the fact that in many cases homozygotes were so rare that we could only study heterozygous individuals. We cannot rule out the possibility that the nsSNPs that we evaluated might have subtle effects on repair of DSBs beyond the sensitivity of our assay and sample size. In addition, it is possible that alternative DNA repair pathways could compensate for decreased function and thus mask subtle functional effects, or that the nsSNPs we evaluated could have other functional consequences, for example, in repair fidelity, that we have not assessed.
Genome-wide association studies are becoming increasingly popular, as is the use of endophenotypes to better understand the etiology of complex diseases. However, difficulties in measuring these intermediate phenotypes on a large scale can often limit selection. The measurement of γH2A.X foci formation is a useful tool for evaluating the induction and repair of DSBs. Development of this assay for large scale use could be made possible by using chamber slides with multiple wells, microscopes with automated slide feeders and imaging capacities, and by reducing the manual labor associated with cell culture. The ability to focus on endophenotypes, such as DSBR, in genetic association studies may help us to better understand the factors that predispose individuals to cancer.