Base Sequence Context Effects on Nucleotide Excision Repair

Nucleotide excision repair (NER) plays a critical role in maintaining the integrity of the genome when damaged by bulky DNA lesions, since inefficient repair can cause mutations and human diseases notably cancer. The structural properties of DNA lesions that determine their relative susceptibilities to NER are therefore of great interest. As a model system, we have investigated the major mutagenic lesion derived from the environmental carcinogen benzo[a]pyrene (B[a]P), 10S (+)-trans-anti-B[a]P-N 2-dG in six different sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data obtained from human HeLa cell extracts for our six investigated sequence contexts. This model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal. Steric hinderance between the minor groove-aligned lesion and nearby guanine amino groups determines the exact nature of the disturbances. Both nearest neighbor and more distant neighbor sequence contexts have an impact. Regardless of the exact distortions, we hypothesize that they provide a local thermodynamic destabilization signal for repair.


Introduction
Nucleotide excision repair (NER) plays a central role in preserving the genome of prokaryotes and eukaryotes. This versatile repair system removes structurally and chemically diverse bulky DNA lesions, including those induced by exposure to UV light and environmental chemical carcinogens [1,2]. The vital importance of this mechanism is demonstrated by several human NER-deficiency syndromes including xeroderma pigmentosum (XP), cockayne syndrome (CS), and trichothiodystrophy (TTD) [3]. XP, for example, is characterized by high photosensitivity, hyperpigmentation, premature skin ageing, and proneness to developing skin cancer [4]. Furthermore, the capacity of the NER pathway is important in cancer chemotherapy [5]: NER diminishes the efficacy of chemotherapeutic agents such as cisplatin, which act via the formation of bulky DNA adducts. A better understanding of the mechanisms of recognition of DNA lesions by the NER system may lead to the design of improved chemotherapeutic drugs that can modulate the repair response. Recent findings reveal that polymorphisms in human NER repair genes have an impact on the repair of DNA lesions and cancer susceptibility [6,7], as well as on chemotherapeutic efficacy [8].
The eukaryotic NER pathway is a biologically complicated process and consists of two sub-pathways with different substrate specificity: global genome NER (GG-NER) [9,10] and transcription-coupled repair (TCR) [11][12][13][14]. Both subpathways consist of ordered multistep processes, which differ in the early steps, when the DNA lesions are recognized, but converge in the later steps. In GG-NER, the focus of our present interest, the whole genome is scanned for bulky lesions to initiate the repair process. Two independent complexes, one involving the XPC/HR23B/Centrin 2 proteins [15][16][17] and the other involving the DDB1/DDB2 heterodimer [18][19][20][21], have been implicated in the early steps of base-damage recognition during NER [9]. By contrast, the TCR sub-pathway is activated by a stalled RNA polymerase during transcription [12]. Once the lesion is detected, the two sub-pathways proceed in an essentially identical manner to excise it: the multisubunit transcription factor. TFIIH, containing helicases XPB, and XPD, is recruited to the lesion site, followed by XPA, the single-strand DNA binding protein RPA, and the two nucleases XPG and XPF-ERCC1. Once assembled, a 24-32 oligonucleotide stretch containing the lesion is excised from the damaged strand. This 24-32 oligonucleotide stretch is the hallmark of a successful NER event. Finally, gap resynthesis by DNA polymerases δ, ε, and κ [22] and ligation by DNA ligase I complete the NER process [23].
One remarkable characteristic of the NER pathway is its ability to excise an astounding variety of chemically and structurally diverse lesions [2], and the rates of repair can vary over several orders of magnitude. However, the differences in the structural and thermodynamic properties of the lesions that control the diverse NER efficiencies have remained elusive. It has been suggested that the NER factors do not recognize the lesion itself, but rather the local distortions and destabilizations in the DNA that are associated with it [24][25][26][27][28][29][30]. A number of different properties of damaged DNA that elicit the NER response have been proposed. These include disruption of Watson-Crick hydrogen bonding [24,31], kinks in the damaged DNA [32], thermodynamic destabilization [24,29,33], diminished base stacking [34,35], local conformational flexibility [36], and flipped-out bases in the unmodified complementary strand [37][38][39][40]. A crystal structure of yeast Rad4/Rad23, the homolog of the human NER recognition factor XPC/HR23B, bound to DNA containing a cyclobutane pyrimidine dimer, shows that Rad4/Rad23 inserts a β-hairpin through the DNA duplex and expels two mismatched thymines in the undamaged strand out of the duplex to bind with the enzyme (PDB ID: 2QSG) [41]. This structure suggests that lesions which thermodynamically destabilize the DNA duplex and facilitate the flipping of base pairs and the intrusion of the betahairpin are good substrates to the NER machinery: the more locally destabilized the lesion, the better it is repaired.
The modulation of NER susceptibility for the same lesion by neighboring base sequence context, is however, a relatively unexplored area. If a lesion is better repaired in one sequence context than the other, a lesion-induced mutational hotspot could result. In order to elucidate the relationship between NER efficiency and base sequence-governed DNA distortion and destabilization induced by a bulky DNA adduct, we have employed as a model system the major lesion derived from the cancer-causing compound benzo[a]pyrene (B[a]P) [42]. B[a]P is the most well-studied member in a family of ubiquitous environmental pollutants known as polycyclic aromatic hydrocarbons. The tumorigenic metabolite of B[a]P [43] is the diol epoxide r7, t8-dihydroxy-t9, 10-epoxy-7,8,9, [44][45][46] is the 10S (+)trans-anti-B[a]P-N 2 -dG adduct ([G * ]) (Figure 1(a)), the focus of our work. This adduct, unless removed by DNA repair mechanisms [47], is highly mutagenic [48,49].
We have investigated the identical 10S (+)-trans-anti-B[a]P-N 2 -dG adduct in the six sequence contexts shown in Figure 1(b), utilizing an array of approaches: NER in human HeLa cell extracts, ligation and polyacrylamide gel electrophoresis techniques to assess bending properties of the modified duplexes, and structural studies utilizing high resolution NMR methods as well as unrestrained molecular dynamics (MD) simulations. The position of the B[a]P ring system in the B-DNA minor groove, directed 5 along the modified strand, was first determined by NMR in the 5 -. . .C[G * ]C-I. . . sequence in 1992 [50], but sequencegoverned structural details as well as dynamic properties remained to be elucidated. One important motivation for our work was to explore the role of nearby guanine amino groups on the structural properties and NER susceptibilities of these duplexes. The key difference in these duplexes is the presence and positioning of guanines flanking the [G * ], either immediately adjacent to the lesion or beyond: the B[a]P rings compete for space with the bulky amino group of guanine on the minor groove side of B-DNA, which we anticipated would differentially impact the structures of the damaged duplexes in a sequence context-dependent manner. A further motivation was to explore the role of differing sequence contexts beyond the lesion that vary in intrinsic flexibility. We hypothesized that subtle but critical structural effects governed by sequence context would manifest themselves by impacting NER efficiencies. Our results determined that sequence context could cause an up to four-fold difference in relative NER susceptibility, with even distant neighbors influencing NER. Locally disturbed Watson-Crick hydrogen bonding and flexible bending are two key sequence-governed structural distortions caused by this lesion that the NER machinery appears to recognize with different efficiencies. More generally, different lesions in varied sequence contexts will cause different kinds of distortions; thus, the extent of the local thermodynamic destabilization will also vary; we hypothesize that it is the extent and type of destabilization that determines the relative NER efficiency.

Nearest Neighbor Base Sequence Context Impacts NER of the 10S (+)-trans-anti -B[a]P-N 2 -dG Adduct
The     (Figure 3), which is partner to the C on the 5 side of [G * ], is sterically crowded by the B[a]P ring system since both are on the minor groove side, and hence this C5 : G20 base pair is episodically denatured (Figure 3(a)); for the 5 -. . .G[G * ]C. . . case, the B[a]P rings crowd the G6 amino group, and in this case the crowding is relieved by the severe untwisting accompanied by the increased Roll, which produces the flexible bend observed by gel electrophoresis. Investigations with the 5 -. . .I[G * ]C. . . sequence context substantiated the critical role of the guanine amino group since "I" (Figure 1(b)) lacks this group: the gel electrophoretic manifestation of a flexible bend was abolished. The NMR data showed conformational heterogeneity in minor groove conformations [51], and the MD simulations showed episodic denaturation of one of the two hydrogen bonds at the I:C base pair, explaining the heterogeneity.

Rigid bend
Flexible bend Flexible bend Flexible bend Rigid bend Rigid bend
Since different sequence steps are known to be differentially flexible [57,65], we hypothesized that the same minor groove lesion [50,64] with different distant neighbors would . . duplex originates from the guanine amino group at the C3 : G20 pair ( Figure 5). This amino group acts as a wedge to open the minor groove; facilitated by the highly deformable local -C3-A4-base step, the amino group allows the B[a]P ring system to better bury its hydrophobic surface within the groove walls. This produces a yet more enlarged minor groove which is coupled with more local untwisting and more enlarged and flexible Roll [67], causing the greater bend in 5 -. . .C[G * ]C-II. . . [66] ( Figure 5). The NER efficiencies are 1.6 ± 0.2 times greater in the 5 -. . .C[G * ]C-II. . . than in the 5 -. . .C[G * ]C-I. . . sequence context [66] showing that distant neighbors to [G * ] modulate the NER susceptibility. The greater NER susceptibility for the 5 -. . .C[G * ]C-II. . . duplex is explained by its greater bending with enhanced flexibility: the intrinsic minor groove enlargement caused by both the guanine amino groups [55,68] and the great flexibility of pyrimidine-purine steps, including the C-A step [57,[69][70][71][72] allow the B[a]P moiety ( Figure 5) to more favorably position itself, but at the expense of the greater bend that makes it more repair-susceptible.

Understanding Repairability Differences: the Degree of Local Thermodynamic Destabilization Is a Unifying Hypothesis
We have carried out a series of studies with the same 10S (+)-trans-anti-B[a]P-N 2 -dG lesion in a number of sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. Additionally, we have considered differences in intrinsic flexibility of sequences flanking the lesion. These are model systems for gaining understanding of NER lesion recognition factors. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data from human HeLa cell extracts for all of our investigated sequence contexts (Figure 1(b)). Figure 4 summarizes our key findings and enables us to infer a hierarchy of NER recognition signals for the series of sequences and the single lesion we explored. We point out here that a variety of structural disturbances are found in each case, which are correlated. Examples include impaired Watson-Crick pairing that is accompanied by diminished base stacking, and DNA bending towards the major groove, that is induced by a minor groove lesion and is accompanied by minor groove enlargement. Our present model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal: for example,  (Figure 4). For our system, steric hindrance between the minor groove-aligned lesion and nearby guanine amino groups, if present, determines the exact nature of the disturbances, depending on exactly where the guanine amino groups are situated. The intrinsic flexibility of the specific base steps also plays an important role in causing the differential disturbances. Both the nearest neighbor and the more distant neighbor sequence contexts have an impact.
More globally, different lesions may cause different types of distortions depending on the specific nature of the lesion and its sequence context. However, regardless of exactly what these distortions are, we hypothesize that they must provide a local thermodynamic destabilization signal for repair to ensue, and the greater the extent of destabilization, the better the repair. The destabilization would facilitate the strand separation, base-flipping, and β-hairpin insertion by the XPC/HR23B recognition factor [41,73] needed to initiate NER. In this way, the NER machinery would excise a large variety of lesions with different efficiencies, by recognizing the thermodynamic impact of the lesions rather than the lesions themselves [24,29,41,73]. Lesions that resist NER present a great hazard, as they survive to the replication step and produce a mutagenic outcome; such NER-resistant lesions provide an important opportunity for gaining further understanding of the mechanism utilized by the NER apparatus to recognize different lesions [74].