Theoretical Study of Sequence Selectivity and Preferred Binding Mode of Psoralen with DNA

Psoralen interaction with two models of DNA was investigated using molecular mechanics and molecular dynamics methods. Calculated energies of minor groove binding and intercalation were compared in order to define a preferred binding mode for the ligand. We found that both binding modes are possible, explaining the low efficiency for monoadduct formation from intercalated ligands. A comparison between the interaction energy for intercalation between different base pairs suggests that the observed sequence selectivity is due to favorable intercalation in 5′-TpA in (AT)n sequences.


INTRODUCTION
The study of the interaction of low molecular weight agents with DNA is essential for a deeper understanding of the biochemistry of the cell and also in the rational design of compounds with desired pharmacological properties. Among the vast amount of molecules that are known or proposed to interact with DNA we have in the current study focused our attention to furocoumarins.
Furocoumarins are tricyclic heterocycles found in a large number of plants [1]. Psoralens (Figure 1) are the best known of these chemical agents and have been employed in the treatment of several skin diseases such as psoriasis, vitiligo, and mycosis fungoides, as well for disorders related to the immune system such as cutaneous T-cell lymphoma [2]. Even though extracts of Ammi majus (Bishop's weed) were used by the Egyptians (1550 BC) in the treatment of vitiligo, the first concise report of the combined use of psoralen derivatives and UV light dates back to 1974 [3], in which Parrish et al. coined the term photochemotherapy to describe the interaction of UVA light and drugs.
The photochemical reactions of psoralens with DNA have been described in detail [1,4,5], and is proposed to proceed via a stepwise mechanism. First, the molecule inter-calates between the base pairs in double stranded DNA interacting with the π-stack of the nucleobases. After irradiation with UVA light, the photoexcited furocoumarins react with DNA and form covalent adducts through a [2 + 2] pericyclic reaction. The photoaddition takes place mostly between the psoralen 4 , 5 double bond (furan ring) and the C5, C6 double bond of a pyrimidine (usually thymine). The furan-side adduct can absorb an additional UVA photon, forming an interstrand cross-linked derivative through photoaddition at the 3, 4 double bond in the pyrone ring [6][7][8].
Furocoumarin photoaddition takes place predominantly with thymine. The sequence specificity has been investigated using DNA sequencing methodology. The results revealed that thymines in a GC environment have weak reactivity, while adjacent thymines are better targets. In addition, there is a strong preference for 5 -TpA sites compared with 5 -ApT, and (AT) n sequences are the most reactive towards photoaddition [9,10].
There is however no information regarding the factors controlling the sequence selectivity. One possibility is that the preference is due to favorable intercalation into this site. Another explanation is that the molecules are docked between all different base pairs, but photoaddition occurs preferentially in a special environment. A third possibility could be 2 Research Letters in Physical Chemistry that psoralens prefer both intercalation and photoreaction with a 5 -TpA site in an AT environment.
In this context, computational techniques are an exceptional tool to explore these different factors. Molecular docking is a computational method for predicting ligand-receptor binding when both the binding site and binding mode are unknown [11]. Even tough DNA has an exceptionally wellcharacterized structure, there is a very small number of DNA-ligand docking studies [12][13][14][15][16], and to the best of our knowledge there is only scarce information on docking involving this kind of ligand systems [17].
The aim of this study is to obtain complexes of DNAmodels and psoralen in which the ligand in principle could act either as intercalator or minor groove binder. We used two different DNA fragments, d(CCTTGCTACCTT) 2 and d(TATATATATATA) 2 , as targets for intercalation and minor groove complexes, in order to evaluate the preferential interaction between different base pairs and explore the proposed mechanism. Obtaining detailed information on the possible interaction sites will assist in the development of new and more efficiently intercalating photoactive compounds.

METHODS
Duplex B-DNA was constructed with the Molecular Operating Environment (MOE) software [18] and was minimized to 0.00001 kcal mol −1Å −1 using the CHARMM27 force field [19]. After minimization, the model was subjected to molecular dynamics (MD) simulations [20] employing an NVT ensemble [21] and the CHARMM27 potential. After the MD simulation, a final energy minimization was performed. The resulting DNA-model was used in the docking studies.
Because of the ring-fused planar structure of furocoumarin, no ligand conformational search is needed. A single energy minimization calculation was performed until the root-mean-square (RMS) gradient was 0.00001 kcal mol −1Å −1 using the same force field.
Manual docking of psoralen to the DNA was performed, in which different orientations for the ligand were considered. The DNA model with the ligand was energy minimized, subjected to MD simulation using the same settings as above except for a longer (250 picoseconds) heating interval, and finally energy minimized, using in all the stages the CHARMM27 force field. All trajectories were rapidly equilibrated during the first 200 picoseconds of the production runs. Hence, a total 1 nanosecond production simulation was considered sufficient to allow for structural relaxation. The potential energy of the final complex was calculated (E complex ). The ligand was then moved away from the DNA stack, the noninteracting system energy minimized, and the potential energy of the separated moieties calculated (E DNA + E ligand ). Subtracting this from E complex gives the interaction energy (IE) reported herein.
Throughout the calculations, the surrounding was modeled through the distance-dependent dielectric model of bulk water [18,22].

Psoralen binding modes
El-Gogary and El-Gendy [17] employed spectrophotometric DNA titration to calculate the intercalation affinity. The spectrophotometric data for 8-methoxypsoralen bound to calf-thymus DNA showed that even though the molecule could intercalate, a large fraction (approximately 30%) was also bound to the surface of DNA. No theoretical studies regarding different binding modes of psoralens are available. To this end, the ability of psoralen to intercalate between base pairs in d(CCTTGCTACCTT) 2 and d(TATATATATATA) 2 sequences, as well to bind to the minor groove, was therefore examined in the current study. The results are shown in Tables 1 and 2.
Psoralen does not present different geometric conformations due to its rigid nature. However, the molecule is asymmetric and may therefore interact with different energy in different poses. Four different orientations for docking into the DNA stack were considered in all systems, numbered as indicated in Figure 1.
First, we examine the ability to intercalate between different base pairs, GC, CT, and TA following the procedure described in Section 2. Each pose was inserted between the considered base pair. In some cases the ligand moved away from the intercalation site to interact with the minor groove, the major groove, or interacting only with one strand of DNA ("single-strand insertion," Table 1). In addition, we calculated the interaction energy when psoralen was placed initially in the minor groove. In this case, all poses remained bound to this region after the MD simulation (Table 2).
In all cases, the interaction between the ligand and the macromolecule lowers the total energy considerably, leading to stable complexes. Our results show that intercalation and minor groove binding are competing processes, in full agreement with the experimental information. When psoralen interacted with the d(CCTTGCTACCTT) 2 fragment, the most favorable docking energies for intercalation were obtained for pose 2 intercalated between CT (−36.8 kcal mol −1 ) and pose 3 between TA (−36.7 kcal mol −1 ). Minor groove binding into that DNA model was found to be slightly more favorable for poses 1 and 3, yielding interaction energies of −39.1 and −38.5 kcal mol −1 , respectively. When considering the poly-TA fragment, intercalation of pose 2 in the 5 -TpA site led to the most stable complex with an interaction energy of −43.7 kcal mol −1 . Minor groove binding energies were a few kcal mol −1 less favorable compared with the nonpoly-TA fragment (best docking energy −34.4 kcal mol −1 for pose 1). Those results show that even though minor groove binding is possible in both DNA strands, it is comparatively less favored in poly-TA fragments.
The current docking studies also explain other experimental facts. Tessman et al. [6] studied the photochemical reaction of psoralen with calf-thymus DNA. They observed a relatively low efficiency for the conversion of noncovalent complexes to monoadducts (approximately 11%). They suggested that the probability of having the appropriate excited state with the correct geometry alignment for photoaddition   is quite low, which could be due to the fact that intercalated psoralen is in dynamic equilibrium with nonbound ligand outside the helix. From our results, we instead propose an alternative explanation, namely, that the low efficiency is a consequence of the two competing binding modes, that is, intercalation versus minor groove binding.

Psoralen sequence selectivity
The use of two different sequences of bases (d(CCTTGCTA-CCTT) 2 and d(TATATATATATA) 2 ) in our docking studies enables us to evaluate the sequence selectivity. The two strands in principle have the same potential to react with psoralen and form photoadducts. However, the experimental studies reveal a different behavior between the two [9,10]. Psoralen preferentially photoreact in (AT) n sequences and especially in 5 -TpA sites. However, it is difficult from those results to conclude whether the preference is a consequence of preferential intercalation, photoreaction, or both.
Thus, we decided to study the ability of psoralen to intercalate between different base pairs in two DNA models (Table 1). In all cases the same four different orientations were considered (Figure 1). Each pose was inserted between the base pair, minimized and subjected to MD simulation. In some cases, the ligand did not remain intercalated, but moved to either the minor or major groove, or became inserted between two bases interacting with one strand of DNA only.
When the ligand interacted with the d(CCTTGCTA-CCTT) 2 fragment, all three environments (GC, CT, TA) behaved differently. We were unable to get psoralen intercalated between GC, while intercalation between CT turned out to be almost as favorable as between TA. When the ligand is inserted (GC base pair) it interacts with only one strand of DNA, acting as hydrogen bond acceptor via its carbonyl oxygen atom with the cytosine amino group (Figure 2). Such interaction stabilizes the insertion of the ligand to the GC tract. Figure 2 also shows the van der Waals contact surface for pose 3 inserted between GC.
The results of the docking studies with the poly-TA model were completely different. Psoralen intercalates to give stable complexes in 5 -TpA sites that furthermore yields the most favorable interaction energy (−43.7 kcal mol −1 ). However, when considering 5 -ApT sites, the ligand became bound to the minor or major groove or single-strand inserted, instead of remaining intercalated after MD simulation.
The most stable complex has the ligand perfectly intercalated between T and A, interacting only with those base pairs. During the MD simulation the psoralen was however rotated 90 degrees from the initial pose to the final one, as is shown in Figure 3. In addition, the van der Waals surface is closed around the ligand (Figure 3). The topology of the two sites (5 -TpA and 5 -ApT) is very different in our model. The base stacking is not the same in the two sequences; in the former the base pairs are per-fectly parallel while in the latter they are not. These results are in agreement with experimental information. McClellan et al. [23] have found a nonstandard B-helix for alternating (AT) n sequences, as a consequence of poor stacking between the base pairs. The same authors mentioned that (AT) n sequences are sites of great instability, present in promoters where DNA replication is initiated, a process in which unstacking is required. The unstacking and instability compared with nonpoly-TA sequences could lead to a better accessibility of the C5-C6 double bond of thymine due to easier intercalation. Our results suggest that intercalation is indeed favored in (AT) n sequences and especially in 5 -TpA sites.

CONCLUSIONS
A computational docking study for the photochemical agent psoralen was performed for the first time, employing two different DNA models (d(CCTTGCTACCTT) 2 , and d(TATATATATATA) 2 ). We have analyzed different binding modes and suggest that the low efficiency observed for the conversion of the noncovalent complexes to monoadducts are a consequence of two competing binding modes, intercalation and minor groove binding.
It is well known that strong sequence selectivity for furocoumarin photoaddition exists. From our results we can conclude that the preference is due to favorable intercalation in 5 -TpA sites in AT enviroments (poly-TA DNA). Albeit this will control the distribution in different environments, it does however not exclude that photoaddition may take place preferentially in those sites also. We are currently combining additional docking studies with quantum chemical calculations to take these factors into account.

ACKNOWLEDGMENTS
The Swedish Science Research Council (VR) and Sparbanksstiftelsen Nya are gratefully acknowledged for financial support.