Use of a Designed Peptide Library to Screen for Binders to a Particular DNA G-Quadruplex Sequence

We demonstrated a method to screen for binders to a particular G-quadruplex sequence using easily designed short peptides consisting of naturally occurring amino acids and mining of binding data using statistical methods such as hierarchical clustering analysis (HCA). Despite the small size of the library used in this study, candidates of specific binders were identified. In addition, a selected peptide stabilized the G-quadruplex structure of a DNA oligonucleotide derived from the promoter region of the protooncogene c-MYC. This study illustrates how a peptide library can be designed and presents a screening guideline for construction of G-quadruplex binders. Such G-quadruplex peptide binders could be functionally modified to enable switching, cellular penetration, and organelle-targeting for cell and tissue engineering.


Introduction
Research over the last few decades has revealed that some DNA and RNA secondary structures modulate a variety of cellular events. One secondary structure in particular, the G-quadruplex [1] regulates cellular events such as transcription, translation, pre-RNA splicing, and telomerase elongation, all of which play roles in various serious diseases and cellular aging [2][3][4][5][6]. Systems capable of controlling DNA and RNA G-quadruplex structures would therefore be useful for the modulation of various cellular events for the purpose of producing biological effects. Because of their biological importance, many G-quadruplex-targeting ligands [7,8] have been described, including phthalocyanine derivatives [9], porphyrin derivatives [10], and others [11][12][13][14]. However, the next generation of binders should have more G-quadruplex sequence specificity, higher inducing or collapsing ability of the structure, and a greater degree of functionality including binding on-off switching, cellular penetration, and the ability to target organelles.
De novo designed peptides (peptides not derived from domains of binding proteins) are promising next generation G-quadruplex binding candidates because of the following advantages they offer: (i) peptides are easier to design and synthesize than antibodies or recombinant proteins; (ii) they can mimic protein-G-quadruplex interactions; (iii) analyses based on peptide libraries can be used to elucidate binding properties of DNAs; (iv) in addition to naturally occurring amino acids, various functional moieties (e.g., artificial amino acids) can be employed as building blocks in designed peptides; (v) because certain peptide sequences may exhibit transmembrane or hormonal properties, combining peptides with these functional sequences with G-quadruplexbinding peptides can produce multifunctional molecular systems useful in cell and tissue engineering.
To increase the utility of the peptide library technology, we designed peptide microarrays composed of various secondary structures. The designed peptide arrays were initially applied to protein analysis [15][16][17][18][19][20][21]. Upon addition of various proteins to these peptide arrays, library peptides containing fluorescent probes showed different binding responses depending on the peptide sequence. These response patterns served as "protein fingerprints" (PFPs), which can be used to establish the identity of a target protein and correlate the recognition properties of a target protein to a particular peptide [15,16,21]. In addition, by applying statistical analyses such as hierarchical clustering analysis (HCA) and principal component analysis (PCA) to PFPs, researchers can draw high-confidence correlations between target proteins and biological function, based primarily upon peptide charge and hydrophobicity data [19,21]. We successfully applied our system to screen for peptide ligands that tightly bind to a target protein and simultaneously control the function of a protein related to the target [20]. This approach has several advantages, such as ease of peptide library design and robust selection of ligands with novel structures for the control of signaling pathways and/or cascades.
Here, we demonstrate a model screening of binders to a particular G-quadruplex sequence using easily designed short peptides consisting of naturally occurring amino acids. We also examined the stability of the DNA G-quadruplex structure upon addition of a G-quadruplex-binding peptide and checked whether the peptides could induce or collapse the G-quadruplex structure. This study illustrates how a peptide library can be designed and presents a screening guideline for the construction of next-generation ligands with increased specificity to particular G-quadruplexes and increased functionality, including on-off switching and the ability to penetrate cells and target organelles.

General Remarks.
All chemicals and solvents were of reagent or HPLC grade and were used without further purification. Oligodeoxynucleotide samples purified by HPLC were purchased from Hokkaido System Science (Sapporo, Japan). HPLC was performed on a GL-7400 HPLC system (GL sciences, Tokyo, Japan) using an Inertsil ODS-3 (10 × 250 mm; GL Science) column for preparative purification with a linear acetonitrile/0.1% trifluoroacetic acid (TFA) gradient at a flow rate of 3.0 mL/min. Peptides were analyzed using MALDI-TOF MS on an Autoflex III (Bruker Daltonics, Billerica, MA, USA) mass spectrometer with 3,5-dimethoxy-4-hydroxycinnamic acid as the matrix.

Synthesis of a Designed Peptide
Minilibrary. The designed peptide library was synthesized on Fmoc-NH-SAL-PEG Resin (Watanabe Chemical Industries, Hiroshima, Japan) using an automatic synthesizer (Advanced ChemTech Model 357 FBS) with Fmoc chemistry [22] using Fmoc-AA-OH (4 eq., Watanabe Chemical Industries) according to the O-(7-azabenzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (HATU, Watanabe Chemical) method. Side chain protections were as follows: t-butyl (tBu) for Glu and Tyr, trityl (Trt) for His, and t-butyloxycarbonyl (Boc) for Lys. The peptides were cleaved from the resin and side chain protections were removed by incubating the peptides for 2 h in TFA (Watanabe Chemical Industries)/H 2 O/triisopropylsilane (Wako Pure Chemical Industries, Tokyo, Japan) (20:1:1, v/v). The peptides were precipitated by addition of cold diethylether and then collected by centrifugation. The recovered peptides were dissolved in water to about 1 mM and stored at 4 • C. In order to determine the concentration of library peptide stock solutions, the absorbance at 280 nm (for Trp (ε = 5500) and Tyr (ε = 1490) [23]) of a diluted solution of each peptide was measured on a U-1900 UV spectrometer (Hitachi High-Technologies, Tokyo, Japan). After a screening assay in 3.2, selected peptides were purified by RP-HPLC and characterized by MALDI-TOF MS for further experiments in 3.3-3.5. Purified peptides were dissolved in water to about 1 mM and their concentrations were measured by their absorbance at 280 nm, after which they were stored at 4 • C.

Screening of Library Peptides Using Gel Electrophoresis.
Prior to analysis, each sample in 10 mM Tris-HCl (pH 7.0) buffer containing 0.1 mM EDTA was heated to 85 • C for 5 min then gently cooled to room temperature at a rate of 1.0 • C min −1 . Native gel electrophoresis was performed using nondenaturing gels containing 13% polyacrylamide. Loading buffer (2 μL) was mixed with 2 μL of 5 μM DNA sample with/without 50 μM library peptide in 10 mM Tris-HCl (pH 7.0) buffer containing 0.1 mM EDTA. A 4 μL aliquot of each sample was loaded onto the gel and electrophoresed at 10.0 V cm −1 for 2 h at room temperature. Gels were stained with SYBR Gold Nucleic Acid Gel Stain (Invitrogen, Carlsbad, CA, USA) and imaged using a FLA-7000 imager (Fuji Film, Tokyo, Japan). Band intensities were quantified using Malti Gauge software (V.3.2) for Windows. The bound DNA percentage was calculated according to the following equation: {(intensity of DNA band without peptide) − (intensity of DNA band with peptide)} / (intensity of DNA band without peptide) × 100 (%).

Hierarchical Clustering Analysis (HCA).
The Euclidean distance [24,25], a common measure of the distance between two vectors, was used to determine the similarity Journal of Nucleic Acids 3 G series: P series:  between two binding-color images obtained from different target DNAs. After the similarities between the binding-color images were determined, HCA was conducted. Ward's clustering algorithm was used and the dendrogram was obtained from the Euclidean distances using the Excel Macro program [26]. The horizontal axis represents the distance between vectors (left for high similarity and right for low similarity).
2.6. Circular Dichroism (CD) Spectroscopy. Circular dichroism (CD) spectroscopy was performed using DNA (1 μM) and peptide no. 010 (0 or 100 μM) in 20 mM Tris-HCl (pH 7.0) containing 0.1 mM KCl and 0.1 mM EDTA. A J-820 spectropolarimeter (JASCO, Hachioji, Japan) with a thermoregulator using a quartz cell with a 1 cm path length at 25 • C was used for CD measurements. Prior to analysis, each sample was heated to 85 • C for 5 min, then gently cooled to room temperature at a rate of 1.0 • C min −1 .

UV Melting of G-Quadruplex Structures.
The UV absorbance was measured using a Shimadzu 1800 spectrophotometer equipped with a temperature controller (Shimadzu, Kyoto, Japan). Melting curves for the G-quadruplex structures were obtained by measuring the UV absorbance at 295 nm in 10 mM Tris-HCl (pH 7.0) containing 0.1 mM KCl and 0.1 mM EDTA at a heating rate of 1.0 • C min −1 . The melting temperature (T m ) values for 5 μM DNAs with/ without peptide no. 010 (100 μM) were obtained from UV melting curves as described previously [27,28].

Design and Synthesis of the Peptide Minilibrary.
We constructed a minilibrary consisting of 32 peptides of varying charge and/or hydrophobicity using the strategy shown in Figure 1(a). The KWK motif is known for its ability to bind to DNA [29]. We designed peptides in which four residues were added to the N-terminus of the KWK motif. Addition of a G or P at 4th residue allowed for varying the flexibility of the peptide main chain. Addition of an H, W, F, or Y at residue X 1 allowed for varying the aromatic character of the peptide, while addition of a K or E at residues Z 1 and Z 2 allowed for varying the peptide charge. Figures 1(b) and 1(c) show all the library peptide species. The column and row headings in Figure 1(c) denote the aromatic residues (H, W, F, or Y at X 1 ) and numbers of amines (E, or K at Z 1 and Z 2 ; numbers of amines = 3, 4, or 5), respectively, and each cell displays the number of the synthesized peptide. Hence, the library consists of two series (a G series and a P series) of designed peptides, each of which contains 16 systematically designed peptides.

Screening of the Designed Peptide Minilibrary and Data
Mining. Library peptides were assayed for their ability to bind to four different parallel G-quadruplex sequences from the promoter regions or the 5 untranslated regions of human protooncogenes (MYC from c-MYC, NRAS from NRAS, WNT from WNT 3, and FGF from FGF 3 [30][31][32] ( Table 1)   and the binding percentages ranged from high to low. To visualize the binding properties more clearly, these results were converted into black-red-yellow images as shown in Figure 2(b). The color images correspond to two series of peptides (Figure 1(c)), and the color of each cell indicates the binding response of each peptide against each DNA.  The cells in the bottom portion of each series were colored in either yellow or red, indicating that DNAs preferentially bind cationic peptides. In addition, while the MYC G-quadruplex tended to bind to G-series peptides and the NRAS Gquadruplex tended to bind to P-series peptides, the FGF and WNT G-quadruplexes tended to bind to both peptide series. This result implies that imparting variation of flexibility to the peptides by adding a G and P to the middle of the sequence provides DNA selectivity. Similarities between the peptides in their ability to bind DNAs were analyzed quantitatively using HCA with Euclidean distances. As shown in Figure 3(a), the library peptides could be sorted into five groups (Groups A-E). Group A peptides exhibited a relatively high binding percentage for the FGF and WNT G-quadruplexes, but had less binding percentages for the MYC and NRAS G-quadruplexes. Peptides classified into Group B showed relatively low binding percentages for all DNAs except the WNT G-quadruplex, to which they bound tightly. Peptides classified into Group C had high binding percentages for the MYC G-quadruplex, but lower binding percentages for the NRAS, FGF, and WNT G-quadruplexes. Peptides in Group D demonstrated an intermediate binding percentage for all quadruplexes except MYC, to which they did not bind. Finally, peptides classified into Group E, which are chara-cterized by 5 amines, bound to all the DNAs tested with high binding percentages. Because peptides with 3 or 4 amines demonstrated diverse binding percentages, we concluded that a greater number of specific binders could be obtained by designing and synthesizing a wider array of peptides with 3 or 4 amines. Despite the small size of the library used in this study, we were able to obtain peptides that demonstrated sequencespecific binding. For example, our results show that peptides 009 and 010 are MYC-specific binders, and peptides 003, 007, and 015 are WNT-specific binders. Similarities between G-quadruplex binding properties were analyzed quantitatively using the HCA method. A clustering dendrogram (Figure 3(b)) was generated by analysis of Euclidean distances. The horizontal axis indicates the distance between the binding percentages of the G-quadruplexes for library peptides (left indicates high similarity and right indicates low similarity). The clustering dendrogram discriminated MYC from the other G-quadruplexes, which tended to cluster. We suspect that some of the library peptides recognized a particular sequence (GGGCGGG) present in all the G-quadruplexes tested except for MYC. Although much additional data regarding the binding of library peptides to various DNA types are needed, these results imply that the statistical approaches used in this study could be used to characterize the binding properties of a variety of other G-quadruplexes.

Confirmation of DNA Binding.
We selected peptide no. 010 as a MYC-specific binder and after the peptide was purified, an electrophoresis assay was conducted to confirm no. 010 binding to MYC, (Though we also selected and purified peptide no. 009, the purified no. 009 peptide did not strongly bind to MYC (data not shown)). Figure 4 shows the bound DNA percentages at which purified pep-tide no. 010 bound to various DNAs. Peptide no. 010 strong-ly bound to MYC, while the peptide weakly bound to NRAS and WNT or little bound to csMYC (complementary sequence of MYC, 5 -TCCCCACCCTCCCCACCCT-3 ) and TELO (representative antiparallel G-quadruplex sequence from human telomeric DNA, 5 -AGGGTTAGGGTTAGG-GTTAGGG-3 [33]). Although G-quadruplex FGF as well as MYC highly bound to peptide no. 010, the data clearly suggest that peptide no. 010 bound only to the parallel Gquadruplex sequence, not to the antiparallel G-quadruplex sequence or to the other sequences, including the MYC complementary sequence. Although additional assays and/or detailed confirmation experiments are needed, these results indicate that the electrophoresis in 3.2 is one of the promising tools for screening DNA-binding peptides using peptide libraries. experiments to investigate induction of structural and conformational changes in the MYC G-quadruplex upon interaction with peptide no. 010. The MYC G-quadruplex yielded spectra that were characteristic of parallel quadruplexes, with a maximum at 260 nm and a minimum at 240 nm ( Figure 5(a)), a result that is consistent with a prior study [10]. Interestingly, addition of peptide no. 010 led to an increase in the parallel G-quadruplex signature. Furthermore we performed CD experiments with the other G-quadruplex DNAs upon addition of peptide no. 010. The FGF G-quadruplex showed similar results to those of MYC ( Figure 5(b)), whereas the NRAS and the WNT did not show significant changes upon addition of peptide no. 010 (Figures 5(c) and 5(d). These results were correlated to the DNAbinding results in 3.3.

Thermodynamic Stability of the DNA G-Quadruplexes
Bound to Peptide no. 010. We investigated the effect of peptide binding on the thermodynamic stability of the DNA G-quadruplexes. Table 2 shows the T m of the MYC and FGF G-quadruplexes in the presence and absence of 100 μM peptide no. 010. (Melting curves of MYC and FGF with/without no. 010 are shown in Figure 6.) Surprisingly, the MYC G-quadruplex T m changed appreciably in the presence of the peptide, increasing by about 8 • C while the FGF G-quadruplex T m changed in the presence of the pep-tide, increasing by only about 4 • C.
These results imply that peptide no. 010 binds to the MYC G-quadruplex by a kind of specific association.
Although we could not screen peptides with a high binding selectivity, despite the limited size of our library, we found that peptide no. 010 acts as a promising stabilizer of the Gquadruplex structure as well as a binder to MYC.

Conclusions
In this study, we demonstrated a novel designed peptide library method to screen for binders to a particular G-quadruplex and also demonstrated the mining of data generated from binding results using statistical methods such as HCA.
Our results suggest that the use of a designed peptide library enables the discrimination of G-quadruplex sequences and could therefore provide useful information for the design of peptides for targeting specific G-quadruplexes. Despite the small size of the library used in this study, some candidates of specific binders were identified. By improving the design of the library peptides and the screening methods, our system could be used to screen for peptides that bind to a particular G-quadruplex and alter its thermodynamic properties. It would then be possible to find binders with strong specificity to which specific functional attributes can be added, such as the ability to penetrate cells in order to control DNA and/or RNA events for the purposes of cell and tissue engineering.