Peptides Trapping Dioxins: A Docking-Based Inverse Screening Approach

A rapid and cost-effective computational methodology for designing and rationalizing the selection of small peptides as receptors for dioxin-like compounds was proposed.The backbone of the dioxin Ah receptor binding site was used to design a series of pentaand hexapeptide libraries, with 1400 elements in total. Peptide flexibility was considered and 10 conformers were found to be a good option to represent peptide conformational space with fair speed-accuracy ratio. Each peptide conformer was treated as a possible receptor, generating a dedicated box and then running a docking process using as ligands a family of 76 dibenzo-p-dioxins and 113 dibenzofurans monoand polychlorinated. Significant predictions were confirmed by comparing primary structure of top and bottom ranked peptides binding dioxins confirming that scrambled positions of the same amino acids gave completely different predicted binding. The hexapeptide EWFQPW, with the best binding score, was chosen as selective sorbent material in solidphase extraction. The retention performances were tested using the 2,3,7,8-tetrachlorodibenzo-p-dioxin and two polychlorinated biphenyls in order to verify the hexapeptide specificity. The solid-phase extraction experimental procedure was optimized, and analytical parameters of hexapeptide sorbent material were compared with the resin without hexapeptide and a commercial reversed phase cartridge.


Introduction
The group of polychlorodibenzofurans (PCDFs) and polychlorodibenzo-p-dioxins (PCDDs) denominated as dioxins and dioxin-like compounds belongs to the chlorinated hydrocarbons family.The dioxins family, considering only molecules having chlorine as substituent in the rings, has over 200 compounds.They are highly toxic substances and very well-known environmental pollutants and carcinogens [1].Previous works have reported associations between dioxinlike compound levels and allergic symptoms, alterations in immune functions and increased risk of infections in infancy [2,3].Other papers relate cancer with high levels of 2,3,7,8tetrachlorodibenzo-p-dioxin (TCDD) [4][5][6].
Usually dioxins analysis is carried out by high resolution gas chromatography combined with high resolution mass spectrometry [7][8][9][10][11].Sample purification is a very important step as reported from many authors and, before analysis, preanalytical cleanup and preconcentration procedures are usually required to separate the different families of pollutants, avoiding interferences during instrumental determination [7][8][9][10][11][12][13]. Solid-phase extraction (SPE) is commonly used for the cleanup of complex samples [14,15].Traditional SPE sorbents range from reverse phases to ion exchange and polymeric materials [16][17][18].These materials are not selective and can result in the coelution of interfering compounds with similar polarity.The matrix effect can be a serious problem, affecting the reliability of the analytical method.In the past years, computational rationally selected peptides were successfully used as selective traps for different ligands producing new preanalytical or analytical systems with very cost-effective results [14,[19][20][21][22][23][24][25][26].In fact the technique of doing sorbent materials with peptides is very easy and cheap especially when peptides are directly attached to a solid support [23][24][25].
In this work, the design and optimization of artificial peptides generated by mimicking the biological Ah dioxin receptor (AhR) were proposed by means of a docking-based inverse screening approach.The highly selective properties of Ah receptor make it an ideal target for mimicking dioxins recognition.However, the structure of the complex (AhR dioxin) is not yet known, and this limits the understanding of the mechanism of the bioassay based on this biorecognition element [27].
The goal of this methodology was to obtain a small set of peptides that could be suitable candidates in an experimental phase.The virtual screening approach used was based on the idea of making a rough selection of small peptides and filter them afterwards, thus, allowing the analysis of large amounts of structures by reducing computational time.To avoid secondary structure motifs that would have a negative impact in system complexity, the maximum length of peptides was limited to six amino acids.As reported in previous work [19], this restriction in sequence length was necessary in order to reduce calculation time, thus, allowing the analysis of a larger number of possible structures.Meaningful predictions were confirmed by comparing primary structure of top ranked peptides binding dioxins with the bottom ranked ones confirming that scrambled positions of the same amino acids gave completely different predicted binding.
The evidence that the computational method used can in fact be applied to calculate relative binding affinities for the featured peptides and dioxin derivatives was proven by selecting the top ranked hexapeptide binding both classes of PCDFs and PCDDs for a preliminary experimental test.The hexapeptide was used as sorbent material in SPE, and the retention performances were tested using the TCDD and two polychlorinated biphenyls (PCBs) in order to verify the peptide specificity.In fact, the aim was to verify the possible use of small peptides in selective extraction and purification procedures.The resulting data were compared with those obtained from the resin without hexapeptide and a commercial reversed phase cartridge.Peptides were generated with Hyperchem 8.0.5 software in zwitterionic form, using only the 20 natural amino acids.Different tools from OpenEye Scientific Software package under academic license were used.Ligands library was designed by converting standard IUPAC names into structures by LEXICHEM package.Geometry optimization was carried out using SZYBKI 1.5.1 in default parameters [28,29].Conformers for each peptide were generated with OMEGA 2.4.3 [30,31].The boxes and rigid body docking processes were performed using FRED 2.2.5 [32,33].VIDA 4.2.1 was used for structures visualization, molecular surfaces, and interatomic distances analysis [34].AutoIT V3, a BASIC-like scripting language, was implemented to automate program run and other tasks.

Materials and Methods
The ligands library was made of 113 PCDFs and 76 PCDDs totalizing 189 structures.All structures were visualized and checked to guarantee their accuracy in terms of valence, bond order, bond angles, and geometrical arrangement of atoms.
Boxes defining the active site were generated for each peptide conformer.The box size was comprised from 4500 to 8000 Å3 ranged in the 95% of all cases.The peptide was inside the box considering the whole receptor structure as a possible binding site for ligands.The time required for each peptide conformer, from the initial design to final docking, was about 5 minutes.
The screening process was subdivided into 3 steps, in each one a library was generated and receptor-ligand binding scores were calculated.Independent docking process was carried out for each of the three libraries.A total of 1400 structures were tested, 1000 pentapeptides and 400 hexapeptides.The three screening processes are described as follows.
(1) A first library of 400 pentapeptides was generated by mutating, in combinatorial way, the first and last position of the pentapeptide residue (NFQGR) supposed in dioxin Ah receptor [27], using each of the 20 natural amino acids.
(2) A second library of 600 pentapeptides was created by mutating the glycine in 4th position with the remaining 19 natural amino acids in each of the top 30 ranked pentapeptides from the first screening step.The objective here was to substitute glycine with amino acids having residual groups in order to increase the probability of good interactions between receptors and ligands.
(3) A final library was formed by 400 hexapeptides generated by inserting, alternately, each of the 20 natural amino acids at the beginning and the end of the top 10 ranked pentapeptides from the second step.Ultrapure water was produced by a Milli-Q Plus apparatus from Millipore (USF ELGA LabWater, UK).The peptide sequence was synthetized on the W-Nova Syn TGA resin by EspiKem (Italy).The SPE sorbent materials were as follows:

Experimental
(i) resin A (peptide): EWFQPW-Nova Syn TGA resin with a purity of 90% and with a peptide substitution level of 0.17 mmol/g, (ii) resin B (blank): W-Nova Syn TGA with only the first amino acid of the sequence.
The cartridges (volume 1 mL) were packed with 30 mg of modified peptide or blank resin with a teflon frit on the bottom.A second frit was used to cover the resin into the cartridge.After loading, the cartridges were conditioned and equilibrated by washing with ethanol.During this procedure, the cartridges were continuously shaken in order to obtain a homogeneous packing.
Before their application, the cartridges were swelled and dried with 10 conditioning and washing cycles to activate the sorbent material.After the optimization of loading and detection conditions, the extraction procedure was performed in five steps as follows: (i) conditioning of the stationary phase with 1 mL of methanol, (ii) equilibrating with 1 mL of H (v) elution with 1 mL of methanol for three times.
Loading, washing, and elution fractions were collected and then analyzed by HPLC/UV for TCDD or analyzed using isotope dilution gas chromatography-high-resolution mass spectrometry (GC-HRMS-ID) for PCBs, adapting procedures reported in other works [35,36].TCDD analysis was carried out on a Perkin Elmer liquid chromatograph Series 200, (Perkin Elmer, Italy) equipped with autosampler, pump, degasser, and UV-Vis detector at 235 nm.The column used was Supelcosil LC-18 250 × 4.6 mm, 5 m.Column flow rate was 1 mL/min, and injection volume was 10 L.Mobile phase was water 30% and methanol 70%.
PCBs analysis was performed using a Thermo Trace GC Ultra coupled with Thermo DFS analyzer.The analytes were separated on a capillary column (Restek PCB HT8 60 m × 0.25 id) and quantified using selected-ion monitoring high-resolution (10,000 resolving power) mass spectrometry (HRGC-ID/HRMS).Quantification was by isotope dilution mass spectrometry using calibration standards containing 13C labeled and unlabeled analytes.The following column temperature program was used: 90 ∘ C isotherm 1 min, gradient 22.5 ∘ C/min to 180 ∘ C, gradient 2.8 ∘ C/min to 285 ∘ C, and gradient 11.7 ∘ C/min to 320 ∘ C. The injection volume was 1 L (for blank, standard solution, sample, and sample + standard); using the same injection syringe, by autosampler, 0.2 L of an alkane homologue series was injected (C10-C22), for retention index calculation.The quantities of PCBs were determined by comparing their peaks areas with those of the corresponding standards and then corrected for IS respective areas.The chromatographic peaks were analyzed with acquisition using a data system program (QuandeskTM 2.1).The mass spectrometer was operated in the selected ion monitoring (SIM) mode.The following masses were measured for each chlorination level of the analyzed PCBs: molecular ion in positive mode (M) and M + 2 for PCB 52, and M + 2, M + 4 for PCB 101.All peaks were identified by using the retention time and relative ion intensities.

Results and Discussion
3.1.Conformational Analysis.A conformational analysis was carried initially in first pentapeptides library, on five receptor ligand complexes ranked 9th, 30th, 154th, 203rd, and 381st in docking dioxins.
The pentapeptides were chosen arbitrarily but attending to their chemical composition and ranking.For each one of them 1, 3, 5, 10, 50, and 100 conformers were generated with OMEGA in independent program run.This conformational study was carried out to determine the minimum number of conformers required to make a statistically reliable computed binding score.
Figure 1 reported the binding score, expressed in percentage, of the 5 pentapeptides, using different conformers, versus all the 189 ligands.The standard deviations ranged from 10 to 20% depending on the amount of conformers.
The results showed that 10 conformers were a good option to represent peptides conformational space with fair speedaccuracy ratio.The overall tendency remained unchanged when using more conformers.The divergences observed between the average binding score using 10 conformers and the others were less than 20% of variance in the worst case.Because a significant divergence was observed comparing the scores obtained with or without conformers even if the trend remained unchanged, to reduce false positives and/or false negatives, 10 conformers were generated for each peptide library and used in all the docking processes.
A similar conformational trend behavior was observed also using different family ligands in a previous work [19], reinforcing the possibility to adequately represent conformational space with 10 conformers, in terms of modeling conditions.

Docking-Based Inverse Screening
Results.The virtual screening process addressed to reduce the number of structures to be tested, in each step a cut-off value was established for candidates selection.In practical terms, only peptides having binding scores comprised within the lower 30% average binding scores variability for first screening and 15% for second and third screening were considered for further analysis.
Figure 2 resumes the results obtained in the three virtual screening processes.The overall trend was to yield lower binding scores in each successive screening stage, meaning that receptor-ligand complexes obtained were more stable.The upper curve, corresponding to the first screening step, showed that there is a small quantity of peptides with negative scores.Just 30 structures were found within the 30% lower binding score variability, corresponding to 7.5% of the first pentapeptides library analyzed.In second screening, 10 structures were found with negative binding scores within the lower 15% of total binding score variability, representing 1.7% of all 600 pentapeptides analyzed.In the last screening step there was a large amount of peptides with negative binding scores, more than 90%, but only 10 hexapeptides with scores within the 15% lower score variability, representing a 2.5% of the 400 hexapeptides library.Table 1 reports the complete set of selected peptides after each screening step and used as lead compounds for upcoming modeling process.The table showed that successive modifications to peptide sequences tended to improve binding scores in about 6 units for each independent step.In general terms, peptides docked better PCDFs than PCDDs even though binding pockets were the same, as well as the amino acids participating in the interaction.

Ligands and Receptors Structural Analysis.
The effect of chlorine substituents in peptide complex affinity was analyzed.In Figure 3 a scatter plot of average binding score versus number of substituents, for both PCDDs and PCDFs in each screening step, was reported.In all cases there was a linear correlation between these two factors with an  2 > 0.88 at 95% confidence interval.Lower correlations were observed for PCDFs along with a decrease of linearity with the peptides size.This behavior indicated that as the number of substituents increased, the affinity complex formed with a given peptide tended to be less favorable.This tendency was slightly stronger for pentapeptides versus PCDFs as indicated in curves slopes.
This behavior was in line with the substituents steric hindrance that destabilized the receptor-ligand complex influencing the scores.However, for both PCDFs and PCDDs, the scores were about two-times lower in each successive screening.
For top scored complexes, all 189 ligands were docked in almost the same peptide region.Two main tendencies, planar and saddle shaped, were identified in relation to the shape of binding site (Figure 4).The orientation of ligands in the first case experimented a greater variability when compared Table 1: Top ranked peptides selected after each screening step.Peptide-ligand complex average binding score was calculated using 10 conformers for each peptide versus all the 189 ligands.Peptides were sorted by average binding score over all the 189 ligands.The peptides rank and binding score versus the 113 PCDFs or the 76 PCDDs were also reported (rank and score columns).In parentheses the number of structures selected after each screening stage.to the latter.In all cases analyzed, however, the oxygen atoms present in both PCDDs and PCDFs were oriented to phenylalanine and glutamine residues as suggested by Kobayashi et al. [27].The simulated three-dimensional shape of peptide-dioxin complex suggested to attach the peptide to the resin via carboxyl or amino terminal without losing the binding properties.This information, supported by the binding score calculated, could be advantageous to improve the retention performances in experimental SPE test.The last screening step was the starting point for selecting possible candidates to be used in experimental trials; therefore, a structural analysis was carried out reporting the amino acids occurrence percentage in each peptide position of the top and bottom 50 ranked hexapeptides.Top and bottom hexapeptides structure analyses were reported considering the higher and the lower ranked structures having, respectively, binding scores less and more than −8.31 and −1.57corresponding to the 12.5% each of the 400 hexapeptides analyzed in the last screening step.

Peptide
According to structural analysis results, in first position all amino acids were well represented.Relevant differences between top and bottom ranked peptides were found positive for glutamic acid and proline but negative for asparagine, leucine, and lysine.The second position presented the greater differences in amino acids percentages, where glutamine, asparagine, and aspartic acid were well represented in top ranked peptides and phenylalanine, serine, and threonine in bottom ranked peptides.Tryptophan had strong presence only in top or bottom ranked peptides, respectively, in second or fifth position.As expected, phenylalanine, glutamine, and tryptophan had a strong presence, respectively, in third, fourth, and sixth positions for all top and bottom peptides.Instead glutamine in third, tyrosine in fourth, and phenylalanine in fifth position, respectively, were well represented only in bottom ranked peptides.Remarkably, proline, known to structure shape contribution, was found to have an very important presence in fifth position for the top ranked peptides.Considering the total percentage occurrences only proline residue showed a significant difference between top and bottom ranked peptides.
The evidence that calculations gave meaningful predictions was proofed by comparing primary structure of top and bottom ranked peptides binding dioxins confirming that scrambled positions of the same amino acids gave completely different predicted binding.The virtual results cannot be correlated to a single property like hydrophobicity, aromaticity, or polarity but to the contribution of every amino acid property coupled with their space disposition.In fact the simulated binding is done by a synergic cooperation of the residue group in each amino acid that has a certain amount of freedom to move around the carbon backbone (much larger than in a protein).

SPE Results.
According to simulated hexapeptide-dioxin binding scores, the top ranked hexapeptide EWFQPW, having a 10 conformers average score of −12.80 in binding all the 189 dioxins, was selected for a preliminary in vivo experiment.The hexapeptide, in virtual screening process, formed either with PCDFs or PCDDs and in particular with TCDD, the strongest complex.Moreover primary sequence amino acids were the most represented in the top 50 ranked hexapeptides but less represented in the bottom 50 ones as shown in Table 2.The three-dimensional shape of the complex peptide-dioxin (Figure 4) suggested to attach the peptide to the resin via carboxyl or amino terminal without losing its performances, as reported also in other experimental work [15].
Before the SPE procedure setup, the detection conditions were optimized with standard solutions.In this step a good compromise for solubility was found to be 70/30 percentage of H 2 O/methanol.The TCDD linear regression HPLC/UV analysis gave a correlation of 0.99 in a linear range from 5 to 800 nM, with a detection limit of 1 nM.The loading conditions optimized for TCDD were also used for PCBs samples preparation.Therefore, the preliminary results were obtained based on the effective amount of solubilized analytes.
The SPE setup included conditioning, sample loading, washing, and, finally, analyte elution steps.In the first step, 16 nM TCDD or 10 nM PCBs in a final volume of 2 mL H 2 O/methanol solution (70/30) was loaded on cartridges.In the second step, all unbound dioxins molecules were cleaned from the resin by using 1 mL of H 2 O/methanol solution (70/30).The analyte bound to the sorbent material was recovered by using 1 mL of methanol.Two further elution steps were carried for a complete dioxin recovery.Fractions were collected and directly transferred into amber glass vials for analysis.
The binding properties of the hexapeptide (resin A) versus a blank column (resin B) and a commercial cartridge were verified with the same extraction protocol.As reported in Table 3, the hexapeptide sorbent material showed a recovery of 65% of the TCDD with a significant difference from unmodified cartridge (27%), which was packed with the related nonfunctionalized resin.The commercial cartridge recovery was about 75%, showing a slight better recovery.This cartridge, with a reversed phase functionalized polymeric sorbent, gives retention of neutral, acidic, or basic compounds with any selectivity.Significant differences in specific absorption were found between TCDD and both PCBs 52 and 101 confirming the hexapeptide dioxin selectivity.
This preliminary cross-reactivity experiments were carried out to test the affinity of the hexapeptide versus PCBs commonly found with dioxins in environmental samples along with dioxins.Using the resin with hexapeptide a specific recovery of TCDD with respect to both PCBs was observed.As concerns recovery, the commercial Strata-X showed higher recovery than resin A, but resin A showed much better specificity for TCDD.The performed specificity test demonstrated the significance of this work as binder screening methodology.
The SPE-extraction was performed in triplicate obtaining a reproducibility with a CV < 15%.Elution tests showed that 90% of bounded TCDD was recovered in the first step, obtaining a significant analyte concentration.For a complete analyte recovery, 2 mL volume was sufficient.In fact no recovery was found in the final elution.
The preliminary SPE test allowed a good and selective recovery of dioxin, statistically comparable with commercial ones.Further work is needed to optimize SPE procedure in order to maximize analyte retention and to verify crossreactivity unspecific adsorption in real samples.

Conclusions
In this work, the application of a molecular modeling method in the study of peptide-dioxin interaction was evaluated.The use of a virtual screening methodology offered a considerable assistance for the molecular interaction understanding.According to preliminary in vivo experimental results, the best ranked peptide designed for dioxins showed very good behavior as sorbent material for TCDD solid-phase extraction.The modest computing framework was found to be a good tool in short peptides selection, as possible receptors for dioxins family.Despite the results, higher ranked peptides were not supposed to be the best in binding dioxins, but the methodology proposed can be used in support to experimental tests, rationalizing, and reducing by orders of magnitude the choice of molecular traps.
Further work is needed to optimize SPE procedure in order to maximize analyte retention and to verify crossreactivity unspecific adsorption in real samples.

Figure 3 :
Figure 3: PCDFs and PCDDs number of substituents effect on peptides average binding score.All the three simulation screening steps were reported.The solid lines illustrated the PCDFs trend, the dashed lines the PCDDs trend.

Figure 4 :
Figure 4: Two examples of pentapeptides versus all 189 ligands.Top: LFQGW folded around ligands with a saddle shaped binding pocket (top right), bottom: WFQPW with an almost planar interaction surface (bottom right).
M and 31 M) in isooctane were purchased from Dr. Ehrenstorfer.For the analysis of PCBs, the fractions were spiked with a mixture of13 C 12-labeled PCB 52 and PCB 101 as internal standards (EC 4979) from Cambridge Isotope Laboratories.

Table 2 :
Structural analysis of the top (T, bold) and bottom (B, regular) 50 ranked hexapeptides.P: position in sequence, %: amino acid occurrences percentage in top or bottom hexapeptides.