Biomolecular cooperativity is of great scientific interest due to its role in biological processes. Two transcription factors (TFs), Oct-4 and Sox-2, are crucial in transcriptional regulation of embryonic stem cells. In this paper, we analyze how Oct-1 (a similar POU factor) and Sox-2, interact cooperatively at their enhancer binding sites in collective motions. Normal mode analysis (NMA) is implemented to study the collective motions of two complexes with each involving these TFs and an enhancer. The special structure of Oct proteins is analyzed comprehensively, after which each Oct/Sox group is reassembled into two protein pairs. We subsequently propose a segmentation idea to extract the most correlated segments in each pair, using correlations of motion magnitude curves. The median analysis on these correlation values shows the intimacy of subunit POUS (Oct-1) and Sox-2. Using those larger-than-median correlation values, we conduct statistical studies and propose several protein-protein cooperative modes (
Embryonic stem cells (ES cells) possess the pluripotency of differentiating into all the three germ layers (endoderm, mesoderm, and ectoderm), which correspond to hundreds of cell types. These pluripotent stem cells are transcriptionally regulated by a number of transcription factors (TFs) [
At the early stage of transcription, TFs bind to specific regulatory DNA regions to cooperatively affect the transcription sites. Enhancers, which act as activators or stimulators for transcription [
On the other hand, molecular dynamics are involved in many biological processes [
In our work, the dynamics of the POU/HMG group at its enhancer binding sites, referred to as POU/HMG/DNA complexes, are surveyed. Two POU/HMG/DNA complexes, which are DNA-binding portions of a POU factor Oct-1 and an HMG factor Sox-2 bound to an enhancer, are specifically studied from a structural and molecular dynamic view. Normal mode analysis (NMA) is implemented to study the collective or cooperative motions of these POU/HMG/DNA ternary complexes, after which the interaction of the POU and HMG factors at their DNA binding sites in these collective motions is explored. We propose a segmentation idea for the proteins to construct an equal-length-chain comparison and measure the correlation of each protein segment pair using the linear correlation. A statistical analysis on the significantly correlated pairs provides useful information on how these TFs have a synergistic control on enhancer DNAs in transcriptional regulation.
NMA is an efficient method to detect the most cooperative or collective motions (essential modes) of large harmonic oscillating systems. With the constraint that the studied conformations are in the vicinity of the systematic equilibrium, which exists in most harmonic oscillating systems [
Specifically, if we describe an
One broadly used construction method for the Hessian matrices is the elastic network models (ENMs) [
Each eigenvalue of an above-constructed Hessian matrix denotes the associated systematic energy for the observed system, and its corresponding eigenvector represents the direction of a specific normal mode motion. Among the obtained
Several online tools are available for normal mode calculations. An online server called NOMAD-Ref at
Two POU/HMG/DNA ternary complexes, 1GT0 and 1O4X, are downloaded from the Protein Data Bank (PDB) [
(a) The 3D structure of the POU/HMG/DNA ternary complex 1GT0. The gray protein represents an HMG factor Sox-2, and the red one is a POU factor Oct-1, which is composed of two subunits POUS and POUHD. (b) The two reassembled protein pairs, originated from (a), for our subsequent studies.
Furthermore, each Oct protein contains two subunits (POUS and POUHD) that are connected by a flexible linker and control DNAs in a bipartite manner [
After generating the motions of the two POU/HMG/DNA ternary complexes using NMA, we observe how the two protein pairs behave at the enhancer binding sites in these most collective or cooperative motions.
For each protein pair in each ternary complex, we analyze the first 10 obtained essential modes. In each mode, we firstly refine an observed pair at the residue level from a view of motion magnitude. This can be achieved by calculating the motion magnitudes for all the atoms in each protein and subsequently computing the motion magnitude of each residue in this protein by averaging the motion magnitudes of all component atoms (see (
(a) The protein pair containing Sox-2 (gray) and POUS (red) in 1GT0. (b) The refined structure of the protein pair, where nodes represent residues. In each mode, a refined protein structure in the pair corresponds to a motion magnitude curve. (c) The searching process for the most cooperative segments of length
Next, in each protein pair we observe the potential protein-protein cooperativity in these motions based on the correlations of motion magnitude functions. An effective method to measure the dependence between two quantities is the Pearson product-moment correlation coefficient [
We adopt this correlation coefficient in our studies. However, since each protein has a different length, we investigate the most cooperative/correlated segments among each protein pair in each mode. We introduce a segment length parameter
In each modes with a list of
Now, we use medians in (
We subsequently examine the relationship between the two protein pairs in each complex based on these logic matrices. The idea is to explore that in a single essential mode whether only one significantly correlated segment pair (either in protein pair 1 or pair 2) is involved or both pairs are involved. To balance the segment lengths (
This diagram shows all the possibilities of the length pair (
To fulfill the aforementioned operation, we conduct several iterations for all the
To compare the scenarios where different filters are applied, we, respectively, apply the first tertile, the first quartile, and the mean value as filters to investigate the corresponding results. The mean filter can be described as (
Finally, to gain a deep insight into the motions of these two complexes, we have also observed the rotation angles of the corresponding protein chains. In the above discussion, we regard residues as basic units in protein sequences, and here we consider the links between each consecutive two residues (Figure
For each protein in an observed pair of a ternary complex, we calculate the motion magnitude functions (
(a) The motion magnitude curves in mode 7 for proteins POUHD (red) and Sox-2 (purple) in pair 1 of 1GT0. (b) The motion magnitude curves in mode 7 for proteins POUS (blue) and Sox-2 (purple) in pair 2 of 1GT0.
After defining a list of
Motion correlations between POUHD and Sox-2 in protein pair 1 of 1GT0.
|
Mode | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Mode 7 | Mode 8 | Mode 9 | Mode 10 | Mode 11 | Mode 12 | Mode 13 | Mode 14 | Mode 15 | Mode 16 | |
1 | −0.696 | 0.606 | −0.477 | 0.324 | −0.265 | 0.383 | 0.202 | −0.326 | −0.382 | −0.520 |
0.9 | −0.738 | 0.711 | 0.697 | 0.340 | −0.419 | 0.772 | 0.454 | −0.429 | −0.420 | 0.739 |
0.8 | −0.853 | 0.797 | 0.819 | 0.342 | 0.651 | 0.838 | −0.607 | −0.463 | 0.569 | 0.730 |
0.7 | −0.859 | 0.836 | 0.820 | −0.445 | 0.639 | 0.820 | −0.621 | −0.562 | −0.716 | 0.769 |
0.6 | −0.856 | 0.851 | 0.850 | 0.537 | 0.814 | 0.849 | −0.699 | −0.684 | 0.762 | 0.806 |
0.5 | −0.865 | −0.862 | 0.856 | −0.757 | 0.858 | −0.901 | −0.733 | −0.761 | 0.806 | 0.819 |
The larger the absolute value of correlation is, the more the two compared segments correlate with each other, either positively or negatively. Now we examine how the absolute correlation values
(a) and (b) show the distributions of absolute correlation values
Next, we use the above-mentioned medians as a filter and investigate how those
Parts of the results of cooperative modes
Mode
Statistics on the occurrences of the cooperative modes
1GT0 | ||||||||
---|---|---|---|---|---|---|---|---|
|
|
| ||||||
82 |
|
|
82 |
|
|
98 |
|
|
36 | 46 | 49 | 33 | 35 | 63 | |||
| ||||||||
1O4X | ||||||||
|
|
| ||||||
85 |
|
|
85 |
|
|
95 |
|
|
62 | 23 | 75 | 10 | 68 | 27 |
Furthermore, we divide the modes
We have also applied the first tertile, the first quartile and the mean value as filters and similarly conducted the statistical analysis as illustrated above. Tables
Statistics on the occurrences of the cooperative modes
1GT0 | ||||||||
---|---|---|---|---|---|---|---|---|
s1 | s2 |
| ||||||
71 |
|
|
71 |
|
|
145 |
|
|
28 | 43 | 47 | 24 | 70 | 75 | |||
| ||||||||
1O4X | ||||||||
s1 | s2 |
| ||||||
77 |
|
|
71 |
|
|
139 |
|
|
57 | 20 | 62 | 9 | 98 | 41 |
Statistics on the occurrences of the cooperative modes
1GT0 | ||||||||
---|---|---|---|---|---|---|---|---|
|
|
| ||||||
62 |
|
|
62 |
|
|
190 |
|
|
26 | 36 | 33 | 29 | 85 | 105 | |||
| ||||||||
1O4X | ||||||||
|
|
| ||||||
70 |
|
|
64 |
|
|
182 |
|
|
47 | 23 | 52 | 12 | 119 | 63 |
Statistics on the occurrences of the cooperative modes
1GT0 | ||||||||
---|---|---|---|---|---|---|---|---|
|
|
| ||||||
66 |
|
|
78 |
|
|
108 |
|
|
33 | 33 | 43 | 35 | 44 | 64 | |||
| ||||||||
1O4X | ||||||||
|
|
| ||||||
73 |
|
|
85 |
|
|
107 |
|
|
52 | 21 | 72 | 13 | 77 | 30 |
Subsequently, we calculate the rotation angle functions for each protein in each complex in the first 10 essential normal modes (described in Section
(a) displays the rotation angle curves for proteins POUHD (red) and Sox-2 (purple) in pair 1 of 1GT0 in mode 7; (b) shows the rotation angle curves for proteins POUS (blue) and Sox-2 (purple) in pair 2 of 1GT0 in mode 7.
Since the rotation angle functions contain a lot of noise, we apply the principal component analysis (PCA) to the 10 rotation angle curves of each protein in the two complexes to obtain the first principal component (PC), leading the rotation angle curves (
Correlations between PC curves of rotation angle functions for the two protein pairs in 1GT0 and 1O4X.
|
1GT0 | 1O4X | ||
---|---|---|---|---|
Pair 1 | Pair 2 | Pair 1 | Pair 2 | |
1.0 | 0.682 | −0.278 | −0.875 | −0.310 |
0.9 | 0.836 | −0.339 | −0.884 | −0.764 |
0.8 | 0.844 | −0.354 | −0.884 | −0.821 |
0.7 | 0.854 | −0.328 | −0.884 | −0.829 |
0.6 | 0.863 | −0.498 | −0.890 | −0.856 |
0.5 | 0.876 | 0.703 | −0.915 | −0.859 |
Now we apply the Fourier transform to analyze these noisy rotation angle values. Simply, the magnitudes of the transformed signals are regarded as our new data. The segmentation and correlation calculation are implemented, after which the statistical analysis is carried out. As an example, we use the first quartile as a filter for the correlations of rotation angle functions. The results are listed in Table
Statistics on the occurrences of the cooperative modes
1GT0 | ||||||||
---|---|---|---|---|---|---|---|---|
|
|
| ||||||
108 |
|
|
108 |
|
|
144 |
|
|
108 | 0 | 108 | 0 | 144 | 0 | |||
| ||||||||
1O4X | ||||||||
|
|
| ||||||
48 |
|
|
48 |
|
|
204 |
|
|
48 | 0 | 48 | 0 | 204 | 0 |
In this paper, we performed NMA to study the collective motions of two TFs, Oct-1 and Sox-2, at their enhancer binding sites, aiming to gain an insight into the cooperative manner of these two TFs through the dynamics of their enhancer-bounded complexes. Based on the special structure of Oct proteins, we treated an Oct/Sox group as two protein pairs and comparably investigated how these two pairs behave in the collective motions. A segmentation idea was introduced to explore the most correlated segments in each protein pair, according to the correlations of motion magnitude curves (or their segments). A median analysis on these correlations was conducted, which shows the leading role of subunit POUS (pair 2). Furthermore, based on statistics of the correlated segment pairs having a correlation value above the corresponding median, we proposed several motion cooperative modes (
Cooperativity, in protein-DNA [
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by the City University of Hong Kong (Project 7002843).