Application of Symmetry Adapted Function Method for Three-Dimensional Reconstruction of Octahedral Biological Macromolecules

A method for three-dimensional (3D) reconstruction of macromolecule assembles, that is, octahedral symmetrical adapted functions (OSAFs) method, was introduced in this paper and a series of formulations for reconstruction by OSAF method were derived. To verify the feasibility and advantages of the method, two octahedral symmetrical macromolecules, that is, heat shock protein Degp24 and the Red-cell L Ferritin, were utilized as examples to implement reconstruction by the OSAF method. The schedule for simulation was designed as follows: 2000 random orientated projections of single particles with predefined Euler angles and centers of origins were generated, then different levels of noises that is signal-to-noise ratio (S/N) = 0.1, 0.5, and 0.8 were added. The structures reconstructed by the OSAF method were in good agreement with the standard models and the relative errors of the structures reconstructed by the OSAF method to standard structures were very little even for high level noise. The facts mentioned above account for that the OSAF method is feasible and efficient approach to reconstruct structures of macromolecules and have ability to suppress the influence of noise.


Introduction
The determination of three-dimensional (3D) structures of macromolecular assemblies plays a key role in understanding their functions and properties. In the course of structure reconstruction of single particles during the last several decades, the Cryo-electron microscopy (referred to as "CryoEM") has been successively used to solve 3D structures up to subnanometer resolution [1][2][3][4][5][6], even nearatomic resolution, such as the 3.8Å resolution structures of the CPV [7] and rotavirus [8], the 4.2Å GroEL structure [9], and the 4.5Å Epsilon 15 bacteriophage structure [10]. Such large assemblies often are either too large or too heterogeneous to be able to study by the conventional Xray crystallography and nuclear magnetic resonance (NMR) [11,12]. It is considered that the CryoEM is an indispensable approach for determining the 3D structures of macromolecular complexes. And many different software packages have been developed in the world wide laboratories for 3D reconstructions, such as EMAN [13], FREALIGN [14] using the direct Fourier inversion method, MRC [15] using the Fourier-Bessel synthesis method and spherical symmetryadapted functions (SAFs) method [16,17]. The SAF method was first realized to be a more efficient method indicated by Crowther in their pioneer paper [18] three decades ago. Provencher and Vogel implemented 3D reconstruction by the SAF method taking both simulated structures and biological objects as samples [19,20]. Zheng et al. used an icosahedral SAF (ISAF) method to determine the structures of viruses from solution X-ray scattering data [21]. Navaza [16] systematically developed formulations for 3D reconstruction of icosahedral viruses including ab initio determination of origins and orientations of particles and interpolation of data in the reciprocal space by the ISAF method. Recently our group [17] reconstructed icosahedral symmetry biological objects (HBV, etc.) by icosahedral SAF approach which showed that SAF method is an efficient approach to reduce the influence of noise and achieve high resolution because of its ability of completely utilizing symmetry operation of the object being studied. Up to date, all the SAF method is only used in the reconstruction of icosahedral symmetry macromolecular. Due to a variety of symmetry of the macromolecular existing in nature except icosahedral symmetry, such as octahedral (small heat shock protein hsp16.5) [22,23], heat shock protein Degp 24 [24], Hepatitis B small surface antigen particles (HbsAg) [25] tetrahedral (small heat shock protein ACR1 [26]), dihedral (auxilinbound clathrin coat [27] catalase, ribulose bisphosphate, glutamine synthetase, carboxylase/oxygenase) and so forth, we would like to extend SAF method to reconstruct the objects having any other symmetry, for example, octahedral, tetrahedral, dihedral symmetries.
In this paper, we would like to concentrate our attention to the octahedral symmetry-adapted function (OSAF) method. And a series of formulations for 3D reconstruction of octahedral symmetry macromolecules had been derived. To verify the feasibility and the advantages of this approach, two octahedral symmetrical simulated data such as heat shock protein Degp 24 (3Cs0.pdb) [24] and the Red-Cell L Ferritin (1rcc.pdb) [28] downloaded from protein data bank (PDB) have been reconstructed by the OSAF method at high resolution. The results demonstrate that the OSAF method can retrieve the 3D structures of the octahedral symmetrical objects at high resolution and effectively suppress the influence of noise.

Principle of the OSAF Method
For readily understanding the OSAF method for 3D reconstruction, it is necessary to describe the principle of the OSAF method briefly.

Octahedral Symmetry-Adapted Functions (OSAFs).
Due to SAFs being the linear combinations of the spherical harmonics Y l,m (θ, ϕ) [29], the major problem is how to find the coefficients B μ l,m of the OSAF (see expression (1)).
According to the conventional definition, we choose the X, Y , Z axes of the Cartesian coordinates system along three 4 fold axes of an octahedral symmetry group, respectively, and the relationship between the Cartesian and spherical pole coordinates is as convention. Consequently, the OSAF can be written as follows: where the O l,μ (θ, ϕ) represents the OSAFs, the Y l,m (θ, ϕ) denotes the normalized spherical harmonics, and the B μ l,m is the combination coefficient with m = 0 (mod4) required by the Z axis being along a 4-fold axis, μ(l) is multiplicity of a given order l of an OSAF. All the μ(l) for l ≤ 11 are listed in the Table 1, when l ≥ 12, the values of the μ(l) can be obtained by where int(l/12) is the integer part of the (l/12) and mod(l/12) the reminder of l divided by 12.
To calculate OSAFs, at first, one should calculate the normalized spherical harmonics by the formula where the P m l (cos θ) is the Legendre-associated polynomial. Then, it is essential to solve the combination coefficients in (1) for getting the OSAFs. Up to now, there are many methods to obtain the high-order OSAFs, such as the general methods [30], Algebraic method [31], and the recursive approach [32]. But we prefer the recursive method since it is less sensitive to the computation errors and more stable for achieving the higher order OSAF as pointed out by Schmidt [32] where γ(l 1 , l 2 , l 1 + l 2 ) is normalized constant. According to (4), any high-order OSAF can be generated only by using three lower order seed OSAFs with l = 4, 6, 9. The coefficients B μ l,m of the seed OSAFs are listed in the Table 2.
For example, Due to the properties of the normalized spherical harmonic functions, the Y l,m (θ, ϕ) and B μ l,m satisfy the following relationships: So that When l is even, When l is odd, where the asterisk * denotes complex conjugation, O c l,μ (θ, ϕ) and O s l,μ (θ, ϕ) denote the real and the imaginary parts of the OSAF, respectively.
According to (8) one may use the O c l,μ (θ, ϕ) to fit the real part of structure factor and O s l,μ (θ, ϕ) the imaginary part of structure factor of biological objects with octahedral symmetry in reciprocal space. All the O c l,μ (θ, ϕ) and O s l,μ (θ, ϕ) of OSAFs with l ≤ 12 are listed in Table 3. The first column gives the orders of OSAF, the second gives the multiplicity μ(l) of OSAF, c and s in the third denotes the O c l,μ (θ, ϕ) and O s l,μ (θ, ϕ), respectively, the final column presents the combination coefficients B μ l,m of the OSAF, and the (0), (4), (8), and (12) mean m = 0, 4, 8, and 12, respectively. For example, according to (8), we can gain the OSAFs of l = 6, 9: Figures 1(a) and 1(b) show the density contour lines of the OSAFs looking down along a fourfold axis with l = 40, μ = 1 and 2, respectively. We can get all the higher order of OSAFs up to 1000 via (4).
It should be pointed out that the OSAFs (O l,μ (Θ, Φ)) are the orthogonal complete basis if the multiplicity μ is taken all the values, then any function F(R, Θ, Φ) with octahedral symmetry can be represented in terms of the linear combination of O l,μ (Θ, Φ).

Reconstruction Principle.
It is well known that the structure of macromolecular complexes can be described as its potential functions which are determined by Fourier inversion transformation of the structure factors F(R), and its expression in spherical coordinates is (10) where r and R denote the vectors in real and Fourier spaces, respectively.
In contrast to Crowther's Fourier-Bessel method [33], we utilize OSAF to express F(R) where f l,μ (R) is the fitting coefficient of the OSAF, and its value depends on the Fourier radius R of a spherical shell and l, μ; n l is the maximum multiplicity of a given order l.
According to (8), (11) can be further expressed as the following two parts, that is, real and imaginary parts, where F r (R, Θ, Φ) and F i (R, Θ, Φ) denote the real and imaginary parts of F(R, Θ, Φ), respectively.  Substituting (12) into (10), one can finally obtain where j l (2πRr) labels the spherical Bessel functions, and its recurrence relationship can be seen in reference [34]. The reconstruction by OSAF can be carried out in the following procedure.
(1) Calculate the OSAFs by (1) up to the required order.
(2) Construct two linear equation groups with experimental determined structure factors F r (R, Θ, Φ) and F i (R, Θ, Φ) in the reciprocal space according to (12).
(3) Find the fitting coefficients f leven,μ (R) and f lodd,μ (R) by solving the above two linear equation groups by means of the least square method.

Implementation of Reconstruction by the OSAF Method
To verify the feasibility and advantages of the OSAF method for reconstruction of macromolecules with octahedral symmetry, two biological objects with octahedral symmetry, heat shock protein Degp 24 and the Red-cell L Ferritin, were taken as examples. The atomic structures were downloaded from PDB (3cs0.pdb and 1rcc.pdb). The both 3D structures with 4.0Å resolution were generated as standard structure models (SSMs) by the EMAN's pdb2mrc procedure. Then two thousand random projections of these two proteins with predefined orientations and centers were created using realspace projection. Then random noise was added to each projection at 3 different signal to noise ratios (S/N), that is, 0.1, 0.5., and 0.8 for 3D reconstruction according to the definition of S/N which is described as below where signal is the average value of the signal, and noise the average value of the noise.   the structures reconstructed by the OSAF method are in good agreement with the SSMs even with heavy noise S/N = 0.1. Figures 6 and 7 show the comparison of 3D reconstructed structures of above two proteins in high resolution, respectively, by the OSAF method. Although there is no obvious discrepancy between the 3D density map reconstructed by the OSAF method for S/N = 0.1, 0.5, 0.8, a few differences still can be identified. For quantitative comparison, the Fourier shell correlations (FSCs) [35] Figure 8. The formula for calculating RE is described as follows [36]: where RE denotes the Relative Errors between the SSMs and reconstructed structures, ρ o represents potential of SSM and ρ r is that of the normalized structure reconstructed by OSAF method. From Figure 8, one may find that with the increase of the Fourier frequency, the RE increases gradually. Furthermore it is apparent that in the case of the low Fourier frequency with nominal resolution of 14.4Å, the relative errors listed in Table 4 show that REs keep almost constant for the different S/Ns, which means that the reconstructed structures are hardly influenced by added noise even for S/N = 0.1. As the structure reconstructed by the OSAF method is of very low RE, that is to say, the structure reconstructed by the OSAF method is very close to the real structure. As the added noises increase, the RE increases. The fact mentioned above implies that the OSAF method is feasible and efficient approach to reconstruct structures of macromolecules and can suppress the influence of noise since the OSAF method can completely utilize 24 symmetry operations of the octahedral symmetry. To achieve advantage of symmetry operation, one should use symmetry adapted function (SAF), for example, icosahedral, octahedral, tetrahedral, dihedral SAF, all these functions have ability to suppress influence of noise in different extent depending on the number of symmetry operation. The icosahedron have 60 symmetry operations which have the strongest ability which is verified by our former paper [17], but other SAFs have certain ability to suppress the influence of noise.  Since the S/N of raw CryoEM data is very low, one may need a large number of particles to reconstruct a 3D structure to achieve high resolution. Therefore the program for dealing with these huge particles for 3D reconstruction is very time consumption. It is essential to reduce the computation time of the 3D reconstruction from huge particles at high resolution. To achieve fast computation, we managed to carry out the reconstruction in an asymmetrical unit of an octahedral symmetry, and therefore the calculation was speeded up 24 times so that the reconstruction can be performed with a PC computer which will be described in another paper in detail. Table 5 shows the algorithm time by the OSAF method. All the above tests were carried out at a general PC flat with the Pentium D 3.2 GHz CPU and the 2 G RAM. From the Table 5 one can see that the OSAF method is very fast even for high resolution reconstruction.

Conclusions
(1) A set of formulations for 3D reconstruction of macromolecular assemblies with octahedral symmetry by the OSAF method has been established.
(2) The OSAF method is feasible and efficiently suppresses the influence of the noise because of its sufficiently utilizing the symmetry of the objects.
(3) The calculation can be greatly speed up by dealing with the reconstruction in an asymmetrical unit of the octahedral symmetry group.
It should be pointed out that in the simulation, one may use projections with predetermined centers and orientations to reconstruct structures; however in practice, one should reconstruct based on experimented measured data with unknown centers and orientations. In this case, one should first determine the center and orientation of a projection. At this stage, we did not write a program to determine the center and orientation by the OSAF method itself yet, International Journal of Biomedical Imaging So far we should use the other program such as EMAN [13], FREALIGN [14], and so forth, to determine the orientation and center parameters of a particle, then adopt the OSAF method for reconstruction. The orientation definition in our OSAF method is the Z-X-Z convention by "clockwise rotation" which is identical to the EMAN's program and is different from Z-Y-Z convention such as FREALIGN's. The relationship of orientation definition between OSAF method and Z-Y-Z convention can be described as follows which is in the same way as EMAN: where φ z1 , θ x , ψ z1 is the Z-X-Z convention adopted in our OSAF method and EMAN, φ z , θ y , ψ z is the Z-Y-Z convention adopted in SPIDER, IMAGIC, MRC, and FREALIGN. The OSAF method for reconstruction is just at beginning stage, there is a plenty of space for optimizing the program.

International Journal of Biomedical Imaging
We believe that this method has a prospective future. A reconstruction with experimental data is proceeding based on the principle mentioned above and will be reported later on.