Recently, there has been tremendous research studies in optical neural networks that could complete comparatively complex computation by optical characteristic with much more fewer dissipation than electrical networks. Existed neural networks based on the optical circuit are structured with an optical grating platform with different diffractive phases at different diffractive points (Chen and Zhu, 2019 and Mo et al., 2018). In this study, it proposed a multiwave deep diffractive network with approximately 106 synapses, and it is easy to make hardware implementation of neuromorphic networks. In the optical architecture, it can utilize optical diffractive characteristic and different wavelengths to perform different tasks. Different wavelengths and different tasks inputs are independent of each other. Moreover, we can utilize the characteristic of them to inference several tasks, simultaneously. The results of experiments were demonstrated that the network could get a comparable performance to single-wavelength single-task. Compared to the multinetwork, single network can save the cost of fabrication with lithography. We train the network on MNIST and MNIST-FASHION which are two different datasets to perform classification of 32∗32 inputs with 10 classes. Our method achieves competitive results across both of them. In particular, on the complex task MNIST_FASION, our framework obtains an excellent accuracy improvement with 3.2%. In the meanwhile, MNSIT also has the improvement with 1.15%.
Store and retrieve data units based on the von Neumann architecture are far more time-consuming and power-hungry than an optical device [
The neuromorphic computing seeks a brain-like processing, which overcomes the limitation from conventional computers. IBM [
The architecture of fully optical network with Mach–Zehnder interferometer [
In general, optical networks have more trainable parameters with complex-value modulation which provide phase and amplitude of each neuron rather than only amplitude in electric networks. Unfortunately, the optical device to form neural network has some problems. Firstly, all-optical neural network is designed for a single task, but multitasks [
In this paper, to address the above two issues, we make most of the optical characteristics to express different tasks with different wavelengths. One is used as baseband and the other is used as a carrier frequency. Therefore, the base band wave can be set to a large learning rate and vice versa. Extensive experiments based on MNIST and MNIST-Fashion [
Spatial domain implies per-wave in-plane propagation reasoning about diffractive in the particular phase and frequency, which can analyze and integrate different direction waves. It operates in the frequency domain. The wave distribution of the observation and aperture plane can be viewed as the linear combination with a great many monochromatic plane wave of different direction propagations. The amplitude and phase of each plane wave lies on the angular spectrum. The angular spectrum can be acquired by FFT analysis process [
As shown in Figure
The architecture of multiwavelength diffractive network.
We input images in MNIST and MNIST-FASHION simultaneously, the input optical wavelength of MNIST and MNIST-FASION task is
The top left of Figure
The framework of multiwavelength diffractive network.
Firstly, we convert image information to the phase and amplitude of optical information as the input of systems. Then, the optical grating is manufactured by 3D-printing device with different heights. In the following sections, we discuss that MWDN tackles the tasks predominately using the angular spectrum. MWDN by the 3D-printing would influence the amplitude and phase of the wave to 0∼1 and 0∼2
Following the Fresnel diffraction equation, we can consider the optical signal from the spatial domain to the frequency domain. The angular spectrum method of plane wave explains how wave propagates. It is the primary method of analyzing diffraction in the frequency domain. Based on the angular spectrum, the free space transfer function is to control free propagation. The wave plane can transfer angular spectrum by the FFT process, where diffractive data processing is more evident as follows:
The output wave plane distribution propagates through 3D material and the field distribution is changed by the refractive index:
According to the size of input data, an effective and flexible linear interpolation algorithm is to fit the diffractive input layer. The interconnection rate between adjacent layers relates to the distance and diffraction angle, which approach the critical value (1.0).
Furthermore, the number of the network layers and the axial distance are also tunable. The output layer can part into ten regions corresponding to ten classes, where the summation of light intensity can be detected in the wave plane region. Mean square error (MSE) uses to train MWDN parameters compared to the target. We aim to minimize a loss function, which increases target region wave intensity and decrease other regions. The training batch size is set to be 10 for the classifier.
To train MWDN, we use the backpropagation algorithm with the Adam optimization method. We focus on the intensity of wave and define loss function with MSE between the output and target:
We use multiwavelength, we need to adjust a base value through one wavelength, and then fine-tune through another wavelength without affecting another task as follows:
Our refractive index has a discrete value. Based on the principle of low-bit network, refractive index
For the forward step, we normalize
The optical diffractive network and deep neural network are markedly different. The function of the optical diffractive network is determined by wavelength and the parameters of the optical grating (height and complex refractive index). Multiwavelength diffractive network has a broad range of requirements that differ from the conventional network.
Different wavelengths have different effectiveness. We set different wavelengths for different tasks. Meanwhile, the network needs to ensure that different wavelengths do not affect each other. By setting one to baseband and the other to the carrier, the diffractive network is used to adjust optical plane wave independently. The algorithm can be considered as an efficient carrier algorithm. The ratio of baseband and carrier wavelengths is 1 : 30. The short wavelength is little influence to long wavelength and vice versa. If the phase difference of long wavelength is
So, the equation can be as follows:
The second term of
The multiwavelength diffractive network can be effective, and more powerful than deep neural network. Phase difference
In this work, we apply the proposed MWDN to implement on two different datasets MNIST and MNIST-Fashion.
By comparing to the state-of-the-art methods with accuracy and speed, MNIST and MNIST-FASION, in this method, achieve better performance. The size of the network is set to 200 × 200, 500 × 500, and 1500 × 1500, each having a trainable height of the map. The optical network possesses two types, one for phase modulation and the other for complex modulation. The MNIST and MNIST-FASION works with different optical wavelengths, and the input is altered by the optical grating mask.
Using the backward propagation, the model is trained with two task datasets alternately and validated its effectiveness. We train the network with different learning rates for different tasks, which overcome the drawbacks of local optimum to solve. As well as, all the parameters of the network are adjusted by the gradient descent algorithm to minimize the error.
We evaluate the approach on two datasets and input information for neurons in the form of phase fed into the network. The two datasets have different data distributions, which are difficult to classify in the same network. The conventional networks require the input information to be independent and identical distribution. The task is to handle two different distribution data in the same network.
For better performance, we set a different learning rate and different signal frequency to two datasets. The maximum half-cone diffraction angle is formulated as follows:
The light wavelength is 0.4 THz and 14.4 THz for MNIST and MNIST-Fashion. The neuron size is set to be 200
The performance of MWDN and DN methods.
Method | MNIST (%) | MNIST-Fashion (%) | ||
---|---|---|---|---|
MWDN(PCM) | 92.85 | 0.4 | 84.33 | 14.4 |
DN-MNIST | 91.75 | 0.4 | 81.13 | 0.4 |
The performance of MWDN and DN methods with multitask.
Method | MNIST (%) | MNIST-Fashion (%) | ||
---|---|---|---|---|
Setting 1 | 23.45 | 12.12 | 0.4 | 0.4 |
Setting 2 | 90.45 | 76.67 | 14.4 | 0.144 |
DN-Fashion | — | 78.70 | 14.4 | — |
DN-MNIST | 91.75 | — | — | 14.4 |
Figures
The accuracy in the training process of different methods.
The loss in the training process of different methods.
Finally, we report the performance on the validation data of MNIST and MNIST_FASION, a challenging task with different datasets. Using only a single network, two datasets classification can be accomplished simultaneously. We investigate the effects of various combinations of different datasets for MWDN. The results are shown in Figures
In this paper, we propose a novel and multitasks optical network named as the multiwavelength diffractive network (MWDN). Based on plane wave propagation, our method can achieve comparable accuracy against the single-task network. We successfully apply MWDN to multitasks with different datasets distribution and provide a multiwavelength method with different model sizes. In the future, we aim to develop a more effective network to achieve complex tasks and reach better performance.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.
The authors acknowledge the financial support by National Key R&D Program of China (Grant no. 2017YFE0112000) and Shanghai Municipal Science and Technology Major Project (Grant no. 2017SHZDZX01). The authors wish to thank Doctor Y. C. Sun. Furthermore, the authors would like to express sincere thanks for light source and controller supported by 3 Lights and Light-ca Technology Corporation.