High-Speed Transmission and Mass Data Storage Solutions for Large-Area and Arbitrarily Structured Fabrication through Maskless Lithography

This paper presents the implementation aspects and design of high-speed data transmission in laser direct-writing lithography.With a single field programmable gate array (FPGA) chip, mass data storage management, transmission, and synchronization of each part in real-time were implemented. To store a massive amount of data and transmit data with high bandwidth, a serial advanced technology attachment (SATA) intellectual property (IP)was developed onXilinxVirtex-6 FPGA. In addition, control of laser beam power, collection of status read back data of the lithography laser through an analog-to-digital converter, and synchronization of the positioning signal were implemented on the same FPGA. A data structure for each unit with a unique exposure dose and other necessary information was established. Results showed that the maximum read bandwidth (240MB/s) and maximum write bandwidth (200MB/s) of a single solid-state drive conform to the data transmission requirement. The total amount of data meets the requirement of a large-area diffractive element approximately 10 cm. The throughput has been greatly improved at meters per second or square centimeter per second. And test results showed that data transmission meets the requirement of the experiment.


Introduction
Laser direct-writing (LDW) maskless lithography was proposed with the rise of binary optics in the 1980s because of its advantages, such as high precision, low cost, high speed, and high flexibility [1].LDW is a well-established and widely used technique for the fabrication of binary optical masks or elements, such as photonic crystal structures, chirp grates, and optical microelectromechanical systems [2,3].Direct-write lithography has been applied to low volume micro/nanodevice production.
Maskless laser direct-writing lithography with spot-byspot exposure mode has numerous advantages, but it has been considered too slow for mass production for a long time.Breakthroughs in direct-write technologies that achieve high throughput and large area will be a significant progress.The laser direct-write lithography equipment with high throughput, large area, and far-field operation that is operating in an air environment (i.e., which does not need a vacuum environment) is the goal of research and industrial workers struggling.
Using acoustooptic modulator or spatial light modulator can only achieve KHz order modulation frequency for the direct-writing laser beam.And it is a major limiting factor.Most direct-writing systems are limited to having a low-speed data rate solution and produce large-area micro/nanostructures subject to periodicity or symmetry.A line-grating scanning method in vectorized LDW was introduced in [4].This approach improved the writing speed; however, the flexibility of the fabricated optical device was limited.In a proton beam writing system [5], the proton beam deflected in a magnetic field was used for scanning.The total scan area was only 1 mm × 1 mm.PML2 [6] and MAPPER Lithography FLX-1200 [7] are electron beam direct-write (EBDW) systems.The throughput and the resolution are main problems of the systems being focused on.In the PML2 test system, 2500 programmable beams of 2.5 m × 2.5 m size are generated within a square field of 1.5 mm × 1.5 mm.With 200x reduction these beams are reduced to 12.5 nm at the wafer substrate within an exposure field of 7.5 m × 7.5 m.With resolution potential beyond 10 nm PML2 is designed to meet the requirements of several upcoming integrated circuit (IC) generations.In [8,9], an advantageous optical interconnected technology for transmit exposure data used in PML2 was introduced.A continuous and reliable parallel data transmission over free-space optics at 1 Gbps per channel was achieved based on the introduced system.However, the system was bulky and complicated.In [7], the complete data preparation flow is presented.This system uses the ASELTA Nanographics Inscale5 software.A complete cycle time of 37 hours was achieved on 144 CPUs.The system is very complicated and expensive.And most of established EBDW equipment is aimed at IC manufacture.LDW equipment with capability of large area and high throughput is necessary as more and more micro/nanodevices are applied.
Based on the improvement of the output optical power of semiconductor laser and the advantage of easy to achieve high frequency modulation [10], we designed the laser direct-writing scheme.A high-speed digital-to-analog (DA) converter was used to modulate the output intensity of the laser.In order to improve the modulation frequency as high as possible, we need a real-time hardware platform instead of a computer for the LDW laser modulator.And the large-area substrate requires large volume writing data, so a large capacity storage device is needed.The direct-writing process requires storing an enormous amount of data and a high-speed data buffer to store a subset of read data and generate a continuous real-time data stream for the laser beam controller.
In the present research, a specific hardware that satisfies synchronous and repeated data read speed for a laser beam controller is established.The status read back path is implemented for LDW status validation.The LDW data transmission and processing system reported in this work can achieve a user-defined structure with an extended large area about 10 2 cm 2 .The modulation frequency can achieve 10 MHz order for the direct-writing laser beam.And the throughput has been greatly improved at meters per second or square centimeter per second.
The rest of the paper is organized as follows.Section 2 introduces the basic workflow and relevant terminologies.Section 3 introduces the implementation of the hardware and synchronous work.Section 4 presents the data structure, and Section 5 provides a summary and discussion of the experimental results.

Basic Workflow
The intensity of the laser of each unit on the substrate is represented by a pixel value.The pixel data utilized to structure the substrate were formatted offline in advance in a data preparation kit and stored on a solid-state drive (SSD).Each unit with a unique exposure dose (i.e., pixel value) for a large-area diffractive optical element leads to massive directwriting data.Data transmission in the GHz range into a laser beam controller requires an appropriately designed data path to provide the required bandwidths and minimize latencies.The transmission and control path of the system architecture are described as in Figure 1.
As shown in Figure 1, an FPGA chip is at the center of the write and read back data transceiver, and the SSD stores the necessary writing data.The DA converter outputs a pixel value to control the laser beam power, and the AD converter reads back the power of the writing laser.The stage system center controls the linear and rotation stages.A highprecision rotary encoder and a linear encoder are integrated in the rotation and linear stages; these encoders produce a pulse for positioning in real-time.
The pulses localized by FPGA are sampled from the encoders' pulse and are received by the stage controller, given that the rotate encoder bits are 26 [11] and the frequency of the output pulse is too high.To synchronize each part, the appropriately calculated number of rotation pulses offline in advance is also stored in the SSD.The details of the data structure are presented in the following section.An RS232 interface was utilized for the communication between FPGA and the PC.The stage controller communicates with the center PC through gigabit Ethernet.The linear stage is scanned along a line, and the rotating stage is continuously revolved in the perpendicular direction.The strength of the beam is then adjusted along the circle according to the pattern to be written.For convenience, we introduce several relevant terminologies.The corresponding schematic diagram is shown in Figure 2.

Cylinder.
The substrate is divided into concentric circles; each circle has a unique number.The numbering starts from the cylinder with the maximum radius.

Information Unit.
The minimum unit is information unit, the size of which is related to optical and mechanical stage precision.

Reference
Baseline.This baseline is a radial that the rotary encoder employs to output a start pulse.
Each information unit with unique exposure dose  V can be accomplished through all information units on the same cylinder.The same exposure time () and the laser beam power of each information unit  V controlled by the DA value (i.e., pixel value) are sustained: where  V is the exposure dose,  V is luminous power, and  is exposure time.

Implementation of Hardware and Synchronous Work
FPGA provides the right mixture of flexibility, hard IP cores, transceiver capabilities, and development tool support to meet the demands of applications with evolving standards and stringent performance requirements in the pursuit of  high bandwidth [12].Thus, an FPGA-based embedded system is the most suitable hardware platform for our system.The FPGA device utilized in our design is Virtex6-xc6vlx240t with 1156 pins, offering 600 available user I/Os  to report the status to the link and transport layers in realtime.Speed negotiation between the host and device is implemented in the physical layer.An NPI port supported by a DDR3 SDRAM controller is developed to exchange data between the physical layer and the controller [13].Another kernel of the design complies with the link and transport layers of the SATA protocol.The transfer data need to be packaged as a protocol specification, and the received data need to be unpackaged for user-specific applications.This part is a bidirectional asynchronous pipeline divided into two main paths: transmission and reception paths.A MicroBlaze soft processor controls the total read and write procedure by sending various commands and checking the status, which is done through the application layer of SATA.For more implementation details, one may refer to [14].The electronics system is shown in Figure 4.
For high-grade pipelining and compliance with the rigid timing constraints of the data path, a double buffer architecture was designed (shown in Figure 5).To synchronize modules with different data bits or clock regions, the first-infirst-out structure was used.
When the start pulse is detected, the DA data value is sent to the DA convertor and the pulse number to the PULSE COUNT down counter module.The pulse number is the number of rotary encoder output pulses, which is unique for each information unit.The value of the PULSE COUNT down counter decreases by 1 when the rotary encoder generates a pulse.When the value is 0, the rotation stage rotates to the next information unit.The DA data value and pulse number are then updated.The PULSE TOTAL COUNT module is a 23-bit up counter.The initial value is 0 when the rotation stage is at the reference baseline.The value of the PULSE TOTAL COUNT module increases by 1 when the rotary encoder generates a pulse.The most significant bit is 1, and the other 22 bits are all 0 when the rotation stage completes a full circle.An error may occur when the start pulse of the angle encoder is outputted but the register is not at this state.The status is recorded, accumulated, and used as reference for work or stopped immediately.

Structure of LDW and Control Data
A corresponding data structure was designed for the maskless laser direct-writing lithography system.The data structure consists of the DA and AD parts.The data structure can be utilized to fabricate an arbitrary pattern and large-area diffractive optical elements (DOEs) with the direct-writing scheme.

Write Data Structure or DA Data
Structure.The DA data structure consists of DA state information, a DA information table, and DA information.The details of these three parts are presented below.
The DA state information part contains data related to the status and parameters of the DA data path, such as error location on an element and related system status, counter for error (i.e., number of runtime errors), and current status.Thus, the direct-writing lithography process can continue even after a power down event or an accident.
The DA information table consists of  records corresponding to the number of cylinders in an element.Each record in the DA information table is 32 bytes.The entries are as follows.The radial position occupies 4 bytes and indicates the position of the cylinder current radial.Exposure time occupies 4 bytes and represents the laser power on time.These values are similar in a certain cylinder; however, they can be different in different cylinders.To determine these values, different lithography materials need to be tested.The number of cylinders occupying 4 bytes indicates the total number of cylinders on DOE.The number of units occupying 4 bytes indicates the total number of information units in each cylinder.The next two entries are related to SSD.The start address occupying 4 bytes indicates the start of the Logical Block Address (LBA) address of SSD where the first DA data of a cylinder are saved.The number of sectors occupies 4 bytes, and the total sectors are occupied by an information unit in a similar cylinder.Another 8 bytes are reserved for address alignment.
DA information includes the DA value and pulse number.The DA value occupies 1 byte and controls the laser beam power.The pulse number occupies 1 byte and represents the number of pulses outputted by the rotary encoder for each

Testing and Conclusion
Integrated Xilinx Chipscope logic analyzer cores were included in each key module of the design [15,16].Additional debugging modules were also created to help in the testing process.These modules can easily be turned off using compile-time parameters.Circuitry was also added to the design to implement a transceiving event logger.Although Chipscope is useful, it can only capture a limited number of consecutive samples.Events that occur far from one another in time cannot be captured in general.To overcome this limitation, we created a monitor module that records significant events and sends them to a PC over the RS232 serial port.A MicroBlaze soft core processor was employed to perform parameter management and respond to states.The interface between the SATA core and the MicroBlaze processor was established to allow the processor to set various parameters, such as number of sectors and start address, and initiate data transfer.The control module also provides status information to the processor.
The method of testing bandwidth involved issuing a write or read command of varying block sizes.After each command, the number of cycles implemented to complete the transfer was stored and used to calculate the average throughput in MB/s.The read and write speed of SATA was tested using the 600 G Intel SSD 320 Series [17].The  bandwidth consumed and the speed of read and write transmission with different data block sizes are shown in Figure 6.
An encoder equipped with 26 rotate stage bits was used.The pulses localized by FPGA were sampled from the pulse of the encoders, and the sampled pulse total bits were 22, as accomplished by the state control equipment.The maximum frequency of the sampled pulse was set to 20 MHz.The outside and inside radii of the lithography substrate were 60 and 10 mm, respectively.The linear speed of the outer diameter of the optical device substrate was 1.79 m/s.A signal generator output pulse (20 MHz) was utilized for the PULSE COUNT module and TOTAL PULSE COUNT module to position the substrate.A binary zone plate structure was designed for the data stored on SSD according to the data structure described in Section 4. The total information unit was 275 × 10 9 .The sum of the amount of DA state information, DA information table, and DA information data was 560 GB for the area of the binary zone plate structure.Each information unit was allocated a specific DA value and a pulse number that can be read right and sent to the cache of DAC according to the signal generator output pulse.When the signal generator output pulse is 20 MHz, the required data rate of DAC is 8.98 MB/s.In our test, the clock of DAC was 20 MHz, and the read SSD bandwidth was more than 40 MB/s.Read back AD data were stored in another SSD.These read back data were used to verify the writing process.We can easy duplicate SATA IP in FPGA to implement parallel transmission.It is going to be used in future works if necessary.
The required data rates for maskless LDW cannot be achieved with existing off-the-shelf components.Therefore, a specific high-speed transmission hardware has been established and an LDW data structure for fabricating areas in 10 2 cm 2 DOEs was proposed.
The modulation frequency can achieve 10 MHz order for the direct-writing laser beam.The modulation frequency is much higher than traditional acoustooptic and spatial light modulator.
And the throughput has been greatly improved at meters per second or square centimeter per second.Table 2  the main features of some of the related equipment [18].Some electron beam direct-write systems are being developed to meet the requirements of upcoming IC generations [7,19].Several main parameters and comparisons with our system are presented.
The LDW technology in its current state is already appealing for wide industrial applications.We argue that the utilization of LDW lithography opens new routes for freeform optics with large area and high throughput.

Figure 2 :
Figure 2: Cylinders and information units of the substrate.
and comprising 37,680 Slices.Each Slice is composed of four look-up-table and eight flip-flops in the Virtex6-xc6vlx240t device.A Configurable Logic Block (CLB) consists of two Slices and necessary connection resource.RocketIO transceivers are embedded into parallel-to-serial and serialto-parallel transceiver cores, thereby enabling bit rates of up to 6.6 Gbps per channel.The high-speed transceiver resource is used as a physical layer of various protocols.The xc6vlx240t device offers 24 transceivers.A serial ATA core is implemented based on RocketIO transceivers, CLBs, and high-speed I/Os.Other necessary modules are established by synthesizing a Verilog code based on CLBs and block memory resources.The design is integrated with a MicroBlaze soft processor to provide a flexible control interface.The embedded processor manages most data transceiver tasks and reduces PC intervention.SSD is a new type of storage medium; it is faster and more reliable than a typical hard drive because it has no moving parts.The storage size of SSD almost equals that of a traditional hard disk drive.

Figure 3 :
Figure 3: Overview of the electronic hardware architecture.

Figure 6 :
Figure 6: Speed of read and write SSD test results.

Table 1 :
Address space of each part.

Table 1 .
4.2.Structure of Data Read Back by AD.The AD part of the data is stored in another SSD.The AD data structure consists of AD state information, AD information table, and AD information.The AD state information is similar to the DA state information.The AD information table consists of radial position occupying 4 bytes, number of cylinders occupying 4 bytes, and number of units occupying 4 bytes.The address range of the AD information table is similar to the DA information table.To improve the efficiency of SSD write/read and align the address used in data storage, 0 is used to fill in the blanks.AD information contains only AD values.