Quantum-dot cellular automata (QCA) is an attractive nanotechnology with the potential alterative to CMOS technology. QCA provides an interesting paradigm for faster speed, smaller size, and lower power consumption in comparison to transistor-based technology, in both communication and computation. This paper describes the design of a 4-bit multifunction nanosensor data processor (NSDP). The functions of NSDP contain (i) sending the preprocessed raw data to high-level processor, (ii) counting the number of the active majority gates, and (iii) generating the approximate sigmoid function. The whole system is designed and simulated with several different input data.
1. Introduction
Nanotechnology is a multidisciplinary field that brings together many science and engineering disciplines including physics, chemistry, biosciences, material science, computer science, and electrical and mechanical engineering. Nanotechnology will radically affect all these disciplines and their application areas. The economic impact is foreseen to be comparable to information technology and telecommunication industries [1]. Nanotechnology has direct applications in sensing, sensor miniaturization, and new materials development as well as in electronics and electromechanical devices. Substantial advances in nanotechnology development have been achieved in the fields of engineering and bioscience [2–4]. For example, integrating a large number (496) of programmable FET nodes in a small area of about 960 μm^{2} and programing them into a full adder, subtractor, multiplexer, and demultiplexer has been reported. Work reported in [3] demonstrated that a lateral integration of 700 rows of Z_{n}O nanowires produces a peak voltage of 1.26 V at a low strain of 0.19%, which is potentially sufficient to recharge an AA battery. The nanosensor development described in [5], based on nanowires, is emerging as a powerful and general class of ultrasensitive, electrical sensors for the direct detection of biological and chemical species, from proteins and DNA to drug molecules and viruses down to the ultimate level of a single molecule. It also shows that a nanosensor array that contains 100 addressable elements provides unique opportunities for label-free multiplexed detection of biological and chemical species. The literature [6] provides an in-depth view of nanosensor technology and electromagnetic communication among nanosensors. With the integration of the technologies described in [3, 5, 6], that is, the integration of nanosensors, self-powered nanodevices, and wireless nanosensors, many nanosensor applications can be considered. These applications can be classified into four groups: biomedical, environmental, industrial, and military applications. In biomedical applications, health monitoring systems and drug delivery systems are some of the examples. In environmental applications, plant monitoring systems and plague defeating systems can be given as examples. In industrial and consumer goods applications, the examples are ultrahigh sensitivity touch surfaces and haptic interfaces. For military and defense applications, examples are nuclear, biological, and chemical defenses and damage detection systems.
In the above-mentioned applications, although the total number of nanosensors used in each case depends on the scale of the deployed devices, generally this number ranges from 10^{3} to 10^{7} or higher. These application devices have the following common characteristics.
Simple Data. Each nanosensor delivers only simple data. The processing of the data provided by each individual nanosensor does not need complicated operation.
Ambiguity. Because some of the nanosensors may work at charging phase (self-powered) or the power may be low, nanosensors may provide ambiguous signals. The receiver (processing) side cannot obtain accurate data at all times. Also, like any other communication system, the signal may fade because of environment conditions and noise.
Massive Data. Nanosensing devices contain a large number of nanosensors (10^{3}–10^{7} or higher) which generate data continuously. The data from each nanosensor may be primitive and ambiguous. However, massive amount of this type of data, if processed correctly, delivers useful and meaningful information from the device as a whole.
To analyze data from the nanosensing devices mentioned above, classical computing and processing models, such as loosely coupled distributed systems or tightly coupled parallel systems, can be employed. However, in the classical computing and processing models, the number of processors is usually in the range of 10^{1}–10^{3}. This number of processors can be increased, but it will bring tremendous increase of cost and system dimensions. Each processing unit usually has a local memory and can conduct complex operation independently. Also, they consume large amounts of energy (power) and heat dissipation becomes a critical issue. The above-mentioned computing and processing models are not suitable for processing data provided by nanosensors. The computing and processing models for nanosensing should meet the following three principles.
Simplicity. The basic processing element (PE), that is, the cell, is simple. PE does not need complicated operations and does not need many instructions, as those in the existing general-purpose CPUs. It only needs a small number of operations because it will function as a bridge between nanosensors and high-level processors.
Parallelism. There will be a vast number of cells operating in parallel. Hence, the processing must be distributed.
Locality. All interactions take place on a purely local basis. A cell can interact with a few other cells.
ITRS report [7] summarizes several possible technology solutions for nanosensor data processing. Quantum-dot cellular automata (QCA) is an interesting possibility. Since QCAs were introduced in 1993 [8], several experimental devices have been developed [9–13]. Although they are certainly “not ready for prime time,” recent papers show that QCAs may eventually achieve high density [14], fast switching speed [15], and room temperature operation [10, 16].
This paper describes the design of a PE for nanosensor data processing, named as nanosensor data processor (NSDP), based on QCA. NSDP works as a bridge between the nanosensors and high-level processor. Its functions are limited to three: (i) sending the preprocessed raw data to high-level processor, (ii) counting the number of the active majority gates (the active means that the output of a majority gate is 1), and (iii) generating the approximate sigmoid function for postprocessing based on artificial neural network. This paper is organized as follows. Section 2 presents the background of QCA focusing on its unique clocking scheme. Section 3 shows the details design of NSDP. Simulation results are shown in Section 4. Conclusions and future works are given in Section 5.
2. QCA Background2.1. QCA Cells and Wires
A QCA cell is a square nanostructure with a quantum dot in each of the four corners [17], as shown schematically in Figure 1. The cell is populated with two electrons that can tunnel between two pairs of quantum dots connected via a tunnel junction. The two electrons occupy antipodal sites within the cell due to Coulombic repulsion. Tunneling action only occurs within the cell and no tunneling happens between cells. The combination of quantum confinement, Coulombic repulsion, and the discrete electronic charge produces bistable behavior. The two charge configurations can be used to represent binary “0” and “1” with polarization of −1 and +1, respectively. In contrast to a physical wire, a QCA “wire” is a chain of cells where the cells are adjacent to each other, as shown in Figure 1(b). Since no electrons tunnel between cells, QCA provides a mechanism for transferring information without current flow.
Basic QCA cell and wire.
2.2. QCA Logic Gates
In QCA, three-input majority gates and inverters serve as the fundamental gates. A majority gate, as shown in Figure 2(a), consists of five QCA cells that realize the function of M(a,b,c)=ab+bc+ac. An inverter, as shown in Figure 2(b), is made by positioning cells diagonally from each other to achieve the inversion functionality. Figures 2(c) and 2(d) show the variation layouts of an inverter. Majority gates and inverters form a universal set; that is, any logic function can be implemented by using this set. For example, a two-input AND gate is realized by fixing one of the majority gate inputs to “0,” that is, AND(a,b)=M(a,b,0)=ab. Similarly, an OR gate is realized by fixing one input to “1,” that is, OR(a,b)=M(a,b,1)=ab+b·1+a·1=a+b.
(a) Majority gate and its symbol; (b) inverter and its symbol; (c) and (d) variation layout of an inverter.
2.3. QCA Clocking Scheme
Adiabatic switching is used for QCA clocking, which significantly reduces metastability issues and enables deep pipelines [18]. During each clock cycle, half of the wire is active for signal propagation, while the other half is unpolarized. During the next clock cycle, half of the previous active clock zone is deactivated and the remaining active zone cells trigger the newly activated cells to be polarized. Thus, signals propagate from one clock zone to the next. The circuit area is divided into four sections and they are driven by four phase clock signals, as shown in Figure 3. In each zone, the clock signal has four states: high-to-low, low, low-to-high, and high. The cell begins computing during the high-to-low state and holds the value during the low state. The cell is released when the clock is in the low-to-high state and inactive during the high state.
QCA clocking scheme.
2.4. QCA Design Rules
A nominal cell size of 20 nm by 20 nm is assumed. The cell has a width and height of 18- and 5-nm-diameter quantum dots. The cells are placed on a grid with a cell center-to-center distance of 20 nm. QCA design rules are well studied in [19–21] and are summarized in the following.
(A) Layout Design Rules
Maximum Number of Cells in a Single Clocking Zone. It can be as large as 47 cells; the maximum length of QCA wire is 25 cells. Any long QCA wire exceeding the length of 25 needs to be partitioned into different clocking zones.
Minimum Number of Cells in a Single Clocking Zone. It can be one cell. However, the waveform of a one-cell clocking zone can become distorted and cascading of this kind of clocking zone could lead to incorrect results [22]. To observe correct outputs from a circuit, it is recommended that clocking zones should consist of at least two cells.
Minimum Wire Spacing for Signal Separation. A space of one QCA cell size is sufficient separation between two wires carrying different signals.
Wire Crossover. A unique property of QCA layout is the possibility of implementing crossovers by using only one layer, known as coplanar crossing. Coplanar crossing uses both 45° and 90° cells. However, they can easily fail due to low robustness [23] and fabrication issues [24]. Another alternative is multilayer crossing, which uses more than one layer of cells similar to the routing of metal wires in CMOS technology. The extra layers of QCA are believed to be useful as active components of the circuits and consume less area compared to coplanar circuits [23].
QCA Equivalent λ-Rule. “λ-rule” for QCA circuit design could be defined according to the size of a QCA cell, or perhaps the cell size itself could be used as the equivalent λ.
(B) Timing Design Rules
Logic Component Timing Rule. The timing constraint on a QCA majority gate is that all three inputs are expected to reach the device cell (central cell) at the same time in order to have fair voting.
Clocking Zone Assignment Rule. In QCA circuits, cells in each clocking zones should be synchronized.
(C) Special Rules for QCA
Majority Logic Reduction. The logic primitive used in QCA is the majority gate. The majority logic-based reduction method [25] can significantly reduce the complexity of QCA circuits.
Systolic Design. The features of systolic architecture in terms of synchrony, deep pipelines, and local interconnection are particularly suitable for accommodating the special timing requirement in QCA circuits. When applying systolic architecture to QCA, significant benefits can be achieved, even more than when applied to CMOS-based technology [26].
In the design of NSDP, the second rule (i.e., Minimum number of cells in a single clocking zone) in layout design is applied in the following way. For the fixed-value input, such as fixing one input of a majority gate to “−1.00” (binary 0) to make it an AND gate, it uses one-cell clocking zone. Or for the limited space, it uses one-cell clocking zone, but no cascading. For the 4th rule (i.e., wire crossover), it employs the multilayer crossing technique. The following section shows the details design of NSDP.
3. NSDP Architecture
NSDP is a processor that works as a bridge between nanosensors and the high-level processor. Its functions are (i) sending the preprocessed raw data to high-level processor, (ii) counting the number of the active majority gates, and (iii) generating the approximate sigmoid function. Among these functions, the last one is the focus of NSDP because it provides the sigmoid function output for the high-level processor to conduct the processing based on the artificial neural network (ANN). The block diagram of NSDP is shown in Figure 4. The detail of each block is explained below.
Block diagram of NSDP.
3.1. Preprocessing
Considering that a large number of nanosensors will generate a great amount of “0” or “1” data, and some nanosensors may generate ambiguous data, the first processing in NSDP computer is the majority operation that determines which one (“0” or “1”) takes the majority. Generally, the number of the inputs of this unit can be any number. The larger the number of the inputs is, the better the majority operation is. However, considering the capability of the available QCA design tool, QCADesigner, the number of the inputs in NSDP is set at 12. This processing unit consists of four majority gates, and it has 12 inputs and 4 outputs. Figure 5(a) shows the schematic, Figure 5(b) shows the QCA layout, and Figure 5(c) shows the simulation result by QCADesigner, respectively. The output of this unit is given by
(1)m0=M(a0,b0,c0),m1=M(a1,b1,c1),m2=M(a2,b2,c2),m3=M(a3,b3,c3).
As shown in Figure 5(c), for the input a0b0c0={000,001,010,001,100,101,110,111}, a1b1c1={000,100,110,101,100,011,010,011}, a2b2c2={000,111,110,101,100,011,010,001}, and a3b3c3={000,001,010,011,100,101,110,111}, the output m0={0,0,0,0,0,1,1,1}, m1={0,0,1,1,0,1,0,1}, m2={0,1,1,1,0,1,0,0}, m3={0,0,0,1,0,1,1,1}, respectively, and the delay is 3/4 clock.
3.2. Counter
The output of the preprocessing unit is separated into two, one goes to M-Latch and the other goes to the input of the counter. This unit is to count “1” in the output of the preprocessing unit. To count “1” from 4 parallel bits of the output of the preprocessing unit, it employs three full adders, as shown in Figure 6(a). QCA full adder has been well studied. Some representative QCA adders are [19, 27–31]. The QCA carry flow adder reported in [19] is a layout optimized multilayer full adder. In QCA, the path from carry-in to carry-out uses one majority gate that requires one clocking zone per bit in a ripple carry adder. This adder (referred to as Cho adder in this paper) consumes only one clocking zone delay per bit, which significantly reduces the delay for large adders. Three Cho adders are employed to implement the counter. The QCA layout is shown in Figure 6(b), and the simulation result shown in Figure 6(c). The output is given by
(2)s2s1s0=m3+m2+m1+m0.
As shown in Figure 6(c), for the input m0={0,0,0,1,0,1,1,1}, m1={0,1,1,1,0,1,0,0}, m2={0,1,1,1,0,1,0,0}, m3={0,0,0,1,0,1,1,1}, the output s2s1s0={0,2,2,4,0,4,2,2}. The delay is 2 clocks. Note that the “+” operator here means arithmetic addition. In the following, “+” means logic OR operation if there is no specific notation.
3.3. Decoder
The decoder unit and the sigmoid pattern generator unit are used to approximate the sigmoid function y=1/(1+e-x). As shown in Table 1, NSDP employs 4 points to approximate the sigmoid function. This is 3-to-4 decoder; the inputs are s2, s1, and s0, and the outputs are f4, f3, f2, and f1. The output f1 corresponds to that s2s1s0 is equal to 0 or 1; that is, it corresponds to that m3m2m1m0 equals to 1000, 0100, 0010, 0001, or 0000. It is worth to note that the position of “1” does not matter. For the four inputs, m3, m2, m1, and m0, of the counter, m3 does not mean the most significant bit and m0 does not mean the least significant bit because the twelve input sensors can only be considered “fired” or “not fired” and the majority gate can be numbered in any order. This is the same here and after. Likewise, the output f2 corresponds to that s2s1s0 which is equal to 2, that is, it corresponds to that m3m2m1m0 which is equal to 1100, 0110, 0011, 1010, 0101, or 1001. The output f3 corresponds to that s2s1s0, is equal to 3, that is, it corresponds to that m3m2m1m0 which is equal to 1110, or 0111. The output f4 corresponds to that s2s1s0 is equal to 4; that is, it corresponds to that m3m2m1m0 equals to 1111. The outputs f4, f3, f2, and f1 are given by
(3)f1=s2-s1-,f2=s2-s1s0-,f3=s2-s1s0,f4=s2s1-s0-.
Decoding of the output of the counter.
s2s1s0
Output: f4f3f2f1
Sigmoid function pattern
The number of “1” is less than or equal to 1 (the position does not matter).
0001
0000
The number of “1” is equal to 2 (the position does not matter).
0010
0011
The number of “1” is equal to 3 (the position does not matter).
0100
1101
The number of “1” is equal to 4.
1000
1111
Figure 7(a) shows the schematic of this 3-to-4 decoder. Figure 7(b) shows the QCA layout, and Figure 7(c) shows the simulation result. As shown in Figure 7(c), when the input s2s1s0 (sum) is equal 0, 1, 2, 4, 0, 3, 1, and 2, the output f4f3f2f1 will be equal to 0001, 0001, 0010, 1000, 0001, 0100, 0001, and 0010, respectively, and the delay is one clock.
The output of the decoder is used to generate four patterns, 1111, 1101, 0011, and 0000, as listed in the third column in Table 1, which are used to approximate the sigmoid function. These four patterns are generated in the following way:
(4)f4a=f4·1,f4b=f4·1,f4c=f4·1,f4d=f4·1,f3a=f3·1,f3b=f3·1,f3c=f3·0,f3d=f3·1,f2a=f2·0,f2b=f2·0,f2c=f2·1,f2d=f2·1,f1a=f1·0,f1b=f1·0,f1c=f1·0,f1d=f1·0.
The schematics for f4af4bf4cf4d, f3af3bf3cf3d, f2af2bf2cf2d, and f1af1bf1cf1d are shown in Figure 8(a). The QCA layout for f4af4bf4cf4d is shown in Figure 8(b). The layout for f3af3bf3cf3d, f2af2bf2cf2d and f1af1bf1cf1d is similar to the one in Figure 8(b), except that the pattern to be generated is 1101, 0011, and 0000, respectively. The simulation result is shown in Figure 8(c). When the input s2s1s0 (sum) is equal to 0, 1, 2, 4, 0, and 3, the generated pattern is 0000 (0) and 0000 (0) for f1af1bf1cf1d, 0011 (3) for f2af2bf2cf2d, 1111 (15) for f4af4bf4cf4d, 0000 (0) for f1af1bf1cf1d, and 1101 (13) for f3af3bf3cf3d, respectively. The delay is 3/4 clock. Note that the delay for f2af2bf2cf2d and f1af1bf1cf1d is one clock longer than of f4af4bf4cf4d and f3af3bf3cf3d. This is because that the input for f2af2bf2cf2d and f1af1bf1cf1d is delayed by one clock.
Pattern generator. (a) Schematic for 1111, 1101, 0011, and 0000 generators; (b) QCA layout for 1111 generator (this QCA layout can generate other patterns by changing the standard pattern to 1101, 0011, or 0000); (c) simulation result for 1111 (15), 1101 (13), 0011 (3), and 0000 (0) generators.
3.5. Sigmoid Function Output
The output of the sigmoid function is one of the four patterns, f4af4bf4cf4d, f3af3bf3cf3d, f2af2bf2cf2d, and f1af1bf1cf1d, depending on the number of 1’s of the counter. Therefore, the output g3g2g1g0 of the sigmoid function can be written as
(5)g3=f4a+f3a+f2a+f1a,g2=f4b+f3b+f2b+f1b,g1=f4c+f3c+f2c+f1c,g0=f4d+f3d+f2d+f1d.
The schematics for gi(i=0,1,2,and3) are shown in Figure 9(a). The QCA layout for g0 are shown in Figure 9(b). This layout is also for other outputs gi(i=1,2,and3), except that the inputs and output will be changed to f4c,f3c,f2c,f1c,g1,f4b,f3b,f2b,f1b,g2 and f4a,f3a,f2a,f1a,g3, correspondingly. The simulation result is shown in Figure 9(c). When the input s2s1s0 (sum) is equal to 0, 1, 2, 4, 0, and 3, the output g3g2g1g0 is 0000 (0), 0000 (0), 0011 (3), 1111 (15), 0000 (0), and 1101 (13), correspondingly. The delay from the output of the counter to the output of the sigmoid function is 2(3/4) clocks. Figure 10 shows the graph of the approximated sigmoid function, generated by the decoder and pattern generator unit described above.
Output of sigmoid function. (a) Schematic for the output of the sigmoid function; (b) QCA layout for g0 of the sigmoid function (this QCA layout is applicable to g1, g2, and g3 by changing to the corresponding input); (c) simulation result of the sigmoid function output.
Approximate sigmoid function generated by NSDP.
3.6. Function Control
NSDP is a multifunction nanosensor data processor. The function selection of NSDP is determined by the function control unit, which is a 2-to-3 decoder. The truth table of the function control unit is shown in Table 2. When control bus c1c0 is equal to 00, the output of NSDP is the raw majority gate output, that is, m3m2m1m0 from 4 independent majority gates. The bit position does not matter; that is, m3 does not mean MSB and m0 does not mean LSB. When control bus c1c0 is equal to 01, the output of NSDP is the number of 1s of the majority gate output, given by 0s2s1s0. Note that here s0 is LSB and MSB is always 0. When control bus is c1c0 equal to 10, the output of NSDP is the sigmoid function output. The output of the function control unit is given by
(6)s00=c1-c0-,s01=c1-c0,s10=c1c0-.
The schematic of the function control unit is shown in Figure 11(a), the QCA layout is shown in Figure 11(b), and the simulation result is shown in Figure 11(c). When input c1c0 is equal to 00, 01, and 10, the output s10s01s00 is equal to 001, 010, and 001, respectively. The delay from the input to output is 1 clock.
Truth table for function control unit.
c1c0
Output of NSDP
00
Majority of raw sensor data, m3m2m1m0
01
Number of active outputs (i.e., 1s), 0s2s1s0
10
Sigmoid output, g3g2g1g0
11
Reserved
Function control unit. (a) Schematic; (b) QCA layout; (c) Simulation result for c1c0=00, c1c0=01, and c1c0=10.
3.7. NSDP Output
The function control unit of NSDP determines which output of the three function units (raw majority data, the number of the active majority gates, and sigmoid function) will be the output o3o2o1o0 of NSDP. o3o2o1o0 is given by
(7)o3=m3s00+0·s01+g3s10,o2=m2s00+s2·s01+g2s10,o1=m1s00+s1·s01+g1s10,o0=m0s00+s0·s01+g0s10.
The schematic of NSDP output unit is shown in Figure 12(a), and the partial QCA layout of o0 in is shown Figure 12(b). Note that m0a, s0a, and g0a are the outputs of M-latch, S-latch, and G-latch for the inputs m0, s0, and g0, respectively, which are not shown in Figure 12(b) (refer to Figure 13 for details).
NSDP output. (a) Schematic; (b) Partial QCA layout for o0.
Complete QCA layout of NSDP.
The complete QCA layout of NSDP is shown in Figure 13. NSDP consists of the following units: preprocessing, counter, 3-to-4 decoder, pattern generator, sigmoid function output, function control, M-latch, S-latch, G-latch, and NSDP output. Each unit is surrounded by a red dotted line in Figure 13. The complete experiment results are given in the next section.
4. Results
NSDP is designed and simulated by using QCADesigner Ver. 2.0.3. NSDP contains 4436 cells. Its dimension is 3μm×2.3μm (6.9 μm2). The simulation employs the coherence vector engine. Figure 14 shows the parameter setting for coherence vector simulation engine.
In this mode, the data flow path is Preprocessing → M-Latch → NSDP Output (refer to Figure 13 for details).
As shown in Figure 15, 12 inputs are grouped into 4 groups: Input 3 (a3b3c3), Input 2 (a2b2c2), Input 1 (a1b1c1), and Input 0 (a0b0c0). The input data are as follows:
The majority output on M bus (m3m2m1m0) is equal to 1111 (15), 0000 (0), 1110 (14), 1000 (8), 1001 (9),…. When c1c0 is equal to 00, s00 becomes 1, and s01 and s10 both become 0. NSDP picks the data on M bus as the final output. This is shown in 6th row, in Figure 15; that is, Output (o3o2o1o0) is equal to 1111 (15), 0000 (0), 1110 (14), 1000 (8), 1001 (9),…, in accompany with the clocked s00.
The counter output on sum bus (s2s1s0) is equal to 000 (0), 001 (1), 010 (2), 100 (4), 000 (0), 011 (3), 001 (1), 010 (2),…. When c1c0 is equal to 01, s01 becomes 1, and s00 and s10 both become 0. NSDP picks the data on S bus and extends to 4 bits by padding leftmost bit with 0, as the final output. This is shown in 6th row, in Figure 16; that is, the Output (o3o2o1o0) is equal to 0000 (0), 0001 (1), 0010 (2), 0100 (4), 0000 (0), 0011 (3), 0001 (1), 0010 (2),…, in accompany with the clocked s01.
In this mode, the data flow path is Preprocessing → Counter → 3-to-4 Decoder, → Pattern Generator →Sigmoid Function Output → G-Latch → NSDP Output (refer to Figure 13 for details).
As shown in Figure 17, the input data are as follows:
The counter output on sum bus (s2s1s0) is equal to 000 (0), 001 (1), 010 (2), 100 (4), 000 (0), 011 (3), 001 (1), 010 (2),…, as shown in 5th row, in Figure 17. The sigmoid function converts these inputs to 0000 (0), 0000 (0), 0011 (3), 1111 (15), 0000 (0), 1101 (13), 0000 (0), 0000 (0),…, and outputs on the Sigmoid bus, as shown in 6th row. When c1c0 is equal to 10, s10 becomes 1, and s00 and s01 both become 0. NSDP picks the data on the sigmoid bus as the final output. This is shown in 7th row, in Figure 17; that is, the Output (o3o2o1o0) is equals to 0000 (0), 0000 (0), 0011 (3), 1111 (15), 0000 (0), 1101 (13), 0000 (0), 0000 (0),…, in accompany with the clocked s10.
In above experiments, the processing in sigmoid function is the most complicated and the delay from input to the output is the longest. For the raw majority data output and the counter output, the delay is inserted to make them synchronize with the output of the sigmoid function output. Therefore, the delay of NSDP from the Input 3 (a3b3c3), Input 2 (a2b2c2), Input 1 (a1b1c1), and Input 0 (a0b0c0) to the Output (o3o2o1o0) is 8(3/4) clocks.
5. Conclusions and Future Works
This paper presents the design of a 4-bit nanosensor data processor (NSDP) which has 3 functions: (i) sending the result of 4 majority gates to high-level processor; (ii) counting the number of active majority gates (i.e., the output of the majority gate is 1); (iii) generating the approximate sigmoid function for the postprocessing based on the artificial neural network. QCA circuits have significant wire delays. For a fast design in QCA, it is generally necessary to minimize the complexity. The design uses systolic array structures to produce an output on every clock cycle with low latency to the first output. The layouts and functionality checks were done using QCADesigner. NSDP is a relatively complicated system. The simulation results show that its three functions work correctly. It is hoped that this paper will inspire further ideas on developing application systems based on QCA circuits.
As mentioned in Section 1, the nanosensor system generates a large number of nanosensor data. This type of system needs 10^{3}–10^{7} PEs to process the real-time data. NSDP described in this paper works as a PE. A large number of PEs (10^{3}–10^{7}) need to be arranged in a parallel and distributed manner to handle the huge amount of nanosensor data. NSDP is a 4-bit processor. It will be more convenient to handle large amount of nanosensor data, by extending it to a 8-bit processor. All these are our future works.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work is partially supported by a Grant from AFRL under Minority Leaders Program, Contract no. TENN13-S567-019-02-C2. The authors would like to thank the anonymous reviewers for their careful review and valuable comments.
ErmolovV.HeinoM.KärkkäinenA.LehtiniemiR.NefedovN.PasanenP.RadivojevicZ.RouvalaM.RyhänenT.SeppäläE.UusitaloM. A.Significance of nanotechnology for future wireless devices and communicationsProceedings of the 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC '07)September 2007Athens, Greece2-s2.0-4434917754610.1109/PIMRC.2007.4394126AnkerJ. N.HallW. P.LyandresO.ShahN. C.ZhaoJ.Van DuyneR. P.Biosensing with plasmonic nanosensorsXuS.QinY.XuC.WeiY.YangR.WangZ. L.Self-powered nanowire devicesYanH.ChoeH. S.NamS.HuY.DasS.KlemicJ. F.EllenbogenJ. C.LieberC. M.Programmable nanowire circuits for nanoprocessorsPatolskyF.LieberC. M.Nanowire nanosensorsAkyildizI. F.JornetJ. M.Electromagnetic wireless nanosensor networksInternational Technology Roadmap for Semiconductors (ITRS), 2007, http://www.itrs.net/LentC. S.TougawP. D.PorodW.BernsteinG. H.Quantum cellular automataAmlaniI.OrlovA. O.KummamuruR. K.BernsteinG. H.LentC. S.SniderG. L.Experimental demonstration of a leadless quantum-dot cellular automata cellCowburnR. P.WellandM. E.Room temperature magnetic quantum cellular automataKummamuruR. K.OrlovA. O.RamasubramaniamR.LentC. S.BernsteinG. H.SniderG. L.Operation of a quantum-dot cellular automata (QCA) shift register and analysis of errorsOrlovA. O.AmlaniI.TothG.LentC. S.BernsteinG. H.SniderG. L.Experimental demonstration of a binary wire for quantum-dot cellular automataQiH.SharmaS.LiZ.SniderG. L.OrlovA. O.LentC. S.FehlnerT. P.Molecular quantum cellular automata cells. Electric field driven switching of a silicon surface bound array of vertically oriented two-dot molecular quantum cellular automataDeHonA.WilsonM. J.Nanowire-based sublithographic programmable logic arraysProceedings of the ACM/SIGDA 12th ACM International Symposium on Field-Programmable Gate Arrays (FPGA '04)February 20041231322-s2.0-2442424176SeminarioJ. M.DerosaP. A.CordovaL. E.BozardB. H.A molecular device operating at terahertz frequencies: theoretical simulationsWangY.LiebermanM.Thermodynamic behavior of molecular-scale quantum-dot cellular automata (QCA) wires and logic devicesOrlovA. O.AmlaniI.BernsteinG. H.LentC. S.SniderG. L.Realization of a functional cell for quantum-dot cellular automataLentC. S.TougawP. D.A device architecture for computing with quantum dotsChoH.SwartzlanderE. E.Jr.Adder and multiplier design in quantum-dot cellular automataLiuW.LuL.O'NeillM.SwartzlanderE. E.Design rules for quantum-dot cellular automataProceedings of the IEEE International Symposium of Circuits and Systems (ISCAS '11)May 2011236123642-s2.0-7996085872610.1109/ISCAS.2011.5938077LiuW.LuL.OrneillM.SwartzlanderE. E.Jr.WoodsR.Design of quantum-dot cellular automata circuits using cut-set retimingKimK.WuK.KarriR.Towards designing robust QCA architectures in the presence of sneak noise paths2Proceedings of the Design, Automation and Test in Europe (DATE '05)March 2005121412192-s2.0-3364691830910.1109/DATE.2005.316WalusK.JullienG. A.Design tools for an emerging SoC technology: quantum-dot cellular automataCrockerM.NiemierM.HuX. S.LiebermanM.Molecular QCA design with chemically reasonable constraintsZhangR.WalusK.WangW.JullienG. A.A method of majority logic reduction for quantum cellular automataLuL.LiuW.O'NeillM.SwartzlanderE. E.Jr.QCA systolic matrix multiplierProceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI '10)July 20101491542-s2.0-7795792390110.1109/ISVLSI.2010.53HanninenI.TakalaJ.Robust adders based on quantum-dot cellular automataProceedings of the IEEE International Conference on Application-Specific Systems, Architecture Processors2007391396PudiV.SridharanK.Low complexity design of ripple carry and brent-kung adders in QCATougawP. D.LentC. S.Logical devices implemented using quantum cellular automataWangW.WalusK.JullienG.Quantum-dot cellular automata addersProceedings of the 3rd IEEE International Conference on Nanotechnology2003461464ZhangR.WalusK.WangW.JullienG. A.Performance comparison of quantum-dot cellular automata addersProceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '05)May 2005252225262-s2.0-3424868262510.1109/ISCAS.2005.1465139