Information Entropy- and Average-Based High-Resolution Digital Storage Oscilloscope

Vertical resolution is an essential indicator of digital storage oscilloscope (DSO) and the key to improving resolution is to increase digitalizing bits and lower noise. Averaging is a typical method to improve signal to noise ratio (SNR) and the effective number of bits (ENOB).The existing averaging algorithm is apt to be restricted by the repetitiveness of signal and be influenced by gross error in quantization, and therefore its effect on restricting noise and improving resolution is limited. An information entropy-based data fusion and average-based decimation filtering algorithm, proceeding from improving average algorithm and in combination with relevant theories of information entropy, are proposed in this paper to improve the resolution of oscilloscope. For single acquiring signal, resolution is improved through eliminating gross error in quantization by utilizing themaximumentropy of sample datawith further noise filtering via average-based decimation after data fusion of efficient sample data under the premise of oversampling. No subjective assumptions and constraints are added to the signal under test in the whole process without any impact on the analog bandwidth of oscilloscope under actual sampling rate.


Introduction
Bandwidth, sampling rate, and storage depth are three core indicators to evaluate the performance of digital storage oscilloscope (DSO).In addition, there is another indicator which is of great significance but always being ignored, that is, vertical resolution [1] (hereinafter referred to as resolution).Higher resolution means more refined waveform display and more precise signal measurement.The resolution of DSO depends on the digitalizing bits of analog-digital converter (ADC) and the noise and distortion level of oscilloscope itself.Common oscilloscopes generally adopt 8-bit or 12-bit ADC, and the key to achieving higher resolution is to lower noise under given digitalizing bits of ADC.
One method to reduce the effect of ADC-related system noise is to combine multiple ADCs in a parallel array.In such a system, the same analog signal is applied to all  ADCs and the output is digitally summed.The maximum signal to noise ratio (SNR) increases because the signal is correlated from channel to channel while the noise is not.In a parallel array, the SNR, therefore, increases by a factor of  assuming the noise is uncorrelated from channel to channel [2,3].Furthermore, decorrelation techniques are proposed in [4] to reduce the effect of correlated sampling noise introduced by clock jitter in all parallel ADCs.
In [5,6], another method which is named stacked ADC is proposed to enhance resolution of ADC system.It uses multiple ADCs, each connected to the same radar IF through amplifier chains with different gain factors.After digital amplitude and phase equalization, the obtained SNR is much greater than that of an individual ADC.The two aforementioned methods will consume more ADCs and result in a high cost.
Another method to enhance SNR is averaging, which can increase the resolution of a measurement without resorting to the cost and complexity of using expensive multiple ADCs [7][8][9][10].There are two common averaging modes for DSO, that is, successive capture averaging and successive sample averaging.The former averages the corresponding sampling points in the multiwaveforms acquired repeatedly one by one, and the latter averages the multiple adjacent sampling points in a single waveform acquired once one by one.Both of them 2 Mathematical Problems in Engineering have certain limitations on restricting noise and improving resolution.Since successive capture averaging is based on repetitive acquisition of signals, it is limited by the repetitiveness of the signals under test.Even though successive sample averaging is based on single acquisition of the signal under test, it sacrifices analog bandwidth while filtering noise.In addition, gross error in quantization such as irrelevant noise and quantization error caused by data mismatch is always included in the sampling data of oscilloscope due to the influence of environmental disturbance, clock jitter, transmission delay, and so forth.If not being handled first, the results of direct averaging of the acquired sample data with gross error will deviate from actual signal drastically.
Data fusion (also called information fusion) is a sample data processing method which is widely applied currently.
In previous studies, a data fusion algorithm based on the estimation theory in batches of statistics theory is proposed in the [11] and is further improved in [12].However, it is kind of subjective that both algorithms assume that sample data are characterized by normal distribution.The concept of entropy originates from physics to describe the disordered state of thermodynamic system.Entropy reflects the statistical property of system and is introduced into numerous research fields successively.In 1948, an American mathematician, Shannon, introduced the entropy in thermodynamics into information theory and proposed information entropy [13], to measure the uncertainty degree of information.Information entropy provided a new approach for data fusion [14].
An information entropy-based data fusion and averagebased decimation filtering algorithm, proceeding from improving average algorithm and in combination with relevant theories of entropy, are proposed in this paper to improve the resolution of oscilloscope effectively.Additional horizontal sampling information is used to achieve higher vertical resolution under the premise of oversampling.Firstly, comparing with traditional averaging algorithm, this algorithm aims at the signal sample acquired once and therefore it is subject to no restrictions of the repetitiveness of signal.Secondly, the resolution of oscilloscope is improved through eliminating gross error in quantization due to noise and quantization error by utilizing the maximum entropy of sample data with further noise filtering via average-based decimation after data fusion of efficient sample data before averaging.No subjective assumptions and constraints are added to the signal under test in the whole process without any impact on the analog bandwidth of oscilloscope under actual sampling rate.

Common Averaging Theory
2.1.Successive Capture Averaging.Successive capture averaging is a basic denoising signal processing technology for acquisition systems of most DSOs, which depends on repetitive triggering and acquisition of repetitive signals.Successive capture averaging averages the corresponding sampling points in these waveforms one by one by using the multiwaveforms acquisition repetitively, to form a single capture result after averaging, that is, output single waveform.
Figure 1 shows the schematic diagram of averaging of  successive acquisitions.
The direct calculation method of successive capture averaging is to sum the corresponding sampling points in all acquisitions and then divide them by the number of acquisitions.The expression is given by where  = 0, ±1, ±2, . .
where  = 0, ±1, ±2, . ... In (2),  represents the current number of acquisitions,   ( + ) is the new averaging result,  −1 ( + ) is the last averaging result,   ( + ) is the new sampling point, and  is the weighting coefficient.Assuming that  is the total number of acquisitions to average, if ( < ), then  = ; otherwise,  = .Obviously, higher efficiency can be achieved in exponential averaging algorithm while calculating and storing the acquired and the averaged waveforms.Exponential averaging algorithm can not only update averaged results immediately after each acquisition and obtain the final same waveform as in direct averaging algorithm, but can also lower the requirements on memory capacity significantly.
No matter which algorithm is adopted, successive capture averaging can improve the vertical resolution of signal.This improvement is measured in bits, which is a function of  (number of acquisitions to average) [1]: In (3),   is the improved resolution.Since average algorithm is achieved by using fixed point mathematics in numerous oscilloscopes and the maximum number of acquisitions to average will not exceed 8192 generally after taking the real-time and memory capacity into account, therefore the maximum number of bits of total resolution is limited within 14.5.In fact, fixed point mathematics, noise, and dithering error can lower the maximum resolution to a certain extent.Successive capture averaging can improve SNR, eliminate the noise unrelated to triggering, and improve vertical resolution.Meanwhile, successive capture averaging will not limit waveform bandwidth under ideal circumstances, which is an obvious advantage compared with other signal processing technologies.However, based on numerous triggering and repetitive quantization of signal, successive capture averaging is limited by the repetitiveness of the signal itself under test and consequently is only applicable to observing repetitive signal.

Successive Sample Averaging.
Successive sample averaging, also known as boxcar filtering or moving average filtering, is another average algorithm widely applied in DSO's acquisition system.Since it is based on the single acquisition of the signal under test, successive sample averaging is not influenced by the repetitiveness of the signal itself.In this averaging process, each output sampling point represents the average value of  successive output sampling points [15], which is shown as follows: where  = 0, ±1, ±2, . . .and  is the number of sampling points to average.Figure 2 shows the averaging principle of 3 successive sampling points.
For successive sample averaging, the sampling rate before and after averaging is equal.It eliminates noise and improves the vertical resolution of signal by reducing the bandwidth of DSO.This improvement is measured in bits, which is a function of  (number of sampling points to average) [1]: where   is the improved resolution.In essence, successive sample averaging is a low-pass filter function, and the 3 dB bandwidth is deducted in [9] where   is the bandwidth and   is the sampling rate.This type of filter is with extremely sharp cut-off frequency and is consistent with the signal whose period is integral multiples of /  .Noise elimination is almost in direct proportion to the square root of number of sampling points to average.For instance, an average of 25 sampling points will reduce the magnitude of high-frequency noise to 1/5 of its original value.
For DSO, successive sample averaging is often used to achieve variable bandwidth function.
It can be easily seen from ( 6) that even though successive sample averaging is based on single acquisition of the signal under test, it lowers the analog bandwidth under actual sampling rate while filtering noise, and therefore it is with poor practicability.

Resolution Improving Algorithm
Based on Information Entropy-and Average-Based Decimation

Data Fusion Based on Information Entropy.
The information entropy-based data fusion researched in this paper aims to eliminate gross error in quantization and then obtain precise measuring results under the condition of oversampling.Firstly, the acquisition system of oscilloscope utilizes the maximum entropy method (MEM) to estimate the probability distribution of discrete sample data acquired under oversampling and then calculates the measuring uncertainty of sample according to its probability distribution to determine a confidence interval.Then the acquisition system discriminates gross error based on confidence interval and finally determines weight coefficient of fusion according to information entropy to achieve data fusion and obtain a precise measured value of the signal under test without any subjective assumptions and restrictions being added.

Distribution Estimation of Maximum Entropy.
Entropy is an essential concept in thermodynamics.For isolated systems, entropy is growing constantly.The maximum entropy can determine the steady state of system.Similar conclusions can be discovered in information theory.In 1957, Jaynes proposed the maximum entropy theory proceeding from the maximum information entropy; that is, when we are deducing an unknown distribution pattern with only part of known information, we should select the probability distribution with the maximum entropy and in conformity with restriction conditions, and any other selection may mean the addition of other restrictions or changes to the original assumption conditions [16].In other words, for circumstances with only the sample under test but lack of sufficient reasons to select some analysis distribution function, we can determine the form of the least tendentious measurand distribution through the maximum entropy [14].
Assuming that oversampling is implemented by the acquisition system of DSO at a high sampling rate  times of actual sampling rate and  discrete sample data are obtained at the high sampling rate, that is,  1 ,  2 , . . .  , the sample sequence after eliminating repetitive sample data is  1 ,  2 , . . .   , with corresponding probability of occurrence of ( 1 ), ( 2 ), . . .(   ), and the probability distribution (  ) of sample can be estimated through the maximum discrete entropy.Based on the information entropy defined by Shannon [13], the information entropy of discrete random variable  is as follows [17]: where (  ) denotes the probability distribution to be estimated of sample   and meets the restriction conditions below: In (8),   (  ) ( = 1, 2, . . ., ) is the statistical moment function with order  and (  ) is the desired value of   (  ).Lagrange multiplier methods can be used to solve this problem.Since  is a positive constant, take  = 1 for convenience to constitute the Lagrangian function (, ), as is shown in where   ( = 0, 1, . . ., ) is the Lagrangian coefficient.Partial derivative should be obtained for (  ) and   , respectively, and the equation set is Mathematical Problems in Engineering 5 probability distribution function is and the corresponding maximum entropy can be given by For given sample data, the expectation and variance of sample sequence can be chosen as expectation function; therefore, the distribution estimation based on maximum entropy is to estimate its probability distribution according to the entropy of discrete random variable and to achieve the maximum entropy (() max ) by adjusting probability distribution model ((  )) under the condition of ensuring the statistical property of sample [16].

Gross Error Discrimination.
Traditional criteria on gross error discrimination (e.g., 3 criterion, Grubbs criterion, and Dixon criterion) are based on mathematical statistics.The probability distribution of sample data needs to be known in case these algorithms are applied to deal with sample data.However, probability distribution is rarely known in advance in actual measurement.Statistical property cannot be satisfied if few sets of sample data are obtained during measurement, and therefore the precision for dealing with gross error will be influenced [18].A new algorithm on gross error discrimination is proposed in this paper, which calculates the measuring uncertainty of sample sequence through the probability distribution of the maximum discrete entropy and then determines confidence interval based on uncertainty to discriminate gross error.
In [14], for continuous random variable , if  is expectation and the estimated probability density function based on MEM is (), then the measurement uncertainty is expressed by The measurement uncertainty calculation method of discrete random variable  can be deduced thereout.If  is expectation and the estimated probability distribution estimated by MEM is (  ), then the uncertainty of sample should be calculated after eliminating repetitive sample data and can be given by The confidence interval is [ − ,  + ], and then judge whether gross errors are contained in the sample based on confidence interval.The data outside this confidence interval is considered as that with gross error and should be eliminated from sample sequence with a new sample sequence being constituted to fulfill data fusion.

Effective Data Fusion.
For DSO, sampling aims at obtaining the information related to the signal under test.As the measurement of information quantity, information entropy is used to determine the level of uncertainty, and therefore it can be used to fuse the sample data acquired.To reduce the uncertainty of fusion results, small weight coefficient should be distributed to the sample with large uncertainty, while large weight coefficient should be distributed to the sample with small uncertainty.
As mentioned above, oversampling is implemented by the acquisition system of DSO at a high sampling rate  times of actual sampling rate, and  discrete sample data are obtained under a high sampling rate after eliminating repetitive sample data, that is,  1 ,  2 , . . .   , and   samples  1 ,  2 , . . .   are obtained after eliminating gross errors.The information quantity provided by each sample   is denoted by self-information quantity (  ).Information entropy is the average uncertainty of the samples, and therefore the ratio of self-information quantity to information entropy can be used to measure the uncertainty of each sample in all the samples.The weight coefficient of fusion is in inverse proportion to self-information quantity.The detailed algorithms are as follows.
(1) Utilize MEM to estimate the maximum entropy distribution and the maximum entropy of sample data, and then figure out the self-information quantity of each sample defined by where (2) Define the weight coefficient of fusion with normalization processing: (3) Fuse data: where the number of data used in data fusion, that is,   , equals the number of samples remained after eliminating repetitive data and gross errors. (

Average-Based Decimation Filtering.
The maximum sampling rate of ADC in DSO is generally much higher than the actually required sampling rate of measured signal spectra.Thus, oversampling has an advantage of filtering digital signal to improve the effective resolution of displayed waveforms and reduce the undesired noise.Therefore, under the premise of oversampling, the vertical resolution at the actual sampling rate can be increased by adopting average-based decimation filtering algorithm.To be specific, the DSO can carry out oversampling at high sampling rate that is  times of the actual sampling rate corresponding to the time base selected by users and then apply the information entropy-based data fusion algorithm mentioned in the former section to  sampling points at high sampling rate to exclude gross errors, fuse data of effective samples, average after creating new sample sequences, and finally decimate sampling points at the actual sampling rate.Average-based decimation under  times oversampling is given by where  = 0, ±1, ±2, . ... Figure 3 shows the average-based decimation principle when  = 3.
The resolution improved by average-based decimation filtering is measured in bits, which is the function of  (the number of samples to average or oversampling factor): In (20),   is the improved resolution,   is the high sampling rate, and   is the actual sampling rate.The −3 dB bandwidth after average is where   denotes the bandwidth and   represents the actual sampling rate.It can be seen that improved vertical resolution and analog bandwidth vary with the maximum sampling rate and actual sampling rate of oscilloscopes.Table 1 lists the ideal values of improved resolutions and analog bandwidth of the oscilloscope with maximum sampling rate of 1 GSa/s and 8-bit ADC adopting averagebased decimation algorithm under oversampling.
Values in Columns 3 and 4 of Table 1 are ideal and the improvement of resolution is directly proportional to ; that is to say, when  increases by 4 times, the resolution can be improved by 1 bit.In reality, the maximum  falls into the range of 10,000 since it is limited by real-time performance and the memory capacity.Moreover, the fixed point mathematics and noise will also lower the highest resolution to some extent.Therefore, it would better not be expected that resolution can be improved by over 4 to 6 bits.It should be also noted that the improvement of resolution depends on dynamic signals as well.For those signals whose conversion results always deviate between different codes, resolution can be always improved.For steady-state signals, only when the noise amplitude is more than 1 or 2 ADC LSBs, the improvement of resolution can be obvious.Fortunately, signals in actual world are always in the case.
Generally speaking, when measured signals are characterized by single pass or repeat at low speeds, conventional successive capture averaging cannot be adopted and thus average-based decimation under oversampling can be used as an alternative.To be specific, average-based decimation under oversampling is especially applicable in the following two situations.
Firstly, if noise in signals is obviously high (what is more, it is not required to measure noise), average-based decimation under oversampling can be adopted to "clear" noise.
Secondly, average-based decimation can be adopted to improve the measurement resolution when high-precision measurement of waveforms is required even if the noise in signals is not loud.
According to the comparison between ( 6) and ( 20), it can be easily seen that, in the conventional successive sampling point averaging algorithm, the bandwidth is directly proportional to actual sampling rate and inversely proportional to the number of sample points to average.When the actual sampling rate is given, the bandwidth dramatically decreases as the number of sample points to average increases.However, when average-based decimation under oversampling is adopted, bandwidth is only directly proportional to the actual sampling rate but has nothing to do with the number of sample points to average (oversampling factor).When the actual sampling rate is given, the bandwidth is determined accordingly without other additional loss.Besides, Nyquist frequency increases by  times by adopting  times oversampling.Therefore, another advantage of average-based decimation is reducing aliasing.

Processing Example.
A group of sets of sample data acquired by DSO is used as an example to illustrate the processing procedure of the algorithm proposed in the paper.The oscilloscope works at the time base of 500 ns/div and the corresponding actual sampling rate is 100 MSa/s.The oscilloscope carries out 10-time oversampling at 1 GSa/s to obtain 10 original discrete sets of sample data at high sampling rate, that is,  1 ,  2 , . . . 10 , which are shown in Table 2.In the samples, there are obvious glitches or gross errors caused by ADC quantization errors.
The expectation and variance of the 10 sample data are x(n − 1) Output samples  Exclude one repeated data 121, and then constraint conditions met by the sample probability distribution are

Input samples
According to MEM and Lagrangian function, the calculated Lagrangian coefficients are  0 = −4.3009, 1 = 0.01648, and  2 = 0.000452, respectively.The expressions of estimated maximum entropy probability distribution, the maximum entropy, and self-information quantity are given by Corresponding probability and self-information quantity of each sample data are shown in Table 3.According to the distribution of the maximum entropy, the uncertainty of measurement is and then confidence interval is It can be judged that in Table 2,  7 and  8 are gross errors in quantization, so they should be excluded from the sample sequence.At the same time, the repeated sample  3 in Table 2 should also be excluded, so the remaining 7 sets of effective sample data form a new sample sequence (i.e.,  1 ,  2 , . . . 7 ) for data fusion, and the calculation results are shown in Table 4.
According to ( 15)-( 18), the result of data fusion is Replace the sample data  7 and  8 in Table 2 According to the maximum entropy theory, the result 124.3786 obtained by the algorithm proposed in the paper is the precise measurement of unknown signal obtained from sample data without any subjective hypotheses and constraints.
Similarly, based on information entropy theory, for each group of original sample data ( 1 ,  2 , . . . 10 ) obtained through oversampling (1 GSa/s), gross errors excluded, data fusing, and average-based decimating can be adopted to obtain precise measurement data of a complete waveform at the actual sampling rate (100 MSa/s).

Experiment and Result Analysis
In order to verify the effectiveness and superiority of vertical resolution improved by the algorithm mentioned in the   paper, we utilize 8-bit ADC model provided by Analog Device corporation to establish the acquisition system of oscilloscope, conduct simulation experiments with conventional average algorithm, direct decimation algorithm and the algorithm mentioned in the paper, and finally estimate and compare performances of all algorithms.
The oscilloscope works at the time base of 500 ns/div and the corresponding actual sampling rate is 100 MSa/s.The frequency of input sine wave is   = 1 MSa/s.In order to simulate quantization errors caused by noise interference and clock jitter, data mismatched samples are randomly added to the ideal ADC sampling model, so the acquired sample sequence includes gross errors in quantization.results without any processing are shown in Figures 4 and 5, respectively.After calculation in Figure 5, SNR = 28.0776dB and ENOB = 4.3717 bits.
Experiment 2. Sampling rate   = 100 MSa/s.Signal spectrum obtained by applying successive capture averaging to sampling results with  = 10 is shown in Figure 6.
After calculation in Figure 6, SNR = 43.0485dB and ENOB = 6.8585 bits.Experiment 3. Sampling rate   = 100 MSa/s.Signal spectrum obtained by adopting successive sample averaging to sampling results with  = 10 is shown in Figure 7.
After calculation in Figure 7, SNR = 44.2538dB and ENOB = 7.0588 bits.Experiment 4. Oversampling is adopted with the sampling rate   = 1GSa/s.Signal spectrum obtained by directly applying 10 times decimation to sampling results is shown in Figure 8.After calculation in Figure 8, SNR = 28.8353dB and ENOB = 4.4976 bits.
Experiment 5. Oversampling is adopted with the sampling rate   = 1 GSa/s.Signal spectrum obtained by conducting 10 times average-based decimation on sampling results is shown in Figure 9.After calculation in Figure 9, SNR = 46.6242dB and ENOB = 7.4525 bits.Experiment 6. Oversampling is adopted with the sampling rate   = 1 GSa/s.The time-domain waveform and signal spectrum obtained by adopting information entropy-based algorithm mentioned in the paper to sampling results to excluding gross errors, fusing data, creating new sample sequence, and then conducting 10 times average-based decimation are shown in Figures 10 and 11, respectively.
After calculation in Figure 11, SNR = 59.6071 dB and ENOB = 9.6092 bits.Table 6 compares the experiment results of the abovementioned 6 methods.
According to Table 6, at the sampling rate of 100 MSa/s, the conventional successive capture averaging algorithm and successive sample averaging algorithm increase the ENOB by about 2.49 and 2.69 bits when processing the sinusoidal samples including quantization errors, respectively.However, successive sample averaging algorithm also causes the decrease of the bandwidth terribly.At the sampling rate of 1 GSa/s, the ENOB provided by the average-based decimation algorithm is about 2.95 bits higher than that provided by the direct decimation algorithm.On this basis, the algorithm of information entropy-based data fusion and average-based decimation proposed in the paper can further increase ENOB by about 2.16 bits to achieve total ENOB of 9.61 bits.Compared with the theoretical digitalizing bits of 8-bit ADC, the actual ENOB (resolution) has totally increased by about 1.61 bits, which is very close to the theoretically improved results   = 0.5log 2 10 ≈ 1.66 bits in (20), and at the same time, no loss of analog bandwidth at the actual sampling rate is caused.

Conclusion
This paper proposes a decimation filtering algorithm based on information entropy and average to realize the goal of raising the vertical resolution of DSO.Based on oversampling and for single acquiring signal, utilize the maximum entropy of sample data to eliminate gross error in quantization, fuse the remaining efficient sample data, and conduct averagebased decimation to further filter the noise, and then the DSO resolution can be improved.In order to verify the effectiveness and superiority of the algorithm, comparison experiments are conducted using different algorithms.The results show that the improved resolution of the algorithm proposed in the paper is nearly identical with the theoretical deduction.What is more, no subjective hypotheses and constraints on the detected signals are added during the whole processing and no impacts on the analog bandwidth of DSO at the actual sampling rate are exerted.

Figure 5 :
Figure 5: Spectrum of originally acquired signals at 100 MSa/s.

Figure 11 :
Figure 11: Signal spectrum obtained with data Fusion and 10 times average-based decimation.
.,   ( + ) is the averaging result,  is the number of acquisitions to average, and   ( + ) represents the corresponding sampling point at the moment  +  in the th acquisition.Obviously, average value cannot be achieved in this algorithm until all  acquisitions are completed.If  is too large, the throughput rate of system will be affected remarkably.For users, the delay caused by averaging is unacceptable.For oscilloscope, the huge sample data will use out memory capacity rapidly.

Table 1 :
Ideal values of improved vertical resolution and bandwidth based on average-based decimation under oversampling.

Table 2 :
10 sets of original sample data obtained by oversampling.

Table 3 :
Probability and self-information quantity of each set of sample data.

Table 4 :
Fusion weight coefficient of 7 sets of effective sample data.

Table 5 :
10New sample data containing data fusion result.

Table 6 :
Comparisons of experiment results.