Differential Privacy via Haar Wavelet Transform and Gaussian Mechanism for Range Query

Range query is the hot topic of the privacy-preserving data publishing. To preserve privacy, the large range query means more accumulate noise will be injected into the input data. This study presents a research on differential privacy for range query via Haar wavelet transform and Gaussian mechanism. First, the noise injected into the input data via Laplace mechanism is analyzed, and we conclude that it is difficult to judge the level of privacy protection based on the Haar wavelet transform and Laplace mechanism for range query because the sum of independent random Laplace variables is not a variable of a Laplace distribution. Second, the method of injecting noise into Haar wavelet coefficients via Gaussian mechanism is proposed in this study. Finally, the maximum variance for any range query under the framework of Haar wavelet transform and Gaussian mechanism is given. The analysis shows that using Haar wavelet transform and Gaussian mechanism, we can preserve the differential privacy for each input data and any range query, and the variance of noise is far less than that just using the Gaussian mechanism. In an experimental study on the dataset age extracted from IPUM's census data of the United States, we confirm that the proposed mechanism has much smaller maximum variance of noises than the Gaussian mechanism for range-count queries.


Introduction
Over the past ten years, differential privacy has become one of the important methods in the area of privacy-preserving for statistical databases. Differential privacy is a promising scheme for publishing statistical query results of sensitive data, which has a strong privacy guarantee for opponents with arbitrary background knowledge [1][2][3][4][5][6]. e strong privacy guarantee of differential privacy ensures that any individuals in the data set will not significantly affect the analysis results of the data set. At present, three basic mechanisms are widely used to ensure differential privacy: Laplace mechanism, Gaussian mechanism, and exponential mechanism. Laplacian and Gaussian mechanisms are applicable to numerical queries, and exponential mechanisms are applicable to non-numeric queries [7][8][9]. Recently, differential privacy is adopted on many research field, such as social network publishing [10][11][12], crowdsourced data publication [13,14], and genomic privacy [15][16][17].
Along with a long-range query scope, the accumulation of noise in the range query answered for privacy preserving can affect the usability of the released data [18,19]. To reduce the accumulation of noise, the method of hierarchical decompositions is usually employed [20]. Zhang et al. proposed a differentially private algorithm for hierarchical decompositions and named it as PrivTree. is histogram construction algorithm eliminates the dependency on a predefined limit parameter. e privacy-preserving range query is adopted in the field of Internet of ings (IoT) in recent years [21][22][23]. Cai et al. studied the transaction approximate range counting problem of large IoT data. ey proposed a sampling-based method to generate approximate counting results. For privacy reasons, these results will be further disturbed and then published. It is theoretically proved that this result achieves unbiasedness, bounded variance and enhances privacy guarantee under differential privacy. Mahdikhani et al. proposed a communication efficient privacy protection range query in the fog-enhanced Internet of things. e feature of this scheme is that it adopts the Paillier homomorphic cryptosystem and the ingenious bloom filter data structure to achieve better privacy and higher count aggregation efficiency in the range query scenario of protecting privacy. Histogram is a representative and popular tool for data publishing and visualization tasks. Nowadays, protecting private data and preventing the leakage of sensitive information have become one of the main challenges faced by histogram [24][25][26]. Histogram is the result of a group of counting queries. It is the core statistical tool for reporting data distribution. In fact, it is regarded as the basic method of many other statistical analyses, such as range query [27]. e advantage of histogram representation is that it limits the sensitivity to noise. For example, when histograms are used to support range or count queries, adding or deleting a single record will affect at most one box. erefore, the sensitivity of range or count query on the histogram is equal to 1, and the amount of additional noise per box will be relatively small [28]. For the differential privacy of long-range queries on the histogram, the accumulation of noise is a key issue that needs to be focused.
Discrete wavelet transform (DWT) is an important technology in signal and image processing [29][30][31]. Lifting scheme, also called second generation wavelet, has many advantages comparing with the first generation wavelet, such as in-place computation, integer-to-integer transforms, and speed [32][33][34]. Wavelet-based privacy preserving is studied in recent years [35][36][37]. Xiao et al. propose the differential privacy via Haar wavelet transform. ey introduce a data publishing technique named Privelet. Privelet not only ensures ε-differential privacy but also provides accurate results for range query by injecting less noise into wavelet coefficients. e mechanism that can be used to build the differential privacy in Privelet is Laplace distribution. e Laplace mechanism, which is used to guarantee differential privacy in Privelet, maybe not a good choice for building the privacy-preserving system based on discrete wavelet. e reason is that the Laplace noise does not have the property of additivity. at is, the sum of two Laplace distributions is not a Laplace distribution. at means we cannot obtain an analyzable noise distribution by wavelet reconstruction where the Laplace noise is injected into the wavelet coefficients. e Gaussian mechanism for differential privacy is proposed by Dwork [38,39]. e Gaussian noise can be used in the structuring of hierarchical decompositions, such as wavelet transforms. e property of additivity of Gaussian noise is very important for the reconstruction of noise data. On the one hand, additivity can ensure that the reconstructed noise is still Gaussian noise; on the other hand, some noise can be eliminated during reconstruction.
In view of the above analysis, we will do some research on differential privacy via Gaussian mechanism and lifting scheme of Haar wavelet transform for range query in this study. In summary, the main contributions of this work are as follows: (1) Differential privacy using lifting Haar wavelet transform and Laplace mechanism is analyzed in this study. e distribution of noise injected into the input data via wavelet reconstruction is discussed and we conclude that they are not noise of Laplace distribution.
(2) Differential privacy based on lifting Haar wavelet transform and Gaussian mechanism is constructed in this study. For range query, our analysis shows that the noise actually added into a certain range of original data is much less than the sum of noise at each data for the proposed mechanism. (3) Differential privacy for range query via lifting Haar wavelet and Guassian mechanism is discussed. We give an algorithm to compute the maximum variance of any range query for any given parameter l (suppose the length of input data is 2 l ). Moreover, we give a coarse estimation of the maximum variance of range query using a function expression. Finally, we give an experimental study using the proposed mechanism, and the results show the proposed mechanism has a much smaller maximum variance of noise than the Gaussian mechanism for range query.
e remainder of the study is organized as follows: Section 2 introduces the fundamental definitions and theorems about the differential privacy and its implement mechanism. Section 3 gives the theorems for how to inject Gaussian noise into the Haar wavelet coefficients. Section 4 analysis the noise of range query under the framework of Gaussian mechanism and Haar wavelet. First, the computing method for the variance of range query is given. Second, the algorithm of computing maximum variance for any range query is introduced, and how to get the interval of the range query when obtaining the maximum variance is introduced in detail. Finally, the coarse estimation of the maximum variance of range query is given as a function expression. Section 5 introduces the experimental verification of the computing of maximum variance for the range query based on Gaussian mechanism and lifting Haar wavelet. Conclusions is given in Section 6.

Preliminaries
In this section, the fundamental definitions and theorems about the differential privacy and its implement mechanism are introduced first. Furthermore, the method of injecting Laplace noise into the Haar wavelet coefficients is given. ey are the basis of the other sections.
where the symbol ‖x‖ 1 denotes the l 1 -norm of a database x, ‖x‖ 1 � |x i |, and ‖x − y‖ 1 denotes the l 1 -distance between two databases x and y.
Definition 3. (Laplace distribution, Lap(λ)). e Laplace distribution with mean zero and scale λ is the distribution with probability density function: In Definition 3, the variance of this distribution is σ 2 � 2λ 2 . We write Lap(λ) to denote the Laplace distribution with mean zero and scale λ in this study.
Theorem 1 (Laplace mechanism [39]). Let f is a function with l 1 -sensitivity, the Laplace mechanism, which adds independently random drawn noise distributed as Lap(Δ 1 f/ε) into each of the d components of the output, preserves (ε, 0)-differential privacy.

Remark 1.
roughout the study, we use the term "noise" to refer to a random variable with a zero mean.

Gaussian Mechanism
Definition 4 (l 2 -sensitivity [39]). Let f: N |χ| ⟶ R d be an arbitrary d-dimensional function, then define the l 2 -sensitivity of function f as follows: where the symbol ‖x‖ 2 denotes the l 2 -norm of a database x, ‖x‖ 2 � x 2 i , and ‖f(x) − f(y)‖ 2 denotes the l 2 -distance between f(x) and f(y).
Definition 5 (Gaussian distribution, Gauss(σ 2 )). e Gaussian distribution with mean zero and variance σ 2 is the distribution with probability density function: In Definition 5, the variance of this distribution is σ 2 . We write Gauss (σ 2 ) to denote the Gaussian distribution with mean zero and variance σ 2 .
In eorem 2, to ensure (ε, δ)-differential privacy, we can inject the Gaussian noise with σ 2 � 2 ln (1.25/δ) · (Δ 2 f/ε) 2 into the input data directly.  Figure 1. In Figure 1, x(z) is the input data, x o (z) and x e (z) denote the odd indexed samples and even indexed samples, respectively. a (z) and d (z) are the approximate coefficients and detail coefficients, respectively. For lifting scheme of Haar wavelet, we have p (z) � −1 and u (z) � 1/2.

Injecting Noise into the Input Data via Lifting
In Figure 1, we have erefore, the approximate coefficients a (z) and detail coefficients d (z) can be given as follows: In Figure 1, the lifting structure has the reconstruction property, that is (9) Figure 1 shows one-level decomposition and reconstruction via lifting Haar wavelet transform. e wavelet transform usually consists of many decomposition levels. We can apply the same procedure to the approximate coefficients a (z) to get the multilevel Haar wavelet decomposition, as shown in Figure 2.
In Figure 2, the top decomposition level is 3 (l � 3), c 3,0 is the approximate coefficient, c k,i (i ≠ 0) denotes the ith wavelet coefficient in kth decomposition level, and x m (m ∈ [0, 7]) denotes the input data. In Figure 2, we observe that the number of wavelet coefficients in kth decomposition level is 2 l− k .
In Figure 2, given the Haar wavelet coefficients, any entry x m can be easily reconstructed as follows: where c l,0 is the approximate coefficient, c k,i (i ≠ 0) denotes the ith wavelet coefficient in kth decomposition level, and g k,i equals 1 (−1) if x m is in the left (right) subtree of c k,i , equals 0 if x m is not in any subtree of c k,i . For example, In Figure 2, if we inject the noise into the approximate coefficients and detail coefficients, then we can obtain the reconstruction data with noise.

Injecting Noise into Haar Wavelet Coefficients.
For the noise injected into Haar wavelet coefficients, the tree structure of noise can be obtained by changing the symbol "c" and "x" to n in Figure 2 because they use the same decomposition of multilevel lifting Haar wavelet transform.
Referring to equation (10), the noise n m that injected into data x m can be given as follows: where n l,0 is the noise injected into approximate coefficient, n k,i (i ≠ 0) denotes the noise injected into the ith wavelet coefficient in kth decomposition level, and g k,i equals 1 (−1) if n m is in the left (right) subtree of n k,i and equals 0 if n m is not in any subtree of n k,i . e range sum of these noise has a special property; that is, some subnoise items can be eliminated when computing some sum of range count. For example, referring to Figure 2 and equation (12), we have In the above equation, the other subnoise items except n 3,0 have been eliminated. is gives us the inspiration to apply this property to range query for differential privacy.

Getting Input Data with
Noise. Based on the above two sections, we reconstruct the input data with noise by using the multilevel lifting Haar wavelet transform. Considering Figure 2, the input data with noise is shown in Figure 3.
In Figure 3, x m (m ∈ [0, 7]) denotes the input data reconstructed, n m is the noise injected into data x m . x m + n m denotes the input data with noise. Referring to equations (10) and (12), we have where the meanings of the symbols c l,0 , n l,0 , c k,i , n k,i , and g k,i are as stated before.
Based on the analysis of above, we conclude that the input data with noised can be obtained by injecting the noise, such as Laplace noise or Gaussian noise, into the approximate and detail coefficients. Moreover, the noise injected into each input data is the sum of the noise injected into approximate and detail coefficients.

Injecting Noise via Haar Wavelet and Laplace
Mechanism. In equation (12), we set n k,i as the noise with the Laplace distribution, as given in Definition 3. We have where λ is the scale parameter of Laplace distribution and k denotes the kth decomposition level of lifting Haar wavelet transform. According to equations (12) and (15), there is where the symbols of n m , n k,i , n l,0 , and g k,i are the same as those in equation (12). Using equations (12) and (16), we can describe the Laplace noise injected into Haar wavelet coefficients, as listed in Table 1. x 0  According to Table 1, letting the range of the range query is n 0 to n 7 , we have at means the sum of all noise injected into the input data is a noise with Laplace distribution with mean zero and scale λ.
According to Table 1, letting the range of the range query is n 1 to n 3 , we have erefore, As we know, the sum of independent random Laplace variables is not a variable of Laplace distribution, so the compositive noise of range query of n 1 + n 2 + n 3 that injected into input data x 1 + x 2 + x 3 is not a noise with Laplace distribution. erefore, we conclude that it is difficult to judge the level of differential privacy protection based on the Haar wavelet transform and Laplace mechanism.
To solve this problem, we consider adopting the Gaussian mechanism for the differential privacy via Haar wavelet transform in the next section.

Injecting Noise into Haar Wavelet Coefficients via Gaussian Mechanism
To inject Gaussian noise into Haar wavelet coefficients in Figure 3, we can set n k,i as the noise with the Gaussian distribution, as given in Definition 5. Let where 3σ 2 /4 k is the variance of Gaussian distribution.
According to equations (12) and (20), we have where the symbols of n m , n k,i , n l,0 , and g k,i are same as those in equation (12).
Using equations (20) and (21), we can describe the Gaussian noise injected into Haar wavelet coefficients, as listed in Table 2.
Theorem 3. Suppose that X 1 and X 2 are independent random variables, and X i has Gaussian distribution with mean zero and variance σ 2 i for i ∈ 1, 2 { }. en, X 1 ± X 2 is Gaussian distribution with mean zero and variance σ 2 1 + σ 2 2 ; kX 1 is Gaussian distribution with mean zero and variance (kσ 1 ) 2 . e proof of eorem 3 will not be given because it is a basic property of Gaussian distribution.
According to Table 2 and eorem 3, there is at means the sum of all noise injected into the input data is a noise with Gaussian distribution. We analyze the distribution of the noise injected into each input data as follows.
Theorem 4. Injecting Gaussian noise with variance σ 2 k � 3σ 2 /4 k into the Haar wavelet coefficients in the kth decomposition level (the maximum decomposition level is l, as shown in Figure 3), the noise injected into each input data via Haar wavelet reconstruction is Gaussian noise with variance (1 + 2/4 l ) σ 2 .
We simulate the process of injecting Gaussian noise with � 12 into wavelet coefficients with 15-level decomposition using eorem 4 and injecting Gaussian noise into input data directly and draw the noise-count figures as follows. Figure 4(a) shows the count of the noise injected into input data by injecting the Gaussian noise with σ � 12 into Haar wavelet coefficients with 15-level decomposition using equation (20); Figure 4(b) denotes the noise count by injecting Gaussian noise with mean zero and variance (1 + 2/ 4 l ) σ 2 into the input data directly. In Figure 4(c), we draw the two curves together and we find that they are almost overlapped. Figure 4(c) shows that the method injecting Gaussian noise into Haar wavelet coefficients has the same   Figure 4: Comparison between injecting Gaussian noise into wavelet coefficients using eorem 4 and injecting Gaussian noise into input data directly. 6 Computational Intelligence and Neuroscience level of differential privacy protection as the method injecting Gaussian noise into the input data directly. In this study, we will focus on the application of range query. According to eorems 3−5, we find that the distribution of noise for the range query using Gaussian mechanism and Haar wavelet is a Gaussian distribution. erefore, we can calculate the variance of noise easily, for example, as listed in Table 2, we have n 1 + n 2 + n 3 � n 3,0 + n 3,1 + n 2,1 − n 1,1 + n 3,0 + n 3,1 − n 2,1 + n 1,2 From this example, we find that some noises (such as n 2,1 and −n 2,1 , n 1,2 and −n 1,2 ) are eliminated by the operation of addition. According to eorem 4, the variance of noise injected into each input data should be (1 + 2/4 3 )σ 2 for l � 3. e total noise variance is 3 * (1 + 2/4 3 )σ 2 � (99/32) * σ 2 . Compared with equation (24), we conclude that, for the range query, the noise actually added into a certain range of the original data is much less than the sum of the noise at each data. erefore, it is a very important property for Gaussian mechanism to be used on range query.

Noise of Range Query under the Framework of Haar Wavelet and Gaussian Mechanism
In this section, we discuss how to compute the noise of the range query under the framework of Haar wavelet transform and Gaussian mechanism. First, we give the computing method for the variance of range query in detail. Second, the interval of the range query when obtaining the maximum variance is introduced. ird, to speed up the computing of maximum variance, we observe the results of the rangecount interval when getting the maximum variance and give a speed computing method. Finally, we give a coarse estimation of the maximum variance of range query as a function expression.

Computing Method for the Variance of Range Query.
In Figure 3, we choose the Gaussian noise and inject them into the approximate coefficient and each wavelet coefficient. e variance of Gaussian noise injected into approximate coefficient is 3σ 2 /4 3 . e variance of Gaussian noise injected into each wavelet coefficient is 3σ 2 /4 k for level k(k ∈ [1,3]). e relationship between decomposition level and variance of noise is listed in Table 3.
e noise-sum of range query via Haar wavelet transform for interval S can be given by the following equation (Figure 3): where S is the interval of any range query, n k,i presents the ith noise injected into wavelet coefficient in kth decomposition level, α(n k,i ) denotes the number of left leaves in the left subtree of n k,i that are contained in S, and β(n k,i ) denotes the number of right leaves in the right subtree of n k,i that are contained in S (Figure 3). Now we analyze the noise variance of range query. According to equation (20), we know that the noise injected into approximate coefficient is n l,0 and its variance is 3σ 2 /4 l ; the noise injected into wavelet coefficient is n k,i and its variance is 3σ 2 /4 k . erefore, according to eorem 3, we can compute the noise-variance of range query by replacing n l,0 and n k,i with 3σ 2 /4 l and 3σ 2 /4 k in equation (25), respectively.
To compute the value of σ 2 sum , we need to calculate the values of α(n k,i ) and β(n k,i ) firstly. In Figure 3, the length of interval of leaves in the subtree of n k,i is 2 k . e left point of this interval has the subscript (i − 1) · 2 k and the right point of this interval has the subscript i · 2 k − 1.
erefore, the subtree of n k,i has the subscript interval of leaves.
For example, the wavelet coefficient n 2,2 in Figure 3 has the subscript interval of leaves [4,7].
According to equation (27), we can obtain the left-half interval [α L , α R ]and right-half interval [β L , β R ] of S n k,i : Computational Intelligence and Neuroscience 7 erefore, α(n k,i ) and β(n k,i ) can be given by computing the number of intersection between S and [α L , α R ], S and [β L , β R ], respectively.
Let S � [S L , S R ], where S L and S R denote the left and right points of the given range query interval, respectively. erefore, we have where k ∈ [1, l] and i ∈ [1, 2 l− k ] (Figure 3).

Maximum Variance of Range
Query. e aim of this study is to obtain the maximum value of range query for any fixed maximum decomposition level l (the number of input data is 2 l ). According to equation (26), we have In equation (32), the given parameter is l. To compute the value of σ 2 sumMax , we need to calculate any range query interval S � [S L , S R ] in all data and obtain the count α(c k,i ) and β(c k,i ) using equations (30) and (31). We give the pseudocode of computing σ 2 sumMax as follows: Algorithm 1 illustrates the details of the algorithm of computing σ 2 sumMax for fix l (l ≥ 2).
Step 1 is the initialization of σ 2 sumMax . Steps 2 to 3 are the range loop of S L and S R .
Step 5 is the loop of the subscript of decomposition level k.
Step 6 is the loop of the subscript of the wavelet coefficient in kth decomposition level.
Step 7 is the computation of the α L , α R , β L , and β R of n k, i . Steps 8 to 13 denote the computation of α (n k, i ). Steps 14 to 19 denote the computation of β (n k, i ).
Step 20 is the computation of right part of σ 2 sum using equation (26).
Step 29 denotes the output of Algorithm 1.
According to Algorithm 1, we calculate the values of σ 2 sumMax , S L , and S R , as listed in Table 4. In Table 4, S R − S L + 1 denotes the length of interval for the σ 2 sumMax . It will take a very long time to compute the σ 2 sumMax using Algorithm 1 when l > 14, so we need to find some method to speed up Algorithm 1.

Speeding Algorithm for Computing the Maximum Variance of Range Query.
Observing the values of S L and S R in Input: the maximum decomposition level l Output: σ 2 sumMax , S L and S R (for σ 2 sumMax ) For k � 1 to l (6) For i � 1 to 2 l−k (7) Compute α L , α R , β L , β R of n k, i using equations (38) and (39) End If (20) Compute sum � sum + (1/4 k ) (α (n k, i ) − β (n k, i )) 2 (21) End For (22) End For (23) Compute σ 2 sum using sum and (26) Table 4, we give some statistical rules to compute them directly in Table 5. According to Table 5, we can compute σ 2 sumMax using the following algorithm: Algorithm 2 illustrates the details of the algorithm of computing σ 2 sumMax,l for any l (l ≥ 2).
Step 1 is the initialization of S L and S R . Step 2 is the loop of the maximum decomposition level l. Steps 3 to 7 denote the computation of S L and S R according to the maximum decomposition level l.
Step 10 is the loop of the subscript of decomposition level k.
Step 11 is the loop of the subscript of the wavelet coefficient in kth decomposition level.
Step 12 is the computation of the α L , α R , β L , and β R of n k, i . Steps 13 to 18 denote the computation of α (n k, i ). Steps 19 to 24 denote the computation of β (n k, i ).
Step 25 is the computation of right part of σ 2 sum using equation (26).
Step 28 is the computation of σ 2 sum using equation (26).
Step 30 denotes the output of Algorithm 2 for any l.
S L and S R can also be given directly by simplifying the results in Table 5.
If l is an even number, If l is an odd number, erefore, we give the values of σ 2 sumMax , S L , S R , and S R − S L + 1 for l from 2 to 30 in Table 6.
In Table 6, S L and S R denote the left and right points of the range query interval when the σ 2 sumMax is met. S R − S L + 1 denotes the length of interval for the σ 2 sumMax . In Table 6, we observe that σ 2 sumMax will increase about (2/3) σ 2 if the parameter l increases 1.

Coarse Estimation of the Maximum Variance of Range
Query. In previous sections, the maximum variance of range queries via Gaussian mechanism and Haar wavelet transform is given for any l. But it is obtained using a computer program, not from a function expression. In this section, the coarse estimation of the maximum variance is given in eorem 6, and it is a function expression with parameters l and σ 2 .
Theorem 6 (Coarse estimation of the maximum variance). Let N be a set of independent Gaussian noise n k,i ∈ N with a variance 3σ 2 /4 k , which is injected into onedimensional Haar wavelet coefficients and approximate coefficient (Figure 3). Suppose l � log 2 |N|, that means the number of Gaussian noise injected into Haar wavelet  Computational Intelligence and Neuroscience coefficients and approximate coefficient is 2 l (the number of input data is also 2 l ). Let M be the noisy data reconstructed from C + N (C is the set of one-dimensional Haar wavelet coefficients of the input data, refer to Figure 2). en, for any range query answered using M, the variance of noise in the answer is at most ((6l + 9)/4)σ 2 .
Proof. Referring to Figure 3 and equation (26), we observe that for any noise n k,i , if none of the leaves under n k,i is contained in S, then there is α(n k,i ) � β(n k,i ) � 0. On the other hand, if all leaves under n k,i are covered by S, then α(n k,i ) � β(n k,i ) � 2 k− 1 . erefore, α(n k,i ) − β(n k,i ) ≠ 0, if and only if the left or right subtree of n k,i partially intersects S. At any level of the decomposition tree except for the lth level, there exist at most two such noises. At the level l, at most one such noise that letting the condition α(n k,i )− β(n k,i ) ≠ 0 be sufficient.
Considering a noise n k,i at level k (k ∈ [1, l]), such that α(n k,i ) − β(n k,i ) ≠ 0. Since the left (right) subtree of n k,i contains at most 2 k− 1 leaves, we have α(n k,i ), β(n k,i ) ∈ [0, 2 k− 1 ]. So, there is |α(n k,i ) − β(n k,i )| ≤ 2 k− 1 . erefore, the variance of the range query about the noise n k,i (k ∈ [1, l]) at most is On the other hand, the noise in the approximate coefficient (n l,0 ) has a variance at most: erefore, the total variance injected into wavelet coefficients of 1 to l − 1 level is 2 · (l − 1) · 3σ 2 /4, and the variance injected into wavelet coefficients of level l is 1 · 3σ 2 /4. According to equation (26), the variance of noise at most is which completes the proof. is conclusion in eorem 6 can also be obtained by observing Table 2. Now, we give the intuitive explanation of eorem 6. According to Table 2, we can give a coarse estimation of the maximum variance of range query. In Table 2, we insert a row at the bottom to calculate the maximum variance sum for each column. e new table is shown as follows.

Experimental Verification
is section introduces the experimental verification of the proposed framework, that is, the computing of maximum variance for range query based on Gaussian mechanism and lifting Haar wavelet. We use the dataset age, which contains census records of individuals from the United States. e age has 107, 974, and 787 records, each of which corresponds to the age of an individual, extracted from the IPUM's census data of the United States [40]. e ages range from 0 to 135 and just have 128 values (ages 121, 123, 127, 128, 131, 132, 133, and 134 are empty). We count the number of each age and give the histogram of age as the input file of our experiments. Given a query length L, we test all the possible range queries with length L and report the maximum variance of the range query for input data. e noise injected into the input data via Gaussian mechanism is Gaussian noise. e variance of Gaussian noise is the sample variance. erefore, the sample variance is adopted in this study and is computed by the following equation: where n i (or n j ) is the noise injected into the input data and m is the number of the input data. We research the maximum variance of the range query of noise when ε chosen in the set {0.5, 0.75, 1.0, 1.25} and δ chosen in the set {0.1, 0.01, 0.001}. For each special ε, we draw the maximum variance of the range count of noise using Gaussian mechanism and Gaussian mechanism with Haar wavelet transform when δ is equal to 0.1, 0.01, and 0.001.
To compute the variance, we inject the Gaussian noise into the input data or the Haar wavelet coefficients 10000 times. To compute the maximum variance of the range query with fixed ε and δ, each variance of the range query for range size k needs to be computed firstly. en, the maximum value of variance for range size k can be given by comparing all the variance of the k-range queries.
For input data with length 128 (such as 128 histogram), we can draw the maximum variance diagram of the range query using Gaussian mechanism and Gaussian mechanism with Haar wavelet transform for any range sizes, as shown in Figure 5.
In Figure 5, "Gauss" means injecting noise into each histogram data via Gaussian mechanism directly and then gets the noise of range query by the operation of addition. First, "GaussWave" denotes injecting noise into the lifting Haar wavelet coefficients using eorem 5. Second, the noise injected into each histogram is obtained by the inverse wavelet transform. Finally, the range query for any range size is obtained by injecting the noise together.
In Figure 5, we observe that the maximum variance is increasing linearly with the "range size" for "Gauss." In Figure 5, for any ε and δ, the maximum variance of the noise using "GaussWave" method is far less than the noise using "Gauss" method.
To observe the variation tendency of "GaussWave" in Figure 5, we just draw the maximum variance diagram of the range query using Gaussian mechanism with Haar wavelet transform, as shown in Figure 6.
In Table 9, the column "max-value" presents the experimental result of the maximum value of maximum variance for range query. e column "σ 2 sumMax " presents the result of theoretical analysis and computed using the formula σ 2 sumMax � 6.248291 * σ 2 . e last column "difference" is the difference of columns "max-value" and "σ 2 sumMax ." In Table 9, we find that the results of experiment and theoretical analysis are in substantial agreement.
is section gives the experimental verification of the framework of the maximum variance computing for range query. is framework on privacy preserving is built using Gaussian mechanism and Haar wavelet. In Figure 6, for any ε and δ, the maximum variance of the noise increases with the increase of range size before it gets the maximum value of 106, and it will decrease with the increase of range size after it has gotten the maximum value. In Table 9, the experimental value and the theoretical value of maximum variance for range count are compared, and the results show that they are in substantial agreement.

Conclusions
In this study, we proposed a new differential privacy framework via Haar wavelet transform and Gaussian mechanism for the range query. e theorems for how to inject Gaussian noise into the Haar wavelet coefficients are given. e noise of range query under the theoretical framework of Haar wavelet and Gaussian mechanism is analyzed. e algorithm to compute the maximum variance of any range query for any given parameter l is introduced. A coarse estimation of the maximum variance of range query using a function expression is given. e experimental results show that the maximum variance of the noise using Gaussian mechanism and Haar wavelet is far less than the noise using Gaussian mechanism. e experimental verification of the computing of maximum variance for range query based on lifting Haar wavelet and Gaussian mechanism is proposed, and the results show the experimental value and the theoretical value of maximum variance for range count are substantial agreement. For future work, we plan to apply our method to the privacy protection of histogram publication. Furthermore, we want to investigate how to assemble our method and machine learning algorithm, such as the decision tree and random forest.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.