Power Consumption Models for Decimation FIR Filters in Multistandard Receivers

,


Introduction
Currently, wireless technologies are widespread because of their flexibility of use. However, many different standards are used, and a new challenge for the communication-embedded system designer is to implement multiple communication standard devices in order to provide easy access to information everywhere with low hardware complexity. In general, such devices include communication chains with decimation and selector filters. Typically, multiple standards communication chains are composed of an RF front end, an over-sampler analogue to digital converter, and a cascade of decimation filters ( Figure 1) [1]. The power consumption is an important constraint during embedded system design because the design of decimation filters has a substantial impact on the power consumption in multistandard receivers. This work focuses on FIR filter power consumption estimation in direct form ( Figure 2). The polyphase form of FIR filters is widely recommended for reducing power consumption in comparison with all possible implementation forms of these filters. To the best of our knowledge, there has not been a clear study based on experimental results showing how much power FIR filters consume. In fact, the work by Dumonteix et al. [2] is widely mentioned, and it deals only with the power consumption, area and critical path of a particular implementation of a comb filter. It was shown in this work that appropriate filter decomposition in association with polyphase decomposition could lead to an important significant consumption of the CIC filter.
The main objective of this work is to provide models that evaluate FIR decimator filter power consumption in direct form. These models consider the main filter parameters, which are the filter order, the input wordlength, and the coefficient wordlength. This model was given for STM90 nm and STM60 nm low power. The operation conditions are specified in paragraph 2.
The second objective of this paper is to study the impact of the polyphase form on the power consumption of FIR filters. Some tips regarding the best decomposition to 2 VLSI Design  perform to save power are extracted from the synthesis results in this work.
The ultimate aim of this study is to help filter designers decide the best way to decompose the filter processing into stages to guarantee optimal power consumption.
This paper is organized as follows. Section 2 presents the specifications of the symmetric FIR filter that is used to estimate the power consumption of a decimation FIR filter. In Section 3, we present the power consumption models that are obtained for the direct implementation form. In Section 4, the models are validated by using synthesis results. In Section 5, we present the implementation results for the polyphase form usage. A use case based on established models is presented in Section 6. Finally, conclusions are given in Section 7.

Characterization of Decimator FIR Filter Parameters
When designing FIR channel selection and a decimation filter, the filter designer faces a trade-off between channel selection efficiency and filter complexity. Indeed, the design must guarantee channel selection with the minimum complexity in terms of occupied area and power consumption. Depending on the communication standard, filter designers choose the optimal filter order and coefficient length to guarantee the required signal-to-noise ratio. The input data wordlength is typically deduced from the analogue to digital converter input signal dynamic range. Hence, for the hardware performance evaluation, the designer should consider the three following parameters in FIR filter power consumption: input wordlength, filter order, and coefficient wordlength.
To evaluate the impact of these parameters on the power consumption of an FIR filter, we performed a large number of FIR filter architecture syntheses. We considered filters orders from 7 to 64 (with a step of three). We chose to implement both direct and polyphase forms. To reduce the hardware implementation complexity of the filter, we used a Wallace adder tree [3,4] to perform the addition of all multiplication results (Figure 2).
For the polyphase decomposition, each subfilter was implemented as a FIR filter in the direct form. For simplicity, we chose to run the syntheses with the input wordlength equal to 4 bits, 8 bits, 16 bits, and 32 bits. In the next section, we will show that, for intermediate values, it is possible to interpolate the power consumption of the filter.
In the same way, we choose three possible values of filter coefficients: 4 bits, 8 bits, and 16 bits. In fact, both the filter order and coefficient size depend on the filter's mask. We estimate that 16 bits offer enough accuracy for quantization process.
The performance estimation of FIR filters was done on STM90 nm process technology. In this 90 nm process library, the static power has almost the same proportions as the dynamic power consumption (see Table 1). For this reason we proposed two separate models for the dynamic and static power consumption. The performance estimation of FIR filters was also done on ASIC 65 nm process technology using a STMicroelectronics low-power library. Using this library, the power consumption of the FIR filters is reduced to the dynamic contribution because the static power is very low (see Table 2). Using the STM 65 nm low-power technology permits to deduce the dynamic power consumption model that can be verified later with the STM90 nm library process.
Design Vision of Synopsys was used to extract the performances on ASIC technology. From the experimental results, we were able to build a power consumption model of FIR filters depending on the three main filter parameters: input wordlength, coefficient wordlength, and filter order. Because the dynamic power consumption depends on frequency, the model for dynamic power consumption obtained is also frequency dependent.

Power Consumption Estimation Models
In this section, we introduce the power consumption estimation models for STM65 nm and STM90 nm process technologies. The STM65 nm library is low-power and operates at 0.9 V and used in nominal case with junction temperature of 25 • C. The STM90 nm library operates at 1.26 V and is used in the best case with a junction temperature of 40 • C.   technology. In this technology, the dynamic power consumption is assumed to be the total power because the static power is very low (see Table 2). The Dynamic power consumption is assumed to be proportional to frequency. To verify this assumption, we performed several experiments to evaluate the impact of the frequency constraint, provided by the logical synthesis tool, on the occupied area and the power consumption of the generated design for a fixed filter order. Figure 3 confirms that the resources that are used to build the architecture for a given filter are the same regardless of the specified frequency constraint. Hence, the logical synthesis tool does not introduce any area or power optimization regardless of the working frequency. As a consequence, the dynamic power consumption is considered linear given the constrained frequency ( Figure 3). Following this observation, we concentrated all syntheses efforts at a fixed frequency equal to 80 MHz, which is sufficient for the requirements of both GSM and UMTS standards [5,6]. Figure 4 gives the evolution of the power consumption versus the filter order for a direct form FIR filter for different values of coefficients and different inputs wordlength. This figure shows that power consumption is quite linear to the order of the filter for fixed input (4, 8, and 16 bits) and coefficient wordlength (4, 8, and 16 bits). Figure 5 illustrates the relationship between the power consumption and input wordlength for four chosen orders and for a fixed coefficient value. For all other orders and for the different coefficient wordlengths, the evolution of the power consumption has the same trends. According to these curves, the power consumption evolution versus the input wordlength is not linear.

Dynamic Power Consumption
However, Figure 6 shows that the natural logarithm of the power consumption is almost linear as compared with the logarithm of the input wordlength. Hence, (1) gives the expression of the natural logarithm of the dynamic power consumption versus the natural logarithm of the input wordlength. This calculation leads to relation (2), which gives the power consumption general expression.
where I is the input wordlength, β is the slope of the curves in Figure 6, and LN(α) represents the origin value of the     same curves. Evaluating the α and β expressions depending on the filter parameters and on the basis of the different experiments and curves will lead to the establishment of the dynamic power consumption model. The following section demonstrates how each parameter of expression (2) is obtained.
We evaluate first whether parameter β is independent of the filter versus expression of the parameter β. Hence, according to Figure 6, all given curves LN(P) versus LN(I) are almost parallels for a fixed coefficient value. This property was verified for all other orders and for all coefficient wordlength considered in this work. Consequently, the slope (β) of all these curves is the same regardless of the filter order. Thus, this slope (β) does not depend on the filter order and is only dependent on the coefficient wordlength. To evaluate the expression of β as a function of coef (filter coefficient wordlength), we plot the curves given β depending on coef for fixed filter orders in Figure 7. These curves are obtained from data illustrated in Figure 6. Figure 7 confirms the β filter order (N). Moreover, the curves show no linear evolution of β versus coefficient wordlength (coef). The expression of β as a function of coef is given in (3).
where a, b, and c are technology-dependent terms. The exponential term is explained by the curves in Figure 7. In fact, the curves converge around (a + c) when the coefficient wordlength is close to 0 and then decreases very fast when the coefficient wordlength increases to converge to the c value.  The line of equation y = c is a horizontal asymptote to the curves in Figure 7, which explain the additive c term in (3). To set numerical value of a, b, c for this given technology we used "Matlab curve fitting toolbox" [7] and we found that, in this technology, parameters a, b, and c are 0.6, 0.3, and −0.7, respectively. Of course, these values are useful only for the given technology; however, if we are using another technology the method presented above should be repeated at least one time as we have performed with the STM90 nm later.
To evaluate parameter α of (2), we used the same approach used for β expression extraction. Hence, first, the dependency of α versus filter order (N) was evaluated. Then,  the dependency versus coefficient wordlength (coef) was found. Curves in Figure 8 illustrate the dependency of parameter α on the filter orders and for three fixed coefficient wordlengths. To obtain these curves, we extracted the origin values from Figure 7. These origin values represent the normal logarithm of the parameter α for each fixed coefficient size. Then, we plotted the exponential of these values. The curves in Figure 8 show that α is almost linear versus filter order for any coefficient size. Hence, we next evaluated the dependency of the slope of α depending on coefficient wordlength, as shown in Figure 9. According to this figure, the slope of α is also linear versus coef. As a consequence, the relation (4) is deduced to model the evolution of α versus N and coef. where γ and ω are constants and depend on the technology considered. Experiments show that in the 65 nm technology, the values of γ, ω are 0.2 and 1, respectively.
Hence, using (3) and (4), equation (5) gives the dynamic power consumption evolution versus filter parameters and normalized frequency for a direct form ( Figure 2).
where f 0 is the frequency used during the syntheses process and is equal to 80 MHz. When the filter order is zero, the power consumption should be zero as well. This condition is guaranteed because power is directly proportional to filter order (N). On the other hand, when the coefficient length (coef) is zero, the FIR filter is composed of N registers and N/2 adders. In this case, the power consumption becomes a constant multiplying the filter order (N).

Verification of the Dynamic Power Consumption for STM90-nm.
To verify whether the model of (4) is compliant with the amount of dynamic power consumption in the STM90 nm technology, we repeated all of the syntheses using the new library. The same parameter values were used for the experiments. The same conclusions regarding power consumption versus input wordlength were noticed. As shown in Figure 10, we also verified that the natural logarithm of the power is proportional to the natural logarithm of the input wordlength regardless of the filter order and coefficient size. Hence, (1) and (2) are still true for the 90 nm technology.
According to Figure 11, the independency of the parameter β in (2) regarding filter order N is still verified. Figure  12 shows the relationship between β and the coefficient  wordlength. The same trends observed in Figure 7 are observed in Figure 13. Hence, the expression of β given in (3) is verified in the 90 nm technology. Figures 13 and 14 give the evolution of parameter α (given in (3)) regarding filter order and the slope of α versus coefficient wordlength, respectively. According to the two figures, (4), which gives expression of α regarding filter order and coefficient wordlength, is verified.
As a consequence, the model of (5) is applicable for the evaluation of the dynamic power consumption in the 90 nm technology. With the same manner, to set numerical value of a, b, c for this given technology we used "Matlab curve fitting toolbox" [7] and we found that for the 90 nm process, parameters γ, ω, a, b, and c were equal to 0.5, 4.5, 0.5, 0.2, and 0.6, respectively.   Table 1 illustrates the dynamic and static power contribution of the power consumption for the 90-nm process technology. It is clear from the table that static power cannot be neglected in this technology.

Static Power Consumption Model. The
To establish a static power consumption model, we evaluated the evolution of the static power regarding filter parameters. In particular, we noticed a nonlinear relationship between the static power and input wordlength (see Figure  15). The second observation concerns the linearity of the natural logarithm of the static power versus the natural logarithm of the input wordlength ( Figure 16). Hence, for the dynamic power consumption contribution, (1) and (2) could be used as general equation forms for the static power consumption of symmetric FIR filters.  To establish expressions of the and β terms (in (2)), the dynamic power consumption modeling was followed. Hence, starting from the fact that β is filter order independent (since the curves are parallel in Figure 16), we plot the evolution of β versus coefficient wordlength in Figure 17.
After analyzing the curves in Figure 17, (6) fits the best curve evolution. Indeed, the exponential term is explained by the rapid decrease when small coefficient values are considered. The linear term is added because of the very slow increase when the coefficient wordlength increases. where d, e, and g are technology dependent parameters. Using Matlab, we shown that, when using a STMicroelectronics library, the parameters are equal to 0.11, 0.03, and 0.04.
Parameter α was then evaluated depending first on the filter order and then on the coefficient wordlength. Figure  18 shows the relationship between α and N. Hence, we found an almost linear relationship of this parameter versus filter order. In the second step, the slope of α was analyzed according to coefficient wordlength (see Figure 19). Matlab verified that (7) fits the evolution of the slope of α illustrated in Figure 18. where a, b, c, and γ are parameters depending on the technology. According to Matlab, when using the 90 nm STMicroelectronics library, the parameters are equal to 0.29, 0.4, 0.077, and 1, respectively. Finally, α is expressed in (8): As a consequence, the general expression of the static power consumption for a direct form FIR filter (Figure 2) in the 90 nm technology could be written as shown in (9).
where technology dependent parameters a, b, c, d, e, and g are equal to 0.29, 0.4, 0.077, 0.11, 0.03, and 0.04, respectively. Parameters γ and ω depend also on the technology and are equal to 1.2 and −1.2, respectively, for the current model.

Validation of the Power Consumption Models
This part presents the validation of the models obtained. Hence, we present different figures comparing the syntheses values of power consumption and the results deduced from established models. The aim of the comparison is to demonstrate first that the power consumption trends are respected by the models. In fact, the power estimation value for a given design is not really important and the objective is to prove that decisions concerning the choice of the suitable parameters, which reduces the power consumption, are not modified using the models. For this purpose, a parameter called the "deviation" is calculated. This parameter gives the difference for two fixed filter parameters (which are the filter order and coefficient wordlength or input wordlength) between the power consumption value when varying the third filter parameter (which is the coefficient wordlength or input wordlength). Indeed, having a similar "deviation" for syntheses and model values means that models cannot change decisions concerning the best parameters to use for power consumption reduction. Technology. Figures 20(a) and 20(b) give the results of the comparison of power consumption for fixed coefficient wordlengths (8 bits and 16 bits, resp.) and variable input wordlength and filter order.

Validation of Power Consumption Models in the 65 nm Process
According to Figure 20, the power consumption trends are respected by the model. Moreover, for a fixed filter order and the deviation of the power consumption when varying the input size, it can be confirmed that deviations measured using the model and power estimation tool are very close. Figure 21 shows the deviations observed in Figure  20(b). In this figure, d1 corresponds to the difference in power consumption considering an input size equal to 8 bits and an input size equal to 4 bits according to experimental results. The model d1 parameter is the same information according to the given model. In the same way, d2 is the power difference considering an input size equal to 16 bits and an input size equal to 4 bits according to experimental results, and d3 is the power difference considering an input size equal to 16 bits and an input size equal to 8 bits. The model d2 and model d3 parameters are the same difference using the model.
In the same way, it can be confirmed that the deviations measured for a fixed filter order and a fixed coefficient size when varying the input wordlength are also close to the experimental deviations measured by the power estimation tool. Hence, we can conclude that, the model could be used for any filter order, input wordlength, or coefficient size not tested within experiments. In conclusion, the model obtained could help efficiently estimate the power consumption of a direct form symmetric FIR filter.

Validation of the Dynamic Power Consumption Model for the 90 nm Process Technology.
In this part, we validate the dynamic power model given in (5). For this purpose, we ran several syntheses using different filter parameters. As in the case of the 65 nm technology, we used the design compiler of Synopsys for the syntheses and the "Primepower" tool for power estimation. The frequency was fixed to 80 MHz for the syntheses. Figure 22 gives a comparison of the results extracted from the model and the experimental values for dynamic power consumption and for fixed values of the coefficient wordlength.
The same conclusions for the 65 nm technology could be formulated. In fact, if we analyze each curve in Figure 22, we can confirm that using a model does not modify the trends of dynamic power consumption of an FIR filter depending on its parameters. Moreover, for a fixed filter order, the dynamic power deviation due to the input wordlength increase is almost equal to the deviation measured for the experimental dynamic power estimation (Figure 23).

Validation of the Static Power Consumption Model for the 90 nm Process Technology.
For the dynamic contribution, we validated the static power model as given in (9). Figure 24 shows the comparison between the results extracted from the model and the experimental values for static power consumption and for fixed input wordlength.
In the same way, we confirm that the proposed model for static power consumption gives static power values very close to the values given by the estimation tool. The model also does not modify trends of power and does not modify decisions depending on the filter parameters because the deviations are almost equal (see Figure 25).

Results of Polyphase Implementation Form in the 65 nm
Technology. Polyphase implementations of FIR filters were performed with different input wordlengths, coefficient sizes, and filter orders, which lead to the same observations illustrated in Figure 26. This figure compares the direct form implementation of different FIR filter orders versus their polyphase decomposition. The comparison is made for an input size equal to 4 bits and includes decimation values of 2, 4, 8, and 16. According to Figure 26, it is clear that polyphase decomposition reduces the power consumption for any decimation factor.
In fact, the following observations are clear.
(i) Decomposition into two stages allows power reduction rates that increase with filter order. The reduction rates are between 20% and 40% in comparison with the direct form implementation.
(ii) Decomposition into four stages allows reduction rates between 45% and 60% compared with the direct form implementation.
As a consequence, polyphase decomposition is always beneficial in terms of power consumption reduction. However, the reduction average depends on the decimation factor that is used for a given filter order. Equation (10) gives the relation between the decimation factor and filter order up to 128 to offer the best power consumption reduction. It is estimated that order 128 is sufficiently large for channel selection filters.

Results of Polyphase Implementation Form in the 90nm Technology.
To verify the dynamic power reduction rates measured in the 65 nm technology when performing polyphase decomposition of a symmetric FIR filter, some syntheses were run in the 90 nm technology. Figure 24 illustrates the impact of each decimation factor on the dynamic power consumption for a fixed input size (4 bits) and fixed coefficient size (4 bits). It was verified that trends for other parameters are the same as those represented in Figure 27. A comparison to the direct form confirms the benefit of the polyphase decomposition. The same conclusions extracted in the 65 nm technology can be formulated for dynamic power consumption in the 90 nm technology. However, the static power in this technology could not be neglected. Hence, the comparison should also concern the static power contribution (see Figure 28). According to Figure 28, static power and dynamic power consumption have the opposite trends. In fact, the static power consumption increases with the increase of the decimation factor. Indeed, when applying polyphase decomposition, the occupied area increases, which increase the static power consumption. In Figure 29, the total power consumption results for the FIR filter considered are plotted and compared to the power consumption results of the direct form. According to this figure and considering the total power consumption criteria, the polyphase implementation could lead in some cases to a power consumption increase in comparison with the direct form power consumption results.
However, decimation by a factor of 2 and 4 always reduces the total power consumption in comparison with the direct form, with advantageous reduction for a decimation factor of 4.

Comparison of Power Consumption of Filtering Architectures for GSM and UMTS Standards Using Established Models.
In this section, we examine the power consumption estimation of different decimation filter solutions for multistandard receiver supporting UMTS and GSM standards. The aim of this section is to demonstrate how the models could help filter designer make the correct choice regarding filter architecture to reduce the power consumption.
For this purpose, three different architectures suitable for multistandard receivers are analyzed (Figure 30).
The parameters of the different filters building each architecture are obtained using the specification methodology described in [8]. The multistandard receiver architecture and parameters considered in this work are calculated in [9]. Because filtering performances are out of the scope of this paper, only the description of the filter parameters is given.

Filtering Solution Description for UMTS and GSM
Standards. This paragraph gives the details of the different architectures for both standards. Tables 3 and 4 summarize the parameters of the different stages in the case of UMTS and GSM standards, respectively.

2-Stage Architecture (Arch0).
For both UMTS and GSM standards, the first filter is composed of a cascade of 5stage CIC filters. It performs decimation by a factor of 12 or 4 for GSM and UMTS standards, respectively. The recursive architecture of the CIC filter allows the programmability of the filter depending on the selected standard. The input wordlength of the filter is equal to 6 bits at its output; the wordlength is 16 bits for the UMTS standard and is equal to 26 bits in the case of the GSM standard. The second filter of the 2-stage architecture is a symmetric FIR filter of order 45 for UMTS, where 12 bits are necessary for the quantification of the filter coefficients for this standard. In the case of GSM, the required filter order is 83, and 12 bits are sufficient for the quantization of coefficients.

3-Stage Architecture (Arch1).
In this architecture, the last filter of the 2-stage architecture is split into a half-band filter and an FIR filter performing, each one a decimation by a factor of 2. The half-band filter has an order equal  Figure 30: Decimation filter architectures for GSM and UMTS standards.  to 14 for UMTS and 20 for GSM. The filters coefficients length is fixed to 10 bits and 11 bits for UMTS and GSM standards, respectively. The last filter has an order of 20 for UMTS and an order equal to 45 for GSM. For both standards, coefficients are quantized with 12 bits.

4-Stage Architecture (Arch2).
In the case of the GSM standard, the 4-stage architecture is based on a CIC filter of order 6 in its recursive form, followed by a cascade of 2 halfband filters, each one performing a decimation by a factor of 2. The last stage is composed of a symmetric FIR filter,  completing the decimation by 2 and the channel selection. The two half-band filters have orders equal to 6 and 10, and their coefficients are quantized on 10 and 11 bits, respectively. The last FIR filter has an order equal to 34, and we found that 10 bits are required for the quantization of coefficients. For the UMTS standard, the CIC filter used in the 3stage architecture is split into a cascade of 2 CIC filters. The number of required stages is 5 for the first CIC filter and 4 for the second. Each filter performs decimation by a factor of 2. The output size of the first CIC filter is equal to 11 bits. The third filter is a half-band filter of order 10 and has a coefficient size equal to 9 bits. Finally, a symmetric FIR filter of order 20 with coefficients quantized on 12 bits completes the selection. As explained before, the input wordlength is considered equal to 15 bits, unless for the first or second stage, for which the input wordlength is equal to 6 bits or 11 bits, respectively.

Comparison to Implementation
Results. On the basis of the filter parameters and the established models given in (5) and (10), we estimated the power consumption of each sub-block in the filtering architectures for both the 65 nm and 90 nm technologies. Table 5 gives power consumption estimation results for GSM and UMTS standards for the 65 nm technology. The results for both standards for the 90 nm technology are given in Table 6.
For the particular case of CIC filters, the power consumption comparison is performed following the work in [2]. Indeed, the authors in [2] studied the power consumption of fixed order CIC filters depending on their implementation architecture. In the case of the GSM standard and because the power consumption of CIC filters of different orders is not included in the work in [2], the power consumption results concern the FIR implementation form of the CIC filters.
According to Tables 3 and 4, it is clear that, for both technologies (65 nm and 90 nm), the architecture that presents the optimal power consumption in the case of the GSM standard is the 3-stage architecture.
For the UMTS standard, the power consumption values obtained from models are very similar in the case of 3-stage  and 4-stage architectures. If the estimation of the power is considered, which is done in [2], for the comparison of the power consumption of the CIC filters, the power values increase, but the trends are not modified. It is, however, important to notice that the power due to the clock tree and stage connection is not considered in the evaluation. Thus, the power consumption for 4 stages should be considered because it is composed of more stages and should increase compared to the 3-stage architectures, in particular for the 90 nm technology. Hence, it can be concluded that the 3-stage architecture is also more advantageous in terms of power consumption for the UMTS standard.
To validate these results experimentally, VHDL code of all architectures considered was built. The results of the power estimated after logic syntheses are given in Tables 7 and 8 for the 65 nm and 90 nm technologies, respectively. These tables confirm the conclusions obtained from established models.

Conclusion
This paper presented power consumption evaluation models of direct form FIR filters ( Figure 2) used for decimation. Both models for dynamic and static power contributions are proposed in this work in STM65 nm low-power and STM90 nm ASIC technology. The proposed models are high level models, which estimate the dynamic and static power consumption of the FIR filter depending on three filter parameters, which are the input wordlength, coefficient wordlength, and filter order. The aim of these models is to help the system designer compare, at the system level, different filter architectures in terms of power consumption without having to implement the different filters and perform syntheses. However, this method should be verified for any new library different form the used ones in this work. In the second step, the effect of polyphase decomposition for FIR decimator filters was evaluated in two different technologies. We found that polyphase decomposition allows good dynamic power reduction regarding the direct form implementation. We observed that the dynamic power reduction could reach 75% in some cases. To help the system designer choose the best decimation factor in terms of power consumption, relation (3) was given between the decimation factor and the filter order. However, the static power consumption increases when performing polyphase decomposition. When the static contribution is important (as with the 90 nm technology library considered), the total power consumption can increase in some cases in comparison with the direct form power consumption.
Finally, a case study concerning the UMTS and GSM standards was presented. We performed a comparison between three filter architectures. The power estimation based on proposed models helped choose the suitable architecture for power consumption optimization, and the result was confirmed by filter implementations.