^{1}

^{1}

^{2}

^{1}

^{1}

^{2}

Decimation filters are widely used in communication-embedded systems. In fact, decimation filters are useful for implementing channel filtering or selection with low-computation complexity requirements. Many multistandard receiver designs that are required in ubiquitous embedded systems are based on a cascade of decimation filter processing. Filter number and implementation architectures have a significant impact on system performances, such as computation complexity, area, throughput, and power consumption. In this work, we present filter power consumption estimation models for FIR filters. Power consumption models were obtained from a large number of FIR filter syntheses using a direct form. Several curves that estimate power consumption were extracted from these synthesis results. Then, we have evaluated the impact of polyphase decomposition on power consumption of FIR filter and compared it with the direct form results. Some tips regarding power consumption were deduced for the polyphase implementation form. The aim of this work is to help a system designer to select an efficient implementation for FIR in terms of power consumption without having to implement and synthesize the different possible solutions. The proposed method is applied for STMicroelectronics libraries 90 nm and 65 nm low power then validated with a use case of multistandard receiver designing.

Currently, wireless technologies are widespread because of their flexibility of use. However, many different standards are used, and a new challenge for the communication-embedded system designer is to implement multiple communication standard devices in order to provide easy access to information everywhere with low hardware complexity. In general, such devices include communication chains with decimation and selector filters. Typically, multiple standards communication chains are composed of an RF front end, an over-sampler analogue to digital converter, and a cascade of decimation filters (Figure

General communication chain in a multistandard receiver.

Implementation architecture of the direct form and polyphase form of FIR filter.

Direct form

Polyphase form

The main objective of this work is to provide models that evaluate FIR decimator filter power consumption in direct form. These models consider the main filter parameters, which are the filter order, the input wordlength, and the coefficient wordlength. This model was given for STM90 nm and STM60 nm low power. The operation conditions are specified in paragraph 2.

The second objective of this paper is to study the impact of the polyphase form on the power consumption of FIR filters. Some tips regarding the best decomposition to perform to save power are extracted from the synthesis results in this work.

The ultimate aim of this study is to help filter designers decide the best way to decompose the filter processing into stages to guarantee optimal power consumption.

This paper is organized as follows. Section

When designing FIR channel selection and a decimation filter, the filter designer faces a trade-off between channel selection efficiency and filter complexity. Indeed, the design must guarantee channel selection with the minimum complexity in terms of occupied area and power consumption. Depending on the communication standard, filter designers choose the optimal filter order and coefficient length to guarantee the required signal-to-noise ratio. The input data wordlength is typically deduced from the analogue to digital converter input signal dynamic range. Hence, for the hardware performance evaluation, the designer should consider the three following parameters in FIR filter power consumption: input wordlength, filter order, and coefficient wordlength.

To evaluate the impact of these parameters on the power consumption of an FIR filter, we performed a large number of FIR filter architecture syntheses. We considered filters orders from 7 to 64 (with a step of three). We chose to implement both direct and polyphase forms. To reduce the hardware implementation complexity of the filter, we used a Wallace adder tree [

For the polyphase decomposition, each subfilter was implemented as a FIR filter in the direct form. For simplicity, we chose to run the syntheses with the input wordlength equal to 4 bits, 8 bits, 16 bits, and 32 bits. In the next section, we will show that, for intermediate values, it is possible to interpolate the power consumption of the filter.

In the same way, we choose three possible values of filter coefficients: 4 bits, 8 bits, and 16 bits. In fact, both the filter order and coefficient size depend on the filter’s mask. We estimate that 16 bits offer enough accuracy for quantization process.

The performance estimation of FIR filters was done on STM90 nm process technology. In this 90 nm process library, the static power has almost the same proportions as the dynamic power consumption (see Table

Dynamic and static power contribution using a 90 nm library of STMicroelectronics.

Filter order | Dynamic power (uW) | Dynamic power (%) | Static power (uW) | Static power (%) |
---|---|---|---|---|

13 | 300,9546 | 48,84 | 315,2013 | 51,16 |

22 | 481,3064 | 48,26 | 515,8453 | 51,74 |

31 | 632 | 46,71 | 721 | 53,29 |

40 | 722 | 43,54 | 936 | 56,46 |

49 | 840,64 | 42,63 | 1131,1 | 57,37 |

58 | 973,3182 | 41,87 | 1351,3 | 58,13 |

64 | 1054,6 | 41,62 | 1478,9 | 58,38 |

Dynamic and static power contribution using a 65-nm low power library.

Filter order | Dynamic Power (uW) | Dynamic power (%) | Static Power (uW) | Static power (%) |
---|---|---|---|---|

13 | 169,6857 | 98,9 | 1,9 | 1,1 |

22 | 245,1208 | 98,8 | 3,1 | 1,2 |

31 | 356,7201 | 98,9 | 4,2 | 1,1 |

40 | 424,5542 | 98,7 | 5,7 | 1,3 |

49 | 497,6389 | 98,6 | 7,2 | 1,4 |

58 | 578,6831 | 98,7 | 8 | 1,3 |

64 | 628,4149 | 98,6 | 9,1 | 1,4 |

Design Vision of Synopsys was used to extract the performances on ASIC technology. From the experimental results, we were able to build a power consumption model of FIR filters depending on the three main filter parameters: input wordlength, coefficient wordlength, and filter order. Because the dynamic power consumption depends on frequency, the model for dynamic power consumption obtained is also frequency dependent.

In this section, we introduce the power consumption estimation models for STM65 nm and STM90 nm process technologies. The STM65 nm library is low-power and operates at 0.9 V and used in nominal case with junction temperature of 25°C. The STM90 nm library operates at 1.26 V and is used in the best case with a junction temperature of 40°C.

For the dynamic power consumption model we used STM 65 nm low-power technology. In this technology, the dynamic power consumption is assumed to be the total power because the static power is very low (see Table

Area occupation and power consumption of FIR filters in the direct form depending on frequency.

Area evolution

Power consumption

Following this observation, we concentrated all syntheses efforts at a fixed frequency equal to 80 MHz, which is sufficient for the requirements of both GSM and UMTS standards [

Figure

Power consumption of FIR filters in the direct form depending on input wordlength and coefficient wordlength.

Figure

Power consumption of FIR filters in the direct form depending on input wordlength for the 65 nm technology for a fixed coef size of 4 bits.

However, Figure

Natural logarithm of the power consumption of FIR filters in the direct form depending on the natural logarithm of the input wordlength for the 65 nm technology for a fixed coef size of 4 bits.

We evaluate first whether parameter

To evaluate the expression of

Evolution of

Figure

To evaluate parameter

Curves in Figure

Evolution of

Evolution of

Hence, using (

On the other hand, when the coefficient length (coef) is zero, the FIR filter is composed of

To verify whether the model of (

Relationship between the dynamic power consumption and input wordlength for the 90 nm technology for a fixed coefficient wordlength.

According to Figure

Relationship between the logarithm of dynamic power consumption and the logarithm of the input wordlength for the 90 nm technology for a fixed coefficient wordlength.

Evolution of

Evolution of

Figures

Evolution of the slope of

As a consequence, the model of (

The Table

To establish a static power consumption model, we evaluated the evolution of the static power regarding filter parameters. In particular, we noticed a nonlinear relationship between the static power and input wordlength (see Figure

Relationship between the static power consumption and input wordlength for the 90 nm technology for a fixed coefficient wordlength.

Relationship between the logarithm of the static power consumption and the logarithm of the input wordlength for the 90-nm technology for a fixed coefficient wordlength.

To establish expressions of the and

Evolution of

After analyzing the curves in Figure

Parameter

Evolution of

Evolution of the

This part presents the validation of the models obtained. Hence, we present different figures comparing the syntheses values of power consumption and the results deduced from established models. The aim of the comparison is to demonstrate first that the power consumption trends are respected by the models. In fact, the power estimation value for a given design is not really important and the objective is to prove that decisions concerning the choice of the suitable parameters, which reduces the power consumption, are not modified using the models. For this purpose, a parameter called the “deviation” is calculated. This parameter gives the difference for two fixed filter parameters (which are the filter order and coefficient wordlength or input wordlength) between the power consumption value when varying the third filter parameter (which is the coefficient wordlength or input wordlength). Indeed, having a similar “deviation” for syntheses and model values means that models cannot change decisions concerning the best parameters to use for power consumption reduction.

Figures

Comparison of model power results with experimental values for the 65 nm technology for a fixed coefficient wordlength.

According to Figure

“Deviation” parameter for model and estimation tool power results for

In the same way, it can be confirmed that the deviations measured for a fixed filter order and a fixed coefficient size when varying the input wordlength are also close to the experimental deviations measured by the power estimation tool. Hence, we can conclude that, the model could be used for any filter order, input wordlength, or coefficient size not tested within experiments. In conclusion, the model obtained could help efficiently estimate the power consumption of a direct form symmetric FIR filter.

In this part, we validate the dynamic power model given in (

Figure

Comparison of model dynamic power results to experimental values for the 90 nm technology for a fixed coefficient wordlength.

The same conclusions for the 65 nm technology could be formulated. In fact, if we analyze each curve in Figure

“Deviation” parameter for model and estimation tool power results for

For the dynamic contribution, we validated the static power model as given in (

Comparison of the model static power results with experimental values for the 90 nm technology for a fixed Input wordlength.

In the same way, we confirm that the proposed model for static power consumption gives static power values very close to the values given by the estimation tool. The model also does not modify trends of power and does not modify decisions depending on the filter parameters because the deviations are almost equal (see Figure

“Deviation” parameter for model and estimation tool power results for an input size of 8 bits.

Polyphase implementations of FIR filters were performed with different input wordlengths, coefficient sizes, and filter orders, which lead to the same observations illustrated in Figure

Comparison between direct form and polyphase form power consumption for different decimation factor values.

The comparison is made for an input size equal to 4 bits and includes decimation values of 2, 4, 8, and 16. According to Figure

In fact, the following observations are clear.

Decomposition into two stages allows power reduction rates that increase with filter order. The reduction rates are between 20% and 40% in comparison with the direct form implementation.

Decomposition into four stages allows reduction rates between 45% and 60% compared with the direct form implementation.

Decomposition into 8, 16, and 32 stages allows reduction rates between 60% and 70% for 8 stages based architecture, 60% and 75% for the 16 stages, and 55% and 75% for the 32 stages.

As a consequence, polyphase decomposition is always beneficial in terms of power consumption reduction. However, the reduction average depends on the decimation factor that is used for a given filter order. Equation (

To verify the dynamic power reduction rates measured in the 65 nm technology when performing polyphase decomposition of a symmetric FIR filter, some syntheses were run in the 90 nm technology. Figure

Dynamic power evolution of poly phase form implementations of a symmetric FIR filter for the 90 nm technology.

The same conclusions extracted in the 65 nm technology can be formulated for dynamic power consumption in the 90 nm technology. However, the static power in this technology could not be neglected. Hence, the comparison should also concern the static power contribution (see Figure

Static power evolution of poly phase form implementations of a symmetric FIR filter for the 90 nm technology.

In Figure

Total power evolution of poly phase form implementations of a symmetric FIR filter for the 90 nm technology.

However, decimation by a factor of 2 and 4 always reduces the total power consumption in comparison with the direct form, with advantageous reduction for a decimation factor of 4.

In this section, we examine the power consumption estimation of different decimation filter solutions for multistandard receiver supporting UMTS and GSM standards. The aim of this section is to demonstrate how the models could help filter designer make the correct choice regarding filter architecture to reduce the power consumption.

For this purpose, three different architectures suitable for multistandard receivers are analyzed (Figure

Decimation filter architectures for GSM and UMTS standards.

The parameters of the different filters building each architecture are obtained using the specification methodology described in [

This paragraph gives the details of the different architectures for both standards. Tables

Specification of the filters in proposed architectures for the GSM standard.

Architecture Specification | CIC filter | HB1filter | HB2 filter | FIR filter |
---|---|---|---|---|

2 stages | ||||

Order | 5 | 83 | ||

Input wordlength | 6 | 17 | ||

Coefficients wordlength | 12 | |||

Decimation factor | 12 | 4 | ||

3 stages | ||||

Order | 5 | 20 | 45 | |

Input word-length | 6 | 17 | 17 | |

Coefficients wordlength | 11 | 11 | ||

Decimation factor | 12 | 2 | 2 | |

4 stages | ||||

Order | 6 | 8 | 10 | 34 |

Input wordlength | 6 | 17 | 17 | 17 |

Coefficients wordlength | 10 | 11 | 10 | |

Decimation factor | 6 | 2 | 2 | 2 |

Specification of the filters in proposed architectures for the UMTS standard.

Architecture Specification | CIC filter | CIC filter | HB filter | FIR filter |
---|---|---|---|---|

2 stages | ||||

Order | 5 | 57 | ||

Input wordlength | 6 | 15 | ||

Coefficients wordlength | 12 | |||

Decimati on factor | 4 | 4 | ||

3 stages | ||||

Order | 5 | 14 | 20 | |

Input wordlength | 6 | 15 | 15 | |

Coefficients wordlength | 10 | 12 | ||

Decimation factor | 4 | 2 | 2 | |

4 stages | ||||

Order | 5 | 4 | 10 | 20 |

Input wordlength | 6 | 11 | 15 | 15 |

Coefficients wordlength | 9 | 12 | ||

Decimation factor | 2 | 2 | 2 | 2 |

For both UMTS and GSM standards, the first filter is composed of a cascade of 5-stage CIC filters. It performs decimation by a factor of 12 or 4 for GSM and UMTS standards, respectively. The recursive architecture of the CIC filter allows the programmability of the filter depending on the selected standard. The input wordlength of the filter is equal to 6 bits at its output; the wordlength is 16 bits for the UMTS standard and is equal to 26 bits in the case of the GSM standard. The second filter of the 2-stage architecture is a symmetric FIR filter of order 45 for UMTS, where 12 bits are necessary for the quantification of the filter coefficients for this standard. In the case of GSM, the required filter order is 83, and 12 bits are sufficient for the quantization of coefficients.

In this architecture, the last filter of the 2-stage architecture is split into a half-band filter and an FIR filter performing, each one a decimation by a factor of 2. The half-band filter has an order equal to 14 for UMTS and 20 for GSM. The filters coefficients length is fixed to 10 bits and 11 bits for UMTS and GSM standards, respectively. The last filter has an order of 20 for UMTS and an order equal to 45 for GSM. For both standards, coefficients are quantized with 12 bits.

In the case of the GSM standard, the 4-stage architecture is based on a CIC filter of order 6 in its recursive form, followed by a cascade of 2 half-band filters, each one performing a decimation by a factor of 2. The last stage is composed of a symmetric FIR filter, completing the decimation by 2 and the channel selection. The two half-band filters have orders equal to 6 and 10, and their coefficients are quantized on 10 and 11 bits, respectively. The last FIR filter has an order equal to 34, and we found that 10 bits are required for the quantization of coefficients.

For the UMTS standard, the CIC filter used in the 3-stage architecture is split into a cascade of 2 CIC filters. The number of required stages is 5 for the first CIC filter and 4 for the second. Each filter performs decimation by a factor of 2. The output size of the first CIC filter is equal to 11 bits. The third filter is a half-band filter of order 10 and has a coefficient size equal to 9 bits. Finally, a symmetric FIR filter of order 20 with coefficients quantized on 12 bits completes the selection. As explained before, the input wordlength is considered equal to 15 bits, unless for the first or second stage, for which the input wordlength is equal to 6 bits or 11 bits, respectively.

On the basis of the filter parameters and the established models given in (

Power consumption evaluation according to models in the 65 nm low-power technology.

2-stage Arch power ( | 3-stage Arch power ( | 4-stage Arch power ( | ||
---|---|---|---|---|

GSM standard | Filter 1 (cic filter) | 226 | 226 | 264,3 |

Filter 2 (halfband) | 42,5 | 32,6 | ||

Filter 3 (halfband) | 20,4 | |||

Filter 4 (FIR) | 208,6 | 47,8 | 34,7 | |

Total GSM | 434,6 | 336,3 | 352 | |

UMTS standard | Filter 1 (cic filter) | |||

Filter 2 (cic filter) | ||||

Filter 3 (halfband) | 54 | 52 | ||

Filter 4 (FIR) | 303 | 56 | 56 | |

Total UMTS | 303 | 110 | 108 |

Power consumption evaluation according to models for the 90 nm technology.

2-stage Archpower ( | 3-stage Archpower ( | 4-stage Archpower ( | ||
---|---|---|---|---|

GSM standard | Filter 1 (cic filter) | 967,8 | 967,8 | 2796,8 |

Filter 2 (halfband) | 119,5 + 1818,8 | 93,7 + 720 | ||

Filter 3 (halfband) | 59,7 + 909,4 | |||

Filter 4 (FIR) | 575,3 + 8660 | 134,4 + 4092,4 | 99,5 + 3060,75 | |

Total GSM | 10203,1 | 7132 | 7739,8 | |

UMTS standard | Filter 1 (cic filter) | |||

Filter 2 (cic filter) | ||||

Filter 3 (halfband) | 145 + 827 | 142 + 818 | ||

Filter 4 (FIR) | 823 + 4000 | 147 + 1600 | 147 + 1600 | |

Total UMTS | 4823 | 2719 | 2707 |

For the particular case of CIC filters, the power consumption comparison is performed following the work in [

According to Tables

For the UMTS standard, the power consumption values obtained from models are very similar in the case of 3-stage and 4-stage architectures. If the estimation of the power is considered, which is done in [

It is, however, important to notice that the power due to the clock tree and stage connection is not considered in the evaluation. Thus, the power consumption for 4 stages should be considered because it is composed of more stages and should increase compared to the 3-stage architectures, in particular for the 90 nm technology. Hence, it can be concluded that the 3-stage architecture is also more advantageous in terms of power consumption for the UMTS standard.

To validate these results experimentally, VHDL code of all architectures considered was built. The results of the power estimated after logic syntheses are given in Tables

Power consumption evaluation according to the “primepower” tool for the 65-nm technology for UMTS and GSM standards.

2 stages | 3 stages | 4 stages | |
---|---|---|---|

GSM standard | 280,6 | 231 | 260 |

UMTS standard | 171 | 157 | 149 |

Power consumption evaluation according to the “primepower” tool for the 90 nm technology for UMTS and GSM standards.

2 stages | 3 stages | 4 stages | ||
---|---|---|---|---|

90 nm | GSM | 751 | 640 | 731 |

UMTS | 5353 | 4231 | 4441 | |

65 nm | GSM | 280,6 | 231 | 260 |

UMTS | 171 | 157 | 149 |

This paper presented power consumption evaluation models of direct form FIR filters (Figure

Finally, a case study concerning the UMTS and GSM standards was presented. We performed a comparison between three filter architectures. The power estimation based on proposed models helped choose the suitable architecture for power consumption optimization, and the result was confirmed by filter implementations.