Low Power Lookup-Table Design Using Power Gating and Body Biasing Techniques
Yang Fu, Kejia Li
ECE 6332 – Fall 2010
University of Virginia
,
ABSTRACT
In this paper, we describe a low power lookup-table (LUT) design using power gating and body biasing techniques. The LUT design could operate in high speed / low power / sleep mode adapting to system requirement.
1. INTRODUCTION
Lookup table (LUT) is an important component for digital IC. LUT can be used to encode Boolean functions efficiently. It is also one of the key components in FPGA. Due to the scaling down of modern VLSI technology, reducing the power consumption of LUT in FPGA applications while maintaining speed performance becomes more and more important [1]. Various techniques including dual-Vt, dual-VDD, body biasing, sleep transistors and so on can be used to reduce the power consumption of LUT. Anderson and Najm have studied different footer / header designs for the multiplexer routing circuitry [2]. Arifur and Vijay compared various leakage reduction techniques for FPGA circuit [1]. DualVdd/Vt technique has been proposed by authors in [3]. Reduced swing technique could also be used for this purpose [4].
In our design, we will use high-Vt devices together with forward body biasing (FBB) to reduce the power consumption of the mux in LUT while maintaining its speed performance. We are also going to use power gating and reduced swing technique to reduce the power consumption by operating the device in low power / sleep mode.
2. SYSTEM CONFIGURATION
The system configuration of our design is illustrated in Figure 1.
The well driver provides the FBB voltage for the NMOS pass transistors in the mux. Power gating circuitry will control the virtual VDD (VVDD) of input driver / mux / output driver. There is a bus with large capacitance loading (100fF) between the output driver and level converter. In the low power mode, the voltage swing on the bus will be reduced to conserve power. The level converter is used to restore full swing signal and thus prevent excessive leakage in the following stages due to the reduced swing signal. A power gating selection circuit is used to control the operation of power gating headers in each component.
3. BODY BIASING
3.1 Determine the optimal FBB voltage
First we need to determine the appropriate FBB voltage to use. The major metrics are static power consumption, propagation delay and energy per switch. We also need to choose the threshold voltage Vth of the NMOS pass-transistor in the mux. A lower Vth pass-transistor would speed up the circuit, but result in larger static power consumption. We would start with a 4x1 mux. The comparison is shown below in Figure 2.
We can see NMOS_VTL devices give large static power consumption with increased FBB voltage. The static power of both NMOS_VTG and NMOS_VTH devices increases a little with larger FBB voltage.
The comparison of propagation delay is shown in Figure 3. The NMOS_VTH devices show excessive delay while the delay of NMOS_VTG and NMOS_VTL devices are acceptable. The energy per switch for all three type of devices are about the same.
So we choose NMOS_VTG devices for the pass transistors in the mux. As we can see from the plot, a FBB voltage of 0.6~0.7V results in the optimum tradeoff between static power consumption and propagation delay. So this is the voltage that our FBB generation circuit would provide.
3.2 High power mode operation
Next we will break down the static power / delay / energy per switch graph for each component in the circuit.
In a 4x1 mux the static power consumption is dominated by the output driver. The power consumption of the mux (including input driver) increases fast with increasing FBB voltage. The delay of each component is shown in Figure 5. Table 1 and 2 summarizes the FBB operation of 4x1 and 16x1 mux in high power mode. We can see that the FBB will reduce the propagation delay by 9.8% in a 16x1 mux while increasing the static power consumption by 32%.
FBB off (0V) / FBB on (0.64V) / Change in percentageStatic power (µW) / 0.37 / 0.45 / +21.6%
Mux delay (ps) / 69 / 37 / -46.4%
Total delay (ps) / 312 / 297 / -4.8%
Energy per switch (fJ) / 148 / 148 / 0%
FBB off (0V) / FBB on (0.64V) / Change in percentage
Static power (µW) / 0.59 / 0.78 / +32.0%
Mux power (µW) / 0.23 / 0.44 / +47.7%
Mux delay (ps) / 137 / 86 / -37.2%
Total delay (ps) / 381 / 347 / -9.8%
Energy per switch (fJ) / 151 / 151 / 0%
3.3 The well driver
The structure of the well driver is shown below. It consists of an inverter and 4 NMOS transistors. The upper two NMOS are connected in diode configuration. When input is low, the output of the inverter is high. M2 is off and M3 is on. The output is low. There is no FBB voltage supplied. When input is high, M3 is off and M2 is on. The output is VDD – 2Vtn.
3.4 Effect of pass transistor sizing
Parameter sweep of NMOS pass transistor sizing is used to determine the optimal size of NMOS in the mux. The result is shown below. (The energy per switch stays about the same. Data is not shown here)
We can see that the static power increases with increasing pass transistor width. There is a minimal delay point on the graph. This is the point where the mux is designed at. If the width of pass transistor is too small, the current driving capability will be limited causing large delay; on the other hand, capacitance loading will be too large if the width of pass transistor becomes too large.
4. POWER GATING
4.1 Header Circuit
The header circuit used for power gating is shown in the above graph. It consists of a PMOS and a NMOS transistor. During the high speed mode, both PMOS and NMOS are on. VVDD = VDD. During low power mode, the PMOS is turned off and only NMOS is on supplying power. VVDD = VDD – Vtn. The voltage swing in the circuit is reduced, especially on the bus. Dynamic power consumption is then reduced. The voltage swing will be restored to full swing by the level converter. Finally in the sleep mode, both PMOS and NMOS are off and the circuit is cut off from the power supply.
4.2 Low power mode operation
The graph below shows the waveform on the bus and final output during the low power mode (FBB is off). The voltage swing on the bus is reduced to ~0.7 V.
The power gating can be combined with FBB. The table below summarizes the result. We can see that the effect of FBB is more pronounced in the low power mode.
FBB off / FBB on / Change in percentageStatic power (W) / 0.44 / 0.52 / +18.2%
Delay (ps) / 829 / 654 / -21.2%
Energy per switch (fJ) / 63.6 / 76.5 / +20.1%
4.3 Effect of Header Sizing
The power is supplied to the circuit through the NMOS in the header in low power mode. The sizing of the NMOS will impact on the performance of the circuit. A parameter sweep of header NMOS width is used to determine the effect. The result is shown below.
The delay decreases with increasing NMOS width while the energy per switch follows an opposite trend. The static power remains about the same for different NMOS width (data not shown here). The power supply capability of NMOS increases with larger width and VVDD will be closer to VDD for larger NMOS. So the transition becomes faster and energy per switch increases.
4.4 Sleep Mode
Both NMOS and PMOS in the headers are turned off in sleep mode. The FBB is also off. The static power consumption in sleep mode is ~46 nW.
5. CONCLUSION
We have developed a low power LUT circuit using power gating and body biasing technique. The circuit could operation in high speed / low power / sleep mode for different tasks. This study provides useful information for low power LUT for FPGA applications.
6. REFERENCES
[1] Arifur R. and Vijay P. Evaluation of LowLeakage Design Techniques for Field Programmable Gate Arrays. FPGA’04, February 22-24, 2004, Monterey, California, USA.
[2] Anderson, J.H.; Najm, F.N.; , Low-Power Programmable FPGA Routing Circuitry,Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol.17, no.8, pp.1048-1060, Aug. 2009.
[3] Fei L., Yan L., Lei H. and Jason C. Low Power FPGA Using Predefined DualVdd/DualVt Fabrics. FPGA’04, February 22-24, 2004, Monterey, California, USA.
[4] Garcia, J.C.; Montiel-Nelson, J.A.; Nooshabadi, S., Adaptive Low/High Voltage Swing CMOS Driver for On-Chip Interconnects,Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on, vol., no., pp.881-884, 27-30 May 2007.