# EEC 216 Lecture #4: Low Power Circuits

## Rajeevan Amirtharajah University of California, Davis

# Outline

- Announcements
- Review: Power Estimation, Interconnect Power
- Lecture 3: Low Power Architecture
- Static CMOS Logic
- Ratioed CMOS Logic
- Pass Gate Logic

• Guest lecture this Friday, Jan. 18

- Nate Guilar on Microwatt Power Electronics

# Outline

- Announcements
- Review: Power Estimation, Interconnect Power
- Lecture 3: Low Power Architecture
- Static CMOS Logic
- Ratioed CMOS Logic
- Pass Gate Logic

# **Review: High Level Power Estimation**

- Basic Idea: Estimate components of dynamic power before completing design
  - 1. Use results from past experience, instruction profiling
  - 2. Benchmark designs from literature (e.g., power factor)
  - 3. Complexity metrics to estimate capacitance (e.g., area, entropy)
  - 4. Profiling, statistical modeling to estimate activity factor (e.g., dual bit type method)
- Advantages: feedback on power early in design process, can use relative information without requiring absolute accuracy
- Disadvantages: usually ignore timing-related activity (glitching), absolute accuracy often needed (e.g. package selection, system power budget)

#### **Review: Interconnect Power**

- Resistance: depends on material resistivity, wire cross-section, length to width ratio
- Capacitance: two terms parallel-plate and fringing field capacitances
  - As technology scales, fringing fields more important
  - Empirical formula for accounting for both components
  - Miller effect: data dependent charging of mutual capacitances between adjacent wires
- Wire length must be estimated to approximate interconnect power early in design process
  - Several approaches based on empirical studies of wire length for different circuit blocks
- Rent's Rule: empirical formula relating gates to I/O's R. Amirtharajah, EEC216 Winter 2008

#### **Review: Low Power Architecture**

- Clock Gating
  - Simple to implement, common industry practice
- Power Down Modes
  - Extend clock gating to gating supply voltages
- Parallelization
- Pipelining
- Bit Serial vs. Bit Parallel Datapaths
  - Low transistor count of serial arithmetic yields less leakage power for increased dynamic power, but net power reduction

# Outline

- Announcements
- Review: Low Power Architecture
- Static CMOS Logic
- Ratioed CMOS Logic
- Pass Gate Logic

## Summary of CMOS Logic Styles

- Nonclocked Logic
  - Does not require clock for proper logic operation (although clocks may be required for state operation)
  - Static CMOS, ratioed logic, DCVSL, Pass-Gate logic
- Clocked Logic
  - Periodic signal required for correct logic operation as well as for state (latches and flip-flops)
  - Dynamic logic, DCSL
- Clocked styles faster in general, but also consume more power (can be observed in Power-Delay Product)

#### **Static CMOS Logic**



 Pull-Up network consists of PMOS devices connected complementary to NMOS Pull-Down network

#### Static CMOS Two-Input NAND Gate



**NMOS Logic Rules** 



**PMOS Logic Rules** 



## Static CMOS Logic Design

- Think of transistor as ideal switch controlled by gate
  - NMOS ON when gate high, OFF when gate low
  - PMOS OFF when gate high, ON when gate low
- Design NMOS or PMOS network first to implement desired logic function
- Design complementary network by recursively using duality
- Complementary CMOS is naturally inverting
  - Requires extra inverter stage to realize noninverting gate
- N-input logic gate requires 2N transistors

#### **Complex Gate Example (on board)**



## **Static CMOS for Low Power**

- Dynamic power and short circuit current applies
  - Mismatched delays can lead to glitching, increased dynamic power
  - Dynamic logic eliminates glitches, short circuit
- Only static power due to leakage
- Fully complementary design has high noise margin

$$- V_{OH} = V_{DD}, V_{OL} = GND$$

- Design style more scalable to lower supply voltages
- Implies lower threshold voltages can be used also
- PMOS devices may degrade performance
  - High input capacitance, slow series P-stacks

# Outline

- Announcements
- Review: Low Power Architecture
- Static CMOS Logic
- Ratioed CMOS Logic
- Pass Gate Logic

## **Ratioed Logic Styles**



 Pull-Up network replaced by simple (often resistive) load

#### **NMOS Two-Input NAND Gate**



Depletion NMOS always on, sourcing static current

#### **Pseudo-NMOS Two-Input NAND Gate**



PMOS always on, sourcing static current

## **Ratioed Logic for Low Power**

- Dynamic power and static current applies
  - Mismatched delays can lead to glitching, increased dynamic power
  - Conducts current as long as output is low
- Reduced noise margin because of resistance ratios

$$- V_{OH} = V_{DD}, V_{OL} = R_{PDN} / (R_L + R_{PDN})$$

- Could increase leakage in load gates whose NMOS gates are at  $V_{\text{OL}}$  instead of ground
- Reduced transistor count decreases input capacitance
- Low-to-High transition speed determined by load (could be faster or slower than series PMOS)
- Most useful for high fan-in gates

#### **Data Dependent Static Power**

- Static power dissipated whenever output is low
- Dynamic power dissipated only on output low-to-high transition
  - For static CMOS 2-input NAND:

$$\alpha_{0\to 1} = \frac{3}{16} = 0.1875$$

For pseudo-NMOS 2-input NAND (assume uniform, independent inputs):

$$\alpha_0 = p_A p_B = 0.25$$

• 33% higher activity factor for pseudo-NMOS

R. Amirtharajah, EEC216 Winter 2008

# A Better Ratioed Logic Style

- Can eliminate static current and provide rail-to-rail output swings
  - Must change load device into a load circuit
  - Need two concepts: differential logic plus positive feedback
- Each gate generates true and complement outputs
  - Two mutually exclusive NMOS pulldown networks implemented in parallel
  - Only one conducts at a time
- Single PMOS pullup replaced by cross-coupled PMOS devices
  - Positive feedback pulls output to VDD, eliminates static current

#### **Differential Cascode Voltage Switch Logic**



 PDN1 ON implies PDN2 OFF pulls Out low, turning on PMOS which pulls complement high

# **DCVSL Summary for Low Power**

- Differential logic style
  - Generating both polarities of output can improve speed (eliminates inverters)
  - Extra noise immunity to common-mode noise
  - Convenient for self-timed (asynchronous) logic design
- Still a ratioed logic style, even though outputs transition rail-to-rail
  - PMOS must be sized carefully to ensure functionality
  - Pulldown networks must overcome PMOS on other side
- Short circuit current flows while outputs are switching (pulldown fighting opposite side PMOS)
- Twice the number of NMOS inputs compared to singleended ratioed logic styles, higher input capacitance

# Outline

- Announcements
- Review: Low Power Architecture
- Static CMOS Logic
- Ratioed CMOS Logic
- Pass Gate Logic

## Pass-Transistor Logic Design

- Logic families studied so far connect primary inputs to transistor gates only
- New circuit concept: connect primary inputs to sources and drains as well as gates
  - Intuition is that it will reduce number of devices required to implement any given logic function
- Consider 2-input AND gate example:



#### **Pass-Transistor Design Issues**

- B inverse path needed for static output
  - Otherwise output would be high impedance:



- Needs only two devices as opposed to six for static CMOS 2-input AND
- Requires generating complement of B with additional inverter

#### **Cascading Pass-Transistor Gates**



 Bottom approach is correct way to cascade pass gate logic, maximizes output high swing

#### **Complementary Pass Transistor Logic**



• Since complementary signals needed anyway, can create a fully differential version of pass gate logic

#### **CPL Basic Gates: AND / NAND**





#### **CPL Basic Gates: XOR / XNOR**



## **Advantages of CPL**

- Fully differential signals
  - Requires more devices, but simplifies complex gates like XOR, full adder
  - Both polarities eliminate extra inverters
- Static logic style
  - Output nodes always have a low impedance path to  $V_{\text{DD}}$  and GND
  - Improves resilience to noise events
- Very modular design style
  - All gates share same fundamental topology
  - Only inputs are permuted

## **Disadvantages of CPL**

- Fully differential signals require extra routing overhead
- Static power dissipation and reduced noise margins due to threshold drop in V<sub>OH</sub>
  - Level restoration through PMOS feedback
  - Multiple-threshold transistors
  - Full transmission gate logic



- PMOS feedback device pulls inverter input to V<sub>DD</sub>
- Must size carefully to guarantee correct operation
  - Pass transistor network must pull X below inverter threshold for output to switch

#### **Multiple-Threshold CPL**



 Zero or low threshold NMOS results in almost full rail intermediate nodes

## **Full Transmission Gates**



- PMOS devices in parallel with NMOS transistors pass full  $V_{\text{DD}}$
- Requires more devices, but can be sized smaller than static CMOS

#### **Static CMOS Full Adder**



Transistor count (conventional CMOS): 40

• From Chandrakasan92, "Low-Power CMOS Digital Design"

#### **Complementary Pass Gate Logic Full Adder**



Transistor count (CPL): 28

- From Chandrakasan92, "Low-Power CMOS Digital Design"
- R. Amirtharajah, EEC216 Winter 2008

#### **PDP vs. Delay for Various Circuit Forms**





From Chandrakasan92, "Low-Power CMOS Digital Design"

#### **Complementary Pass-Gate Logic for Low Power**

- Number of devices can be dramatically lower than static CMOS
  - No static power if circuits designed to maximize swings
- Extra routing overhead implies extra capacitance
- Performance worse than other styles, especially when gates cascaded
  - Acceptable when aggressive voltage scaling reaches limits, then only way to reduce power is to reduce switched capacitance
  - Power-Delay Product (switching energy) lower than for other styles

## **Closing Thoughts**

- Static CMOS is almost always the right choice
  - Simple to design, very robust wrt noise, power supply variation
  - Use other styles only sparingly and with care (bag of tricks for special cases of low area, low leakage, etc.)
- Next topic: sizing, dynamic logic, clocking