# **Special Correspondence**

# **Design-Performance Trade-Offs in CMOS-Domino** Logic

VOJIN G. OKLOBDZIJA AND ROBERT K. MONTOYE, MEMBER, IEEE

Abstract --- This paper is a study of the charge-sharing problem and its effect on the performance of CMOS-Domino logic. Several solutions to the charge-sharing problem are examined, and the results are verified by simulation. Thus the charge-sharing problem in CMOS-Domino logic was identified and alternate approaches were evaluated.

#### I. INTRODUCTION

With increased interest in CMOS, Domino-type logic has been gaining favor due to its n-MOS-like performance (i.e., n-channel dominant delay) and CMOS-like power consumption [1], [2], and favorable testability relative to CMOS [3]. However, Domino logic presents a charge-redistribution problem that continues to impair its usability.

This logic family was developed during the course of implementation of BELMAC-32 microprocessor and the first paper on Domino logic was published by the authors from Bell Laboratories [2]. However, the logic originally published had several drawbacks. For example, inversion was not possible, making the implementation of EXCLUSIVE OR (XOR) function difficult. The original circuit implementation was very sensitive to the charge-redistribution problem, causing spurious results.

The authors from IBM developed a logic family, Cascode Voltage Switch (CVS), which further advanced the status of Domino logic [1]. This is a complete logic family because both polarities of each function output are available at every stage. In fact, this is a two-rail logic that offers a self-checking feature at no extra cost. Their logic family comes in two versions: static and dynamic. The dynamic version can be treated as part of the CMOS-Domino logic family. The static version of the CVS logic implements the p-MOS latches at the output nodes, which triggers a regenerative action to bring the nodes to their full logic one and logic zero values. An extension of this technique is Differential Split Level (DSL) logic, which claims a performance improvement of ten times over regular static CMOS, but consumes more power as reported [4].

In this paper, we will consider CMOS Domino and dynamic CVS version for the purpose of analysis and identification of the charge-redistribution problem.

#### II. OPERATION

Principles of operation of Domino-type logic are outlined by the circuit example shown in Fig. 1. This logic family evolved from the dynamic n-MOS (p-MOS) circuits and therefore retained two phases of operation: "precharge" and "evaluate" (designated PCHG and EVAL, respectively, in this paper). The basic logic function implemented with this type of logic consists of: clock circuitry (transistors  $Q_2, Q_{p1}$ ), n-MOS transistor network SWF implementing given Boolean function f, and inverter. During the PCHG phase of the clock, the p-MOS transistor  $Q_{p1}$  is ON while the n-MOS transistor  $Q_2$  is OFF Node N4 is charged to  $V_{dd}$  and the output from the inverter is at the voltage level close to 0 v. This situation occurs at that time at every logic block including those whose outputs are connected to the inputs  $X_i$  of this particular block. Registers are designed in the same way so that all of their outputs are logic zero value during the



Fig 1 Domino circuit A p-channel transistor is indicated by a circle at the gate W/Lratios are indicated next to the transistor. Waveforms on the internal nodes N1, N3, and N4 are shown in Fig 2

PCHG phase. As a consequence, all of the inputs  $X_i$  of the particular block, and all of the other blocks, are close to 0 v during the PCHG-phase. Therefore during the PCHG phase there is no electrical path from the "top" node  $(N_4)$  to the "bottom" node  $(N_1)$  and only the "top" node  $(N_4)$  is storing charge. When the clock turns to the EVAL phase, transistor  $Q_2$  is ON creating the path from the node N4 through the switching network SWF to the ground. If the condition for the existence of an electrical path in the network SWF (between the nodes N4 and N2) is established by the signal values of the inputs to the SWF, node N4 is discharged to ground, which in turn makes the output of the inverter Fthe logic ONE. This value is the input to the subsequent logic block(s) and can cause the output of the block(s) to switch to ONE. This signal change is propagated in the "domino" fashion.

#### A. Charge-Redistribution Problem

From the operation of the Domino logic, it is clear that the charge is stored only at the "top" node (N4 in Fig. 1.) and the nodes  $N_1, N_2, N_3$ are not charged during the PCHG phase. They might have been discharged during the previous cycle and thus have no charge. Therefore, during the evaluation phase, there may be an electrical path to several discharged nodes (causing charge redistribution) without an electrical path to ground. If there is sufficient charge redistribution (i.e., the ratio of the capacitance at the top node of the tree  $C_t$ ) the uncharged capacitance internal to the tree  $C_i$  reduces the voltage below the inverter threshold  $I_{ik}$ 

$$V_{dd} \times \frac{C_t}{C_t + C_t} \leq I_{th}.$$

This charge redistribution will cause the inverter at the output of the tree to falsely switch, thus placing the incorrect value on the line causing other groups to discharge falsely. One such example (shown in Fig. 2) is generated by simulation of the single Domino-logic stage using the Toggle circuit simulation package [5]. We are observing in this case the behavior of the logic block during the period of two full cycles: PCHG-EVAL, PCHG-EVAL. During the first cycle inputs X0, X1, and X2 are set to

Manuscript received September 9, 1985; revised December 20, 1985

The authors are with the IBM T J Watson Research Center, Yorktown Heights, NY 10598

IEEE Log Number 8607667.



Fig 2 Waveforms from the Domino circuit example in Fig 1. The effect of charge redistribution is seen on the node N4

logic one causing the discharge of the entire network during the EVAL phase. Following the PCHG phase in the second cycle, the node N4 is precharged to the value of 5.000 V. In the second cycle, the inputs are set to X0 = 0, X1 = 1, and X2 = 1 creating an electrical path between the nodes N4, N3, and N1, but stopping short of the node N2 which is connected to ground during the EVAL phase. This situation causes redistribution of the charge between the nodes N4, N3, and N1. From the waveforms in Fig. 2, we can observe that the voltage on the node N4 falls to 1.0016 V. The voltage on node N1 has risen to 1.0005 V and the voltage on node N3 has risen to 1.0009 V. The input combination  $(X_0 = 0, X_1 = 1, \text{ and } X_2 = 1)$  is supposed to produce the logic ZERO value at the output  $F_1$ . However, because of the redistribution of charge between the node  $N_4$  and the nodes  $N_1, N_2, N_3$  the voltage at node  $N_4$  is only 1.0016 V. This produces an erroneous value of logic ONE at the output  $F_1$ .

#### III. PROBLEM ELIMINATION METHODS

In this section we examine the techniques used to alleviate the problem caused by charge redistribution and evaluate the trade-offs in reliability and circuit performance.

Two methods of reducing the charge-sharing problem are addressed. The first of these, incorporating feedback into the tree, reduces the charge-sharing problem by injecting charge into the tree during evaluation. The second method selectively increases the storage capacity of the precharge node in proportion to the number of nodes to which the charge can be redistributed.

# A. Feedback Transistor (Dynamic CVS Logic ONE)

One method used by dynamic CVS logic to alleviate the charge redistribution problem is to place an additional p-MOS transistor Qf in parallel with the precharge transistor. The gate of this transistor is connected to the output of the inverter so that feedback from the output is obtained (Fig. 3). In this way the inverter-transistor combination acts as a "latch" that locks on the state where the output of the inverter F is at the ZERO logic value. Because the output F = 0 is "latched", it takes more current from the node N4 to pull the node to ground since the charge is being continuously replenished by the device  $Q_f$ . When charge redistribution occurs, transistor  $Q_f$  serves the purpose of replenishing the charge lost in the process of redistribution to the other nodes. We can distinguish two cases.

1) During the charge redistribution, the total sum of the currents to node  $N_4$  is such that node  $N_4$  will recover the charge lost by redistribution. The current from the node  $N_4$  which is due to redistribution of charge, decreases exponentially in time. The current to the node  $N_4$  through the transistor  $Q_f$  will also exponentially decrease in time but have a larger time constant.



Fig 3. An example of CVS logic which was used to simulate charge redistribution. p-channel transistors are distinguished by the circle associated with the gate symbol. Voltages on the nodes N3 and N4 are shown in Fig. 4

As a result, the current calculated with reference to the node  $N_4$  is negative at the beginning and is equal to the charge taken from the node and distributed to other nodes. Later, this current becomes positive, bringing the charge lost in redistribution back to the node  $N_4$ . It decreases exponentially to ZERO. This produces a voltage "spike" or "glitch" at the node N4 which is of the amplitude that never exceeds the threshold for logic ONE at the inverter input.

2) In the second case, the amount of charge lost during the initial period is such that produces the voltage spike of sufficient amplitude to change the output value to logical ONE. This in turn cuts off the transistor Qf preventing it from replenishing the lost charge to the node N4. In this case, the fault is permanent and the output will stay at the erroneous value of logic ONE instead of ZERO.

Let us consider the first case and the consequences of the voltage "spike." The voltage "spike" at the node N4 is propagated through the inverter producing a positive voltage "spike" at the output F. The effect of this positive voltage "spike" can be twofold.

1) It can cause complete discharge in the next logic block creating the erroneous value to appear at its output. This value is propagated further in a "domino" fashion.

2) The voltage "spike" can cause a similar "spike" at the output of the next logic block. This spike is further propagated, in which case it can be:

- a) of the smaller amplitude and therefore dissipated in the logic;
- amplified through the consequent stages and therefore eventually resulting in a permanent error being propagated (much like the case 1);
- c) fanned out in different directions, in which some will dissipate the spike and some will amplify it and distribute the erroneous reading.

#### These cases are illustrated in Fig. 4.

However, feedback transistor  $Q_f$  serves as a load transistor when the node N4 is forced to ground. Therefore the operation is now that of a ráticed circuit, and as such the transistor  $Q_f$  serves as a load and its L/W ratio has to be adjusted to the cumulative (L/W) eff ratio of the maximum length electrical path to ground of the switching block SWF. L/W ratio represents resistance of the feedback transistor in the circuit. Let us define  $\beta$  to be the ratio of (L/W)f/(L/W)eff. This ratio represents the resistance of the feedback device compared to the resis-



Fig 4 Signals at the nodes F1, F2, N3, and N4 for the circuit in Fig. 3, as a function of time for various sizes  $(W/L)_f$  of the feedback transistors  $Q_{f1}$  and  $Q_{f2}$ 



Fig. 5. Waveforms on the nodes F1, F2, N3, and N4 for the circuit in Fig. 3. without Resp. watering of modes 71, 22, 76, and 74 for the effect in the effect in (2, -5), and  $Q_{12}$ . The size of the output inverters is varied: A = -5 mall inverter; B—output inverter size increased 7 times over case A. C—output inverter of the size in case B with the output load on the nodes F1 and F2 doubled

tance of the switching network SWF. We can distinguish three cases:

- 1)  $\beta$  too large, in which case the feedback device is not very effective except for very small "glitches;"
- 2)  $\beta$  in the range on 0.9, which was determined to be the optimal value with respect to adequate "glitch" protection; and
- 3)  $\beta$  too small, in which case the output F acts as being stuck at zero, because the SWF block is too weak to pull the node N4 to ground.

However, the "safe" or "glitch-free" operation is dependent on the proper choice for  $\beta$ . The range of  $\beta$  values in which the circuit is effective in "glitch" suppression imposes the restriction on the maximal length of the possible electrical paths in the SWF network. Additionally, a restriction is placed on the number of nodes in the tree that can be at 0 V and cause redistribution.

Given the example shown, in which a SWF with three devices in a series and a total of six devices can provide glitch immunity only with careful feedback device tuning, it is clear that the range of effectiveness of the feedback device is very limited.

Another impact of the feedback device is on performance. This is visible in Fig. 4, as the delay increases markedly as a larger feedback device (smaller  $\beta$ ) is used, until, as previously mentioned, the circuit fails to operate  $\beta$  below 0.6. Thus the reduction in glitch sensitivity is paid for by a degradation in performance. This difficulty becomes apparent as larger SWF's are used, since the path to ground involves more active devices, and there are more total devices in the circuit.

#### B. Charge Storage on Output Inverters

One method of reducing the glitch sensitivity is to increase the capacitance of the precharge node. This method forces the charge to be drained proportionally to the number of nodes in the SWF. This increase in capacitance allows the charge stored to be distributed over the available nonprecharged drains. This capacitance can be increased by making the size of the transistors in the output inverter larger. This method has the major advantage of increasing the drive capability of the circuit, thus reducing its sensitivity to output loading. The waveform A in Fig. 5 shows the result of using a small output inverter, i.e., charge redistribution. The waveform B results from increasing the output inverter by a factor of 7, which removes the glitch in the output at the expense of a 25-percent increase in delay. This is because the additional capacitance must be discharged if the circuit is to switch. However, the increase allows much greater insensitivity to the effects of increasing the output load, since the output driver is much larger. Notice that there is both charge redistribution for the first set of inputs and along delay for the true switching signal. The final waveform C shows the result of the circuit with the larger output inverter and double the expected loading, i.e., 0.6 pf. Note that the delay for this circuit is only slightly larger than for the circuit having the output inverter driving the smaller load.

The technique of increasing the output inverter size to reduce the effects of charge sharing is limited in range to acceptable output inverter sizes and delays. It has the potential to reduce charge sharing in cases where the problem is not too severe. However, the internal switching delay may grow significantly with larger trees. The additional side effect of increasing the switching speed of the output load may offset some of this weakness.

## IV. CONCLUSION

The charge-sharing problem may be combatted using several methods. Two solutions to this problem were described and analyzed. It was concluded that feedback devices are helpful in a limited range and increasing output inverter sizes are additionally helpful to reduce the problems of charge redistribution. In practice, these two methods can be used to produce a large family of glitch-free trees. Further study is required to make Domino logic viable for a wider range of switching functions.

#### REFERENCES

- [1] L G Heller, W R Griffin, J W Davis, and N. G Thoma, "Cascode voltage switch logic: A differential CMOS logic family," in Proc. ISSCC (San Francisco, CA), Feb. 22-24, 1984.
- [2] R H Krambeck et al, "High-speed compact circuits with CMOS," IEEE J. Solid-State Circuits, vol SC-17, no 3, June 1982. V G. Oklobdzija and P. G. Kovijanic, "On testability of CMOS-Domino logic," in [3]
- Proc. 14th Int. Conf. Fault-Tolerant Computing (Orlando, FL), June 20–22, 1984 L. C. M. G. Pfennings et al., "Differential split-level logic for sub-nanosecond speeds," in Proc. ISSCC 1985 (New York, NY), Feb 14, 1985. [4]
- F Beetern et al., "A large-scale MOSFET circuit analyzer based on waveform [5] relaxation," in Proc ICCD'84 (Port Chester, NY), Oct. 8-11, 1984.

## **CMOS Circuit Testability**

### P. S. MORITZ AND L. M. THORSEN

Abstract-CMOS circuits present unique testing problems. Although open faults in CMOS circuits can be statically tested, a sequence of patterns is required to guarantee a test. In addition, connections in the circuit layout affect testability. An automatic test generator has been developed to generate test sequences which will detect open CMOS faults.

Manuscript received August 15, 1985; revised December 11, 1985

The authors are with the IBM General Technology Division, Essex Junction, VT 05452 IEEE Log Number 8607454.