# A CMOS High Speed Multi-Modulus Divider With Retiming for Jitter Suppression

Qun Jane Gu and Zhuo Gao

*Abstract*—A new asynchronous high speed multi-modulus divider (MMD) architecture is presented in this letter. This new architecture significantly reduces the delay of the critical path, which not only pushes to ultra-high speed operation, but also allows retiming techniques to suppress jitter accumulation from the divider chain simultaneously. A prototype in a 65 nm CMOS technology has demonstrated an improved speed over three times compared with a conventional MMD and a reduced phase noise about 8.4 dB due to a retiming scheme. To the authors' best knowledge, this MMD has demonstrated to date the highest operating frequency static MMD with retiming function in CMOS. Due to its static implementation, this MMD can operate from 19 GHz down to close to dc with programmable division ratios from 16 to 31. This MMD consumes 39.8 mW power and occupies 0.011 mm<sup>2</sup> chip area.

*Index Terms*—CMOS, frequency synthesizer, multi-modulus divider (MMD), resynchronization.

## I. INTRODUCTION

IGH speed frequency dividers play a critical role in phase-locked loop (PLL) design. For an integer-N PLL, the reference frequency is often reduced to achieve a fine step channel selection. It, however, leads to a high division ratio and results in long settling time and high in-band phase noise. To overcome these issues, fractional-N architecture is widely adopted, which has great flexibility in channel selection [1]. A  $\Sigma\Delta$  modulator is commonly employed to shape the quantization noise to the high frequency, which is then filtered by the subsequent low pass filter. However, the filtering effect is normally limited due to a low-order loop filter of the PLL and constrained by the highest operating frequency of the programmable divider [2]-[5]. Therefore, a high frequency multi-modulus divider (MMD) is desired to achieve a better performance [6]-[8]. Moreover, a higher frequency MMD also enables a higher reference frequency, which reduces the noise contribution from PLL components due to a smaller division ratio [4]. Noise performance is another critical spec to MMDs [9], [10], which degrades along the divider chain due to jitter accumulation. Retiming, which uses the input clock to re-synchronize the output, is an effective solution to suppress the output jitter [11]. However, retiming techniques require the overall delay of the critical path less than one period

Manuscript received October 16, 2012; revised December 30, 2012; accepted February 04, 2013. Date of publication April 11, 2013; date of current version October 03, 2013.

Q. J. Gu is with the Department of Electrical and Computer Engineering, University of California, Davis, CA 95616 (e-mail: jgu@ucdavis.edu).

Z. Gao is with the Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611 USA (e-mail: zhuog@ufl.edu).

Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/LMWC.2013.2248080

of the input clock across different corners and variations for reliable functionality, which thus constrains the employment of retiming techniques in high speed operations [7], [8]. This letter presents a new MMD architecture, which reduces the critical path delay significantly to allow both retiming techniques and speed increasing simultaneously.

#### II. MMD STRUCTURE AND CIRCUIT TOPOLOGY

Typical MMD consists of N cascaded stages of divider-by-2/3 circuits and associated control gates to enable programmable division control of each divider according to the desired division ratio. The division ratio,  $D_{DIV}$ , is programmable to provide  $2^{N} \sim 2^{N+1} - 1$  division ratios and can be expressed as  $D_{DIV} = 2^{N} + \sum_{i=0}^{N-1} 2^{i} \times D\langle i \rangle$ , where  $D\langle i \rangle$  is the control bit to each divider-by-2/3 circuit. Fig. 1(a) illustrates a block diagram of a conventional 4-bit asynchronous MMD [6]. Fig. 1(b) shows the schematic of a divider-by-2/3 circuit unit. Asynchronous architecture is chosen because it significantly reduces the loading to the input block to facilitate high speed operations and reduce spurious generation compared to synchronous ones. Jitter performance is another concern for MMDs. If the output is drawn from the MMD's divided result, shown as OUT' in Fig. 1(a), the jitter contributed by each divider and the associated control gates is accumulated and degrades the output signal phase noise performance. To mitigate jitter accumulation, a retiming block, including a D Flip-Flop (DFF) and a buffer, is applied, labeled as "Retimer" in Fig. 1(a). The "Retimer" block re-samples the output with the input clock to reduce jitter accumulation along the divider chain and the OUT signal from "Retimer" block delivers better phase noise. The "Retimer" block also performs duty-cycle equalization function to make it close to 50% [6]. However, the design constraint to realize retiming function is to ensure that the critical path delay is smaller than one period of the input clock over various processes and corners for correct function. The critical path of a conventional MMD is annotated as the orange line in Fig. 1, which includes the sum of the overall delay of the dividers and the cascaded control gates. This long critical path dramatically reduces the highest operating speed of the MMD.

To mitigate this speed constraint, this letter presents a new asynchronous MMD architecture, as shown in Fig. 2, which significantly reduces the critical path delay. Instead of propagating the control signals from the latter stages back to the front stages as in Fig. 1, the control signals of later stages directly feedback to the corresponding circuits without passing through multiple gates. Therefore, the division control signals are parallel processed for the first two high frequency dividers. This approach



Fig. 1. (a) Conventional MMD circuit diagram, and (b) a divider-by-2/3 circuit.



Fig. 2. Proposed new MMD to provide high speed operation.

TABLE I Speed Comparison Between a Conventional MMD and the New MMD Versus Different Process Corners and Temperatures

|           | slow corner,<br>100°C | typical<br>corner, 27°C | fast corner,<br>0°C |
|-----------|-----------------------|-------------------------|---------------------|
| Conv. MMD | 5 GHz                 | 6.5 GHz                 | 7.5 GHz             |
| New MMD   | 18 GHz                | 21 GHz                  | 26 GHz              |

thus significantly reduces the delay of the critical path, illustrated as the orange line in Fig. 2, where the dash line represents a short connection and does not introduce any delay, consequently relaxes the MMD speed limitation. The critical path in the new structure is formed by the delay of the divider chain and the multiple input AND gate of the first stage. Therefore, the speed of theses circuits needs to maximize. The divider-by-2/3 block is implemented as synchronous implementation because its delay is reduced down to one gate delay from the input clock to facilitate high speed operation.

The drawback of this new MMD architecture is to increase the loading of the latter divider stages. The different symbol lines in Fig. 2 represent the loading of different stages, with the cycle-symbol line representing the loading for the 1st stage, the round-symbol line for the second stage, the square-symbol line for the third stage and the cross-symbol line for the fourth stage. The last two stages, third and fourth stages, have additional two gates to drive by comparing with Fig. 1 architecture. However, this extra loading will not affect circuit speed because the latter stages work at lower speeds and can drive extra loadings. Table I compares the simulated highest operating speeds of a conventional MMD and the newly invented MMD over three corners. The speed has increased over three times.

The MMD can be realized in single-ended logic, such as true single phase clock logic (TSPC), or fully differential logic like current mode logic (CML). Single-ended logic often leads to a compact and low power realization due to small capacitive loadings. In addition, it does not consume quiescent dc power. However, these features are achieved at the cost of large spurious generation at the supply and ground through bond wires



Fig. 3. Two differential implementation examples: nand2 and latch. The parameter values correspond to the lowest operating frequency unit. The device sizes and current values are scaled accordingly in different operating frequencies.



Fig. 4. (a) Simulated spur generation and transferring to the output by assuming 1 nH bonding wire inductance on the ground and supply, and (b) phase noise comparison before and after retiming circuitry.



Fig. 5. (a) Supply voltage versus maximum input frequency with different division ratios (16, 24 and 31), and (b) Measured output spectrum from the MMD. The input frequency is 19 GHz with 31 division ratio.

due to large switching current. Since this MMD aims to be integrated into a low spurious fractional-N frequency synthesizer, a fully differential realization is preferred. Fig. 3 shows the implementation example of two differential unit cells of the divider: nand2 and latch. Other cells adopt a similar differential configuration. Differential configuration not only alleviates the spur generation, but also benefits the spur immunity of the MMD by converting the coupling spurs into common-mode signals. The disadvantage is its relatively higher power consumption.

Fig. 4(a) shows the simulation results of differential configuration nand2 with the assumption of 1 nH bonding wire inductance on the ground and supply. The top figure shows the generated spurs on the ground, with the peak to peak spur amplitude about 59 mV. The middle and bottom figures show the common mode and differential mode output, respectively. It indicates that spurs are only present in the common mode, about 55 mV. No obvious spurs are observed in the differential mode. Therefore, it proves that the spurs generated within MMD are converted into common mode and also demonstrates the immunity of differential configurations to external spurs. Fig. 4(b) presents simulated phase noises before and after the retiming circuitry, which demonstrates about 8.4 dB improvement.



Fig. 6. Phase noise performance of (a) phase noise measurement results of both input and output of the new MMD with an input frequency of 18 GHz and division ratio of 24, and (b) phase noise improvement (in dB) at 10 KHz frequency offset for different input frequencies (19 GHz, 16 GHz, and 3 GHz) with division ratios of 16, 24, and 31.



Fig. 7. (a) Measured eye diagram of the new MMD output, and (b) die photo of the high speed MMD on 65 nm CMOS.

 TABLE II

 Performance Summary and Comparison With SOAs

| Ref.         | Architec<br>ture   | Division<br>Ratio | Power<br>Con.<br>[mW] | Max.<br>Freq.<br>[GHz] | Reti<br>ming<br>Func. | Process         |
|--------------|--------------------|-------------------|-----------------------|------------------------|-----------------------|-----------------|
| [7]          | 5-stage<br>MMD     | 128~159           | 24.2                  | 13                     | No                    | 0.13 μm<br>SiGe |
| [8]          | Swallow<br>Counter | 56.5~64           | 66.6                  | 20                     | No                    | 0.13 μm<br>SiGe |
| This<br>work | 4-stage<br>MMD     | 16~31             | 39.8                  | 19                     | Yes                   | 65 nm<br>CMOS   |

#### **III. MEASUREMENT RESULTS**

A high speed MMD with 16~31 division options has been fabricated in a 65 nm CMOS process. The core circuit occupies a chip area of 0.011  $mm^2$ . An on-chip balun was integrated with the MMD only to facilitate testing through a singleended configuration. Measurements were carried out through on-chip probing, where the dc supplies are provided through dc probes. Input signal is provided by a synthesized sweeper, and the output is captured by a spectrum analyzer. At input power level of -3 dBm, the MMD operates properly from 19 GHz down to 2 GHz. Theoretically, the MMD can operate down to an extremely low frequency due to its static implementation. The measured low frequency bound of 2 GHz is mainly constrained by the input balun limited bandwidth, which significantly attenuates low frequency inputs. Simulation confirms there is more than 19 dB loss at 2 GHz in the existing design. Fig. 5(a) shows the measured maximum input frequency of the MMD under different division ratios for different supply voltages with a fixed input power of -3 dBm. For instance, with a 0.94 V power supply, the MMD could operate at a maximum input frequency

of 16 GHz with division ratio of 16. Fig. 5(b) shows the measured output spectrum from the MMD with a 19 GHz input at division ratio of 31. Fig. 6(a) exemplifies the measured input and output phase noise (PN) with an input at 18 GHz and division ratio of 24. The difference between output PN of 118.8 dBc/Hz @ 10 KHz and input PN of  $-91.38 \,\mathrm{dBc/Hz}$  @ 10 KHz is about 27.42 dB, which is close to the theoretical value of 27.6 dB. Other frequency and division ratio measurement results also present similar results, which indicate negligible noise contribution from the MMD, as shown in Fig. 6(b). Fig. 7(a) shows one measured eye diagram at the input frequency of 9 GHz with 16-division ratio. The highest measured speed is limited by the speed of the trigger signal of the oscilloscope. Fig. 7(b) shows the die photo, where the balun is in the front. Table II summarizes and compares the performance of the proposed design with state-of-the-arts. Although the speed and power consumption are similar, only our MMD has retiming circuits, which can significantly suppress accumulated jitters from the divider chain.

### IV. CONCLUSION

A new high speed asynchronous MMD architecture is presented and validated to not only boost operating frequency but also allow retiming techniques for jitter suppression. The prototype in a 65 nm bulk CMOS has demonstrated that the improved speed is over three times compared with conventional MMDs with retiming functions and the improved phase noise is about 8.4 dB.

#### ACKNOWLEDGMENT

The authors would like to thank the staff of TSMC for foundry support.

#### REFERENCES

- F. M. Gardner, *Phase Lock Techniques*, third ed. New York: Wiley, 2005, ch. 15, pp. 357–379.
- [2] T. A. Riley, M. A. Copeland, and T. A. Kwasniewski, "Delta-sigma modulation in fractional-N frequency synthesis," *IEEE J. Solid-State Circuits*, vol. 28, pp. 553–559, May 1995.
- [3] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters. New York: Wiley, 2004, ch. 2, pp. 21–59.
- [4] B. Razavi, *RF and Microelectronics*. Upper Saddle River, NJ: Prentice-Hall, 1998, ch. 8, pp. 247–297.
- [5] P.-E. Su and S. Pamarti, "Fractional-N phase-locked-loop-based frequency synthesis: A tutorial," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 56, no. 12, pp. 881–885, Dec. 2009.
- [6] Z. Xu, Q. J. Gu, Y.-C. Wu, H.-Y. Jian, and M.-C. F. Chang, "A 70–78-GHz integrated CMOS frequency synthesizer for w-band satellite communications," *IEEE Trans. Microw. Theory Tech.*, vol. 59, no. 12, pp. 3206–3218, 2011.
- [7] M. Ray, W. Souder, M. Ratcliff, F. Dai, and J. Irwin, "A 13 GHz low power multi-modulus divider implemented in 0.13 μm SiGe technology," in *Proc. IEEE Topical Meeting Silicon Monolith. Integr. Circuits RF Syst.*, Jan. 2009, pp. 1–4.
- [8] B. A. Floyd, "A 16–18.8-GHz sub-integer-N frequency synthesizer for 60-GHz transceivers," *IEEE J. Solid-State Circuits*, vol. 43, no. 5, pp. 1076–1086, May 2008.
- [9] C. S. Vaucher, I. Ferecic, M. Locher, S. Sedvallson, U. Voegeli, and Z. Wang, "A family of low-power truly modular programmable dividers in standard 0.35-µm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 35, no. 7, pp. 1039–45, Jul. 2000.
- [10] V. F. Kroupa, "Jitter and phase noise in frequency dividers," *IEEE Tran. Instrum. Meas.*, vol. 50, no. 5, pp. 1241–3, May 2001.
- [11] H.-Y. Jian, Z. Xu, Y.-C. Wu, and M.-C. F. Chang, "A fractional-N PLL for multiband (0.8–6 GHz) communications using binary-weighted D/A differentiator and offset-frequency Δ-Σ modulator," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 768–780, Apr. 2010.