# 20 GHz Low Power QVCO and De-skew Techniques in 0.13µm Digital CMOS

Masum Hossain and Anthony Chan Carusone Dept. of Electrical and Computer Engineering, University of Toronto

Abstract- A novel VCO topology is proposed that combines the low power of  $-g_m$  oscillators with the inherent buffering of Colpitts oscillators. Using this topology, a quadrature VCO (QVCO) was implemented in 0.13 µm digital CMOS consuming 32 mW at 20 GHz with just over 10% tuning range. The measured phase noise of the QVCO at 20.17 GHz is -102.41 dBc/Hz at 1 MHz offset. Because the load is isolated from the tank, the QVCO can directly drive 50-Ohm impedances or large capacitive loads with no additional buffering. A technique to use the QVCO to deskew clocks is also presented whereby the QVCO accepts a small forwarded clock amplitude of 20 mV, and provides a 200 mV peak-to-peak differential clock output with linear control of the phase over the complete range, 0-360°.

## I. INTRODUCTION

Clock generation and distribution consumes significant power and area in high-speed I/Os. To reduce power consumption per link, a shared clock source may be used. This shared clock may be generated either in the receiver [1] or at the transmitter and then forwarded to the receiver [2]. A phase interpolator must then be included in each link's receiver to compensate for skew (fig.1) [1,3,4].

A common approach to the problem of clock generation and distribution is to employ a low-jitter VCO, within a phaselocked loop (PLL), then buffer the output with several CML and CMOS stages to distribute the clock [1,5]. In this work we propose a VCO with an inherent buffer that re-uses the VCO bias current and provides large driving capacity without additional power consumption.



Fig. 1. Shared clocking for high density I/O [1-4].

An efficient technique for deskewing is to use an injection locked oscillator (ILO) whose free-running frequency is detuned away from the input frequency [6]. A problem with this approach has been that for large phase shifts considerable variation is observed in the jitter tracking bandwidth and output clock amplitude [7]. In this work, by selectively injecting either one or the other side of a quadrature VCO (QVCO), the required phase adjustment range is cut in half.

# II. LOW POWER VCO ARCHITECTURE

Before introducing the proposed topology, two popular VCO topologies are reviewed: Colpitts and cross-coupled. The proposed topology, introduced at the end of this section, combines the advantages of both.



Fig. 2 Colpitts VCO (a) Conventional (b) CMOS version of modified A. Colpitts

Fig. 2 shows the equivalent half circuits two variants of the Colpitts VCO: fig. 2(a) is the well known conventional Colpitts and fig. 2(b) is a CMOS implementation of the bipolar microwave oscillator discussed in [8]. Although both VCO topologies have the same start up and oscillation conditions, the implementation in fig. 2(b) provides inherent buffering [8]: the tank is coupled to the load only through  $C_{GD}$  whereas in 3(a) the load capacitance( $C_L$ ) is directly across the tank. This is the main advantage of this modified Colpitts VCO.

If  $g_m$  is the small-signal transconductance of  $M_2$  and  $R_P$  models tank losses, the condition to ensure oscillation of the colpitts VCO is:

$$g_m \ge \frac{4}{R_p} \tag{1}$$

Providing sufficient bias current to meet this requirement can lead to very high power consumption in CMOS Colpitts VCOs. For example, in this design tank inductance (L) and capacitance (C) are chosen to be 350 pH and 150 fF respectively to achieve the targeted 20 GHz oscillation. In the given digital process, this inductance was realized with a single loop. EM simulation of the optimized structure (including metal fill inside the loop required by the design rules) shows a Q of 5 which is typical for digital CMOS processes [5,7]. Capacitors C1 and Cvar are chosen to be 400 fF and 250 fF respectively. For the given value of Q, the required transconductance to meet the oscillation condition was found to be approximately  $G_m = \omega^2 C_1 C_{var} R_s = 25 \text{mS}$  [9]. Here  $R_s$  is the series loss of the inductor. For the given CMOS 0.13 um process, a bias current density of 0.3mA/um provides a transconductance of 0.75 mS/um. Thus a single ended VCO

consumes 10 mA of current, which leads to 20 mA of total current consumption for the differential Colpitts VCO.



Fig. 3 (a) Cross-coupled VCO (b) Proposed topology

### B. Cross-coupled

The equivalent half circuit for the cross-coupled VCO is shown in fig. 3(a). Here, the ideal gain of '-1' is used to represent the positive feedback resulting from the crosscoupling in a fully-differential circuit. Compared to the Colpitts VCO, the cross-coupled VCO has a relaxed oscillation condition:

$$g_m \ge \frac{1}{R_p} \approx \sqrt{\frac{C_{eq}}{L}} \left( \frac{Q_L Q_C}{Q_L + Q_C} \right) = \sqrt{\frac{C_L + C_{var}}{L}} \left( \frac{Q_L Q_C}{Q_L + Q_C} \right)$$
(2)

where  $g_m$  is the transconductance of  $M_1$  and  $R_p$  models the tank loss.  $Q_L$  and  $Q_C$  are the quality factors of inductor and varactor respectively. The capacitance Cvar is the varactor and C<sub>L</sub> takes into account additional load and parasitic capacitors connected to this node. Considering the same tank circuit as in the previous section, the  $g_m$  required to meet the oscillation condition is found to be 6 mS. Assuming the same current density (0.3 mA/um), a differential cross coupled VCO consumes only 5 mA to provide the required g<sub>m</sub>, one quarter the current of the Colpitts VCO. However, if used as a clock generator for a wide parallel bus, the output node is heavily loaded by CL. Hence, to achieve the target oscillation frequency, the varactor Cvar must be made small resulting in poor tuning range [10]. To avoid the loading effect CML buffers are used in [5] and [9], but these consume additional power negating the benefit of cross-coupled oscillators.

In summary, the Colpitts topology provides sufficient tuning range and output power but consumes large power. On the other hand although cross-coupled VCOs consume less power, they require additional buffer and they are more susceptible to load parasitics.

## C. Proposed topology

To address these issues, we propose the architecture in fig. 3(b) which combines the useful properties of both crosscoupled and Colpitts architectures: The inherent buffering of the modified Colpitts topology (2(b)), and the low power oscillation condition of the cross-coupled VCO (fig. 3(a)). In this architecture the transistor M<sub>2</sub> is introduced in the tank of the cross-coupled VCO to isolate the output node from the tank similar to the modified Colpitts VCO. Effectively, M<sub>2</sub> serves as a buffer which can directly drive 50-ohm or large capacitive load. Since it uses the same VCO bias current, there is no additional DC power consumption. There are two sources of negative resistance in this topology: (i) Due to the cross-coupling,  $M_1$  provides a negative resistance of  $-1/g_{m1}$ . (ii)  $M_2$  also provides a negative resistance of approximately  $-g_{m2}/(C_{GS}C_{var}\omega^2)$ . Now potentially there are two modes of operation: (i) A Colpitts VCO, where the negative resistance provided by  $M_2$  is the dominating one and is large enough to compensate the tank loss (ii) A cross coupled VCO, where the oscillation occurs due to the negative resistance provided by  $M_1$ . Between these two modes, cross-coupled mode of oscillation requires less transconductance hence lower power consumption.

| Summary of VCO topology comparison |                                                                                |                                                                           |  |  |  |  |  |
|------------------------------------|--------------------------------------------------------------------------------|---------------------------------------------------------------------------|--|--|--|--|--|
|                                    | Frequency<br>of Oscillation                                                    | Minimum<br>Required g <sub>m</sub>                                        |  |  |  |  |  |
| Cross-<br>Coupled                  | $f_{osc} = \left(2\pi\sqrt{L(C_{var} + C_L)}\right)^{-1}$                      | $\geq \frac{1}{R_p}$                                                      |  |  |  |  |  |
| Colpitts                           | $f_{osc} = \left(2\pi\sqrt{L\frac{C_1C_{var}}{C_1 + C_{var}}}\right)^{-1}$     | $\geq \frac{4}{R_P}$                                                      |  |  |  |  |  |
| Proposed<br>VCO                    | $f_{osc} = \left(2\pi\sqrt{L\frac{C_{GS}C_{var}}{C_{GS}+C_{var}}}\right)^{-1}$ | $\geq \left(\frac{1}{R_{p}} - \frac{C_{\text{var}}}{C_{GS}}g_{m2}\right)$ |  |  |  |  |  |

The derived oscillation condition and oscillation frequency for the proposed cross coupled oscillator are given in table-I. These results are in good agreement with qualitative description given above:  $f_{osc}$  is independent of C<sub>L</sub> and required minimum transconductance is slightly less than  $1/R_P$ . Thus M<sub>1</sub> is sized to meet the oscillation condition and M<sub>2</sub> is used as a buffer only. The total current consumption for a differential VCO is 8 mA which results in 60% power reduction compared to the Colpitts implementation.

## D. Qudrature VCO (QVCO)



Fig. 4. Implementation of QVCO (a) Architecture and test set up (b) Detail schematic of QVCO (c) Die photo of QVCO in 0.13 um CMOS

A quadrature version of the proposed VCO topology was implemented by coupling 2 differential VCOs operating at the same frequency[11,12]. A schematic is shown in figures 4a and b. The coupling is provided by active devices, Mc.

Quadrature (4-phase) VCOs in general have several disadvantages compared to their differential (2-phase) counterparts: a) Due to the additional DC power consumption in the coupling devices, the power consumption of the

quadrature VCO is usually more than twice the power consumption of their differential version. b) In the quadrature implementation both tanks operate slightly off resonance which results in higher phase noise and reduced tank impedance compared to the differential version.

In cross-coupled topology, the coupling devices load the coupling node with additional parasitic capacitance which further reduces the tuning range of the cross-coupled QVCOs. To ensure 90° phase locking, the quadrature coupling transistors  $M_C$  are one-half the size of the cross-coupled transistors  $M_1$ , which results in additional 8 mA of current consumption. Thus the total current consumption was 24 mA from 1.2 V supply. A die photo of the implemented quadrature VCO is shown in fig. 4(c). For testing, the QVCO directly drives 0.3mm on die transmission line and 50-ohm off-chip termination without additional buffer.

#### E. Experimental Results

Measured results of the QVCO are summarized in fig. 5. The VCO has a tuning range of 2 GHz. Measured single ended output power driving a 50-ohm load varies from -12 dBm to -14 dBm over the tuning range. A captured phase noise plot at 20.17 GHz is shown in fig. 6.



Fig. 5. QVCO performance summary (a) Tuning curve (b) Phase noise and output power as a function of frequency

Саrrier Ромег -12.38 dBm Atten 0.00 dB Mkr1 1.03800 MHz Ref -50.00dBc/Hz -102.47 dBc/Hz 10.00 dB/ 1 kHz Frequency Offset 10 MHz

Fig. 6. Measured phase noise of the QVCO at 20.17 GHz



Fig. 7 Time domain QVCO output at 20 GHz showing 0° and 90° phase

To verify the quadrature operation,  $0^{\circ}$  and  $90^{\circ}$  outputs are captured on an oscilloscope where any mismatch in the length of measurement cables has been calibrated out (fig.7). For comparison, key performance metrics for different VCO topologies are summarized in table-II. According to the ITRS 2003[13], the figure-of-merit for VCOs is:

$$FoM = 10\log_{10} \left( \left( \frac{f_{osc}}{\Delta f} \right)^2 \frac{1}{L(\Delta f)P_{diss}(\text{mW})} \right)$$
(3)

Our earlier conclusion regarding Colpitts and cross-coupled VCOs are in good agreement with the measured results from [9]: cross-coupled VCO can achieve a significant advantage over Colpitts for low power applications. However, this advantage is significantly compromised when the buffer is included in the performance metric. In addition as pointed out in the previous section, there is significant performance degradation in cross coupled QVCOs compared to their differential counterparts [11,12]. Although the tank Q in this VCO is much lower compared to the other VCOs listed in the table, this VCO topology is still has a FoM better than other QVCOs in CMOS. The differential 10GHz Colpitts VCO designed in [9] consumes more power than the 20GHz QVCO designed in this work, which demonstrates the low power advantage of the proposed topology.

Table II

| Comparison for state-of-art CMOS VCOs |                 |                   |                      |                   |                   |                   |  |
|---------------------------------------|-----------------|-------------------|----------------------|-------------------|-------------------|-------------------|--|
|                                       | [9]<br>CSICS'06 | [9]<br>CSICS'06   | [10]<br>JSSC'07      | [11]<br>JSSC'04   | [12]<br>VLSI'05   | This work         |  |
| Technology                            | 90-nm<br>CMOS   | 90-nm<br>CMOS     | 0.13-um<br>CMOS      | 0.13-um<br>CMOS   | 90-nm<br>SOI      | 0.13-um<br>CMOS   |  |
| Frequency                             | 10 GHz          | 10 GHz            | 26 GHz               | 10 GHz            | 40 GHz            | 20 GHz            |  |
| Topology                              | Colpitts        | Cross-<br>Coupled | G <sub>m</sub> Tuned | Cross-<br>Coupled | Cross-<br>Coupled | Cross-<br>Coupled |  |
| Diff./Quadrature                      | Diff.           | Diff.             | Diff.                | Quad.             | Quad.             | Quad.             |  |
| Tuning Range                          | 12.2 %          | 15.8 %            | 23.6 %               | 15%               | 12.5 %            | 10.2 %            |  |
| Inductor Q/<br>Transformer Q          | 10              | 10                | 18                   |                   | 18                | 5                 |  |
| Phase Noise<br>(dBc/Hz@1 MHz)         | -117.5          | -109.2            | -92.6                | -95               | -87<br>@3 MHz     | -102.41           |  |
| VCO power<br>VCO+ Buffer              | 36 mW           | 7.5 mW<br>17.5 mW | 43.6 mW<br>50 mW     | 14.4 mW<br>       | <br>81 mW         | <br>32mW          |  |
| FOM (VCO) (dB)<br>(VCO+ Buffer)       | 181.9<br>       | 180.4<br>176.5    | 163.9<br>163.3       | 163.4<br>         | <br>150.4         | <br>173.45        |  |
|                                       | 1               | 1                 |                      |                   |                   |                   |  |

**III. DE-SKEW TECHNIQUES WITH JITTER FILTERING** 

Injection locking was introduced in [14] as an effective method to filter out jitter and duty cycle distortion from a high frequency reference clock. Recently in [6], an ILO is used as a local clock generator which provides several advantages: (i) Due to its high sensitivity, ILOs can operate with very small input amplitude. The ratio of the input clock amplitude to the VCO output amplitude is known as injection strength. Thus the reference clock can be distributed with low power which translates into large power savings. (ii) Since an ILO behaves as a 1<sup>st</sup> order PLL, it rejects high frequency jitter and is less susceptible to power supply noise. (iii) The clock can be deskewed by detuning the free running frequency of the ILO. For small injection strengths, the deskew range is smaller than 360° [7]. With large injection strength, it is possible to extend

the deskew range but this requires a wide tuning range in the ILO. Furthermore, providing skews near  $\pm$  180 degree results in considerable variation in the jitter tracking bandwidth and output clock amplitude [7]. To address these issues, we propose a deskew technique utilizing the QVCO as shown in fig.8(a). This proposed scheme allows us to selectively inject either of the differential VCOs in the QVCO. The measured skew versus control voltage is shown in fig. 8(b). Two deskew curves (AB and CD) are shown due to injection in *I-VCO* and *Q-VCO* respectively. Since *I-VCO* and *Q-VCO* are oscillating in quadrature, they maintain 90° phase difference between each other.

In [6,7] a single differential VCO is used as an ILO which has a deskew curve very similar to each of those in fig. 8(a). To obtain the 0-360° phase selection capability, full length of the curve is utilized. Notice the nonlinear compression observed close to the edges of the lock range. Fig. 9(a) shows the captured deskewed clocks for this portion of the curve. Variation in output amplitude is observed, and clock phases are nonlinearly spaced.



Fig. 8 (a) Proposed deskew technique (b) Corresponding measured deskew curves at 20 GHz for I-VCO and Q-VCO injection



Fig.9 (a) Measured deskewed clock using I injection only (d) Measured deskew using proposed technique

In the proposed technique, only the linear portions of the deskew curves are used. Hence, the ILO can provide linear control of the phase shift, relatively constant output amplitude and relatively little variation of the jitter transfer bandwidth. Now the forwarded clock is injected to the in-phase VCO to achieve 0-180° phase shift only. For  $180^{\circ}$ - $360^{\circ}$ , we shift the injection to *Q-VCO* and use linear portion of its deskew curve. Thus the proposed technique allows us to accomplish 0- $360^{\circ}$  phase selection with linear phase steps and negligible amplitude variation, as shown in fig. 9(b). In this experiment the forwarded clock amplitude was 20 mV and the deskewed differential peak to peak clock output was 200 mV for an injection strength of 0.1. Due to the additional VCO in

quadrature, this technique will consume more power compared to [6] and [7]. However, this 20 GHz deskew scheme can be implemented using total 35 mW only, which still compares favorably with using a complete DLL for deskewing, as in for example [14], which requires many buffers to delay the clock and perform phase selection.

# IV. CONCLUSION

In this work we have introduced a novel VCO topology capable of driving large capacitive loads without a buffer and with lower power than Colpitts VCOs. Using this topology, a QVCO is designed with a FoM comparable to state-of-art solutions in spite of a much lower tank Q. Its inherent buffering makes it useful for clock generation and distribution to the large capacitive loads in high-speed I/Os. It can also be used as an ILO to deskew a forwarded clock. It provides more linear skew control with less variation in output amplitude than previous solutions.

#### ACKNOWLEDGMENTS

The authors would like to acknowledge F. O'Mahony, M. Mansuri and B. Casper of CRL (Intel Circuit Research Lab at Hillsboro,OR) for their contribution in clock deskew technique presented in this work. This work is supported by Intel and fabrication facilities were provided by Gennum Corporation.

### References

- H. Takauchi et. al., "A CMOS Multichannel 10-Gb/s Transceiver," IEEE J. Solid-State Circuits, vol. 38, no. 12, pp. 2094–2100, Dec, 2003.
- [2] B. Casper et. al. "A 20 Gb/s forwarded clock transceiver in 90-nm CMOS,"in IEEE ISSCC Dig. Tech. Papers, Feb, 2006, pp. 90-91
- [3] R. Kreienkamp et. al., "A 10-Gb/s CMOS Clock and Data Recovery Circuit With an Analog Phase Interpolator," *IEEE J. Solid-State* Circuits, vol. 40,no. 3, pp. 736–743, Mar, 2005.
- [4] C. Kromer et. al., "A 25-Gb/s CDR in 90-nm CMOS for High-Density Interconnects," *IEEE J. Solid-State Circuits*, vol. 41,no. 12, pp. 2921– 2929, Dec, 2006.
- [5] F. O'mahony et. al., "A low-jitter PLL and repeaterless network for a 20Gb/s link," in *IEEE Symp. On VLSI Circuits Dig. of Tech. Papers*, 2006.
- [6] L. Zhang, B. Ciftcioglu, M. Huang, H. Wu, "Injection-locked clocking: a new GHz clock distribution scheme," *Custom Integrated Circuits Conference*, San Jose, California, September 2006
- [7] F. O'Mahony et. al., "A 27Gb/s Forwarded Clock I/O Receiver using an Injection-Locked LC-DCO in 45nm CMOS", IEEE International Solid-State Circuits Conference, Feb. 2008.
- [8] N. Nguyen, R.G. Meyer, "Start up and frequency stability in high-frequency oscillators," *IEEE J. Solid-State Circuits*, vol. 27,no. 5, pp. 810–820, May, 1992.
- [9] K.W. Tang et. al. "Frequency Scaling and Topology Comparison of mmwave CMOS VCOs," *IEEE CSICS*, pp.55-58, Nov,2006.
- [10] K. Kwok, J. Long, "A 23-to-29 GHz Transconductor-Tuned VCO MMIC in 0.13μm CMOS," *IEEE J. Solid-State Circuits*, vol. 42,no. 12, pp. 2878–3997, Dec, 2007.
- [11] S. Li, I. Kipnis, M. Ismail, "A 10-GHz CMOS Quadrature LC-VCO for Multicore Optical Applications," *IEEE J. Solid-State Circuits*, vol. 38,no. 10, pp. 1626–1634, Oct, 2003.
- [12] F. Ellinger and H. Jäckel, "38-43 GHz Quadrature VCO on 90nm VLSI CMOS with Feedback Frequency Tuning," VLSI Circuits Symposium, Kyoto, Japan, June, 2005.
- [13] International technology roadmap of semiconductor. www.itrs.org, 2003.
- [14] H.Ng et. al., "A Second-Order Semidigital Recovery Circuit Based on Injection Locking," *IEEE J. Solid-State Circuits*, vol. 38,no. 12, pp. 2101–2110, Dec, 2003