## Generation of Accurate On-Chip Transconductances Using a Monolithic CMOS PLL With Hybrid Analog and Digital Control

. . .

by

## **Angus McLaren**



A Thesis submitted in conformity with the requirements for the degree of Master of Applied Science Department of Electrical and Computer Engineering University of Toronto

© Copyright Angus McLaren, 2000



#### National Library of Canada

Acquisitions and Bibliographic Services

395 Wellington Street Otlaws ON K1A 0N4 Canada Bibliothèque nationale du Canada

Acquisitions et services bibliographiques

395, rue Wellington Ottawa ON K1A 0N4 Canada

Your file Votre rélérence

Our lie Notre rélérance

The author has granted a nonexclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission. L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

0-612-50388-7

# Canadä

## Generation of Accurate On-Chip Transconductances Using a Monolithic CMOS PLL With Hybrid Analog and Digital Control

Abstract

Angus McLaren Department of Electrical and Computer Engineering University of Toronto Degree of Master of Applied Science, 2000

## Abstract

A method for generating accurately-known on-chip transistor transconductances over process, power-supply, and temperature variations is presented. The technique uses an adjustable constant- $g_m$  bias circuit, which is tuned with a fully-integrated CMOS PLL locked to an external frequency reference normally present to produce a system clock. The PLL uses a charge-pump structure with three control-loops (two digital and one analog) having overlapping ranges with hysteresis to minimize tuning glitches in steady-state. The PLL has a lock-range of 135 MHz to 300 MHz, and displays an RMS jitter of 15.6 ps. The transconductances generated from the circuit display a 2.2% variation for a 60°C change in temperature, and a 1.3% variation for a 10% variation in power-supply voltage. The design has been fabricated in a 0.35  $\mu$ m CMOS process, using an active area of 1200 x 1200  $\mu$ m<sup>2</sup>, and drawing 5.8 mA from a 3.3 V supply.

## **Acknowledgments**

I would like to thank my supervisor, Ken Martin, whose ideas this work was based on. I am deeply indebted to him for the opportunities he gave me, and the invaluable knowledge he has passed on to me.

Thanks to the EA104 posse for providing a lively (sometimes exceedingly) work environment, indispensable technical and emotional support, and, of course, friendship. Afshin, Amir, Anas, Anthony, Augustine, Bahram, Cameron, Dave, Dickson (all versions), Fabio the Incredible, James, John, Kamran, Professor Phang, Mark (both Irish and not), Raj, Saman, Sebastian, Shahriar, Steve, Takis, Vasilis: may all your DRC's run clean!

I am very grateful to NSERC and the University of Toronto for their financial support, and to Micronet for their fabrication services. Thanks also to Professor Salama for allowing me access to his test equipment.

Thanks also to Professor Salama for giving me permission to use his test equipment, and Jaro Pristupa for maintaining the CAD tools, as well as tracking down test equipment for me.

I would also like to express my gratitude to my father, who has kept up interest in my work over the past two years, despite its lying a great distance outside his field of expertise.

Finally, and most importantly, I would like to thank my fiance Maia, without whose support and understanding I could not have completed this work.

<u>\_</u>+

ę

# Table of Contents

| 1. Introduction                                                    | 1    |
|--------------------------------------------------------------------|------|
| 1.1 Generation of Accurate Transistor Transconductance             | 1    |
| 1.1.1 Motivation and Design Implications                           |      |
| 1.1.2 Constant-Transconductance Bias Circuit                       | -    |
| 1.1.3 Constant-g <sub>m</sub> Bias Circuit With PLL-Tuned Resistor |      |
|                                                                    | -    |
| 1.2 Intended Application: Wireless Data Communications Receiver    | _    |
| 1.3 Outline of Thesis                                              |      |
| 1.4 References                                                     | . 3  |
| 2. Phase-Locked Loops: A Background                                | б    |
| 2.1 Basic Loop Structure                                           | 6    |
| 2.1.1 Block Diagram                                                | -    |
| 2.1.2 Classes of PLL Systems                                       | _    |
| 2.1.3 Linearized Small-Signal Model                                | _    |
| 2.1.4 Lock Metrics                                                 |      |
| 2.1.4.1 Lock-in Range                                              |      |
| 2.1.4.2 Hold Range                                                 |      |
| 2.1.4.3 Pull-in Range                                              |      |
| 2.1.4.4 Pull-out Range                                             |      |
| 2.2 Block Realizations                                             |      |
| 2.2.1 Phase-Frequency Detector With Charge-Pump                    | . 12 |
| 2.2.1.1 Phase-Frequency Detector                                   |      |
| 2.2.1.2 Charge Pump for PFD                                        |      |
| 2.2.2 Loop Filter                                                  |      |
| 2.2.3 Voltage-Controlled Oscillator                                |      |
| 2.2.3.1 LC Oscillators                                             |      |
| 2.2.3.2 RC Multivibrators                                          |      |
| 2.2.3.3 Ring Oscillators                                           |      |
| 2.2.4 Frequency Dividers                                           |      |
| 2.2.5 Loop Equations for Charge-Pump PLL With Lead-Lag Loop Filter |      |
| 2.3 Design Issues                                                  |      |
| 2.3.1 Loop Bandwidth                                               |      |
| 2.3.2 Quality Factor and Natural Frequency                         |      |
| 2.3.3 PFD                                                          |      |
| 2.3.3.1 Lock Behaviour                                             |      |
| 2.3.3.2 Logic Family                                               |      |
| 2.3.4 Charge-Pump                                                  |      |
| 2.3.4.1 Up/Down Symettry                                           |      |
| 2.3.4.2 Variation of Up/Down Currents                              |      |
| 2.3.5 VCO                                                          |      |
| 2.3.5.1 Jitter                                                     |      |
| 2.3.5.2 Frequency-Control Voltage Characteristic                   | . 24 |

| Tal | ble | of Contents |  |
|-----|-----|-------------|--|

~ ...

| 2.3.6 Integration of Loop Filter                      | 24  |
|-------------------------------------------------------|-----|
| 2.3.7 Loop Gain                                       | 24  |
| 2.4 References                                        | 29  |
|                                                       |     |
| 3. Block-Level Modeling and Design                    | 31  |
| 3.1 PLL Simulation Problems                           | 31  |
| 3.2 System Block-Level Description                    | 32  |
| 3.2.1 System Description                              | 32  |
| 3.2.2 System Advantages                               | 34  |
| 3.3 Blocks                                            | 34  |
| 3.3.1 VCO                                             | 35  |
| 3.3.2 Phase Detector                                  | 36  |
| 3.3.3 Up/Down Control                                 | 43  |
| 3.4 SIMULINK Model                                    | 45  |
| 3.5 SIMULINK Simulation Results                       | 46  |
|                                                       | 46  |
| 3.5.1 System With No Delay                            | 50  |
| 3.5.2 System With Delay                               | 52  |
| 3.6 References                                        | 54  |
|                                                       | E A |
| 4. Circuit Design                                     | 54  |
| 4.1 Overall System                                    | 54  |
| 4.2 Adjustable Constant-Transconductance Bias Circuit | 55  |
| 4.2.1 Bias Circuit Without Variable Resistor          | 56  |
| 4.2.2 Voltage-Controlled Resistor                     | 61  |
| 4.3 Transconductance-Controlled Oscillator            | 64  |
| 4.3.1 Oscillator Design                               | 64  |
| 4.3.2 Oscillator Buffer Design                        | 69  |
| 4.4 Phase-Frequency Detector                          | 70  |
| 4.5 Charge-Pump                                       | 76  |
| 4.5.1 Charge-Pump Design                              | 76  |
| 4.5.2 Loop-Filter Design and Self-Biasing             | 80  |
| 4.6 Comparators                                       | 82  |
| 4.6.1 Comparator 1 Design                             | 83  |
| 4.6.2 Comparator 2 Design                             | 85  |
| 4.7 Up/Down Control Logic                             | 87  |
| 4.8 Layout Considerations                             | 88  |
| 4.9 References                                        | 91  |
|                                                       |     |
| 5 Measurements for Enbricated VCO and DIT             | 93  |
| 5. Measurements for Fabricated VCO and PLL            |     |
| 5.1 VCO                                               | 93  |
| 5.1.1 Test Setup                                      | 95  |
| 5.1.2 Results                                         | 96  |
| 5.2 Entire PLL                                        | 101 |
| 5.2.1 Test Setup                                      | 105 |
|                                                       |     |

٦.

| 5.2.2 Results                                                                                            |            |
|----------------------------------------------------------------------------------------------------------|------------|
| 6. Conclusions<br>6.1 Discussion<br>6.2 Suggestions for Future Work<br>6.3 References                    | 122<br>123 |
| Appendix A: VHDL Code for Up/Down Logic<br>Appendix B: Derivation of Delay Through Differential Inverter |            |

# List of Tables

## **Chapter 2: Phase-Locked Loops: A Background**

| Table 2.1: State Assignment for PFD                     | 13 |
|---------------------------------------------------------|----|
| Table 2.2: Formulae for PFD and Passive Lead-Lag Filter | 19 |
| Table 2.3: Parameter Values Used for Root Locus Plot    | 24 |
|                                                         |    |

## **Chapter 4: Circuit Design for Hybrid Analog/Digital PLL**

| Table 4.1: Opamp Characteristics        | 58 |
|-----------------------------------------|----|
| Table 4.2: Characteristics of Bias Loop | 60 |

## **Chapter 5: Measurements for Fabricated VCO and PLL**

| Table 5.1: Summar | y of PLL and Bias S | ystem Performance | 120 |
|-------------------|---------------------|-------------------|-----|
|-------------------|---------------------|-------------------|-----|

# List of Figures

## **Chapter 1: Introduction**

| Figure 1.1: Simplified Schematic of Constant-Transconductance Bias Circuit | 2 |
|----------------------------------------------------------------------------|---|
| Figure 1.2: Schematic for Proposed System                                  | 3 |
| Figure 1.3: Simplified Schematic for PLL                                   | 4 |
| Figure 1.4: System Schematic for Low-Cost Wireless Receiver                | 5 |

## **Chapter 2: Phase-Locked Loops: A Background**

| Figure 2.1: Block Diagram for General PLL                                                | 7  |
|------------------------------------------------------------------------------------------|----|
| Figure 2.2: Simplified Charge-Pump Circuit                                               | 8  |
| Figure 2.3: Block Diagram for Linearized PLL                                             | 9  |
| Figure 2.4: Schematic for PFD                                                            | 12 |
| Figure 2.5: Signal-Flow Graph for PFD                                                    | 13 |
| Figure 2.6: PFD Characteristic: a) Ideal(Upper) b) Non-Zero Delay in Logic (Lower)       | 14 |
| Figure 2.7: Lead-Lag Loop Filter                                                         | 15 |
| Figure 2.8: Schematic for Lead-Lag Loop Filter for Charge-Pump PLL (Two Realizations)    | 15 |
| Figure 2.9: Schematic for Current-Starved Delay Element                                  | 17 |
| Figure 2.10: Circuit Schematic for Differential Inverter                                 | 18 |
| Figure 2.11: Root Locus Plot for Third-Order System                                      | 25 |
| Figure 2.12: Behaviour of Root Loci for Varying Values of C <sub>1</sub> /C <sub>2</sub> | 26 |
| Figure 2.13: Step Response for PLL.                                                      | 27 |
| Figure 2.14: Step Response of PLL at Input to VCO                                        | 27 |
| Figure 2.15: Pole-Zero Plot for Filter 1                                                 | 28 |
| Figure 2.16: Bode Plot for PLL With Filter 1                                             | 28 |
| Figure 2.17: Bode Plot of PLL at Input to VCO                                            | 29 |
|                                                                                          |    |

## Chapter 3: System-Level Design for Hybrid Analog/Digital PLL

| Figure 3.1: Simplified Schematic of System                                   | 32 |
|------------------------------------------------------------------------------|----|
| Figure 3.2: Frequency Plan for VCO                                           | 36 |
| Figure 3.3: SIMULINK Schematic for PFD                                       | 36 |
| Figure 3.4: Ideal PFD Signals for VCO in Phase With Reference Signal         | 37 |
| Figure 3.5: PFD Signals with Delay in NAND Gate, VCO In Phase With Reference | 38 |
| Figure 3.6: Ideal Response of PFD for $f_{VCO} > f_{Ref}$                    | 38 |
| Figure 3.7: PFD Signals with 0.5 ns NAND Delay, $f_{VCO} > f_{Ref}$          | 39 |
| Figure 3.8: Loop Filter Used for SIMULINK Simulations                        | 40 |
| Figure 3.9: Step Responses for Various Values of C <sub>2</sub>              | 42 |
| Figure 3.10: Bode Plots for PLL System Design                                | 42 |

| List   | of | Fier | ine | ė. |
|--------|----|------|-----|----|
| STATE: | ~  | ~    |     |    |

- -2-----

| Figure 3.11: Block Diagram of Up/Down Control Block                                | 43 |
|------------------------------------------------------------------------------------|----|
| Figure 3.12: Block Diagram of Thermblock (Same as Binblock)                        | 44 |
| Figure 3.13: Signals for Up/Down Control Block                                     | 45 |
| Figure 3.14: Complete SIMULINK System                                              | 46 |
| Figure 3.15: System Response to 180 Degree Phase Error                             | 47 |
| Figure 3.16: System Response for 0.5 MHz Frequency Discrepancy                     | 48 |
| Figure 3.17: System Response for Frequency Discrepancy of 5 MHz                    | 49 |
| Figure 3.18: System Response for 75 MHz Frequency Discrepancy                      | 50 |
| Figure 3.19: System Response for 75 MHz Frequency Discrepancy and 75 ns Loop Delay | 51 |
| Figure 3.20: Response of Altered System to 75 MHz                                  |    |
| Frequency Discrepancy and 75 ns Delay                                              | 52 |
|                                                                                    |    |

## Chapter 4: Circuit Design for Hybrid Analog/Digital PLL

| Figure 4.1: Simplified System Schematic                                             | 54 |
|-------------------------------------------------------------------------------------|----|
| Figure 4.2: Full System Block-Diagram                                               | 55 |
| Figure 4.3: Circuit Schematic for Adjustable Constant-Transconductance Bias Circuit | 57 |
| Figure 4.4: Measurement of Bias Circuit Loop Gain                                   | 59 |
| Figure 4.5: Frequency Response of Bias Loop                                         | 60 |
| Figure 4.6: Voltage-Controlled Resistor                                             | 61 |
| Figure 4.7: Response of Final System to 275 MHz Input                               | 62 |
| Figure 4.8: Response of Final System to 325 MHz Input                               | 63 |
| Figure 4.9: Variable-Resistor Conductance and g <sub>m4</sub> Versus V <sub>a</sub> | 63 |
| Figure 4.10: G <sub>m</sub> CO Schematic                                            | 64 |
| Figure 4.11: Ring Oscillator Signals at Outputs of Various Stages                   | 65 |
| Figure 4.12: Circuit Schematic for VCO Inverter                                     | 66 |
| Figure 4.13: Frequency Response of VCO Inverter                                     | 66 |
| Figure 4.14: Simulated Transconductance of mhigh Versus VCO Frequency               | 68 |
| Figure 4.15: VCO Frequency Versus Binary Control Signal (Analog and Thermometer     |    |
| Control Signals Fixed at Middle Values)                                             | 69 |
| Figure 4.16: Circuit Schematic for VCO Buffer                                       | 70 |
| Figure 4.17: VCO Buffer Signals                                                     | 70 |
| Figure 4.18: Block Diagram for PFD                                                  | 71 |
| Figure 4.19: Mean Value of (Up-Down) for Various Values of Gate Delay $\Delta R$    | 72 |
| Figure 4.20: Circuit Schematic for PFD Flip-Flop                                    | 74 |
| Figure 4.21: PFD Response for $f_{VCO} > f_{Ref}$                                   | 75 |
| Figure 4.22: Response of PFD for Maximum Phase Error at 333.3 MHz                   | 75 |
| Figure 4.23: Mean Value of Up-Down Versus Phase Error Between Inputs at 333 MHz     | 76 |
| Figure 4.24: Initial Circuit Schematic for Charge-Pump                              | 78 |
| Figure 4.25: Circuit Schematic for Revised Charge-Pump Design                       | 78 |
| Figure 4.26: Charge/Discharge Currents and Output Voltage of Charge-Pump            |    |

| List of Figures |   |    |    |   |   |      |
|-----------------|---|----|----|---|---|------|
|                 | ſ | İs | to | f | F | Ires |

| for Equal Input Frequencies                                                         | <b>79</b> |
|-------------------------------------------------------------------------------------|-----------|
| Figure 4.27: Simulated I <sub>ch</sub> , R, and K <sub>a</sub> Versus VCO Frequency | 81        |
| Figure 4.28: Natural Frequency and Q Versus Frequency                               | 82        |
| Figure 4.29: Circuit Schematic for Comparator 1                                     | 84        |
| Figure 4.30: Output of Comparator1 for 1 MHz Sinusoidal Input, 5 MHz Clock Rate     | 85        |
| Figure 4.31: Circuit Schematic for Comparator 2                                     | 86        |
| Figure 4.32: Output of Comparator 2 for 1 MHz Sinusoidal Input, 5 MHz Clock Rate    | 87        |
| Figure 4.33: Schematic for Guard Rings                                              | 88        |
| Figure 4.34: Open-Loop Response of PLL Without VCO                                  | 89        |
| Figure 4.35: Response of VCO Control Voltage in Analog PLL to 290.7 MHz Input       | 90        |
| Figure 4.36: Final Layout for PLL                                                   | 91        |

## **Chapter 5: Measurements for Fabricated VCO and PLL**

| Figure 5.1: Block Diagram of VCO Circuit                                           | 94         |
|------------------------------------------------------------------------------------|------------|
| Figure 5.2: Circuit Schematic for Ring Oscillator Inverter                         | 94         |
| Figure 5.3: Circuit Schematic for VCO Output Buffer                                | 95         |
| Figure 5.4: PCB Layout for VCO Chip                                                | 96         |
| Figure 5.5: VCO Frequency as Analog and Digital States are Swept Highest to Lowest | 97         |
| Figure 5.6: VCO Frequency Versus Digital State ( $V_a = 0.8 V$ )                   | <b>9</b> 8 |
| Figure 5.7: VCO Frequency Versus Analog Control Voltage for Four Different         |            |
| Digital States                                                                     | <b>99</b>  |
| Figure 5.8: Analog VCO Constant Versus VCO Frequency                               | <b>99</b>  |
| Figure 5.9: Spectrum of VCO Output at 200 MHz                                      | 100        |
| Figure 5.10: Spectrum of VCO Output at 200 MHz (Zoomed In)                         | 101        |
| Figure 5.11: Block Diagram for Complete PLL Chip                                   | 103        |
| Figure 5.12: Circuit Schematic for Output Driver                                   | 103        |
| Figure 5.13: Circuit Schematic for DAC 1                                           | 104        |
| Figure 5.14: Circuit Schematic for DAC 2                                           | 104        |
| Figure 5.15: Circuit Schematic for DAC Switches                                    | 104        |
| Figure 5.16: PCB Layout for PLL Chip                                               | 106        |
| Figure 5.17: Output of Test Comparator in Response to Sinusoidal Input             | 107        |
| Figure 5.18: Track Output for an Input Frequency of 650 MHz                        | 108        |
| Figure 5.19: Input (Upper Waveform) and Output (Lower Waveform)                    |            |
| for 135 MHz Input                                                                  | 109        |
| Figure 5.20: Input (Upper Waveform) and Output (Lower Waveform)                    |            |
| for 250 MHz Input                                                                  | 109        |
| Figure 5.21: Input (Upper Waveform) and Output(Lower Waveform)                     |            |
| for 300 MHz Input                                                                  | 110        |
| Figure 5.22: Spectrum of PLL Output for 202.2 MHz Input                            | 111        |
| Figure 5.23: Spectrum of PLL Output for Input Frequency of 202.2 MHz (Zoomed Out)  | 111        |

### List of Figures

| Figure 5.24: Spectrum of PLL for Input Frequency of 115 MHz (Zoomed Out)                | 112 |
|-----------------------------------------------------------------------------------------|-----|
|                                                                                         |     |
| Figure 5.25: Jitter Histogram of PLL Output at 290 MHz                                  | 113 |
| Figure 5.26: Setup For Measuring Transistor Transconductance                            | 114 |
| Figure 5.27: Variation of g <sub>m</sub> of mhigh With Input Frequency                  | 115 |
| Figure 5.28: Variation of g <sub>m</sub> of mhigh Over Process for 5 Chips at 250 MHz   | 115 |
| Figure 5.29: Variation of g <sub>m</sub> of mhigh With Power Supply Voltage at 250 MHz  | 116 |
| Figure 5.30: Transistor g <sub>m</sub> Versus Temperature at a VCO Frequency of 250 MHz | 117 |
| Figure 5.31: Output of DAC 2 Versus Time for Input Signal of 200 MHz                    | 118 |
| Figure 5.32: VCO Frequency Characteristic                                               | 118 |
| Figure 5.33: VCO Characteristic for Fixed Value of V <sub>a</sub>                       | 119 |
| Figure 5.34: VCO Frequency Versus V <sub>a</sub> for Various Digital States             | 120 |

## **Appendix B: Derivation of Inverter Delay**

| Figure B.1: Circuit Schematic For Analyzed Inverter Structure      | 128 |
|--------------------------------------------------------------------|-----|
| Figure B.2: Simplified Circuit Schematic for Falling Output Signal | 129 |

#### **Chapter 1: Introduction**



# Introduction

This chapter presents the motivation behind this design, as well as a description of the target application.

## **1.1 Generation of Accurate Transistor Transconductance**

This section presents a motivation as to why accurate on-chip transistor transconductances are required in general circuit design, and what problems this imposes on the integratedcircuit designer. Next, a possible solution called the constant-transconductance bias circuit [Johns, 1997] is described, along with its problems. Finally, the proposed method of generating accurate transconductances is described.

### **1.1.1 Motivation and Design Implications**

Time-constants are vital quantities in almost every area of circuit design. They determine rise and fall times in digital circuits, and bandwidth in analog circuits. In general, however, these quantities cannot be tightly controlled by the designer due to process variations in the technology used to manufacture the integrated circuits. As a result, important aspects of system performance can vary with the process variations.

Most important time-constants are proportional to  $C/g_m$ , where C is some on-chip capacitance (intentional or parasitic), and  $g_m$  is the transconductance of a transistor. In most technologies, on-chip capacitors can be implemented with relatively high accuracy (3 $\sigma$  variation of

about 10%). Thus, the most important step towards generating well-known accurate time-constants is to ensure that transistor transconductances remain accurately known.

### 1.1.2 Constant-Transconductance Bias Circuit

The current design attempts to produce well-known transistor transconductances by building on a circuit called the constant-transconductance bias circuit [Johns, 1997], which in turn was based on a concept found in [Steininger, 1990]. The main idea behind the constant-transconductance bias circuit is outlined in Figure 1.1. In this figure, the subcircuits in the on-chip system are all biased by one or more voltages from the constant- $g_m$  bias circuit (which is also on-chip).



Figure 1.1: Simplified Schematic of Constant-Transconductance Bias Circuit

The constant- $g_m$  bias circuit is designed so that the transconductance of transistors it biases are all proportional to the conductance of  $R_{bias}$ . If  $R_{bias}$  is a high-quality surface-mount resistor, then the transconductance of all transistors in the on-chip system will be accurately controlled over process and temperature variations. A problem with this scheme is that the presence of an off-chip resistor  $R_{bias}$  prevents the design in question from being fully integrated. While the resistor could be placed on-chip, this would defeat the purpose of the circuit, as the process variations on  $R_{bias}$  would cause 30% variations in the transconductances realized. Also, it has been found that, in practice, a part of  $R_{bias}$  must be placed on-chip to prevent oscillations in the bias circuit [Cheng,

1998]. This compromises the accuracy of  $R_{bias}$ , so that the accuracy of the transconductances is also compromised, unless the off-chip resistor is tuned during testing.

### 1.1.3 Constant-gm Bias Circuit With PLL-Tuned Resistor

100

The main disadvantage of the constant- $g_m$  bias circuit is that the bias resistor  $R_{bias}$  must be placed off-chip in order for its value to be sufficiently accurate. To get around this problem, we are proposing to *tune* the bias resistor  $R_{bias}$  so that its value is always well-determined. This can be accomplished with the custom shown in Figure 1.2



Figure 1.2: Simplified Schematic for Proposed System

In order to tune the bias resistor, an on-chip PLL is added, which locks to an off-chip crystal oscillator, which would be present anyway in most systems to produce a system clock. Hence, the circuit requires no additional off-chip components than those that would be required anyway in most systems. To ensure that the bias resistor  $R_{bias}$  in Figure 1.2 is always tuned to the same value over process and temperature, the PLL uses the structure shown in Figure 1.3. As can be seen, the VCO for the PLL is comprised of the constant- $g_m$  bias circuit with a tunable resistor that controls the frequency of oscillation of a  $g_m$ -controlled oscillator. In order for the PLL to lock to the input signal from the crystal oscillator, it must adjust the frequency of the VCO to be the same as that of the input. However, in order to do this, the transconductance provided by the bias circuit must be equal to a specific well-determined value. Since the transconductance pro-

vided by the bias circuit is inversely proportional to the bias resistor value, the final value of  $R_{bias}$  will be well-determined over process and temperature as well.





To reduce noise on the bias voltages produced by the bias circuit in Figure 1.2, the PLL uses both digital and analog signals to tune the bias circuit of Figure 1.3. The digital signals are used to change R in medium and large-sized steps, while the analog voltage is used only for fine-tuning. Thus, the resistance R is tuned to approximately the correct value by the digital signals alone, so that in steady-state only the analog control-voltage is active (unless the temperature changes substantially, which usually occurs only slowly). The noise due to ripple on the analog control voltage can be removed by using a second adjustable bias circuit controlled only by the digital signals from the PLL. Assuming the temperature varies slowly, changes to the digital tuning signals will seldom be required (since the crystal frequency remains virtually constant over time). Thus, the system is biased by constant voltages with very little noise on them, as desired.

## **1.2 Intended Application: Wireless Data Communications Receiver**

The intended application for this circuit is to produce the in-phase and quadrature phase 250-MHz IF carriers for a low-cost wireless receiver for the ISM Band, as shown in Figure 1.4. The signals of interest are highlighted in Figure 1.4 by the dotted rectangle. It is immediately obvious that in this application, the biasing scheme chosen has the added benefit that the PLL used to tune the bias resistor can also be used to synthesize the IF carriers. Also, the absence of

-4-

any off-chip components (aside from the crystal oscillator) help to make the system less expensive. It should also be noted that quadrature carriers need to be generated by the PLL.



Figure 1.4: System Schematic for Low-Cost Wireless Receiver

## **1.3 Outline of Thesis**

Chapter 2 contains a background on PLL circuitry and architectures, and introduces the basic design formulae used in designing PLL's. Chapter 3 describes the system-level modelling that was done for this design using SIMULINK, including descriptions of the models, as well as simulation results. Chapter 4 describes the transistor-level circuit design of the various blocks in the system. Chapter 5 describes the testing of the manufactured chips, including test setups and results. Finally, Chapter 6 presents the conclusions for the project.

## **1.4 References**

D. Johns, K. Martin, Analog Integrated Circuit Design. Toronto: John Wiley & Sons, 1997.J. M.

Steininger, "Understanding Wide-Band MOS Transistors," *IEEE Circuits and Devices*, Vol. 6, No. 3, pp. 26-31, May 1990.

J. S. Cheng, "Adaptive Equalization System for Data Transmission over Coaxial Cables," University of Toronto Master's Thesis, 1998.



# Phase-Locked Loops: A Background

This chapter is devoted to reviewing the basic theory behind phase-locked loops. Most of this theory is well understood, and is covered extensively in several textbooks [Gardner, 1979][Wolaver, 1991][Best,1997][Johns, 1997][Encinas, 1993]. The discussion begins with a general description of charge-pump phase-locked loops (PLL's), followed by an overview of the various circuits often used to implement the blocks in a charge-pump PLL. Finally, the important design trade-offs that exist in a charge-pump PLL system are described.

## **2.1 Basic Loop Structure**

The architecture for a charge-pump PLL will now be described, and the equations governing its operation will be derived.

#### 2.1.1 Block Diagram

The block diagram for a general PLL is given in Figure 2.1. The phase-detector (PD) senses the phase-difference between the reference and the voltage-controlled oscillator (VCO) signal, and outputs a voltage whose average is proportional to this phase difference. The low-pass loop-filter extracts this average from the PD output. This average is used to control the VCO, whose output frequency is proportional to its input voltage.

When the loop is in lock, the PD outputs a signal with a nearly constant average, so that the VCO frequency remains constant. If the reference frequency rises in value, the PD will

sense this, causing an increase in the loop-filter output-voltage. This in turn increases the VCO frequency, so that the VCO catches up to the input.

The M and N blocks divide the frequency of their input by M and N, respectively. The N block allows the circuit to tune the VCO output to  $f_{in}$ ,  $2f_{in}$ ,  $3f_{in}$ , etc., where  $f_{in}$  is the frequency of the reference input (after the M block). This is useful in applications where the PLL must lock to several different carriers separated by a fixed channel width. The M block allows a higher frequency crystal oscillator to be used, which can be useful if the required reference frequency is very low (say, below 1 MHz, making a crystal oscillator impractical due to size limitations).



Figure 2.1: Block Diagram for General PLL

#### 2.1.2 Classes of PLL Systems

There are three main types of PLL systems: analog PLL's, hybrid analog/digital PLL's, and all-digital PLL's. As their name implies, analog PLL's contain only analog circuitry. Hybrid analog/digital PLL's contain some digital circuitry (usually just the phase detector), while the rest of the PLL blocks remain analog. Finally, all-digital PLL's are entirely composed of digital circuitry. The following discussion will focus on a specific type of analog/digital PLL's called *charge-pump* PLL's, as that is the architecture used in this work.

### 2.1.3 Linearized Small-Signal Model

Assume that the input to the PLL and output from the VCO are both square waves, and that M and N in Figure 2.1 are both equal to one. In a charge-pump PLL, the phase-detector consists of a charge-pump (a simplified example of which is shown in Figure 2.2) driven by a digital

#### **Chapter 2: Phase-Locked Loops: A Background**

circuit called a phase-frequency detector (PFD). The details of these blocks will be discussed in Section 2.2. The charge-pump adds or removes charge from the loop-filter capacitor under control of the Up and Down signals generated by the PFD. The Up and Down signals are generated such that when the input frequency  $(f_{in})$  is higher than that of the VCO  $(f_{VCO})$ , the Up signal toggles on and off (while the down signal remains low), and when  $f_{in}$  is lower than  $f_{VCO}$ , the Down signal toggles on and off (while the Up signal remains low). When the VCO and input signal are equal in phase and frequency, both Up and Down ideally remain inactive. When the input and VCO have equal frequencies but unequal phase, the duty-cycle of the toggling signal (Up when the input leads the VCO, Down when the input lags the VCO) is proportional to the phase-difference between the input and VCO signals.



Figure 2.2: Simplified Charge-Pump Circuit

To quantify this, the *average* current flowing into the loop-filter capacitor when the PLL is locked can be found to be equal to the expression in Equation 2.1, where  $I_{ch}$  is the charge-pump current in Figure 2.2, and  $\theta_e(t)$  is the phase-error between the input and VCO signals. The constant  $K_d$  is called the phase-detector constant, given in Amperes/radian.

$$I_{avg} = \frac{I_{ch}}{2\pi} \Delta \phi = K_d \theta_e(t)$$
 (EQ 2.1)

The phase-detector output is averaged by the loop filter, which controls the VCO. The Laplacedomain loop-filter output-voltage is given in Equation 2.2, where F(s) is the impedance of the loop filter, and  $\Theta_e(s)$  is the frequency-domain phase error between the input and the VCO.

$$V_F(s) = K_d \Theta_e(s) F(s)$$
(EQ 2.2)

Chapter 2: Phase-Locked Loops: A Background

The difference between  $f_{VCO}$  and the free-running VCO frequency is given in the time-domain as in Equation 2.3, where  $v_f(t)$  is the output voltage of the loop filter. Note that when the loop-filter output is zero, the VCO runs at its free-running frequency. The constant  $K_0$  is the VCO gain in

$$\omega_{VCO}(t) = K_o v_f(t) \tag{EQ 2.3}$$

rad/s/Volt. The phase of the VCO output is the integral of this, so that the phase can be expressed in the Laplace domain as in Equation 2.4. The PLL model can therefore be drawn as shown in Figure 2.3.



Figure 2.3: Block Diagram for Linearized PLL

The only major difference between most PLL designs is the implementation of the various blocks in this diagram. Therefore the *general* transfer function of a PLL from the reference phase ( $\Theta_{in}(s)$ ) to the VCO output-phase ( $\Theta_{out}(s)$ ) is given as in Equation 2.5, and the transfer function from  $\Theta_{in}(s)$  to the phase error between the VCO and the reference ( $\Theta_e(s)$ ) is given as in Equation 2.6, where F(s) is the transfer function of the low-pass filter. The loop gain of the PLL is

$$H(s) = \frac{\Theta_{VCO}(s)}{\Theta_{in}(s)} = \frac{K_o K_d F(s)}{s + K_o K_d F(s)}$$
(EQ 2.5)

$$H_e(s) = \frac{\Theta_e(s)}{\Theta_{in}(s)} = \frac{s}{s + K_o K_d F(s)}$$
(EQ 2.6)

defined to be  $K = K_0 K_d F(\infty)$ , so that the high-frequency gain of H(s) can be expressed as K/ (s+K). Thus, the loop gain is also approximately the bandwidth of the PLL.

If F(s) is a first-order low-pass or lead-lag filter, then the denominator of H(s) will be second order, and can be expressed in the form of Equation 2.7, where Q is the quality factor for

$$D(s) = 1 + \frac{s}{\omega_n Q} + \frac{s^2}{\omega_n^2}$$
(EQ 2.7)

the PLL, and  $\omega_n$  is the natural frequency. These values are very important in the design of the loop, in that they determine the settling behaviour of the loop, as will be discussed in Sections 2.3.1 and 2.3.2.

Finally, let us use the final-value theorem to determine the steady-state phase-error for a phase-error and frequency-error step on the PLL input. The phase error as time approaches infinity is given in Equation 2.8, where n=1 for a phase-error step, and n=2 for a frequency error step. Hence, the steady-state phase error is zero for a PLL, provided that the loop-filter transfer

$$\lim_{t \to \infty} \phi_e(t) = \lim_{s \to 0} \frac{1}{o_s^{n-1}} \Theta_e(s) = \lim_{s \to 0} \frac{1}{o_s^{n-1}} \cdot \frac{s}{s + K_o K_d F(s)}$$
(EQ 2.8)

function does not have a zero at DC (a condition that is satisfied for all popular designs). However, in order to achieve zero steady-state phase-error in response to a frequency-error step, the loop-filter must have at least one pole at DC (i.e. it must contain an ideal integrator). It will be seen later that this DC pole can be produced using very simple passive circuitry if a charge-pump architecture is used.

#### 2.1.4 Lock Metrics

The acquisition of lock is the process by which the PLL aligns itself to its input signal, starting from an unlocked state. This section provides definitions for the various metrics used to describe the locking performance of a PLL.

#### 2.1.4.1 Lock-in Range

The lock-in range is defined as the maximum frequency-difference between the PLL input and the VCO frequency for which lock is attainable within one single beat-note, assuming the PLL is initially unlocked. The lock-in time is the time required for this to happen. For frequency offsets smaller than this, lock-in will occur within one beat-note, while for offsets larger than this, lock-in may occur, but after a longer time.

#### 2.1.4.2 Hold Range

The *hold range* describes the maximum frequency-difference for which the PLL will remain locked, assuming it is initially locked, and that the input frequency changes very slowly (i.e. with a rate of change less than the loop bandwidth of the PLL).

#### 2.1.4.3 Pull-in Range

The *pull-in range* is the maximum frequency-difference for which the PLL can eventually attain lock, assuming it is initially unlocked. This is different from the lock-in range, in that it does not specify how long the process must take. If a frequency falls within the lock-in range, it must fall within the pull-in range, however the converse is not true.

#### 2.1.4.4 Pull-out Range

The *pull-out range* is the largest frequency-step that can be applied to the input of the PLL without losing lock, assuming the PLL is initially locked. To remain in lock, all frequency steps must remain smaller than this value, and additionally, there is usually a maximum allowable rate of change for this frequency-step for the PLL to maintain lock.

### **2.2 Block Realizations**

The most common circuits used to realize the blocks of a charge-pump PLL (phase detector, VCO, and loop filter) will now be explored. The design equations for charge-pump PLL's will then be summarized.

### 2.2.1 Phase-Frequency Detector With Charge-Pump

PLL's using this type of PD are called *charge-pump* PLL's. This type of digital PD consists of two main components: the phase-frequency detector (a digital circuit), followed by a charge pump (an analog circuit).

#### 2.2.1.1 Phase-Frequency Detector

The PFD seems to be the most popular PD in recent literature [von Kaenel, 1996][Sung, 1999][Toifl, 1999][Djahanshahi, 1999][Rhee, 1999][Chang, 1999][Sumi, 1999][Wang, 1998][Wu, 1999][Yang, 1997][Craninckx, 1998][Parker, 1998][Rau, 1997]. It has the desirable characteristic that it gives the PLL an infinite hold range and pull-in range even if a passive first-order loop filter is used. Also, the PFD is insensitive to the duty cycle of the input waves, as it is edge-triggered.

Another advantage of the PFD circuit is that it maintains phase-lock over a large phase-error range (from  $-2\pi$  to  $2\pi$ ) without "slipping", which is twice that for a JK flip-flop, and four times that of the multiplier/XOR gate [Best, 1997].



The logical schematic for the circuit is shown in Figure 2.4. To help envision the

Figure 2.4: Schematic for PFD

operation of this circuit, refer to the signal-flow graph of Figure 2.5.



Figure 2.5: Signal-Flow Graph for PFD

The circuit changes states only on the positive edges of the input signals. Unless the PLL is in lock, the PFD will alternate between two of the three states. If  $f_{in}$  is larger than  $f_{VCO}$ , the circuit alternates between states 0 and +1, while if  $f_{in}$  is smaller than  $f_{VCO}$ , the circuit alternates between -1 and 0. The state assignments are summarized in Table 2.1.

| State | Up | Down |
|-------|----|------|
| 0     | 0  | 0    |
| +1    | 1  | 0    |
| -1    | 0  | l    |

TABLE 2.1. State Assignment for PFD

The Up and Down outputs control switches that increase or decrease the VCO controlvoltage by adding or removing charge from the loop filter's capacitor, respectively, using the charge-pump. Thus, if  $f_{in}$  is larger than  $f_{VCO}$ , the Up signal will oscillate between 1 and 0, while the down signal will remain at 0. This gradually adds charge to the filter capacitor, which increases the VCO frequency, bringing the PLL closer to lock.

The ideal characteristic for the PFD is shown in Figure 2.6a). In reality, however, the non-zero delay through the flip-flops and NAND gate can alter the characteristic, as seen in Figure 2.6b). To see how this arises, consider the case when the PLL input and VCO output are perfectly in phase. Initially, both PFD outputs are zero. When the first pulse edges arrive at the PFD input, the PFD outputs are forced high. Ideally, this should only occur for an infinitesimal amount of time, however in reality there is a delay through the NAND gate before its output falls to zero, and then another delay through the flip-flops before the low reset signal takes effect at the output. This delay decreases the maximum phase shift that the PFD can handle without slipping. This

maximum phase shift is given by  $\theta_{em} = 2\pi - T_{min}w_i$ , where  $w_i$  is the input frequency in rad/s and  $T_{min}$  is the delay through the NAND and flip-flop [Wolaver, 1991]. Note that as the input frequency increases, the maximum phase decreases.



Figure 2.6: PFD Characteristic: a) Ideal (Upper) b) Non-Zero Delay in Logic (Lower)

#### 2.2.1.2 Charge Pump for PFD

The most basic form of the charge pump was shown in Figure 2.2. The Up and Down signals are provided by the PFD. If the Up signal is raised,  $I_{ch}$  flows into the output node, charging the filter capacitor. If the Down signal is raised,  $I_{ch}$  flows out of the output node, discharging the filter capacitor. In this manner,  $V_{out}$ , which controls the VCO, is changed.

### 2.2.2 Loop Filter

The most common filter used in the literature is the first or second-order passive RC filter, due its simplicity and sufficient performance. The most basic form for this filter is shown in Figure 2.7. This form of the filter is used in analog PLL's and those hybrid analog/digital PLL's in which the PD output signal is a voltage. The transfer function for this circuit is given in Equation 2.9. Note that this loop-filter does not contain an ideal integrator, so that the hold-in and pull-in ranges will be limited. To obtain infinite hold-in and pull-in ranges for analog PLL's (and analog/digital PLL's that don't use a charge-pump structure), one must use a more area and power-hungry active loop-filter.

$$F(s) = \frac{1 + sR_2C_1}{1 + s(R_1 + R_2)C_1}$$
(EQ 2.9)  
+  $\frac{R_1}{V_{in}}$  +  $\frac{V_{in}}{R_2}$  +  $V_{out}$ 

#### Figure 2.7: Lead-Lag Loop Filter

For charge-pump PLL's, a very similar structure is used, however it is excited by current (from the charge-pump) instead of voltage, which changes its transfer function to contain an ideal integrator. Two common realizations for this loop filter are shown in Figure 2.8. In both cases, the capacitor  $C_1$  is usually much larger than  $C_2$ , which is present to suppress glitches across  $R_1$  when current is first switched into the filter.



Figure 2.8: Schematic for Lead-Lag Loop Filter for Charge-Pump PLL (Two Realizations)

The transfer function for Filter 1 is given in Equation 2.10, where the approximation holds if  $C_1 >> C_2$ , which is usually the case ( $C_2$  is usually chosen to be at least 10 times smaller

**Chapter 2: Phase-Locked Loops: A Background** 

than  $C_1$  [Gardner, 1981]). Notice that the filter contains an ideal integrator, which gives the PLL pull-in and hold-in ranges that are limited by VCO tuning range.

$$F(s) = \frac{1 + sRC_1}{s(C_1 + C_2)\left(1 + s\frac{C_1C_2}{C_1 + C_2}R\right)} \approx \frac{1 + sRC_1}{sC_1(1 + sC_2R)}$$
(EQ 2.10)

The transfer function for Filter 2 is given by Equation 2.11. Notice that both filters are approximately equivalent if  $C_2$  is much smaller than  $C_1$ . Also, note that  $C_2$  has the effect of

$$F(s) = \frac{1 + sR(C_1 + C_2)}{sC_1(1 + sC_2R)} \approx \frac{1 + sR(C_1)}{sC_1(1 + sC_2R)}$$
(EQ 2.11)

introducing a pole at a relatively high frequency, which degrades Q slightly (makes is a little larger). To compensate for this, the filter is often designed ignoring  $C_2$ , but for a lower Q (by about 20%) [Johns, 1997].  $C_2$  makes the loop filter second order, and hence makes the overall PLL third order. Because less understanding exists of third-order systems, the PLL is easier to analyze ignoring  $C_2$ , assuming it only affects the circuit at high frequencies [Gardner, 1980].

### 2.2.3 Voltage-Controlled Oscillator

The three major types of voltage-controlled oscillator (VCO) integrated circuits will now be discussed: LC oscillators, RC multivibrators, and ring oscillators.

#### 2.2.3.1 LC Oscillators

While integration is possible for these circuits, it is very difficult to achieve very high performance on-chip, since high-Q inductors are difficult to create. Also, LC oscillators tend to have fairly narrow tuning ranges.

#### 2.2.3.2 RC Multivibrators

RC multivibrators contain no resonating components, so that their phase-noise performance is inferior to crystal and LC oscillators. However, because they contain no inductors or crystals, these circuits are easily integrated, making them an attractive choice for fully-integrated PLL's. Typically, multivibrator circuits do not have quite as good jitter performance as welldesigned ring oscillator VCO's [Mcneill, 1997].

#### 2.2.3.3 Ring Oscillators

Ring oscillators appear to be the most popular VCO's for fully-integrated CMOS PLL's [Young, 1992][Sung, 1999][Chang, 1999][Rau, 1997][Kim, 1990][Djahanshahi, 1999][Chen, 1999]. Their design is fairly simple and easy to understand, in that they contain n delay elements connected in a series loop. Thus, the period of oscillation is 2nT, where T is the delay of one of the delay elements. For single-ended logic gates, n must be odd in order to make the loop unstable, however if fully-differential logic is used, n can be even, since a sign change can be created by swapping positive and negative outputs to create instability.

Two types of delay elements are commonly used: current-starved elements, and differential inverter elements.

#### **Current-Starved Delay Elements**

A schematic for a typical current-starved delay element is shown in Figure 2.9 [Yang, 1997]. To see how this delay element works, first assume  $V_{in}$  is high. This means that M2 will be shut off and M1 will conduct  $I_b$ .  $V_{out}$  is then governed by the on-resistance of M1. Next assume  $V_{in}$  is forced low. In this case M1 will shut off, causing  $V_{out}$  to rise to a value determined by the gate voltage of M2. By altering the value of  $I_b$ , the delay through the element can be controlled. In practice, it is better to use a fully-differential version of this circuit [Yang, 1997] in order to reduce the effects of power-supply noise.



Figure 2.9: Schematic for Current-Starved Delay Element

-17-

#### Differential Inverter Delay Elements

Regular CMOS inverters do not perform well in VCO's because their threshold voltage depends on the power-supply voltage, so that noise on the power-supply creates a large amount of jitter in the VCO's output. To minimize the effects of power-supply noise, fully-differential circuits are used. A typical differential inverter delay-element is shown in Figure 2.10.



Figure 2.10: Circuit Schematic for Differential Inverter

The input transistors steer the bias current produced by the cascode current-source M1 to either M4 or M5, which are triode-biased PMOS transistors. It can be shown that the delay through the element is approximately  $r_{ds4}C_L ln2$  (see Appendix B). By altering the drain-to-source resistance in M4 and M5, the delay through the element can be controlled, along with the VCO oscillation-frequency.

Like multivibrator circuits, ring oscillators contain no resonant components, which compromises their phase-noise performance. However, this also means that ring oscillators can be easily integrated.

### 2.2.4 Frequency Dividers

As mentioned in Section 2.1.1, frequency dividers are used in some applications. In frequency synthesizers, they are added to generate multiples of the reference frequency, whereas in CPU PLL's, a divide-by-two circuit is usually added after the VCO to help obtain a signal with a 50% duty cycle. The addition of a frequency divider (say, by some integer N) after the VCO lowers the loop bandwidth by a factor N, which degrades the performance (lowers the lock-in

Chapter 2: Phase-Locked Loops: A Background

range and degrades attenuation of VCO phase noise), but also eases the design of the phase detector and charge pump, since they only have to operate at 1/N times the frequency of the VCO.

### 2.2.5 Loop Equations for Charge-Pump PLL With Lead-Lag Loop Filter

The loop equations for a second-order charge-pump PLL with a lead-lag filter are summarized in Table 2.2 (see [Johns, 1997] and [Best, 1997] for derivation). The loop filter structure is assumed to be that of Figure 2.8, in the absence of  $C_2$ . In this table,  $I_{ch}$  is the charge-pump current, R and  $C_1$  are the component values in the loop filter, and  $K_0$  is the VCO gain.

| Quantity                            | Formula                                                                               |
|-------------------------------------|---------------------------------------------------------------------------------------|
| Phase Detector Gain                 | $K_d = \frac{I_{ch}}{2\pi}$                                                           |
| Filter Transfer Function F(s)       | $F(s) = \frac{RC_1 s + 1}{sC_1}$                                                      |
| Natural Frequency (ω <sub>n</sub> ) | $\omega_n = \sqrt{\frac{K_o K_d}{C_1}} \cong \frac{1}{\tau_{PLL}}.$                   |
| Quality Factor                      | $Q = \frac{1}{\omega_n R C_1}$                                                        |
| Transfer Function H(s)              | $H(s) = \frac{1}{K_o} \cdot \frac{s(1 + sRC_1)}{\frac{C_1 s^2}{K_d K_o} + sRC_1 + 1}$ |
| Steady-State Phase Error            | $\theta_{v} = \frac{2\pi\Delta\omega}{K_{o}I_{p}Z_{F}(0)} = 0$                        |
| Lock-in Range                       | $\frac{\pi\omega_n}{2Q}$                                                              |
| Hold Range                          | Infinite (limited by tuning range of VCO)                                             |
| Pull-in Range                       | Infinite (limited by tuning range of VCO)                                             |

TABLE 2.2. Formulae for PFD and Passive Lead-Lag Filter

#### Chapter 2: Phase-Locked Loops: A Background

TABLE 2.2. Formulae for PFD and Passive Lead-Lag Filter

| Quantity       | Formula                                           |
|----------------|---------------------------------------------------|
| Pull-out Range | $11.55\omega_n \left( 0.5 + \frac{1}{2Q} \right)$ |

The natural frequency approximates the inverse of the loop time-constant, assuming a first-order response. Thus, the natural frequency is very important in determining the noise tracking ability of the PLL, as will be discussed in Section 2.3.1.

The hold range and pull-in range of the charge-pump PLL approach infinity, meaning that the PLL can always lock to the input frequency (assuming the VCO does not saturate), and assuming the input frequency doesn't change too quickly, the PLL will remain in lock. This is a direct result of the PFD used in the charge-pump PLL, and is in sharp contrast with other PLL types (e.g. hybrid analog/digital PLL with a flip-flop phase detector, analog PLL), whose hold range and pull-in range are often limited by the phase-detector, and not the VCO tuning range, unless an active loop-filter is used.

### **2.3 Design Issues**

It was decided to use a charge-pump structure for the PLL, using a ring-oscillator VCO. The charge-pump/PFD was chosen as the phase-detector, since it allows a large pull-in and hold-in range, even if a simple passive filter is used. The ring-oscillator was chosen because it allowed the creation of a transconductance-controlled oscillator, as will be discussed in Chapter 4. Some of the more important design issues involved in such a system will now be discussed.

#### 2.3.1 Loop Bandwidth

The loop bandwidth controls several areas of performance. The first is the immunity to input noise. The second is the ability to correct for noise generated in the VCO.

The noise appearing at the input of the PLL sees a low-pass response with a cutoff frequency approximately given by the natural frequency. Thus, decreasing the natural frequency (loop bandwidth) will improve the immunity of the PLL to input noise. If the input signal comes from a high-quality oscillator, then input noise will be of little or no concern.

Phase error in the VCO sees a high-pass transfer function (since the internal phase error sees  $H_e(s)=1$ -H(s), where H(s) is the PLL response from the input phase to the VCO output phase), with a cutoff approximately given by the natural frequency of the PLL [Parker, 1998]. Thus, phase noise generated by the VCO gets attenuated if it is offset from the free-running frequency by less than the natural frequency. Outside of this range, the PLL can no longer correct for the phase error, and the phase noise appears unattenuated at the PLL output. Thus, to "track out" the maximum amount of phase noise from the VCO, one should choose a high bandwidth.

### 2.3.2 Quality Factor and Natural Frequency

The quality (Q) factor should be chosen based on the desired response of the loop. A Q factor of 0.5 gives real poles for H(s). A Q factor of  $\frac{1}{\sqrt{3}}$  gives maximally flat group delay, while a Q factor of  $\frac{1}{\sqrt{2}}$  gives maximally flat amplitude response [Johns, 1997].

#### 2.3.3 PFD

There are several important issues to be considered in the design of a PFD. These include behaviour of the PFD around lock, and the logic family to be used.

#### 2.3.3.1 Lock Behaviour

In a perfect PFD, when the VCO is locked to the PLL input, the Up and Down outputs toggle on and off simultaneously, so that, assuming an ideal charge-pump with equal charging and discharging currents, the net change in the charge stored on the loop filter capacitor remains fixed. This translates to the VCO control-voltage remaining constant in steady-state. However, it is difficult to exactly match the Up and Down currents in the charge-pump under all conditions, so that the charge on the loop-filter capacitor is altered if the Up and Down outputs of the PFD are pulsed simultaneously. This means that the PLL must create a suitable phase-offset between the VCO and PLL input in order to hold the VCO control-voltage constant. In applications where steady-

state phase-alignment of the VCO output and PLL input is important, this problem must be carefully addressed.

#### 2.3.3.2 Logic Family

Similar issues arise in the design of the PFD as those in the frequency divider (see Section 2.2.4). The logic in the PFD must be fast enough to settle within a fraction of a period of the VCO, which demands the use of high-performance logic families (ECL, True Single-Phase). The design should also reject power-supply noise, which again suggests using differential logic.

#### 2.3.4 Charge-Pump

A great number of charge-pump circuits have been proposed in the literature. This section sums up the main issues to be considered in the design of a charge pump.

#### 2.3.4.1 Up/Down Symmetry

The transient responses for the up and down currents in the charge-pump must be as well matched as possible, so that equal-duration up and down voltage pulses result in the same change of charge on the loop-filter capacitor. This is especially important once the loop is locked in, since if the up and down transients significantly differ, the short equilength up/down pulses at the output of the PFD will result in a net removal or addition of charge over a long period of time, resulting in a steady-state phase-offset between the VCO and input signal (which may or may not be important, depending on the application)

#### 2.3.4.2 Variation of Up/Down Currents

If single-transistor current sources are used for the simple charge pump in Figure 2.2, the current supplied by them will vary considerably as the charge pump output-voltage changes, due to the small output-resistance of these current sources. To decrease this variation, the output resistance of these sources can be boosted using cascode current-sources.

## 2.3.5 VCO

The VCO is a block that has received a great deal of attention in the literature, since a stable, high-quality, low phase-noise VCO is very difficult to achieve on-chip. Besides the obvious speed and tunability requirements, the main issues in the design are jitter and the linearity of the frequency versus control-voltage characteristic.

#### 2.3.5.1 Jitter

The jitter of the VCO can be an extremely important factor in the determination of the output jitter of the PLL. This is especially true if the input signal has very little jitter. Note also that VCO jitter considerations are only worthwhile if power-supply noise has been dealt with properly (e.g. by using a fully-differential architecture, large decoupling capacitors on the VCO power supplies, and separate power supplies for digital and analog circuitry). If this is not the case, power-supply noise will likely dominate the performance [Martin, 1999].

For a good discussion of timing jitter in CMOS ring oscillators, see [Welgandt, 1998]. However, substitute the expression given for the delay per stage with that used in Mcneill's paper, which is RC<sub>L</sub>ln2. This yields a normalized (to the delay per stage) timing-jitter per stage as given in Equation 2.12 [Martin, 1999]. For the overall oscillator,  $T_{osc}=2nt_d$ , and from the central limit theorem,  $\Delta T_{osc}^2 = 2n\Delta t_d^2$ , which gives the expression for the normalized VCO jitter given in Equation 2.13.

$$\frac{\Delta t_d^2}{t_d^2} = \frac{2.8kT}{Vpp^2 C_L} \left(1 + \frac{2}{3}A_v\right) = \frac{2.8kT}{Vppt_d I_{DD}} \left(1 + \frac{2}{3}A_v\right)$$
(EQ 2.12)

$$\frac{\Delta T_{osc}}{T_{osc}^{2}} = \frac{1.4kT}{nVppt_{d}I_{DD}} \left(1 + \frac{2}{3}A_{\nu}\right) = \frac{1.4kTF_{osc}}{VppI_{DD}} \left(1 + \frac{2}{3}A_{\nu}\right)$$
(EQ 2.13)

Equation 2.13 helps to establish some guidelines in the design of low-jitter ring oscillators. The first is that the gain of the inverters should not be excessive. A minimum value of 1 is required to start oscillations in the circuit, however increasing the gain increases the jitter, so the

#### **Chapter 2: Phase-Locked Loops: A Background**

gain should not be made any larger than necessary. The second guideline is that the power dissipated in the load devices should be maintained as high as possible. Lastly, note that as the frequency of oscillation increases, the jitter increases. This is because a given amount of jitter will have a larger effect on the frequency of oscillation as the period of oscillation gets smaller.

#### 2.3.5.2 Frequency-Control Voltage Characteristic

In order to maintain constant loop dynamics (i.e. Q and  $\omega_0$ ), it is usually desired to keep the VCO gain constant, which implies a linear frequency-control voltage characteristic.

### 2.3.6 Integration of Loop Filter

When designing the PLL, quantities must be chosen such that the loop filter components are realizable on-chip. This means that the filter capacitor must be chosen to be a couple hundred pF at the most (the largest size found in the literature for a 0.35  $\mu$ m CMOS technology is 250 pF in [Djahanshahi, 1999]).

#### 2.3.7 Loop Gain

The loop gain is defined as the gain around the loop at high frequencies, which can be found to be  $K_0K_dR$ . To look at stability requirements for the PLL, one can look at the root locus plot. To do this, one can define the normalized loop gain, which is the regular loop gain multiplied by RC<sub>1</sub>. [Gardner, 1980]. Using the parameter values given in Table 2.3, and using the H(s)

| Parameter      | Value                        |
|----------------|------------------------------|
| R              | 18.26 Kilo-ohms              |
| C <sub>i</sub> | 75pF                         |
| C <sub>2</sub> | 3pF                          |
| Ko             | 15.7X10 <sup>6</sup> rad/s/V |
| K <sub>d</sub> | 15.92X10 <sup>-6</sup> V/rad |
| I <sub>b</sub> | 100uA                        |
| Q              | 0.4                          |

12

**TABLE 2.3. Parameter Values Used for Root Locus Plot** 

| Parameter      | Value  |
|----------------|--------|
| f <sub>o</sub> | 291kHz |

expression given in [Gardner, 1980], the expression in Equation 2.14 was found for the denominator of H(s), where K' is the normalized loop gain of the PLL.



Figure 2.11: Root Locus Plot for Third-Order System

The root locus plot for such a system is given in Figure 2.11. Note that for all K'>0, the system is stable, since all poles lie in the left-half plane. For small values of K', the system is underdamped with 2 complex poles and one (large) real pole. If K' is increased from this value, the system becomes overdamped with all real poles. Once K' is increased beyond a certain value, however, two of the poles become complex, and the system again becomes underdamped. Thus there is a range of K' for which the system will be overdamped.

As the ratio of  $C_1/C_2$  is increased by decreasing  $C_2$ , the breakaway point moves further into the left-half plane. In fact, if  $C_1/C_2$  is less than 8, then the system is underdamped for *all* values of K', since the root locus never returns to the real axis after breaking from the origin [Gardner, 1980]. This is verified in Figure 2.12, which shows the root loci of the PLL for varying values of  $C_1/C_2$ . Notice that as this ratio decreases from the initial value of 50, the breakaway point for the poles moves closer to the origin. Finally, for  $C_1/C_2=7.5$ , the poles are complex for all values of loop gain, as expected.



Figure 2.12: Behaviour of Root Loci for Varying Values of C1/C2

To confirm the system response, the step response was found for the overall PLL, and is plotted in Figure 2.13. The step response for the PLL taken at the input to the VCO is shown in Figure 2.14, along with the step response of the system without capacitor  $C_2$ . As expected, with  $C_2$  included, the voltage rises and then falls to zero as the system comes back into phase with the reference, although there is a slight overshoot in returning to zero. The presence of  $C_2$  removes the step that occurs at the start of the response, so that the VCO input displays a more gradual rise in voltage.



Figure 2.14: Step Response of PLL at Input to VCO

Finally, the pole-zero plot for the PLL with Filter 1 is shown in Figure 2.15. Notice that the low-frequency pole is actually cancelled by the zero introduced by the filter. Thus, the system approximately looks like a second order system with two real poles and no zeros. This is illustrated in Figure 2.16, which is a Bode plot of the PLL with Filter 1. The response is almost flat up to 10Grad/s (1.6 GHz), and then starts to fall off at -40dB per decade.



Figure 2.16: Bode Plot for PLL With Filter 1

It is also interesting to see the system response at the input to the VCO. This is shown in Figure 2.17. As can be seen, the system is bandpass with a very narrow passband centered around 8.5Mrad/s (1.35 MHz).





Figure 2.17: Bode Plot of PLL at Input to VCO

## **2.4 References**

R.E. Best, Phase-Locked Loops, 3rd Ed. Toronto: McGraw-Hill, 1997.

Y. Chang, E.W. Greeneich, "A Current-Controlled Oscillator Coarse-Steering Acquisition-Aid for High Frequency SOI CMOS PLL Circuits," *ISCAS*, Vol. II, pp. 561-564, 1999.

H. Chen, E. Lee, R. Geiger, "A 2-GHz VCO With Process and Temperature Compensation," *ISCAS*, Vol. II, pp. 569-572, 1999.

J. Craninckx, M. Steyaert, "A Fully Integrated CMOS DCS-1800 Frequency Synthesizer," *IEEE ISSCC*, Session 23.5, 1998.

H. Djahanshahi, C.A.T. Salama, "Differential 0.35um CMOS Circuits for 622MHz/933MHz Monolithic Clock and Data Recovery Applications," *ISCAS*, Vol. II, pp. 93-96, 1999.

J.B. Encinas, Phase Locked Loops, New York: Chapman & Hall, 1993.

F.M. Gardner, Phaselock Techniques, 2nd Ed. Toronto: John Wiley & Sons, 1979.

F.M. Gardner, "Charge-Pump Phase-Lock Loops," *IEEE Transactions on Communications*, Vol. COM-28, pp. 1849-1857, 1980.

D. Johns, K. Martin, Analog Integrated Circuit Design. Toronto: John Wiley & Sons, 1997.

B. Kim, D. N. Helman, P.R. Gray, "A 30MHz Hybrid Analog/Digital Clock Recovery Circuit in 2um CMOS," IEEE Journal of Solid-State Circuits, vol. 25, no. 6, pp. 1385-1394, Dec. 1990.

John A. McNeill, "Jitter in Ring Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 6, pp. 870-878, June 1997.

K. Martin, PLL Notes, 1999.

J. F. Parker, D. Ray, "A 1.6-GHz CMOS PLL with On-Chip Loop Filter," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 3, pp. 337-343, Mar. 1998.

M. Rau, T. Oberst, R. Lares, A. Rothermel, R. Schweer, and N. Menoux, "Clock/Data Recovery PLL Using Half-Frequency Clock," IEEE Journal of Solid-State Circuits, vol. 32, no. 7, pp. 1156-1159, Jul. 1997.

W. Rhee, "Design of Low-Jitter 1-GHz Phase-Locked Loops for Digital Clock Generation," *ISCAS*, Vol. II, 1999.

Y. Sumi, S. Obote, N. Kitai, R. Furuhashi, Y. Matsuda, Y. Fukui, "PLL Frequency Synthesizer with an Auxiliary Programmable Divider," *ISCAS*, Vol. II, pp. 532-536, 1999.

H. Sung, K.S. Yoon, "A 3.3-V High-Speed CMOS PLL With 3-250 MHz Input Locking Range," ISCAS, Vol. II, pp. 553-556, 1999.

T. Toifl, P. Moreira, "A Radiation-Hard 80-MHz Phase-Locked Loop for Clock and Data Recovery," *ISCAS*, Vol. II, pp. 524-527, 1999.

V. von Kaenel, D. Aebischer, C. Piguet, and E. Dijkstra, "A 320-MHz, 1.5 mW @ 1.35 V CMOS PLL for Microprocessor Clock Generation," *IEEE J. Solid-State Circuits*, Vol. 31, No. 11, pp. 1715-1722, Nov. 1996.

C. Wang, Y. Chien, Y. Chen, "A Practical Load-Optimized VCO Design for Low-Jitter 5V 500-MHz Digital Phase-Locked Loop," *ISCAS*, Vol. II, pp. 528-531, 1998.

T. C. Welgandt, B. Kim, P.R. Gray, "Analysis of Timing Jitter in CMOS Ring Oscillators,"

Dan H. Wolaver, Phase-Locked Loop Circuit Design. Toronto: Prentice-Hall, 1991.

L. Wu, H. Chen, S. Nagavarapu, R. Geiger, E. Lee, W. Black, "A Monolithic 1.25 GBits/s CMOS Clock/Data Recovery Circuit for Fibre Channel Transceiver," ISCAS, Vol. II, pp. 565-568, 1999.

H.C. Yang, L.K. Lee, R.S. Co, "A Low Jitter 0.3-165 MHz CMOS PLL Frequency Synthesizer for 3 V/5 V Operation," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 4, pp. 582-586, Apr. 1997.

I. A. Young, J. K. Greason, K. L. Wong, "PLL Clock Generator With 5 to 110 MHz of Lock Range," *IEEE J. Solid-State Circuits*, vol. 27, no. 11, pp. 1601-1604, Nov. 1992.



# Block-Level Modeling and Design

With an understanding of the general structure of PLL's, it is now possible to examine the new architecture used in this design. In this chapter, this structure is described, as well as the high-level block selection and SIMULINK simulation of the system. Section 3.1 justifies the usage of SIMULINK models. Section 3.2 gives the block-level description of the PLL design. Section 3.3 describes the high-level selection and design of the system blocks. Section 3.4 describes the SIMULINK system used to model the behaviour of the PLL. Finally, Section 3.5 describes the results obtained from the SIMULINK system.

# **3.1 PLL Simulation Problems**

It is well-known that, in general, PLL's are very difficult to simulate at the transistorlevel using a circuit simulator such as HSPICE, since they contain a very wide range of time-constants. Very large time-constants (on the same order as the PLL  $1/\omega_0$ ) are induced by the loop filter, while much smaller time-constants are present in the VCO (often less than one ten-thousandth of the PLL  $1/\omega_0$ ). Hence, the PLL must be simulated using a very small time-step to get accurate results, but also for a very long time-period in order to observe the behaviour of the overall PLL system. At the current time, this places unreasonable requirements on computer memory and time, unless extremely simple circuitry is used [Razavi, 1997].

While long simulations may be acceptable if one is reasonably sure that the design will work, they make design iterations extremely long. As an alternative to long, memory-hungry

transistor-level simulations, the designer may initially verify the system design using a simplified model. If the designer can then design circuits to imitate the behaviour of the simplified system blocks, then the interconnection of these blocks is likely to work properly, and can be verified with a few long transistor-level simulations.

The system-level simulations for this design were carried out using SIMULINK, with the help of C MEX Files [Math Works, 1996]. A good source for understanding how to apply C MEX files to PLL systems is in [Johns, 1997].

## **3.2 System Block-Level Description**

The system will now be described at the block level, followed by a discussion of the advantages of this system design.

## **3.2.1 System Description**

The block diagram for the system is given in Figure 3.1. Relating back to the block diagram of Figure 2.1, the blocks in Figure 3.1 can be lumped together as follows: the phase-frequency detector (PFD) and charge pump blocks form the phase detector block in Figure 2.1, the loop filter block is the same as the loop filter block in Figure 2.1, and the rest of the circuitry (comparators, Up/Down Control, VCO) forms the VCO block of Figure 2.1.





The VCO structure is the most significant aspect of the system, as the other blocks are fairly conventional, being present in almost all charge-pump PLL's [Von Kaenel, 1996][Young, 1992][Parker, 1998][Rau, 1997][Nati, 1997]. The VCO differs from conventional VCO's in that it takes three control inputs, instead of just one. As seen in Figure 3.1, the three VCO inputs correspond to three levels of control resolution. An analog control voltage  $(V_a)$  is used for fine-tuning of the VCO frequency, while two separate digital voltages are used to tune the VCO in medium-sized and large-sized steps ( $V_{therm}$  and  $V_{biny}$  respectively).

To see how this works, it is easiest to consider an example. Assume that the PLL is locked, so that the reference input is in phase with the VCO output, and the control voltages have assumed their steady-state values. Now, if the reference signal abruptly increases in frequency by a small amount, the PFD senses that the reference signal is now leading the VCO signal, and causes the charge-pump to add charge to the loop-filter capacitor. This increases Va, causing the VCO to oscillate faster, so that it catches up to the reference frequency, and lock is maintained. Note that this operation is exactly the same as in any charge-pump PLL. The difference arises when the frequency step applied to the reference gets larger. The comparators in Figure 3.1 continually compare  $V_a$  to a high threshold (High) and a low threshold (Low). If the frequency step on the reference signal becomes large enough, V<sub>a</sub> increases beyond High. This event is detected by Comparator 1, whose output goes high. This event is in turn detected by the Up/Down Control block, which increments the "medium" signal (V<sub>therm</sub>) by one. This increases the VCO frequency by a larger amount than is possible using V<sub>a</sub> alone. The VCO constants are chosen such that when this change occurs,  $V_a$  moves to halfway between Low and High (we will refer to this point as Mid). The analog loop then continues to try and match the VCO and reference frequencies. If at some point V<sub>therm</sub> reaches its maximum value, it is reset to a value halfway between its minimum and maximum, and the "coarse" signal (Vbin) is incremented by one, which increases the frequency of the VCO by an even larger step than possible with V<sub>therm</sub>. Note that every time one of the control signals changes, all control signals "below" it (i.e. all signals that provide a higher resolution) are reset to their "middle" values. This ensures that changes in the digital states will happen very infrequently after the loop has locked.

# 3.2.2 System Advantages

Although not all the advantages of this structure can be made apparent quite yet (these will be discussed in the next chapter, when the circuit-level system is considered), there are some that can be seen immediately. One is that in steady-state, inevitable fluctuations on  $V_a$  will have a reduced effect on the output frequency of the VCO, leading to less jitter. This is because  $V_a$  is meant only for fine-tuning of the VCO frequency, so that even large changes in  $V_a$  will only lead to small changes in the VCO frequency.

Also, this PLL structure could be used to create a very low-jitter oscillation by using only the digital control signals to control a VCO identical to the one described above, only neglecting  $V_a$ . The second VCO will only provide an accurate version of the reference if  $V_{therm}$ provides a high degree of resolution. If this is possible, the second VCO will accurately follow the reference signal, but will have no jitter introduced by an analog control voltage. To ensure that, in steady state, the digital states do not toggle back and forth between two neighboring states, the control signals use controlled hysteresis. Whenever a change in  $V_{therm}$  occurs,  $V_a$  is forced to Mid through negative feedback. Also, neighboring digital states are made to be slightly overlapping, so that there is more than one combination of coarse, medium, and fine control voltages that produce a given frequency of oscillation in the VCO. In this way, the borders between digital states are made to be "soft". These measures ensure that the there is a large time between digital state changes in steady-state, especially since, in practice, the only changes in the system over time will be due to temperature, which generally varies slowly.

## **3.3 Blocks**

The behavioral models for the system blocks will now be discussed. It is through these models that reasonable specifications can be derived for the circuit implementation. C MEX files were used to model the VCO and the charge-pump/loop filter combination, while regular SIMULINK models were constructed for the other blocks.

-34-

## 3.3.1 VCO

As previously described, the VCO in the PLL system takes three control inputs (fine, medium, and coarse frequency adjust). This VCO was modelled using a C MEX file, whose algorithm was a modified version of the code given in [Johns, 1997].

There are four constants that must be defined: the free-running frequency, and the VCO constants for the coarse ( $K_{bin}$ ), medium ( $K_{therm}$ ), and fine ( $K_a$ ) loops. To work towards this goal, the lock range was first specified to be 250 MHz plus or minus 30%, to allow for process variations. It was then specified that the digital control signals should be able to get the PLL within 0.5% of the free-running frequency from any given reference frequency in the lock range. This 0.5% gives a frequency change of 1.25 MHz for each change in V<sub>therm</sub>.

Some up/down control circuits were synthesized using VHDL, and it was decided to implement V<sub>therm</sub> as a thermometer-code signal. A thermometer-code was chosen because this signal is more likely to change in steady-state, and thermometer-code incurs less switching noise than binary switching, where several bits change during one switching event, as opposed to just one in thermometer code. V<sub>bin</sub> was implemented using binary logic, since it is unlikely to change very frequently in steady-state, and binary-encoded logic produces more space-efficient circuitry than thermometer-code. It was found that to keep the synthesized area to a reasonably small value (no larger than 300 x 300  $\mu$ m<sup>2</sup>), no more than four binary bits could be used for V<sub>bin</sub>. This gives a VCO frequency change of about (325-175)/15=10 MHz per change in V<sub>bin</sub> (see Figure 3.2). Because the coarse frequency regions were chosen to overlap by 50% (see Figure 3.2), the total available frequency range within one fixed value of  $V_{bin}$  turns out to be 2(10)=20 MHz. Now, in order for V<sub>therm</sub> to achieve at least 1.25 MHz resolution, there had to be at least 20/1.25=16 medium signal states available. To accommodate this, a 15-bit thermometer-code signal was chosen for V<sub>therm</sub>. Thus, K<sub>bin</sub> is 10 MHz/state, and K<sub>therm</sub> is 1.25 MHz/state. Again, a 50% overlap was chosen for the thermometer-code states (see Figure 3.2), so that the analog VCO constant  $(K_a)$  was a little larger than necessary. Because the upper threshold for  $V_a$  was chosen to be 2.3 V (1 V below the supply to minimize distortion in the charge-pump current), and the lower threshold to be 1 V,  $K_a$  was chosen to be (2)(1.25/(2.3-1))=1.92 MHz/V.



## 3.3.2 Phase Detector

The phase detector in the behavioral model consists of a PFD followed by a chargepump and loop-filter combination. The PFD model given in Figure 2.4 can be used directly to create a SIMULINK model for the PFD, as shown in Figure 3.3.



Figure 3.3: SIMULINK Schematic for PFD

-36-

The ideal performance under conditions of zero phase or frequency error between the VCO and reference signal is shown in Figure 3.4.



Figure 3.4: Ideal PFD Signals for VCO in Phase With Reference Signal

Because the signals are perfectly in phase, the PFD does not instruct the charge-pump to add or remove any charge from the loop filter capacitor. As can be seen, both the Up and Down signals remain low, except for extremely short (the length of one SIMULINK time step) spikes that occur simultaneously on either output, which will have no effect on the charge stored on the loop-filter capacitor.

If a delay is added to the NAND gate that forms the reset signal for the flip-flops, the PFD reacts as shown in Figure 3.5. As can be seen, the pulses on the Up and Down signals are now wider (0.8 ns, the same as the delay added to the NAND gate). This behaviour leads to a degradation in performance, as will be described in Section 4.4.

Next, consider the performance of an ideal PFD when the VCO and reference signals are at different frequencies. This is illustrated in Figure 3.6 for  $f_{VCO} > f_{Ref}$ , where it can be seen that the Up signal remains inactive (aside from the short pulses previously discussed), while the







Figure 3.6: Ideal Response of PFD for f<sub>VCO</sub> > f<sub>Ref</sub>

Down signal displays a periodic pulse train with a duty cycle that increases gradually over time. Thus, the PFD causes the charge-pump to continually remove charge from the loop filter capacitor, which would slow down the VCO, and bring it closer to lock with the reference signal.

If a delay is added through the NAND gate (or equal delays are added to each of the flip-flops), the response changes to the one shown in Figure 3.7. Again, note that the "inactive" Up output contains 0.5 ns pulses. Also, note that the Down signal's duty cycle still increases continually, but once its duty cycle reaches a certain maximum value (at around 60 ns), it can no longer increase, and the Down signal begins outputting the same "default" 0.5 ns pulses as the Up signal was previously. As it begins doing so, the duty cycle of the Up signal abruptly increases, and then slowly decreases until its pulse width returns to that of the "default" pulse. Once this occurs, the duty cycle of the Down signal again begins to increase, and the cycle repeats itself. This behaviour is undesirable, because it slows down the convergence of the charge-pump output on the correct value to give a locked condition, by driving the charge-pump output in the wrong direction for part of the cycle.



Figure 3.7: PFD Signals with 0.5 ns NAND Delay, f<sub>VCO</sub> > f<sub>Ref</sub>

The charge-pump and loop filter combination was implemented using a C MEX file containing the same difference equations derived in [Johns, 1997] for the charge-pump and loop filter combination, altered to model the saturation of the output voltage at the power supplies.



Figure 3.8: Loop Filter Used for SIMULINK Simulations

As seen in Chapter 2, the loop filter components  $C_1$ ,  $C_2$ , and R of Figure 3.8 must be chosen to give acceptable loop performance. To work towards this, a value of 0.45 was chosen for the Q of the PLL loop, which should ensure real transfer function poles [Johns, 1997]. This was arrived at by choosing a nominal Q of 0.5 (which gives real poles), ignoring the effect of capacitor  $C_2$ . To compensate for the phase-shift introduced by  $C_2$ , the nominal Q was lowered by 10%, to give 0.45. From Chapter 2, the expression for the Q of a second-order charge-pump PLL is given by Equation 3.1. Next, a reasonable value was found for the charge-pump current. It is known

$$Q = \frac{1}{\omega_n R C_1}$$
(EQ 3.1)

from Chapter 2 that the phase-detector gain is given as in Equation 3.2, and that the natural fre-

$$K_d = \frac{I_{ch}}{2\pi} \tag{EQ 3.2}$$

quency of the PLL is given as in Equation 3.3.

$$\omega_n = \sqrt{\frac{K_o K_d}{C_1}}$$
(EQ 3.3)

In the current design, a large loop-bandwidth is desired, which allows the loop to correct for noise in the VCO output up to a high bandwidth (see Section 2.4.1). The loop bandwidth is approximately given by the natural frequency of the loop. Thus, to increase the loop bandwidth, one must choose large values for the phase detector and VCO gains. Also, a small value

for  $C_I$  should be chosen, which also helps to shrink the area of the physical layout. A value of 100  $\mu$ A was chosen for  $I_{ch}$  as a good trade-off between loop bandwidth and power dissipation.

As described previously, the choice of oscillator constant was governed by the frequency plan for the VCO.  $K_a$  was found to be 1.92 MHz/V, and  $K_d$  can be found to be 15.9  $\mu$ A/ rad. Next, a value for C<sub>1</sub> must be chosen. If the equations for Q and the natural frequency are combined, one gets the relation for R given in Equation 3.4. Note that all quantities except for R and C<sub>1</sub> have already been determined, so that the choice of C<sub>1</sub> dictates the value of R. Both

$$R = \frac{1}{Q\sqrt{C_1 K_o K_d}}$$
(EQ 3.4)

quantities must be physically and reliably implementable, which limits how small  $C_1$  can be chosen. A good trade-off was found to be  $C_1=75$  pF, and R=18.26 kOhms. This leads to a loop-bandwidth of about 560 kHz.

With R and  $C_1$  assigned values, the next task is to select a value for  $C_2$ . Based on the conclusions in [Gardner, 1980], the ratio  $C_1/C_2$  was chosen to be larger than 8, to obtain real poles (as discussed in Section 2.4.7). The value of  $C_2$  was determined by examining the step response of the system for various values of  $C_2$ , as shown in Figure 3.9.

As can be seen in Figure 3.9, if  $C_2$  is too large, the system displays large overshoot, which is undesirable, since this could cause  $V_a$  to swing outside the allowable region (between the Low and High thresholds), causing a change in the digital state, even when a change was not really necessary. Once  $C_2$  is small enough to maintain real poles, it has little effect on the performance of the system. As a result, the choice of  $C_2$  was not a very sensitive one. A value of 3.5 pF was chosen, which results in an overshoot of 13.5% (as opposed to the 28% for  $C_2$ =10 pF), and a 1% settling time of 4.15 µs.



Figure 3.9: Step Responses for Various Values of C2

Finally, the ideal frequency response for the third-order, closed-loop PLL was plotted, as shown in Figure 3.10, where it can be seen that the actual 3-dB bandwidth of 955 kHz is larger than the expected value of 560 kHz. This large bandwidth is desirable, however, since it allows the PLL to track out more phase noise from the VCO.



Figure 3.10: Bode Plots for PLL System Design

# 3.3.3 Up/Down Control

A SIMULINK block was needed to carry out the function of the Up/Down Control logic. The role of this block is to detect when  $V_a$  leaves its allowed region, and then to increment or decrement the digital states as appropriate.

The SIMULINK model for this block is shown in Figure 3.11. In this diagram, Thermblock checks whether or not  $V_a$  (which is the input) has left its allowed region. It does this on every rising clock edge. If  $V_a$  has left this region, it increments or decrements  $V_{therm}$  (Thermout) as necessary. Next, Binblock checks if the  $V_{therm}$  has left its allowed region (-8 to 8). If it has, it increments or decrements  $V_{bin}$  (Binout) as necessary. Meanwhile, the switches below Thermblock detect that  $V_{therm}$  has left its allowed region, and use this information to reset  $V_{therm}$  to its middle value. No mechanism was included in the model for setting  $V_a$  to Mid, because it added too much complexity to the model.



Figure 3.11: Block Diagram of Up/Down Control Block

The block diagram for Thermblock is shown in Figure 3.12. The switches on the input



Figure 3.12: Block Diagram of Thermblock (Same as Binblock)

pass a zero to their output if  $V_a$  lies inside its allowed region. If it rises above High (2.3 V), "Switch2" goes high, whereas if it falls below Low (1 V), "Switch1" goes high. These two conditions can never happen simultaneously, so that if the output of the subtraction block is non-zero, then  $V_a$  has gone outside its allowed region. If  $V_a$  has risen above this range, the output will be +1, and if has fallen below this range, the output will be -1. Otherwise, the output will be zero. This subtraction output feeds a digital integrator, which stores the current value of  $V_{therm}$ . The saturation block is in place to limit the output to values between -8 and 8. If these values are reached,  $V_{bin}$  will need to be changed.

The Binblock in Figure 3.11 is the same as Thermblock, except that the switch thresholds for Switch1 and Switch2 are -8 and 8, respectively.

The output waveforms for an input sinusoid on  $V_a$  are shown in Figure 3.13. Initially, the system is in its reset state, so both digital signals are at zero. At 0.01s, the input rises above the high threshold of 2.3 V (shown by a dotted line), causing  $V_{therm}$  to increment steadily. Eventually,  $V_{therm}$  clips at 8 (shown by a dotted line), and is reset to 0, while  $V_{bin}$  is incremented. This is continued until  $V_a$  falls below High again. A similar process occurs when  $V_a$  falls below the low threshold of 1 V.



Figure 3.13: Signals for Up/Down Control Block

# **3.4 SIMULINK Model**

The full SIMULINK model is shown in Figure 3.14. The QuadOsc block produces an in-phase and quadrature-phase square wave at a specified frequency and phase. Only the in-phase component is used for the PLL system. A saturation block is used on the in-phase signal to produce signal levels compatible with the logic blocks in the PFD. Rate limiters are used on the saturated signals to limit the pulse rise and fall times to typical values seen in HSPICE simulations (about 0.5 ns to rise from 0 to 3.3V). Both inputs to the PFD are preceded by an inverter, to make the block negative-edge triggered, as was the case in the circuit design (see Chapter 4). The up and down signals from the PFD are then combined using a multiplexer, and fed to the Charge\_Pmp block, which is a mask for the C MEX file described in Section 3.3.2. This signal is then passed through a look-up table that approximates the effect of channel-length modulation in the transistors used to form the charge-pump current sources (see Chapter 4). This is followed by a 75 ns delay block, that models the latency between changes in  $V_a$  and changes in the VCO frequency. Similar delay blocks exist for  $V_{therm}$  and  $V_{bin}$ . This delay stems from the mechanism used to adjust the frequency in the VCO, and will be discussed in Chapter 4. The VCO block is





Figure 3.14: Complete SIMULINK System

modelled by the QuadDCO1 block, which is a mask for the C MEX file described in Section 3.3.1. The Up/Down Control block, described in Section 3.3.3, uses a clock signal derived from the reference signal by a divide-by-64 clock divider. The input signal is the distorted, non-delayed charge-pump output signal. This block produces the medium and coarse control-signals as outputs, which are used to drive the QuadDCO1 block. Finally, the VCO output is a square wave with an amplitude of 1, which is then saturated to give signal-levels compatible with those in the PFD, and then rate-limited.

# **3.5 SIMULINK Simulation Results**

The results of the system-level SIMULINK simulations will now be discussed. The discussion will start with the ideal system, followed by a discussion of the effects of the non-ideal delay through the VCO.

### 3.5.1 System With No Delay

Initially, the delay through the three delay blocks in Figure 3.14 was set to zero, and the performance of the resulting system was analyzed.

First, a 250 MHz waveform with a phase of 180 degrees was input into the system. The resulting waveform is shown in Figure 3.15. Initially, due to a start-up transient,  $V_a$  drops below Low, which causes  $V_{therm}$  to decrement by one. This slows down the VCO, and causes the

phase error to increase more, which causes  $V_a$  to rise in the opposite direction. However, because  $V_{therm}$  is lower than required for the given VCO frequency,  $V_a$  must rise above High, which triggers an increase in  $V_{therm}$ . This allows  $V_a$  to decrease to its reset value of 1.6 V. Note that because there is no frequency error between the VCO free-running frequency and the input frequency,  $V_a$  returns to its "middle" value at about 4.5  $\mu$ s.



Figure 3.15: System Response to 180 Degree Phase Error

To examine the response of  $V_a$ , a small frequency error was input into the system. The response is shown in Figure 3.16, along with  $V_{therm}$  and  $V_{bin}$ . The response involves only  $V_a$ , which rises smoothly to the required voltage to lock the reference signal to the VCO output, in about 1.3  $\mu$ s.



Figure 3.16: System Response for 0.5 MHz Frequency Discrepancy

To examine the performance of  $V_{therm}$ , a larger frequency error was input into the system. The response for a 245 MHz input signal is shown in Figure 3.17. Due to the relatively large frequency error,  $V_a$  begins to move downward, and eventually drops below Low. This causes  $V_{therm}$  to be decremented. However,  $V_a$  must still move below Low to achieve lock, so  $V_{therm}$  is again decremented. This continues until finally  $V_a$  is forced above Low, where it settles to a steady-state value between the low and high thresholds at about 3  $\mu$ s.



Figure 3.17: System Response for Frequency Discrepancy of 5 MHz

To observe how all three control voltages interact, an even larger frequency error was input into the system. The response of the system to an input frequency of 325 MHz is shown in Figure 3.18. In the figure one can see that, as expected,  $V_a$  rapidly rises above High, causing  $V_{therm}$  to increment steadily. However, the frequency discrepancy is so large that  $V_a$  continues to rise, and clips at the power supply.  $V_{therm}$  reaches its upper limit of 8 several times, causing  $V_{bin}$  to increment until it clips at its maximum value of eight, as well. Eventually, the digital state gets high enough to allow  $V_a$  to drop below High. However,  $V_a$  takes some time to drop below High from the power supply, so  $V_{therm}$  and  $V_{bin}$  continue to increment, until they are higher than required. Thus, when  $V_a$  finally makes it below High, it falls below Low, causing  $V_{therm}$  to decrease, until finally  $V_a$  rises and settles between the low and high thresholds. The entire process takes 28  $\mu$ s.





Figure 3.18: System Response for 75 MHz Frequency Discrepancy

## 3.5.2 System With Delay

Once the performance of the system without delay was deemed satisfactory, a 75 ns delay was added to the VCO control signals, as shown in Figure 3.14. The simulation for the 325 MHz input was repeated, and it was found that the digital state displayed "ringing", in that the digital state bounced back and forth around the steady-state digital state several times before coming to rest, as shown in Figure 3.19. This overshoot is due to the inability of the up/down logic to respond to current changes in the charge-pump output to alter the VCO frequency. Thus,  $V_a$  ends up moving far past its thresholds before the logic can take action. When the logic finally does change its state, it overshoots the correct state, since  $V_a$  must change by a large amount to move back between the thresholds.

This response is undesirable, since it greatly increases lock acquisition time. To



Figure 3.19: System Response for 75 MHz Frequency Discrepancy and 75 ns Loop Delay

circumvent this problem, there are three basic choices: decrease the delay, move the thresholds closer to the supply rails, or change the VCO constants. Because the delay was imposed on the system by the choice of VCO (which was critical to the design), the first method is not possible. Changing the thresholds would be a simple solution if they were generated off-chip, however this is not the case. It turned out that for the reference-generation method chosen, it was difficult to make the separation between thresholds too large without a large sacrifice of power and device area. Also, it was desired to avoid distortion due to channel-length modulation in the charge-pump current sources, which limited the separation between high and low thresholds. The last option seemed to offer the fewest disadvantages (although it was by no means ideal), so it was decided to increase  $K_a$ . Specifically, the  $K_a$  was increased to 3 MHz/V from 1.92 MHz/V. This has the effect of spreading out the medium control regions in Figure 3.2, so that more tuning was possible within a fixed digital state. This makes the system less picky about the digital state, so that if the digital state is overshot or undershot by a state or so,  $V_a$  can compensate. The drawback

of this alteration is that the average accuracy of tuning possible with just the digital control signals is decreased.

Note that it was later found that in the actual system, the charge-pump saturated somewhat below the power supply, and that channel-length modulation caused the "down" current source to have larger current than its nominal value, which helped it pull down  $V_a$  faster (similar effects were observed on the "up" current). This further helped to avoid the behaviour observed in Figure 3.19.

The response of the altered system with 75 ns delay is shown in Figure 3.20. Note that the response has returned to one that is similar to the original un-delayed response in Figure 3.18. Analog Control, 75 MHz Frequency Discrepancy, with Loop Delay-Altered System



Figure 3.20: Response of Altered System to 75 MHz Frequency Discrepancy and 75 ns Delay

# **3.6 References**

H. Djahanshahi, C.A.T. Salama, "Differential 0.35 um CMOS Circuits for 622 MHz/ 933 MHz Monolithic Clock and Data Recovery Applications," *ISCAS 1999*, Session II-93.

D. Johns, K. Martin, Analog Integrated Circuit Design. Toronto: John Wiley & Sons, 1997.

Math Works Inc, MATLAB Application Program Interface Guide, 1996.

S. Nati, I. Kyles, "A Monolithic Gallium Arsenide Interval Timer IC with Integrated PLL Clock Synthesis Having 500-ps Single Shot Resolution," *IEEE Journal of Solid-State Circuits*, Vol. 32, no. 9, pp. 1350-1357, Sept. 1997.

J. Parker, D. Ray, "A 1.6-GHz CMOS PLL with On-Chip Loop Filter," *IEEE Journal of Solid-State Circuits*, Vol. 33, no.3, pp. 337-343, Mar. 1998.

M. Rau, T. Oberst, R. Lares, A. Rothermel, R. Schweer, N. Menoux, "Clock/Data Recovery PLL Using Half-Frequency Clock," *IEEE Journal of Solid-State Circuits*, Vol. 32, no. 7, pp. 1156-1161, July 1997.

B. Razavi, "A 2-GHz 1.6-mW Phase-Locked Loop," *IEEE Journal of Solid-State Circuits*, Vol. 32, no. 5, pp. 730-735, May 1997.

V. von Kaenel, D. Aebischer, C. Piguet, E. Dijkstra, "A 320-MHz, 1.5-mW @ 1.35V CMOS PLL for Microprocessor Clock Generation," *IEEE Journal of Solid-State Circuits*, Vol. 31, no. 11, pp. 1715-1722, Nov. 1996.

I. Young, J. Greason, K. Wong, "A PLL Clock Generator with 5 to 110 MHz of Lock Range for Microprocessors," *IEEE Journal of Solid-State Circuits*, Vol. 27, no. 11, pp. 1599-1606, Nov. 1992.



This chapter discusses the transistor-level design of the system described in Chapter 3. Section 4.1 describes the block-level design of the circuit, while Sections 4.2-4.7 describe the transistor-level design of each of the blocks.

# NOTE: The unit transistor size in all figures is 1.6µm/0.4µm for NMOS and 4.8µm/0.4µm for PMOS

# 4.1 Overall System

The system block diagram is quite similar to the SIMULINK system of Figure 3.1, and is shown in simplified form in Figure 4.1. As in the SIMULINK system, the diagram in Figure 4.1 contains a phase-frequency detector (PFD), charge-pump, loop-filter, and a VCO. To simplify the discussion, the digital control-signals have been temporarily neglected.



Figure 4.1: Simplified System Schematic

#### **Chapter 4: Circuit Design**

The PLL uses a conventional charge-pump architecture, except that the VCO is controlled by changing the value of a resistor in the bias circuit ( $R_{bias}$ ). The charge-pump output drives a voltage-controlled resistor used to control the transconductance provided by a constanttransconductance bias circuit (CG<sub>m</sub>BC). The CG<sub>m</sub>BC is used to bias a transconductance-controlled oscillator (G<sub>m</sub>CO). By changing the resistor in the bias circuit, the frequency of oscillation of the G<sub>m</sub>CO is changed, so that the combination of the G<sub>m</sub>CO and CG<sub>m</sub>BC acts as a VCO.

The block diagram for the full system is shown in Figure 4.2. As in Figure 4.1, the diagram in Figure 4.2 contains a PFD, charge-pump, loop-filter, and VCO. However, since the diagram includes the digital control signals, the system now also contains two comparators and an up/down control block, and the adjustable  $CG_mBC$  is controlled by three signals (analog, digital-binary, digital-thermometer).



Figure 4.2: Full System Block-Diagram

## 4.2 Adjustable Constant-Transconductance Bias Circuit

The adjustable  $CG_mBC$  will be discussed in two parts. First, the bias circuit will be examined, ignoring the internal workings of the voltage-controlled resistor. This will be followed by a discussion of the operation of the voltage-controlled resistor alone, and in the bias circuit.

## 4.2.1 Bias Circuit Without Variable Resistor

The adjustable  $CG_mBC$  must provide a well-defined transconductance to the  $G_mCO$ , the value of which is controlled by the input signals Analog ( $V_a$ ), Digital-Therm ( $V_{therm}$ ), and Digital-Binary ( $V_{bin}$ ). Fine-tuning of the transconductance is provided by  $V_a$ , while  $V_{therm}$  and  $V_{bin}$  are used to alter the transconductance in "medium" and "large" discrete steps, respectively. The circuit schematic for the adjustable  $CG_mBC$  is shown in Figure 4.3.

Referring to the box labelled "Adjustable  $CG_mBC$ " in Figure 4.3, it can be seen that the circuit is, to a degree, similar to the constant-transconductance bias circuit in [Johns, 1997]. Both circuits contain a CMOS version of a Widlar current source (including transistors m1-m4 and Bias Resistor in Figure 4.3), as suggested in [Steininger, 1990]. However, in [Johns, 1997], both the PMOS and NMOS transistors are cascoded, while only the NMOS transistors are cascoded in Figure 4.3. This was done to lower jitter in the GCO, as will be discussed in the next section. It can be shown [Johns, 1997] that the transconductance of m4 is equal to the conductance of the adjustable Bias Resistor. All circuits matched to the structure of the bias circuit will have transconductances proportional to  $g_{m4}$ , so that the transconductances of transistors throughout the system can be altered by altering the value of the Bias Resistor.

Another important difference between the bias circuit in [Johns, 1997], and Figure 4.3 is that the resistor in Figure 4.3 is placed at the source of a PMOS transistor, instead of an NMOS. This was done in response to an analysis in [Hartman, 1997], in which it was found that the largest error-contribution of the matching of  $g_{m4}$  to the bias resistor was from the body effect in the NMOS transistor whose source was connected to the resistor. By moving the resistor to the source of a PMOS transistor, the body-effect error can be removed by placing the PMOS transist-tor in its own well, with its source tied to its bulk, as shown in Figure 4.3.

**Chapter 4: Circuit Design** 



Figure 4.3: Circuit Schematic for Adjustable Constant-Transconductance Bias Circuit

#### **Chapter 4: Circuit Design**

The circuit in Figure 4.3 includes a simple differential amplifier that takes its input from the drains of m3 and m4. This amplifier creates a negative-feedback loop that adjusts the bias voltage r1 to keep the drains of m3 and m4 at equal voltages. This means that  $V_{DS4}$  is equal to  $V_{DS3}$  plus the voltage drop across the adjustable Bias Resistor, which guarantees that there is a large voltage-drop across m4 (about 0.89 V, nominally). As will be seen in the next section, this helps to reduce the jitter in the GCO signal.

Bias voltages must also be generated to bias wide-swing PMOS and NMOS current mirrors, so that high output-impedance current sources can be used without sacrificing too much voltage swing [Johns, 1997]. This is accomplished by transistors m7-m10 in Figure 4.3. The bias circuit also incorporates a start-up circuit, very similar to the one used in [Johns, 1997].

The opamp design was straightforward, since a low-speed, low-gain circuit was required. The opamp design is shown in Figure 4.3 to contain a single-stage, with no cascode devices, which lead to higher gain. The characteristics of the opamp are summarized in Table 4.1, assuming a bias-circuit resistor of  $3.5 \text{ k}\Omega$ . These characteristics include the effect of the 1 pF C<sub>n1</sub> that loads the output of the opamp, and lowers the bandwidth (the unloaded 3-dB bandwidth is 669 MHz).

| •••                  |            |
|----------------------|------------|
| Unity-Gain Frequency | 12.87 MHz  |
| Open-Loop DC Gain    | 28.3 dB    |
| 3-dB Bandwidth       | 1.43 MHz   |
| Phase-Margin         | 93 Degrees |
| Power Consumption    | 56.6 μW    |
|                      |            |

**Table 4.1. Opamp Characteristics** 

Because the bias circuit contains several feedback loops (some of them positive), the stability of the circuit must be examined. To compensate the circuit, capacitor  $C_c$  was added across transistors m1-m1cas (see Figure 4.3). Capacitors  $C_{n1}$ - $C_{n2}$  and  $C_{p1}$ - $C_{p2}$  were added to decouple noise from the bias voltages to ground. The loop gain for the circuit was measured using a similar method to the one used in [Seevinck, 1998], as illustrated in Figure 4.4.



Figure 4.4: Measurement of Bias Circuit Loop Gain

The results of this loop gain measurement are plotted in Figure 4.5. The characteristics of the bias loop are summarized in Table 4.2. The loop response is essentially first-order lowf pass up to about 1 MHz. The DC loop-gain is fairly high at 55.93 dB, but has a low cutoff frequency of 37.3 kHz. Because the circuit is used for biasing, the low bandwidth is not a problem. However, it was necessary to check whether the bias-circuit response had an effect on the PLL loop response. The capacitor C<sub>c</sub> has been chosen to give an adequate phase-margin of just under 70 degrees.

#### **Chapter 4: Circuit Design**

To check whether the AC response of the bias circuit affected the loop AC response, an AC source was used to excite  $V_a$  in the adjustable resistor (see Section 4.2.2), and the response of the bias voltage r1 was monitored. To prevent the bias-circuit frequency-response from affecting the overall loop response, it was desired to have the first pole introduced by the bias circuit be at least ten times greater than the nominal loop-filter pole (which is roughly at 117 kHz). This leads to a required minimum pole frequency of 1.2 MHz. Through simulation, it was found that the first pole produces a 3-dB frequency of 9.7 MHz, which is well above the required minimum pole frequency response of the bias circuit.



Figure 4.5: Frequency Response of Bias Loop

| table 4.2. Characteristics of Dias Loop |            |  |
|-----------------------------------------|------------|--|
| 3-dB Bandwidth                          | 37.3 kHz   |  |
| Unity-Gain Frequency                    | 18.8 MHz   |  |
| DC Gain                                 | 55.93 dB   |  |
| Phase Margin                            | 69 degrees |  |

Table 4.2. Characteristics of Bias Loop

Figure 4.3 contains one more box, labelled "Reference Signal Generation". This circuitry generates the reference voltages High and Low, used by the comparators to check if  $V_a$  has left its allowed region of operation. It also generates a bias voltage called r3p, which is used to bias the digital-controlled resistors when they are active (see Section 4.2.2).

## 4.2.2 Voltage-Controlled Resistor

The voltage-controlled resistor is composed of an array of triode-region transistors controlled by  $V_{bin}$ ,  $V_{therm}$ , and  $V_a$ . From Chapter 3,  $V_{bin}$  is 4 bits, and  $V_{therm}$  is 15 bits.

A diagram for the voltage-controlled resistor is given in Figure 4.6. It consists of a 5  $k\Omega$  resistor in parallel with a triode-region transistor controlled by  $V_a$ , along with an array of digital switches controlled by  $V_{bin}$  and  $V_{therm}$ . Each digital switch controls a triode-region transistor, by connecting its gate to VDD (in which case it is inactive) or the bias voltage r3p (in which case it is active). Recall that r3p is generated from the bias circuit, as shown in Figure 4.3. Each transistor controlled by a thermometer-code switch is the same size, while the transistors controlled by the binary switches are binary-weighted. The 5k $\Omega$  resistor is present so that when all the digital switches are turned off, the conductance of the voltage-controlled resistor is large enough to allow all devices in the bias circuit to be active.



#### Figure 4.6: Voltage-Controlled Resistor

The triode-region transistors were sized to give the desired VCO gains, as found from the SIMULINK simulations, and also to ensure that  $V_a$  returned to roughly Mid (or (High+Low)/2) whenever  $V_{therm}$  changed. To accomplish this, the following sizing was used for the transistors in Figure 4.3: assuming the width-to-length ratio for the thermometer-code triode-region transis-

tors is X, then  $(W/L)_{ma}=2X$ ,  $(W/L)_{mr3p}=X/4$ ,  $(W/L)_{mlow}=X/9$ , and  $(W/L)_{mhigh}=X$ . To see why this works, note that  $V_{eff,mlow}=3V_{eff}$ ,  $V_{eff,mr3p}=2V_{eff}$ , and  $V_{eff,ml(therm)}=2V_{eff}$ , where  $V_{eff}$  is the overdrive voltage for mhigh, and  $V_{eff,ml(therm)}$  is the overdrive voltage for the triode transistors in the thermometer-code digital resistors when on. Thus, when  $V_a$  moves from Mid to High (a voltage change of  $V_{eff}$ ), it creates an admittance change in the bias resistor of  $2X\mu_pC_{ox}V_{eff}$ . Once it hits the threshold value, the digital state is changed, causing another admittance increase of  $2X\mu_pC_{ox}V_{eff}$ . The negative feedback loop will then force  $V_a$  back to roughly Mid, since this pushes the resistor admittance back down by  $2X\mu_pC_{ox}V_{eff}$ . Note that the thresholds High and Low change in value with the bias resistor (High is given by VDD- $V_{eff}$ - $V_{tp}$ , and Low is given by VDD- $3V_{eff}$ - $V_{tp}$ ) [Martin, 1999]. Finally, note that the preceding analysis assumes that the voltage-drop across  $R_{bias}$  ( $V_{Rbias}$ ) is small relative to  $V_{eff}$ . Otherwise, the change in bias-resistance conductance when a thermometer-code resistor is switched in or out is roughly  $X\mu_pCox(2V_{eff} - V_{Rbias})$ , neglecting the effects on  $R_{bias}$  of the change in  $V_{Rbias}$  after switching.

The final VCO gains were used in the SIMULINK model, and the simulations were redone, resulting in the plots shown in Figures 4.7 and 4.8. The system can be seen to respond quickly with little ringing, in a similar fashion to the initial design.



Analog Control, 25 MHz Frequency Discrepancy

Figure 4.7: Response of Final System to 275 MHz Input

-62-





Figure 4.8: Response of Final System to 325 MHz Input

With the variable-resistor in place, the bias circuit consumed 1.75 mW in simulation, including the reference-generators for Low, High, and r3p. A comparison of  $g_{m4}$  with the conductance of the variable bias-resistor is given in Figure 4.9, for a bias resistance ranging from 3830 $\Omega$  to 3930 $\Omega$ , which shows a matching error ranging from 0.42% to 0.59%, indicating that transistor transconductances are well-matched to the conductance of the variable resistor.



Figure 4.9: Variable-Resistor Conductance and gm4 Versus Va

# **4.3 Transconductance-Controlled Oscillator**

The design of the transconductance-controlled oscillator ( $G_mCO$ ) will now be discussed. This will be followed by a description of the  $G_mCO$  output buffer.

## 4.3.1 Oscillator Design

The block diagram for the  $G_mCO$  is shown in Figure 4.10. The  $G_mCO$  is a ring oscillator composed of four differential delay-stages. The last stage has its outputs cross-coupled to the next stage to ensure that there are an odd number of inversions through the ring at DC.



Figure 4.10: G<sub>m</sub>CO Schematic

The selection of a four-stage topology allows the generation of quadrature-phase signals for free [Buchwald, 1991], which is beneficial for such modulation schemes as QAM and SSB. According to the Barkhausen criterion [Sedra, 1991], for a positive feedback loop to be unstable, the loop gain must be equal to unity and the phase shift through the loop must be  $2\pi k$ (where k is some integer) at the frequency of oscillation. In this circuit, the first three inverters supply a 180 degree phase-shift due to the connection of the output terminals, plus an additional phase-shift due to delay through the inverter. The last inverter simply supplies a phase-shift due to the delay through the circuit. Thus, there is a total of 540 degrees (or 180 degrees, equivalently) through the loop simply due to wire connections. Another 180 degrees must be made up for through inverter delay. Since all the inverter delays are kept equal, each inverter must supply 180/4=45 degrees of delay-induced phase-shift. Thus, quadrature oscillator signals can be generated by tapping the outputs of any pair of inverters separated by one inverter, since the phase-shift between such inverters is 450 degrees, or, equivalently, 90 degrees. These observations can be made directly from Figure 4.11, which shows the VCO signals at various points in the loop. The top plot shows the input and output of the first inverter stage. Note the 225 degree phase shift. The middle plot shows the input and output of the fourth (non-inverting) inverter stage, which, as expected, generates a 45 degree phase-shift. Finally, the outputs of the first and third delay stages are shown, which can be seen to be in quadrature with one another.



Figure 4.11: Ring Oscillator Signals at Outputs of Various Stages

The transistor-level schematic for the differential inverters is shown in Figure 4.12. The inverters are composed of a differential pair formed by m2-m3, with a cascode tail current-source, that drives triode-region loads m4-m5. The current-source supplies 150  $\mu$ A at 250 MHz, for a total ring-oscillator DC power dissipation of 1.98 mW. The frequency response for the inverter is shown in Figure 4.13, assuming the bias conditions for 250 MHz oscillation. The gain at 250 MHz is 8.2 dB (or about 2.6 V/V), which is larger than necessary to produce oscillation, but provides some insurance against temperature and process variations.





The delay through this inverter can be shown to be roughly equal to  $r_{ds4}C_L ln2$  (see Appendix B). If the drain-to-source voltage of the loads is assumed to be small so that m4-m5 are always in the triode region, then  $r_{ds4}$  is proportional to  $1/g_{m2}$ , which is in turn proportional to the resistance of the adjustable resistor in the bias circuit of Figure 4.3, so that the voltages control-ling the resistor directly control the frequency of oscillation in the G<sub>m</sub>CO. To see this, recall the

equation for the triode-region resistance of a MOSFET, given in Equation 4.1, where the approximation holds if  $V_{eff}$  is much larger than the source-to-drain voltage of the PMOS. However, if the

$$r_{ds,\text{triode}} = \frac{1}{\mu_p C_{ox} \left(\frac{W}{L}\right) (V_{eff} - V_{sd})} \cong \frac{1}{\mu_p C_{ox} \left(\frac{W}{L}\right) V_{eff}}$$
(EQ 4.1)

the voltage-swing at the output of the VCO gets very large (comparable to  $V_{eff4}$ ), then this approximation starts to break down. If the output drops lower than VDD -  $V_{eff4}$ , then the load device enters the active region, and the approximation that  $r_{ds4}$  is proportional to  $1/g_{m2}$  is no good at all. Hence, the maximum differential voltage swing to ensure that the loads are always in the triode region is  $V_{eff4}$  peak, or roughly 0.35 V.

For low-jitter design, this is not a good result, since all analysis of ring-oscillator jitter has shown that to decrease jitter, VCO output power must be increased [Razavi, 1996][McNeill, 1997], as seen in Section 2.3.5.1. Indeed, most ring-oscillator designs attempt to make the VCO signals as large as possible, such as in the fully-switching delay-cell design outlined in [Park, 1999]. Clearly, a trade-off must be made between jitter performance and matching between the bias circuit and ring oscillator.

Note that the structure of the inverter mimics that of transistors m2 and m4 in the bias circuit of Figure 4.3. To see this, notice that when the differential pair is completely switching all current through m2, m3 is off while m2 can be modelled as being close to a short-circuit. Hence, the circuit looks like m4 being biased by the cascode current source m1, which is analogous to the bias circuit structure, except that m4 in the bias circuit is diode-connected. Thus, if m4 in the bias circuit is biased with a large drain-to-source voltage (as it is) then so should m4 in the inverter. To further boost the output swing of the VCO, the lengths of the load devices were increased to 0.5  $\mu$ m, which increased the output resistance, and hence the output swing. The differential output swing at 250 MHz with this modification was 1.6 V peak-to-peak. The only problem with this design is that when the outputs go low, there is a large voltage (0.8 V) across the load devices, which causes them to enter the saturation region, so that the matching of the VCO frequency to the adjustable bias resistor is compromised. However, it was found in simulation that the VCO frequency still had a strong proportionality to transistor  $g_m$ . This is demonstrated in Figure 4.14,

which shows the simulated variation of the transconductance of mhigh in the bias circuit, as the VCO frequency is varied. The relation matches the LMS error fit line with a standard deviation of  $1.7 \,\mu$ A/V, which is only 0.6% of the transconductance at 250 MHz.



Figure 4.14: Simulated Transconductance of mhigh Versus VCO Frequency

Load capacitors are added to each stage to slow down the inverter stages to give a 250 MHz free-running frequency, and also to diminish the effects of the non-linear parasitic capacitance seen looking into the inverter stages. The load capacitors on the first stage are made smaller than those on the other stages to account for the capacitive loading of the VCO buffer (which is driven by the first inverter stage).

Figure 4.15 shows the variation of the VCO frequency as the binary control-signal is changed (assuming the analog and thermometer control signals are fixed at their middle values). It can be seen that the VCO is tunable from about 150 MHz to 475 MHz.

The total oscillator power dissipation (dynamic and static) for the ring oscillator was simulated to be 2.21 mW at 250 MHz.



Figure 4.15: VCO Frequency Versus Binary Control Signal (Analog and Thermometer Control Signals Fixed at Middle Values)

## **4.3.2 Oscillator Buffer Design**

The schematic for the buffer is shown in Figure 4.16. It consists of a differential-tosingle-ended converter that drives two series CMOS inverters. Figure 4.17 shows the buffer signals for a ring-oscillator frequency of 250 MHz. The bottom plot is the signal at the input to the buffer, the second from the bottom is the signal before the inverters, the third from the bottom is the signal between the two inverters, while the top plot gives the final output signal. The differential-to-single-ended converter draws a large amount of current (369  $\mu$ A at the free-running frequency), in order to drive the input capacitance of the first inverter with a peak-to-peak voltage of 1.64 V (see Figure 4.17). The first inverter is scaled to have a high threshold of 2.35V, in order to maintain a roughly 50% duty cycle, however this inverter has slow fall-times, due to the small NMOS transistor used. To equalize rise and fall times to about 0.3 ns (10% to 90%), a symmetric CMOS inverter with a threshold of 1.63 V is added before the output.

The VCO buffer was found through simulation to have a total power dissipation of 2.64 mW at 250 MHz.





Figure 4.17: VCO Buffer Signals

# **4.4 Phase-Frequency Detector**

The fact that the PLL contains no frequency-dividers means that the phase-frequency detector (whose schematic is shown in Figure 4.18) must operate at the full speed of the VCO (i.e. as high as 325 MHz). While there is no strong reason to not use a frequency divider, it was

decided to examine the performance of the loop with the phase-detector running at the full speed of the VCO. At these speeds, delays through the NAND gate and flip-flops begin to have a noticeable effect on performance. These effects were observed qualitatively in the system modeling of Section 3.3.2. More quantitatively, it is shown in [Soyuer, 1990] that the mean value of Up-Down

(where Up and Down assume values of 0 and 1) ideally varies with  $\beta = \frac{f_{ref} - f_{vco}}{f_{vco}}$  as given in

Equation 4.2, for  $f_{vco} > f_{ref}$ .



Figure 4.18: Block Diagram for PFD

If, however, the path through the NAND gate, into the flip-flop reset, and back out to the flip-flop output, has a delay of  $\Delta R$ , then the average value of Up-Down is given by Equation 4.3, for  $f_{vco} > f_{ref}$ , where  $T_{vco}$  is the period of the VCO signal entering the PFD. Similar

$$\overline{(U-D)} = \frac{\beta+0.5}{\beta+1} - \frac{\Delta R}{T_{vco}}$$
(EQ 4.3)

expressions are found for the case when  $f_{ref} < f_{vco}$ , and are plotted in Figure 4.19 for several cases of  $\beta$ , and a fixed reference frequency of 300 MHz.





Figure 4.19: Mean Value of (Up-Down) for Various Values of Gate Delay  $\Delta R$ 

Note that as the gate delay ( $\Delta R$ ) increases, the maximum value attainable by  $\overline{U}$ - $\overline{D}$  decreases, as does the value at a given value of  $\beta$  or  $\gamma$ . Thus, the PLL cannot take action as quickly to frequency errors as it can when no delay is present.

From the above discussion, it can be appreciated that it is important for fast frequency acquisition to minimize the reset time through the NAND and flip-flops. While a conventional CMOS NAND gate can easily be designed to give fast propagation times (on the order of 30 ps in  $0.35\mu$ m technology, unloaded), conventional D flip-flop designs contain a large number of transistors, and correspondingly high propagation times. For example, the flip-flop design in [von Kaenel, 1996] has 19 transistors, and a reset-to-output propagation delay of 216 ps. Also, since the reset signal sees ten transistor gates, the capacitive load on the NOR gate that generates the reset signal is very large. As a result, the propagation time through the NOR gate is increased to 325 ps. This value cannot be substantially improved by increasing the sizes of the NOR gate transistors, since this increases the load seen by the flip-flops. As a result, the "default" pulses appearing at the flip-flop outputs have a width of 0.54 ns. While this is adequate performance for lower-speed designs, this degrades performance too much for signals with periods as low as 3 ns. While the ideal maximum value of  $\overline{U-p}$  is unity, the value in this case is reduced to 0.83.

To minimize the reset time through the flip-flops, and to decrease the loading on the NAND gate, a very simple flip-flop was designed (similar to the one used in [Kim, 1997]), based on the true single-phase clock (TSPC) logic family [Yuan, 1989]. The flip-flop is composed of two cascaded level-sensitive inverting latches, as shown in Figure 4.20a). The first (formed by transistors m1, m2, and m6) is active on a high clock, while the second (formed by transistors m3-m5) is active on a low clock, so that the cascaded combination is negative-edge triggered. Note, however, that this circuit, in general, will not work as a true D flip-flop, since, when the clock is low, the output q is not isolated from changes in the input D.

In TSPC D flip-flops, there is usually an isolation stage between the positive and negative latches to isolate the output of the flip-flop from changes in the input when the second latch is active [Yuan, 1989]. However, for this work, this design will act as the desired flip-flop, since the input is always high. In this case, when clk goes high, the drain of m2 is pulled down to VSS, and when clk drops low, the drain of m2 is left floating, holding the VSS value. This value causes m5 to pull the output high, as desired. The removal of the isolation stage from the circuit lowers the transistor count by three, and decreases the delay through the flip-flop from the reset input to the output.

One further required modification is the addition of some reset mechanism. This is accomplished through the addition of m7, which up to now, has been ignored. An active-low reset has been chosen to allow the use of a NAND gate, instead of an AND gate. This transistor performs an asynchronous reset operation whenever reset goes low. To decrease the reset time, m7 and m3 are made fairly large.

Finally, since the D input is always high, the flip-flop may be simplified, as shown in Figure 4.20. Because m2 is always off, it can be removed from the circuit, and because m1 is always on, it can be replaced with a short circuit. With these modifications, the circuit appears as shown in Figure 4.20b).

Simulations show that the delay from the reset input through to the output is 76.7 ps, which is almost three times faster than the flip-flop used in [von Kaenel, 1996]. The circuit uses only five transistors, instead of 19 in the case of [von Kaenel, 1996].



Figure 4.20: Circuit Schematic for PFD Flip-Flop

The inverters shown in Figure 4.18 use unit-sized transistors, since the capacitive load seen at each of the outputs is a single transistor gate. These inverters are in place to generate compliments of the up and down outputs, which are required to drive the charge-pump circuit. The inverters have a very small propagation delay of 47 ps, including loading effects. The inverters provide output waveforms with fast 10%-90% rise and fall times ranging from 98 ps to 120 ps. The NAND gate is a conventional CMOS design. With the loading effects of the flip-flops included, the NAND propagation delay was 78 ps, which shows great improvement over the von Kaenel design.

When all the above blocks are put together as shown in Figure 4.18, simulations show that the default pulse width on the up/down outputs is very narrow at 150 ps, which is 3.6 times smaller than in [von Kaenel, 1996]. Thus, the theoretical maximum value of  $\overline{U-D}$  is increased to 0.95.

The circuit response to a VCO frequency of 333.3 MHz and a reference frequency of 312.5 MHz is shown in Figure 4.21. The Up signal is virtually inactive, with extremely narrow periodic spikes, while the Down signal gradually increases in duty cycle, as expected. Note that even at these high input frequencies, the circuit response looks close to ideal. The circuit response for the maximum phase error for which the circuit responds correctly at 333.3 MHz is shown in



Figure 4.21: PFD Response for f<sub>VCO</sub> > f<sub>Ref</sub>

Figure 4.22. The delay between the VCO and reference signal, which are both running at 333.3 MHz, is 2.62 ns, for a phase difference of about 315 degrees, so that the PFD operates correctly for phase shifts between plus and minus 315 degrees at 333.3 MHz.



Figure 4.22: Response of PFD for Maximum Phase Error at 333.3 MHz

The mean value of Up-Down is plotted versus phase error between the VCO signal and the reference signal in Figure 4.23, assuming 333.3 MHz input signals. The ideal plot is

shown as a dotted line in this figure. The main difference between the two is that the simulated plot does not rise straight to 3.3 V at 360 degrees (and -3.3 V at -360 degrees), as the ideal plot does. This is due to the delay incurred in resetting the flip-flops. As can be seen, the PFD only acts ideal between plus and minus 310 degrees of phase error at 333.3 MHz. This range increases as the frequency of the input signals is decreased, since the size of the reset delay decreases relative to the signal period. At 250 MHz, the range would be  $\pm 323^{\circ}$ , and at 175 MHz, the range would be  $\pm 333^{\circ}$ .



Figure 4.23: Mean Value of Up-Down Versus Phase Error Between Inputs at 333 MHz

The PFD circuit was found to dissipate 135  $\mu W$  power, assuming 333 MHz input signals.

# 4.5 Charge-Pump

The design of the charge-pump will first be discussed, followed by a discussion of the loop-filter design, including self-biasing considerations.

## 4.5.1 Charge-Pump Design

The design of the charge-pump began with a straightforward structure, as shown in Figure 4.24. The circuit contains two cascode current-sources (formed by m4 and m5), which are

matched to transistors m7 and m9 in the bias circuit, so that the up and down currents are closely matched. The current provided by these sources is directed into or out of the loop filter impedance using conventional current mirrors. These current mirrors are turned on and off by the switches m8 and m7. The transistors m1 and m2 are given large lengths (0.6  $\mu$ m) to reduce the effects of channel-length modulation on the charging/discharging current. Channel-length modulation alters the phase-detector constant by changing the current supplied by m1 and m2 as the output voltage changes. The capacitor C<sub>load</sub> has been added to equalize the propagation delay times seen by the Up and Down paths. The Up signals experience an extra inverter gate delay from the PFD, and also see a larger input capacitance looking into the charge-pump circuit.

This design is simple, and gives reasonable performance, however it has the drawback that large transients occur in the circuit whenever m8 or m7 is turned off. Consider the case of m8 turning off. Initially, m8 is turned on, which pulls the gate of m6 close to the power supply, turning off m6 and m2. This results in an increase in the current flowing through m5 (by about 20% in simulations). When m8 is turned off, the current source m5 must quickly reduce its current to 100 mA, while m6 and m2 turn on. The important point is that the gate of m6 goes through large transients (on the order of 1 V) every time m8 is turned on off, which slows down the switching of current in the circuit. Similar transients occur at the gate of m3.

To overcome this problem, the circuit was modified as shown in Figure 4.25. The design retains the cascode current-sources m4 and m5, however now dummy structures formed by m6p/m8p and m3p/m7p have been added. Instead of loading the current-sources with a triode-region transistor (with a small voltage drop across it) when no change is to be made to the loop filter output voltage, the current from these sources is directed into dummy structures [Lacy, 1999]. This minimizes transients at the drains of m8 and m7, and speeds up the response of the circuit, as well as decreasing the power dissipation, since the nominal current flows through m4 and m5 when no changes are to be made to the output voltage (instead of the roughly 20% larger current in the previous design).

-77-

. .

ц.



Figure 4.24: Initial Circuit Schematic for Charge-Pump



Figure 4.25: Circuit Schematic for Revised Charge-Pump Design

A plot for the charge/discharge currents, as well as for the output voltage of the charge-pump and PFD combination is given in Figure 4.26, assuming both input frequencies are 333.3 MHz. Ideally, the current through m1 and m2 should be equal, so that no net charge is transferred to the loop filter, and the output voltage remains constant. In practice, however, this is not possible (for the given design), since the signal paths for creating the charge and discharge currents are different. Because the charging current is created through a group of p-channel devices, it has slower rise and fall times than the discharging current. Despite efforts to make these current pulses equal in total charge transfer, there is a 6.8% mismatch between the RMS values of the charge and discharge currents, assuming an output voltage of about 1.6 V. As the output voltage changes, however, this mismatch changes, due to channel-length modulation (13.5% at 2 V output, 0.9% at 1.3 V output). As a result of this mismatch, the output voltage drifts when the VCO is in phase with the input, as seen in Figure 4.26. Nevertheless, the charge-pump does give fast rise and fall times for the up and down current pulses, as desired. The rise and fall times for the up current are 509 ps and 1000 ps, respectively.

The charge-pump circuit was found through simulation to dissipate about 710  $\mu$ W, assuming 333 MHz input signals.





-79-

# 4.5.2 Loop-Filter Design and Self-Biasing

The loop filter is a simple implementation of the passive second-order filter used in the MATLAB simulations, except that a triode-region transistor mr1 is used to implement the filter resistor. This transistor is included to incorporate the concept of self-biasing into the PLL [Maneatis, 1996], which allows the loop Q and natural frequency-to-VCO frequency ratio to remain constant with VCO frequency, over the entire allowable frequency range ( $\pm 30$  % around 250 MHz).

To see how this can be achieved, recall that the natural frequency is given as in Equation 4.4 and loop Q as in Equation 4.5, where R is the small-signal resistance of transistor mr1 in

$$\omega_o = \frac{1}{RC_1 Q} \tag{EQ 4.4}$$

$$Q = \frac{1}{R} \sqrt{\frac{2\pi}{C_1 I_{ch} K_a}}$$
(EQ 4.5)

Figure 4.25,  $I_{ch}$  is the nominal current injected into the loop filter by the charge-pump, and  $K_a$  is the analog VCO constant, in rad/s/V.

It was found through simulation that  $K_a$  and  $I_{ch}$  are proportional to frequency. In order to ensure Q does not change with frequency, it is therefore necessary to make R inversely proportional to frequency. With Q held constant, the natural frequency is then proportional to frequency, as desired. To make R inversely proportional to frequency, it is implemented by a triode-region transistor mr1, whose gate voltage is determined by r2. Note that the small-signal resistance of mr1 (see Equation 4.6) is inversely proportional to the transconductance of an active transistor, which is in turn proportional to the conductance of the bias circuit resistor. Because (from Section 4.3.1) the VCO frequency is also proportional to transistor transconductance, the resistance of triode-biased of mr1 should be inversely proportional to frequency (this is summarized in Equation 4.7). Thus, the loop dynamics should track the frequency of the VCO, so that the loop step-response does not vary too much as the input frequency changes over the allowable range.

$$R_{Triode, mr1} = \frac{1}{\mu_p C_{ox} \left(\frac{W}{L}\right) (V_{r2} - V_{tn})}$$
(EQ 4.6)  
$$R_{Triode, mr1} \propto \frac{1}{g_m} \text{ and } f \propto g_m \implies R_{Triode, mr1} \propto \frac{1}{f}$$
(EQ 4.7)

To check this,  $I_{ch}$ ,  $K_a$ , and R were simulated over a wide range of VCO frequencies, and plotted as shown in Figure 4.27. From this figure, it can be seen that  $I_{ch}$  and  $K_a$  are indeed proportional to frequency, and that R is inversely proportional to frequency.



Figure 4.27: Simulated Ich, R, and Ka Versus VCO Frequency

Because these are not pure proportionalities (i.e. they all have offsets), one expects some deviation from the expected behaviour of Q and  $\omega_0$ . These quantities were calculated using Equations 4.4 and 4.5 (and assuming the simulated values for I<sub>ch</sub>, R, and K<sub>a</sub>), and plotted versus VCO frequency, as shown in Figure 4.28. While  $\omega_0$  seems to be proportional to frequency as desired, the Q factor displays some variation. However, Figure 4.28 also shows the variation of Q over frequency assuming a fixed R of 18.26 k $\Omega$ , which shows that introduction of the tunable filter resistor reduces Q variation substantially. Specifically, Q varies by 29% from its nominal value (its value at 250 MHz) at 175 MHz instead of 63%, and varies by -17% instead of -31% at 325

MHz. Hence, Q is not held constant, but there has been a substantial reduction in Q variation for little or no extra design effort.



Figure 4.28: Natural Frequency and Q Versus Frequency

# 4.6 Comparators

There are two comparators required in the system, as shown in Figure 4.2. One comparator is required to operate at a high common-mode voltage (to check if the charge-pump output has risen above the High threshold), while the other comparator must operate at a low commonmode (to check if the charge-pump output has fallen below the Low threshold). To ease the common-mode range requirements of the comparators, it was decided to design two different comparators. While this incurs some mismatch in latching times, this is not important, provided the two comparators can settle in one half-period of the Up/Down control-logic's clock. The clock for the comparators and digital logic is simply the input crystal-reference divided down in frequency by a factor of 64. Thus, the smallest required settling time is  $64 \times 1.5ns = 96$  ns.

# 4.6.1 Comparator 1 Design

To obtain a fast settling time, it was decided to go with a latched architecture, based on that found in [Song, 1990], which is also described in [Johns, 1997]. The schematic for the highcommon-mode comparator (Comparator 1 in Figure 4.2) is shown in Figure 4.29, along with a diagram of the connection of Comparator 1 and 2 in the system. The use of n-channel input transistors places the High threshold in the input common-mode range of the circuit (which is roughly 1.45 V to 3.05 V). The only difference between this comparator and the one found in [Song, 1990] is that the current source m11 is cascoded. The DC gain of the first stage (formed by m1-4 and m11) was simulated to be about unity. The gain is small because it is simply the ratio of  $g_{m1}$ to  $g_{m3}$ . The second and third stages combine during track mode (i.e. when m13 is turned on) to form a gain stage with a gain of  $g_{m5}R_{Track}$ , where  $R_{Track}$  is given as in Equation 4.8. This resolution-enhancing gain was simulated to be about 17 (or 25 dB) at DC. When m13 is turned off,

$$R_{Track} = \frac{1}{g_{m9} - g_{m7}}$$
(EQ 4.8)

the comparator enters latch mode, and transistors m9 and m10 are disabled. The positive feedback provided by m8-m9 forces the comparator to latch to its final value, which depends on the output voltage built up by the previous track period. To save on power, no differential-to-singleended converter is used at the output of the comparator. Instead, two CMOS inverters are used to regenerate the signal up to full CMOS levels, for use by the up/down logic.

The clocking scheme used forces the comparators to latch when the clock is low, and to track when the clock is high. The up/down counter is positive-edge triggered, so that it samples the outputs of the comparators at the end of the latch phase. As long as the comparator outputs settle within half a clock phase, the correct value will be sampled by the up/down counter.

14



Figure 4.29: Circuit Schematic for Comparator 1

The output "outp" is shown for the case of sinusoidal input in Figure 4.30. The lower plot shows the input sinusoid, along with the reference  $V_{high}$  and the clock signal. The upper plot also shows the output signal after the two inverters (dotted line). At "outp", the high level is 3.3 V, however the low level is 1.1 V. To fix this problem, a CMOS inverter with a threshold voltage of 2.01 V is used, followed by a CMOS inverter with a threshold voltage of 1.65 V, to help equalize rise and fall times. The risetime at the output of the comparator (before the inverters) is 3.63

ns, while the propagation delay through the two inverters is 880 ps in total. The propagation delay from the clock signal (measured at its 50% point) to the Up signal reaching 99% of its final value is 2.37 ns, which is well within the required value of 96 ns. This delay value was measured at the bias conditions expected at 250 MHz, and includes the loading effects of the Up/Down Logic.

The total power dissipation, assuming a clocking speed of 5 MHz, was simulated to be 0.41 mW.



Figure 4.30: Output of Comparator1 for 1 MHz Sinusoidal Input, 5 MHz Clock Rate

## 4.6.2 Comparator 2 Design

The design for Comparator 2 is similar to that for Comparator 1, except that the design is "flipped over" to give p-channel input transistors, as shown in Figure 4.31. This moves the input common-mode voltage range to a lower level (roughly 0.25 V to 1.65 V), suitable for Comparator 2.

Three CMOS inverters were used at the "outp" output, instead of using two inverters at the "outn" output, since this yields an output that is always low, unless the comparator is latching out a high value (which indicates that the up/down counter should be decremented). This situation is the same as at the output of Comparator 1.

To improve the gain in the first stage, the sizes of m3 and m4 were reduced to decrease their transconductance, thus increasing the gain, which is the ratio of  $g_{m1}$  to  $g_{m3}$ . This gain was found to be 1.08 at DC. By leaving m5 the same size, an additional current gain of 2 was produced in the second stage, giving a DC gain of 14 (23 dB) for the second stage. The overall DC gain from the input to the output during track mode was similar to the gain for Comparator 1, at 15.10 (23.6 dB). Because the settling time was initially poorer than Comparator 1 during latching, the transistors m7-m10 and m12-m13 were all doubled in size to improve the settling time for Comparator 2. The 90% to 10% fall time at the output of Comparator 2 (before the inverters) was simulated to be 1.23 ns (an improvement of 66% over Comparator 1), while the propagation time from the clk to the output of the second inverter was only 1.36 ns, assuming bias conditions for 250 MHz input, and including the loading effects of the Up/Down logic.



Figure 4.31: Circuit Schematic for Comparator 2

The output of the comparator for a 1 MHz sinusoidal input and 5 MHz clock rate is shown in Figure 4.32. The top plot shows the output of the comparator before (shown as dotted line) and after (shown as solid line) the two inverters. As can be seen, the comparator latches when the clock goes low. To regenerate the comparator output to full CMOS levels, an inverter with a threshold of 1.89 V is followed by an inverter with 1.65 V, which helps equalize rise and fall times.

The total power dissipation for Comparator 2 was found to be 1.2 mW, assuming a clock-rate of 5 MHz.



Figure 4.32: Output of Comparator 2 for 1 MHz Sinusoidal Input, 5 MHz Clock Rate

# 4.7 Up/Down Control Logic

Up/Down control logic is required in the system to control the state of the digital resistors in the bias circuit. The logic implements the block diagram described in Section 3.3.3. As the functionality of the logic was relatively complex, it was decided to implement the circuit using VHDL, and then to automatically generate the layout for the circuit using 0.35  $\mu$ m CMOS digital cells, using Cadence. The VHDL code for the up/down control logic is given in Appendix A.

The automatically generated layout is very compact, with a size of  $230 \,\mu\text{m}$  by  $260 \,\mu\text{m}$ , with a maximum average power dissipation (assuming the digital state must be altered on every clock cycle and the circuit is clocked at 11 MHz) of 4.4 mW, which drops to about 0.5 mW when the PLL reaches steady state at 300 MHz (when the Up and Down signals are essentially inactive). The circuit operates correctly for clock frequencies exceeding 400 MHz (assuming a state change on every clock cycle), including the effects of parasitics.

# **4.8 Layout Considerations**

The layout for the circuit was done with the help of BALLISTIC, a code-based analog layout tool developed at the University of Toronto. This program was used to quickly generate layouts that would otherwise have taken a great deal of time, such as differential pairs with interleaved unit transistors to improve matching and large capacitors composed of an array of unitsized capacitors with dummy devices added for matching.

Guard rings similar to those discussed in [Johns, 1997] were placed around the logic circuitry to reduce the amount of digital noise coupled into the analog circuitry through the substrate, as shown in Figure 4.33.



Figure 4.33: Schematic for Guard Rings

The fixed resistor in the bias circuit was implemented as a p+ diffusion resistor, as opposed to a polysilicon resistor. This choice was made to save on area, since the resistivity for p+ diffusion is roughly 10-15 times higher than polysilicon for the process used. The p+ diffusion resistor was chosen over an n+ resistor because the p+ resistor sits in an n-well that, if biased with a clean power supply, can provide better isolation from substrate noise.

To further reduce the amount of digital noise coupled into the analog circuitry, separate power supplies (VDD and VSS) were used for analog and digital circuitry. These supply lines were only connected off chip. Additionally, a third set of power supplies was added to bias the shields for the resistors, capacitors, and digital circuitry.

While the interconnection of several blocks was possible to extract and simulate, the entire PLL simply took too long. To verify the PLL operation as much as possible, the VCO was removed from the loop, and two signal-sources were used to excite the PFD. The simulation results for this setup are shown in Figure 4.34. Because the VCO frequency is faster than the input frequency (these are both supplied by signal generators), the charge-pump pushes  $V_a$  up in value (out in Figure 4.34). Eventually,  $V_a$  rises above High, which is detected by Comparator 1 (whose output is downcntl in Figure 4.34), which latches high once the clock goes low. At the end of this latch period (when the clock goes high), the up/down logic detects the high Down signal, and decreases  $V_{therm}$  (therm in Figure 4.34). Notice that the Up signal (upcntrl in Figure 4.34) is inactive at all times, since the charge-pump output never falls below Low, and  $V_{bin}$  (bin in Figure 4.34) remains constant, since  $V_{therm}$  doesn't reach its minimum value.



Figure 4.34: Open-Loop Response of PLL Without VCO

Finally, the PLL was simplified by removing all the digital VCO control signals, leaving only the ring-oscillator, adjustable bias-circuit (with the digital control signals held constant), PFD, charge-pump, and loop filter. This remaining circuitry formed a conventional charge-pump PLL (albeit with a slightly unconventional VCO structure). The response of the circuit to a 290.7 MHz input is shown in Figure 4.35, from which it can be seen that the circuit converges to a stable operating point in about 800 ns, with a 5.8% overshoot.





The final layout for the complete PLL system is shown in Figure 4.36. The system occupied an area of  $1600\mu m \times 1600\mu m$ , including pads and circuits added for testing (see next chapter).



Figure 4.36: Final Layout for PLL

# **4.9 References**

A. Buchwald, K. Martin, "High-Speed Voltage-Controlled Oscillator With Quadrature Outputs," *Electronic Letters*, Vol. 27, No. 4, pp. 309-310, 14 Feb. 1991.

Greg Hartman, Continuous-Time Adaptive-Analog Coaxial Cable Equalizer in 0.5µm CMOS, Master's Thesis, 1997.

D. Johns, K. Martin, Analog Integrated Circuit Design. Toronto: John Wiley & Sons, 1997.

S. Kim, K. Lee, Y. Moon, D. Jeong, Y. Choi, H.K. Lim, "A 960-Mb/s/pin Interface for Skew-Tolerant Bus Using Low Jitter PLL," *IEEE J. Solid-State Circuits*, Vol. 32, No. 5, pp. 691-700, May 1997.

C. Lacy, Private Conversation, March 1999.

G. Maneatis, "Low-Jitter Process-Independent DLL and PLL Based on Self-Biased Techniques," *IEEE J. Solid-State Circuits*, Vol. 31, No. 11, pp. 1723-1733, Nov. 1996.

K. Martin, "Obtaining Accurate On-Chip Time-Constants," U.S. Patent 5973524, Oct. 26, 1999.

I. A. McNeill, "Jitter in Ring Oscillators," *IEEE J. Solid-State Circuits*, Vol. 32, No. 6, pp. 870-879, June 1997.

B. Razavi, "A Study of Phase Noise in CMOS Oscillators," *IEEE J. Solid-State Circuits*, Vol. 31, No. 3, pp. 331-343, Mar. 1996.

A.S. Sedra, K.C. Smith, *Microelectronic Circuits, 4th Ed.*, New York: Oxford University Press, 1997.

E. Seevinck, M. du Plessis, T. Joubert, A.E. Theron, "Active-Bootstrapped Gain-Enhancement Technique for Low-Voltage Circuits," *IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal Processing*, Vol. 45, No. 9, pp. 1250-1254, Sept. 1998.

B. Song, H. S. Lee, and M. Tompsett, "A 10-b 15-MHz CMOS Recycling Two-Step A/D Converter," *IEEE J. of Solid-State Circuits*, Vol. 25, No. 6, pp. 1328-1338, Dec. 1990.

M. Soyuer, R.G. Meyer, "Frequency Limitations of a Conventional Phase-Frequency Detector," *IEEE J. Solid-State Circuits*, Vol. 25, No. 4, pp. 1019-1022, Aug. 1990.

J.M. Steininger, "Understanding Wide-Band MOS Transistors," *IEEE Circuits and Devices*, Vol. 6, No. 3, pp. 26-31, May 1990.

V. von Kaenel, D. Aebischer, C. Piguet, E. Dijkstra, "A 320 MHz, 1.5 mW @ 1.35 V CMOS PLL for Microprocessor Clock Generation," *IEEE J. Solid-State Circuits*, Vol. 31, No. 11, pp. 1715-1722, Nov. 1996.

J. Yuan, C. Svensson, "High-Speed CMOS Circuit Technique," *IEEE J. Solid-State Circuits*, Vol. 24, No. 1, pp. 62-70, Feb. 1989.

#### **Chapter 5: Test Results**



# **Test Results**

The PLL was implemented over two design runs. The first run (sent out in December of 1998) contained only a simplified version of the VCO, while the second run contained the entire design.

This section starts by discussing the testing of the VCO run, followed by the testing of the entire PLL chip. In each section, the test setup for the given run is discussed, followed by the test results.

# 5.1 VCO

The VCO chip was a simplified version of the VCO used in the final PLL design. The block diagram for this circuit is shown in Figure 5.1. The up/down counter contained only eight states (using a seven-bit thermometer-code), instead of the 256 states in the final design. The ring-oscillator contained an earlier version of the inverter, as shown in Figure 5.2. This design was later scaled down to reduce the power consumption, and the lengths of the load transistors were increased to increase the output swing. A large-width open-collector differential pair was used to drive the pins of the package, as shown in Figure 5.3.

## **Chapter 5: Test Results**

÷.



Figure 5.1: Block Diagram of VCO Circuit

The control signals for the up/down logic (up, down, reset, clock) and the analog control signal were all fed in from off the chip.



Figure 5.2: Circuit Schematic for Ring Oscillator Inverter

-94-

1



Figure 5.3: Circuit Schematic for VCO Output Buffer

## 5.1.1 Test Setup

The VCO chip was packaged by CMC in a 24-pin CFP (Ceramic Flat Package), suitable for frequencies up to 2 GHz [CMC Website, 1999].

The PCB layout was designed using CircuitCAM software. This layout was then etched onto a copper circuit board using an LPKF ProtoMat [LPKF, 1996]. A schematic for the final board layout is shown in Figure 5.4.

Pins 6 and 7 tap the differential output of the first stage of the ring oscillator (outp1 and outn1) through the VCO output buffer of Figure 5.3. Analog VDD and Analog VSS provide power to the analog circuits (bias circuits and ring oscillator). Shield VDD and VSS bias the shields used on-chip to isolate analog regions from noisy digital regions. Digital VDD and Digital VSS provide power to the logic used to control the digital portion of the voltage-controlled resistor in the bias circuit. The "clk" input provides a clock signal to the logic, while the "reset" input provides a reset signal to the logic (resets the digital state to 0001111). The "up" and "down" inputs are used to increase and decrease the value of the digital portion of the adjustable resistor, respectively. The "Va" input is used to control the analog portion of the digital resistor. The r1-r4 pins monitor the bias voltages produced by the adjustable bias circuit.



Figure 5.4: PCB Layout for VCO Chip

# 5.1.2 Results

The VCO was measured to have a minimum frequency of 125 MHz and a maximum frequency of 320.5 MHz. When the VCO control signals (analog and digital) were swept from their highest to lowest states, the VCO responded as shown in Figure 5.5.





The variation of the VCO frequency for a fixed analog control voltage of 0.8 V is shown in Figure 5.6, where the digital states are assigned in thermometer-code (i.e. 0=0000000, 1=0000001, 2=0000011, etc.)



Figure 5.6: VCO Frequency Versus Digital State ( $V_a = 0.8$  V)

The variation of the VCO frequency with the analog control voltage for various fixed digital states is shown in Figure 5.7. The standard deviation of the measured curves from the LMS error-fit lines are given as a percentage of the tuning range for the given digital state. The VCO characteristics for a fixed digital state are reasonably linear (less than 2.6% error), considering the large control-voltage range used.

Finally, the variation of the analog VCO constant with VCO frequency is shown in Figure 5.8, where it can be seen that the VCO constant varies linearly with VCO frequency.



Figure 5.8: Analog VCO Constant Versus VCO Frequency

The spectrum of the VCO output at 200 MHz is shown in Figure 5.9. The main lobe of the VCO spectrum is much wider than that in the PLL spectrum (approximately 18 MHz, as opposed to 20 kHz, as will be seen in Section 5.2.2). This becomes more apparent when span of the plot is decreased, as in Figure 5.10. The phase noise can be seen from Figure 5.9 to be -102.1 dBc/Hz, 10 MHz from the carrier, increasing to roughly -54.5 dBc/Hz at 100 kHz from the carrier. The fact that the main lobe is wider than in the PLL is not surprising, since the PLL attenuates phase noise that is offset from the carrier by less than the loop bandwidth (see Section 2.3.1), which leads to the creation of a narrow main lobe around the carrier.



Figure 5.9: Spectrum of VCO Output at 200 MHz





Figure 5.10: Spectrum of VCO Output at 200 MHz (Zoomed In)

The total current drawn from the power supply at a VCO frequency of 250 MHz was measured to be 4.36 mA, including the extra bias circuit and pad drivers.

## **5.2 Entire PLL**

In order to make the PLL design testable, several additions had to be made to the circuit, as shown in Figure 5.11. A simple output driver was added to bring the VCO output off chip. The schematic for this driver is shown in Figure 5.12. The first block in the circuit is a strong digital buffer from the standard digital library for the technology used. This is followed by a parallel connection of strong CMOS inverters, to further improve the drive strength of the signal. This signal then drives the gate of a very large NMOS transistor, which drives the pin capacitance.

To monitor the digital states of the up/down logic, two digital-to-analog converters were added to the design, DAC 1 and DAC 2. The DAC 2 block is a 4-bit thermometer-based DAC that tracks the state of the thermometer-code signal, while the DAC 1 block is a 4-bit binary-DAC that tracks the state of the binary signal. The design for DAC 1 is shown in Figure 5.13

### **Chapter 5: Test Results**

이 옷 무엇

[Johns, 1997], while the design for DAC 2 is shown in Figure 5.14 [Johns, 1997]. DAC 1 uses an R-2R structure to decrease the ratio of resistance values required in the circuit. Relatively large resistor values were used to minimize the power dissipation. The two-way switches for the DAC's are implemented as shown in Figure 5.15. Note that simple NMOS transistors may be used for switches, since for both DAC's, the sources of the transistors in the switches are always at either true ground or the virtual ground of an opamp.

The opamps used to produce a voltage output for the converters are off-chip, to save on area. Because the digital states do not change very quickly (5 MHz at the very most), these opamps do not have to have terribly fast risetimes (100-150 ns is more than sufficient).

Because some extra area existed in the layout at the end of the design, an extra comparator was added to the system to allow testing of the comparator circuit by itself. This test comparator takes the High threshold as its negative input, and takes its positive input from off chip (Test\_in). It also has its own output and latch signals, which leave the chip through pins Test\_out and Test\_latch, respectively. The output signals from all comparators (Comparator 1, Comparator 2, Test Comparator) leave the chip through digital buffers from the standard CMOS digital cell library.

Because the settling behaviour of r1-r4 is important for the PLL circuit, the bias voltages could not be brought off-chip without extensive buffering. Instead, on-chip probe-pads were added to the circuit to observe these voltages. The Low and High threshold voltages were brought off chip, since their settling behaviour isn't as critical. Also, this allows the Low and High thresholds to be forced to different values if needed during testing.

The clock signals for the logic and comparators are generated by dividing down the reference input frequency by 64, using a cascade of six D flip-flops. The D flip-flops are taken from the CMOS standard digital cell library. The reference input signal is generated off-chip.



i Pres



Figure 5.11: Block Diagram for Complete PLL Chip



Figure 5.12: Circuit Schematic for Output Driver

.







Figure 5.14: Circuit Schematic for DAC 2



Figure 5.15: Circuit Schematic for DAC Switches

-104-

÷.

## 5.2.1 Test Setup

The package chosen for the chip was a 44-pin Kyocera CQFP (Ceramic Quad Flat Package), bonded through CMC. This package is claimed to be suitable for clock speeds up to 1.5 GHz [CMC Website, 1999].

The test setup for the final PLL chip is shown in Figure 5.16. The PLL input signal enters the chip through the SMA connector labelled "Ref". No termination resistor was placed here to match the signal generator to the load, as maximum voltage-swing was desired, not maximum power transfer. The Low and High threshold voltages are monitored through the SMA connectors labelled "Low" and "High", respectively. The output buffer for the VCO shown in Figure 5.12 is loaded with a 100 $\Omega$  resistor to VDD, and monitored through the SMA connector labelled "Buffer Out". The test-comparator output is monitored through the SMA connector labelled "Out\_test", while the input and clock signals enter the chip through the SMA connectors labelled "In\_Test" and "Track\_Test", respectively. When it is desired to disable the test comparator (i.e. when it is not being tested), the input and clock signals are tied to ground using switches. The Therm signal that monitors the state of the thermometer-coded logic is amplified using Opamp\_1. Rtherm was chosen to a give a minimum and maximum output voltage of 0.6 V and 2.6 V, respectively. The reset signal for the logic is input using the manual Switch\_2. The Up and Down signals that control the up/down counter are monitored through the SMA connectors labelled "Up" and "Down", respectively. Opamp\_2 is used to amplify the Bin signal that monitors the state of the binary-encoded portion of the up/down counter. Rbin was chosen to give minimum and maximum voltages of -0.7 V and 0 V, respectively. Finally, the Track signal that is used to clock the up/down counter, as well as the comparators, is monitored through the SMA connector labelled "Track". To decouple power-supply noise to ground, a 10 µF electrolytic capacitor was placed across the power-supply terminals, and several 0.1µF surface-mount capacitors Cc1-4 were placed between VDD and ground at various locations on the PCB.



Figure 5.16: PCB Layout for PLL Chip

#### 5.2.2 Results

To start, the operation of the test comparator was examined. The output of the test comparator to a 150 kHz sinusoid with a 42 mV peak, clocked at 2 MHz, is shown in Figure 5.17. The input signal is shown below the output signal. The sinusoid is roughly centered around the trigger voltage of the comparator (which is the High threshold), so that the comparator output goes high when the sinusoid is above its mean value, and goes low when the sinusoid is below its mean value.

The output levels of the test comparator are 0 V and 3.3 V. The 10%-90% rise time is 11.6 ns, while the 90%-10% fall time is 5.5 ns. This corresponds to a maximum clocking frequency of about 58.5 MHz. This corresponds to a maximum input Reference frequency (which gets divided down by 64 to generate the comparator clocks) of 3.7 GHz.

#### **Chapter 5: Test Results**



Figure 5.17: Output of Test Comparator in Response to Sinusoidal Input

To test the analog portion of the PLL, the Low and High thresholds were held at 0 V and 3.3 V, respectively, and the up/down counter was reset to its middle value. With the Low and High thresholds held at these levels, the comparators driving the up/down counter never latch high, so that the up/down counter is disabled, and only the analog portion of the PLL is functional. Under these conditions, it was found that the PLL could lock to frequencies ranging from 182 MHz to 190 MHz.

To check that the clock signal for the logic and comparators was generated correctly, a 250 MHz signal was input to the PLL, and the Track output was observed. As expected, the Track output toggled between 0 and 3.3 V at a frequency of 3.90 MHz (or 250 MHz divided by 64). To check that the clock-divider circuit was as fast as required, a 325 MHz was input to the PLL. Again, as expected, the Track output was divided down in frequency by a factor of 64, for a frequency of 5.08 MHz. The clock-divider was found to work correctly for input frequencies of up to 650 MHz. The clock-divider output for this input frequency is shown in Figure 5.18.



Figure 5.18: Track Output for an Input Frequency of 650 MHz

To check that the Up and Down signals were being generated correctly, the Low and High thresholds were left at the values generated on-chip, while the reset signal for the logic was held high, thus keeping the digital logic in its middle state, while allowing the comparators to latch high in response to the charge-pump output. When a high frequency of 300 MHz was input into the PLL, the Up signal toggled on and off at a frequency of 4.7 MHz, while the Down signal remained low, as expected. When a low frequency of 150 MHz was input into the PLL, the Down signal toggled on and off at a frequency of 2.3 MHz, while the Up signal remained low, also as expected.

Finally, the operation of the entire system was verified by inputting a 250 MHz signal into the PLL, and observing the PLL output. It was found that the PLL quickly locked to the input signal, and remained locked as the frequency was slowly swept. The PLL lock range did not go quite as high expected, only locking to frequencies of up to 300 MHz, however the PLL locked to lower frequencies than expected, going as low as 135 MHz. This is likely due to parasitic wiring capacitances in the ring-oscillator. The input and output of the PLL are shown for input frequencies of 135 MHz, 250 MHz, and 300 MHz, in Figures 5.19-5.21.

#### **Chapter 5: Test Results**

-12

۰. ب



Figure 5.19: Input (Upper Waveform) and Output (Lower Waveform) for 135 MHz Input TEK Running: Average



Figure 5.20: Input (Upper Waveform) and Output (Lower Waveform) for 250 MHz Input



Figure 5.21: Input (Upper Waveform) and Output (Lower Waveform) for 300 MHz Input

The power dissipation at 250 MHz was measured to be about 83.2 mW for a 3.3 V supply, excluding power used by the external opamps. About 63.9 mW of this is power is due to pad drivers and test circuitry (from simulations), leaving a dissipation of about 19.3 mW for the core PLL

Next, the frequency spectrum of the VCO output was examined under locked conditions. The spectrum of the VCO output for an input frequency of 202.2 MHz is shown in Figure 5.22. The phase noise cannot be accurately measured from this plot, however, since the spectrum cannot be approximated as white noise. To get the phase noise measurement, the span of the spectrum analyzer was increased as shown in Figure 5.23, so that the spectrum could be approximated as white noise (i.e. white noise with a narrow spike at 202.2 MHz). This allows the phase noise to be measured to be -92.5 dBc/Hz at a 100 kHz offset from the carrier. The phase noise decreases at lower frequencies, due to the larger signal amplitude at the output buffer. This is illustrated in Figure 5.24, which shows the spectrum of the PLL output for an input frequency of 115 MHz (one of the chips measured had a lower lock range limit of about 115 MHz). This figure shows the phase noise to be -105.6 dBc/Hz at a 110 kHz offset from the carrier.



Figure 5.23: Spectrum of PLL Output for Input Frequency of 202.2 MHz (Zoomed Out

#### **Chapter 5: Test Results**

)



Figure 5.24: Spectrum of PLL for Input Frequency of 115 MHz (Zoomed Out)

The jitter of the PLL output signal was studied next. An oscilloscope was used to compare the phase of the PLL output to a high-quality frequency reference (which the PLL was locked to). The resulting jitter histogram is shown in Figure 5.25 for a 290 MHz input signal, from which it can be seen that the PLL displays a peak-to-peak jitter of 111.6 ps, and an RMS jitter of 15.6 ps.



Figure 5.25: Jitter Histogram of PLL Output at 290 MHz

With the PLL characterized, the transistor transconductances generated by the constant-transconductance bias circuit were examined. In order to measure transistor transconductance, the setup in Figure 5.26 was used, where mhigh is the transistor used to generate the High threshold in Figure 4.3. Vtest is AC-coupled to the gate of mhigh through a 3.01 k $\Omega$  resistor, which forms a resistive divider with the resistor r (which was added to provide some ESD protection for the gate of mhigh) and the impedance looking into the diode-connected mhigh (which is roughly  $1/g_{m,mhigh}$ ). Thus,  $g_{m,mhigh}$  (which is proportional to all other transistor transconductances) can be found using Equation 5.1, where  $A_v=V2/V1$  (V1-2 are small-signal voltages).

$$g_{m, mhigh} = \frac{1}{\frac{R \cdot A_{v}}{1 - A_{v}} - r}$$
(EQ 5.1)

7

#### **Chapter 5: Test Results**



Figure 5.26: Setup For Measuring Transistor Transconductance

In theory, the transistor transconductances should be proportional to the input frequency (assuming the PLL is locked). This was verified by slowly sweeping the input frequency from 160 MHz to 270 MHz, and measuring the transconductance of mhigh, resulting in the plot shown in Figure 5.27. The standard deviation between the observed curve and the LMS error-fitted straight line was found to be 4.6  $\mu$ A/V, which is 1.3% of the value of g<sub>m</sub> at 250 MHz. Hence, the system seems to keep the transistor transconductances well matched to the input frequency of the PLL.

To examine the variation of the supplied transistor transconductances over process, the transconductance of mhigh was measured at a fixed input-frequency of 250 MHz, over five different chip samples, each placed in an identical test setup containing an IC socket, to remove errors due to test-setup differences. The result of this test is shown in Figure 5.28, where the value of  $g_m$  is plotted for each chip sample, along with the mean of the measured values. The standard deviation of these measurements was found to be 9  $\mu$ A/V, which is 3.06% of the mean value, indicating that the system seems to provide stable transconductances over process. However, because all samples were from the same process run, these results are not sufficient to prove that the transconductances obtained are highly immune to process variations.



Figure 5.28: Variation of gm of mhigh Over Process for 5 Chips at 250 MHz

To examine the dependence of transistor transconductance on the power-supply voltage, the power-supply voltage was varied from 2.45 V to 3.5 V, and  $g_m$  was measured at each point. The result of this measurement is shown in Figure 5.29, where it can be seen that  $g_m$ changes by  $\pm 1.3$ % when VDD changes by  $\pm 10$ %. Thus, the transconductances produced by the system are very stable over changes in power-supply voltage.



Figure 5.29: Variation of gm of mhigh With Power Supply Voltage at 250 MHz

To determine the variation of  $g_m$  due to changes in temperature, the test-setup was heated in an oven, resulting in the plot shown in Figure 5.30. This plot shows that there is very little variation in  $g_m$  for a temperature range of 60° C (2.2% difference between  $g_m$  at 80° C and 20° C).



Figure 5.30: Transistor gm Versus Temperature at a VCO Frequency of 250 MHz

To determine how often the digital logic switched in steady-state, the chip was left running with a 200 MHz input signal for ten hours, measuring the analog version of the thermometer control-signal (i.e. the output of DAC 2) every ten minutes. The result is shown in Figure 5.31, which shows that over ten hours, the logic switched only once. Thus, the time between switching events in steady-state is at least eight hours. Ideally, the digital state should not have to change at all, assuming a fixed input signal, however as time progresses, the chip heats up, so that the analog control voltage must decrease to keep the VCO matched to the input frequency. Eventually, the analog control voltage moves below the Low threshold, which causes the thermometercode state to change. The analog control-voltage is then forced to its middle value through controlled hysteresis, which helps to increase the amount of time before the next state-change is required.

Finally, the open-loop characteristics of the VCO were examined, as shown in Figure 5.32. The upper plot shows the VCO frequency, the middle plot shows the value of the thermom-

eter-encoded digital signal (at the output of its D/A converter), while the bottom plot shows the value of the binary-encoded digital signal.



Figure 5.31: Output of DAC 2 Versus Time for Input Signal of 200 MHz



Figure 5.32: VCO Frequency Characteristic

#### **Chapter 5: Test Results**

This plot is simplified in Figure 5.33 by showing only one frequency point per thermometer state (i.e. the value of  $V_a$  is fixed for all points). Figure 5.34 shows the variation of VCO frequency versus  $V_a$  for four digital states. For a fixed digital state, the VCO characteristic is very linear, with an almost constant slope. The plots include the standard deviation between the observed curves and their minimum-LMS error straight lines. The standard deviations are also given as percentages of the tuning range at the given digital state. All standard deviations can be seen to be less than or equal to 4% of the tuning range.



Figure 5.33: VCO Characteristic for Fixed Value of Va



Figure 5.34: VCO Frequency Versus V<sub>a</sub> for Various Digital States

The performance of the PLL and bias system is summarized in Table 5.1.

Table 5.1: Summary of PLL and Bias System Performance

| Power Dissipation (Including Test Circuitry and Pad Drivers)                | 23.1 mA from 3.3 V Supply                               |
|-----------------------------------------------------------------------------|---------------------------------------------------------|
| Estimated Power Dissipation of Core PLL                                     | 5.8 mA from 3.3 V Supply                                |
| Closed-Loop Jitter                                                          | 111.6 ps peak-to-peak, 15.6 ps<br>RMS @ 290MHz          |
| Closed-Loop Phase Noise                                                     | -92.5 dBc/Hz @ 100 kHz offset<br>from 202.2 MHz carrier |
| Lock Range                                                                  | 135 MHz to 300 MHz                                      |
| Center Frequency                                                            | 217.5 MHz                                               |
| Variation of g <sub>m</sub> from 10% Variation in Power-Sup-<br>ply Voltage | 1.3%                                                    |
| Variation of g <sub>m</sub> Over Process                                    | 3.06%                                                   |
| Variation of g <sub>m</sub> For 60° C Change In Temperature                 | 2.2%                                                    |

| Time Between Digital State-Changes in Steady-<br>State at a VCO Frequency of 200 MHz | 82hours                                                 |
|--------------------------------------------------------------------------------------|---------------------------------------------------------|
| Integrated Area (Including Test Circuitry and Pads)                                  | 1600 μm X 1600 μm,<br>1200 μm X 1200 μm<br>without pads |
| Technology                                                                           | 0.35 µm Triple-Metal CMOS                               |

Table 5.1: Summary of PLL and Bias System Performance

## **5.3 References**

CMC Website, www.cmc.ca/Fabrication/packaging.html, 1999.

D. Johns, K. Martin, Analog Integrated Circuit Design, Toronto: John Wiley & Sons, 1997.

LPKF CAD/CAM Systeme GmbH, LPKF BoardMaster Version 2.0 Release Notes 2.8x, Garbsen, Germany: LPKF, 1996.



# Discussion and Recommendations

## **6.1 Discussion**

A system for obtaining accurately-known transistor transconductances has been designed, fabricated, and tested in a 0.35  $\mu$ m CMOS process. This work is based on the constant-transconductance bias circuit found in [Johns, 1997], in which on-chip transistor transconductance transconductance of an (accurately known) off-chip resistor. The main contribution of this work is the notion of tuning this resistor *on-chip*, using a PLL with an accurate frequency-reference that is present in most systems to produce a system clock. Using this technique, it was found that on-chip transconductances varied by 1.3% for a 10% variation in power-supply voltage, 3.06% over process, and 2.2% over temperature. In addition, it was demonstrated how multiple tuning mechanisms (in this case both analog and digital) with overlapping ranges having hysteresis could be used to minimize tuning glitches.

The PLL used to tune the bias resistor is of a charge-pump type with no frequencydivider. The VCO uses a ring-oscillator topology, biased so that its frequency of oscillation is proportional to the conductance of the bias resistor, and also to allow reduced VCO jitter. The PFD uses a True Single-Phase Clock (TSPC) topology in order to improve the speed of the circuit. The loop is designed so that the loop-filter components can be placed on-chip, keeping the PLL fully-integrated. The PLL was measured to have a broad lock-range (135 MHz to 300 MHz), reasonably low phase-noise (-92.5 dBc/Hz at 100kHz away from a 202.5 MHz carrier), low jitter (15.6 ps RMS and 111.6 ps peak-to-peak), and low power (5.8 mA from a 3.3 V supply, not including test circuitry and pad drivers). The final design occupied an integrated area of 1600  $\mu$ m x 1600  $\mu$ m, including all pads and test circuitry.

Thus, this technique would be a good means of obtaining accurately-known transistor transconductances in *any on-chip system* that includes an off-chip clock-generator, with only a small area and power dissipation penalty (as little as 0.8 X 0.8  $\mu$ m<sup>2</sup> and less than 20 mW). In some systems (such as the current one) in which a PLL is required *anyway*, the only extra circuitry is the adjustable R<sub>bias</sub>, up/down logic, and comparators, which consume an area of roughly 0.5 X 0.5  $\mu$ m<sup>2</sup> and a power of 2.2 mW)

Note that the PLL actually tunes  $g_m/C$  in the  $g_m$ -controlled oscillator. This is because, in steady-state, the  $G_mCO$  frequency must equal the input frequency. In order to obtain a certain frequency from the  $G_mCO$ , the  $C/g_m$  delay through the inverters must be tuned. Thus, the accuracy of  $g_m$  in the system is determined by the variations of on-chip capacitors. Because on-chip capacitors display very little variation with temperature or power-supply changes, the  $g_m$  accuracy will primarily be determined by the process-variations of on-chip capacitors (3 $\sigma$  of 10%). Often, however, it is the time-constants  $C/g_m$  we wish to obtain accurately, in which case this system could provide even greater accuracy.

## **6.2 Suggestions for Future Work**

While it was interesting to investigate a dividerless topology, it would likely be necessary in any practical implementation to incorporate a frequency divider in the loop (the N block in Figure 2.1) in order to allow the use of a crystal oscillator reference (which are only widely available for frequencies up to roughly 50 MHz [Electrosonic, 1999])

To further decrease phase noise, it would be beneficial to make the PFD fully-differential. This could also eliminate the need for a buffer between the VCO and PFD. The current charge-pump design already has a differential input, however in order to make the charge-pump output differential the bias-circuit topology would have to be drastically altered. The chargepump circuit could also be improved by using wide-swing cascode current-mirrors to mirror current to the loop filter. This would decrease the variation of the phase-detector constant with changes in the charge-pump output-voltage.

It would be interesting to see how accurately-determined time-constants are over process and temperature using this system. Unfortunately, we designed no means for such measurements into the final chip.

Finally, it would be interesting to measure the reduction in phase-noise that could result from including a second VCO biased only with the digital PLL signals.

## **6.3 References**

Electrosonic, Catalogue 991, 1999.

-1.5

# **Appendix A: VHDL Code for Up/Down Logic**

The VHDL code for the up/down counter is given below. All lines that start with a

double-hyphen (--) are comments.

library IEEE; USE IEEE.std\_logic\_1164.all; USE IEEE.std\_logic\_unsigned.all;

ENTITY hybrid3 IS PORT( --inputs clk,reset,up,down: IN std\_logic; --outputs countn: OUT std\_logic\_vector (18 downto 0)); END hybrid3;

ARCHITECTURE maia of hybrid3 IS signal countfilttherm: std\_logic\_vector (14 downto 0); signal thermcount: std\_logic\_vector (3 downto 0); signal bincount: std\_logic\_vector (3 downto 0); signal count: std\_logic\_vector(18 downto 0); BEGIN --count(14 downto 0) stores thermometer state count(14 downto 0)<=countfilttherm;</pre> --count(18 downto 15) stores binary state count(18 downto 15)<=bincount; --output should be active-low countn<=not(count);</pre> --this process maintains the thermometer state PROCESS(clk,reset) variable tmpcount: std\_logic\_vector (3 downto 0); BEGIN --If reset is high, reset therm to its middle value IF (reset='1') THEN tmpcount:="1000"; thermcount<=tmpcount; ELSIF (clk'EVENT and clk='1') THEN IF (up='1') THEN --If up is high and therm hasn't saturated, increment therm IF (NOT(thermcount ="1111")) THEN tmpcount:=tmpcount+1; thermcount<=tmpcount;

## ELSE

F . 27

--If up is high and therm has saturated, reset therm IF (not(bincount="1111")) THEN tmpcount:="1000"; thermcount<=tmpcount; END IF: end if: ELSIF (down='1') THEN --If down is high and therm hasn't saturated, decrement therm IF (thermcount /="0000") THEN tmpcount:=tmpcount-1; thermcount<=tmpcount; ELSE --If down is high and therm has saturated, reset therm IF (not(bincount="0000")) THEN tmpcount:="1000"; thermcount<=tmpcount; END IF; END IF: END IF; end if; END PROCESS:

-- This process transforms 4-bit therm state into thermometer code PROCESS(thermcount) variable tmpcountfilttherm: std\_logic\_vector (14 downto 0); BEGIN CASE thermcount(3 downto 0) IS WHEN "0000"=> tmpcountfilttherm:="00000000000000"; WHEN "0001"=> tmpcountfilttherm:="0000000000001"; WHEN "0010"=> tmpcountfilttherm:="00000000000011"; WHEN "0011"=> tmpcountfilttherm:="00000000000111"; WHEN "0100"=> tmpcountfilttherm:="00000000001111"; WHEN "0101"=> tmpcountfilttherm:="000000000011111"; WHEN "0110"=> tmpcountfilttherm:="0000000001111111"; WHEN "0111"=> tmpcountfilttherm:="0000000011111111"; WHEN "1000"=> tmpcountfilttherm:="0000000111111111";

WHEN "1001"=> tmpcountfilttherm:="000000111111111"; WHEN "1010"=> tmpcountfilttherm:="000001111111111"; WHEN "1011"=> tmpcountfilttherm:="000011111111111"; WHEN "1100"=> tmpcountfilttherm:="0001111111111111"; WHEN "1101"=> tmpcountfilttherm:="0011111111111111"; WHEN "1110"=> tmpcountfilttherm:="011111111111111"; WHEN OTHERS=> tmpcountfilttherm:="1111111111111111"; END CASE; countfilttherm<=tmpcountfilttherm; end process; -- This process maintains the binary state process(clk,reset) variable tmpcount: std\_logic\_vector (3 downto 0); begin --If reset is high, reset binary signal to middle value IF (reset='1') THEN tmpcount:="1000"; bincount<="1000";</pre> ELSIF (clk'EVENT and clk='1') THEN --If up is high and therm has saturated, and binary hasn't saturated, --increment binary IF ((thermcount="1111") AND (up='1')) THEN IF (NOT (bincount="1111")) THEN tmpcount:=tmpcount+1; bincount<=tmpcount;</pre> END IF: --If down is high and therm has saturated, and binary hasn't saturated, --decrement binary ELSIF ((down='1') AND (thermcount="0000")) THEN IF (NOT (bincount ="0000")) THEN tmpcount:=tmpcount-1; bincount<=tmpcount;</pre>

END IF;

END IF;

END IF;

END process;

END maia;

Appendices

## **Appendix B: Derivation of Inverter Delay**

For the derivation of the delay through a differential CMOS inverter, the circuit structure shown in Figure B.1 will be assumed. Now, assume that there is a sharp transient on the



Figure B.1: Circuit Schematic For Analyzed Inverter Structure

input signals inp and inn such that inp suddenly goes to VDD and inn suddenly goes to zero. In such a case, m3 turns off, and m2 all of current from m1 flows through m2, while outn falls from VDD to some final voltage (VDD -  $I_1r_{ds4}$  if m4 is in the triode region), and outp rises to VDD. Let us determine the transient response at outn, provided that m4 remains in the triode region (i.e. outn remains above VDD-V<sub>eff4</sub>).

An approximate equivalent circuit for this case is shown in Figure B.2, where the current-source m1 has been replaced by current source  $I_{EE}$ , and transistors m3 and m5 have been omitted from the circuit. Also, the load capacitance  $C_L$  has been added on the node outn. If m4 remains in the triode region for the entire transient, it can be modelled as a resistor. Although the value of this resistor varies as the output voltage changes, let us assume a constant value of  $r_{ds4}$ .



Figure B.2: Simplified Circuit Schematic for Falling Output Signal

With this simplification, one must solve the simple differential equation given in Equation B.1 to find the output voltage transient. This solution is given in Equation B.2, which yields

$$VDD - I_{EE}r_{ds4} = v_o + r_{ds4}C_L \frac{dv_o}{dt}$$
(EQ B.1)

the expression given in Equation in B.3 as the propagation delay, where the delay has been measured between the rising input signal and the time when  $v_0$  reaches its bias point of  $r_{ds4}I_{EE}/2$ , at which point the differential output signal is zero.

$$v_o(t) = VDD - I_{EE}R\left(\begin{array}{c} -t \\ RC_L \\ 1 - e \end{array}\right)$$
(EQ B.2)

$$T_d = r_{ds4} C_L \ln(2) \tag{EQ B.3}$$

The expression for  $r_{ds4}$  is given in Equation B.4, where the approximation is valid if the drain-to-source voltage of m4 is much smaller than its effective gate-to-source voltage  $V_{eff4}$ .

$$r_{ds4} = \frac{1}{\mu_p C_{ox} \left(\frac{W}{L}\right)_4 (V_{eff4} + v_o - VDD)} \approx \frac{1}{\mu_p C_{ox} \left(\frac{W}{L}\right)_4 V_{eff4}} \propto \frac{1}{g_{m, sat}}$$
(EQ B.4)

#### Appendices

It was found through simulation that the switching point for the inverters occured when the inverter output voltage was just above VDD -  $V_{eff4}$ . Thus, the triode-region transistor can no longer be modelled by a fixed resistor, so that we get a more complex result. The equation for current flowing into the capacitor is given in Equation B.5, where  $I_C$  and  $I_4$  are given as in Equation B.6 and B.7, respectively. This yields the differential equation for the output voltage

$$I_C = I_4 - I_{EE} \tag{EQ B.5}$$

$$I_C = C_L \frac{dv_o}{dt}$$
(EQ B.6)

$$I_{4} = \frac{1}{2}\mu_{p}C_{ox}\left(\frac{W}{L}\right)\left(2V_{eff4}(VDD - v_{o}) - (VDD - v_{o})^{2}\right)$$
(EQ B.7)

given in Equation B.8, where  $K_{p4}=0.5\mu_pC_{ox}(W/L)_4$ . If this equation is solved to find the time at

$$\frac{-C_L dv_o}{K_{p_4} dt} = (v_o + V_{eff4} - VDD)^2 - V_{eff4}^2 + \frac{I_{EE}}{K_{p_4}}$$
(EQ B.8)

which the output-voltage equals  $V_{sw}$  (at which point the differential inverter output voltage is zero), the result is given as in Equation B.9 (assuming that m4 is still in the triode region at the switching point). Using values obtained from simulation, it was found that for a VCO frequency of 250 MHz, Z = 1.3947,  $K_{p4}$  = 567.3 mA/V2,  $V_{eff4}$  = 432.2 mV,  $C_L$ =95.7 fF (including the

$$T_{p} = \frac{C_{L}}{K_{p4} \cdot V_{eff4}\sqrt{Z-1}} \left( \operatorname{atan}\left(\frac{1}{\sqrt{Z-1}}\right) - \operatorname{atan}\left(\frac{1-\frac{(VDD-V_{sw})}{V_{eff4}}}{\sqrt{Z-1}}\right) \right) \quad \text{where} \quad Z = \frac{I_{EE}}{K_{p4} \cdot V_{eff4}^{2}} \quad (EQ B.9)$$

parasitic gate capacitance of the next stage), and  $V_{sw} = 2.90$  V. This gives a theoretical value for  $T_p$  of 0.554 ns, for a VCO frequency of  $\frac{1}{8} \cdot \left(\frac{1}{0.554 ns}\right) = 226$  MHz. If Equation B.3 is used to calculate the VCO frequency, a value of 0.31 ns is found for  $T_p$ , for a VCO frequency of 400 MHz..