

Preprints are preliminary reports that have not undergone peer review. They should not be considered conclusive, used to inform clinical practice, or referenced by the media as validated information.

# A 12.5 Gb/s 1.38 mW all-inverter-based optical receiver with multi-stage feedback TIA and continuous-time linear equalizer

Peng Yan ( pengyan@tamu.edu ) Texas A&M University Chaerin Hong Texas A&M University **Po-Hsuan Chang Texas A&M University** Hyungryul Kang **Texas A&M University** Dedeepya Annabattuni **Texas A&M University** Ankur Kumar **Texas A&M University** Yang-Hang Fan **Texas A&M University** Ruida Liu Texas A&M University Ramy Rady **Texas A&M University** Samuel Palermo Texas A&M University

# **Research Article**

**Keywords:** active inductor, CMOS, CTLE, inverter, low noise, low power, optical receiver, transimpedance amplifier, transimpedance limit

Posted Date: February 10th, 2023

DOI: https://doi.org/10.21203/rs.3.rs-2557778/v1

License: (c) This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License **Version of Record:** A version of this preprint was published at Analog Integrated Circuits and Signal Processing on February 3rd, 2024. See the published version at https://doi.org/10.1007/s10470-024-02248-1.

# A 12.5 Gb/s 1.38 mW all-inverter-based optical receiver with multi-stage feedback TIA and continuous-time linear equalizer

Peng Yan<sup>1</sup>, Chaerin Hong<sup>1</sup>, Po-Hsuan Chang<sup>1</sup>, Hyungryul Kang<sup>1</sup>, Dedeepya Annabattuni<sup>1</sup>, Ankur Kumar<sup>1</sup>, Yang-Hang Fan<sup>1</sup>, Ruida Liu<sup>1</sup>, Ramy Rady<sup>1</sup> and Samuel Palermo<sup>1</sup>

<sup>1</sup>Analog and Mixed-Signal Center, Texas A&M University, 188 Bizzell St, College Station, 77843, Texas, USA.

Contributing authors: pengyan@tamu.edu; chaerin@tamu.edu; pchang0628@tamu.edu; hrkang@tamu.edu; dedeepya@tamu.edu; ankur.kumar@tamu.edu; yhfanmail@tamu.edu; rdliu918@tamu.edu; ramyrady@tamu.edu; spalermo@tamu.edu;

### Abstract

An optical receiver employs an all-inverter-based front-end design that provides maximum transconductance for a given power supply and allows for ultra-low power consumption. The feedback transimpedance amplifier (TIA) input stage utilizes a multi-stage amplifier to achieve a dramatic increase in feedback resistance and lower input-referred noise. Cascading an inverter-based active inductor continuous-time linear equalizer (CTLE) provides frequency peaking to compensate the input stage TIA that is intentionally designed with a reduced bandwidth to achieve adequate sensitivity at low power. Fabricated in 28 nm CMOS, the 12.5 Gb/s optical receiver achieves -10.7 dBm OMA sensitivity at 0.11 pJ/bit energy efficiency and occupies only 720 µm<sup>2</sup> area.

**Keywords:** active inductor, CMOS, CTLE, inverter, low noise, low power, optical receiver, transimpedance amplifier, transimpedance limit.

# 1 Introduction

Advances in machine learning, artificial intelligence, and 5G workloads are driving performance scaling in high-performance computing (HPC) systems and datacenters. Low-power low-noise CMOS optical receiver (RX) circuits are essential to improve overall datalink energy efficiency in these systems. It is important that these receivers have adequate sensitivity to detect weak optical signals, resulting in relaxed source laser power, while also achieving low front-end circuit power and compact area.

Inverter-based feedback transimpedance amplifier (TIA) front-ends have been developed that allow for operation at low-supply voltages [1]. However, additional techniques are necessary to cope with the inherent transmedance limit in these feedback TIA designs [2][3]. Receivers with low-bandwidth input TIAs, such as front-ends with integrate-and-dump stages [4], cascaded decision-feedback equalizer (DFE) [5], duobinary sampling [6], or continuoustime linear equalizer (CTLE) [7] circuits, have been proposed to address this. However, additional power is required in these designs due to the clocked integration stages [4], meeting the critical DFE timing [5], additional slicers and logic gates to resolve the symbols [6], and current-mode logic CTLE stages [7]. TIAs with multi-stage amplifiers are another way to extend the transpedance limit by achieving a higher gain-bandwidth product [3]. Unfortunately, typical multi-stage broadband amplifiers with several roughly equal non-dominant poles require significantly faster technologies to ensure that the overall phase shift in the TIA feedback loop is small enough to achieve good stability [8], which is difficult in high-speed designs. It is possible to add on-chip peaking inductors to allow for bandwidth extension in these multistage amplifier TIAs [9]. However, this comes at a significant area penalty and is not suitable for the high-bandwidth density optical interconnect transceivers necessary in future systems.

This paper presents a 12.5 Gb/s all-inverter-based optical receiver that addresses these challenges [10]. A multi-stage amplifier TIA input stage allows for increased feedback resistance and lower input-referred noise. Intentionally designing this TIA stage with a reduced bandwidth allows for adequate sensitivity at ultra-low power and area consumption. A subsequent inverterbased active inductor CTLE provides additional gain and frequency peaking to compensate for intersymbol interference (ISI) from the low-bandwidth TIA. Optical receiver noise analysis and design limitations are provided in Section 2. Section 3 discusses key circuit design details of the all-inverter-based design. Experimental results of the optical receiver prototype, which was fabricated in a 28 nm CMOS technology, are presented in Section 4. Finally, Section 5 concludes the paper.

# 2 Optical receiver noise analysis

Input-referred noise model, indicating the main noise sources, is widely used in receiver's signal-to-noise-ratio (SNR) optimization. As shown in Fig. 1, optical



Fig. 1 Optical receiver front-end noise model.

receiver noise performance is generally limited by the TIA front-end due to the noise from subsequent circuits being suppressed by the transimpedance gain. The feedback resistor  $R_F$  and amplifier are the two major noise sources [7], resulting in the following input-referred noise power spectral density  $\overline{I_{n,in}^2(\omega)}$ .

$$\overline{I_{n,in}^{2}(\omega)} \approx \overline{I_{n,R_{F}}^{2}(\omega)} + \overline{I_{n,amp}^{2}(\omega)}$$

$$= \frac{4kT}{R_{F}} + \frac{\overline{v_{n,amp}^{2}(\omega)}}{R_{F}^{2}} + \omega^{2}(C_{in} + C_{in,amp})^{2}\overline{v_{n,amp}^{2}(\omega)} \qquad (1)$$

$$\approx \frac{4kT}{R_{F}} + \omega^{2}(C_{in} + C_{in,amp})^{2}\overline{v_{n,amp}^{2}(\omega)}$$

where  $C_{in}$  is the combined PD and packaging capacitance at the input node.  $C_{in,amp}$  is the amplifier input capacitance.  $\overline{v_{n,amp}^2(\omega)} = 4kT\gamma/g_m$  is the amplifier's input-referred noise power spectral density.  $\gamma$  is the channel-noise factor. The  $\overline{v_{n,amp}^2(\omega)}/R_F^2$  term is omitted as  $4kT/R_F \gg 4kT\gamma/(g_m R_F^2)$ always holds. For a given amplifier structure, both  $C_{in,amp}$  and  $\overline{v_{n,amp}^2(\omega)}$ are determined by the size and power, especially by its input stage.  $C_{in,amp}$ is proportional to size/power, while  $\overline{v_{n,amp}^2(\omega)}$  is inversely proportionally to them. The following optimum value is achieved when  $C_{in} = C_{in,amp}$  [2][11].

$$min. \ \overline{I_{n,in}^2(\omega)} = \frac{4kT}{R_F} + \omega^2 \frac{16kT\gamma C_{in}}{A\omega_A}$$
(2)

where  $A\omega_A$  is the gain-bandwidth product that is roughly equal to CMOS technology transition frequency  $\omega_T$  in single-stage amplifier, but may achieve a higher value in multi-stage amplifier [3][8]. Proper PMOS/NMOS size ratio should be set to achieve maximum  $\omega_T$  for inverter-based amplifier permitted by supply voltage. While optimum sensitivity is achieved under equal input and amplifier capacitance, a smaller input stage could be used to trade-off lower power with degraded sensitivity.

TIA's dominant pole is usually placed at the input node for higher  $R_F$  and better noise performance. Considering a first-order amplifier response A(s) =



Fig. 2 Simulated TIA frequency response with different amplifier bandwidths.

 $A/(1 + s/\omega_A)$  for high data-rate applications, this results in the following second-order TIA frequency response.

$$Z_T(s) = \frac{-R_F}{1 + \frac{(1+s/\omega_A)(1+sR_FC_T)}{4}},$$
(3)

where  $C_T = C_{in} + C_{in,amp}$ . Setting  $\sqrt{2}\omega_{TIA} = 2A/(R_F C_T)$  provides a second-order Butterworth TIA response for the transimpedance limit [2][3][7].

$$max. \ R_F = \frac{A\omega_A}{C_T \omega_{TIA}^2} = \frac{\sqrt{2A}}{C_T \omega_{TIA}} \tag{4}$$

The TIA performance will degrade if the amplifier pole position deviates. Fig. 2 shows the simulated TIA frequency response with the same amplifier gain A and  $R_F$ , but different  $\omega_A$ . A lower  $\omega_A$  results in ac overshoot and requires a reduction in  $\omega_{TIA}$  to improve the phase margin, while a higher  $\omega_A$ has lower bandwidth with excessive phase margin and wastes the potential for a higher  $R_F$  value with improved noise. Trading excessive  $\omega_A$  for a higher amplifier gain is critical to achieve the optimal noise performance, as indicated by (4).

Substituting (4) into (2) and assuming  $C_{in} = C_{in,amp}$ , the optimal inputreferred noise power spectral density is

$$min. \ \overline{I_{n,in}^2(\omega)} = \frac{8kTC_{in}}{A\omega_A} (\omega_{TIA}^2 + 2\gamma\omega^2).$$
(5)

Integrating it with the second-order Butterworth frequency response, the input-referred noise power is proportional to  $C_{in}\omega_{TIA}^3/(A\omega_A)$  [2][5]. This relationship can be understood intuitively for an optical receiver properly



Fig. 3 Inverter-based optical receiver.

designed for a given  $C_{in}$ . If  $C_{in}$  increases by n times, the whole front-end can be sized up proportionally to maintain the same phase margin and bandwidth, which means increasing amplifier's size and reducing  $R_F$  by n times. Inputreferred noise power increases by n times as a result due to the 1/n times transimpedance gain and 1/n times output-referred noise power.

The closed-loop  $\omega_{TIA}$  has much more impact on the receiver's noise performance. Achieving  $n\omega_{TIA}$  requires A/n amplifier gain for a constant  $A\omega_A$ , resulting in  $(1/n^2)R_F$  that generates  $n^2$  times higher noise density, which means  $n^3$  times noise power integrating over  $n\omega_{TIA}$ . The same input-referred noise density from the amplifier also generates  $n^3$  noise power over  $n\omega_{TIA}$ . Thus, the whole input-referred noise power is  $\propto n^3$ , which motivates several noise reduction techniques based on low-bandwidth front-ends.

Increased A with the same  $\omega_A$  implies that  $g_m$  is proportionally higher for the same  $C_{in,amp}$ , resulting in 1/n times input-referred noise power from the amplifier. The  $R_F$  value can also be increased proportionally to maintain the same bandwidth and phase margin, generating 1/n times input-referred noise power.

# 3 Optical receiver design

Fig. 3 shows the single-ended all-inverter-based optical receiver architecture. A low-bandwidth TIA converts the single-ended PD input current to a single-end output voltage that is equalized by the CTLE. An RC low pass filter (LPF) generates a pseudo-differential signal for the four slicers driven by 4-phase quarter-rate clocks. A DC cancellation loop eliminates the low-frequency (LF) component of the input current and stabilizes output common-mode voltage.

### 3.1 Low-bandwidth TIA with multi-stage amplifier

As shown in Fig. 4 and 5, it is possible to utilize either a conventional singlestage amplifier TIA cascaded with a transconductance-transimpedance voltage amplifier [1] or the proposed TIA with multi-stage amplifier to achieve the



Fig. 4 (a) Conventional TIA with post-amplifier and (b) proposed TIA with multi-stage feedback amplifier.



Fig. 5 Small signal equivalent models of (a) conventional TIA with post-amplifier and (b) proposed TIA.

same effective transimpedance gain and bandwidth. Both topologies employ inverter-based gain stages with current reuse that efficiently provides both NMOS and PMOS transconductance and so higher gain-bandwidth product. Single-stage inverter is used as amplifier in the conventional TIA, with the amplifier pole  $\omega_A$  at its output node. Though easier to achieve good stability, designer have no direct control over  $\omega_A$  when amplifier is sized to capacitively match  $C_{in}$ . In modern CMOS technology, low intrinsic device gain limits  $R_F$ 



Fig. 6 Transimpedance and input-referred noise power spectral density of conventional TIA with post-amplifier and proposed TIA.

to even lower than (4), generating higher input-referred noise [8]. Potential of high  $\omega_T$  is completely wasted in this scenario.

The proposed design deals with this problem using an inductor-less TIA based on a multi-stage amplifier, which is equivalent to placing the postamplifier inside the TIA feedback loop. The post-amplifier is comprised of a transconductance input stage with a TIA load stage and has the following second-order frequency response.

$$A_{2}(s) = \frac{g_{m2}R_{F2}}{1+1/L(s)}$$

$$L(s) = \frac{(g_{m3}R_{F2}-1)\frac{r_{ds2}}{R_{F2}+r_{ds2}}\frac{r_{ds3}}{R_{F2}+r_{ds3}}}{[1+s(r_{ds2}||R_{F2})C_{L2}][1+s(r_{ds3}||R_{F2})C_{L3}]}$$
(6)

Conventional multi-stage TIA without local feedback resistor  $R_{F2}$  has a significantly lower bandwidth limited by the three roughly equal non-dominant poles at the output nodes of each stage for a required phase margin [3][8]. Adding  $R_{F2}$  parallel to the third stage  $S_3$  reduces its intrinsic gain and pushes both poles in the post-amplifier to higher frequency, generating higher post-amplifier bandwidth which is  $> \omega_{TIA}$ . Higher  $R_{F2}$  generates lower postamplifier bandwidth with more phase shift in the global TIA feedback loop, and vice versa. Through proper choice of the  $R_{F2}$  value, this phase shift compensates a too high phase margin due to fast  $\omega_A$  at the input stage's output node to achieve the same  $63.4^o$  phase margin as in a Butterworth response. In this case, post-amplifier can be modeled as an ideal gain stage while its phase shift effect 1/[1 + 1/L(s)] is included in the input stage model.

The overall transfer functions of conventional and proposed front-end are derived here, both with its dominant pole  $1/(R_F C_T)$  and non-dominant pole  $\omega_A$  at the input and output node of the input stage  $S_1$ , respectively.



Fig. 7 Inverter-based CTLE schematic, small signal model, and noise reduction via TIA input stage bandwidth reduction.

$$Z_{T,conv.}(s) = \frac{-g_{m2}R_{F2}R_F}{1 + \frac{1}{A_{conv.}(s)} + \frac{sR_FC_T}{A_{conv.}(s)}}$$

$$A_{conv.}(s) = \frac{(g_{m1}R_F - 1)r_{ds1}}{[1 + s(r_{ds1} || R_F)C_{L1}](R_F + r_{ds1})}$$

$$Z_{T,proposed}(s) = \frac{-g_{m2}R_{F2}R_F}{1 + \frac{1}{g_{m2}R_{F2}A_{proposed}(s)} + \frac{sR_FC_T}{A_{proposed}(s)}}$$

$$A_{proposed}(s) = \frac{g_{m1}r_{ds1}}{1 + sr_{ds1}C_{L1}} \approx A_{conv.}(s), \text{ when } R_F \gg r_{ds1}$$
(7)

As shown in Fig. 6, roughly the same  $Z_{T,conv.}(s)$  and  $Z_{T,proposed}(s)$  is possible. Power consumption and the input-referred noise from the TIA feedback amplifier are also roughly equal, since both designs are dominated by the size/power of the input stage  $S_1$ . Conversely, the input-referred noise power from the feedback resistor is significantly reduced by a factor of  $g_{m2}R_{F2}$  in the proposed design, releasing full potential of the high-speed low-gain technology.

### 3.2 Inverter-based CTLE

As discussed in Section 2, TIA's integrated noise power is proportional to  $\omega_{TIA}^3$ . Low-bandwidth TIA improves optical receiver's OMA sensitivity by 15dB/dec with decreasing  $\omega_{TIA}$ . On the other hand, NRZ sampling requires 50% or higher data-rate overall bandwidth to keep eye-height, necessitating



Fig. 8 Frequency response of P(s) with different R values.

equalization to avoid excessive ISI [6]. Though not as good as high-order DFE [5], which adds minimal noise during bandwidth recovery and so suppresses noise from both  $R_F$  and amplifier, CTLE only suppresses noise from  $R_F$  but doesn't need power-hungry fast decision feedback loop and is suitable for low power application.

Subsequent inverter-based CTLE [12][13] is utilized to further increase TIA feedback resistance by a factor of n=2.5 in this design as shown in Fig 7. Low-power additive CTLE is more suitable here, thanks to the relaxed linearity requirement in NRZ signaling. At LF coupling capacitor  $C_C$  blocks the bottom path, leaving top-path  $g_{m1}$  alone setting input transconductance, while at high-frequency (HF) both  $g_{m1}$  and  $g_{m2}$  drive the combined loading in both paths. The CTLE utilizes 2-bit control through the EN transistors to achieve the desired frequency response, with a relatively stable peaking gain  $n = (g_{m1} + g_{m2})/2g_{m1}$  that is determined by the input stage transistors' size ratio. A  $g_{mL}$  loading is also added in the bottom path to limit its local voltage swing for reasonable linearity. First-order high-pass response produced with  $g_{m2} > g_{m1}$  can not perfectly cancel the second-order slope from the preceding low-bandwidth TIA. Adding series resistors to the gate path of the output load stage solves this problem by realizing an active inductor [14], generating a flatter frequency response.

$$H(s) = -\frac{g_{m1}}{g_{mL}} \frac{1 + s/\omega_z}{1 + s/\omega_p} P(s),$$

$$\omega_z = \frac{g_{m1}g_{mL}}{(g_{m1} + g_{m2})C_c}, \omega_p = \frac{g_{mL}}{2C_c} = \frac{g_{m1} + g_{m2}}{2g_{m1}} \omega_z$$

$$P(s) = \frac{1 + s/\omega_{z,ind}}{1 + 2\zeta(s/\omega_n) + (s/\omega_n)^2}$$

$$\omega_{z,ind} = \frac{1}{RC}, \omega_n = \sqrt{\frac{2g_{mL}}{RCC_{Load}}}, \zeta = \frac{\sqrt{2}}{4} (\sqrt{\frac{RCg_{mL}}{C_{Load}}} + \sqrt{\frac{C_{Load}}{RCq_{mL}}}).$$
(8)

where C consists of combined NMOS/PMOS gate capacitance and parasitic capacitance. It works with the added R to attenuate LF component, creating an active inductor  $L = RC/g_{mL}$  and a series resistor  $1/g_{mL}$ . The active inductor adds a pole-zero pair at the output node, represented by P(s). As shown in Fig. 8, P(s) becomes a second-order Butterworth LPF with a  $\sqrt{2}g_{mL}/C_{Load}$  bandwidth and a zero at  $g_{mL}/C_{Load}$ , when  $R = C_{Load}/(Cg_{mL})$ . Lower R provides less peaking gain at higher frequency with roughly the same bandwidth. Proper choice of  $R = 0.7C_{Load}/(Cg_{mL})$  generating peaking gain in the mid-band compensates the discrepancy between the second-order TIA and first-order CTLE. Compared to conventional design that's equivalent to R = 0, active inductor extends the bandwidth by 1.5 times. Proportional reduction in  $g_{m1}, g_{m2}$  and  $g_{mL}$  maintaining the same LF/HF gain and bandwidth, reduces the whole CTLE power consumption by 33% as a result. Power spent on TIA's output stage can also be reduced thanks to CTLE's reduced input capacitance.

An adjustable CTLE power supply is utilized to set the absolute transconductance values to achieve the desired peaking frequency. As TIA and CTLE are both inverter-based and biased near half of the supply for optimal gain, no extra buffer is needed between them. Inverter-based CTLE also makes it possible for less parasitic capacitance and a compact layout. The combined TIA and CTLE layout occupies  $65 \ um^2$  silicon area.

### 3.3 DC cancellation

Optical receiver's front-end needs a feedback loop to suppress the following DC input current from PD and provide proper common-mode voltage for the subsequent slicers.

$$I_{DC} = \frac{(ER+1)R_{PD}P_{in,OMA}}{2(ER-1)}$$
(9)

where  $R_{PD}$  is the PD's responsivity, ER and  $P_{in,OMA}$  are the input laser's extinction ratio and OMA power, respectively. As shown in Fig. 9, a highgain OTA, with a pole created by the 0.68 pF Miller compensation capacitor, is utilized to match the DC current flowing through transistor M0 with  $I_{DC}$ and generate the 2 MHz cut-off frequency. A preceding 100 MHz RC LPF



Fig. 9 DC cancellation schematic and simulated front-end frequency response over an extended low-frequency range.  $Z_T$  is the front-end's HF frequency response.

is added to relax OTA's input dynamic range since 12.5 Gb/s large-swing signal from CTLE is attenuated by 36 dB. This additional pole doesn't impact stability due to it being far beyond the cut-off frequency. Reference voltage is locally generated by a diode-connected PMOS/NMOS pair, which helps to compensate PVT variation.

Noise from RC LPF and OTA are filtered by the low bandwidth, leaving M0 the main noise contributor here. The following output-referred noise current is directly added at the TIA's input node and should be minimized.

$$\overline{I_{n,out}^2(\omega)} \approx \overline{I_{n,M0}^2(\omega)} = 4kT\gamma g_{m0} = 4kT\gamma \sqrt{2\mu C_{ox}(W/L)I_{DC}}$$
(10)

It's clear that M0 should use the minimum transistor width indicated by (9) and (10) with the minimum possible transistor length, just enough to stay in saturation region with the max  $I_{DC}$ . It also helps to minimize any parasitic capacitance added to the TIA's input node, which degrades front-end's noise performance as discussed previously.

### 3.4 Low-voltage quarter-rate slicers

The front-end pseudo-differential output signal is sampled by four quarter-rate slicers that are activated by four 90°-spaced clock phases. This quarter-rate operation provides increased slicer regeneration time and allows for powering them with reduced supply voltages. Fig. 10 shows the two-stage slicer circuit [15] that employs minimal device stacking for low-voltage operation. This is followed by a SR latch that holds the sampled data during the slicer reset phase. Optical receiver sensitivity is improved with slicer offset cancellation that is performed with two 5b current DACs that provide programmable discharge currents at the first-stage output nodes during the sampling phase.



Fig. 10 Schematic of sampling slicer.

### 3.5 Simulation results

Fig. 11 shows the simulated frequency response, input-referred noise power spectral density with  $C_{in} = 150$  fF, and a 12.5 Gb/s eye-diagram with input OMA = -10.7 dBm. The proposed TIA is intentionally designed with a reduced 2.8 GHz bandwidth that is extended by the subsequent CTLE to 7.0 GHz to support the 12.5 Gb/s data-rate. This allows for an extremely high 82  $dB\Omega$  transimpedance gain without excessive ISI. The higher feedback resistance in the proposed TIA with CTLE yields a 2.0  $pA/\sqrt{Hz}$  reduction relative to a conventional broadband TIA with the same power consumption.

## 4 Experimental results

Fig. 12 (a) shows the chip micrograph of the optical receiver, which was fabricated in a 28 nm CMOS technology. The optical receiver is placed directly underneath the pad to reduce parasitic capacitance and occupies  $720 \ um^2$  total area.

The optical test setup is shown in Fig. 12 (b). A 40 Gb/s 0.6 A/W InGaAs PD is wire bonded to the optical receiver input. This results in 150 fF total combined input capacitance from the PD and bonding pads. A 1550 nm laser is connected to a Mach-Zehnder modulator (MZM) that is modulated with 12.5 Gb/s PRBS15 data to produce the optical input signal. A half-rate electrical clock is supplied to the chip and passes through an injection-locked oscillator to generate the four quarter-rate clock phases for the slicers. The quarter-rate



Fig. 11 Simulated frequency response, input-referred noise power spectral density, and 12.5 Gb/s differential eye diagram at the slicers' inputs with  $C_{in} = 150$  fF.

data signals are then multiplexed and driven out of the chip with a CML buffer for BER testing.

Fig. 13 shows measured timing bathtub and sensitivity curves at 10 Gb/s, 12.5 Gb/s and 14 Gb/s. The 12.5 Gb/s OMA sensitivity at BER =  $10^{-12}$  is -10.7 dBm with a 0.04 UI timing margin. The optical receiver front-end consumes 1.08 mW from a 1 V power supply and the slicers consume 0.30 mW from a 0.7 V power supply, resulting in a 0.11 pJ/bit power efficiency at the 12.5 Gb/s data rate.



Fig. 12 (a) Optical receiver layout and chip micrograph and (b) optical test setup.

Table 1 summarizes the receiver performance and compares it with other recent CMOS designs that operate between 10-20 Gb/s. Since input optical signal power is proportional to  $\sqrt{C_{in}\omega_{TIA}^3}/R_{PD}$  for a given SNR, the OMA sensitivity is normalized for a fair comparison between the different design techniques.

Normalized OMA Sens. = OMA Sens. - 
$$5log_{10}\left(\frac{C_{in}}{100 \ fF}\right)$$
  
-  $15log_{10}\left(\frac{Data \ rate}{12 \ Gb/s}\right) + 10log_{10}\left(\frac{R_{PD}}{1 \ A/W}\right)$  (11)

The proposed design improves upon the normalized OMA sensitivity relative to the conventional inverter-based TIA with a single-stage amplifier [1]. While the integrate-and-dump [4], duobinary-signaling design [6] and pseudo-differential TIA with 4-tap DFE [5] achieve better normalized OMA sensitivity, these designs consume significantly more power on clocked integration stages, extra slicers and logic gates, and fast decision-feedback circuitry, respectively. The best sensitivity is achieved with the multi-stage amplifier TIA [9] due to bandwidth extension with large area on-chip peaking inductors, high power consumption to minimize amplifier noise, and the lack of on-die slicers, which can lead to an optimistic estimate of the receiver sensitivity that is set by the BER tester. Overall, the proposed all-inverter-based optical receiver



Fig. 13 (a) Measured BER timing margin curves with OMA = -10.7 dBm and (b) sensitivity curves.

with multi-stage feedback TIA and continuous-time linear equalizer achieves adequate sensitivity and provides both more than 3.6X improvement in power efficiency and 6.9X improvement in area.

# 5 Conclusion

This paper presented a 12.5 Gb/s all-inverter-based optical receiver with a multi-stage TIA feedback amplifier that is suitable for high-speed lowgain nanometer CMOS technologies. This multi-stage amplifier technique suppresses feedback resistor noise without extra power consumption and is compatible with other noise reduction techniques. Significant power efficiency improvement is achieved with a subsequent inverter-based active inductor CTLE that provides frequency peaking to compensate for ISI from the low-bandwidth TIA. Overall, the all-inverter-based optical receiver achieves

| References                                                              | [9]         | [6]               | [5]               | [1]         | [4]             | This work                                                                           |
|-------------------------------------------------------------------------|-------------|-------------------|-------------------|-------------|-----------------|-------------------------------------------------------------------------------------|
| CMOS technology                                                         | 180nm       | $65 \mathrm{nm}$  | 65nm              | 40nm        | 28nm            | 28nm                                                                                |
| Data rate (Gbps)                                                        | 10          | 12                | 12                | 10          | 20              | 12.5                                                                                |
| Architecture                                                            | $MSA-TIA^1$ | TIA+<br>Duobinary | Diff. TIA<br>+DFE | TIA         | $\mathrm{ID}^2$ | $\begin{array}{c} \mathrm{MSA}\text{-}\mathrm{TIA}^1 \\ +\mathrm{CTLE} \end{array}$ |
| Sampling rate                                                           | No Sampling | 1/4th             | Half              | Half        | 1/4th           | 1/4th                                                                               |
| PD + parasitic cap<br>(fF)                                              | >200        | 160               | 100               | 40-60       | 130             | 150                                                                                 |
| PD responsivity $(A/W)$                                                 | 1.0         | 0.8               | 0.75              | 0.7         | 0.5             | 0.6                                                                                 |
| Power supply (V)                                                        | 1.8         | -                 | -                 | 1.0         | 0.95            | 1.0/0.7                                                                             |
| $\begin{array}{c} {\rm Transimpedance} \\ {\rm (dB}\Omega) \end{array}$ | 70.5        | 79                | 86                | 72          | -               | 82                                                                                  |
| Sens. OMA (dBm),<br>$BER = 10^{-12}$                                    | $-18.7^{3}$ | -14.1             | -16.8             | $-12^4$     | -8.6            | -10.7                                                                               |
| Normalized<br>Sens. OMA (dBm),<br>$BER = 10^{-12}$                      | $-19.0^{5}$ | -16.1             | -18.0             | $-10.9^{6}$ | -15.5           | -14.1                                                                               |
| Area $(um^2)$                                                           | 780,000     | 88,000            | 120,000           | 7,000       | $5,\!000$       | 720                                                                                 |
| Power (mW)                                                              | 81          | 9.5               | 23                | 3.95        | 10.6            | 1.38                                                                                |
| Power efficiency<br>(pJ/bit)                                            | 8.1         | 0.79              | 1.9               | 0.40        | 0.53            | 0.11                                                                                |

Table 1 Performance summary

<sup>1</sup>Multi-stage amplifier TIA

<sup>2</sup>Integrate-and-dump

 $^3\mathrm{Calculated}$  from input-referred noise current = 0.97  $\mu\mathrm{A}_\mathrm{rms}$ 

<sup>4</sup>Calculated from avg. sensitivity

 ${}^{5}Assume input cap = 200 \text{ fF}$ 

 $^{6}$ Assume input cap = 50 fF

ultra-low power and area consumption, making it suitable for the high bandwidth-density optical interconnects required in future systems.

# 6 Declarations

Ethical Approval. Not applicable.

**Competing interests.** The authors declare that they have no competing interests.

Authors' contributions. Peng Yan designed the circuit and wrote the first draft of the manuscript. Chaerin Hong did test preparation and data collection. Po-Hsuan Chang, Hyungryul Kang, Dedeepya Annabattuni, Ankur Kumar, Yang-Hang Fan, Ruida Liu and Ramy Rady contributed to the chip layout. Samuel Palermo gave advice during design procedure and commented on previous versions. All authors reviewed the final manuscript.

Funding. This research was funded by the DARPA PIPES program.

Availability of data and materials. The authors declare that the data supporting the findings obtained during this research work is available within the paper.

# References

- Liu, F.Y., Patil, D., Lexau, J., Amberg, P., Dayringer, M., Gainsley, J., Moghadam, H.F., Zheng, X., Cunningham, J.E., Krishnamoorthy, A.V., Alon, E., Ho, R.: 10-Gbps, 5.3-mW Optical Transmitter and Receiver Circuits in 40-nm CMOS. IEEE Journal of Solid-State Circuits 47(9), 2049–2067 (2012). https://doi.org/10.1109/JSSC.2012.2197234
- [2] Sackinger, E.: Broadband Circuits for Optical Fiber Communication. Wiley, New York, NY, USA (2005)
- Säckinger, E.: The Transimpedance Limit. IEEE Transactions on Circuits and Systems I: Regular Papers 57(8), 1848–1856 (2010). https://doi.org/ 10.1109/TCSI.2009.2037847
- [4] Sharif-Bakhtiar, A., Lee, M.G., Carusone, A.C.: Low-power CMOS receivers for short reach optical communication. In: 2017 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–8 (2017). https://doi.org/ 10.1109/CICC.2017.7993601
- [5] Ahmed, M.G., Talegaonkar, M., Elkholy, A., Shu, G., Elmallah, A., Rylyakov, A., Hanumolu, P.K.: A 12-Gb/s -16.8-dBm OMA Sensitivity 23-mW Optical Receiver in 65-nm CMOS. IEEE Journal of Solid-State Circuits 53(2), 445–457 (2018). https://doi.org/10.1109/JSSC. 2017.2757008
- [6] Ahmed, M.G., Kim, D., Nandwana, R.K., Elkholy, A., Lakshmikumar, K.R., Hanumolu, P.K.: A 16-Gb/s -11.6-dBm OMA Sensitivity 0.7-pJ/bit Optical Receiver in 65-nm CMOS Enabled by Duobinary Sampling. IEEE Journal of Solid-State Circuits 56(9), 2795–2803 (2021). https://doi.org/ 10.1109/JSSC.2021.3064248
- [7] Li, D., Minoia, G., Repossi, M., Baldi, D., Temporiti, E., Mazzanti, A., Svelto, F.: A Low-Noise Design Technique for High-Speed CMOS Optical Receivers. IEEE Journal of Solid-State Circuits 49(6), 1437–1447 (2014). https://doi.org/10.1109/JSSC.2014.2322868
- [8] Li, D., Geng, L., Maloberti, F., Svelto, F.: Overcoming the Transimpedance Limit: A Tutorial on Design of Low-Noise TIA. IEEE Transactions on Circuits and Systems II: Express Briefs 69(6), 2648–2653 (2022). https://doi.org/10.1109/TCSII.2022.3173155

18

- [9] Li, D., Liu, M., Gao, S., Shi, Y., Zhang, Y., Li, Z., Chiang, P.Y., Maloberti, F., Geng, L.: Low-Noise Broadband CMOS TIA Based on Multi-Stage Stagger-Tuned Amplifier for High-Speed High-Sensitivity Optical Communication. IEEE Transactions on Circuits and Systems I: Regular Papers 66(10), 3676–3689 (2019). https://doi.org/10.1109/TCSI. 2019.2916150
- [10] Yan, P., Hong, C., Chang, P.-H., Kang, H., Annabattuni, D., Kumar, A., Fan, Y.-H., Liu, R., Rady, R., Palermo, S.: A 12.5 Gb/s 1.38 mW Inverter-Based Optical Receiver in 28 nm CMOS. In: 2022 IEEE 65th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1–4 (2022). https://doi.org/10.1109/MWSCAS54063.2022.9859536
- [11] Abidi, A.A.: Gigahertz transresistance amplifiers in fine line NMOS. IEEE Journal of Solid-State Circuits 19(6), 986–994 (1984). https://doi.org/10. 1109/JSSC.1984.1052255
- [12] Zheng, K., Frans, Y., Chang, K., Murmann, B.: A 56 Gb/s 6 mW 300 um2 inverter-based CTLE for short-reach PAM2 applications in 16 nm CMOS. In: 2018 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–4 (2018). https://doi.org/10.1109/CICC.2018.8357076
- [13] Zheng, K., Frans, Y., Ambatipudi, S.L., Asuncion, S., Reddy, H.T., Chang, K., Murmann, B.: An Inverter-Based Analog Front-End for a 56-Gb/s PAM-4 Wireline Transceiver in 16-nm CMOS. IEEE Solid-State Circuits Letters 1(12), 249–252 (2018). https://doi.org/10.1109/LSSC. 2019.2894933
- [14] Musah, T., Jaussi, J.E., Balamurugan, G., Hyvonen, S., Hsueh, T.-C., Keskin, G., Shekhar, S., Kennedy, J., Sen, S., Inti, R., Mansuri, M., Leddige, M., Horine, B., Roberts, C., Mooney, R., Casper, B.: A 4–32 Gb/s Bidirectional Link With 3-Tap FFE/6-Tap DFE and Collaborative CDR in 22 nm CMOS. IEEE Journal of Solid-State Circuits 49(12), 3079–3090 (2014). https://doi.org/10.1109/JSSC.2014.2348556
- Schinkel, D., Mensink, E., Klumperink, E., van Tuijl, E., Nauta, B.: A double-tail latch-type voltage sense amplifier with 18ps setup+hold time. In: 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, pp. 314–605 (2007). https://doi.org/10.1109/ISSCC. 2007.373420