# Decoupled Logic Based Design for Implementation Low Power Memories by 8T SRAM

R.L.B.R.Prasad Reddy\*, G.Naresh #

Department of Electronics and Communication, Vaagdevi institute of Technology & Science, Proddutur, A.

\*rajendra.409@gmail.com #ganekantinaresh@gmail.com

Abstract -We present a novel half-select disturb free transistor SRAM cell. The cell is 6T based and utilizes decoupling logic. It employs gated inverter SRAM cells to decouple the column select read disturb scenario in halfselected columns which is one of the impediments to lowering cell voltage. Furthermore, "false read" before write operation, common to conventional 6T designs due to bit-select and wordline timing mismatch, is eliminated using this design. Two design styles are studied to account for the emerging needs of technology scaling as designs migrate from 90 to 65 nm PD/SOI technology nodes. Namely we focus on a 90 nm PD/SOI sense Amp based and 65 nm PD/SOI domino read based designs. For the sense Amp based design, read disturbs to the fully-selected cell can be further minimized by relying on a read-assist array architecture which enables discharging the bit-line (BL) capacitance to GND during a read operation. This together with the elimination of half-select disturbs enhance the overall array low voltage operability and hence reduce power consumption by 20%-30%. The domino read based SRAM design also exploits the proposed cell to enhance cell stability while reducing the overall power consumption more than 30% by relying on a dynamic dual supply technique in combination of cell design and peripheral circuitry. Because half-selected columns/cells inherently protected by the proposed scheme, the dynamic supply "High" voltage is only applied to read selected columns/cells, while dynamic supply "Low" is employed in all other situations, thereby reducing the overall design power. A short bitline loading of 16 cells/BL is adopted to achieve high-performance low-power operation and lower bitline capacitance to improve tability. A newly developed fast Monte Carlo based statistical method is used to analyze such a unique cell, and 65 nm design simulations are carried out at 5 GHz. The feasibility of the cell and sensitivity to sense Amp timing has been proved by fabricating a 32 kb array in a 90-nm PD/SOI technology. Hardware experiments and simulation results show improvements of cell Vdd min over traditional 6T cells by more than 150 mV for 90 nm PD/SOI technology. Also experimental results based on fabricated 65 nm PD/SOI (1.6 kb/site 80 sites) hardware also asserts half-select disturb elimination and hence the ability to enable significant power savings. The performance and speed are shown to be comparable with the conventional 6T design.

Keywords—Column-decoupled, differential /domino read, half-select, low power 8T, SRAM, stability.

#### I. INTRODUCTION

ISSN: 2231-4946]

Device miniaturization and the rapidly growing demand for mobile or power-aware systems have resulted in an urgent need to reduce power supply voltage (Vdd). However, voltage reduction along with device scaling are associated with decreasing signal charge . Furthermore, increasing intra-die process parameter variations, particularly random dopant threshold voltage variations can lead to large number of fails in extremely small channel area memory designs. Due to their small size and large numbers on chip, SRAM cells are adversely affected. This trend is expected to grow significantly as designs are scaled with each technology generation [1]. Particularly, it conflicts with the need to maintain a high signal to noise ratio, or high noise margins, in SRAMs and is one of the



Fig. 1. SRAM cell scaling (dashed line) is limited due to process variation effect on cell yield.

major impediments to producing a stable cell at low voltage. When combined with other effects such as

narrow width effects, soft error rate (SER), temperature, and process variations and parasitic transistor resistance, the scaling of SRAMs becomes increasingly difficult due to reduced margins [2]. Fig. 1 illustrates the saturation in the scaling trend (dashed line) of SRAM cells across technology generations. The plot indicates that the SRAM area scaling drops below 50% for 32-nm technology and beyond.

Furthermore, voltage scaling is virtually nullified. Higher fail probabilities occur due to voltage scaling, and low voltage operation is becoming problematic as higher supply voltages are required to conquer these process variations. To overcome these challenges, recent industry trends have leaned towards exploring larger cells and more exotic SRAM circuit styles in scaled technologies. Examples are the use of write-assist design [3], read-modify-write [4], read-assist designs [5], and the 8T register file cell [6], [7]. Conventional 6T used in conjunction with these techniques does not lead to power saving due to exposure to half select condition [3], [4]. Column select/half-select is very commonly used in SRAMs to provide SER protection and to enable area efficient utilization



Fig. 2. SRAM half-select stability failure.

and wiring of the macro. Nevertheless, the use of column select introduces a read disturb condition for the unselected cells along a row (half-selected cells), potentially destabilizing them In this paper we present a new column-decoupled 6T-based SRAM cell where read disturb is eliminated for column selected half selected cells [5], [8]. The decoupling logicuses two additional devices and henceforth we will refer to the cell as the 8T-column-decoupled-cell (8T-CDC). We study the cell in the presence of two design styles: namely, sense Ambased read peripheral circuitry that was typical for the 90-nm node, and domino read peripheral circuitry [9]

for 65 nm and beyond In a sense Amp-based read design, the read disturb condition is further minimized for the selected cells by th- use of a sense-amp architecture which actively discharges the selected cell(s) BL to GND, thereby eliminating the source of disturb. Through a combination of accurate simulations and hardware (HW) data acquired from a 32 kb SRAM macro, a path towards low voltage SRAM operation of the cells is shown, and the design is shown to enhance read stability and half-select stability problems thereby enabling improved. However, process variations are increasingly affecting sense Amp designs in PD/SOI designs and it is natural to converge to domino-read designs [9]. In domino read-based designs, the columndecoupled cell still maintains guard against half-select cell disturbs. However, with the absence of read-assist feature in domino designs, we need to account for the read disturb on fully-selected cells. For this, we propose a dynamic dual supply header design that leverages the benefits of the column decoupled cell design and helps save power. As is the case with traditional dual supply techniques, the proposed header design maintains separate cell supply (Vcs) and logic supply (Vdd). However, unlike traditional techniques, the dynamic cell supply changes based on the column selection status. The new header design sets: 1) the selected cell columns at a voltage supply higher than the logic one for improved read stability and 2) maintains a low supply for half-select cells since half-select disturbs are not an issue for this design. Hence, we rely on the columndecoupled cell to enable a simplified low-power highperformance column-decoupled domino read based design. We implement the design using simplified bitselect logic and dynamic supply headers with shorter bitlines. In what follows, we provide a thorough analysis into the design modifications compared to the traditional 6T dynamic supply designs. We also highlight the advantages this methodology brings in terms of lower power and yield improvements. This paper is organized as follows. Section II provides a review of columnselect disturbs. The cell is introduced in Section III. In Section IV, we study the sense amplifier based design and in Section V we study the dynamic domino based design are presented and in Section VI. Conclusions are presented.

# II. BACKGROUND: COLUMN SELECT (HALF-SELECT) AND MEMORY DESIGNS

Fig. 2 shows a typical array topology which employs a two-way column select condition. In this topology, the word-line (WL) activates both the selected and half-selected cells along the decoded row. However, only the read/write data from the selected cell is allowed to pass to/from peripheral logic, while the half selected cell is isolated. When the word-line is activated during a

selected read or half-select condition, the pass-gate (PG) transistor and the pulldown (PD) device (transistors T2 and T4 in Fig. 2) form a resistor- voltage divider between the BL and the storage node of the cell. This causes the "0" node of the cell (node R in Fig. 2)to bump up to some intermediate voltage which subsequently increases sub-threshold leakage (on transistor -3 in Fig. 2) and causes discharge of the "1" node in the cell thereby destabilizing it.



Fig. 3. Column-select decoupled 8T-CDC cell (in dashed rectangle) eliminates half-select condition. Selected column LWLE0 is high. Half-selected column LWLE1 stays due to "ANDing" GWLE with BDC1).

The read disturb to the selected cell keeps diminishing as the cell read current discharges the BL capacitance. The half-selected cell see the maximum potential in the case when the BLs are clamped to Vdd. This is the reason why there has been an industry wide trend towards shorter BL heights, thin cell designs and unclamping (floating) the bitlines of half-selected cells [2], [9] for 65 nm and beyond; in prior technologies, like 90 nm, the general trend was to use clamped bitlines. If the area under the curve is used as a measure of the read disturb witnessed by the cell, unclamping the BLs results in a mere 12% reduction over clamping for 128 cells/BL. While this benefit increases to 25% for 32 cells/BL, a significant area penalty is paid to achieve this. In the following section, an 8T-CDC which can result in larger area reduction of the curve for the halfselected cell is presented. This design, together with special sensing technique or special dynamic headers can lead to improved yield for both selected and halfselected cells and lower operating voltages for the overall design.

## III. 8T COLUMN DECOUPLED CELL A. Proposed 8T-CDC

Fig. 3 illustrates a new 8T-CDC SRAM cell (inside dashed rectangle) with a gated wordline which enables

the decoupling of the column/half-select condition [5] hence eliminating halfselect stability fails. A localized gated inverter consisting of two additional transistors, T1 and T2, effectively perform a logical "AND" operation between the column select signal (BDT0) and the decoded row, or global wordline, GWLE. The output of the inverter is the local wordline signal (LWLE0). The local wordline is ON only when both the column and row are selected (i.e., for fully selected cells only); hence, as illustrated in the waveforms of Fig. 3, LWLE0 of the selected columned turns ON while LWLE1 of the half-selected column remains low. This ensures that the local wordline for only the selected cells is activated, thereby effectively protecting the half-selected SRAM cells from the read disturb scenario that exists in 6T cell due to wordline sharing. Alternatively, it is possible to swap the inpu and supply pairs of the gated inverter; however this comes at the cost of extra delay stage and power. The advantages of the 8T-CDC cell are as follows: 1) conforming with traditional 6T requirements in terms of (a) allowing the designer to integrate it in a column select fashion and (b) offering/maintaining SER protection while 2) maximizing array efficiency, 3) eliminating the read disturb to the unselected cells, and 4) reducing power with simplification in peripheral logic. Fig. 4(a) shows a layout view of the 8T-columndecoupled cell in a 90-nm PD/SOI technology. The two extra devices are integrated on top of an existing 6T cell to allow for easy cell mirroring and integration into an array topology. The addition of the two new transistors results in a cell area increase of 40% (all in -direction). Through the use of higher level metallurgy to wire in the column decode (BDC) signal, the growth to the direction of cell was not impacted. The increase to the mension of the cell causes a proportionate increase to the BL metal capacitance while maintaining the original diffusion capacitance contributed by the 6T cell. Column decode signal integrated with higher level metal. Area penalty can be further reduced to 30% via use of 6T thin cell integration in Fig. 4(b); further reduction can be achieved by use of non-DRC clean devices. Fig. 4(b) and (c) presents the front end of the line (FEOL) and back end of the line (BEOL) layout view of 2 2 8T-CDC thin cell. The views illustrate how the recessed oxide (ROX) and power buses are shared. The area can be reduced further to 30% by utilizing thin cells as presented in this paper without degrading the bitline capacitance.

## B. Timing Advantages: Elimination of "False Read" Before Write

During the write operation in conventional 6T SRAM, when the wordline precedes ahead the column-select in timing, then the cell starts reading the data [8]. When the bitline droops, "false read" before write

happens [see Fig. 5(a)]. This is a disadvantage for conventional 6T SRAM. This particular drawback is overcome by the technique that is proposed here as illustrated When the bitline droops, "false read" before write happens. (b) This particular drawback is overcome by the 8T-CDC cell; the early ordline (GWLE in dashes) will be gated by the column select and thus "false read" before write does not happen. in the Fig. 5(b); if the wordline arrives earlier than the column select it will be gated by the column select and thus "false read" before write does not ripple through the bitlines to the evaluation logic.

#### C. Logic and Circuit Requirements

The difference between the 8T-CDC and 6T array design can be highlighted in terms of distributed versus global wordline drivers. Hence, for the 8T-CDC cell, the wordline driver (inverting function) is eliminated as it is already accounted for inside the gated cells. This helps reduce the area overhead and will be discussed in detail later. Fig. 6(a) and (b) illustrates the wordline logic for the 6T design and 8T-CDC design, respectively. The gated inverters



Fig. 4. Layout view of the new 8T-CDC SRAM cell for a (a) typical cell and a (b)  $2 \square 2$  thin cell front end of the line layout view and (c) back end of the line layout view to show ROX and GND sharing.



Fig. 5. (a) For conventional 6T SRAM, during write, when the wordline precedes ahead the column-select, the cell starts reading the data [8].



Fig. 6. Wordline logic (a) for traditional 6T cell and (b) for the 8T-CDC cell. the wordline driver (inverting function) is eliminated as it is already accounted for inside the gated cells.

transistor sizes are comparable to those of the SRAM cell. In the presence of large distributed loads, it may be desirable to optimize the NAND gate nfets further. This is feasible with minimal area penalty because the gated inverter sizes leave room for NAND gate optimization. Furthermore the gated inverters improve the local wordline slews as opposed to the case of conventional 6T wordline drivers with large pass gate loads where the slew rates are wire limited.

### IV. SENSE AMP BASED DESIGN

The 8T-CDC cell together with read-assist sense amp designs [5] can mitigate the read disturb problem both for selected and half-selected designs

#### A. Read Assist Sense Amp-Based design:

Fig. 7 illustrates the 8T-CDC cell design combined with readassist sense Amp. The sense amplifier is shared among multiple columns. In a typical sense Amp scenario, the bit switch (BDC), and the WL on the selected cells columns are turned off once enough margin is developed for the sense-amplifier to

accurately resolve the BL differential. This is done to save ac power (prevents discharge of BL to GND) and to speed up sense-time (smaller capacitance for sense-amplifier to discharge). For this scenario, only the PFET transistor exists (solid bit switch PFET Fig. 7) and it closes during "Sense" to save power and perform faster sense. In a read-assist scenario the bit-switch PFET is converted to a complementary (dashed line) NFET and PFET bit-switch pair. The pair is kept open during the entire WL active phase. Consequently, the sense-amp and the cell discharge the BL completely during a sense-read operation [5].



Fig. 7. Gated 8T-CDC cell design combined with read-assist sense Amp [5].

Hence the sense amplifier "sees" the full BL capacitance during a read operation; it discharges the capacitance to GND, and the cell data is "written back". This helps minimize the amount of read disturb charge induced onto the cell from the bitlines.

### V. DOMINO READ-BASED DESIGN

In the following sections, we discuss the advantages of the proposed 8T-CDC design in the presence of domino read based architectures as well as the rational behind these architectures.

As technology scales, sense-Amp devices suffer from Vt-mismatch and scaling becomes difficult particularly for PD/SOI technology designs due to hysteretic Vt variation. Thus, it is preferred to use large signal domino read circuitry [9]. During a domino read, the dual rail signals from the cell are amplified by skewed inverters to full rails. This eliminates the dependency on bitline differential which can be highly sensitive to Vt-mismatch and we refer the reader to [9] and the references within for a detailed overview of domino based read designs. However, the SRAM cell read disturbs and half-select problems are still critical in a domino read design. In what follows, we study the advantages of combining a decoupled half-select column

design cell design with dynamic supply techniques for a 65-nm PD/SOI domino read-based design. Our goal is to exploit the elimination of half-select disturbs together with dynamic supply techniques for optimal yield and power. For this purpose, we propose new header designs for the dynamic supply suitable for the 8T-CDC cell. Next, we revisit traditional circuit and peripheral logic for 6T domino designs and propose simplifications/modifications as well as novel dynamic header designs suitable for low-power 8T-CDC cell design.

#### VI. CONCLUSION

We studied a novel 8T-CDC column-decoupled SRAM design. The half-select free design enables enhanced voltage scaling capabilities, and 30%–40% power reduction in comparison to standard 6T techniques. This study involved a 90-nm read assist-based sense Amp design, and a 65-nm domino read-based design with dynamic supply capabilities. The 8T-CDC cell enables significant power savings in terms of reduction for read-assist design, and half-select column power reduction in dynamic dual supply domino read designs with the aid of new header designs.

#### REFERENCES

- [1] L. Itoh, K. Osada, and T. Kawahara, "Reviews and future prospects of low voltage embedded RAMs," in Proc. IEEE Custom Integr. Circuits Conf., 2004, pp. 339–344.
- [2] Wann, R. Wong, D. J. Frank, R. Mann, S.-B. Ko, P. Croce, D. Lea, D. Hoyniak, Y.-M. Lee, J. Toomey, M. Weybright, and J. Sudiiono.
- [3] "SRAM cell design for stability methodology," in Proc. IEEE VLSI-TSA Int. Symp. VLSI Technol., 2005, pp. 21–22.
- [4] L. Chang, D. M. Fried, J. Hergenrother, J. W. Sleight, R. H. Dennard, R. K. Montoye, L. Sekaric, S. J. McNab, A. W. Topol, C. D. Adams, K. W. Guarini, and W. Haensch, "Stable SRAM cell design for the 32 nm node and beyond," in Proc. IEEE Symp. VLSI Technol., 2005, pp. 128–129.
- [5] M. Kellah, Y. Yibin, S. K. Nam, D. Somasekhar, G. Pandya, A. Farhang, K. Zhang, C. Webb, and V. De, "Wordline & bitline pulsing schemes for improving SRAM cell stability in low-Vcc 65 nm CMOS designs," in Proc. VLSI Circuits Symp., 2006, pp. 9–10.
- [6] V. Ramadurai, R. Joshi, and R. Kanj, "A disturb decoupled column select 8T SRAM cell," in Proc. CICC, 2007, pp. 25–28.
- [7] W. Henkels, W. Hwang, R. Joshi, and A. Williams, "Provably correct storage arrays," U.S. Patent 6 279 144, Aug. 21, 2001.