CIRCUIT SIMULATION OF THREE-DIMENSIONAL DEVICE IN BEAM-RECRYSTALLIZED POLYSILICON FILMS

James F. Gibbons, Martin D. Giles, James C. Sturm and James T. Walker

Stanford Electronics Laboratory, Stanford University Stanford, California 94305 U.S.A.

#### ABSTRACT

A set of folding and rotation operations is used to transform planar MOSFET device configurations into three-dimensional structures in beam recrystallized polysilicon films. Circuit simulation techniques are used to judge packing density, speed, and yield of each configuration in an integrated circuit layout. The implications of this analysis are explored to estimate the potential of three dimensional integration.

#### INTRODUCTION

In 1979, Lee et al. (1) fabricated the first MOSFETs in laser-recrystallized polysilicon films and made a variety of measurements showing that these films were worthy of further study as an alternative to SOS technology. Shortly afterwards Lam et al., in collaboration with the Stanford group (2), fabricated 11-13 stage ring oscillators in laser recrystallized films with a minimum propagation delay of 44 ns per stage compared to a propagation delay of 36 ns per stage for the same circuit made on a single crystal substrate. Since then a substantial number of papers have been published on this subject (3, and references cited therein), leading to a number of techniques for improving the recrystallization process itself. Devices capable of integration into large area liquid crystal display decoders have been demonstrated(4). In addition, the possibility of threedimensional integration has been established by Gibbons and Lee (5), who made a stacked CMOS structure with a common gate interposed between the two surfaces on which enhancement mode MOSFET action was desired. More recently, Gibbons et al. (6), described stacked MOSFETs that were fabricated in a single recrystallized film in which the devices utilized separate gates to obtain independent enhancement mode behavior on each surface of the recrystallized film.

# THE FOLDING PRINCIPLE

From a topological point of view, both of these stacked devices can be visualized as arising from a series of folding and rotation operations, using the equivalent planar (single surface) device as a primitive.

Stacked CMOS. To illustrate the procedure, we show in Fig. 1(a) a CMOS device as it might

be fabricated on the top surface of two beam-recrystallized islands, as suggested by Lam et al. (7). If the p channel device is now folded through -180°, it is brought to a position over the n channel device and we obtain the stacked CMOS structure shown in Fig. 1(b), similar to the one reported on by Gibbons and Lee (3). We will call this type of folding "simple folding". It is permitted by the fact that the gates in the CMOS inverter are always at the same potential. Similarly the two device drains are at the same potential and are considered to be connected together through a metal layer that makes ohmic contact to both drains simultaneously. This structure has the possibly useful feature that +V<sub>D</sub> and GND are on the uppermost and lowest level of the structure, while signals flow laterally in a middle layer that could be composed of metal only.

Cross-MOS. Recently Gibbons et al. (6) have demonstrated a cross-MOS structure in which the upper and lower surfaces of a single recrystallized film are used simultaneously for device fabrication. The CMOS implementation of this structure is described in Fig. 2, where the folding operation is again considered to be through -180°. However, the different potentials of the p and n channel source regions require that this folding operation be accompanied by a rotation. We take the rotation to be 90° for convenience, leading to the cross structure reported on earlier.

Selector Logic Circuit. An interesting example of a folded configuration can be obtained by studying the selector logic circuit used by Mead and Conway (8), for which the stick diagram is shown in Fig. 3(a). This circuit provides a method for implementing single output, multi-input logic. The dotted regions represent regions with a depletion mode implant, where the signal path is shorted independent of the gate signal. The gates with no





Fig. 1. (a) n and p channel CMOS devices in a planar array, (b) n and p channel devices in a stacked array.

depletion mode implant act as pass transistors. Thus, for the first term SoA'B', A and B are shorted and A' and B' are implemented. While the different gate signals do not share the same source and drain regions, it can be seen that a signal and its complement (e.g., A and A') are never implemented simultaneously. They can therefore be folded together, provided a method can be found that allows the implementation of a signal but prohibits the implementation of its complement. Such a method is illustrated in Fig. 3(b). To implement A', a p type implant is used that is peaked at the top interface. The implant is similar to that of a threshold adjustment implant, with a somewhat higher dose such that the threshold voltage is sufficiently higher than the highest value of the gate voltage to be used in the logic. To implement A, a similar implant is performed, except that a higher energy is used such that the implant is peaked at the bottom interface. Alternatively, masking steps that allow the growth of different gate oxide thicknesses can be used, although that may be a more complicated procedure.

The stick diagram for the folded structure is shown in Fig. 3(c), where the open and cross-hatched squares indicate implants of different conditions. A clear improvement in packing density is evident.



Fig. 2. Schematic top view of cross-MOS structure.

# CIRCUIT SIMULATION

Using these ideas, Gibbons and Lee (9) have developed a number of structure that provide the potential of increased packing density for logic and memory circuits. The utility of these circuits will of course depend on whether they can be manufactured with reasonable yields and whether they have sufficient speed or other advantage over more conventional circuits to warrant serious consideration.

To provide some guidance on these questions, -we have performed some circuit simulations of various forms of a CMOS shift register cell that can be developed by the folding and rotation operations just described. The circuit diagram is shown in Fig. 4 along with three forms of stacked CMOS cross sections (related to the one film and two film inverters described above). Circuit simulations were then performed using SPICE (10) assuming  $\lambda =$ 2 μm and oxide thicknesses of 500 Å and 0.5 μm for gate and field respectively, for several values of mobility in beam-recrystallized polysilicon and for both self-aligned and partially self-aligned (bottom gate only) structure. These simulations give a time  $\overline{t}_d$ , which is the average delay time across a shift register cell. Calculations for several different types of folded cells are shown in Tables 1 and 2. The rows labelled SOI are visualized as recrystallized islands in a planar array, similar to the unfolded primitive shown in Fig. 1(a).

To allow comparison of the results, a figure of merit, F, was defined in terms of the cell delay  $\overline{t}_d$ , the cell area A, an estimated fabrication mask count M and a yield per mask level Y.

 $F = (Y)^{M}/(2 A \overline{t}_{D})$ 









Fig. 3. Selector logic circuit:

- (a) Stick diagram of conventional structure.
- (b) A possible way of implementation in a folded structure.
- (c) Stick diagram of folded structure.

F has the dimensions (Gate-Hz/cm²). The simulation results are summarized in Tables 1 and 2. The estimated mask count is somewhat larger than the minimum necessary to allow for a range of interconnection possibilities, and to allow depletion transistors in peripheral circuitry.

- 1. The simple use of island structures (designated SOI) without any stacking whatsoever gives the fastest circuits and much of the figure of merit increase that is obtained using more complex stacking arrangements.
- 2. The packing density for 3D circuits is significantly better than SOI or bulk.
- 3. The switching speed of 3D circuits is comparable to SOI.
- 4. The figure of merit for 3D circuits is adversely affected by the rather large number of masks that is required.

This last feature has led us to study a more simply fabricated basic CMOS 3D shift register in which we assume that only one type of device will be fabricated on each layer of the circuit and





Fig. 4. Cross-sections of stacked CMOS structures for shift register cell.

where inter-layer connection is made with forward biased diodes. This reduces the number of masks significantly, but only the type 1 3D configuration can be built with this restriction.

As suggested in Table 3, the number of masks has been reduced to 8 or 9 for all of the configurations. The 3D type 1 configuration has a figure of merit that is approximately one-third larger than the SOI planar configuration. This conclusion turns out to be independent of the carrier mobilities assumed for the layers.

### SOME PROPOSED MEMORY APPLICATIONS FOR THREE-DIMENSIONAL INTEGRATION

In the previous sections we have shown that three-dimensional integration can provide a definite advantage over SOI, but the advantage does not seem great enough for existing logic circuits to warrant large scale developmental effort. The areas in which three-dimensional integration seems to have the greatest likelihood of success are in circuits where:

1. Device density is very important,

Table 1. Layout Comparison for Shift Register Cell

| Туре | Area/\lambda^2 | Estimated Mask |  |
|------|----------------|----------------|--|
|      | 950            | 11             |  |
|      | 860            | 12             |  |
|      | 270            | 15             |  |
|      | 200            | 21             |  |
|      | 210            | 16             |  |

Table 2. Simulation Results for Shift Register Cell. A yield of 70% per mask level is assumed.

| Туре      | Delay/<br>ns | *Merit/<br>(gate-Hz/cm <sup>2</sup> ) | Notes                                                                       |
|-----------|--------------|---------------------------------------|-----------------------------------------------------------------------------|
| Bulk      | 3.1          | 200                                   | Partially self-<br>aligned                                                  |
| SOI       | 1.1          | 2.2                                   | Mobilities:                                                                 |
| Stacked 1 | 2.6          | 1.0                                   | $\mu_0 = 700 \text{ cm}^2/\text{V-S}$                                       |
| Stacked 2 | 2.0          | 0.2                                   | $\mu_e = 700 \text{ cm}^2/\text{V-S}$ $\mu_n = 250 \text{ cm}^2/\text{V-S}$ |
| Stacked 3 | 1.5          | 1.5                                   | (3)                                                                         |
| SOI       | 1.5          |                                       | Partially self-                                                             |
| Stacked 1 | 3.4          | 0.8                                   | μ <sub>p</sub> = 500                                                        |
| Stacked 2 | 2.6          | 0.2                                   | μ <sub>n</sub> = 200                                                        |
| Stacked 3 | 2.0          | 1.2                                   | -11                                                                         |

\*Merit normalized by 8.4x1010

Table 3. SPICE Simulation Results for Simplified 3D Configuration

| Circuit<br>Type | Delay | Area | Masks | Merit<br>Y = 70% |
|-----------------|-------|------|-------|------------------|
| Bulk planar     | 3.1   | 950  | 8     | 29               |
| SOI planar      | 1.1   | 860  | 9     | 64               |
| SOI 3D 1        | 2.6   | 270  | 9     | 86               |

2. Device properties not obtainable in bulk or SOI are important, and  $% \left( 1\right) =\left\{ 1\right\} =\left\{ 1$ 

Only a small number of masks (or additional masking steps) are required, implying only one device type per layer.

These conclusions lead one naturally to a consideration of memory circuits as logical early applications for three-dimensional integration. To study these possibility we show in Fig. 5 the state-of-the-art SRAM, a four transistor circuit with high value load resistors made in polysilicon. The resistors are in the megohm range, leading to a noise immunity ( $\alpha$  particle) problem: since the resistors provide a very high impedance to the 5V power supply buss, the capacitance from drain to ground can be charged by currents arising from the  $\alpha$  bombardment.

In Fig. 6 we show the CMOS 6 transistor SRAM, which is known to operate with very low power and have very good noise immunity because now there is



Fig. 5. State-of-the-art NMOS 4 transistor SRAM.



Fig. 6 CMOS 6 transistor SRAM.

a low impedance both to the 5V rail and to ground. Unfortunately, the conventional CMOS circuit requires larger area than its NMOS counterpart shown in Fig. 5. However, it is possible to fold the p channel devices in Fig. 5 on top of the n channel devices, as suggested in the first application of folding earlier in this paper, to produce a memory cell which has the same size as the 4 transistor NMOS cell just described, but improved performance with regard to noise immunity. This is a potentially useful application of the folding principle.

A second interesting application is obtained from a development that starts with the conventional one transistor DRAM shown in Fig. 7. The advantage of this cell is of course its extremely small size. The disadvantages are that the cell has a limited storage time due to the fact that the depletion region at the  $n^+p$  junction generates leakage and requires a refresh cycle. There is also rather poor  $\alpha$  particle immunity on account of the fact that charge collected by the junction depletion region can discharge the storage capacitance. The size of the cell is limited by the required value of storage capacitance divided by bit line capacitance.

An improvement on the one transistor DRAM developed by Jolly and co-workers (11) is shown in





Fig. 7. Conventional one transistor DRAM.

Fig. 8, where an n-channel device is made in a recrystallized film over a thick oxide. The source contact for the transistor is fabricated in close proximity to both an n+ substrate (single crystal of silicon) and a metal plate fabricated on top of the oxidized source region. This configuration provides a significant increase in the storage capacitance and improved storage time because there is no depletion region leakage to discharge the storage capacitance in the same manner as in the conventional circuit. Furthermore, the bit line capacitance is reduced because the n+ drain region There is also is not diffused in the substrate. improved a particle immunity since the storage node is isolated from charge generated in the substrate. However, Jolly et al. found that leakage on the lower surface of the access transistor decreased the refresh time significantly unless this lower layer could be biased by the application of several volts to the n<sup>+</sup> substrate. This solution can represent a significant problem in the actual application of the cell, however.

A further improvement on the one transistor DRAM currently under study at Stanford is shown in Fig. 9 and can be obtained from the cell of Jolly et al. by an appropriate folding operation of the type described earlier (12). This cell retains all advantages of the previous cell with regard to noise immunity and large storage capacitance.



Fig. 8. Improvements on one transistor DRAM (H-P).



Fig. 9. Further improvements on one transistor DRAM (Stanford).

It is also now possible to use the plate, which serves only as an oxide field plate in the configuration developed by Jolly et al., to provide negative bias and thus avoid backside leakage in the access transistor. Furthermore oxides 1 and 2 can be very thin because the thick oxide over the bias plate gives the thermal isolation that is necessary for recrystallization of the top layer.

In addition to these memory applications one can envisage three-dimensional integrated circuits where circuit operation on one level is to be monitored and/or modified by devices constructed on a second level of recrystallized material. A number of circuits with redundant networks in the "upstairs" layer have been suggested. Three-dimensional integration has also been proposed for digital system applications in which the algorithmic structure of the data or the processing is itself naturally three dimensional.

# CONCLUSIONS

The folding (and rotation) principle provides a number of circuit configurations with substantial improvements in packing density, and speeds that are comparable to nonstacked configurations. However, if devices of both types are to be fabricated on each layer of the circuit, the mask count for a given circuit is likely to be rather high, and hence the yield of good circuits per wafer will be low. To avoid this possibility, we have studied circuits in which only one type of device is fabricated on each layer. This reduces the number of masks considerably, with some decrease in the performance of the 3D circuits compared to their bulk counterparts.

These general results suggest that, in the near term, memory applications are more likely candidates for commercialization of three-dimensional integration than logic, except for special cases.

# ACKNOW LEDGMENTS

The authors would like to acknowledge their indebtedness to Dr. R. A. Reynolds and the Defense Advanced Research Projects Agency for supporting this work. Very helpful discussions were provided during early phases of the work by our colleague Dr. K. F. Lee.



# REFERENCES

- K. F. Lee, J. F. Gibbons and K. C. Saraswat, Appl. Phys. Lett. 35, 173, July 1979.
- (2) H. W. Lam, A. F. Tasch, Jr., T. C. Holloway, K. F. Lee and J. F. Gibbons, IEEE Electron Device Letters <u>EDL-1</u>, 6, June 1980.
- (3) J. F. Gibbons, presented at Materials Research Symposium, Boston, Mass., Nov. 16-20, 1980.

  Published in Proceedings.
- (4) N. M. Johnson, D. K. Biegelson and M. D. Moyer,

  Laser and Electron-Beam Solid Interactions and

  Materials Processing, ed. by J. F. Gibbons,

  L. D. Hess and T. W. Sigmon, North Holland,
  1981, p. 463-470.
- (5) J. F. Gibbons and K. F. Lee, IEEE Electron Device Letters <u>EDL-1</u>, 6, 117, June 1980.
- (6) J. F. Gibbons, K. F. Lee, F. C. Wu, and G.E.J. Eggermont, IEEE Electron Device Letters EDL 3, 8, Aug. 1982.

- (7) H.W. Lam, Laser and Electron-Beam Interactions with Solids, Appleton and Celler, eds., p. 471 (North-Holland, 1981).
- (8) Carver Mead and Lynn Conway, Introduction to VLSI Systems, (Addison-Wesley, 1980).
- (9) IEEE, IEDM Technical Digest, 1982.
- (10) A. Vladimirescu, A. R. Newton and D. O. Pederson, SPICE Version 2G User's Guide, University of California, Berkeley, September 1980; or HP SPICE (June 1980), Hewlett-Packard Design Aids, Palo Alto, California.
- (11) R. D. Jolly, T. Kamsin, and R. D. McCharles, IEEE Electron Device Letters EDL-4, 1, Jan. 1983.
- (12) U.S. Patents applied for.