# Fast Timing Simulation of Transient Faults in Digital Circuits\*

A. Dharchoudhury, S. M. Kang, H. Cha and J. H. Patel

Department of Electrical and Computer Engineering

University of Illinois at Urbana-Champaign

Urbana, IL 61801.

### Abstract

Transient fault simulation is an important verification activity for circuits used in critical applications since such faults account for over 80% of all system failures. This paper presents a timing level transient fault simulator that bridges the gap between electrical and gate-level transient fault simulators. A generic MOS circuit primitive and analytical solutions of node differential equations are used to perform transistor level simulation with accurate MOS-FET models. The transient fault is modeled by a piecewise quadratic injected current waveform; this retains the electrical nature of the transient fault and provides SPICElike accuracy. Detailed comparisons with SPICE3 show the accuracy of this technique and speedups of two orders of magnitude are observed for circuits containing up to 2000 transistors. Latched error distributions of the benchmark circuits are also provided.

# 1 Introduction

Transient faults are temporary faults that occur in a functional circuit for a very short duration and may lead to system failures by altering the circuit behavior. Such faults are caused by a number of physical phenomena such as  $\alpha$ -particle hits from cosmic rays, electromagnetic interference, crosstalk and power transients. It has been reported that over 80% of all system failures occur due to such transient faults [1]. Transient fault simulation is, therefore, important so that circuits (especially those used in critical space, biomedical or military applications) can be redesigned for better fault tolerance. In this paper, we develop a fault simulation technique by focussing on singleevent upsets (SEU), which are transient faults caused by  $\alpha$ -particle hits. The techniques, however, can be extended to other transient faults that can be modeled as a charge injection.

Accurate verification of transient fault tolerance requires a large number of simulations since the circuit has to be simulated for faults occuring at different nodes of the circuit, for different amounts of injected charge, for different times of injection, and for different input sequences. The task is clearly prohibitive for transient fault simulators

Permission to copy without fee all or part of this material is granted, provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. based on SPICE-like simulators, as well as for mixed-mode fault simulators [2]. Gate-level transient fault simulators [3] abstract the electrical nature of a transient fault into a logic-level model, and provide significant savings in compute time. However, these simulators suffer from a number of problems: (i) any logic-level model of transient faults, which are electrical in nature, will be inherently less accurate, (ii) the latch modeling approach is valid for standard cell designs only, and has to be repeated if the cell design or technology implementation changes, (iii) nodes internal to the logic gates/cells are not accessible for injection or observation, and (iv) gate-level transient fault simulators ignore faults which do not upset the logic value of the affected node. An example in which this may cause incorrect results is shown in Fig. 1. Suppose that nodes A and B of the CMOS transmission gate are at logic values 0 and 1, respectively and the gate is off when a transient fault is injected into node A. If the fault reduces the voltage of node A sufficiently, the NMOS transistor will turn on and charge-sharing may lead to an incorrect logic value of node B which may, in turn, lead to a latched fault. Such a fault will not be detected by a gate-level transient fault simulator.



Fig. 1: SEU missed by gate-level fault simulator

In this paper, we present a fast and accurate timinglevel simulator for transient faults in digital circuits, which bridges the gap between electrical- and gate-level transient fault simulation. We demonstrate the accuracy and efficiency of this technique by comparisons with SPICE3. This simulator can be effectively applied for transient fault simulation of custom VLSI circuits for which electricallevel fault simulation is too expensive. It can also be used in place of electrical simulators like SPICE3 in the model-

<sup>\*</sup>This research was supported in part by the Joint Services Electronics Program under contract N00014-93-J-1270 and by Samsung Electronics Co.

ing phase of gate-level transient fault simulation, thereby reducing the modeling cost considerably.

## 2 Transient Fault Model

In CMOS digital circuits, single-event upsets are modeled by injecting the following double exponential current pulse into the affected node [4]:

$$I(t) = I_0(e^{-t/\tau_1} - e^{-t/\tau_2}).$$
(1)

In the above equation,  $I_0$  depends on the amount of injected charge and may be positive or negative,  $\tau_1$  represents the collection time-constant of the junction, and  $\tau_2$  the ion-track establishment time constant.  $\tau_1$  and  $\tau_2$ are constants which depend on several process-dependent factors; in this work, we use the values given in [5]:  $\tau_1 = 1.64 \times 10^{-10}$  sec and  $\tau_2 = 5.0 \times 10^{-11}$  sec. The double exponential current pulse of (1) is approximated by a piecewise quadratic (PWQ) function in time. This is accomplished by dividing the simulation time interval into several segments, and finding a quadratic function that closely approximates the exponential waveform over each segment. Thus, for the *i*th segment  $[t_i, t_{i+1}]$ , the quadratic current waveform can be represented as  $i_{seu}(t) = \gamma_0 + \gamma_1(t-t_i) + \gamma_2(t-t_i)^2$ . The rationale for obtaining a quadratic approximation is explained in the next section. Note that  $\tau_1$  and  $\tau_2$  characterize the double exponential current pulse and since they are known apriori, this approximation can be done as accurately as desired before the simulation. For example, Fig. 2 shows the accuracy of the piecewise quadratic current approximation (with five segments) for different levels of charge injection.



Fig. 2: PWQ approximation of injected current

# 3 Circuit Primitive and Solution

We use the generic MOS circuit primitive shown in Fig. 3 and proposed in [6]. Piecewise linear (PWL) input signals are applied at the terminals  $D_i$  and  $G_i$ ,  $g_k$ and  $C_k$  are linear parasitic conductances and capacitances, and  $C_L$  is the capacitance at the output node of the primitive whose voltage waveform V(t) is to be determined. The primitive also includes the piecewise quadratic (PWQ) current source injected at the output node to represent charge injection during a single-event upset. We define



Fig. 3: Generic MOS circuit primitive

a phase to be a time interval  $[t_0, T]$  in which the PWL signals have constant slew rates and the PWQ current waveform has constant parameters and can be written as  $i_{seu}(t) = \gamma_0 + \gamma_1(t - t_0) + \gamma_2(t - t_0)^2$ . If each MOS transistor in the primitive has a regionwise quadratic model [7] describing the drain current in terms of the gate-source and drain-source voltages, then the differential equation of the output node is of the form

$$\frac{dV}{d\tau} = KV^2 + (p_1\tau + p_0)V + (q_2\tau^2 + q_1\tau + q_0), \quad (2)$$

where  $\tau = t - t_0$ . The above differential equation belongs to the class of Riccati differential equations, for which exact analytical and power series solutions have been derived [6]. The use of the analytical solutions is the primary contributing factor to the speedup of fast timing simulation and is the motivation for using a PWQ approximation to the injected current. Moreover, the PWQ approximation allows the electrical nature of the transient fault to be maintained, and provides very high accuracy close to electrical-level simulators.

Under normal operating conditions, the output node is the drain terminal for each of the NMOS and PMOS transistors in the primitive. During transient fault simulation, however, this condition may not always be true. For example, if a fault pumps current into the output node of a CMOS inverter when it is at logic 1, the output node may become the source terminal for the PMOS transistor. Since the coefficients of the differential equation depend on whether the output node is the source or drain terminal, changes in the location of drain and source terminals of MOS transistors are monitored during the simulation. Moreover, the MOS drain-bulk and source-bulk parasitic diodes, which are reverse biased under normal operating conditions, may start conduting during transient faults. In the previous example, this would happen in the output node rises to about  $V_{dd} + 0.65$  V, assuming that the bulk node of the PMOS transistor is tied to  $V_{dd}$ . These parasitic diodes are handled as follows: if for any MOS transistor in the primitive, the parasitic diode corresponding to the simulated node (drain-bulk or source-bulk diode) begins conducting, the voltage of the simulated node is clamped to the voltage of the bulk node plus/minus the turn-on voltage for the diode. The error incurred by ignoring the finite conductance of the diode and the presence of other



conducting elements in the primitive is small; this claim is borne out by the near perfect match with SPICE3 simulation results (Section 4). It is worth mentioning that no other fast timing simulator considers parasitic diode

effects. Note also that the coefficients of the differential equation in (2) depend on the regions of operation of the MOS transistors in the primitive. Starting at a time point  $t_s$  in the phase, we find the regions of operation of each MOS transistor and compute the coefficients of the differential equation. The analytical solution V(t) is used to determine the regions of operation of the transistors at the end of the time phase T. If there is any change, the time of the earliest change is calculated as  $t_x$ , and the current solution is considered valid over  $[t_s, t_x]$ . The process is then repeated from  $t_x$  by determining the new regions of operation of the transistors.

#### 4 Results and Observations

In this section, we demonstrate the accuracy and efficiency of the transient fault simulator by comparing its results with those of SPICE3. Since SPICE simulations of large circuits are too time-consuming, this comparison can only be done for a few small circuits. The first example uses the D flip-flop (DFF) shown in Fig. 4. Transient faults are injected at node F and voltage waveforms at node F and the output node Q are monitored. The waveforms from our simulator and SPICE3 for the fault-free case are shown in Fig. 5. Next, two transient faults injecting 2 pc and 3 pc of charge at node F are simulated and the results are shown in Fig. 6 and Fig. 7, respectively. It can be seen that the first fault does not get latched by the DFF, while the second one does. In the next example, we consider the s208 sequential circuit from the ISCAS-89 benchmark suite [8] and observe the fault-free and faulty waveforms at one of its nodes. Figure 8 compares the simulation results with SPICE3.

Average run-times for the seven sequential circuits from the ISCAS-89 benchmark suite are shown in Table I. In the table, N refers to the number of transistors in the circuit. Test vectors obtained from STG [9] were used as the input sequences for these circuits. For the circuits which were simulated with SPICE3, we provide values for the speedup and accuracy. The quantity  $\overline{\epsilon}_{max}$  is used as a measure of accuracy with respect to the SPICE3 results. For a given



Fig. 5: Fault-free Simulation of DFF



Fig. 6: DFF simulation with 2 pc charge injection

circuit, it is calculated as follows: (i) for a particular simulation, the average errors for each of the monitored nodes is calculated, (ii) the maximum of the average node errors is the error for the particular simulation, and (iii)  $\overline{\epsilon}_{max}$  is the largest error among all the simulations for the given circuit. The graphical comparisons and the results of Table I demonstrate the accuracy and efficiency of our simulator. We see that speedups of two orders of magnitude are achieved over SPICE3. Moreover, the speedup factor increases with the number of transistors in the circuit, indicating that the speed advantage will be even more for larger circuits.

To demonstrate the application of our simulator in transient fault analysis, seven ISCAS-89 sequential benchmark circuits were selected and 2000 randomly chosen faults were injected in each circuit. The number of faults that were latched are shown in Table II. Table II also shows the number of faults that caused a particular number of latches to flip (as a percentage of latched faults). As can be seen, single-bit flips are by far the most common, but a number of faults causes two or more latches to flip.



Fig. 7: DFF simulation with 3pc charge injection

| Ckt             | N     | Run-time       | SPICE3         | Speedup | $\overline{\epsilon}_{\max}$ |
|-----------------|-------|----------------|----------------|---------|------------------------------|
| Name            |       |                | run-time       |         |                              |
|                 |       | $[\mathbf{s}]$ | $[\mathbf{s}]$ |         | [V]                          |
| s27             | 114   | 2.5            | 45.34          | 18.14   | 0.29                         |
| s208.1          | 624   | 2.6            | 115.80         | 45.10   | 0.16                         |
| s641            | 1740  | 14.04          | 1394.18        | 99.30   | 0.18                         |
| s713            | 1860  | 16.55          | -              | -       | -                            |
| $\mathbf{s820}$ | 1906  | 17.62          | -              | —       | -                            |
| s1494           | 4046  | 14.24          | —              | —       | -                            |
| s5378           | 13198 | 138.57         | _              | =       | -                            |

Table I: Average run-times on a Sun Sparcstation 10

Table II: Latch error distributions for 2000 inject faults

| Ckt   | Latched | No. Latches Flipped (%) |       |      |  |
|-------|---------|-------------------------|-------|------|--|
| Name  | Faults  | 1                       | 2     | 3    |  |
| s27   | 51      | 96.10                   | 3.90  | 0.00 |  |
| s208  | 63      | 93.60                   | 3.20  | 3.20 |  |
| s641  | 33      | 93.90                   | 3.05  | 3.05 |  |
| s713  | 21      | 85.71                   | 14.29 | 0.00 |  |
| s820  | 6       | 83.33                   | 16.67 | 0.00 |  |
| s1196 | 9       | 100.00                  | 0.00  | 0.00 |  |



Fig. 8: Transient fault simulation of s208 circuit

## 5 Conclusions

In this paper, we have presented a new fast timing simulator of transient faults in CMOS digital circuits. Trasient faults are modeled as piecewise quadratic injected current waveforms. The simulator uses a generic MOS primitive and analytical solutions of node differential equations to provide accurate and detailed waveform information. This simulator has been shown to be very accurate compared to SPICE3, while providing speedups of two orders of magnitude for circuits containing up to 2000 transistors. For larger circuits, the speedup is expected to be even more.

### References

- R. K. Iyer and D. Rossetti, "A measurement-based model for workload dependence of CPU errors," *IEEE Trans. Comput.*, vol. C-35, pp. 511-519, June 1986.
- [2] F. L. Yang and R. A. Saleh, "Simulation and analysis of transient faults in digital circuits," *IEEE J. Solid State Circuits*, vol. 27(3), pp. 258-264, March 1992.

- [3] H. Cha, E. M. Rudnick, G. S. Choi, J. H. Patel, and R. K. Iyer, "A fast and accurate gate-level transient fault simulation environment," *Digest 23rd Int. Symp. Fault-Tolerant Comput.*, pp. 310-319, June 1993.
- [4] G. C. Messenger, "Collection of charge on junction nodes from ion tracks," *IEEE Trans. Nucl. Sci.*, vol. NS-29(6), pp. 2024-2031, Dec. 1982.
- [5] V. Carreno, G. Choi, and R. K. Iyer, "Analog-digital simulation of transient-induced logic errors and upset susceptibility of an advanced control system," NASA Technical Memo 4241, Nov. 1990.
- [6] Y. H. Shih and S. M. Kang, "Analytic transient solution of general MOS circuit primitives," *IEEE Trans. Computer-Aided Design*, vol. 11(6), pp. 719-731, June 1992.
- [7] A. Dharchoudhury, S. Kang, K. Kim and S. Lee, "Fast and accurate timing simulation with regionwise quadratic models of MOS I-V characteristics," *Proc. ICCAD*, 1994.
- [8] F. Brglez, D. Bryan, and K. Kozminski, "Combinational profiles of sequential benchmark circuits," Proc. IEEE Int. Symp. Circuits and Systems, pp. 1929-1934, May 1989.
- [9] W. T. Cheng and S. Davidson, "Sequential circuit test generator (STG) benchmark results," *Proc. IEEE Int. Symp. on Circuits and Systems*, pp. 1938-1941, May 1989.