| |
GLSVLSI 1999 ABSTRACTS
Sessions:
[Plenary]
[2A]
[2B]
[3A]
[3B]
[4A]
[4B]
[5A]
[5B]
[6A]
[6B]
[6C]
[7A]
[7B]
[8A]
[8B]
[8C]
[9A]
[9B]
-
High Performance Options through Nanoelectronics
-
G. Pomerenke
-
MEMs
-
KD. Wise
-
PASTA: Partial Scan to Enhance Test Compaction [p. 4]
-
Irith Pomeranz and Sudhakar M. Reddy
We propose a procedure to select flip-flops for partial scan targeting the reduction of test
length. We show that significant reductions in test length can be achieved by this procedure.
In addition, experimental results show that using heuristics that target the test length does
not have to increase the numbers of flip-flops that need to be scanned in order to achieve a
given level of fault coverage. Consequently, it may be possible to perform partial scan selection
targeting the two parameters, test length and fault coverage, without requiring more flip-flops
than required for one of the parameters.
-
On Applying Set Covering Models to Test Set Compaction [p. 8]
-
Paulo F. Flores, Hor´cio C. Neto and João P. Marques-Silva
Test set compaction is a fundamental problem in digital system testing. In recent years,
many competitive solutions have been proposed, most of which based on heuristics approaches.
This paper studies the application of set covering models to the compaction of test sets,
which can be used with any heuristic test set compaction procedure. For this purpose, recent
and highly effective set covering algorithms are used. Experimental evidence suggests that the
size of computed test sets can often be reduced by using set covering models and algorithms.
Moreover a noteworthy empirical conclusion is that it may be preferable not to use fault
simulation when the final objective is test set compaction.
-
On Test Generation with a Limited Number of Tests [p. 12]
-
Hideyuki Ichihara, Seiji Kajihara, Kozo Kinoshita
This paper considers a new test generation scheme in which a limitation of the number of
tests exists. Since, in this scheme, correct fault coverage cannot be calculated by the
representative faults, we present a method for calculating the correct fault coverage by
using the weighted fault list. And then we propose a selection-based test generation method
which derives limited number of tests with higher fault coverage. The experimental results
for IDDQ testing shows that our test generation method can generate tests with fault coverage
close to the maximum fault coverage.
-
Functional ATPG for Delay Faults [p. 16]
-
S. Tragoudas, M. Michael
This paper presents a functional level ATPG tool for delay faults which handles all existing
fault models. The tool generates patterns using either binary decision diagrams or boolean
satisfiability. Experimental results are presented on the ISCAS'85 benchmarks.
-
On Path Delay Fault Testing of Multiplexer-Based Shifters [p. 20]
-
H. T. Vergos, Y. Tsiatouhas, Th. Haniotakis, D. Nikolos, and M. Nicolaidis
In this paper we present a method for path delay fault testing of multiplexer-based shifters.
We show that many paths of the shifter are non-robustly testable and we give a path selection
method so as all the selected paths to be robustly testable by 20 * log2n + 2 test-vector pairs.
where n is the length of the shifter. The propagation delay along all other paths is a function
of the delays along the selected paths.
-
A Test Vector Ordering Technique for Switching Activity Reduction during
Test Operation [p. 24]
-
P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch
This paper considers the problem of testing VLSI integrated circuits without exceeding their
power ratings during test. The proposed approach is based on the reordering of test vectors
of a given test sequence to minimize the average and peak power dissipation during test operation.
For this purpose, the proposed technique reduces the internal switching activity by lowering the
transition density at circuit inputs. The technique considers combinational or full scan sequential
circuits and do not modify the initial fault coverage. Results of experiments show reductions
of the switching activity ranging from 11 % to 66 % during external test application.
-
VLSI Implementation of Early Branch Prediction Circuits for High
Performance Computing [p. 30]
-
Aamir A. Farooqui, Vojin G. Oklobdzija
In this paper, design and VLSI implementation of an Early Branch Prediction (EBP) circuit,
based on a variation of Carry Look-ahead scheme is presented. The key features of this design
are low area, high speed (2[log n/2] + 1), and high modularity. This design out performs all
the EBP designs presented so far. For 64-bit word length the early branch prediction is obtained
in 679 ps as simulated for 0.2-μm technology under typical conditions. Simulation and layout
results for 0.2-μm CMOS technology show a 30% increase in speed with 25% decrease in area as
compared, to recently published results.
-
The Design of a Register Renaming Unit [p. 34]
-
Benjamin Bishop, Thomas P. Kelliher, Mary Jane Irwin
Register renaming is often used to improve performance in many high-ILP processors. However
there is a lack of publications regarding register renaming hardware design. This paper
presents a detailed look at one possible implementation of a register renaming unit, as
well as some possible optimizations.
-
Efficient and Safe Asynchronous Wave-Pipeline Architectures for Datapath
and Control Unit Applications [p. 38]
-
O. Hauck, M. Garg, and S.A. Huss
This paper presents a generalization of a previously proposed asynchronous wave-pipeline
architecture. Four-phase and two-phase communication units supporting more than one wave
in the logic are proposed. General feedback structures are then outlined. Simulations from a
16-bit add-and-shift ring demonstrate their feasibility. The same architecture is applicable
for both datapath and control enabling the realization of complete high-throughput asynchronous
systems.
-
Memory Organization of a Single-Chip Video Signal Processing System
with Embedded DRAM [p. 42]
-
Jorg Hilgenstock, Klaus Herrmann, Peter Pirsch
A programmable single-chip multiprocessor system for video coding applications has been
developed. It integrates four processing elements, on-chip DRAM, and application-specific
interfaces. The integrated DRAM is primarily used as frame buffer and makes external memory
for most applications obsolete. For fast access to local data segments also static RAM is
integrated in each processing element
-
Theoretical Analysis of Word-Level Switching Activity in the Presence of
Glitching and Correlation [p. 46]
-
Janardhan H. Satyanarayana and Keshab K. Parhi
This paper presents a novel analytical approach to
compute the switching activity in digital circuits at the
word-level in the presence of glitching and correlation.
The proposed approach makes use of signal statistics
such as mean, variance, and autocorrelation. A novel
expression is derived for the switching activity a f at
the output node f of an arbitrary circuit in terms of
time-slot autocorrelation coefficient, the expected value,
and the signal probability. The switching activity anal ysis
of a signal at the word-level is computed by summing
the activities of all the individual bits constituting
the signal. A novel relationship between the correlation
coefficient of the higher order bits of a normally distributed
signal and the bit where the correlation begins
is also presented. The proposed approach can estimate
the switching activity in less than a second which is
orders of magnitude faster than simulation based approaches.
Simulation results show that the errors using
the proposed approach are about 6% on an average and
that the approach is well suited even for highly correlated
speech and music signals.
-
Adaptive Hard Disk Power Management on Personal Computers [p. 50]
-
Yung-Hsiang Lu, Giovanni De Micheli
Dynamic power management can be effective for designing low-power systems. In many systems,
requests are clustered into sessions. This paper proposes an adaptive algorithm that can predict
session lengths and shut down components between sessions to save power. Compared to other
approaches, simulations show that this algorithm can reduce power consumption in hard disks
with less impact on performance or reliability
-
Inductance Effects in RLC Trees [p. 56]
-
Yehea I. Ismail, Eby G. Friedman, and Jose L. Neves
A closed form solution for characterizing voltage-based signals in an RLC tree is presented.
This closed form solution is used to derive figures of merit to characterize the effects of
inductance at a specific node in an RLC tree. The effective damping factor of the signal at a
specific node in an R!,C tree is shown to be a useful figure of merit. As the effective damping
factor of a signal increases, an RC model is sufficiently accurate to characterize that waveform.
The rise time of the input signal driving an RLC tree is another factor characterizing the
importance of inductance, As the rise time of the input signal becomes much larger than the
effective LC time constant at a specific node within an RLC tree, the signal at this node does
not exhibit the effects of inductance. Evidence is provided showing that using a single line
analysis to determine the importance of including inductance to characterize a tree structured
interconnect line is invalid in many cases and can lead to erroneous conclusions.
-
S2P: A Stable 2-Pole RC Delay and Coupling Noise Metric [p. 60]
-
Emrah Acar, Altan Odabasioglu, Mustafa Celik, and Lawrence T.Pileggi
The Elmore delay is the metric of choice for performance-driven
design applications due to its simple, explicit. form
and ease with which sensitivity information can he
calculated. However; for deep submicron technologies, the
accuracy of the Elmore delay is insufficient. In this paper;
we formulate a delay model using a provably stable two
pole waveform response that provides a unique mapping
between four moments and a specific delay value. Unlike
traditional moment matching, this two-pole model permits
us to precharacterize the delays, and store them in a table,
as a mapped function of three parameters. The model also
provides an explicit expression for the peak noise induced
on a coupled line as a function of the same three moments.
The results indicate runtimes comparable to an Elmore
delay calculation but with the accuracy of an AWE
approximation.
-
ICE: Incremental 3-Dimensional Capacitance and Resistance Extraction for
an Iterative Design Environment [p. 64]
-
Yanhong Yuan, Prithviraj Banerjee
In this paper, we discuss the 3-Dimensional(3-D) capacitance
and resistance extraction within an iterative design
environment, where small changes are made to the 3-D
structures. We present a bounded incremental algorithm
for accurate and fast 3-D extraction in such a design environment,
based on the Boundary Element Method(BEM).
The incremental algorithm can re-utilize the computation
results of previous extractions and rapidly re-compute the
new parasitic parameters in response to the design changes
made to the layout. The incremental algorithm has been implemented
in the ICE tool. Experimental results on a set of
3-D interconnect structures show that the incremental algorithm
is efficient, for the iterative design methodology. For
one large structure, the incremental extraction is over 20
times faster than the full extraction without using the incremental algorithm.
To the best of our knowledge, this is the first reported work on an
incremental algorithm for capacitance and resistance extraction.
-
An Exact Analytical Time-Domain Model of Distributed RC Interconnects for
High Speed Nonlinear Circuit Applications [p. 68]
-
Ninglong Lu and Ibrahim N. Hajj
Accurate simulation of interconnect effects is an increasingly critical step in high speed
deep submicron design. With ever increasing frequency of digital/analog signals. the traditional
lumped RC elements may not be accurate enough in modeling RC interconnects in VLSI applications
due to the distributed nature of realistic interconnects. In this paper a novel analytic time-domain
model for distributed RC interconnects is developed for application in nonlinear circuit simulators.
The exact analytical solution is derived under the assumption of piecewise-linear signal waveforms
at the two ports of the line. We have incorporated this model into a general-purpose circuit
simulator using SWEC technique.
-
A Radix-16 SRT Division Unit with Speculation of the Quotient Digits [p. 74]
-
Gianluca Cornetta, Jordi Cortadella
The speed of a divider based on a digit-recurrence algorithm depends mainly on the latency of
the quotient digit generation function. in this paper we present an analytical approach that
extends the theory developed for standard SRT division and permits to implement division schemes
where a simpler function speculates the quotient digit. This leads to division units with shorter
cycle time and variable latency since a speculation error may be produced and a post-correction of
the quotient may be necessary. We have applied our algorithm to the design of a radix-16
speculative divider for double precision floating point numbers, that resulted to be faster than analogous implementations.
-
Area-Efficient Area Pad Design for High Pin-Count Chips [p. 78]
-
Louis Luh, John Choma,Jr., and Jeffrey Draper
This paper presents an area pad layout method to efficiently reduce the space required for
interconnection pads and pad drivers. Unlike peripheral pads, area pads use only the top
metal layer and therefore allow active circuitry to be laid out undernesth. With identical
functional elements grouped together. a group of pad drivers share the same well and can be
placed tightly together. The use of silicided diffusion reduces the well contact to diffusion
contact spacing requirement, by taking advantage of this spacing requirement and rising
serpentine gate layout, a driver's size can he effectively reduced without reducing the
driving capacity.An embeddded multicomputer router interface chip has been implemented
using these techniques and has achieved 554 pads in a 9mm x 6mm chip with a 0.8ml single-poly
3-metal N-well CMOS process.
-
New 2 Gbit/s CMOS I/O Pads [p. 82]
-
Guido Masera, Gianluca Piccinini, Massinio Ruo Roch and Maurizio Zamboni
A couple of low complexity high performance input
and output pads are proposed: they have been designed
in 0.7 µnn CMOS ES2 technology and support bit rates
ranging from DC up to 2 Gbit/s. The differential input
pad and the differential output pad interface true PECL
external logic levels to full swing 5V CMOS internal
levels.
-
A Methodology for Minimizing Power Dissipation of Embedded Systems
through Hardware/Software Partitioning [p. 86]
-
Jorg Henkel
We present a novel approach that minimizes the power dissipation
of embedded core-based systems through hardware/software partitioning.
Our approach is based on the
idea of mapping clusters of operations/instructions to a core
that ,Melds a high utilization rate of the involved resources
(ALUs, multipliers, shifters etc.) and thus minimizing power
dissipation. Our approach is comprehensive since it takes
into consideration the power dissipation of a whole embedded
system comprising a microprocessor core, application
specific (ASIC) core(s), cache cores and a memory core. We
report high reductions of power dissipation between 35c7c
and 94% at the cost of a relatively small additional hardware
overhead of less than 16k cells while maintaining or
even slightly increasing the performance compared to the
initial design.
-
On Optimizing Test Strategies for Analog Cells [p. 92]
-
Anna M. Brosa and Joan Figueras
The purpose of this paper is to analyze an optimization
method to improve the testability of structural defects,
such as bridges and opens, in
low-power low-voltage analog circuits. The approach
consists of finding an optimum subset of tests which
maximizes the fault coverage -with minimum cost. An
application example is given to illustrate the proposal
by studying the fault coverage obtained using different
test sets on a simple 2-stage Nested Transconduct a rice
Capacitance Compensated (NGCC) amplifier.
-
Novel Design for Testability of a Mixed-Signal VLSIC [p. 97]
-
E. McShane, K Shenai, L. Alkalai, E. Kolawa,
V. Boyadzhyan, B. Blaes, and W.C. Fang
A novel testability architecture has been developed for a mixed-signal VLSIC which has a
functional architecture consisting of a microprocessor core, RF transceiver, nd two voltage
regulators. It permits a decoupling of analog/RE, digital, and power systems for individual
stimulation and analysis. Testing may be performed at the subsystem or block level, and
traditional scan techniques are augmented to allow mixed static and dynamic test. This
approach aids in identifying any detrimental interaction between individual subsystems
by providing isolation between the circuit-under-test and idle circuits.
-
The Development of Analog SPICE Behavioral Model Based on IBIS Model [p. 101]
-
Ying Wang, Han Ngee Tan
This paper presents an approach for building an analog
SPICE behavioral model based on the information
provided by IBIS model. Such analog SPICE behavioral
model can describe both static and dynamic
characteristics of I/O buffers. The method to extract
dynamic information from IBIS switching waveform VT
tables is discussed in detail. Two types of models can be
generated depending on the availability of the waveform
tables with different load conditions in IBIS data. The
influence of waveform table load condition on the
validity of the analog SPICE behavioral model is also
investigated.
-
Fault Coverage Estimation for Early Stage of VLSI Design [p. 105]
-
Von-Kyoung Kim, Tom Chen, Mick Tegethoff
This paper proposes a new fault coverage estimation model which can be used in the early
stage of VLSI design. The fault coverage model is an exponentially decaying function with
three parameters, which include the fault coverage upper bound, UB, the fault coverage
lower bound, LB, arid the rate of fault coverage change, α. The fault coverages using
three different testing scenarios, which are no DFT, scan, iddq testing, are predicted
using circuit design information, sue/i as gate count, JO count, and FF count. These
parameters are often readily available at the early stage of VLSI design. Finally, the
composite fault coverage is estimated by combining different fault coverages. Experimental
result showed a 1.9% model estimation error with a given circuit information in the early
design.
-
Pseudo-Exhaustive Testing of Sequential Circuits [p. 109]
-
Bassam Shaer, Sami A. Al-Arian, David Landis
A new sequential circuit partitioning algorithm is introduced which enhances pseudo-exhaustive
testing. Our PIFAN algorithm is based on an analysis of Primary Input cones and FANout values.
Results are presented which show f/mat PIFAN offers significant reductions in hardware
overhead and test tune when compared to an alternative partitioning algorithms.
-
Self-Assembly Based Approaches for Metal/Molecule/Semiconductor
Nanoelectronic Circuits [p. 114]
-
D.B. Janes, R.P. Andres, E.H. Chen, J. Dicke, V.R. Kolagunta,
J. Lauterbach, T. Lee, J. Liu, M.R. Melloch, E.L. Peckham,
T. Pletcher, R. Reifenberger, H.J. Ueng, B.L. Walsh,
J.M. Woodall, C.P. Kubiak, and B. Kasibhatla
This paper describes a technological approach which combines the nanoscale elements available
from molecular devices and self-assembled molecular/nanoparticle systems with semiconductor
devices which can provide the gain or bistability required for computational functionality.
The architectural motivation for these configurations and experimental demonstrations of several
key technologies for this hybrid approach are described.
-
Logic in Wire: Using Quantum Dots to Implement a Microprocessor [p. 118]
-
Michael T. Niemier, Peter M. Kogge
Despite the seemingly endless upwards spiral of modern VLSI technology, many experts are
predicting a hard wall for CMOS in about a decade. Given this, researchers continue to
look at alternative technologies, one of which is based on quantum dots, called quantum
cellular automata. While the first such devices have been fabricated, little is known
about how to design complete systems of them. This paper summarizes one of the first such
studies, namely an attempt to design a complete, albeit simple, CPU in the technology.
The projections are striking: a projected 10 to 1 increase in circuit density when compared
to a CMOS equivalent, but a design approach which is radically different from conventional
"logic" design, especially in timing considerations.
-
Why is Time-Varying Control Necessary for Signal Processing with Locally-
Connected Quantum-Dot Arrays? [p. 122]
-
Arp´d. I. Csurgay, Craig S. Lent, and Wolfgang Porod
(Exended Abstract)
-
Resonant Tunneling Technology for Mixed Signal and Digital Circuits in
the 10-100 GHz Domain [p. 123]
-
T.P.E. Broekaert, B. Brar, F. Morris, A.C. Seabaugh, and G. Frazier
The inherent bistability and picosecond time-scale switching of the resonant tunneling diode
(RTD) provides an ideal element for the design of digital circuits and analog signal quantizers
in the 10-100 GHz domain. New differential RTD-based circuits for quantizers and a first-order
Sigma-Delta modulator capable of operating at 10 GHz and beyond are introduced.
-
Efficient Algorithms for Finding Highly Acceptable Designs Based on
Module-Utility Selections [p. 128]
-
Chantana Chantrapornchai, Edwin H.-M. Sha, Xiaobo (Sharon) Hu
In this paper, we present an iterative framework to solve module selection problem under
resource, latency, and power constraints. The framework associates a utility measure with
each module. This measurement reflects the usefulness of the module for a given a design
goal. Using modules with high utility values will result in superior designs. We propose
a heuristic which iteratively perturbs module utility values until they lead to good module
selections. Our experiments show that the module selections formed by combinations of
modules with high utility values are superior solutions. Further; by keeping modules with
high utility values, the module exploration space can drastically be reduced.
-
Reducing BDD Size by Exploiting Structural Connectivity [p. 132]
-
Ronnie L. Wright, Michael A. Shanblatt
Computer-aided design tools have been limited by the use of the Binary Decision Diagram (BDD).
The major drawback of the BDD is its abundant usage of CPU time and memory. Techniques such as
BDD variable ordering and sharing have been used in the past to address the size issue. However;
these techniques remain to be limited to modest-sized circuits. In this paper; we present a
significant variation to the conventional BDD, the Connective Binary Decision Diagram (CBDD).
The CBDD addresses the size issue concerning conventional BDD implementations by employing the
use of minimized-scalable binary decision diagrams (MSBDDs) combined with the structural
connectivity present in the circuit's netlist. The experimental results section will demonstrate
that the proposed method reduces the BDD size by more than two orders of magnitude for large
circuits.
-
An Integrated Approach for Synthesizing LUT Networks [p. 136]
-
Shigeru Yamashita, Hiroshi Sawada, Akira Nagoya
This paper presents a method for synthesizing look-up table (LUT) networks. The strategy
employed by our method is very different from the strategies of previous methods; many
decomposition methods that are not only algebraic but also functional are integrated very well.
Our method can be thought of as a general framework for LUT network synthesis integrating
various decomposition methods. The experimental results are very encouraging.
-
Hierarchical Scheduling in High Level Synthesis Using Resource Sharing
Across Nested Loops [p. 140]
-
Abhijit Ghosh, Sandeep K. Lodha, Ranga Vemuri
This paper presents a resource-constrained scheduling algorithm for hierarchical behavioral
specifications containing nested loops. The algorithm attempts to share resources across
levels, to schedule operations that belong to different levels of the nested loop structures
in the specifications as well as operations that belong to the same level. We compare the
results of scheduling using our algorithm with those obtained using traditional list scheduling
with no sharing of resources among different levels of the specification. These results show
an average improvement of 23.47% in terms of number of control steps.
-
Design Issues in the Synthesis of Reusable Cores [p. 144]
-
Rohit Sharma and C. P. Ravikumar
While core-based design is itself a challenging task, it is
equally challenging for a core vendor to provide
information about a core without compromising on the
protection of intellectual property. A number of issues are
to be taken into consideration when designing a core.
While conventional goals such as minimal area and
maximal performance continue to hold, additional
constraints such as core testability and power dissipation
will have to be considered. Since the vendor of a core
does not reveal details about the internals of the core, it is
often the responsibility of the vendor to provide the test
plan for the core. In this paper, we present our
experiences in designing a testable CORDIC core.
Keywords: Embedded Cores, Deign Reuse, CORDIC
Arithmetic, and Core Testability.
-
Ultrahigh-Speed Circuits Using Resonant Tunneling Devices [p. 150]
-
M. Yamamoto, H. Matsuzaki, T. Itoh, T. Waho, T. Akeyoshi, and J. Osaka
Ultrahigh-speed circuit applications of resonant tunneling diodes (RTDs) have been developed.
One of the key concepts is the merged utilization of RTDs and high electron mobility transistors
(HEMTs). The integration technology for lnP-based RTDs and HEMTs has been developed. Another
key technology developed is a circuit configuration using series-connected RTDs, driven by
a clocked bias, in combination with HEMTs. Given this circuit concept, various kinds of
edge-triggered flip-flop circuits and multiple-valued quantizers featuring high-speed
operation and compact configuration have been constructed. By extending this circuit
concept, an optoelectronic circuit using RTDs and a photodiode has also been developed.
High-speed operations have been demonstrated, including a delayed flip-flop circuit
operating at 35 Gbit/s, multiple-valued quantizers operating at 10 GHz, a 2-bit analog-to-digital
converter operating at 5 GHz and an optoelectronic circuit that demultiplexes an
80 Gbit/s optical signal into a 40 Gbit/s electrical signal. The presented results clearly
show the potentiality of RTD-based circuits for the construction of unprecedented ultra high-speed
communications and signal processing circuits.
-
A Novel High-Speed Flip-Flop Circuit Using RTDs and HEMTs [p. 154]
-
Hideaki Matsuzaki, Toshihiro Itoh, and Masafumi Yamamoto
An RTD (resonant tunneling diode)-based flip-flop circuit with a new configuration is
proposed. The circuit features an SCFL interface for both input and output, and achieves
high-speed operation with a simplified configuration. The circuit consists of only two
RTDs and three HEMTs, and works as a delayed flip-flop (D-FF) with return-to-zero (RZ)
mode output. 50 Gbit/s operation is confirmed by SPICE simulation for the SCFL-interfaced
D-FF with the proposed configuration. A static binary frequency divider (T-FF) is also
designed based on the same concept. It is fabricated by InP-based RTD/HEMT integration
technology, and its proper operation of up to 15 GHz is confirmed experimentally.
-
Design and Analysis of a Novel Quantum-MOS Sense Amplifier Circuit [p. 158]
-
Tetsuya Uemura, Pinaki Mazumder
A novel quantum-MOS sense amplifier circuit consisting of resonant tunneling diodes (RTD 's)
as pull-up devices and NMOS transistors is discussed in this paper. Compared to the conventional
sense amplifier circuits using CMOS technology, the proposed QMOS sense amplifier exhibits about
20% higher sensing speed. The cross-coupled QMOS latch, which is at the heart of the sense amplifier
circuit, has metastable and unstable states which are closely related to the I-V characteristics
of the RTD 's. The stability analysis has been made by using phase-plot diagram and how RTD
parameters relate to circuit speed and robustness of the sense amplifier has been discussed.
-
Integration of InAs/AlSb/GaSb Resonant Interband Tunneling Diodes with
Heterostructure Field-Effect Transistors for Ultra-High Speed Digital
Circuit Applications [p. 162]
-
P. Fay, G.H. Bernstein, D. Chow, J. Schulman, P. Mazumder, W. Williamson, and B. Gilbert
Resonant tunnelling diode based Logic circuits offer significant advantages for low power,
ultra-high-speed applications. In this work, a Low-power resonant interband tunneling diode
(RITD)-based logic technology capable of operating at clock rates of at least 12 GHz is reported.
The circuits are fabricated using InAs/AlSb/ GaSb RITDs. Fanout of at least two at a clock
rate of 10 GHz is also reported for two AND gates in a two-stage pipelined configuration.
Simulation results for an RITD/ HFET circuit based on measured characteristics of InAs/AlSb/GaSb
RITDs and InAs-channel HFETs for a simple inverting Schmitt trigger are presented to demonstrate
the advantages of an integrated RITD/HFET technology. This circuit architecture demonstrates
proper operation with power supply voltages as Low as 0.5 V. In addition, well defined logic
levels and abrupt logic transitions are achieved, despite the limited transconductance and
Large output conductance typical of InAs-channel HFETs.
Keywords:
resonant tunneling diode (RTD), resonant interband tunneling diode (RITD),
heterostructure field-effect transistor (HFET), ultra-high-speed logic circuits
-
A Memory Design in QCAs Using the SQUARES Formalism [p. 166]
-
D. Berzon and T.J. Fountain
We present a formalism for implementing circuits with Quantum-dot Cellular Automata (QCA),
comprising a set of standard circuit elements with uniform layout rules. The formalism
simplifies circuit design from an engineering perspective and overcomes an observed sensitivity
of QCA systems to input delays. A design for an addressable shift register is implemented, and
promises considerable density gains over conventional CMOS.
-
Transistor Level Synthesis for Static CMOS Combinational Circuits [p. 172]
-
Chia-Pin R. Liu, Jacob A. Abraham
This paper introduces a novel framework to synthesize static CMOS circuits at the transistor
level. A new class of binary decision diagrams (BDDs) which represent inverting Boolean
functions, called Transistor Mapped BDDs (TMBDDs), is used in the synthesis process. There
is a one-to-one correspondence between a transistor netlist and its TMBDI), Nodes in a
TM-BDD represent gate inputs and the edges represent the transistors in the netlist. TM-BDDs
can be optimized using BDD operations, and the data structure can retain device aspect ratios
and geometries for performance optimization. The synthesis process involves a transformation
from logic functions to transistor netlists using TM-BDDs. We show how a transistor netlist
can be automatically generated during a depth-first traversal on a TM-BDD. The synthesis
process is not only independent of any library, but also capable of generating a cell
library for a particular circuit. Experimental results demonstrating the reduction of
transistor counts are presented.
-
SINMEF-A Decomposition Based Synthesis Tool for Large FSMs [p. 176]
-
Carlos Humberto, Llanos Quintero and Marius Strum
This paper describes the SINMEF environment, composed of the DECMEF and the SIS [9] systems,
used to synthesize large finite state machines (FSMs). The DECMEF system consists of a set
of tools to decompose a FSM into a set of cooperating sub-FSMs. An efficient cost fraction
is used to guide the decomposition process. The decomposed FSMs are state encoded and further
optimized amid technology mapped using tools from the S/S system. Results obtained for FSMs
with more than 1000 states showed an improvement of as much as 60.42% in critical path
and 14.79% in area. Preliminary results show that the recursive use of' the decomposition
system extends its application to FSMs witlt several thousands of states.
Keywords:
FSM, decomposition, non- deterministic transitions, redundant transitions, clustering technique.
-
An Approach for Testing Safety-Critical Software [p. 180]
-
Weiwei Li, Zhongwei Xu, Yan Jin
A novel approach for testing the effectiveness, efficiency, safety and relative
appropriateness of Computer Interlocking Software (CIS) --a kind of safety- critical
software is presented wit/i a software platform developed to support this approach. A
brief description of the proposed approach is also included.
Key Words
Safety-Critical Software, Failure Severity Level, Failure Frequency, Software Safety Integrity
Level, Software Validation
-
Design Recovery for Incomplete Combinational Logic [p. 184]
-
Travis E. Doom, Anthony S. Wojcik, Moon-Jung Chung
Motivated by the problem of reengineering legacy digital circuits for which design information
is missing or incomplete, this paper presents a new technique for representing the relationships
among the internal components of a combinational circuit. This technique proves to he a powerful
tool for redesign, capable of representing internal Boolean relationships in a fully or partially
specified multiple-output combinational circuit with a single data structure
-
Regression-Based Macromodeling for Delay Estimation of Behavioral
Components [p. 188]
-
A. Macii, E. Macii, G. Odasso, M. Poncino, and R. Scarsi
This paper presents a methodology for delay estimation of hardware components described
at the behavioral-level. The basis of the proposed technique is a well-known theoretical
result that relates the entropy of a logic function to the delay of a multi-level
implementation of the same function. We propose an improved model for delay estimation, and
we prove its validity by means of experiments performed on a set of standard benchmarks.
-
Efficiently Searching the Optimal Design Space [p. 192]
-
Stephen A. Blythe and Robert A. Walker
One of the primary advantages of a high-level synthesis
system is its ability to explore the design space. This
paper presents several methodologies for design space
exploration that compute all optimal tradeoff points for
the combined problem of scheduling, clock length
determination, and module selection. We discuss
how each methodology takes advantage of both the
structure within the design space itself as well as the
structure of, and interaction between, each of the three
subproblems.
-
A Bandpass Sigma-Delta for Software Low-Power and Low-Voltage Radio by
Using PATH Technique [p. 198]
-
Yiu (Simon) Wu, John Ling and Ward J. Helms
Tins paper proposes a PArallel Two patH (PATH) technique for oversampled bandpass
analog-to-digital converter in low-power and low-voltage environment to relax the settling
requirement and to increase signal-to-noise ratio, Time design considerations for the
implementation are evaluated and strategies overcome the possible problems. It is clocked
at 20MHz and digitized a 200KHz bandwidth signal centered at 10MHz with 87dB Signal-to-Noise
Ratio (SNR) while suppressing the undesired mirror image signal at 40dB in 1.8-V supply voltage.
-
No-Race Charge-Recycling Differential Logic (NCDL) [p. 202]
-
Seung-Moon Yoo and Sung-Mo (Steve) Kang
This paper describes No-race Charge-recycling
Differential Logic (NCDL) which realizes low
power computation with less sensitivity to input
signal skews. Performance comparison with
previous charge recycling logics is shown for a
2-input NAND logic. NCDL operates in push-pull
mode and achieves about 35% improvement in
power-delay product over full swing differential
logic without the pre-evaluation problems. Thus,
it shows increased effectiveness for the implementation
of random logic with input signals
arriving in an arbitrary sequence.
-
Linear Transconductors Using Low Voltage Low Power Square-Law CMOS Cells [p. 206]
-
Tuna B. Tarim and Mohammed Ismail
Two transconductors composed of two square-law CMOS cells are introduced in this paper.
The analysis of the cells is given. The transconductors operate in the saturation region with
a fully balanced input signal. Simulations were done for 0.8μm n-well process using BSIM3
model parameters. The first circuit has a trade-off between low voltage operation and low
power dissipation. The circuit has a cutoff frequency of 170MHz and Pdis=l.l7mW for a bias
current of l20μA. The second transconductor has aimed to overcome the trade-off and to
improve the performance; the circuit has a cutoff frequency of 236MHz and Pdis=l.74mW for the
same bias current, however, it is possible to reduce the bias current, since the trade-off T
he transconductors have a THD of less then -56dB and -60dB, respectively, for 1MHz, 0.5V peak-to-peak
sinusoidal input. A comparison between the two circuit performances is given.
-
Current Sensor on the Base of Permanent Pre-Chargeable Amplifier [p. 210]
-
Victor Varshavsky, Masayuki Tsukisaka
The sensitivity and delay of the amplifier the key problems in the performance of Current Sensors(CS).
For large devices which consist of several cells, for example 32bit, the amplifier must react to
1O-2OmV
The previous type of highly sensitive amplifier which is based on cascade and reference voltage[dill]
can react to this level of voltage. But this model is not stable in respect to technological and parametric
variation. In this paper, we suggest a tripple cascade inverters feedbacked by a un-symmetrical pass
transistor which amplifies lmV without reference voltage. Monte-Cairo SPICE simulation shows the
stableness of this model for parametric variation. We prepare the schematic of CS which includes
control unit with shunt transistors and evaluate the delay.
-
Parallel Saturating Fractional Arithmetic Units [p. 214]
-
Navindra Yadav, Michael Schulte, John Glossner
This paper describes the designs of a saturating adder, multiplier, single MAC unit, and dual MAC unit
with one cycle latencies. The dual MAC unit can perform two saturating MAC operations in parallel
and accumulate the results with saturation. Specialized saturation logic ensures that the output
of the dual MAC unit is identical to the result of the operations performed serially with saturation
after each multiplication and each addition
-
Residue Arithmetic Circuits Based on Signed-Digit Number Representation
and the VHDL Implementation [p. 218]
-
Shugang Wei, Kensuke Shimizu
Residue arithmetic circuits based on radix-s signed-digit (SD) number representation, using integers
2p and 2p ± 1 as moduli of residue number system(RNS), are presented. The modulo m addition,
m = 2 p
or m = 2p ± 1, is performed by a carry-free SD adder and the modulo in multiplier is constructed using
a binary modulo m SD adder tree. The implementation for the residue arithmetic circuits with VHDL
description is proposed. The modulo m adders and multipliers have about 530 and 5000 gates, respectively,
in cases of m =216±1.
-
Model Evaluation Using Genetic Manipulation Techniques [p. 224]
-
Z. Stamenkovic, H.-Ch. Dahmen, and U. Glaeser
Formal Verification is an important area in industry with getting more and more attention.
Growing complexity of digital circuits and the use in safety critical systems are the reasons
for the need of tools for checking the correctness of designs.
In this paper we present a new approach for model evaluation. With our approach we are able to
increase the belief of a designer in the right functionality of a circuit without the long runtimes
of classical model checking but with more reliability than testing a design via simulation with some
input patterns.
To achieve this goal we use our genetic manipulation technique: a combination of classical genetic
algorithms with a goal oriented mutation operator
-
A Genetic Algorithm for Register Allocation [p. 226]
-
K.M. Elleithy and E.G. Abd-El-Fattah
In this paper we introduce a new genetic algorithm for register allocation. A merge operator is used
to generate new individual solutions. The number of steps required to examine all pairs in the population
matrix to generate n2 (n is the population matrix size). Generating an offspring from
the parents needs m steps (m number of nodes). The total number
of steps required by the algorithm is n2m, that is, the genetic algorithm has a linear time complexity
in terms of number of nodes. The experimental results show optimal solutions in many of the graphs used
for testing.
-
Congestion Mitigation during Placement [p. 228]
-
Kanad Chakraborty and Natesan Venkateswaran
High post-placement congestion in complex ASICs and microprocessors may pose severe constraints
on the wiring resources, thereby causing wireability, timing and noise problems. Linear wire length-based
mincut partitioning algorithms have some built-in advantages for reducing congestion. We
present a mathematical model of congestion and experimentally investigate various congestion
mitigation techniques used in conjunction with linear wirelength-based placement. The experimental
results validate our congestion model. Our placement tool, CPlace©, is a clustering-based mincut
partitioner that optimizes a linear wirelength objective.
-
A Spiffy Tool for the Simultaneous Placement and Global Routing for
Three-Dimensional Field-Programmable Gate Arrays [p. 230]
-
John Karro and James P. Cohoon
FPGAs are a useful and flexible alternative to custom design chips, but can suffer from severe
interconnection delay. The 3D-FPGA is an alternative to the two-dimensional architecture that has
been proposed to reduce these delay problems [2]. Here we present Spiffy - the first tool
specifically designed for the placement and global routing of 3D-FPGAs. Spiffy produces some of
the best results in the literature, and using Spiffy, we can show that when mapped to the 3D-FPGA
architecture, circuits tend to have considerably shorter net-length, making this new chip an
improvement over the standard architecture.
-
Formal Verification of Tree-Structured Carry-Lookahead Adders [p. 232]
-
Sae Hwan Kim, Shiu-Kai Chin
Quad trees - trees with four branches, are used to abstractly describe tree-structured carry-lookahead
adders using 4-bit components. The specification and implementation descriptions are parameterized and
describe tree-structured adders having arbitrarily large inputs and outputs. The descriptions are formally
verified using the HOL theorem prover.
-
Bounding Algorithms for Design Space Exploration [p. 234]
-
Samit Chaudhuri, Robert A. Walker
This paper describes several new algorithms for computing lower bounds on the length of the schedule
and the number of functional units in high-level synthesis.
-
Digital Neural Processing Unit for Electronic Nose [p. 236]
-
Hoda S. Abdel-Aty-Zohdy and Mahmoud Al-Nsour
In a biological nose, the environment usually suggests a number of common odors. The classification
process checks sensed information against existing knowledge. This similarity with Reinforcement Learning
neural networks suggests challenging implementation problems.
A VLSIC digital design and implementation of a Reinforcement Artificial Neural Network (RANN) for chemical
classification, in an electronic nose is presented. The chip is designed to classify chemical gases among
four possible volatile organic compounds. The system consists of four neurons and twelve synapses /1].
A neuron has been implemented on a tiny chip, using 2.Oμm n-well CMOS technology, at Orbit Semiconductors,
through the MOSIS facilities. Simulation results demonstrated proper operation. Stand alone experiments
are satisfactory, with off-chip weight storage and weight update. Electronic nose system testing is
currently under way.
-
A Low Power Charge-Recycling CMOS Clock Buffer [p. 238]
-
Xiaohui Wang and Wolfgang Porod
A low power CMOS clock buffer based on charge recycling technique is presented. To accomplish the
charge recycling process and avoid introducing the extra short circuit current during the recycling
phase, an extra switching circuit and control signal are utilized to keep inverters momentarily tri-state.
The feasibility of this design and its improved power efficiency are demonstrated by simulations.
-
A Multiple-Input Single-Phase Clock Flip-Flop Family [p. 240]
-
Richard F. Hobson and Allan R. Dyck
The design of a versatile CMOS semi-static true single-phase clock flip-flop family is presented. It
naturally supports multiple, multiplexed, inputs. Asynchronous Set/Reset are easily implemented. Switching
power is lower than for some other semi-static flip-flop techniques.
-
Methodology of Logic Synthesis for Implementation Using Heterogeneous
LUT FPGAs [p. 242]
-
I. Lemberski
Logic synthesis method for heterogeneous LUT FPGAs implementation is proposed. As an example, XILINX4000
architecture is considered. The method takes XILINX4000 architectural features (heterogeneous LUTs of 3
and 4 inputs) into account and includes two step decomposition. In the first step, two-level logic
representation is transformed into a graph of at most 4 fanin nodes (after this step, each node can be
mapped onto 4 input LUT). In the second step, selected 4 fanin nodes are re-decomposed into 3 fanin nodes
to ensure mapping onto 3 input LUTs. Re-decomposition task is formulated as substituting node two fanins
for exactly one fan in.
-
VHDL Design of a Test Processor Based on Mixed-Mode Test Generation [p. 244]
-
Md.Altaf-Ul-Amin and Zahari Mohamed Darus
This paper presents the VHDL design of a prototype test processor, which can be used for functional
testing of digital ICs. The design of the test processor supports itself to be controlled by a microcomputer.
The processor can generate mixed-mode (pseudo-random followed by deterministic) test vectors and can apply them
to circuit under test (CUT). The test processor also receives the output responses of the CUT and compresses
them to a signature. The signature is then sent to the computer for comparison. The test processor supports
the testing of combinational as well as sequential circuits (with scan-path).
-
An Incremental Floorplanner [p. 248]
-
Jim Crenshaw, Majid Sarrafzadeh, Prithviraj Banerjee, Pradeep Prabhakaran
One of the foremost problems in physical design for deep-submicron circuits is the need for
estimates that depend on future decisions. Estimation of area, timing, and coupling are required.
We propose a novel floorplanner, with a new wiring metric which can he updated quickly in small
increments. This provides tools with a way to influence the floorplan as they make changes without
a large running time penalty. We provide experimental results that show the incremental approach
to be generally 5 times faster than full floorplanning while maintaining good estimates.
-
A Greedy Router with Technology Targetable Output [p. 252]
-
R. Balakrishnan and R.F. Hobson
Our objective was to integrate an effective channel routing algorithm with the Chip Design
Language (CDL) algorithmic layout tool. CDL uses technology targetable layout techniques,
so that the output of the routing algorithm can easily be ported to different technologies.
We introduce the technology independent features of CDL and describe how a greedy router
can be interfaced to it. Specific features of interest include mapping from the grid based
router to the gridless CDL environment, and the automatic insertion of CDL feed-through cells
in multi -channel applications.
-
Routability Prediction for Hierarchical FPGAs [p. 256]
-
Wei Li and D.K Banerji
This paper investigates the problem of routability
prediction in a FPGA that employs a hierarchical routing
architecture. Such a FPGA is called a hierarchical
FPGA(HFPGA). A novel model is proposed to analyze
various HFPGA configurations. A software tool has been
developed to predict the routability of circuits on specific
HFPGA architectures. Primary contribution of this work
is that routability prediction can be done immediately
after the technology-mapping step, rather than after
placement. The effect of connection block and switch
block flexibility on routability is also studied. The results
show that compared to a symmetrical FPGA
architecture, we can achieve the same degree of
routability on a HFPGA, with much fewer routing
switches.
-
Memory Unit Design for Real Time DSP Applications [p. 260]
-
Daniel CHILLET, Olivier SENTIEYS, Michel CORAZZA
Today, the design complexity for new applications
(such as telecommunication, multi media, internet), requires
new high level tools which enable us to translate - the behavioral
description into hardware. All of the recents High Level
Synthesis tools are able to transform
high level specifications in an ASIC based on processing
and control units. In general, these tools do not handle
a real optimization of the memory unit. However, in many
applications, the hardware solution may be challenged
by the number and the complexity of memory
units. This paper proposes to complete the synthesis design
flow by including the memory unit synthesis. Our
methodology is integrated in the BSS (Breizh Synthesis
System http: //www. enssat. fr/bss) project which is
a framework for the design of real-time constraint applications.
Session 7B:
MEMS
-
Design Automation of MEMS Systems Using Behavioral Modeling [p. 266]
-
Dennis Gibson, Carla Purdy, Alva Hare, Fred Beyette, Jr.
We propose a behavioral approach to designing MEMS devices. This approach differs from much
current research in that this approach would not require dimensional parameters for the device,
but instead would require a high level, functional or behavioral description. This paper examines
how such an approach would work using a case study of an optical processor manually designed
using the MUMPs process.
Keywords: MEMS, CAD for MEMS, behavioral modeling, design automation.
-
Blending Symbolic Matrix and Dimensional Numerical Simulation
Methodology for Mechatronics Systems [p. 270]
-
Robert L. Ewing
The methodology far the integration of design domains towards the purpose of controlling
dynamic mechatronics systems is the current challenge of the modern engineer. Scaling issues
for both the mechanical and electrical parameters are critical to the successful design and
implementation of a mechatronic system. In approaching time scaling design methodology
for future submicron fabrication, new disciplines of
symbolic matrix techniques and dimensional analysis must be developed and applied in the design
of these mechatronics systems. This paper presents both an overview of the techniques and insight
using conmpu er aided design packages for the blending of symbolic matrix techniques using
the admittance matrix created by SPICE and dimensional analysis using Buckingham 's II parameters.
-
Numerical Tools for Fracture of MEMS Devices [p. 274]
-
N. Tayebi, A.K Tayebi, and Y. Belkacemi
Numerical tools to model fracture in MEMS
devices are proposed. The two numerical
procedures are the Element Free Galerkin
method and the Displacement Discontinuity
Method. Experiments on MEMS fracture are used
to evaluate the numerical procedures. The test
specimens covered a range of geometries and
designs, including notches, holes and corners.
For some specimens both methods gave
acceptable results compared to experiments
(Ballarini et al and Suwito), while for others
results were off' by more than 15%. These findings
raise new questions about the applicability of
linear elastic fracture mechanics to model failure
of MEMS devices at microscopic scale.
Key words: CAD Tools, MEMS, Fracture Mechanics, Meshless Methods, Boundary
Element Methods
-
Formal Checking of Properties in Complex Systems Using Abstractions [p. 280]
-
Dinos Moundanos,Jacob A. Abraham
Only very small designs can he verified currently using property checking due to state-space
explosion. Abstractions have been developed to simplify the design in an attempt to address
this problem. However the properties themselves may involve large state spaces, and practical
property checking is generally confined to the control behavior This paper describes an elegant
technique for verifying properties of complex designs where the abstraction is applied to both
the property and the design, thereby allowing us to verify properties which may deal with the
data space. We demonstrate the technique on a processor by checking properties which are
intractable using existing model checking techniques.
-
A Hierarchical Approach to the Formal Verification of Embedded Systems
Using MDGs [p. 284]
-
Subhashini Balakrishnan and Sofiène Tahar
With the increasing emergence of mixed hardware/software systems, it is important to ensure
the correctness of such a system formally, particularly for real-time and safety critical
applications. We present a hierarchical approach to modeling and formally verifying an embedded
system at higher levels of abstraction, using Multiway Decision Graphs (MDGs). We demonstrate
our approach on the embedded software for a mouse controller application on a commercial
microcontroller (PlC I6C7l), using the MDG verification tools. Inconsistencies in the
assembly code with respect to tile specification, as published in the application notes of
the manufacturer, were uncovered through our experiments.
-
Symbolic Multi-Level Verification of Refinement [p. 288]
-
Stefan Hendricx, Luc Claesen
VLSI-system design can, in general, be characterized in terms of the step-wise refinement
of intermediate solutions. Despite the fact that such refinements usually do not preserve
time-scales, current formal verification approaches mostly start from the assumption that both
specification and implementation utilize the same scales of time. Realizing the importance
of being able to cope with differences in timing granularity, this preliminary paper proposes
a symbolic methodology to verify that a low-level finite state machine is a refinement of a
high-level finite state machine. To illustrate our approach, the step-wise refinement - and
verification --- of a simple microprocessor is presented.
-
Self-Checking of FGPA-Based Control Units [p. 292]
-
Ilya Levin and Vladimir Sinelnikov
The paper introduces a new technique for on-line checking of FPGA based Control Units (CUs).
This technique is based on the architecture comprising two portions: a self-checking CU and a
separate totally self-checking (TSC) checker, Each of these portions is implemented as a combination
of an Evolution block and an Execution block. Comparison of code vectors being transferred
between the blocks of the portions enables providing a totally self-checking property. The
self-checking CU is implemented in a form of one-rail network of interconnected pre-designed
LUT-based configurable logical blocks. The self-checking checker is a Sum-Of-Minterms based
checker. The proposed technique: a) does not require any encoding of output words; b) uses
one-rail design, thereby drastically decreasing the required overhead.
-
A Software Acceptance Testing Technique Based on Knowledge Accumulation [p. 296]
-
Yi Yu, Fangmei Wu
System acceptance testing in general relies on the
specification of system requirements, but for complex
systems, especially for complex safety systems, the issue
whether system requirements specified by users are
complete should be considered. This paper presents a
software acceptance testing technique based on
knowledge accumulation, which can help to expose the
software faults caused by the lack of knowledge. A
software test tool using the technique for the railway
signaling computer interlocking systems and some tested
results are also introduced in this paper.
Keywords
software acceptance testing, knowledge accumulation, railway signaling, interlocking
-
A Correlation Matrix Method of Clock Partitioning for Sequential Circuit
Testability [p. 300]
-
Yong Chang Kim, Vishwani D. Agrawal, Kewal K. Saluja
We propose a method of partitioning the set of all
flip-flops in a circuit for multiple clock testing. In the
multiple clock testing, flip-flops are partitioned into
different groups and each group of flip-flops has an
independent clock control. In our method, we use a
test generator assuming an independent clock control
for each flip-flop. We than determine correlation between
clock activity for all pairs of flip-flops. This information
is than used to an optimal or near optimal
partition of flip-flops in. Through experiments, we
demonstrate that our partitioning method increases
fault coverage and reduces test length with almost no
hardware overhead or performance penalty.
-
A Novel Low Power Low Phase-Noise PLL Architecture for Wireless
Transceivers [p. 306]
-
Amr N. Hafez and M.I. Elmasry
A sample- (2nd-hold stage placed in the feedback path of a PLL frequency synthesizer reduces
the division ratio, and hence the phase-detector phase-noise, without the need of multiple loops.
When used in conjunction with a DDS, this architecture simplifies the DDS design leading to a
low-power architecture. Furthermore, this architecture allows for a large loop bandwidth thus
sup- pressing the VCO phase-noise. The advantages of this architecture are highlighted and system-
and circuit- level simulations presented.
-
NMOS Energy Recovery Logic [p. 310]
-
Chulwoo Kim, Seung-Moon Yoo, and Sung-Mo Kang
In this paper, we describe NMOS Energy Recovery Logic (NERL) which exhibits high throughput
with low energy consumption due to efficient energy transfer and recovery using adiabatic
and bootstrapping. NERL shows full output voltage swing, insensitivity to output load capacitance,
less dependency on power-clock frequency and complementary outputs for balanced capacitance
load m.o power-clock. We have designed an 8-bit CLA amid inverter drain using 0.6μm CMOS
technology and verified that NERL saves energy over ECRL by 2 to 3 times.
-
Noise Immunity of Digital Circuits in Mixed-Signal Smart Power Systems [p. 314]
-
Radu M. Secareanu, Ivan S. Kourtev, Juan Becerra, Thomas E. Watrobski,
Christopher Morton, William Staub, Thomas Tellier, and Eby G. Friedman
Experimental data describing circuit and physical design issues that influence the noise
immunity of digital latches in mixed-signal smart power circuits are described and discussed.
The principal result of this paper is the characterization of the conditions under which
substrate noise generated by high power analog circuitry affects digital latches. The
experimental data characterize a variety of different noise mitigation techniques for the
particular process technology, circuit structures, signal/clocking interdependencies, and
related conditions.
-
An All Digital BiCMOS Phase Lock Loop for VLSI Processors [p. 318]
-
Lim Chu Aun and S.M.Rezaul Hasan
A BicMOS all digital phase lock loop is described. This design is suitable for applications
such as clock
and frequency synthesis in VLSI processors where thermal stability is an important factor.
The main block o/ the design consists of a digital/v controlled oscillator with wide frequency
range & high thermal stability compared to CMOS design. Improved BiCMOS adder/subtractor was
also implemented to reduce worst- case propagation delay-time. A small test chip was fabricated
using MOSIS Orbit 2μm low-cost analog CMOS process technology that provides
lateral NPN bipolar device option.
-
Low Power Techniques for Digital GaAs VLSI [p. 321]
-
J. F. Lopez, R. Sarmiento, A. Núñez, K Eshraghian, S. Lachowicz,
and D. Abbott
This paper presents a survey of low-power digital Galhum Arsenide logic applicable to
high performance VLSI circuits and system.s and proposes new design concepts in methodology
and architecture based on implementation of Pseudo-Dynamic Latched Logic in order to
achieve reasonable power-delay-area tradeoff' The approach is highly suitable far self-timed
systems where the complexities of clock skew are avoided and power saving is achieved through
pipelined architectures. The emergence of low- power Complementary HIGFET (C-HIGFET)
technology enables the realisation of new high performance low-power architectures. The
viability of neu-GaAs (vGaAs) as applied to C-HIGFET is discussed and the concept of soft'
hardware referred as 'flexware' is introduced as a new design paradigm far GaAs.
-
A VLSI Architecture for ATM Switches with Algorithm-Agile Encryption [p. 325]
-
A. G. Wassal and M.A. Hasan
In this paper a VLSI architecture is proposed for an
algorithm-agile encryptor for ATM networks. The architecture
is based on a circular sorting queue that buffers
and switches incoming cells to the appropriate encryption
pipelines. It also handles multicast cells that require different
encryption algorithms for different destinations. Delay
and loss priority are analyzed for multi-class traffic processed
through the encryptor. The analysis results are necessary to
size the buffer properly and to choose an appropriate priority
scheme. An ASIC prototype of the sorting queue
that supports an aggregate traffic rate of up to 21.2 Gbps is
also presented.
-
On an Efficient Method for Estimating the Interconnection Complexity of
Designs and on the Existence of Region III in Rent's Rule [p. 330]
-
Dirk Stroobandt
The interconnection complexity of digital designs can be captured by the well-known Rent
exponent, described by Landman and Russo [2]. In this paper, we present an efficient method
for obtaining the Rent exponent of a design through a hierarchical partitioning algorithm.
Experimental results not only confirm the Landman and Russo observations of a region land region
II, but also show a hitherto unknown region III
-
Monolithic Microprocessor and RF Transceiver in 0.25-micron FDSOI CMOS [p. 332]
-
E. McShane, K Shenai, L. Alkalai, E. Kowala, V. Boyadzhyan, B. Blaes, and W.C. Fang
A monolithic RFIC in 0.25-micron fully-depleted SOI CMOS has been designed consisting of a
microcoded 8-bit 33-MHz microprocessor, a 400-MHz 8-bit ASK-modulated RF transceiver, and
two integrated dc-dc voltage converters for power management. This architecture exploits a
low-power (sub 2- V) digital process for mixed-signal VLSI in a die size measuring 2.2 mm x 2.2 mm.
-
Low Power Design of an Acoustic Echo Canceller Gmdf&alph; Algorithm on
Dedicated VLSI Architectures [p. 334]
-
S. Gailhard, N. Julien, A. Baganne, and E. Martin
The acoustic echo cancellation with adaptive filters is a computationally intensive problem
that needs real time cost effective solutions for embedded systems. Low Power optimized signal
processing architectures are likely to provide such solutions in the future. In this paper,
we present different realtime optimized architectures of the popular Gmdfα algorithm, obtained
by a HLS CAD tool providing trade-off between area and power dissipation.
-
Proposal of Data-Driven Processor Architecture Qv-K1 [p. 336]
-
Teruhiko Kamigata, Koso Murakami, Makoto Iwata, Hiroaki Terada
This paper presents an extended SIMD form data operation for multi-media signal processing
and a performance evaluation of data-driven processor Qv-K1. By appending proposed data-parallel
operation mechanism, the number of executed instructions is reduced than the one of SIMD. So,
the processing ability of this processor could be risen.
-
Accurate Resource Estimation Algorithms for Behavioral Systems [p. 338]
-
Srinivas Katkoori, Ranga Vemuri
Given a scheduled data flow graph the functional, storage, and interconnect (multiplexors)
resources are analytically estimated taking into account the effects of post-scheduling tasks.
Complexity of the controller implementation is also estimated. The novelty of this work lies
in predicting the effects of the post-scheduling tasks on the final amount of resources, the
effects of data path~ resource optimization on the controller complexity. Experimental results
show high correlation between estimated and actual numbers.
-
Assessing Defect Coverage of Memory Test Algorithms [p. 340]
-
Vonkyoung Kim and Tom Chen
This paper describes the defect coverage evaluation of memory testing algorithms. Realistic
CMOS defects were extracted from a 2 x 2 SRAM layout using an IFA tool, and circuit simulations
were performed to measure the defect coverages of the eleven memory testing algorithms
-
Exploiting Test Resource Optimization in Data Path Synthesis for BIST [p. 342]
-
Xiaowei Li, Paul Y.S.Cheung
Area and test time are two major overheads encountered during data path synthesis for BIST.
This paper presents an attempt towards testability enhancement in data path BIST synthesis by
considering two factors simultaneously. It is achieved by incorporating two testability
constraints in data path synthesis. Experimental results are presented to demonstrate the
effectiveness of the proposed (data path) BIST synthesis approach.
-
Resonant Tunneling Transistors for Threshold Logic Circuit Applications [p. 344]
-
C. Pacha, P. Glösekotter, K Goser, U. Auer, W. Prost, and F. -J. Tegude
Resonant tunneling transistors (RTT's) and linear threshold gates based on monostable-bistable
logic transition elements (MOBILE's) are promising candidates for nano-scale integrated circuits.
In this paper the design methodology of RTT logic gates is discussed and experimental results of
a monolithically integrated NAND -NOR gate are presented. To exploit the computational functionality
of threshold logic circuits a depth-2 full adder and a bit-level pipelined ripple carry adder are
proposed.
-
A Multilevel Cache Memory Architecture for Nanoelectronics [p. 346]
-
David Crawley
In this paper, we present a new multilevel cache memory architecture which uses only
near-neighbour connections, thus eliminating long tracks and rendering the system suitable
for nanoelectronic implementation. Operation of the memory is such that the most-recently
accessed data is kept closest to the read-write port.
-
ALPS: A Peak Power Estimation Tool for Sequential Circuits [p. 350]
-
F. Corno, M. Rebaudengo, M.Sonza Reorda, and M. Violante
Tools for evaluating the worst-case peak power consumption of sequential circuits are highly
useful to designers of low-power circuits. Previously proposed methods search for the initial
state and the couple of vectors with maximum consumption, without fully considering the
reachability of the initial state. This paper shows that this approach can lead to a significant
underestimation of the maximum peak power consumption, and proposes a new algorithm that overcomes
this drawback. Experimental results show that for many circuits the algorithm is able to provide
better results than those known up to now, while an approximate version is able to deal even with
the largest benchmark circuits.
-
Clustered Table-Based Macromodels for RTL Power Estimation [p. 354]
-
Roberto Corgnati, Enrico Macii, Massimo Poncino
Macromodeling is considered the most effective approach to RTL power estimation. Among the
macromodels presented in the literature, table-based ones have overcome some of the limitations
of conventional, equation-based solutions.
In this paper we propose some enhancements to the basic implementation of table-based macromodels
that improve the estimation accuracy while preserving the intrinsic robustness.
-
The Design of a CMOS Gigahertz-Band Continuous-Time Active Lowpass
Filters with Q-Enhancement Circuits [p. 358]
-
Yuyu Chang, John Choma, Jr., and Jack Wills
A tunable second-order lowpass filter architecture capable of operating in the gigahertz
frequency range is proposed. Two Q-enhancement techniques are utilized to extend the Q
tuning range. Simulation results employing standard 0.5μm CMOS technology have successfully
verified that the center frequency tuning and the hybrid Q-tuning approach operate between 1.26GHz
and 2.3GHz center frequencies with Q larger than 1000. A tunable lowpass filter with a center
frequency at 2.07GHz with a Q equal to 31 is designed to have 44dB input dynamic range and
27.8 mW power dissipation.
-
A New Algorithm for RNS Magnitude Comparison Based on New Chinese
Remainder Theorem II [p. 362]
-
Yuke Wang, Xiaoyu Song, Mostapha Aboulhamid
The number comparison is a difficult and fundamental operation for residue number systems
(RNS). Previous algorithms use either some redundant modulus or big modulo operations. In
this paper, based on the New Chinese Remainder Theorem II, we present a new comparison
algorithm using smaller modulo operations and no redundant modulus.
-
Low Power Chip Interface Based on Bus Data Encoding with Adaptive
Code-Book Method [p. 368]
-
Satoshi Komatsu, Makoto Ikeda, Kunihiro Asada
An adaptive code-book encoding is proposed, which is applicable for low power chip-interface.
In this method, data transition activity on bus signals is lowered by data encoding similar
to the vector quantization (VQ). Transferred data on bus are the quantized vector numbers
along with the Hamming difference between the original data and the quantized vector. A computer
simulation and measurement results show that this encoding method is effective for low power
chip-interface especially for the deep sub-micron VLSIs.
-
A 1.8V High Dynamic-Range CMOS High-Speed Four Quadrant Multiplier [p. 372]
-
Chi-Huing Lin and Mohannnsetl Ismail
A low-voltage (<3V) CMOS four quadrant multiplier
is introduced which has an almost rail-to-rail
differential-input-swing with a low signal-distortion
(<1% for 100kHz signal). The proposed circuit is
composed of a pair of rail-to-rail differential-input
V-I converters and a pair of voltage-followers. This
topology of multiplier results in a high frequency capability
with low power consumption. In a 1.2μm n-well
CMOS process, the 3dB frequency of the multiplier is in
a range of 103MHz. Measured total power consumption
is around 0.52mW with supply voltage 2V. The multiplier
can operate at a minimum supply voltage of 1.8V.
-
A Second-Order Sigma-Delta Modulator with Built-in VGA to Improve SNR
and Harmonic Distortion [p. 376]
-
Xiaopeng Li and Mohammed Ismail
A modified architecture of the second-order switched-capacitor modulator is proposed. A simple
four-transistor variable gain attenuator is included in the architecture which
continuously adjusts the reference voltage of the quantizer feedback. This improves the
output SNR for small signal input and reduces the harmonic distortion for large signal input.
Simulation results show that it achieves higher dynamic range and lower harmonic distortions
compared with the traditional architecture.
-
A Novel Low Power Energy Recovery Full Adder Cell [p. 380]
-
R. Shalem, E. John, and L. K John
A novel low power and low transistor count static energy recovery full adder (SERF,) is
presented in this paper. The power consumption and general characteristics of theSERF
adder are then compared against three low power full adders, the transmission function
adder (TFA), time dual value logic (DVL) adder and the fourteen transistor (14T) full
adder. Time proposed SERF adder design was proven to be superior to the other three designs
in power dissipation and area, and second in propagation delay only to the DVL adder. The
combination of low power and low transistor count makes the new SERF cell a viable option
for low power design.
-
Memory Chip BIST Architecture [p. 384]
-
Jacob Savir
This paper describes a random access memory (RAM, sometimes also called an array) test
scheme that has the following attributes:
1. Can be used in both built-in mode and off chip/module mode.
2. Can be used to test and diagnose naked arrays.
3. Fault diagnosis is simple and is "free" for some faults during test.
4. It never subject to aliasing.
5, Depending upon the test length, it earn detect many kinds of failures, like stuck-cells,
decoder faults, shorts, pattern-sensitive, etc.
6. If used as built-in feature, it does not slow down the normal operation of the array.
7. Does not require storage of correct responses. A single response hit always
indicates whether a fault has been detected. Thus, Thus storage requirement for the
implementation of the test scheme is zero.
8. If used as a built-in feature, the hardware overhead is very low.
-
A Fully Pipelined, 700MBytes/s DES Encryption Core [p. 386]
-
Ihn Kim, Craig S. Steele, Jefferey G. Koller
Fully-pipelined, 56-bit DES de/encryption and authentication at memory-bus bandwidths is now
feasible. We describe a custom, 7 square mm, 120mW core in 4-metal 0.35μm CMOS. Performance
allows on-the-fly encryption of 64-bit, 66MHz PCI traffic, and hence typical network traffic.
FPGA, synthesized, and 3-metal versions are compared.
-
Transistor Stuck-Open Fault Detection in Multilevel CMOS Circuits [p. 388]
-
Mostafa Abd-El-Barr, Yanging Xu and Carl McCrosky
The necessary and sufficient conditions for detecting
transistor stuck-open faults in arbitrary multi-level
CMOS circuits are shown. A method for representing a
two-pattern test for detecting a single stuck-open fault
using only one cube is presented. The relationship between
the D-algorithm and the conditions for detecting transistor
stuck-open faults in CMOS circuits is provided. The
application of the proposed approach in robust test
generation for transistor stuck-open faults in
a number of benchmark circuits is demonstrated. The
fault coverage achieved is as good as or better than those
reported using existing techniques.
keywords
Transistor stuck-open fault, two-pattern test, test pattern generation,
multi-level CMOS circuits testing, robust CMOS testing.
-
Advances Toward Molecular-Scale Electronic Digital Logic Circuits:
A Review and Prospectus-Abstract [p. 392]
-
James C. Ellenbogen
(Extended Abstract)
-
Transport in Split Gate MOS Quantum Dot Structures [p. 394]
-
S. M. Goodnick, J. Bird, D. K Ferry, A. D. Gunther, M. D. Khoury,
M. Kozicki, M. J. Rack, T. J. Thornton, and D. Vasileska-Kafedezka
A novel technique has been developed for the fabrication of Si quantum dot structures with
controllable electron number through both top and side gates. We have tested devices ranging
in size from 40 to 200nm. By varying the density with the top gate, amid controlling the
input and output barriers of the dot with the side gates, conductance peaks are observed
which map details of the energy level within the dot as well as the interaction of the
electrons with one another.
|