1990_ACT_Family_Field_Programmable_Gate_Array_Databook 1990 ACT Family Field Programmable Gate Array Databook
User Manual: 1990_ACT_Family_Field_Programmable_Gate_Array_Databook
Open the PDF directly: View PDF
.
Page Count: 200
| Download | |
| Open PDF In Browser | View PDF |
ACT™ Family
Field Programmable Gate Array
Databook
April 1990
ACT, Action Logic, Activator, Actionprobe and PUCE are trademarks of Actel Corporation.
Apollo, LCA, Mentor Graphics, NETED, DrCAD, DrCAD/SDT, DrCADNST, PAL, QuickSim, Sun, Sun 3, Sun 4, SYMED, Valid, ValidGED, ValidSIM,
Viewdraw, Viewlogic, Viewsim, and Workview are all trademarks or registered trademarks of their respective manufacturers.
Actel Corporation reserves the right to make changes to any products or services herein at any time without notice. Actel does not assume any
responsibility or liability arising out of the application or use of any product or service described except as expressly agreed to in writing by Actel.
© 1990 Actel Corporation
ACTTM Family
Field Programmable Gate Arrays
Product D a t a .
Reliability Report.
Applications
..
Article Reprints.
ACTTM Family
Field Programmable Gate Array
Databook
Order
of
Contents
Section 1: Product Data
ACT 1 Field Programmable Gate Arrays
1-1
ACT 1 Military Field Programmable Gate Arrays .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1-27
Action Logic System FPGA Design Environment ..................................................... 1-41
ACT 2 Field Programmable Gate Arrays (Preliminary) ................................................. 1-47
Activator 2 Programmer!Tester/Debugger (Preliminary) ................................................ 1-49
Section 2: Reliability Report
Reliability Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
2-1
Section 3: Applications
Which Actel array should you use?
3-1
Efficient Logic Conversions ......................................................................
3-3
The Actel Timer ................................................................................ 3-15
Using Actionprobe Diagnostic Tools ............................................................... 3-17
Metastability ................................................................................... 3-19
Using the Actel Debugger as a Functional Tester ..................................................... 3-21
Calculating Power in Actel's ACT 1 Arrays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-25
Three-Stating ACT 1010/1020 Designs ............................................................. 3-29
Multi-ASIC and System Simulation with Actel FPGAs .................................................. 3-31
Constructing RAM with ACT 1 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-33
Using the DRAM Controller Supermacro .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-39
Using the DMA Controller Supermacro ............................................................. 3-43
Using the SCSI Interface Controller Supermacro ..................................................... 3-45
UART Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-49
Actel TA181 ALU ............................................................................... 3-61
Actel Fast Adders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-63
Eight-Bit Twos Complement Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-65
Section 4: Article Reprints
AR-1: Analyze FPLD Architectures, Performance, and Development Tools
to Optimize Design-Dependent Selection .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
4-1
AR-2: A CMOS Electrically-Configurable Gate Array .................................................. 4-11
AR-3: An Architecture for Electrically-Configurable Gate Arrays ......................................... 4-23
AR-4: Dielectric-Based Antifuse for Logic and Memory ICs .............................. . . . . . . . . . . . . . .. 4-29
AR-5: Field Programmable Gate Array with Non-Volatile Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-35
AR-6: Testing Antifuse-Based FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-45
AR-7: Compare ASIC Capacities with Gate Array Benchmarks .......................................... 4-49
Product Data
Product Data
ACT 1 Field Programmable Gate Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
1-1
ACT 1 Military Field Programmable Gate Arrays ................................................................ 1-27
Action Logic System FPGA Design Environment ............................................................... 1-41
ACT 2 Field Programmable Gate Arrays ....................................................................... 1-47
Activator 2 Programmerrrester/Debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-49
ACT™ 1
Field Programmable
Gate Arrays
Features
unique architecture offers gate array flexibility, high performance,
and instant turnaround through user programming. Utilizations as
high as 95% permit designs of 1200 gates for the ACT 1010, and
2000 gates for the ACT 1020.
• High Gate Count
- ACT™ 1010: 1200 gate array gates
(3000 PLD/LCA equivalent gates)
ACT 1010/1020 devices also provide system designers with unique
on-chip diagnostic probe capabilities, allowing convenient testing
and debugging. Additional features include an on-chip clock driver
with a hardwired distribution network. The network provides
efficient clock distribution with minimum skew.
- ACT 1020: 2000 gate array gates
(6000 PLD/LCA equivalent gates)
• Field Programmability Allows Instant Prototypes and
Production
The user-configurable IIOs are capable of driving at both TTL and
CMOS drive levels. The ACT lOlD/102O devices are available in
44-,68-, and 84-pin plastic leaded chip carrier and I-leaded cerquad
packages, and an 84-pin pin grid array package.
• Toggle Rates to 70 MHz
• I/O Drive to 8 rnA
• System-Level Performance to 40 MHz
• Gate Array Architecture Allows Completely Automatic Place
and Route
• Non-Volatile, Permanent Programming
• Built-In Clock Distribution Network
A security fuse may be programmed to disable all further
programming and protect the design from being copied or reverse
engineered.
The Action Logic System
• Built-In Diagnostic Probe Pins
• Low-Power CMOS Technology
• Fully Supported by Actel's Action Logic™ System
Description
ACT 1010 and ACT 1020 devices are the first members of a family
of field programmable gate arrays (FPGAs) offered by Actel. These
devices are implemented in silicon gate, 2-micron, two-level metal
CMOS, employing Actel's PLICE™ anti fuse technology. The
ACT 1010/1020 devices are supported by Actel's Action Logic
System, allowing logic design implementation with minimum
effort. The Action Logic System (ALS) interfaces with the resident
CAE system to provide a complete gate array design environment:
schematic capture, simulation, fully automatic place and route,
timing verification, and device programming. The Action Logic
System is available for 386 PC and ApolloTM and Sun™
workstations, running Viewlogic@, Mentor Graphics™, VaJid™,
and OrCADTM.
Figure 1. ACT 1020 Die
©
1990 Actel Corporation
ACT 1010
ACT 1020
1
Figure 2. Partial View of an ACT Device
ACT Device Structure
Device Organization
A partial view of an ACT device (Figure 2 above) depicts four logic
modules and distributed horizontal and vertical interconnect
tracks. PLICE antifuses, located at intersections of the horizontal
and vertical tracks, connect logic module inputs and outputs.
During programming, these antifuses are addressed and
programmed to make the connections required by the circuit
application.
ACT devices consist of a matrix of logic modules arranged in rows
separated by wiring channels (see Figure 1). This array is
surrounded by a ring of peripheral circuits including I/O buffers,
testability circuits, and diagnostic probe circuits providing real-time
diagnostic capability. Between rows of logic modules are routing
channels containing sets of segmented metal tracks with PLICE
antifuses. Each channel has 22 signal tracks. The resulting network
allows arbitrary and flexible interconnections between logic
modules and I/O modules.
The Actel Logic Module
The Actellogic module is an eight-input, one-output configurable
logic circuit chosen for the wide range of functions it implements
and for its efficient use of interconnect routing resources.
The logic module can implement the four basic logic functions
(NAND, AND, OR, and NOR) in gates of two, three, or four
inputs. Each function may have many versions, with different
combinations of active-low inputs. The logic module can also
implement a variety of D-Iatches, exclusivity function, AND-ORs,
and OR-ANDs. No dedicated hardwired latches or flip-flops are
required in the array since latches and flip-flops may be constructed
from logic modules wherever needed in the application.
I/O Buffers
Each I/O pin is configurable as an input, output, three-state, or
bidirectional buffer. Input and output levels are compatible with
standard TTL and CMOS specifications. Outputs sink or source
4 rnA at TTL levels. See Electrical Specifications for additional I/O
buffer specifications.
Probe Pin
ACT 1010/1020 devices have two independent diagnostic probe
pins. These pins allow the user to observe any two internal signals
by entering the appropriate net name in the diagnostic software.
Signals may be viewed on a logic analyzer using Actel's
Actionprobe™ diagnostic tools. The probe pins can also be used as
user-defined I/Os when debugging is finished.
ACT Array Performance
Temperature and Voltage Effects
Worst-case delays for ACT arrays are calculated in the same
manner as for masked array products. A typical delay parameter is
multiplied by a derating factor to account for temperature, voltage,
and processing effects. However, in an ACT array, temperature
and voltage effects are less dramatic than with masked devices.
ACT 1 FPGAs
The electrical characteristics of module interconnections on ACT
devices remain constant over voltage and temperature fluctuations.
As a result, the total derating factor from typical to worst case for an
ACT array is only 1.19 to 1, compared to 2 to 1 for a masked gate
array.
Logie Module Size
Logic module size also affects performance. A mask programmed
gate array cell with four transistors usually implements only one
logic level. In an ACT array's more complex logic module (similar
to the complexity of a gate array macro), implementation of
multiple logic levels within a single module is possible. This
eliminates interlevel wiring and associated RC delays. The effect is
termed "net compression."
Ordering Information
PG
1020
A
84
M
l~::::,
Package Type
Speed Bin
I
Die Revision
Part Number
Device Family
Packages
Type
Lead Count
44,68,84
PL
Plastic Leaded Chip Carrier
JQ
J-Leaded Cerquad Chip Carrier
44,68,84
PG
Ceramic Pin Grid Array
84
Applications
C
Commercial
Industrial
M
Military
8
8838*
·Contact your Actel representative for current status.
1-3
Recommended Operating Conditions
Device Resources
Device
User II0s
Parameter
Commercial
84 LCC 68 LCC 44 LCC 84 PGA
Temperature
Range (Note 1)
Oto +70
Power Supply
Tolerance
±5
Modules
Gates
1010
295
1200
N/A
57
34
57
1020
546
2000
69
57
34
69
Industrial
Military
-40 to +85 -55 to +125
±10
±10
Units
°C
%Vee
Note:
1. Ambient temperature (fA) used for commercial and industrial; case
temperature (fe) used for military.
Absolute Maximum Ratings
Power Dissipation
Free air temperature range
Symbol
Parameter
DC Supply Voltage1
Vee
Limits
Units
The following formula is used to calculate total device dissipation.
-0.5 to + 7.0
Volts
Total Chip Power (mW)
Where:
V,
Input Voltage
-0.5 to Vee +0.5
Volts
Va
Output Voltage
-0.5 to Vee + 0.5
Volts
F1
F2
1,K
Input Clamp Current
±20
mA
10K
Output Clamp Current
±20
mA
10K
Continuous Output Current
±25
mA
-65 to +150
°C
Storage Temperature
TSTG
Stresses beyond those listed under '~bsolute Maximum Ratings" may
cause permanent damage to the device. Exposure to absolute maximum
rated conditions for extended periods may affect device reliability. Device
should not be operated outside the Recommended Operating Conditions.
Note:
1. Vpp
=:
= 0.41 N*F1 + 0.17 M*F2 +
1.62 P*F3
= Average logic module switching rate in MHz.
= Average clock pin switching rate in MHz.
F3 = Average I/O switching rate in MHz.
M = Number of logic modules connected to the clock pin.
N = Total number of logic modules used on the chip.
(including M)
P = Number of outputs used loaded with SO pR
The second term, variables F2 and M, may be ignored if the
CLKBUF macro is not used in the design.
Vee, except during device programming.
Electrical Specifications
Industrial
Commercial
Military
Units
Parameter
Min.
(IOH
(IOH
(IOL
=:
=:
=:
-4 mAl
Max.
Min.
Max.
Min.
Max.
V
3.84
3.7
-3.2 mAl
0.33
4 mAl
3.7
0.40
-0.3
0.8
-0.3
0.8
-0.3
2.0
Vee + 0.3
2.0
Vee + 0.3
2.0
V
0.40
V
0.8
V
Vee
+ 0.3
V
Input Transition Time t A, tF2
500
500
500
ns
Cia I/O Capaeitanee2. 3
10
10
10
pF
Standby Current, lee4
10
20
25
mA
10
}JA
Leakage Current5
-10
10
-10
10
-10
loS Output Short
(Vo =: Vee)
20
140
20
140
20
140
mA
Circuit Currents -(V-0-=:-G-N-D-)------1-0------1-0-0------1-0-------1-00------10-------1-0-0---m-A-Notes:
1. Only one output tested at a time. Vee =: min.
2. Not tested, for information only.
3. Includes worst-case 84-pin PLeC package capacitance. V OUT =: 0 V, f = 1 MHz.
4. TYpical standby current =: 3 mA. All outputs unloaded. All inputs =: Vee or GND.
5. Va, V IN =: Vee or GND.
6. Only one output tested at a time. Min. at Vee = 4.5 V; Max. at Vee = 5.5 V.
1_4
ACT 1 FPGAs
Package Thermal Characteristics
Maximum junction temperature is 150°C.
The device junction to case thermal characteristic is Sjc, and the
junction to ambient air characteristic is 9ja. The thermal
characteristics for 9ja are shown with two different air flow rates.
Package Type
Plastic J-Ieaded (PL)
= 25/33 C/W (pG84 - still air) = 0.76
ajc
aja
Still air
44
15
13
12
52
45
44
40
35
33
°C
°C
°C
84
8
8
8
38
35
34
30
25
24
°C
°C
°C
84
8
33
20
°C
84
44
68
Ceramic Pin Grid Array (PG)
150 (Max) - 125 (Max Mil.)
Watts.
Pin Count
68
Cerquad J-Ieaded (JQ)
A sample calculation of the maximum power dissipation for a
ceramic PG84 package at military temperature is as follows:
a/a
300 ft/min.
Units
AIR FLOW
PACKAGE CASE
\
I
DEVICE JUNCTION
Functional Timing Tests
AC timing for logic module internal delays is determined after
place and route. The ALS Timer utility displays actual timing
parameters for circuit delays. ACT 1 devices are AC tested to a
"binning" circuit specification.
The circuit consists of one input buffer + n logic modules + one
output buffer (n = 16 for the ACT 1010; n = 28 for the ACT 1020).
The logic modules are distributed along two sides of the device.
These modules are configured as inverting and non-inverting
buffers. The modules are connected through programmed
antifuses with typical capacitive loading.
Propagation delay [tpD = (tPLH
AC test specifications.
+
tpHL)/2] is tested to the following
AC Test Specifications
Vee
= 5.0 V; TA = 25°C
Min.
Parameter
Max.
Units
1010
87
ns
1020
135
ns
Binning Circuit Delay (tpDB)
1-5
Output Buffer Performance Derating
Sink
Source
10
10
/
/,
,, "
/
"
,,"
Q3
Q4
Q5
0.6
4.0
3.6
VOL (Volts)
",,'"
~'
3.2
2.8
VOH (Volts)
-
Military, worst case values at 125°C, 4.5 V.
Commercial, worst case values at 70°C, 4.75 V.
Note:
The above CUlVes are based on characterizations of sample devices and are
not completely tested on all devices.
1_R
,
,~
2
2
0.2
V
'/
V
2.4
2.0
ACT 1 FPGAs
typical and critical (speed-sensitive) nets are given below. Most nets
win fall into the "typical" category.
Timing Characteristics
Timing is design-dependent; actual delay values .are de~e:mined
after place and route of the design using the ALS TImer. ~tIhty. The
following delay values use statistical estimates for wmng delays
based on 85% to 90% module utilization. Device utilization above
95% will result in performance degradation.
Less than 1% of all routing in a design requires the use of "long
tracks." Long tracks, long vertical or horizontal routing paths, are
used by the autorouter only as needed. Delays due to the use oflong
tracks range from 15 ns to 35 ns. Long tracks may be used to route
the least-critical nets in a given design.
With Actel's place and route program, the user can assign a net's
criticality level, based on timing requirements. Delays for both
In the following tables, timing is characterized over the
recommended operating conditions, unless specified otherwise.
Module Timing
5.0 V; TA
Vcc
25°C; tpo - 4.0 ns @ FO - 0
Single Logic Module Macros
(e.g., most gates, latches, multiplexors)f
FO
FO
Units
Parameter
Output Net
FO=1
FO-2
FO-3
tpo
Critical
7.0
7.5
8.0
11.0
Note 2
ns
!Po
Typical
8.2
8.7
10.0
11.2
14.0
ns
4
8
Dual Logic Module Macros
(e.g., adders, wide Input gates)1
Parameter
Output Net
FO = 1
FO = 2
FO = 3
FO - 4
tpo
Critical
12.0
12.5
13.0
16.0
Note 2
ns
!Po
Typical
13.2
13.7
15.0
16.2
19.0
ns
FO = 4
FO - 8
FO
8
Units
Sequential Element Timing Characteristics
Fan-out
Parameter
FO = 1
FO
=2
FO
=3
Units
tsu
Set Up Time, Data Latches
4.6
5.0
5.4
5.8
5.8
ns
tsu
Set Up Time, Flip-Flops
5.0
5.0
5.0
5.0
5.0
ns
tH
Hold Time
0
0
0
0
0
ns
tw
Pulse Width, Minimum3
10.0
10.0
10.0
10.0
10.0
ns
tpo
Delay, Critical Net
7.0
7.5
8.0
11.0
Note 2
ns
!Po
Delay, Typical Net
8.2
8.7
10.0
11.2
14.0
ns
Notes:
1. Most flip-flops exhibit single module delays.
2. Critical nets have a maximum fan-out of 6.
3. Minimum pulse width, tw, applies to CLK, PRE, and CLR inputs.
I
1/0 Buffer Timing
vee = 5.0V;TA = 25°C
INBUF Macros
Parameter
=2
=3
=4
=8
Units
From - To
FO = 1
Pad to Y
9.0
9.9
11.6
13.9
18.6
ns
Pad to Y
7.7
8.4
10.0
10.9
16.1
ns
FO
FO
FO
FO
CLKBUF (High Fan-out Clock Buffer) Macros
Parameter
FO
tpLH
= 20
FO
= 50
FO
= 200
11
15
25
12
16
25
Notes:
1. A clock balancing feature is provided to minimize clock skew.
Units
ns
ns
2. There is no limit to the number of loads that may be connected to the
CLKBUF macro.
OUTBUF, TRIBUFF & BIBUF Macros
CL = 50 pF
Parameter
From - To
CMOS
TTL
Units
tpHL
5.1
6.4
ns
tpLH
o to Pad
o to Pad
9.4
7.4
ns
tpHZ
E to Pad
6.7
4.4
ns
tpZH
E to Pad
8.4
6.3
ns
tpLZ
E to Pad
8.9
6.7
ns
tpZL
E to Pad
6.4
7.7
ns
TTL
Units
Change In Propagation Delay with Load Capacitance
Parameter
From - To
CMOS
tpHL
o to Pad
o to Pad
0.04
0.06
ns/pF
t pLH
0.09
0.05
ns/pF
tpHZ
E to Pad
0.10
0.06
ns/pF
tpZH
E to Pad
0.09
0.05
ns/pF
tpLZ
E to Pad
0.09
0.05
ns/pF
tpZL
E to Pad
0.04
0.05
ns/pF
Notes:
1. The BIBUF macro input section exhibits the same delays as the INBUF
macro.
2. Load capacitance delay delta can be extrapolated down to 15 pF
minimum.
Example:
Delay for OUTBUF driving a 100-pF TfL load:
tPHL = 6.4 + (0.06 .. (100-50» = 6.4 + 3.0 = 9.4 ns
tpLH = 7.4 + (0.05 .. (100-50» = 7.4 + 2.5 = 9.9 ns
ACT 1 FPGAs
Timing Derating
Operating temperature and voltage and device processing
conditions account for variations in array timing characteristics.
These variations are summarized into a derating factor for ACT
array typical timing specifications. Derating factors are shown
below: best-case reflects mInImUm operating voltage and
temperature and best-case processing; worst-case reflects
maximum operating voltage and temperature and worst-case
processing. Best-case derating is based on sample data only and is
not guaranteed.
Factor (x typical)
Commercial
Industrial
Military
Best-case
0.55
0.47
0.40
Worst-case
1.19
1.27
1.38
Temperature Derating Curve
Voltage Derating Curve
1.20
1.40
1.15
1.30
,,~
1.10
~
If
1.05
1.20
~
........
1.00
'-
0.95
~~
j
~
1.10
1.00
, ~'"
..........
0.90
0.90
0.80
0.85
4.5
4.75
5.0
5.25
,~
I'"
0.70
-60
0.80
5.5
Vee (Volts)
"'~~
-40 -20
0
20
40
Output Buffer Delays
To AC test loads (shown below)
GND
GND
tpHL
60
Temperature (0C)
o
~""
80
100
120
I
AC Test Loads
Load 2
(Used to measure rising/failing edges)
Load 1
(Used to measure propagation delay)
Vee
)>-----'l
T
GND
•
To the output under test
•
50PF
R to Vee for tpLZ/tpZL
R to GND for tpHz/tpZH
To the output under test
'>----.
R = 1 kO
50PF
T
Module Delays
Input Buffer Delays
sDan-out=2
A
Y
B
Vee
S,~V50%
500/:'r-...._ _G_N_D
_ _ _ _ __
Vee
Out
"V50 %
',50%
GND
t pLH
t pHL
Vee
Out
50~r-...
GND
t pHL
GND
/~O%
t pLH
D-1'ype Flip-FlOp and Clock Delays
PRE
D
Q
Fan-out = 2
Clk
ClR
(Positive edge triggered)
tsu = Setup time
tH = Hold time (0 ns)
tw = Minimum pulse width (10 ns)
Vee
Q
50%
ACT 1 FPGAs
Soft Macro LIbrary Overview
Macro Name
Description
Modules Req'd
Levels of Logic
Counters
4
CNT4A
17
4 bit loadable binary counter with clear
CNT4B
15
4 bit loadable bin counter wi elr, active low carry in & carry out
4
UDCNT4A
24
4 bit upldown cntr wi sync active low load, carry in & carry out
6
Decoders
2 to 4 decoder
DEC2X4
4
DEC2X4A
4
2 to 4 decoder with active low outputs
DEC3X8
8
3 to 8 decoder
3 to 8 decoder with active low outputs
DEC3X8A
8
DEC4X16A
20
DECE2X4
4
DECE2X4A
4
4 to 16 decoder with active low outputs
2
2 to 4 decoder with enable
2 to 4 decoder with enable and active low outputs
DECE3X8
11
3 to 8 decoder with enable
2
DECE3X8A
11
3 to 8 decoder with enable and active low outputs
2
Latches and Registers
DLC8A
8
Octal latch with clear
DLE8
8
Octal latch with enable
DLM8
8
Octal latch with mulitplexed inputs
REGE8A
20
Octal register with preset and clear, active high enable
2
REGE8B
20
Octal register wi active low clock, preset & clear, active high en.
2
One bit full adder
3
•
Adders
FA1
3
FADD8
37
8 bit fast adder
4
FADD12
62
12 bit fast adder
5
FADD16
78
16 bit fast adder
5
FADD24
120
24 bit fast adder
6
FADD32
160
32 bit fast adder
7
Comparators
ICMP4
5
4 bit identity comparator
2
ICMP8
9
8 bit identity comparator
3
MCMP16
93
MCMPC2
9
16 bit magnitude comparator
5
2 bit magnitude comparator with enables
3
MCMPC4
18
4 bit magnitude comparator with enables
4
MCMPCS
36
8 bit magnitude comparator with enables
6
Multiplexors
MX8
3
8 to 1 multiplexor
2
MX8A
3
8 to 1 multiplexor with an active low output
2
MX16
5
16 to 1 multiplexor
2
Multipliers
SMULT8
235
8 x 8 two's complement multiplier
Varies
1-11
Soft Macro Library Overview (continued)
Macro Name
Description
Modules Req'd
Levels of Logic
Shift Registers
SREG4A
8
4 bit shift register with clear
2
SREG8A
18
8 bit shift register with clear
2
TA138
12
3 to 8 decoder with 3 enables and active low outputs
2
TA139
4
2 to 4 decoder with an enable and active low outputs
TA151
5
8 to 1 multiplexor with enable, true, and complementary outputs
3
TA153
2
4 to 1 multiplexor with active low enable
2
TTL Replacements
2 to 1 multiplexor with enable
TA157
TA161
22
4 bit sync counter w/ load, clear, count enables & ripple carry out
TA164
18
8 bit serial in, parallel out shift register
TA169
25
4 bit synchronous up / down counter
6
TA181
31
4 bitALU
4
TA194
14
4 bit shift register
3
TA195
10
4 bit shift register
TA269
50
8 bit up/down cntr w/ clear, load, ripple carry output & enables
8
TA273
18
Octal register with clear
1
Parity generator and checker
4
TA280
9
TA377
16
Octal register with active low enable
189
Universal Asychronous Receiver / Transmitter
UARTs
UART
1-12
7-Tx 4-Rx
ACT 1 FPGAs
Hard Macro Library Overview
The following illustrations show all the available Hard Macros.
2-lnput Gates (Module Count = 1)
~ ~ ~
~ ~NAND~ ~NAND~
~ ~ ~
~ ~ ~
3-lnput Gates (Module Count
= 1, unless indicated otherwise)
I
® Indicates 2-module macro
... Indicates extra delay input
1-13
4-lnput Gates (Module Count = 1, unless indicated otherwise)
®
Indicates 2-module macro
... Indicates extra delay input
f)o~
nNO~
XOR Gates
(Module Count
= 1)
AND XOR Gates
(Module Count = 1)
1-14
XOR OR Gates
XOR AND Gates
(Module Count = 1)
(Module Count = 1)
ACT 1 FPGAs
AND OR Gates (Module Count = 1)
®
Indicates 2-module macro
... Indicates extra delay input
A
B
c
..
A
c
A
B
A
c
B
D
C
D
OR AND Gates (Module Count
= 1)
®
Indicates 2-module macro
.... Indicates extra delay input
ACT 1 FPGAs
Buffers (Module Count = 1)
I/O Buffers (I/O Module Count
= 1)
D
Multiplexors (Module Count
•
= 1)
A
A
B
B
DO
D1
y
D2
D3
c
D
Latches (Module Count
= 1)
-D---G--D----G-
--0 -0 -t:J -t:J
o Latches with Clear (Module Count = 1)
D
OLeA
G
G
CLR
CLR
o Latches with Enable (Module Count
Q
D
Q
OLe
= 1)
n1::J
Mux Latches (Module Count = 1)
OLME1A
A
n
Q
B
OLMA
S
G
Adders (Module Count
= 2)
A
A
A
B
B
B
A
B
CO
CO
CO
CO
.AS
.AS
.AS
.AS
HA1
HA1A
HA1B
AD
A
B
CI
B
CO
CI
.AS
FA1A
FA1B
HA1e
A1
CO
B
CO
.AS
CI
.AS
FA2A
Macros FA1A, FA1 B, and FA2A have two level delays from the inputs to the S outputs, as indicated by the.A
ACT 1 FPGAs
D Flip-FlopS (Module Count
= 2)
--D-----G-----D--~
-0 --0 -D -D
D Flip-Flops with Clear
a
D
DFC1
ON
DFC1A
D
a
D
DFC1B
ON
DFC1C
D
a
CLR
a
DFC1D
ON
DFC1E
CLR
CLR
D
CLR
CLR
CLR
D
ON
DFC1F
D
ON
DFC1G
D
CLR
CLR
D Flip-Flops with Preset
PRE
PRE
PRE
a
D
a
D
PRE
a
D
DFP1
DFP1A
PRE
D
a
DFP1D
PRE
PRE
D
ON
DFP1E
D
ON
DFP1F
DFP1B
D Flip-Flops with Preset and Clear
PRE
D
a
DFPC
PRE
D
a
DFPCA
CLR
CLR
ON
DFP1C
D
PRE
D
ON
DFP1G
I
D Flip-Flops with Enable (Module Count = 2)
D-~~
=t:J TI =t:J
PRE
D
PRE
PRE
D
Q
D
E
Q
E DFEB
E DFEC
ClR
ClR
ClR
JK Flip-Flops (Module Count = 2)
D-
n
PRE
J
Q
JKFPC
K
ClR
Mux Flip-Flops (Module Count = 2)
DFME1A
=rct-=ro=k:J~
A
Q
B
DFMB
uQ
B
~ ClK
S
CLKBUF Interface Macros (Module Count = 1)
DO
D1
y
D2
D3
Q
DFED
ACT 1 FPGAs
Packages
(Top View)
()
z
~
Cl
9 8 7 6 5 4 3 2 1 6867666564636261
39
38
37
36
35
34
33
32
31
GND
44-PIN
PLCC
Vee
Vpp
Vee
MODE
ClK or I/O
GND
GND
GND
68-PIN
PLCC
30
29
Vee
1819202122232425262728
()
z
Cl
60
59
58
57
56
55
54
53
52
51
50
49
48
47
46
45
44
PRS or I/O
PRA or I/O
DClKor I/O
SDI or I/O
~
Vpp
PRS or I/O
PRA or I/O
DClKor I/O
SDI or 110
Vee
MODE
ClK or 110
GND
N~293O~U3334~~~~~~~Qa
()
0
z
~
Cl
•
§;
0
a:
()
m
Cl
a..
z
PRA or I/O
DClK or I/O
SDI or I/O
NC
71
GND
GND
68
67
Vee
Vee
64
63
62
61
60
59
58
57
56
55
54
ClK or I/O
MODE
84-PIN
PLCC
Vee
Vee
Notes:
1. Vpp must be terminated to Vee, except during device programming.
2. MODE must be terminated to circuit ground, except during device
programming or debugging.
GND
GND
3. Unused 110 pins are designated as outputs by the Action Logic System
and are driven low.
4. All unassigned pins are available for use as 1I0s.
Packages (continued)
9 8 7 6 5 4 3 2 1 6867666564636261
39
38
37
36
35
34
33
32
31
30
29
GND
44·PIN
JQCC
Vee
Vpp
Vee
MODE
ClKor I/O
GND
0
z
GND
GND
Vee
ClK or 1/0
51
GND
Vee
~
PRB or 110
PRAor I/O
DClK or I/O
SDI or I/O
MODE
68·PIN
JQCC
1819202122232425262728
C!l
60
59
58
57
56
55
PRB or 1/0
PRA or 1/0
DClKor 1/0
SDI or I/O
Vpp
2728293031323334 3536 37383940414243
~
oz
C!l
o
C£I
a:
a..
1110967 654 3 2184638281807978777675
NC
GND
GND
84·PIN
JQCC
Vee
Vee
Notes:
1. V pp must be terminated to Vee, except during device programming.
2. MODE must be terminated to circuit ground, except during device
programming or debugging.
74
73
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
56
55
54
PRA or 110
DClKor 110
SDI or I/O
Vee
Vee
MODE
ClKor I/O
GND
GND
3. Unused 110 pins are designated as outputs by the Action Logic System
and are driven low.
4. All unassigned pins are available for use as 1I0s.
ACT 1 FPGAs
Packages (continued)
2
3
4
5
6
7
8
9 10 11
00000000000
00000000000
cOO.
000
00
D
0 0
00
000
E 000
84-PIN
CERAMIC
000
F
000
PIN GRID
ARRAY
000
GOOO
H
0 0
00
JOO
00000
K 00000000000
L
00000000000
A
B
•
Orientation Pin (C3)
•
Pin Assignments for the 84-Pln PGA
Signal
ACT 1010
ACT 1020
PRA
A11
A11
PRB
B10
B10
MODE
E11
E11
SD!
B11
B11
DCLK
C10
C10
Vpp
K2
K2
CLKor I/O
F9
F9
GND
B7, E2,E3, K5, F10,G10
B7,E2, E3, K5, F10,G10
Vee
B5,F1,G2,K7, E9,E10
B5,F1,G2, K7, E9, E10
N/C (No Connection)
B1, B2,C1,C2, K1,J2, L1,J10, K10, K11,C11,D10,D11
B2
Notes:
1. Vpp must be terminated to Vee, except during device programming.
2. MODE must be terminated to circuit ground, except during device
programming or debugging.
3. Unused 110 pins are designated as outputs by the Action Logic System
and are driven low.
4. All unassigned pins are available for use as liDs.
Package Mechanical Details
J-Leaded Cerquad Chip Carriers
.040" ±
Pin 1 Index Chamfer
.010"1!d=~=====t~
t
.050" BSC
--=:J
--1
.105" ± .010"
0.007"
0.011"
-a
.155"
.180"
Lead Count
44
D,E
E1,D1
.690" ± .005"
.650" ± .008"
68
.990" ± .005"
.950" ± .008"
84
1.190" ± .005"
1.150" ± .008"
Plastic J-Leaded Chip Carriers
o
01
0
~m
-.\ I- .029"
± .003"
.175" ± .010"
Lead Count
44
O,E
.690" ± .005"
E1,01
.655" ± .005"
68
.990" ± .005"
.955" ± .005"
84
1.190" ± .005"
1.155" ± .005"
ACT 1 FPGAs
Ceramic Pin Grid Array
~ ~ .050"
II
Pin #1
± .010"
(4 PLCS)
~
-,.018" ± .002"
_____L
.100" TYp.
-----r
~
1.100" + .020" Squ"",
----1
.120"
.140"
0000000000~r--r-
00000000000
00
000
00
00
00
000
000
000
000
000
000
00
00
00.
000
00
00000000000
I
1.000 SSC
cJOOOOOOOOOG--"•
Orientation Pin
ACT™ 1 Military
Field Programmable
Gate Arrays
Preliminary
Features
Benefits
• Military Data Sheet Devices
- Operation Over Full Temperature Range;
-55°C to + 125 °C, with ± 10% Voltage Supplies
Actel's ACT™ 1 field programmable gate array (FPGA) is an ideal
device for military applications. Designs are easily implemented
and programmed in minutes. This nonvolatile device offers a
risk-free solution for almost any type of logic integration problem.
The ACT 1 family includes the ACT 1010 device with a capacity of
1,200 gate array equivalent gates or 3,000 PLD/LCA™ gates, and
the ACT 1020 device with a capacity of 2,000 gate array gates or
6,000 PLD/LCA gates.
• Military STD 883-C Class B Devices·
- Assembled to Method 5004
- Dynamic Bum-In
- Qualification to Method 5005
- Group A, B, C, D Test Reports Available
• Packaging
- 84-Pin Ceramic PGA - Available Now
- 44-,68- and 84-Pin J-Leaded Cerquad Chip Carrier Available Q2, 1990
- 84-Pin Ceramic Quad Flat Pack - Planned
• Architecture/Technology
- Antifuse-based architecture offers design flexibility and
high gate utilization
- Architecture and antifuse provide high security for
customer designs
- CMOS process offers low power consumption
- Complete Test Coverage
• Design Tools
- Actel Action Logic™ System uses fully automatic place
and route algorithms and offers timing analysis to military
worst-case specifications
- Interface to Viewlogic@, Mentor Graphics™, Valid™, and
OrCAD™
• High Reliability
- Nonvolatile, Permanent Programming
- Antifuse reliability ( < 10 F1TS) exceeds that of the base
CMOS process
- Very low product failure rate: 91 F1TS - 0.091% failures
per 1000 hours at 70°C ambient temperature (90°C
junction temperature)
- Radiation tolerant CMOS antifuse process: preliminary
total dose tests better than 1.5 megarads (Si)
- Greater than 2000 V ESD on all pins (Class 2 test method
3015 MIL-STD-883C)
• Compliance to Para 1.21 MIL-STD-883 and MIL-M-3851O
pending.
©
1990 Actel Col'DOratlon
No cost risk - Once you have an Action Logic System, Actel's
CAE software and programming package, you can produce as
many chips as you like for just the cost of the device itself, no NRE
charges to eat up your development budget each time you want to
try out a new design.
No time risk - After entering your design, placement and routing
is automatic, and programming the device takes only about 5
minutes for an average design. You save time in the design entry
process by using tools that are familiar to you. The Action Logic
System (ALS) software interfaces to popular CAE software such as
Mentor Graphics, Valid, OrCAD, and Viewlogic and runs on
popular platforms such as ApolloTM, Sun 3™, Sun 4TM, and 386 PC
compatible machines.
No reliability risk - The PLICE™ antifuse is a one-time
programmable, nonvolatile connection. Since Actel devices are
permanently programmed, no down-loading from EPROM or
SRAM storage is required. Inadvertent erasure is impossible and
there is no need to reload the program after power disruptions.
Both the PLICE antifuse and the base process are radiation
tolerant. Fabrication using a low-power CMOS process means
cooler junction temperatures. Actel's non-PLD architecture
delivers lower dynamic operating current. Our reliability tests show
a very low failure rate of 91 F1TS at 90°C junction temperature
with no degradation in AC performance. Special stress testing at
wafer test eliminates infant mortalities prior to packaging.
No security risk - Reverse engineering of programmed Actel
devices from optical or electrical data is extremely difficult.
Programmed antifuses cannot be identified from a photograph or
by using a SEM. The antifuse map cannot be deciphered either
electrically or by microprobing. Each device has a silicon signature
that identifies its origins, down to the wafer lot and fabrication
facility.
No testing risk - Unprogrammed Actel parts are fully tested at
the factory. This includes the logic modules, interconnect tracks,
and I/Os. AC performance is assured by special speed path tests
and programming circuitry is verified on test antifuses. During the
programming process an algorithm is run to assure that all
antifuses are correctly programmed. In addition, Actel's
Actionprobe™ diagnostic tools allow 100 percent observability of
all internal nodes to check and debug your design.
I
Ordering Information
A
PG
1020
84
M
Package Type
Speed Bin
Die Revision
Part Number
Device Family
Packages
Type
Lead Count
PG
Ceramic Pin Grid Array
84
JQ*
J-Leaded Cerquad Chip Carrier
44,68,84
Applications
*
M
Military
B*
883 B
Important:
Contact your Actel representative for current availability.
ACT 1 Military FPGAs
Actel Military Product Flow
883C-Class B
883C Method
Screen
Step
883C-Class B
Requirement
Military
Datasheet
Requirement
1.0
Internal Visual
2010, Test Condition B
100%
100%
2.0
Temperature Cycling
1010, Test Condition C
100%
100%
3.0
Constant Acceleration
2001, Test Condition E
(min), Y1, Orientation only
100%
100%
4.0
Seal
a. Fine
b. Gross
1014
100%
100%
100%
100%
100%
100%
5.0
Visual Inspection
6.0
Pre Burn-in
Electrical Parameters
In accordance with Actel
applicable device specifications
100%
N/A
7.0
Bum-in Test
1015 Condition D
160 hours @ 125°C Min.
100%
N/A
S.O
Interim (post bum-in)
Electrical Parameters
In accordance with Actel
applicable device specifications
100%
100%
(as final test)
9.0
Percent Defective Allowable
5%
All Lots
N/A
Final Electrical Test
In accordance with Actel
applicable device specifications
a. Static Tests
(1) 25°C
(Subgroup 1, Table I, 5005)
(2) -55°C and +125°C.
(Subgroups 2, 3, Table I, 5005)
100%
100%
b. Dynamic and Functional Tests
(1) 25°C
(Subgroup 7, Table I, 5005)
(2) -55°C and + 125°C.
(Subgroups SA and SB, Table I, 5005)
100%
100%
c. Switching Tests at 25°C
(Subgroup 9, Table I, 5005)
100%
100%
10.0
11.0
Qualification or Quality
Conformance Inspection Test
Sample Selection (Group A)
5005
All Lots
N/A
12.0
External Visual
2009
100%
Actel specification
•
Recommended Operating Conditions
Device Resources
Temperature
Range (To)
Device Modules Gates 84 JQCC 68 JQCC 44 JQCC 84 PGA
1010
295
1200
N/A
57
34
1020
546
2000
69
57
34
57
69
VI
Total Chip Power (mW)
-0.5 to +7.0
Volts
Where:
-0.5 to Vee + 0.5
Volts
Voltage 1
Input Voltage
Vo
Output Voltage
-0.5 to Vee + 0.5
Volts
11K
Input Clamp Current
±20
mA
10K
Output Clamp Current
±20
mA
10K
Continuous Output Current
±25
mA
T STG
°C
±10
%Vee
The following formula is used to calculate total device dissipation.
Units
Limits
Parameter
DC Supply
-55 to +125
Power Dissipation
Free air temperature range
Vee
Units
Power Supply
Tolerance
Absolute Maximum Ratings
Symbol
Military
Parameter
User I/Os
-65 to + 150
Storage Temperature
= 0.41 N*F1 + 0.17 M*F2 +
1.62 P*F3
= Average logic module switching rate in MHz.
= Average clock pin switching rate in MHz.
F3 = Average I/O switching rate in MHz.
M = Number of logic modules connected to the clock pin.
N = Total number of logic modules used on the chip.
F1
F2
(including M)
P
°C
Stresses beyond those listed under '~bsolute Maximum Ratings" may
cause permanent damage to the device. Exposure to absolute maximum
rated conditions for extended periods may affect device reliability. Device
should not be operated outside the Recommended Operating Conditions.
= Number of outputs used loaded with 50 pE
The second term, variables F2 and M, may be ignored if the
CLKBUF macro is not used in the design.
Note:
1. Vpp = Vee, except during device programming.
DC Characteristics
Vee = 5.0V ± 10%; -55 °C:$;Te :$; +125 °C
Limits
Parameter
Test Conditions
VOL
Output Low Voltage
Vee = 4.5 V. IOL = 4 mA
Test one output at a time
VOH
Output High Voltage
Vee = 4.5 V. IOH = -3.2 mA
Test one output at a time
1,2,3
3.7
VIL
VIH
Input Low Level
1,2,3
-0.3
Input High Level
1,2,3
2.0
Icc
Standby Supply Current
Vee = 5.5 V. VIN = Vee or GND
Outputs unloaded
IlL
Input Leakage Current
Vee
loz
Output Leakage Current
Vee
los
Output Short Circuit Current
VOUT = Vee
Test one output at a time
Symbol
VOUT
= GND
Vee
Vee
= 5.5 V. VIN = Vee or GND
= 5.5 V. VOUT = Vee or GND
Min.
Max.
1,2,3
0.4
Units
V
V
0.8
Vee + .3
1,2,3
25
V
V
mA
1,2,3
-10
10
pA
1,2,3
-10
10
pA
20
140
mA
-10
-100
mA
1,2,3
= 4.5 V for min. limit
= 5.5 V for max. limit
Switching Characteristics
vee
= 5.0 V ±
Symbol
tpOB
10%; -55°C:$; Te:$; +125 °C
Parameter
Binning Circuit Delay
ACT 1010
ACT 1020
Limits
Test Conditions
Vee = 4.5V
VIH = 3 V. VIL = 0 V
VOUT = 1.5 V
Min.
9, 10, 11
Max.
Units
_17.::2:-::0_ _....;n:..:;:s:....-_
186
ns
ACT 1 Military FPGAs
Functional and Switching Tests
ACT 1010 and ACT 1020 devices can be tested functionally by
using a serial scan test method. Data is shifted into the SDI pin, and
the DCLK pin is used as a clock. The data is used to drive the inputs
of the internal logic and I/O modules, allowing a complete
functional test to be performed. The outputs of the modules can be
read by shifting out the output response. All units are tested for
functionality over the military temperature range. Group A
subgroups 7, SA, and 8B are tested in the same manner.
AC timing for logic module internal delays and pin-to-pin delays is
determined after place and route. The AlS Timer utility displays
actual timing parameters for circuit delays. Since these delays are
design-dependent and cannot be tested on unprogrammed devices,
Actel tests for AC performance by measuring the input-to-output
delay of a special path called the "binning circuit."
The binning circuit consists of one input buffer + n logic modules
+ one output buffer (n = 16 for the ACT 1010; n = 28 for the ACT
1020). The logic modules are distributed along two sides of the
device. These modules are configured as inverting and
non-inverting buffers. The modules are connected through
programmed antifuses with typical capacitive loading.
Actel uses a special benchmark design to correlate the binning
circuit delay to typical and worst-case design delays. Samples of
units are programmed to this benchmark circuit and all
programmed paths are measured for AC performance (including
the binning circuit delay). The measured delays are then compared
to the AlS Timer predictions. The binning circuit maximum delay
has been set to assure conformance to the predicted delays. Units
are sampled to confirm this correlation upon initial device
characterization and whenever a change is made that may affect
AC performance.
Package Thermal Characteristics
The device junction to case thermal characteristic is 6jc, and the
junction to ambient air characteristic is 6ja. The thermal
characteristics for 6ja are shown with two different air flow rates.
Maximum junction temperature is 150°C.
A sample calculation of the maximum power dissipation for a
ceramic PGS4 package at military temperature is as follows:
150 (Max) - 125 (Max Mil.) = 25/33 C/W (pGS4 - still air)
Watts.
Pin Count
6jc
6ja
Still air
Ceramic Pin Grid, PG84
84
8
33
20
°C
Cerquad J lead, JQ44
44
8
40
32
°C
Cerquad J lead, JQ68
68
8
38
30
°C
Cerquad J lead, JQ84
84
8
36
25
°C
Package Type
PACKAGE CASE
\
DEVICE JUNCTION
6ja
= 0.76
300 ft/m!n.
.. AIR FLOW
..
Units
•
Output Buffer Performance Derating
Source
Sink
10
10
,/
,
,
~,.
~
~~
~r
"
2
0.2
0.3
0.4
~
0.5
2
4.0
0.6
VOL (Volts)
""~
~
3.6
3.2
2.8
VOH (Volts)
Worst-case values at 125°C, 4.5 V.
Note:
The above CUlVes are based on characterizations of sample devices and are
not completely tested on all devices.
2.4
2.0
ACT 1 Military FPGAs
typical and critical (speed-sensitive) nets are given below. Most nets
will fall into the "typical" category.
Timing Characteristics
Timing is design-dependent; actual delay values .are de~e.rmined
after place and route of the design using t~e ALS TIme~ ~t1lIty. The
following delay values use statistical estImat~s for. ~n~g delays
based on 85% to 90% module utilization. DeVIce utIlIzatIon above
95% will result in performance degradation.
Less than 1% of all routing in a design requires the use of "long
tracks." Long tracks, long vertical or horizontal routing paths, are
used by the autorouter only as needed. Delays due to the use oflong
tracks range from 15 ns to 35 ns. Long tracks may be used to route
the least-critical nets in a given design.
With Actel's place and route program, the user can assign a net's
criticality level, based on timing requirements. Delays for both
In the following tables, timing is characterized over the
recommended operating conditions, unless specified otherwise.
Module Timing
5.0 V; TA
vee
25°C; tpD
4.0 ns @ FO - 0
Single Logic Module Macros
(e.g., most gates, latches, multlplexors)1
Output Net
Parameter
FO
=1
FO - 2
FO - 3
FO
4
FO
8
Units
t pD
Critical
7.0
7.5
8.0
11.0
Note 2
ns
t pD
Typical
8.2
8.7
10.0
11.2
14.0
ns
Dual Logic Module Macros
(e.g., adders, wide Input gates)1
Parameter
Output Net
FO = 1
FO - 2
FO - 3
tpD
Critical
12.0
12.5
13.0
16.0
Note 2
tpD
Typical
13.2
13.7
15.0
16.2
19.0
4
FO
FO
8
Sequential Element Timing Characteristics
Fan-out
Parameter
FO
=1
FO
=2
FO
=3
FO
=4
FO - 8
tsu
Set Up Time, Data Latches
4.6
5.0
5.4
5.8
5.8
tsu
Set Up Time, Flip-Flops
5.0
5.0
5.0
5.0
5.0
tH
Hold Time
0
0
0
0
0
tw
Pulse Width, Minimum
10.0
10.0
10.0
10.0
10.0
tpD
Delay. Critical Net
7.0
7.5
8.0
11.0
Note 2
tpD
Delay, Typical Net
8.2
8.7
10.0
11.2
14.0
Notes:
1. Most flip-flops exhibit single module delays.
2. Critical nets have a maximum fan-out of 6.
3. Minimum pulse width, tw, applies to CLK, PRE, and CLR inputs.
I/O Buffer Timing
vee = 5.0 V; TA = 25°C
INBUF Macros
Parameter
From - To
FO
=1
FO
=2
FO
=3
FO
=4
FO
=8
Units
Pad to Y
9.0
9.9
11.6
13.9
18.6
ns
Pad toY
7.7
8.4
10.0
10.9
16.1
ns
CLKBUF (High Fan-out Clock Buffer) Macros
Parameter
FO
= 20
FO = 50
FO
= 200
Units
11
15
25
ns
12
16
25
ns
Notes:
1. A clock balancing feature is provided to minimize clock skew.
2. There is no limit to the number of loads that may be connected to the
CLKBUF macro.
OUTBUF, TRIBUFF & BIBUF Macros
CL = 50 pF
Parameter
From - To
CMOS
TTL
Units
tpHL
D to Pad
5.1
6.4
ns
t pLH
D to Pad
9.4
7.4
ns
t pHZ
E to Pad
6.7
4.4
ns
tpZH
E to Pad
8.4
6.3
ns
tpLZ
E to Pad
8.9
6.7
ns
tPZL
E to Pad
6.4
7.7
ns
Change in Propagation Delay with Load Capacitance
Parameter
From - To
CMOS
TTL
Units
tpHL
D to Pad
0.04
0.06
ns/pF
tpLH
D to Pad
0.09
0.05
ns/pF
tpHZ
E to Pad
0.10
0.06
ns/pF
t pZH
E to Pad
0.09
0.05
ns/pF
t pLZ
E to Pad
0.09
0.05
ns/pF
tpZL
E to Pad
0.04
0.05
ns/pF
Notes:
1. The BIBUF macro input section exhibits the same delays as the INBUF
macro.
2. Load capacitance delay delta can be extrapolated down to 15 pF
minimum.
Example:
Delay for OurnUF driving a 100-pF TIL load:
tPHL = 6.4 + (0.06 .. (100-50» = 6.4 + 3.0 = 9.4 ns
tpLH = 7.4 + (0.05 .. (100-50)) = 7.4 + 2.5 = 9.9 ns
ACT 1 Military FPGAs
Timing Derating
Operating temperature and voltage and device processing
conditions account for variations in array timing characteristics.
These variations are summarized into a derating factor for ACT
array typical timing specifications. Derating factors for military
devices are shown below: best-case reflects minimum operating
voltage and temperature and best-case processing; worst-case
reflects maximum operating voltage and temperature and
worst-case processing. Best-case derating is based on sample data
only and is not guaranteed.
Factor (x typical)
Best-case
0.40
Worst-case
1.38
Voltage Derating Curve
Temperature Derating Curve
1.20
1.40
1.15
1.30
1.10
.s
~
1.05
1.00
I."'~
1.20
"'"
........
~
0.95
~ .....
j
1-0"
1.10
1,..-1-0"
I.;
1.00
, ~I,..-
.........
0.90
0.90
~I-o"
,~
•
0.80
0.85
0.70
-60
0.80
4.5
4.75
5.0
5.25
5.5
Vee (Volts)
-40 -20
0
20
40
80
80
100
120
Temperature (0C)
Output Buffer Delays
D
To AC test loads (shown below)
GND
GND
Vee
Out
1.5 V
GND
tpZH
1-35
AC Test Loads
Load 2
(Used to measure rising/failing edges)
Load 1
(Used to measure propagation delay)
GND
Vee
•
To the output under test
)>-----'l
I
•
50P
R to Vee for tpLZ/tpZL
Rto GND for tpHz/tpZH
'
To the output under test " } - - - - -
R = 1 kO
50PF
T
Input Buffer Delays
Module Delays
sDan-out=2
A
Y
B
Vee
s, ~ v50 %
2' ' ___G_N_D_ _ _ __
50 0
Vee
Out
"",V50%
"r-... 5O %
GND
t pHL
tpLH
Out
Vee
50~"
GND
t pHL
GND
..... VSO%
t pLH
D-Type Flip-Flop and Clock Delays
PRE
D
Q
Fan-out = 2
Clk
CLR
(Positive edge triggered)
tsu = Set up time
tH = Hold time (0 ns)
tw = Minimum pulse width (10 ns)
Vee
Q
1-36
50%
ACT 1 Military FPGAs
Packages
(Top View)
2
A
B
C
D
E
F
G
H
J
K
L
3
4
5
6
7
8
9 10 11
00000000000
00000000000
00.
000
00
00
00 (';
000
000
84·PIN
CERAMIC
000
000
PIN GRID
ARRAY
000
000
00
00
00
000
00
00000000000
00000000000
•
Orientation Pin (C3)
•
Pin Assignments for the 84-Pln PGA
Signal
ACT 1010
ACT 1020
PRA
A11
A11
PRB
B10
B10
MODE
E11
E11
SDI
B11
B11
DCLK
C10
C10
Vpp
K2
K2
CLKor I/O
F9
F9
GND
B7, E2, E3, K5, F10, G10
87,E2, E3, K5, F10, G10
Vee
85, F1,G2, K7, E9, E10
85,F1, G2, K7, E9, E10
N/C (No Connection)
B1, B2, C1, C2, K1, J2, L1, J10, K10, K11,
C11, D10, D11
82
Notes:
1. V pp must be terminated to Vee, except during device programming.
2. MODE must be terminated to circuit ground, except during device
programming or debugging.
3. Unused 110 pins are designated as outputs by the Action Logic System
and are driven low.
4. All unassigned pins are available for use as 1I0s.
1-37
Packages (continued)
:»
0
z
C!l
6 5 4 3 2 1 4443424140
GND
44-PIN
JQCC
Vee
Vpp
39
38
37
36
35
34
33
32
31
30
PRS or I/O
PRA or I/O
DClKor I/O
SDI or I/O
Vee
MODE
ClK or I/O
GND
GND
GND
68-PIN
JQCC
29
17
Vee
1819202122232425262728
Vpp
60
59
58
57
56
55
54
53
52
51
50
49
48
47
46
45
PRS or I/O
PRA or I/O
DClKor I/O
SDI or I/O
Vee
MODE
ClK or I/O
GND
44
27282930313233343536 37383940414243
g
o
z
C!l
NC
GND
GND
84-PIN
JQCC
Vee
Vee
6
m
a:
a..
74
73
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
PRAor I/O
DCLK or I/O
SDI or I/O
Vee
Vee
MODE
ClK or I/O
GND
GND
56
55
54
Notes:
1. Vpp must be terminated to Vee, except during device programming.
2. MODE must be terminated to circuit ground, except during device
programming or debugging.
Important:
Please contact your Acte1 representative for current package availability.
1-38
3. Unused 110 pins are designated as outputs by the Action Logic System
and are driven low.
4. All unassigned pins are available for use as liDs.
ACT 1 Military FPGAs
Package Mechanical Details
Ceramic Pin Grid Array
-.J I- .050" ± .010"
II (4 PLCS)
Pin #1
~
~
.018" ± .002"
_____L
.100" TYp.
-----r
~
1100" +020" Square
---1 ~UI. .I~~:
080
"
I
OOOOOOOOOOG-t--r00000000000
00
000
00
00
00
000
000
o 0 0 1.000 SSC
000
000
000
00
00
00.
000
00
00000000000
(j000000000G-t--~
•
Orientation Pin
•
Package Mechanical Details (continued)
J-Leaded Cerquad Chip Carriers
.040" ±
.010"1~'!IE"==l*f== F.!..~1LI.1...L.U..J...L.LoooU..,!~'::~-b.l-~_~Pin
t
.050" BSC
-~
0.007"
0.011"
--1 -
.105" ± .010"
I~
-0
.155"
.180"
D,E
Lead Count
44
68
84
.690" ± .005"
E1, D1
.650" ± .008"
.990" ± .005"
.950" ± .008"
1.190" ± .005"
1.150" ± .008"
Important:
Please contact your Actel representative for current package availability.
1 Index Chamfer
Action Logic™ System
FPGA Design Environment
Features
• Extensive
Gate Array
Macro Library
• Supports Popular
CAE Systems
• Design Validation,
Electrical Rules Check
• Fully Automatic
Place and Route
• Menu-Driven Pin Editor
for 1/0 Assignment
• Timing Analyzer Inspects All Critical
Paths and AC Specifications
Overview
Schematic Capture and Simulation
The Action Logic™ System is a high-productivity CAE
environment for implementing logic functions with ACTTM 1 field
programmable gate array devices. A menu-driven interface allows
users to complete ACT 1 designs, from concept to silicon, within
hours without costly NRE. The system includes a software design
environment and an Activator™ programmer, tester, and
debugger. Most popular CAE platforms are supported. Designers
use their workstation to capture schematics, simulate, verify, place
and route, perform timing analysis, program, and debug the chip.
Figures 1 through 4 show the Actel interface to popular CAE
design environments. Actel's on-line help screens and reference
manuals speed and simplify the design process.
Users enter their schematic using the Actel library. When
schematic capture is completed, functional simulation is performed
on the design.
Pin Editor
The Actel Pin Editor is an easy-to-use, menu-driven program for
user assignment of logic lias and package pins. The editor
automatically scrolls a list of all user-designated lias. To assign any
pin to an 1/0, the user enters the desired pin number next to the
1/0 node name. During each pin assignment, the Editor insures
each pin is a valid package pin and is not already assigned to
another chip I/O.
Gate Array Macro Library
Design Validation
The Actel macro library contains the logic function building blocks
necessary to create a design. The library includes macros ranging in
complexity from simple gates to complex functions. Each macro
has a graphic symbol or icon. Hard macros also contain placement
information, a netlist, and timing data.
Basic gates from the library are used to create soft macros, such as
counters, adders, and decoders. The Actellibrary contains over 200
different macros.
©
1990 AC1el Corporation
The Actel Design Validator examines an ACT design for
adherence to design rules specific to the ACT device chosen for the
design. Validation verifies routability and performs design rules
checks prior to routing. The Validator produces error and
information messages and system warnings regarding electrical
rules violations such as excessive fanout, shorted outputs, or
unconnected inputs. A warning is issued, for example, if a net
exceeds a fanout of ten; unconnected module inputs produce an
error message.
•
The Validator also provides statistical information about
routability, logic module count, I/O count, average fanout per net,
and array utilization. After passing through the validator, the
design is free of electrical rules violations.
Automatic Place and Route
Fully automatic place and route software minimizes design delay by
assigning macros to optimal locations in the chip. The system uses
the design's netlist, critical net information, and I/O assignments to
automatically place and route all the logic blocks within the circuit.
No manual intervention is required, even at high device utilization.
The software provides the user with data on actual wire lengths,
capacitive loading, and wiring congestion. The route program
assigns the shortest possible net segments to connect the library
macros with minimal delay, routing 100% of the nets automatically
for 85% to 95% logic module utilization.
and evaluated with the timer or a CAE simulator. Using this
information, the designer can optimize the design to meet timing
specifications. The Timer accepts as input the design's netlist and
delay information. The user directs the type of analysis and output.
Delay reports generated by the timing analyzer using post-route
numbers provide the final AC specifications for the design.
Device Programming and Functional Test
Programming is controlled by the Activator programming system.
Action Logic software generates a fuse map for the ACT device,
which is used by the Activator for programming. A series of
internal address registers are automatically loaded; they specify the
programming element (pLICETM antifuse) to be programmed. A
programming sequence is then initiated, creating a permanent link.
These steps are repeated until all interconnections are made.
The Activator supports functional testing using an I/O test vector
file. It also supports interactive circuit debugging.
Timing Analysis
The timing analyzer (Timer) is an interactive tool that determines
and highlights all critical and non-critical paths within a specified
design. After the layout phase is completed, the Timer extracts
accurate net delays. These delays are back-annotated to the netlist
In-Circuit Test and Debug
Once the device is programmed and functionally verified, it is ready
for operation in the user's system. Actionprobe™ diagnostic tools
may then be used to further evaluate circuit integrity.
Action Logic System FPGA Design Environment
Ordering Information
Configuration
Part
Number
ALS-113
ALS-110
Debugger,
Aetlonprobes
Platform/
CAE Host*
Vlewlogle
Vlewdraw
Place & Route,
Timer
386 PCI
OrCADTM
Viewlogic
X
X
386 PCI
OrCAD
Viewlagic
X
X
X
X
X
X
X
X
X
X
X
X
X
X
ALS-031
ApolioTM/
Mentor Graphics™
X
ALS-Q35
Apollol
Mentor Graphics
X
ALS-041-S3
Sun 3™/
Valid™
X
ALS-Q45-S3
Sun 31
Valid
X
ALS-041-S4
Sun 4™I
Valid
X
ALS-045-S4
Sun 41
Valid
X
ALS-051-S3
Sun 31
Viewlogic
X
ALS-055-S3
Sun 3/
Viewlogic
X
ALS-051-S4
Sun 41
Viewlogic
X
ALS-Q55-S4
Sun 41
Viewlogic
X
X
OPTIONS
ALS-016
High-Density Viewsim for 386 PC
ALS-017
Low-Density (up to 650 logic modules) Viewsim for 386 PC
ALS-114
Boolean Entry and Optimization for 386 PC
ALS-119
Activator 1 Programmer
ALS-219, 239, 249-S3, 249-84
Activator 2 Programmer
ALS-256-S3
SUN 3-based Workview CAE
ALS-256-S4
SUN 4-based Workview CAE
"Includes schematic, symbol, and functional simulation libraries
Important:AlI systems and options are subject to availability.
Please contact your Actel representative for current status.
Activator
II
Vlewloglc Design Environment
Viewsim
I
Viewdraw®:
Design Entry
I
Viewlogic Netlist
Unit Delay
Functional Simulation
Post-Route
Simulation
I
f
I
I
I
Actel
Libraries
I
.vsm file
VSM Utility
f
Back-annotated post-layout delays (.dtb file)
I
I
Makeadl:
Netlist Conversion
I
Valldator:
Design Validation
•
I
Export
t
Place &
Route
I
I
I
.-
r
I
Actel Design Environment
Activator:
Device
Programmer
J Timer: Timing
I Analysis
I
J
Debugger &
Actionprobe
,I
I
Your design on
an Actal FPGA
I
Figure 1. Actel - Vlewloglc Design Flow
Mentor Graphics Design Environment
QuickSim™
I
NETEDTM & SYMEDTM
Design Entry
I
I
I
Unit Delay
Post-Route
II--------.t
Functional Simulation
Simulation
~--------------~--~f----~
I I
Actel
L-_Li_b_rn_ry~~
Makeadl:
L-_N_et_li_st_C_orn_v_e_rn_io_n~
I
I
I
J
•
Valldator:
Design Validation
Actel DeSign Environment
I
I
Place &
Route
1
Export.als (Delays back-annotation)
It-------e---~J
I
J
I
Activator:
Device
Programmer
I
Figure 2. Actel - Mentor Graphics Design Flow
1-44
I
I
Timer: Timing
Analysis
Debugger &
Actionprobe
Your design on
an Actal FPGA
I
Action Logic System FPGA Design Environment
OrCAD Design Environment
I
OrCAD/SDTTM:
Design Entry
~
Actel
Library
I
I
OrCAD Netlist
I
RCA2ADL:
Netlist Conversion
I
Valldator:
Design Validation
o
~
I
OrCADNSTTM:
Functional Simulation
I I
Place &
Route
I
1
I
I
.1 Timer: Timing
1
Analysis
I
J
I
Actel Design Environment
Activator:
Debugger &
Actionprobe
Device
Programmer 1
I
I
I
::::::;:;: :.:::::::::::::::::::::;:::::::::::.:....
I
Your design on
an Actel FPGA
I
•
Figure 3. Actel - OreAD Design Flow
Valid Design Environment
ValidSIMTM
ValidGEDTM:
Design Entry
I
Valid Netlist
I
Unit Delay
Functional Simulation
I
Post-Route
Simulation
f
.::::::::::::::::::::::::::::::::::::::: :.: :.:.:.:.:.:.:.:.;.:.:.:.....:. ....................•.......•..........•.....•...;.;.;.:.:.:.:.:........
I
I
I
I
Actel
Libraries
I I
~
Makeadl:
Netlist Conversion
~
Valldator:
Design Validation
I
Place &
Route
Back-annotated post-layout delays
1
r
I
J
Actel Design Environment
I
1 Timer: Timing
Analysis
I
I
Debugger &
Activator:
Device
Actionprobe
Programmer 1
I
I
•
Your design on
an Actel FPGA
I
Figure 4. Actel - Valid Design Flow
1-45
1-46
ACT™2
Field Programmable
Gate Arrays
Preliminary
Product
Brief
Features
Enhanced Logic Module
• High Gate Count
- Up to 8000 gate array gates
(20,000 PLD/LCA equivalent gates)
The logic module used in the ACT 2 family is an enhanced version
of the ACT 1 module, allowing improved implementation of high
fan-in combinatorial macros. The ACT 2 logic modules are also
optimized to implement high-speed flip-flops and latches. Figure 1
depicts part of a state machine implemented with five ACT 2
modules. The critical path delay is only two module delays.
• 1232 Logic Modules
• 140 User-Programmable I/O Pins
• 176-Pin PGA Package
• Instant Prototypes and Production
• Non-Volatile, Permanent Programming
• I/O Drive to 8 mA
• System-Level Performance to 50 MHz
• Enhanced Channeled-Array Architecture
- Up to 750,000 Programming Elements
r-------------,
I
I
I
- Up to 36 Horizontal and 15 Vertical Tracks
• Enhanced Logic Modules
• Supports Wide-input Combinatorial Functions (Up to
five-Input AND gates)
IL
_____________ J
II
• Supports Single-Module Sequential Functions
• Two In-circuit Diagnostic Probe Pins Support 50 MHz
Analysis
• Two High-Speed Low-Skew Clock Networks
• Low-Power 1.2-}Jm CMOS Technology
- Standby Current < 300 pA
Figure 1. Part of State Machine Implemented In
Five ACT 2 Modules.
Description
The ACT™ 2 family is the second generation of field
programmable gate arrays from Actel. Based on Actel's patented
channeled array architecture, the ACT 2 family provides
significant enhancements to gate density and performance while
maintaining. upward compatibility from ACT 1 designs. The
devices are implemented in silicon gate, 1.2-}Jm, two-level metal
CMOS, and they employ Actel's PLICETM antifuse technology.
The unique architecture offers gate array flexibility, high
performance and instant turnaround through user programming.
Designs of up to 8000 gates can be implemented. The ACT 2 family
is supported by the Action Logic™ System (ALS). ALS is available
on SunTM, ApolloTM, and 386 PC platforms, with CAE interfaces to
Valid™, Viewlogic®, Mentor Graphics™, and OrCAD™.
©
1990 Actel Corporation
Enhanced Architecture
Routing efficiency with large gate counts is accomplished with
increased routing resources, increased antifuse programming
elements, and architectural enhancements. Horizontal routing
tracks/channel increase to 36 (vs 25 for ACT 1); vertical routing
tracks/column increase to 15 (vs 13 for ACT 1). All speed-critical
module-to-module connections are accomplished with only two
low-resistance antifuse elements. Most connections are
implemented with either two or three antifuse elements (as shown
in Figure 2). No connections are allowed with more than four
antifuse elements in the path. The result is predictable
performance with fully automatic placement and routing. Device
utilization is typically 85% to 95% of available logic modules and
80% of gates.
Low-Skew Network
uncommitted vertical segment
Two low-skew distribution networks are provided. Each network
can be driven by either of two dedicated I/O pins or from internal
logic.
/
Enhanced Programming and Test
The ACT 2 family provides the same guaranteed functionality as
ACT 1 provides. All routing tracks, logic modules, and program,
debug, and test circuits are fully tested in the factory. Verification of
correct antifuse programming is performed automatically with
Actel's Activator™ 2 program and debug hardware. The ACT 2
architecture implements an enhanced programming and test
algorithm. Programming time for designs up to 8000 gates is
comparable to programming time for 2000-gate designs
implemented with the ACT 1 array.
driver
/
dedicated
vertical
segment
Figure 2. Enhanced Routing Architecture
Programmable I/O Pins
Each I/O pin can be configured as an input, output, three-state, or
bidirectional buffer. Inputs are TIL- and CMOS-compatible.
Output drive levels meet 8-mA TIL and 4-mA RCT standards.
User-programmable slew rate is provided to limit simultaneous
switching noise. Optional transparent latches at the I/O pins are
provided for both inputs and outputs. I/O latches can be combined
automatically with latches in the array to implement master-slave
flip-flops, as depicted in Figure 3.
LOGIC MODULES
r-----------,I
I
/
INPUT MODULE
I
I
I
I
I
I
I
I
I
I
DLAT
t---I----t DLAT
6
1...---- ______
6
\
t-----t DLAT
r-----------,
OUTPUT MODULE
I
I
I
6
I...
...1
r-----------,
I
CLOCK MODULE
I
I
I
I
I
I
I
I
II... _ _ _ _ _ _ _ _ _ _ _ .J
Figure 3. Latched User II0s
I
I
I
I
6
I
I
I
I
_ _ _ _ _ _ _ _ _ _ ...1I
r---*---tDLAT
Activator™ 2
Programmer/Tester/Debugger
Features
• supports ACT™ 1 and ACT 2 Device Families
• SCSI Connectors Interface to 386 PC, Sun™ and ApolloTM
Workstations
• Adapter Modules for Each Package/Family Combination
• Up to Two Times Faster Than Activator™ 1 for ACT 1
Programming
• Supports Functional Verification with Actionprobe™
Diagnostic Pod
• Simultaneously Programs a Single Pattern in Up To Four
Identical Devices
• In-Circuit Probing of Up To Four Devices Simultaneously
Product Description
Activator 2 is Actel's state-of-the-art desktop programmer. It
utilizes Actel's Action Logic™ System (ALS) software and
PLICE™ antifuse technology to program custom-engineered
devices from Actel's ACT 1 and ACT 2 device families. SCSI
connectors, shipped with the unit, provide a convenient connection
for PC 386, Sun, and Apollo workstation support. Customized
adapter modules for each device type permit easy interchange of
programmable devices. Programming, Test, and Debug execute up
to two times faster than on the Activator 1; users can program up to
four devices at one time. The accompanying diagnostic pod and
Actel debug software support observation of all internal signals.
The unit is software-driven, providing flexibility of application and
a built-in barrier to obsolescence. Independently powered, the unit
provides desktop device programming, functional test, and
in-circuit debug.
©
Preliminary
Product
Brief
The unit supports different CAE platforms using RS232 and SCSI
connectors.
The Activator 2 may operate with four different adapter modules
inserted, or three of the sockets may be slaves to the first,
permitting simultaneous programming of four identical devices.
The Activator 2 then addresses anyone of the sockets for
programming.
Activator 2 Base Unit
The base unit contains the control board and the analog board.
The LED display on the top of the programmer shows when the
unit receives power. Four adapter ports located on the top of the
base unit accept the different adapter modules for each device type.
The adapter ports are identical, allowing interchangeability of
modules. RS232 and SCSI connectors are located on the back of
the unit.
The Adapter Module
The adapter modules customize pin configurations for each device;
the user need only switch adapter modules in order to program
another device. All four ports may be in use simultaneously, when
programming identical devices. The adapter modules are compact
and sturdy. Any module may be used on any available port.
The Diagnostic Pod
The diagnostic pod supports ALS diagnostics and connects to the
programmer via cable. The Activator permits the user to view any
internal circuit activity through Actel's on-chip diagnostic ports.
Up to four pods may be used simultaneously. A six-foot cord
connects the pod to the base unit and provides debugging
flexibility.
1990 Actel Corporation
1_4Q
I
Reliability Report
1IIIII
~_____________R_e_li_a_b_il_itY__R_e_p_o_rt___
ACT 1010/1020 Reliability Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
2-1
ACT™ 101 0/1 020
Reliability Report
ACT™ 1010/1020 devices are 1200- and 2OO0-gate (respectively)
field programmable gate arrays (FPGAs). The programming
element is an Actel-invented PUCETM (programmable
Low-Impedance Circuit Element) antifuse. An antifuse is a
normally open device in which an electrical connection is
established by the application of a programming voltage. Although
ACT 1010/1020 products are one-time programmable devices,
their unique architecture features complete functional testability.
The ACT 1010/1020 device is processed using a standard 2 pm,
double metal, CMOS process to which three additional masking
steps have been added to implement the PUCE antifuse. A
description of the main process parameters is shown in Table 1.
Because ACT 1010/1020 devices are manufactured with a
conventional CMOS process, normal CMOS failure modes will be
observed. However, the addition of the antifuse adds another
structure that could affect the device's reliability.
By Steve Chiang
and
Ken Hayes
PLiCE Antifuse Reliability
The antifuse is a vertical, two-terminal structure. It consists of a
polysilicon layer on top, N + doped silicon on the bottom, and an
ONO (oxide-nitride-oxide) dielectric layer in-between. A Scanning
Electron Microscope (SEM) cross-section of the anti fuse is shown
in Figure 1. On the ACT 1010/1020 device, the size of the antifuse is
1.8 pm 2• This small size, along with a low programmed
on-resistance, typically 500 0, makes the PUCE antifuse a very
attractive alternative to EPROM, EEPROM, or RAM for use as a
programming element in a large programmable gate array. In the
unprogrammed state, the resistance of the antifuse is in excess of
100 MO. The ACT 1010 and ACT 1020 contain 112,000 and
186,000 antifuses respectively. However, typical applications
utilizing 85% of the available gates require programming only 2%
to 3% of the available antifuses.
Actel has completed numerous studies in order to quantify the
reliability of the antifuse. These studies lead to the conclusion that
the time to failure of the antifuse is substantially more than 40 years
under normal operating conditions and that the combined
contribution of all antifuses to the gate array product's hard failure
rate is less than 10 FITS (Failures-in-Time or 0.001% failures per
1000 hours).
•
Table 1. ACT 1010/1020 Process Description
CMOS, 2 pm, double metal, dual polysilicon, dual well, EPI wafer.
Dimensions
Width
Space
N+
P+
Cell Polysilicon
4.0pm
4.0
1.6
2.0 pm
2.0
3.6
Gate Polysilicon
Metal I
Metal II
1.6
4.0
4.2
2.4
2.0
2.8
Contact
Via
1.8 x 1.8
2.0 x 2.0
2.0
2.0
Figure 1. SEM Cross-Section of Antifuse
Thickness
Normal Gate Oxide
High Voltage Gate Oxide
Cell Polysilicon
Gate Polysilicon
Metal I
Metal II
Passivation
25nm
40
35
40
80
100
1100
Composition
Metal I
Metal II
Passivation
© 1990 Actel Corooratlon
98% AI, 1% Si, 1% Cu
98% AI, 1% Si, 1% Cu
300 nm Si02 , 800 nm SiN
The Unprogrammed Antifuse
In order to evaluate antifuse reliability, Actel has developed
models and collected data for both unprogrammed and
programmed antifuses 1, 2. We'll consider the unprogrammed
antifuse first. Since the antifuse is a dielectric sandwiched between
polysilicon and silicon, the model for its reliability, in the
unprogrammed condition, is the same as that used to evaluate
reliability ofMOS transistor gate oxides. The parameter we wish to
evaluate is the time to breakdown (tbd) of the dielectric. This
parameter is a function of the electric field across the dielectric as
well as temperature and has the following relationship 3.
tbd
= to. exp (GIE)
From experimental data, we can plot the log of the time to
breakdown of the antifuse at various temperatures versus the
reciprocal of the electric field across it and derive G from the slope
and to from the Y axis intercept. Actel has done this both on large
antifuse capacitors and on arrays of 28,000 antifuses. An example is
shown in Figure 2. From this we have derived a value for G of 510
MYIcm and a to of 1 x 10- 16 seconds. Also note the difference
between the data at 2S°C and 150°C.
(1)
where tbd is the time to breakdown in seconds, to is a constant, E is
the electric field in MYIcm, and G is the field acceleration factor in
MYIcm (G is a function of temperature).
By taking the log of both sides of equation 1 we have:
In (tbd)
= G • (l/E) +
(2)
In (to)
11
10
~
~
/
.1:
/
!
1
.c
;/~
oS
Q)
E
z..J
I
./
/'
/:~
k1.
.
V
If'
.
Legend
•
150°C
I
o
7.6
7.8
./
i/
/
**
E
:/'
/'
/
8.2
8.4
* 25°C-
I
8.8
8.8
100/E (cm/MV)
Figure 2. Field Acceleration of Antlfuse (0.04 mm2 Area Capacitor)
In order to quantify the temperature-dependence of the time to
breakdown, we use the Arrhenius equation to determine the
reaction rate (or semiconductor failure rate) of a given process
(failure mode) over temperature:
R
= Ro • exp(-EallcT)
(3)
where R is the reaction rate, Ro is a constant for a particular
process, T is the absolute temperature in degrees Kelvin, k is
Boltzmann's constant (8.62 x 10-5 eV/OK), and Ea is the activation
energy for the process in electron volts. To determine the
acceleration factor for a given failure mode at temperature T 2 as
compared with temperature T 1 we use equation 3 to derive:
(4)
where A is the acceleration factor.
= k. din (tbd)/d (lIT)
=G
• [1
+
S/k. {lIT - l/298}]
(6)
where S (in eV) characterizes the temperature-dependence of G.
Ea can be related to G (f) by:
Ea
= G. S/E - Eb
(7)
where Sand Eb are treated as fitting parameters between E.
andG.
From the data shown in Figure 2, as well as data on test arrays of
28,000 antifuses, we have derived a value for Ea ofO.2eY. This value
ofEa = 0.2eV is for a very high E field of 11 MV/cm, or 10 V across
the antifuse. With 5.5 Y, the E field is about 6 MY/cm and Ea is
approximately 0.6 eV. Values of Sand Eb were found from
equation 7 to be 0.01 eV and 0.24 eY, respectively.
By combining equations 1,5,6, and 7, we obtain an overall equation
for the time to breakdown for a given temperature and E field:
For a given time to breakdown of a dielectric, the expression,
Ea
G (f)
(5)
gives us the activation energf. The field acceleration factor, G, is
also temperature-dependent, i.e.
tbd = to· exp {(G/E) [1
(lIT - l/298)}
+
(S/k). (lIT - 1/298)] - (EJk) •
(8)
ACT 1010/1020 Reliability Report
factor of 36, or 4.1 equivalent years for a 1000-hour bum-in at
125°C. Note that this acceleration factor of 36 is close to the value
of 41.8 derived from the Arrhenius equation (equation 4) using an
activation energy of 0.6 e V and the same temperatures. We use 0.6
eV as a general semiconductor failure mode activation energy
when calculating failure rates from high-temperature operating life
(HTOL) later in this report.
By applying the values for the constants as defined above, the time
to breakdown for the antifuse dielectric can be derived for a given
temperature and electric field. In Table 2, we have used equation 8
to solve for the acceleration factors associated with powering up a
device at high voltage and/or temperature and comparing the
failure rate with more typical voltages or temperatures. We can see
the effect of temperature by comparing 125°C at 5.5 V with 55°C at
5.5 V in which the Actel model (equation 8) gives us an acceleration
Table 2. Acceleration Factor vs. Operating Conditions (Unprogrammed Antifuse)
to = 1 X 10-16 sec', G = 510 MV/cm, 8 = 0.01
eV.
TemperatureNoltage
Equivalent Years
for 1000-Hour
125°C Burn-In
High
Typical
Acceleration
Factor
125°C/5.5 V
12SoC/S.5 V
SsoC/S.5 V
9SoC/S.S V
36.0
3.9
4.1
0.4
Fixed Temperature
25°C/S.5 V
2soC/S.75 V
2S°C!S.75 V
25°C/S.2S V
25°C/S.2S V
2soC/S.5 V
48.7
1692
34.7
5.6
193.2
4.0
Varied Temperature and Voltage
12SoC/S.S V
12soC/S.7S V
12soC/S.7S V
S5°C/S.2S V
55°C/S.S V
95°C/5.S V
1526
883
96.5
174.2
100.8
11.0
12SoC/S.S V
12SoC/S.S V
SsoC/S.S V
9SoC/S.S V
41.8
4.2
4.8
0.5
Model
Fixed Voltage
Fixed 0.6 eV
(Activation Energy),
Voltage-Independent
We can also see from Table 2 that a small change in voltage is a
much more effective reliability screen for the unprogrammed
antifuse than is a change in temperature. For example, if we
compare 25°C at 5.75 V to 25°C at 5.25 V we see that just a half volt
change yields an acceleration factor of 1692, or 193.2 equivalent
years per 1000 hours at 5.75 V. This strong dependence on voltage
allows Actel to screen out antifuse infant mortality failures during
normal wafer sort testing at Actel simply by performing a special
test in which a higher than normal voltage is applied across all
antifuses. Because antifuse infant mortality failures can be detected
and effectively screened, ACT 1010/1020 devices have as high a
level of reliability as standard CMOS processed products.
Actel has collected data on over 350 antifuse test devices
representing eleven wafer runs. From this data, we have
determined that the contribution of the antifuse to overall device
reliability is less than 10 FITS. This conclusion is confirmed by the
reliability data taken on actual ACT 1010/1020 units which will be
discussed later in this paper.
Test devices were stressed by forcing a constant 5 rnA current from
polysilicon to N + diffusion through the antifuse at 250°C. Note
that this stress is far greater than what a programmed antifuse
would see in a device operating under normal conditions. Because •
the antifuse is used to connect two networks together, there is
usually no voltage across it, hence no current passes through. A
voltage will appear across the antifuse only momentarily while a
network switches from low-to-high or high-to-Iow.
0
3
N+
-
Diffusion
The Programmed Antifuse
A Kelvin test structure as shown in Figure 3 was used to evaluate
reliability of a programmed antifuse. Here, a strip of polysilicon
crosses an N + diffusion. The antifuse is located at their
intersection. There are metal-to-poly contacts at nodes 1 and 3 as
well as metal-to-N + contacts at nodes 2 and 4. A four-terminal
Kelvin structure is useful should a failure occur, because antifuse
opens can be separated from other problems (such as polysilicon or
contact opens) simply by checking for continuity on appropriate
pairs of nodes.
0
40 I
2
GND
+V
Poly-
1
0
Figure 3. Antifuse Kelvin Structure
units were then examined on an SEM, where the cause of failure
During the 5 rnA, 2S0°C stress, the voltage across the antifuse was
monitored. Figure 4 is a plot of the monitored voltage as a function
of stress time. A sudden increase in voltage indicates that an open
occurred. As can be seen from the figure, failures occurred at about
300 hours of stress. However, by probing on nodes 3 and 4 of the
Kelvin structure, we were able to measure continuity and
determine that the cause of failure was not the antifuse. The failed
was revealed as metal-ta-poly contact electromigration. This is a
well-known failure mode in CMOS, which has been determined to
have an activation energy of 0.9 eV. Using equation 4 we can predict
a lifetime under normal operating conditions in excess of 40 years
for this failure mode. The lifetime of the programmed antifuse is
even longer.
12
8l
~c:
,..-
Antifuse Accelerated Ufetest
250°C 5 mA Test
11
-J,
,·
~
10
fl
<
III
e
0
<
--
- I '...
,
~
"7- ::- ~
I'
,·
Q)
1
g
I
- ~;t:..
.....
'---. ~ .-....
-I
~
I-'
;7
~
·
I
·
~-
I
I
I
J
24 48 72 96 120 144 168 192 216240 264 288 312 336 360 384 408432 456
Elapsed Time (Hours)
Figure 4. Voltage Across Antlfuse versus Stress Time
ACT 1010/1020 Device Reliability
Device reliability was evaluated on four Actel products: a 64K
PROM (PROM64), a 300-gate FPGA (1003), a l200-gate FPGA
(ACT lOlO/lOl0A), and a 2OOO-gate FPGA (ACT 1020/l020A).
The FROM64 product uses the same process and antifuse as the
ACT 1010/1020. The 1003 is a test device, a smaller version of the
ACT 1010/1020, which was used for early characterization and
qualification. The ACT 1010A and ACT 1020A are 20% linear die
shrink versions of the ACT 1010 and ACT 1020, respectively. The
PROM64 units were packaged in 24-pin side-brazed packages; the
FPGA units were in 68- and 84-pin JLeC (ceramic J-Ieaded chip
carriers) and PLeC (plastic-leaded chip carriers) packages.
Package characteristics for ACT 1010/1020 devices are shown in
Table 3.
ACT 1010/1020 Reliability Report
Table 3. ACT 1010/1020 Package Characteristics
PlCC
Sumitomo 6300 H
Molding Compound
Filler Material
Fused silicon 70% by weight
Lead Frame Material
Copper
Lead Plating Composition
Tin or solder, 300 to 800 micro inches (}.lin)
Die Attach Material
Silver Epoxy
Die Coat
Dow Corning Q14939 silicon gel
Bond Wire
Gold, 1.3 mil diameter
Bond Attach Method
Thermosonic
JlCC
Body Material
Ceramic
Lid and Lead Material
Alloy 42 with minimum 60-pin gold plate
Bond Wire
99% Aluminum, 1% Si, 1.25 mil diameter
Bond Attach Method
Ultrasonic
Thermal Resistance (OC/Watt)
Package
44 PLCC
44 JLCC
68 PLCC
68JLCC
84 PLCC
84 JLCC
15
5
13
5
12
5
8
52
38
45
35
44
34
35
High Temperature Operating Life (HTOL) Test
The intent of HTOL is to dynamically operate a device at high
temperature (usually 125°C) and extrapolate the failure rate to
typical operating conditions. This test is defined by Military
Standard-883 in the Group C Quality Conformance Tests. The
Arrhenius relationship in equations 3 and 4 is used to do the
extrapolation. To use the Arrhenius equation, we need to know the
activation energy of the failure mode. Activation energies of
antifuse failure modes were discussed earlier. Table 4 gives the
activation energies of general semiconductor failure modes.
Table 4. Activation Energy of CMOS Failures
Failure Mechanism
Activation Energy
Ionic Contamination
1.0 eV
Oxide Defects
0.3 eV
Hot Carrier Trapping in Oxide
(Short Channels)
-0.06 eV
Silicon Defects
0.5 eV
Aluminum Electromigration
0.5eV
Contact Electromigration
Electrolytic Corrosion
0.geV
0.54 eV
Six different data patterns were programmed into the 64K PROMs
for HTOL testing: a diagonal of zeros (98% programmed); a
diagonal of ones (2% programmed); a topological checkerboard
pattern (50%); all zeros (100%); all ones (0%); and an incrementing
pattern (50%). During burn-in, all addresses are sequenced
through at a 1 MHz clock rate. The outputs are enabled and loaded
with a 100 ohm resistor to a 2 V supply. This results in an output
84 PGA
loading of equal to or greater than the data sheet specified limits of
IOH = -4 rnA and IOL = 16 rnA. In most cases, the PROMs were
burned-in at Vee = 5.5 V; 125°C. However, voltage acceleration
experiments were also done at 7 V; 125°C as well as at 8 V; 25°C.
The PROM is useful for antifuse reliability studies for several
reasons. First of all, we can program anywhere from 0% to 100% of
the antifuses although we program only 2% to 3% of the antifuses
for a given design on the ACT 1010/1020 device. Also, an antifuse
failure on the PROM is very noticeable, since the antifuse is
directly addressed. A weak antifuse would show an AC speed drift,
and a failed antifuse would read the wrong data.
To evaluate the ACT 1003/1010/1020 devices, we programmed an
actual design application into each device and performed a
dynamic bum-in by toggling the clock pins at a 1 MHz rate. The
designs selected utilized 85% to 97% of the available logic modules
and 85% to 94% of the lias. Outputs were loaded with 1.2 kO
resistors to V DD resulting in greater than 4 rnA of sink current as
each I/O toggled low. Under these conditions, each ACT 1010
typically draws about 100 rnA during dynamic burn-in. Most ofthis
current comes from the output loading while about 5 rnA is from
the device supply current. The thermal resistance Gunction to
ambient) of the 68- and 84-pin PLeC packages is about 45°C/Watt;
for the JLeC packages it is about 35°C/Watt. For a 125°C burn-in,
this results in junction temperatures of about 150°C for plastic
packages and 145°C for ceramic packages. Most burn-in was done
at 5.75 V (for voltage acceleration of the anti fuse ) and 125°C. Some
data was taken at 5.5 V; 125°C and at 5.5 V; 150°C.
A summary of the HTOL data collected by Actel is shown in Table
5. A failure is defined as any device which shows a functional
failure, exceeds data sheet DC limits, or exhibits any AC speed
drift. Among the parts tested, no speed drift, faster or slower, was
•
Both ofthe observed failures were normal CMOS failures and were
not caused by the antifuses. The 1003 device failed after 80 hours at
150°C. The unit was functional but had high 100 current (40 rnA vs.
4 rnA typical). Liquid crystal analysis revealed a hot spot outside the
antifuse area of the Chip. The other failed unit was nonfunctional,
with a high 100 current of about 40 rnA. This unit failed after 500
hours at 125°C. Both failures are believed to be the results of
junction degradation or silicon defects.
observed within the accuracy of the test set-up. Failure rates at
55°C and 70°C were extrapolated by using the Arrhenius equation
and a general activation energy of 0.6 eV. Poisson statistics were
used to derive a calculated failure rate with a 60% confidence level.
Use of Poisson statistics is valid for a failure rate which is low and a
failure mode which occurs randomly with time. At 55°C, the
calculated failure rate with a 60% confidence level was found to be
33 FITS (0.0033% failures per 1000 hours). This number was
derived from over 2.2 million device hours of data.
Table 5. HTOL (High Temperature Operating Life) Test
Device Hours
at 125°C
Failures
Equivalent
Device Hours
at 55°C (0.6 eV)
Equivalent
Device Hours
at 70°C (0.6 eV)
4
568.000
0
23.8 Million
9.4 Million
3
359,400
1
15.0
5.9
144
6
283.000
0
11.8
4.7
1020 JLCC
61
2
90,000
0
3.8
1.5
1020AJLCC
69
2
69,000
0
2.9
1.1
1010 PLCC
496
6
844,000
1
35.3
14.0
Units
Runs
PROM64
275
1003 JLCC
238
1010 JLCC
Device
1020 PLCC
Totals:
32
2
48,000
0
1315
20
2,261,400
2
2.0
0.8
94.6 Million
37.4 Million
Summary:
Observed FITS at 55°C:
21
Observed FITS at 70°C:
53
Calculated FITS to 60% confidence at 55°C:
33
Calculated FITS to 60% confidence at 70°C:
83
(Summary of Data as of February 20, 1990)
A total of 426 units from five wafer runs and six assembly lots were
used. Read points were taken at 96, 168, and 336 hours. There were
a total of four failures (Table 6). All four failures were caused by
bond wires lifting off bond pads. This was an assembly problem
that occurred only on our first lot of plastic units. The failures were
caused by high temperature, not by metal corrosion. The assembly
problem was corrected, with no further failures observed.
Unbiased Steam Pressure Pot Test
This test is used to qualify products in plastic packages. Units are
placed in an autoclave (pressure pot) and exposed to a saturated
steam atmosphere at 121°C and 15 psi. Problems with bonding,
molding compounds, or wafer passivation can cause metal
corrosion to occur in this atmosphere. Metal corrosion is detected
during a full electrical test of the device following exposure to the
autoclave environment.
Table 6. Unbiased Steam Pressure Pot Test
121 ee, 15 psi
Number of Failures
Product
Run Number
Package
Number of Units
96 Hours
168 Hours
1010
JB13
84 PLCC
34
0
3
0
1010
JB13
68 PLCC
71
0
0
1010
JB14
68 PLCC
71
0
0
0
1010
JB22
68 PLCC
71
0
0
0
336 Hours
1010
JB27
68 PLCC
50
0
0
0
1010A
TI24
68 PLCC
129
0
0
0
ACT 1010/1020 Reliability Report
As shown in Table 7, a total of 288 units were stressed. There were
Biased Moisture Life Test (85/85)
three failures. One failure was caused by two lifted bond wires; it
was from the same lot in which we saw failures in steam pressure
pot test. The second unit was functional but had high 100 current.
The 1000-hour failure was nonfunctional.
In this test, the units are placed in a chamber at a temperature of
85°C and a relative humidity of 85%. A voltage of 5.5 V is applied
to every other device pin while other pins are grounded. 5.5 V is
applied to V 00 while V ss is grounded. This test is effective at
detecting die related and plastic package related problems.
Table 7. Biased Moisture Life Test
85°C/85% Humidity with DC Alternate Pin Bias of 0 V to 5.5 V
Number of Failures
1000 Hours
Product
Run Number
Package
Number of Units
500 Hours
1010
JB13
68 PLCC
80
2
1010
JB14
68 PLCC
81
0
0
1010
JB22
68 PLCC
54
0
0
1010
JB26
68 PLCC
54
0
0
1010
JB27
68 PLCC
19
0
0
2000 Hours
0
0
temperature is -65°C to 150°C. For plastic packages, the range is
O°C to 125°C. Both programmed and unprogrammed units are
placed on temperature cycle. Data on 451 units is summarized
in Table 8.
Temperature Cycling
This test checks for package integrity by cycling units through
temperature extremes. For ceramic packages, the range of
Table 8. Temperature Cycling Test
-65°C to 150°C Ceramic; O°C to 125°C Plastic
Failures
Product
Run Number
Package
Number of Units
100 Cycles
200 Cycles
1000 Cycles
1010
JB13
68 PLCC
158
0
0
1010
JB14
68 PLCC
28
0
0
1010
JB26
68 PLCC
21
0
0
1010
JB28
68 PLCC
31
0
0
1020
JB22
84 PLCC
17
0
1010
1077
84 JLCC
20
0
1020
JB33
84 PGA
25
0
1010A
TI24
68 PLCC
176
0
Other Tests
Electrostatic Discharge (ESD)
Units were tested for sensitivity to static electricity by using the
human body model as described in MIL-883C (100 pF discharged
through 1.5 kO). Fifteen ACT 1010 units from three wafer runs
were tested. Nine representative I/O pins were checked on each
device. Since all I/O pins have the same layout on the chip, the nine
pins tested were selected based on their proximity to V ss, V DO, or
the comer of the chip. The MODE pin was also tested because it is
the only dedicated input on the chip. In addition, the three power
supplies (Vss, V oo, V pp) were tested. Three positive and three
negative pulses were discharged into each pin tested at each voltage
level. For inputs and I/Os, these six pulses were applied with three
different grounding conditions; V ss only grounded, V 00 only
0
0
0
grounded, and all other I/Os grounded. Thus each pin received a
total of eighteen pulses for each test voltage. Testing began at 1000
V and continued in 500 V increments. After pulsing was completed
at each voltage, the I-V characteristic of each pin was checked on a
digital curve tracer. Any significant change in the I-V curve from
the previous reading was considered a failure. The units were then
tested on a VLSI tester. Leakage currents were datalogged at 0 V
and 5.5 V. Any pin showing more than 250 nA of leakage current
was also considered to be a failure. For Ioo, a change of more than
250}JA was cause for rejection. No failures occurred through 2000
V. At 2500 V, five of the fifteen units failed on at least one pin.
Failure analysis revealed that the failures occurred on the
N-channel pulldown transistor of the output driver. With no
failures through 2000 V testing, the ACT 1010 (and the ACT 1020,
by virtue of identical I/O layout to the 1010 device) met the
requirements for the 2000 V ESD category of MIL-883C.
II
Latch-up
Latch-up is a well-known cause of failure in CMOS circuits.
Parasitic bipolar transistors are created by the P-channel transistor,
the N-channel transistors, the N-well, and the P-substrate. These
transistors are connected in a manner which effectively creates an
SCR. If a voltage on an external pin were to forward bias to the
substrate, the parasitic SCR can be latched to the on state, creating
a low-impedance path between V 00 and ground. A large amount of
current then flows through this path. This current can, at best,
make the device temporarily nonfunctional and, at worst, cause
permanent damage.
Several techniques are used by CMOS designers to reduce the
chance of latch-up. One of the most common techniques is the use
of guard rings to isolate P-channel and N-channel transistors. The
disadvantage of this method is that it requires additional silicon die
area. Another method is to use a substrate bias generator. Creating
a negative substrate bias means that an input must go even more
negative to cause latch-up. A third technique is to use EPI wafers to
achieve low substrate resistance, which lowers the chances of
triggering latch-up. Actel designers use both guard ring and EPI
wafer techniques for ACT 1010/1020 devices.
The latCh-up test method used is defined by JEDEC Standard No.
17. Each I/O pin on a tested device was forward biased in both
directions (to Vss and V oo) by forcing negative and positive
currents ranging from ±50 rnA to ±250 rnA in 50 rnA increments.
Following each stress, the device Ioo current was measured. If the
current exceeded the data sheet limit of 10 rnA, the unit would be
rejected. The device was also functionally tested.
Fifteen units from three different wafer lots were tested. Testing
was done both at room temperature and at a worst case
temperature of 135°C. All device I/Os and power supplies were
tested. No failures were detected through 250 rnA.
Conclusion
The data presented in this report establishes the excellent
reliability of Actel ACT 1010/1020 devices. Both the Actel models
and the test devices show that the antifuse is highly reliable and that
it detracts negligibly from overall product reliability.
References
1) E. Hamdy, et ai, "Dielectric Based Antifuse for Logic and
Memory ICs." IEDM paper, p. 786-789,1988.
2) S. Chiang, et ai, "Oxide-Nitride-Oxide Antifuse Reliability. "To
be published, Proc. Int. ReI. Phys. Symp., 1990.
3) J. Lee, I-Chen, and C. Hu, "Modeling and Characterization of
Gate Oxide Reliability," IEEE Trans. of Elec. Dev., Dec. 1988.
Applications
Applications
_
'------_----.J
Which Actel array should you use? ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-1
Efficient Logic Conversions ........... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-3
The Actel TImer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-15
Using Actionprobe Diagnostic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-17
Metastability ............................................................................................ 3-19
Using the Actel Debugger as a Functional Tester. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-21
Calculating Power in Actel's ACT 1 Arrays ............................................................ . . . . . . .. 3-25
Three-Stating ACT 1010/1020 Designs.. . . ... . .. . . . .. .. . .. . . .. .. ... .. . ... . .. . . . .. . . . ... . . . . . . . . . . .. . . . . . .. ... 3-29
Multi-ASIC and System Simulation with Actel FPGAs ........................................................... 3-31
Constructing RAM with ACT 1 Macros ....................................................................... 3-33
Using the DRAM Controller Supermacro ..................................................................... 3-39
Using the DMA Controller Supermacro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-43
Using the SCSI Interface Controller Supermacro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-45
UART Design ........................................................................................... 3-49
Actel TA181 ALU . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-61
Actel Fast Adders ............................................................. . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-63
Eight-Bit Twos Complement Multiplier ....................................................................... 3-65
Contributors:
Sam Beal
John Day
Dennis McCarty
Faysal Sohail
OuatTran
Which Actel array should
you use?
Applications
Note
Determining which array to use is an important part of design
implementation. This applications note shows how to easily select
the correct Actel ACT™ 1 array for your design.
Note that both true and inverting inputs are provided on all 2-, 3-,
and 4-input functions, so inverters are rarely needed in Actel
designs.
The ACT 1 family of gate arrays contains the ACT 1010, a 1200gate array, and the ACT 1020, a 2OOO-gate array.
Design limits for Actel devices are 295 logic modules for the ACT
1010, and 547 logic modules for the ACT 1020. But device
utilization, the percentage of available modules used, must be
considered also. Designs using 85% or less of available logic
modules can be placed and routed easily. Device utilization above
85% may lead to problems, especially when designing for fast AC
performance.
The ACT 1010 contains 295 logic modules and 57 I/O modules.
The ACT 1020 contains 547 logic modules and 69 I/O modules.
Logic modules can be configured as gates, flip-flops, and other
logic functions. An I/O module can be designated as an input,
output, or bidirectional buffer.
Each array contains a master clock buffer. This buffer can drive
flip-flops or latches without fan-out restriction.
Logic Module Requirements
To determine the correct device for your design, first consider I/O
requirements-including simultaneously switching outputs (SSOs)
-then consider logic requirements.
2-input gate
In determining I/O requirements, consider the package
limitations, as shown in the table below. Power and ground are not
included in the limits given; separate bond pads and package pins
are provided for power and ground. SSOs are outputs that switch
within 10 ns of each other.
logic Modules
3-input gate
1
4-input gate
2
XOR gate
XOR OR gate
XOR AND gate
AND XOR gate
I/O LImits
Package
Macro Function
ACT 1010
I/Os
SSOs
ACT 1020
I/Os
SSOs
AND OR gate
OR AND gate
44-pin chip carrier
34
16
34
16
Buffer/Inverter
68-pin chip carrier
57
24
57
24
2- or 4-input Multiplexor
84-pin chip carrier
N/A
N/A
69
32
Adder
84-pin grid array
57
32
69
32
D-Iatch
2
D-Iatch with clear
To calculate the number oflogic modules required by a design, add
up the individual logic module requirements of each macro used.
The following table gives typical logic module requirements for
various hard macro functions. Consult the Actel Macro Library for
a complete listing of macro types and actual logic module
requirements. For calculation purposes, convert complex functions
(e.g, 74163) into gate and flip-flop schematic equivalents. Refer to
the manufacturer's databook for this information.
©
1990 Actel Corporation
D-Iatch with enable
Multiplexed latch
D flip-flops
2
Multiplexed D flip-flops
2
D flip-flop with enable
2
JK flip-flop
2
I
Efficient Logic Conversions
Introduction
In order to obtain the most efficient conversion of a logic design
into Actel's format, the designer should become familiar with the
available Actel macros. The Actel Macro Library manual lists
available Action Logic™ System macros, with AC performance,
fan-in, and levels of logic given.
Most common logic functions are readily implemented using Actel
macros. This applications note describes several common
functions created from Actel macros, and serves as a designer's
guide to efficient logic conversions. New macros are frequently
Applications
Note
added to the Actel Macro Library. Always check the current
software library shipped with the Action Logic System before
creating a macro.
All Actel macros are created with logic modules. Each logic module
is an eight-input, one-output circuit, which consists of three 2-to-1
multiplexors and one 2-input OR gate. See Figure l.
This circuit can implement a wide range of functions, and it
interconnects efficiently through routing resources as required by
the circuit application.
I
Figure 1. Logic Module Schematic
Actel macros fall into two categories, hard macros and soft macros.
Hard macros consist of one or two logic modules. Soft macros were
created for logic functions requiring more than two logic modules.
Hard macros have fixed logic module positions and wiring
©
1990 Actal Corporation
interconnections to maintain specified AC characteristics. Soft
macro logic modules are placed on the array as needed, and the
wiring path is routed to complete the interconnection. Soft macro
AC performance is determined after place and route.
Combinatorial Functions Existing In The Macro
Library
effective" function. Many macros have multiplexor functions built
in; Figure 2 illustrates how several logic functions are combined,
saving space and power.
Since the basic macro building block (the logic module) is created
from multiplexors, the multiplexor can be viewed as the most "cost
S
A
B
r---.....
VrI
Y
a
0
t>
DF1
Equivalent schematic of a DFM macro
I--"':"Y-~D
a
DF1
I--"':"Y-~ 0
a
DF1
..
..
GNDfia
A
B
B
S
DFM
GNDBfia
B
DFM
A
a
>-..:..Y--t 0
DF1
..
VeeAfia
B
DFM
B
a
>-..:..Y--t 0
DF1
..
S
S
veefia
B
B
A
S
DFM
Figure 2. Sample Logic Function Equivalents
Note from Figure 2 that the DFM multiplexed D-input flip-flop
makes use of its inputs to achieve other functions by simply
connecting Vee and GND as shown. The original circuitry
consisting of a gate and flip-flop would have used three logic
modules, while the more efficient version uses only two. This is a
very powerful technique, and can be extended to other functions in
the library.
Efficient Logic Conversions
Actel provides every combination of 2-, 3- and 4-input functions
with true and inverting inputs, so inverters are almost never
needed. Figure 3 illustrates the logic savings realized using Actel
macros in a standard circuit.
'U'.'--_------'
Net4
Standard Circuit
Actel Circuit
y
Figure 3. Sample Logic Conversion
The standard circuit requires five logic elements. The Actel circuit
requires only two logic modules; it runs faster and uses less power
than the standard circuit.
II
library. Consult the Actel Macro Library for a complete listing of
macro types and actual logic module requirements.
Counting Logic Modules
When converting a logic design into an Actel array, do not count
the number of gates in the original design; count logic module
requirements. Many standard logic functions can be implemented
with fewer Actel logic modules than expected, as shown in the
examples in this application note. Table 1 gives typical logic module
requirements for various hard macro functions. Many complex
functions, such as TTL replacements, exist as macros in the Actel
Both logic module and I/O module requirements must be
considered in selecting an array for a given design. Refer to the
ACT 1 data sheet for array specifications. For more information on
choosing an array for your design, refer to the applications note
"Which Actel array should you use?"
Table 1. Logic Module Requirements
Macro Function
Includes AND, NAND, OR and NOR gates.
2-input gate
NAND3 and OR3e require two logic modules.
3-input gate
4-input gate
Notes
Logic Modules
2
See Note 1.
XOR gate
XOR OR gate
XOR/XNOR followed by an OR gate.
XORAND gate
XOR/xNOR followed by an AND gate.
ANDXOR gate
AND gate followed by an XOR gate.
AND OR gate
12 types available.
OR AND gate
9 types available.
Least efficient of all macro types.
Bufferllnverter
2- or 4-input Multiplexor
1
Most efficient macro. 6 types available.
Adders
2
7 types available
D-Iatch
D-Iatch with clear
D-Iatch with enable
Multiplexed latch
1
2 input Multiplexed D-Iatch.
D flip-flop
2
22 types available. Single output only.
Multiplexed D flip-flop
2
D flip-flop with enable
2
Note:
1. The designer could use an average of 1.5 logic modules for the
four-input gate conversions, as 50% of these gates use only one logic
module.
Allows multiple clocked systems.
Efficient Logic Conversions
Creating Logic Functions With Hard Macros
One of the most efficient logic functions in the Actel Library is the
MXT macro. Figure 4 illustrates this function. The MXT macro
uses one logic module.
Many logic functions can be created using this macro, as illustrated
in Figures 4a, 4b, 4c, and 4e. Figure 4d illustrates set-to-l flip-flops
created from Actel macros DLC and DLCA.
DO
D1
D2
D3
Figure 4. MXT Macro
A--------,t
r - -__~----Q
Logic Function
B
II
B
Actel Circuit
SOA
S1
DO
GND
D1
Y
Voo
D2
D3
SOB
A
Figure 4a. NAND Gate Set - Reset Latch
Q
A-------\
r - -__~------Q
Logic Function
B
A
Actal Circuit
B
SOA
S1
DO
Voo
01
y
GNO
Q
02
03
SOB
Figure 4b. NOR Gate Set - Reset Latch
Figures 4a and 4b illustrate how two types of S-R latches can be
implemented using the MXT macro (one logic module) in place of
two logic elements. Using the MXT macro, the Actel circuit
provides improved speed and reduced power dissipation over the
standard circuit.
By manipulating the inputs tied to GND and V DO, further logic
functions can be built in. Figure 4c illustrates how additional gates
can be added to the NOR gate S-R latch using existing logic. Note
that two levels of logic in the standard circuit, from the A and C
inputs to the Q output, have been reduced to one level of logic,
improving AC performance.
Efficient Logic Conversions
A - - -__- - - - t
T)--~------
C -___.t--+_-{
Q
Logic Function
A
B
Actel Circuit
S1
SOA
DO
D1
VDD
Fan-In
Y
A
B
Q
D2
GND
C
2
1
1
D3
C
SOB
Figure 4c. NOR Gate Set - Reset Latch With Both Inputs Gated
The circuit in Figure 4c can prevent the illegal state (when both A
and B are active true) if the designer connects the B input to the C
input. B will override A, and when both inputs are removed
simultaneously, a logic 1 will be maintained at the output. Note that
the fan-in is not always represented by the schematic, as noted by
input C.
are created with Actel DLC and DLCA macros, while the Set-to-O
functions, shown in Figure 4e, are created with the MXT macro.
For the DLC macro, a logic 1 on the gate input will set the Q output
to a logic 1. For the DLCA macro, a logic 0 on the gate input will set
the Q output to a logic 1.
Figures 4d and 4e illustrate another set of useful logic functions:
Set-to-l and Set-to-O flip-flops with reset. The Set-to-l functions
VDD
D
Q
VDD
Q Output
D
OLC
Gate
Q
OLCA
G
Gate
G
CLR
CLR
Reset
Reset
Figure 4d. Set-to-1 Flip-Flops
Q Output
I
Actal Circuit
Reset
Gate
Logic Function
DO
GND
0
ON
ON Output
GND
Gate
D1
y
G
CLR
Voo
ON Output
D2
D3
Reset
Reset
Gate
GND
GND
D
Gate
G
ON
ON Output
DO
D1
y
CLR
Voo
ON Output
D2
D3
Reset
Reset
DO
GND
0
Gate
G
ON
ON Output
Voo
CLR
D1
y
ON Output
y
ON Output
D2
GND
D3
Reset
Gate
Reset
DO
GND
D
Gate
G
ON
ON Output
Voo
CLR
GND
D1
D2
D3
Reset
Gate
Figure 4e. Set-to-O Flip-Flops
Efficient Logic Conversions
Implementing Logic Functions Using
Multiplexors
The designer can combine several functions using one MXT
macro, as shown in Figure Sb. Note the savings of one level oflogic,
which saves power and improves speed.
Many functions can be implemented using the simple two-input
multiplexor. Figure Sa illustrates four common functions and their
multiplexor equivalents.
The function shown in Figure Sb exists as a macro (AO Ie) in the
Actel Library.
~
!
GND A
B
B
A
A
B
GND B
!
A
A
A
~
!
!
A
B
~
~
Vee
Vee
B
B
A
B
Figure Sa. Multiplexor Functional Equivalents
C---------.
~
A-- 10 kQ resistor. Probe pins may
be connected directly to a logic analyzer or oscilloscope.
II
Figure 1. In-Circuit Debug Setup
~ 1990 Actel Corporation
In-Circuit Probing
Changing the signal nodes is done simply by changing node names
with the Debugger software. The newly assigned signals are
connected automatically to the probe pins. Internal signals up to 10
MHz can be monitored externally. The internal signal passes
through an inverting buffer before reaching the probe pin.
The "Iep" (In-Circuit Probing) command connects the probe pins
to internal nodes. The syntax is :
ICP node_1 node_2
where "node_1" (node name) is connected electrically to PRA, and
"node_2" is connected electrically to PRB.
difference measured is the skew of the probe pins. This skew is
subtracted from subsequent delay measurements in the circuit. In
Figure 2a, for example, both PRA and PRB are connected
electrically to node NetO. The delay difference is the skew,
calculated as tSK = tpRA - tPRB. Using the Debugger software, the
slower probe (PRA) is assigned to node Net3. Figure 2b shows this
configuration. In this example, actual propagation delay is the
measured delay time between the output of GO and the output of
G3, minus the probe skew time. Actual delay is calculated as: tpD =
tpRA - tPRB - tSK
Note: Due to the propagation delay difference between rising and
falling signals, both probe pin signals must be either rising or
falling when they are used to calibrate for delays
measurement.
Probe Calibration
The probe circuitry does not introduce any additional loading to
the design, so the AC characteristic of the observed internal nodes
remains unchanged. And, because probe propagation delay is
independent of layout, probe delay remains unchanged for all
points in the device.
The skew of the probe pins can be measured, then used to calibrate
for accurate measurement of propagation delay. When both probe
pins are assigned to the same point on the device, the delay
G3
Figure 2a. Measuring Skew of Probe Pins
Pin Assignment for Dedicated 1/0 Pins
During device verification, dedicated pins SDI, DCLK, PRA, and
PRB are assigned as special I/O pin. These pins should be assigned
as non-critical, so the design is functional without them. After
device verification, these pins can be assigned as regular lIOs by
programming the security fuses. This disables the probe pins to
prevent unauthorized device probing.
G3
PRA
PRA
PRB
PRB
Figure 2b. Calibrating for Accurate Propagation Delay
Measurement
Applications
Note
Metastability
Actel Metastability Characteristics
MTBF
Designers often have asynchronous signals coming into
synchronous systems. Typically a flip-flop is used to synchronize the
incoming signal with the system clock.
If the asynchronous incoming signal does not meet the setup time
requirement for the flip-flop, there exists a window of time where
the incoming signal may cause the flip-flop to develop an unknown,
or metastable, logic condition. Figure 1 shows this window as two
Actual clock setup time of the flip-flop is shown as tsu; propagation
delay of the flip-flop is shown as tcq. Resolution time (tres) between
flip-flop output and the next clocked device is the amount of time
required for the metastable condition to stabilize.
The duration of a metastable condition is probabilistic. But the
designer can calculate how often a metastable state will last longer
than a given duration. Mean time between failures (MTBF) can be
calculated from the following equation:
where the constants depend on the ACT™ 1 device characteristics,
and:
MTBF = Mean time between failures (s)
fclk
= System clock frequency (Hz)
fdal = Incoming data rate (Hz)
e
= Natural log base
tres
= Resolution time (ns)
Cl = 1O-9/Hz
C2 = 4.60S2/ns
(Value of C2 derived from circuit simulations.)
----n-CLK--t:J
DATA IN
DATA IN
1
fClk * fdat * Cl e -C2°tres
Q1
•
--....&..""+'
CLK-----
"1"
Q1------../
Figure 1. Metastable Condition
©
1990 Actal Corporation
' __ _ _ _ "0"
Sample Calculation
Using the MTBF equation for a design with a system clock
frequency of 10 MHz and a data rate of 1 MHz, various resolution
times produce the results shown in Figure 2. Linear increases in
resolution time cause exponential increases in MTBR
1011
1010
109
108
107
~I
106
105
104
103
102
101
2
4
6
8
tres
Figure 2. Metastable MTBF as a Function of Resolution Time
ns
Using the Actel Debugger
as a Functional Tester
combination of the two. Command files and test vector files are
created with an ASCII text editor, then loaded into the Debugger.
User-defined macros may be created in a command file, then
executed in the Debugger.
Introduction
Actel's Activator™ programming and diagnostics unit, together
with Actel's Debugger software, provide powerful tools to
functionally test an ACT™ device. Device debugging begins after
design configuration and device programming. Debugging is
performed with the device inserted in the Activator unit. The user
accesses Debugger software from the Action Logic™ System
(ALS) main menu.
This applications note shows Debugger commands for a sample
design. The sample design is Actel's TT269 (TTL 74269), an
eight-bit binary loadable up/down counter with count enable.
Figure 1 shows the sample design; Table 1 shows the truth table for
the part. PO through P7 are parallel load inputs; 00 through 07 are
counter outputs; CLK is the counter clock; UD is the up/down
count selector. Internal nodes (nets) should be labeled during
design capture for easy reference during debugging.
Debugger functional test allows the user to observe any internal
node of the device. The user defines the device inputs with
Debugger menu commands or with a command file, or any
UD-I-----l
UO
PQ_I-----l
U1
Applications
Note
U13
P1-1----I
P2-1----I
P3-1----I
P4-1---~
P5-1---~
U2
U3
U4
U5
U6
P6-1----I
U7
P7-1---l
U8
ClK-I----!
PO
UD
00
P1
01
P2
02
P3
03
P4
04
P5
TT269
05
P6
Q6
P7
07
ClK
TC
CEP-I---~
CET-I---~
PE-I----!
PE
U12
Figure 1. TT269
©
1990 Actel Corporation
00
Q1
Q2
Q3
Q4
Q5
Q6
Q7
TC
QOUTO
QOUT1
QOUT2
QOUT3
QOUT4
QOUT5
OOUT6
OOUT7
TCOUT
I
Assigning Test Vectors
Table 1. TT269 1l'uth Table
Outputs
Input.
P[7:0]
ClK
x
a
X
FF
i
i
i
i
i
t
PE
CEP
CET
UD
a
a
x
x
x
x
X
X
X
1
X
X
X
a
a
1
X
0
0
0
X
0[7:0]
TC
a
FF
a
Hold
Hold
The default input radix for all test vectors is decimal. To specify a
binary, hex, or octal radix, add a Ob, Oh, or 00 prefix, respectively, to
the vector (e.g., OblOlO or Oh7e). Outputs are in binary format. To
interactively define input test vectors, use the Debugger menu.
Alternatively, use input command files to define test vectors. To
view outputs and internal nodes, print them to the PC screen
display or to an output file.
Increment
Decrement
r - - - - - - - . . , - - - - - - Action Logic System _ _ _ _ _ _ _ _ _ _ _-.,
~D_eb_u~g~g~e_r____~L INPUTl INPUT2 INPUT3 INPUT4 INPUT8 CLOCKIN
Assign
Low
High
Hi-Z
H INPUT5 INPUT6 INPUT7
TABADD INPUT2 INPUT5 CLOCKIN ______
Set up list
TABADD OUTPUTl OUTPUT2 OUTPUT3
~ of Signals
to observe
STEP
FAssign
Print
FPrint
Apply stimulus;
allow time for
response
H INPUT2
STEP
----------~
Setup
View
STEP
PRINT . . . ...-----_
..
_ _ _ _ _ _ __
INPUT2=O
INPUT5=!
Help
Exit
Revise input stimulus
Print values of signals
in tablist to pc screen
CLOCKIN=O
OUTPUTl=l
OUTPUT2=1
OUTPUT3=O
Design:
sample
Figure 2. Typical Debug Command Sequence
Using the Actel Debugger as a Functional Tester
The sample macro in part A of Figure 3, c lkl0, provides 10 clock
pulses to the pin CLK and prints the value of internal vector Q to a
specified output file after each clock pulse. The OUTFILE
command specifies the output file.
Defining User Macros
You may save time by creating user-defined macros for the
Debugger. These macros may contain a series of basic Debug
commands or may nest any combination of basic commands and
user-defined macros.
define (clkl0)
Part B of Figure 3 shows a nested macro, c lkl00, which executes
the clkl0 macro 10 times, providing 100 clocks to the CLK pin.
(repeat 10 (1 CLK)
define (clkl00)
(step)
(h CLK)
(step)
(fprint Q»
(repeat 10 (clkl0»
Figure 3. Sample Command Macros
Creating a Command File (TT269. cmd)
The command file (see Figure 4) applies test vectors to the TT2 6 9
counter. It redirects results to an output file, TT269. out and
compares the output vector Q of the counter to an existing results
file, TT269. cmp. The following notes correspond to each line in
the command file.
Lines 1 and 2: The vec tor command defines eight parallel load
input bits as vector P, and counter output as vector Q.
Line 3: The tabadd command defines the internal or external
nodes to be displayed or printed when the print or fprint
command is executed.
Lines 4, 5, and 6: The infile, outfile, and compfile
commands define input and output files. inf i Ie opens a file
containing input test vectors. outfile contains the output
results. compfile contains the data to be compared against the
current device status. Use the full path name of the file, and enclose
it in question marks.
Line 7: The define commands create user-defined macros. In
this example, the clkl0 macro provides 10 clock pulses to the
CLK input, fpr in t prints all nodes in the tabadd command to a
file defined by outfi Ie. fcomp compares the status of vector Q
to the file specified by compfile.
Lines 8, 9, and 10: Defines three user macros: up, down, and
load.
Loading a Command File
The loadfile command loads a defined command file
into the Debugger. Select setup I loadfile from the
Debugger menu, then enter the full path name of the command
file. For example:
/designs/TT269/TT269.cmd
Executing User-Defined Macros
To execute any previously defined macro, type the macro at the
command line while in the Debugger.
Line
1
(vector P PO PI P2 P3 P4 P5 P6 P7)
2
(vector Q QO Ql Q2 Q3 Q4 Q5 Q6 Q7)
3
(tabadd PE CEP CET liD P CLK Q TC)
4
(infile l/designs/tt269/tt269.pat")
5
(outfile l/designs/tt269/tt269.out")
6
(compfile l/designs/tt269/tt269.cmp")
7
(define (clkl0)
(repeat 10 (1 CLK) (step) (h CLK) (step) (fprint) (fcomp Q»)
8
(define (up)
CEP CET) (h PE liD) (step) (clkl0»
9
(define (down) ,(h PEl (l CEP CET liD) (step) (clkl0»
10
(1
(define (load)
(1 PE CLK) (step) (fassign P) (h CLK) (step»
Figure 4. Debug Command File (TT269. cmd)
•
Running the Sample Command File
Input Pattern File (TT269.pat)
In the following example, we will assign input test vectors from an
input pattern file named TT269.pat,andwriteoutputresults
to a file named TT269 . out. The command file (TT269 . cmd)
compares the counter's output vector Q to an existing results file,
TT269. cmp.
ObOOOOOOOO
Ob10000000
Ob01000000
Ob11000000
Output Compare File (TT269. cmp)
Output Results File (TT269. out)
S
T
E
P
00005:
00007:
00009:
00011:
00013:
00015:
00017:
00019:
00021:
00023:
00028:
00030:
00032:
00034:
00036:
00038:
00040:
00042:
00044:
00046:
00051:
00053:
00055:
00057:
00059:
00061:
00063:
00065:
00067:
00069:
00074:
00076:
00078:
00080:
00082:
00084:
00086:
00088:
00090:
00092:
pee u
P C
Q T
E E E D
L
K
C
P T
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
10000000
10000000
10000000
10000000
10000000
10000000
10000000
10000000
10000000
10000000
01000000
01000000
01000000
01000000
01000000
01000000
01000000
01000000
01000000
01000000
11000000
11000000
11000000
11000000
11000000
11000000
11000000
11000000
11000000
11000000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10000000
01000000
11000000
00100000
10100000
01100000
11100000
00010000
10010000
01010000
00000000
11111111
01111111
10111111
00111111
11011111
01011111
10011111
00011111
11101111
11000000
00100000
10100000
01100000
11100000
00010000
10010000
01010000
11010000
00110000
01000000
10000000
00000000
11111111
01111111
10111111
00111111
11011111
01011111
10011111
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
a
0
0
0
Ob10000000
Ob01000000
Ob11000000
Ob00100000
Ob10100000
Ob01100000
Ob11100000
Ob00010000
Ob10010000
Ob01010000
ObOOOOOOOO
Ob11111111
Ob01111111
Ob10111111
Ob00111111
Obll011111
Ob01011111
Ob10011111
Ob00011111
Ob11101111
Ob11000000
Ob00100000
Ob10100000
Ob01100000
Obll100000
Ob00010000
Ob10010000
Ob01010000
Ob11010000
Ob00110000
Ob01000000
Ob10000000
ObOOOOOOOO
Ob11111111
Ob01111111
OblOllll11
ObOOlll111
Ob11011111
Ob01011111
Obl001l1!1
Calculating Power in
Actel's ACT™ 1 Arrays
During the design of any system, the question, "How much power
will the device use?" always comes up. This is especially important
in battery-operated or other systems with limited available power.
This applications note will help you answer the power consumption
question.
Actel's ACT™ 1 array is CMOS-based, and so draws very little IDD
current. The amount of power supply current drawn is pirectly
related to the input clock frequency and the amount of logic
switching. Use the following formula for typical dynamic power
dissipation.
Note that powerisca1culated for a typical interconnect length and a
fan-out of three for each of the logic modules used in the design.
Total Chip Power (m W)
= 0.41 *N*F1 + 0.17*M*F2 + 1.62*P*F3
Where:
= Average logic module switching rate in MHz.
F1
F2 = Average clock pin switching rate in MHz.
Applications
Note
Example of Dynamic Power Calculation
If an ACT 1010 design used 200 logic modules of which 40 were
connected to the high fan-out clock buffer running at 20 MHz, and
the rest were running at 4 MHz, plus 50 lias (25 outputs, 25 inputs)
running at an average of 4 MHz, it would dissipate the following
amount of power:
Total Chip Power = 0.4IN*F1 + 0.17M*F2 + 1.62P*F3
= 0.41(200*4) + 0.17(40*20) + 1.62(25*4)
= S03.6mW
Power Dissipation Characteristics
The graphs in Figure 1 illustrate the power dissipation for both the
ACT 1010 and ACT 1020 devices as the operating frequency and
device utilization increases. These graphs assume that the logic
modules and the l/Os are operating at 1/4 the input frequency, and
the CLKBUF macro is used. The percent utilization applies to both
the I/Os and the number of logic modules.
F3 = Average I/O switching rate in MHz.
M
= Number oflogic modules connected to the clock pin.
N = Total number of logic modules used on the chip.
(including M)
P =
Number of I/O pairs used (input
with 50 pF of output loading.
+ output)
F1 term example: If a 4-bit ripple counter has a 10 MHz clock
input, the four outputs would be at 5, 2.5, 1.25, and 0.625 MHz, with
the average frequency of 2.1875 MHz «5 - 0.625) / 2) used for F1.
F2 is associated with the CLKBUF high fan-out clock buffer. F2
and M can be deleted if you are not using the CLKBUF macro in
your design.
F3 is the average switching rate for all the l/Os in the design.
N is the total number of logic modules used in the design, including
those that are connected to the CLKBUF macro.
:§) 1990 Actal Corporation
Package Limitations
The power dissipation values given in Figure 1 do not take into
account the power handling limitations of the package. The
following maximums apply:
Plastic J lead packages = 1.0 watts
Ceramic J lead packages = 1.S watts
Ceramic pin grid array = 2.0 watts
Quiescent Current
Quiescent, or standby, current ranges from 3 rnA, typical, to 10 rnA,
maximum. A voltage pump on the device runs constantly and
consumes power. All power dissipation calculations should include
standby current.
•
100
400
90
360
80
320
V
/V
ACT 1020
§'
70
S
t:
0
ACT 1020
/
60
~
vV
0
40
V
30
V
20
10
o
./
V
. / .".,.,V """'"
LP
o
~
V
V
20
30
40
240
en
200
vV
0
.......
j
/
LV
LV' V
160
120
I
I
80
50
60
70
I
80
40
90
V
/
ACT 1010
~
10
VI'
280
t:
0
'!'iii
/
a.
'iii 50
en
j
{
V
~
100
V
r1I
V
V
ACT 1010 _
V
lL L
20
10
30
40
50
60
Device Utilization (%)
= 1 MHz
f = 5 MHz
1500
1350
1200
S
t:
0
'!
'gj
900
ACT 1020
750
V
0
~
~
600
450
300
150
V
l/V
V
V
V
/v
./
V
L
L
1/
§' 1050
/"
l.<'
V
/
/V
I
ACT 1010
./
~
10
20
30
40
50
60
70
V
I----
V
Device Utilization (%)
f
V
80
Device Utilization (%)
f = 20 MHz
Figure 1. Power Dissipation Characteristics
90
100
70
80
90
10(
Calculating Power In Actel's ACT 1 Arrays
In the example design, power dissipation was calculated as 503.6
milliwatts at 25 °C and 5.0 V. If the voltage were taken down to 4.5
volts and the temperature elevated to 70°C, the example would
then dissipate the following amount of power:
Voltage and Temperature Derating
When the power supply or the ambient temperature deviates from
typical, derating factors must be applied to the total power
dissipation of the device.
503.6 X 0.87 X 1.01 = 442.5 mW.
System power supply derating factor:
VooM
4.50
4.75
5.00
5.25
5.50
Derating Factor:
0.87
0.94
1.00
1.07
1.13
An additional 15 milliwatts of power (nominal conditions) should
be added to account for standby current. This number is also
derated by the factors given for voltage and temperature.
Ambient temperature derating factor:
Temperature (0C)
-55
o
25
70
125
Derating Factor
0.98
0.99
1.00
1.01
1.03
ACT 1 Power Calculation Worksheet
Average switching rate of the logic modules
+
x
Multiplying factor of
+
x
0.41
t
Total number of logic modules used in the design
+
Clock buffer input frequency
+
Multiplying factor of
x
x
+
0.17
t
Total number of logic modules connected to the clock buffer
+
Average I/O switching rate
Multiplying factor of
+
+
x
x
1.62
t
Number of pairs of I/Os
+
Typical power dissipation of the voltage pump =
Subtotal power dissipation =
Multiplied by
Voltage derating
Temperature derating
Total device power dissipation, in milliwatts
=
•
15.0
II
Three-Stating
ACTTM 1010/1020 Designs
During board test and debug it is frequently desirable to place all
chip lias into a three-stated condition. This provides iso~ation
from other three-stating circuit devices connected to sIgnals
common to the ACT™ device. The three-stated condition also
allows board test for trace integrity or insertion damage to ACT
device pins.
Three-stating a design is easy using the unique debug features of
ACT device architecture. Three special pins on the ACT device
facilitate three-stating: MODE, SDI, and DCLK.
Three-State Pin Assignments
44-pln
PLCC/JQCC
68-pin
84-pln
PGA
84-pin
MOOE
34
54
66
SOl
36
56
72
811
OCLK
37
57
73
C10
E11
Applications
Note
Pins SDI and DCLK should remain unassigned by the user or
should be defined as input-only.
You may three-state all user-defined pins regardless of their
normal mode definition: input-only, output-only, or
three-stateable. Each pin can be temporarily three-stated for test
and debug purposes.
Figure 1 shows the sequence of three-stating. Seven data bits are
clocked into the device, using the SDI pin as data input, and DCLK
as clock. The MODE pin distinguishes "test" mode from "normal."
The data sequence clocked is {OOOlOll}. After clocking the
seventh bit, all user-defined pins enter a three-state condition until
MODE is taken low.
Actel Actionprobe™ diagnostic tools allow 100% observability of
internal nodes; ACT devices may also be isolated from external
board circuitry. Together, these two features provide a powerful
debugging feature previously unavailable in custom devices.
>1 )..Is
V1H
V1L
OCLK
SOl
V1H
V1L
0
V1H
V1L
MOOE 0
Figure 1. Three-state Timing
Notes:
1. 0 V:::;; V 1L ;5; 0.5 V; 3.0 V:::;; V IH :::;; Vee
2. Test mode configuration is a low frequency ( < 1.0 MHz) operation.
3. All setup and hold conditions (th' t s) ~ 250 ns.
©
1990 Ac1el Corporation
3-29
II
3-30
Multi-ASIC and
System Simulation
with Actel FPGAs
Multiple ASIC and board-level design simulation produces a
dilemma: while individual simulation of component ASIC designs
is possible, there is no easy system-level simulation process for
designs consisting of ASICs and other component types such as
TTL.
Actel offers complete timing simulation using Viewlogic®
Viewsim® for board-level designs with an unlimited number of
Actel field programmable gate arrays (FPGAs).
The technique is as follows:
• Design and simulate timing for each FPGA. This process
includes Viewsim functional simulation, place and route using
Actel's Action Logic™ System (ALS) software, and
back-annotated timing simulation.
• From a DOS window in the \ WORKVIEW directory, execute
the following command for each FPGA design:
Applications
Note
VSM YOURCHIP -W -DYOURCHIP.DTB
The command causes the Viewsim compiler to create a flat
WIR file for the design to the Builtin or Viewlogic simulation
primitive level. The. WIR file includes delay information from
the file YOURCHIP.DTB.
• Copy each .WIR file to the WIR subdirectory. The
VIEWDRAWINI file specifies the location of the WIR
subdirectory (on the line beginning with "DIR").
• Modify each .WIR file (using non-document mode), adding the
lines:
G GND
G VDD
between the last AP/ AS line and the first M line in the. wir file,
as shown in Figure 1.
II
Figure 1. Modifying the .WIR File
©
1990 Actel Corporation
• Edit the file \WORKVIEw\VIEWDRAWJNI to comment
out the Actel library pointer and to add pointers for the
other libraries to be used in the design. An example is shown in
Figure 2 below.
• Create symbols for each Actel FPGA in WorkvieW® .
• Capture the system-level schematics and simulate the design.
VIEWDRAW.INI (NEW)
VIEWDRAW.INI (OLD)
...
...
DIR 0 C: \DESIGNS
becomes
DIR 0 C: \DESIGNS
DIR 6 C: \AClVIEW
DIR 7 C: \WORKVIEw\BUILTIN
...
I DIR 6 C:\AClVIEW
DIR 7 C: \WORKVIEw\BUILTIN
DIR 11 C:\WORKVIEw\74LS
. ..
Figure 2. Modifying the VI EWDRAW.INI File
3-32
Constructing RAM with
ACTTM 1 Macros
Introduction
This applications note illustrates an efficient way to construct
RAM of varying width and depth on ACT™ 1 field programmable
gate arrays. Actel has created a number of soft macros for this
purpose. Below is a listing of macros available for RAM
construction. Details on these soft macros can be found in the Actel
Macro Library.
Function
Soft Macro
DEC4
2 Une Decoder with Write Control
DECS
3 Une Decoder with Write Control
DEC12
4 Une Decoder with Write Control
DEC16
4 Une Decoder with Write Control
RAM4
4 Bit Deep by 1 Bit Wide RAM Macro
SRAM4
4 Bit Deep by 1 Bit Wide Synchronous RAM Macro
DPRAM4
4 Bit Deep by 1 Bit Wide Dual Port RAM Macro
SDPRAM4
4 Bit Deep by 1 Bit Wide Synchronous Dual Port RAM Macro
RAM8
8 Bit Deep by 1 Bit Wide RAM Macro
SRAM8
8 Bit Deep by 1 Bit Wide Synchronous RAM Macro
DPRAM8
8 Bit Deep by 1 Bit Wide Dual Port RAM Macro
SDPRAM8
8 Bit Deep by 1 Bit Wide Synchronous Dual Port RAM Macro
RAM4X4
4 Bit Deep by 4 Bit Wide RAM Macro with Separate Read and Write Select
RAM4X4A
4 Bit Deep by 4 Bit Wide RAM Macro with Common Read and Write Select
RAM4X9
4 Bit Deep by 9 Bit Wide RAM Macro with Separate Read and Write Select
RAM4X9A
4 Bit Deep by 9 Bit Wide RAM Macro with Common Read and Write Select
RAM8X8
8 Bit Deep by 8 Bit Wide RAM Macro with Separate Read and Write Select
RAM8X8A
8 Bit Deep by 8 Bit Wide RAM Macro with Common Read and Write Select
The soft macros can be used individually or combined to create
RAM of any desired width and depth. The last six macros on the
above list are complete RAM functions, containing address
decoders, write control, and a reset. They can be used
independently of the other RAM macros, or they may be combined
as needed.
©
Applications
Note
1990 Actel Corporation
RAM macros are constructed from transparent data latches,
therefore the outputs will follow the inputs for as long as the write
line is held true. This minimizes the silicon area required for each
RAM function.
I
contains the address inputs, a low-true write input, and the RAM
bit select outputs.
Decoder Functions
The DEC4, DEC8, DEC12, and DEC16 decoders have been
created to interface easily to Actel RAM macros. Each decoder
DEC12
WRO
G[1:S]
WR1
WR2
WR3 G[9:12]
NWR
DECS
WRO
G[1:S]
WR1
WR2
NWR
DEC4
WRD
G[1:4]
WR1
NWR
DEC16
WRD
G[1:S]
WR1
WR2
WR3 G[9:16]
NWR
Outputs
Inputs
WRO
WR1
NWR
G[1:4]
X
X
D
1
0
0
0
1
0
0
0
D
0000
1000
0100
0010
0001
Sample Truth Table for the DEC4 Macro
Figure 1a. Decoder Macros
Figure la illustrates the four DECx macros, and shows a sample
truth table for the DEC4 macro. Note that the Gx outputs are
selected only when the write input, NWR, is at a logic zero. The
WAx
NWR write pulse must occur only while the address, determined by
the WRx signals, is valid. All of the DECx macros function
similarly.
_ _ _ _ _"""'
__
_ _ _ _J',
:: K_ _ _
Valid
address
:: K_ _ _ __
NWR
G[x] output _ _ _ _ _--10--/
Figure 1b. Decoder Macro Timing
Figure Ib illustrates the timing requirements for the DECx macros.
Soft macro timing is extracted after place and route of the design.
Constructing RAM with ACT 1 Macros
RAM Functions
All of the RAM building blocks are organized as 1-bit wide by
either 4 or 8 bits deep. These macros are "stacked" to create RAM
of the desired width. In addition, single or dual port, and
asynchronous or synchronous versions are available, as shown
below:
Synchronous
Asynchronous
Depth
Single Port
Dual Port
Single Port
Dual Port
4 Bits
RAM4
DPRAM4
SRAM4
SDPRAM4
8 Bits
RAM8
DPRAM8
SRAM8
SDPRAM8
The RAM4 and RAM8 macros have a reset input as an additional
feature.
Figure 2 illustrates the schematic diagram and symbol for the
SDPRAM4 macro. Note that every input presents a single unit load
to the driving source. This is true for every input on all RAM
macros. The G[x] inputs enable the clocking source to load data
into the latch whenever both the eLK and appropriate G[x] inputs
are at a logic 1. Multiple bits can be written to by controlling the
G[x] inputs; however, only one latch output can be read at a time.
RDO
RD1
DINA~
ClK~
~
DLME1A
Y
V
~v
A
B
Q
,..
S
E
~ G
'"
DLME1A
G[1:4)
-
G1
G2
G3
G4
A
A
A
A
~v
~v
~v
~v
A
B
,..
>::
'"
DO
'----
SEl~~
Y
y
r-.... ...........
,..
>::
'""
DLME1A
A
B
rl- OUT
~V
Q t-----
S
Symbo
E
G
-
'----
A
B
,...,
S
E
G
--C
II
.......
MX4
D2
..-
SO
D1
DLME1A
DINB-L~
S1
Q t-----
S
E
G
Q
V
Figure 2. SDPRAM4 Macro Schematic
-
SDPRA M4
RDO
RD1
OUT
DINA
DINB
SEl
ClK
G[1:4]
Constructing RAM Modules
By combining RAM soft macros, complete RAM modules can be
constructed. Figure 3 illustrates the RAM4X4 function as
constructed from Actel RAM and decoder soft macros.
Each RAM4 macro is 4 bits deep. Using four RAM4 macros as
shown creates a 4-bit wide, 4-bit deep RAM module. Note that the
DEC4 macro can drive up to 24 RAM4 macros. A RAM module of
any size can be constructed in similar fashion.
.----
~
NRST~
DO
Y
7'
RAM4
ROO
RD1
DIN
OUT f - - 00
NRST
G[1:4]
4.-D1
RAM4
RDO
RD1
DIN
OUT I--- 01
NRST
RAM Depth
G[1:4]
RD1 2 -
~
RAM4
4~ RDO
Y
7
WRO
WR1
NWR
-
~
..
•
D2
RD1
DIN
NRST
Symbol
OUT I--- 02
RAM4X4
WRO
WR1
NWR
RDO
RD1
NRST
G[1:4]
RD0 2 -
IRAM~d~
y
V
DEC4
WRO
G[1:4]
WR1
NWR
D3
RAM4
RDO
RD1
DIN
OUT
NRST
G[1:4]
Figure 3. RAM Module Example
-
03
DO
D1
02
D3
00
01
02
03
Constructing RAM with ACT 1 Macros
Figure 4 illustrates a 12-bit deep RAM module constructed from
RAM and decoder macros.
Constructing Deep RAM Modules
The DEC12 and DEC16 four line decoder macros allow the
construction of RAM modules of 12 or 16 bits deep, respectively.
The same technique applies to 16-bit deep RAM modules, using
the DEC16 macro and two RAM8 macros.
RD3
RAM8
RDO
RD1
RD2
DIN
OUT I - - NRST
RDO
RD1
RD2
Data in
Reset
AGY
S
G[1:8]
~
-
DEC12
WRO
WR1
G[1:8]
WR2
WR3 G[9:12]
NWR
MX2
RAM4
L - - - RDO
' - - - - RD1
lj::wldeX
DIN
OUT
NRST
I
y Output
12 Bltsdeep
G[1:4]
Figure 4. Deep RAM Module Example
All common inputs are tied together on the macros, completing a
I-bit wide by 12- or 16-bit deep RAM module. Up to 24 RAM
macros can be stacked without additional buffering.
II
Using the DRAM
Controller Supermacro
Introduction
Processor-based systems require control logic for DRAM accesses
and refresh. System designers can use a VLSI DRAM controller
chip with a built-in processor interface. The chips are easy to use,
but slow, and they operate in a limited number of configurations.
One alternative is to build a controller out of discrete logic,
including PAL@, delay lines, and standard components. Such a
controller can be customized to the system, but it requires several
components with the attendant assembly, board space, and failure
rate costs.
Another way to implement the controller is to use a masked gate
array. Although this provides a single-chip custom solution, large
NRE costs and lengthy delays are inherent problems in developing
masked gate arrays.
The ideal solution is to use an Actel field programmable gate array
(FPGA.) The Actellibrary contains a DRAM controller macro
that may be customized for the target system and then combined
with other logic required to implement the system.
This note explains the operation of the macro and how to use it in
developing a system. The design has been tested at a clock rate of
16 MHz and can operate easily with both 120 ns and 100 ns
DRAMs.
The sample implementation has two address channel inputs and
decodes two address bits to four RAS drivers. It has no byte
decoding so that it addresses four megabytes on a 32-bit boundary.
Architecture
The architecture of the controller is shown in Figure 1. The three
major blocks are the Refresh Timer, Memory Access Arbitrator,
and Memory Timing Control.
The refresh timer generates refresh requests on intervals set by
decoding counter outputs. The arbitrator is a state machine that
prioritizes memory requests and sends signals to the timing control
to initialize memory cycles. The timing controller asserts the
DRAM and address multiplexor control lines at the appropriate
times for the cycle being executed. A detailed operational
description follows.
REFRESH
CYCLE
REFRESH
REQUEST
REFRESH
TIMER
Applications
Note
RW,
RAS,
CAS
MEMORY
TIMING
CONTROL
MEMORY
ACCESS
ARBITRATOR
ACCESS
CYCLE
MEMORY
CYCLE
ACKNOWLEDGE
-
MEMORY
ACCESS
REQUESTS
}
REQUESTOR
AD DRESS BUS
Figure 1. DRAM Controller Supermacro Architecture
MEMORY
ADDRESS
BUS
I
Refresh Timer
The refresh timer consists of two counters, a five-bit binary
counter, and a three-bit Gray counter. The binary counter
generates a pulse every two microseconds. Its carry-out is used to
enable the three-bit Gray counter. The Gray counter outputs are
decoded to set the refresh request flip-flop and to reset the
counters. The Gray counter is used to avoid race conditions on
state decodes, which can cause spurious refresh requests.
The decode selected from the Gray counter determines the period
of refresh requests. The macro's range is from 2 to 14
microseconds. Longer periods could be achieved by adding a bit to
the binary counter with a loss of granularity in selecting the refresh
period.
The refresh request flip-flop remains set until the arbitrator begins
a refresh cycle. The memory controller uses CAS-before-RAS
refresh so no refresh address counter is required.
Arbitrator
The arbitrator accepts requests for memory cycles from the system
processor, 110 channel, or the refresh flip-flop. If the memory is
idle, the arbitrator goes to the state appropriate for the requestor
and proceeds through the states of the cycle as shown in Figure 2.
ADDSEL.
ACCSTRT
REFSTRT
REFEND
Figure 2. Memory Access Arbitrator State Graph
If the memory is not idle, the new request will wait until the current
cycle is complete. The processor has priority over the I/O channel,
and the channel haS priority over the refresh. Processor or 110
channel cycles begin by setting a flip-flop whose output (ADDSEL)
selects the appropriate address bus and asserting a signal
(ACCSTRT) to the timer to start the memory access. Refresh
cycles assert a signal (REFSTRT) that begins the refresh cycle.
DRAM precharge timing requirements are not violated by
adjacent memory cycles because RAS is turned off prior to the end
of a cycle. Mter that, the arbitrator returns to the idle state (SO)
before it can begin another cycle and reassert RAS. At a clock
frequency of 16 MHz, this pattern ensures a precharge time of at
least 90 ns. Faster clocks could require unused states to be
employed as timing buffers.
Using the DRAM Controller Supermacro
Memory Timing Control
The timing controller begins memory cycles when a pulse is
received from the arbitrator.
A refresh cycle causes the CAS signal to be asserted first and then
all the RAS lines a half cycle later. The refresh cycle is ended by the
arbitrator reaching a state and issuing a pulse (REFEND).
Processor or I/O memory cycles begin by asserting the RAS line
that is decoded from the two most significant address bits. The
following clock sets a flip-flop that causes the address multiplexor
to switch from the RAS address bits to the CAS. The multiplexor
control then sets a flip-flop to cause the CAS line to be set a short
time later.
The multiplexor line also goes to a flip-flop that will be set on the
next clock and causes both the arbitrator and the DRAM control
flip-flops to end the memory cycle.
The timing controller may require more customization than the
other modules in the macro to integrate it into other systems.
Faster clocks require timing adjustments to the memory control
signals. These adjustments can be made with additional flip-flops,
clocked on either edge of the clock, to minimize the length of the
cycle without violating any timing constraints.
I
Using the DMA
Controller Supermacro
Introduction
DMA controllers are used to supply main memory addresses and
to handle transfers for I/O channels in systems. They are
frequently implemented with standard VLSI controllers, but they
may not be adapted easily to some systems. Discrete logic offers a
custom solution, but it is expensive in terms of the number of
devices required.
Actel offers a field programmable gate array (FPGA) solution with
a DMA controller supermacro. The supermacro can be customized
to accommodate any system, and can be combined with other logic
in the FPGA to integrate other system functions.
Applications
Note
This note explains the DMA controller design and its use to control
the flow of data between an I/O channel and memory in a
processor-based system.
Architecture
The architecture of the controller is shown in Figure 1. It consists of
an address counter, a transfer counter, and a state machine
controller.
SYSTEM
DATA
BUS
ADDRESS
COUNTER
L--
CHANNEL
f---- MEMORY
ADDRESS
BLOCK
f---- TRANSFER
COMPLETE
TRANSFER
COUNTER
COUNTER
CONTROL
f---
TRANSFER
REQUEST SELECT -
DMA
CONTROL
SYSTEM
DATA BUS
ARBITRATION
MEMORY
ACCESS
REQUEST
Figure 1. DMA Controller Supermacro Architecture
The address counter provides addresses for channel memory
lccesses. The transfer counter counts the number of transfers in a
:>lock and interrupts the processor when transfers are complete.
The state machine controls access to the processor data and
;>
1990 Actel Corporation
address bus. It also serves as the interface between the system
memory controller and the I/O channel. Figure 2 shows the state
machine state graph.
•
PBGNT
ENDCY
INCREMENT
COUNTERS
(TCRLVR. ARLVR)
Figure 2. DMA Bus Interface State Graph
Operation
The processor writes the beginning address of the block transfer to
the address counter and memory index register. The counter has 20
bits to access 1 M bit memories. The register has two bits to allow
for accesses within one of four 1 M words. The counter and register
may be used with a 4 X 1 M word memory configuration, or may be
modified for any other configuration.
Loading the address counter/register is a two-step process, since
the data port on the DMA macro is 16 bits and the counter/register
is 22 bits. Upper and lower load controls are provided to load each
part of the address. A single control is used to load the
twos-complement of the block size to the transfer counter.
3-44
When the channel is ready for a transfer, it asserts a memory
transfer request (BREQ) to the DMA controller. The state
machine on the DMA controller performs the bus arbitration cycle
with the processor by asserting a processor bus request (PBREQ)
until it receives a processor bus grant (pBGNT). The processor bus
grant also goes to the I/O channel so it may deassert the original
request and drive data on the processor bus.
Next, the state machine requests a memory cycle (CMREQ) from
the memory controller. When the memory cycle is complete, the
DMA controller relinquishes the bus and increments the counters
for the next cycle. The cycle is repeated until the transfer counter
rolls over (fCRLVR), indicating the end of the block. The rollover
can be used to interrupt the processor to reload the counters for the
next transfer. There is also a rollover on the address counter
(ARLVR), which can be used to interrupt the processor.
Using the SCSI Interface
Controller Supermacro
Introduction
The rapid success of the SCSI interface as a standard has caused a
corresponding increase in the use of SCSI controller chips. NCR
manufactures several popular SCSI controllers. The NCR family of
controllers features internal FIFOs, parity pass, and a variety of
system bus interfaces.
Some of the devices can do multibyte system data bus transfers;
some are optimized for a standard bus. However, investments in
software and hardware development, as well as differences in price,
may preclude the use of these devices in some systems. In such
cases, a data transfer interface is needed between an eight-bit SCSI
interface controller and the system bus.
Applications
Note
This note explains how the Actel SCSI Interface Controller
supermacro controls the flow of data between the system bus and
an NCR 53C90B. The design can be adapted easily to a specific
system. The macro uses about 60% of an ACT™ 1010 and less
than a third of an ACT 1020.
Architecture
The architecture of the supermacro is shown in Figure 1. It consists
primarily of a double-buffered bidirectional transceiver and a state
machine. There is also a parity checker, which can be used to
interrupt a processor when a single-bit parity error is detected.
32-BIT BIDIRECTIONAL
TRANSCEIVER LATCHES
4:1 MULTIPLEXOR
SYSTEM
DATA
BUS
SCSI
CONTROLLER
DATA
BUS
..---...-<
I
TRANSCEIVER
CONTROL
TRANSFER
ACKNOWLEDGE
INTERFACE
CONTROLLER
SCSI
DATA
PARllY
ERROR
TRANSFER
REQUEST
PARllY
CHECKER
GENERATOR
SCSI
INTERFACE
CONTROL
Figure 1. SCSI Interface Controller Macro
The controller cycles data through, requests the processor for data
transfers, and passes control signals to the 53C90B to transfer data
and access internal registers.
©
1 QM .4r.tlDl
~nrnnr2tl"'n
The heart of the design is the state machine, which controls the
timing of all internal and external control lines. A state graph for
the state machine appears in Figure 2.
CLOCK BYTE 0
CLOCK BYTE 1
CLOCK BYTE 2
CLOCK BYTE 3
BREQF
Bl1E'OF • rs.w
DREQ = SCSI CONTROLLER DATA TRANSFER REQUEST
BREQ = REQUEST FOR DATA TRANSFER OVER THE SYSTEM BUS
XFER = INTERNAL SIGNAL CONTROLLING DATA TRANSFERS
BElWEEN THE TRANSCEIVERS
RW = READ/WRITE, WRITIEN INTO AN INTERNAL CONTROL
REGISTER BY THE PROCESSOR
GO
= READY TO TRANSFER, WRITIEN
CLOCK BYTE X = INTERNAL SIGNAL CONTROLLING BYTE TRANSFERS BElWEEN THE TRANSCEIVER AND THE SCSI CONTROLLER
Figure 2. SCSI Interface Controller State Graph
Using the SCSI Interface Controller Supermacro
Operation
Preparing for Data Transfer
The processor accesses the 53C90B internal registers by first
writing the appropriate data to the control1er, then reading or
writing the 53C90B.
The processor prepares the control1er to transfer data by writing
data to it that tel1s it the transfer direction and tens it to begin when
the 53C90B control1er asserts DREQ.
Writing
The processor writes a 32-bit word with four parity bits to the buffer
and prepares the system as detailed above. When the control1er
gets a DREQ signal from the 53C90B, it transfers the data in the
processor-side buffer to the SCSI-side buffer, then requests the
processor for more data. It then multiplexes each byte with its
parity bit out to the control1er data bus and issues an acknowledge
to the 53C90B.
If the request for data from the processor has been granted by the
time the last byte is sent, the process wil1 repeat, until the 53C90B
stops requesting data.
If the 53C90B stops asserting DREQ, the state machine wil1 wait
until the signal returns before sending the next byte. If the transfer
is of an odd number of bytes, or if the transfer stops on an odd-byte
boundary due to a failure, the processor can initialize the state
machine by activating the control1er chip-select line.
Reading
The flow of data from the control1er to the system bus begins when
the 53C90B asserts DREQ. The control1er latches each byte of a
word until the buffer on the 53C90B side is ful1. Then the control1er
transfers the word to the other buffer, sets the data transfer request
flag, and resumes latching bytes.
If the processor has responded to the data transfer request by the
time the next word is latched, then the process continues. If not, the
state machine wil1 stop until the buffer is read.
Parity Error Detection
The parity error detection logic checks each byte that passes
through when the DACK signal is asserted. If an error is detected, it
sets a flag that could be used to interrupt the processor. The flag
can only be cleared by the processor accessing the control1er.
I
Applications
Note
UART Design
peripherals (modems, terminals, printers, etc). This note
describes a general-purpose UART that utilizes 189 logic modules
(approximately 700 gates) in the ACT 1010.
Introduction
This applications note demonstrates the design of a Universal
Asynchronous Receiver and Transmitter (UART) circuit on an
Actel field programmable gate array (FPGA). The architecture of
ACT™ 10lQ/1020 arrays provides the flexibility needed for the
circuit design. Actel macros are used to speed and simplify the
design process.
Figure 1 shows the UART configured as a controller between a
computer and a peripheral device. Communications protocol is a
standard RS232 serial interface.
The UART transmission format is shown in Figure 2. A start bit
begins the transmission, followed by the data bits, and then an
optional even or odd parity bit. A programmed number of stop bits
concludes the transmission.
UART Description
The UART is a serial communication device that handles
transmission and reception of information between computers and
RD
TXD
WR
CPU
RXD
&
Serial
Peripheral
UART
Memory
CS,CD
DTR
DO-D7
DSR
II
Figure 1. UART System Block Diagram
MARK
~~_s_m_rt ~_D_0 .I__
__
__
D_1__.I__D_2__
~_D_3 ~_D_4 ~_D_5 ~_D_6 ~_p_a_r~
__
__
__
--I BAUD IFigure 2. UART Transmission Format
©
1990 Actel Corporation
__
Stop
serial data stream on the TXD output. It shifts out serial data at a
rate equal to 1/16 the clock frequency (pin CLKI6X).
UARTDesign
The UART application described in this note operates similar to an
INTEL 8251 operating asynchronously. See Figure 3 below.
When the CPU sends a data character, the UART automatically
adds a start bit, followed by 7 (or 8) data bits, an even or odd parity
bit, and a stop bit. The UART then transmits this "packet" as a
(Data Set Ready)
OSR
When the CPU receives a data character, a falling edge on the
UART's RXD input triggers the beginning of a start bit. The start
bit is again checked to prevent a false start. If the start bit is valid,
the data character is shifted in, followed by one stop bit. A serial
stream of data is then converted to parallel format, ready for CPU
retrieval.
00-07
OTR
(Data Terminal Ready)
RO
WR
CS
CD
Microprocessor
Interface
Control
00-07
Figure 3. UART Function
UART Design
Transmitter Design
Figure 4 shows the transmitter's buffer and shift register.
BITNUM
VDD
GND
II
COUNT
TXDONE
PARITYEN
TXDI
Figure 4. Transmitter Buffer and Shift Register
Because of the wide variety of latches available in the ACT 1
library, designers may build efficient registers for any application:
synchronous clear, gate-enable, active high or low outputs, and
active high or low gates.
The transmitter first latches the parallel data word DO-D7 from the
CPU in the buffer TlATCHO-TlATCH7. It then transfers the
data character to the shift register TSHIFfO-TSHIFf7 through
the B inputs. After the first clock (fXCLK), TXDONE goes low
selecting the A inputs of TSHIFfO-TSHIFf7. TXCLK then
serially shifts out the data character. A series of AND gates
(fGATEO-TGATE6) detect a logic high, indicating the stop bit,
propagating through the shift register.
Clocks
It is also possible to construct a two-phase clock system by
combining active high gate latches and active low gate latches. See
Figure 5 below.
Transmitter Buffer
The transmitter buffer TlATCHO-TlATCH7 uses transparent
latches to store a data character. The level-sensitive transparent
latch consumes only one ACT 1 logic module.
Latch
r- ~D-1
Latch
r-~-1
Latch
(Negative
Gate)
G
G
G
I
f
CLK
(Clock Network)
f
r---
Minimum
tpD + tLoglC + tsu
-I
Figure 5. Two-Phase Clock USing Transparent Latches
UART Design
The clock enable feature allows separation of data storage and
transfer while maintaining a common clock system. See Figure 6
below.
The ACT 1 macro library includes a large selection of flip-flops,
which are edge-sensitive storage elements. These flip-flops are
available with asynchronous clear, asynchronous preset, and clock
enable inputs. Negation bubbles are available on all inputs and
outputs.
Data In
Enable
CLK
Q
D
E DFE
Data Out
Primary Clock (CLK)
I
I
Desired Second Clock
~
f~----~
I
I
____~f
I
I
~~________E_na_b_le________~r--l
I
I
Oro. In
~
1..--I1lL...--_1
I
L
Data Out
Figure 6. Clock Enable Function
Transmitter Shift Register
The transmitter'S shift register TSHIFTO-TSHIFT7 uses flip-flops
with a built-in 2:1 multiplexor (MUX) on the D input. The built-in
MUX can emulate two-input logic functions, as shown in Figure 7.
y
The MUX flip-flop uses two logic modules (as does a standard
flip-flop). It is thus possible to construct a parallel load/serial shift
register without using additional logic.
GND
Q
D
DF1
•
A
B
A
B
Q
S DFM
CLK
elK
y
D
Q
DF1
•
A
Voo
B
elK
elK
Figure 7. Reducing Logic LevelS with MUX Flip-Flops
3-53
I
Transmitter Clock Circuit
The clock circuit is shown in Figure 8. It is implemented as a 4-bit
Gray code counter with both synchronous and asynchronous reset
(COUNT and RESET, respectively) and count enable (GOTX).
An external clock signal (CLK16X) drives the state machine. Gates
G 13 and G 14 decode the outputs 03-00 of the state machine and
generate an internal clock (TXCLK) for the transmitter shift
register. Only one bit changes at a time, eliminating glitches on the
transmitter clock (TXCLK).
TXRDY
BIT3
COUNT
A
RESET
Figure 8. Transmitter Clock Circuit
UART Design
Two-Level Logic
Decoders
The transmitter state machine takes advantage of the OR/AND
macro, a two-level logic function using only one logic module. In
combinatorial logic, use of two-level logic functions can
significantly reduce the module count. There are different types of
OR/AND macros with a wide choice of input combinations. Refer
to the ACT 1 family data sheet for a list of OR/AND and AND/OR
macros.
Decoder functions require signals and their complements. Using
negation assertion levels available on Actel macros simplifies the
logic path and conserves module usage. See the 2:4 decoder in
Figure 9.
It is important to note that propagation delay through a logic
module is independent of the implemented function. Thus the
designer should use negation bubbles and two-level logic macros
whenever possible.
A----~--------_A~
YO
B-----+----~--~B~
A
Y1
B
A
YO
B
A
Y2
B
A
Y1
B
A
Y3
B
A
Y2
B
A
Y3
B
Circuit with Actel Macros
Standard Circuit
I
Figure 9. 2:4 Decoders
Parity Generation
The transmitter parity generator is programmable for odd or even
parity. The parity generator circuit is shown in Figure 10.
From the shift register, data serially enters the parity generator on
the transmit line (fXDI). The generator inserts the parity bit just
prior to the stop bit. A 2: 1 MUX (PMUX) selects either data or
parity bit to the data transmit line TXD.
PARITYEN . - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - TXDI~------~~----------------------------------------------------~
S
TXDONE __
-t~..:..j
TXCLK __----'-..:..j
Y
GND
r-----------------~TXD
Figure 10. UART Parity Generator
3-55
Logic Module Conservation
Efficient utilization of Actel macros allows implementation of the
UART transmitter section using only 80 of the 295 available logic
modules on the ACT 1010 array. The parity generator uses the
XOR/AND macro. The XOR/AND, XOR/OR, and AND/XOR
functions each utilize only one logic module.
ACT 1 devices are particularly efficient at implementing
multiplexor functions. Both 2: 1 and 4: 1 MUXs can be implemented
in only one logic module each. MUXs implement internal bus
designs in place of internal three-state drivers on ACT 1 devices, as
shown in Figure 11.
A
B
y
0
Q
DFC1B
ClK
ClR
0
Q
DFC1B
ClK
ClR
•
GND
C
Figure 11. Internal Three-State Function Implemented with ACT 1 Multiplexor
Receiver Design
register, the parallel data character transfers to the buffer
(RlATCHO-RlATCH7) for CPU access.
Shift Register and Buffer
The receiver design takes advantage of the single-module
transparent latch with MUX, minimizing required circuitry.
Similarly, the built-in MUX can emulate other two-variable
functions.
Figure 12 on the next page shows the schematic of the receiver's
shift register and buffer. A logical "I" is loaded into the receiver
shift register RSHIFTO-RSHIFT9. After the first clock
(SHFfRX), XFERRX goes low, selecting the A inputs of the
MUX flip-flops. The low start bit precedes the serial data shifts into
the register. When a complete serial data character shifts into the
3-56
The MUX also can emulate an inverter, effecting a negation at the
data input of the transparent latch.
UART Design
~
-
RXDI
DOF1
r~
RFF2
~
V
--
XFERRX
A INV y
SHFTRX
~
INV
r---
~
Q
~~
BUF1
•
r----
~
Q
OFM
r--S
r--t>ClK
~
r---
~
DFM
RSHi'Ffa
L~
~
DFM
41---- S
G
~~ClK
Rl:ATcHs
r--
~
OFM
4.--S
4~
4'--~
.---1> ClK
Rs'i:iiFT6
~
DFM
4~ ..--1> ClK
I--S
4~~
RsH'i'i=T5
~
Q
S
B
.--~
4..--1> ClK
RsHiFf4
B
-~
.
DFM
S
.--~
~I>ClK
RsH'iFf3
Lr;:--
t>
Rs'HiFf2
Lr;;:--;;
B
BUF4
DFM
~.-
RSHIFT1
...
-
VDD
L~~
BrrCNT:~
B
DFM
S
~~
RSHIFTO
Ri:ATci:i1
RFFO
A
Q ~ ROO
~ RXCOUNT
BOLM
D
Q____.
~..__ S
DFC1
G
.---- ClK
Rl:ATcHo
ClR
~~~ClK
4'-~
Q~ RD1
DLM
r---
Q
DFM
S
DLM
Rl:ATcH2
L~
~
Q~ RD2
A
.
DFM
A INV y
Q~ RD3
DLM
RL:Ai=cH3
L-;;:--
B
RD4
r---
Q
B
Q~
DLM
Ai::ATcH4
Lr;:-B
Q~ RDS
DLM
RsHiFFr
L~
BUF2
RD6
r---
4t-t--S
B
Q~
DLM
S
G
Rl:ATcHa
S
4~t>ClK
B
f--II RD?
r--S
G
RsHiFf9
L~
B
Q
DLM
L-... I - -
MX2
B~
Y
CNTClR
~RESET
49B
AND3C
-~
Figure 12. Receiver Shift Register and Buffer
y
+. ...lXF~-
RX
Q
D
OFP1
~J- t>ClK
Ri=F1
•
Receiver Clock Circuit
The re~iver clock circuit is a 4-bit Gray code counter (similar to the
transmItter clock circuit) and is driven by the same external clock
signal (CLK16X). See Figure 13.
The state machine has both synchronous and asynchronous resets
(RXCOUNT, RESET). A filter circuit (RXDFILTER) prevents
false starts due to errors on the data line (RXDI). The receiver
clock (SHFfRX) is decoded from the outputs of the state machine
(00-03).
RXDI
RXDFILTER
c·LK-1-6X----------------1-1---------------------~~-bCLK
FFO
0r-4-----------~
GORX
FF1
a 00
BITO
QO,._------':..:..j
G16
a
01
_----=;..0--
BIT1
01~--"-------1
>-,-Y-+++-+-I A
B
DFMO
.---------------~H_~S
ClK
CLR
Figure 13. Receiver Clock Circuit
Y
SHFTRX
UART Design
serial data stream and compares it to the incoming parity bit.
The parity error flag (pARITYERROR) is set accordingly. See
Figure 14.
Parity Checker
The receiver parity checker is programmable to either odd or even
parity. The parity checker generates a parity bit from the incoming
RXDI
G3
N
~
XNOR
1-....---=--1
DFl
PARITYERROR
Y
Figure 14. Parity Error Generator
Design Complexity
Table 1. Logic Utilization
The UART transmitter section can be implemented using only 80
of the 295 available logic modules on the ACT 1010 device. The
receiver section requires only 78 logic modules.
Transmitter
These figures represent 27.1% utilization of the array for the
transmitter section and 26.4% utilization of the ACT 1010 for the
receiver section.
Table 1 provides a breakdown of logic utilization for the UART
design on Actel's FPGA.
Logic Modules
Buffer and Shift Register
45
Clock Circuit
26
Parity Generator
9
Total
Receiver
80
Lagle Modules
Receiver Shift Register and Buffer
42
Receiver Clock
30
Receiver Parity Checker
6
Total
78
I
Actel TA181 ALU
Introduction
Applications
Note
be accurate at the transistor level, it typifies an ASIC design
implementation.
The venerable TTL 74181 is a well-known and useful arithmetic
logic unit (ALU). Its four select lines, mode, and carry-in bits
combine to offer over 40 unique arithmetical and logical functions.
Many designers dislike losing such functionality as they move from
discrete to ASIC technology.
The Actel TA181 design requires only four logic levels (five for
A = B output). It uses components available from Actel's library.
~ctellogic .modul~ f1exib~lity enabl~s such "logic compression" by
Implementmg a WIde vanety of lOgIC functions.
For example, each bit pair in the ALU passes through a section of
logic, where select lines qualify the pair before outputting them to
the remaining logic. The qualification logic, as it appears in TTL
manuals, is shown in Figure 1. It uses eight gates and three logic
levels.
Designers may retain the function of the 74181 using Actel field
programmable gate arrays. Actel's macro library includes "TAI81,"
a soft macro functionally identical to the TTL component.
Implementation
The 74181 depicted in popular TTL data manuals has six logic
levels (seven for A =B output). While this representation may not
SO
S1
S2
S3
I
83---a_-l"'l
A3------------------~~~~~--~
Figure 1. TTL 74181 Qualification Logic
©
1990 Actal Corporation
Another example oflogic compression is in macro OAS, used in the
logic generating the F3 output. The macro implements the
expression:
Figure 2 shows Actel's version of qualification logic. By contrast, it
requires only two modules and one level of delay. The driving gate
is an OR function, as opposed to NOR in the TIL diagram.
Bubbling (inverting) inputs on each gate driven by the section
adjusts for this difference. All gates in the Actel library have
versions with bubbled inputs.
Y
One module implements two three-input ANDs to two-input OR
(A04A), and the other module implements two two-input ANDs
to three-input OR (AOSA). Combined, these modules create the
qualification logic.
Conclusion
Actel's ALU 181 design uses only 32 logic modules, only about 10%
of an ACT™ 1010 gate array. It is an excellent example of the Actel
logic module's adaptability, both in implementing logic functions
and in compressing logic, for efficient, high-performance designs.
The remaining logic in the TA181 macro is similar to that found in
the TIL 74181, with some improvements. M inputs require no
inverter, because M-driven gates have bubbled inputs. Single logic
module macros implement the AND-OR functions.
SO
S1
S2
= /((/A*/B*/C) + (/A*/D»
in a single module. OAS is a clear example of the savings in delay
and modules achievable with the versatile Actel logic module.
S3
r---------------,
B3~--__----------~--~_+--~~A~
L _______________
I
I
I
I
I
I
I
I
I
JI
r---------------,
~__________~~------~~A~
I
I
I
A3.r----------------~--------~~--====~~
L _______________
Figure 2. Actel's TA181 Qualification Logic
I
I
I
I
I
I
JI
Actel Fast Adders
Applications
Note
Introduction
Group Size
Adders, in a variety of sizes, perform a very common function in
digital designs. They are frequently implemented using
carry-look-ahead (CLA) logic to speed carry propagation.
However, as adder size increases, CLA logic requires increasingly
wider gates and more logic levels; such requirements degrade adder
performance.
Since carries ripple within groups, small group size is an advantage.
Small groups, however, require more groups for a given number of
bits, lengthening the carry propagation path. Thus, the design
trade-off is the number of delays to a group sum versus the number
of delays required to propagate a carry.
Implementation
Actel solved the CLA logic problem by designing its family of fast
adders using the Carry Select (CS) method. CS adders partition the
addition into groups of bits. Two additions are performed
simultaneously for all the groups. One addition is done with a
carry-in of one and the other with a carry-in of zero. The two group
sums and their carry-outs are connected to two-input multiplexors
(MUXs).
When determined, the carry from the lower groups selects the
group that supplies the sum to the outputs. The carry-out for the
group selects the sum and carry for the next higher group, and so
on.
The speed of group additions and carry propagation determine the
speed of the CS adder.
By optimizing group sizes based on adder macro delay parameters,
Actel adders are very fast.
Carry Propagation
As previously mentioned, carry propagation speed is a limiting
factor in carry select adder performance. Group carries must be
selected, just as the sums are. Unlike sums, carries select higher
carries and so forth. This creates the problem of cascading carries
through multiple logic levels.
The flexibility of Actel's logic module alleviates cascading carry
problems by implementing a cascaded two-input MUX in a single
module. The logic is shown in Figure 1. A two-input MUX drives
the select for another two-input MUX. The macro allows two-level
carry selection using only one module and in only one delay. All
Actel adder macros use the cascade MUX macro for rapid carry
propagation.
II
A
B
Figure 1. Cascade Multiplexor Macro
The Actel Fast Adder Family
The following table lists Actel adders, their speeds (driving one
load), and the number of logic modules used:
Number of Bits
Worst-Case
Delay
(ns)
Levels of Logic
FADD8
8
36
4
FADD12
12
45
5
63
FADD16
16
47
5
78
FADD24
24
56
6
120
FADD32
32
65
7
160
Macro Name
Number of Modules
37
Eight-Bit Twos
Complement Multiplier
conventional full adders for twos complement multiplication.
When the size of the multiplier and multiplicand are known, the
Baugh-Wooley formula can generate a series of product terms. In
this example, when both multiplier and multiplicand are eight bits,
M and N equal 8 in the formula.
Introduction
Multipliers are vital to systems design. In the past, designers
integrated VLSI multiplier chips into their designs. As ASICs
become more popular, ASIC-based systems increasingly utilize
multiplier macros. Previously, field programmable devices did not
have sufficient architectural flexibility and connectivity to
efficiently implement multipliers. Actel has designed an eight-bit
twos complement multiplier that easily fits on an ACT™ 1010 or
ACT 1020 field programmable gate array (FPGA).
Each product term contains "2" raised to a power. The product
terms are arranged into an array according to their binary power.
Adding the columns of the array results in the product of the two
original operands.
In the generic architecture, AND gates generate the product terms,
and the adders sum them together. The adder array has a
parallelogram shape. Dividing the parallelogram into four sections
eases the design process. Each section generates a set of partial
products, which are then added together.
Algorithm
High-performance multipliers form the product of two numbers
using combinatorial logic and an array of adders. Actel's multiplier
is based on the Baugh-Wooley! algorithm (Equation 1), and uses
P
am-1 b n-1 2m+n-2
m-2 n-2
+ "'"~
Applications
Note
~
"'"
i=O j=O
i+j
aib2
j
Equation 1. Baugh-Wooley Algorithm
Actel Multiplier Macro Architecture
The Actel multiplier macro architecture appears in Figure 1. It
consists of three major sections: the partial product generators,
Wallace tree adder, and final product adder.
The partial product generators implement the adder array sections
using Actel library components. Since the partial product
©
1990 Actel Corporation
generators have similar structure, the design was copied and
modified to create the others.
The Wallace tree adder sums the partial products and passes two
binary numbers on to the final product adder. The final product
adder performs arithmetic necessary to produce the product. The
following is a discussion of each element in detail.
I
Partial Product
Generators
Wallace Tree
Adder
Final Product
Adder
Multiplier Bus
Multiplicand Bus
Figure 1. Actel Multiplier Macro Architecture
Partial Product Generator
Each partial product generator requires as input four bits from
each operand. The bits pass through AND functions, which output
through an array of adders to generate eight partial products.
Because the partial product generators operate in parallel, all
partial products are produced simultaneously.
The schematic has an array of adders with inputs either from an
AND gate or the sum from another adder. The Actellibrary adder
FA2A adds a two-input AND term with inverted inputs to another
input and a carry-in. Most of the adders in the generator use FA2A,
requiring only two logic modules to implement both the AND and
full adder functions.
Wallace Tree Adder
The multiplier's Wallace tree section sums columns of terms from
partial product generators. Summing a column in binary arithmetic
requires tracking carries generated to the higher power-of-two
positions in the number. The Wallace tree uses three-bit adders to
input up to three bits of the column. It feeds the adder sums into
adders taking in other bits of the column, and feeds carry-outs to
the inputs of adders summing the column in the next higher bit
position, and so on. The tree output is a series of sum and carry-in
bits which may be added in a conventional adder as two binary
numbers, producing the final product.
Because both carry-outs and carry-ins are active-low on the Actel
adder macros, the adders can be used in a Wallace tree structure as
a three-bit adder which outputs a sum and a carry. The carry-out
from an adder in one column inputs directly to the carry-in of the
next higher column. Other than the adders, no logic is required to
design the Wallace tree.
Final Product Adder
The multiplier's last section is a binary adder. The Wallace tree
completely sums some lower columns of bits, which pass from the
tree to the product bus. The tree outputs higher bits as two binary
numbers to be added. The bits comprise the sum bit for each bit
position and its respective carry-in. The final adder adds them
together for the remaining bits of the product.
The final adder uses the same design technique as the Actel family
of fast adder macros. The architecture of the final product adder is
shown in Figure 2.
Eight-Bit Twos Complement Multiplier
Carry
Select Adders
A Bus
B Bus
Sum Bus
Figure 2. Final Product Adder Architecture
The Carry-Look-Ahead Problem
Adders are usually implemented using carry-look-ahead (CIA)
logic to speed carry propagation. However, as the size of the adder
increases, the CIA logic requires wider gates and more logic levels;
such requirements degrade adder performance.
Carry Select Addition
To solve the CIA problem, Actel designs its fast adders using the
Carry Select (CS) method. CS adders partition the addition into
groups of two or three bits. Two additions occur simultaneously for
all groups. One addition has a carry-in of one and the other a
carry-in of zero. The two group sums and their carry-outs connect
to two-input multiplexors.
When determined, the carry from the lower groups selects the
supply that sums the outputs. Next, the carry-out for the group
selects the sum and carry for the next higher group, and so on.
The CS adder speed is determined by group additions speed and
carry propagation speed.
Group Size
Since carries ripple within groups, small group size is an advantage.
Small groups, however, require more groups for a given number of
bits, lengthening the carry propagation path. Thus, the design
trade-off is the number of delays to a group sum versus the number
of delays required to propagate a carry.
By optimizing group sizes based on adder macro delay parameters,
Actel adders are very fast.
I
Carry Propagation
As previously mentioned, carry propagation speed is a limiting
factor in carry select adder performance. Group carries must be
selected, just as the sums are. Unlike sums, carries select higher
carries and so forth. This creates the problem of cascading carries
through multiple logic levels.
The flexibility of Actel's logic module alleviates cascading carry
problems by implementing a cascaded two-input MUX in a single
module. The logic is shown in Figure 3. A two-input MUX drives
the select for another two-input MUX. The macro allows two-level
carry selection using only one module and in only one delay. All
Actel adder macros use the cascade MUX macro for rapid carry
propagation.
A
B
Figure 3. Cascade Multiplexor Macro
The Actel Advantage
The eight-bit twos complement multiplier illustrates the type of
macros that can be implemented on Actel field programmable gate
arrays. The design method detailed here suits multipliers of many
different configurations. This technique also applies to 3X7, 9X4,
or 6X5 multipliers, depending on system requirements.
Other multiplier algorithms also apply. A magnitude array
multiplier design using the Cavanagh method 1 requires less logic
than the eight-bit multiplier above. Other alternatives include the
Booth and Bit-Pair Recording methods 1.
The multiplier macro may be designed as part of the logic in a field
programmable gate array. The remaining gate array modules are
available for other design purposes, either other arithmetic
functions or glue logic. For example, Actel's 8X8 multiplier macro
uses 241 of 546 modules on an ACT 1020. Nearly 60% of the chip is
still available for other purposes.
In any design, Actel provides the technology to complete the entire
gate array development process including schematic capture,
simulation, place and route, timing analysis, and device
programming.
1. Cavanagh, Digital Computer Arithmetic Design
Implementation, McGraw-Hill, New York. 1984.
and
Article Reprints
Article Reprints
l . . . - - - -_ _ _ _
_~
Article Reprint 1:
Analyze FPLD Architectures, Performance, and Development Tools to Optimize Design-Dependent Selection . . . . ..
4-1
Article Reprint 2:
A CMOS Electrically-Configurable Gate Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-11
Article Reprint 3:
An Architecture for Electrically-Configurable Gate Arrays ................................................. 4-23
Article Reprint 4:
Dielectric-Based Antifuse for Logic and Memory ICs .................................................... 4-29
Article Reprint 5:
Field Programmable Gate Array with Non-Volatile Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-35
Article Reprint 6:
Testing Antifuse-Based FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-45
Article Reprint 7:
Compare ASIC Capacities with Gate Array Benchmarks ................................................. 4-49
Article Reprint 1
ANALYZE FPLD ARCHITECTURES,
PERFORMANCE AND DEVELOPMENT
TOOLS TO OPTIMIZE DESIGNDEPENDENT SELECTION
by Dennis McCarty
Sr. Applications Engineer, Actel Corporation
Sunnyvale, California
I
4-1
Analyze FPLD Architectures,
Performance and Development
Tools to Optimize DesignDependent Selection
by Dennis McCarty, Sr. Applications Engineer
Actel Corporation, Sunnyvale, California
O
ver the past decade, fieldprogrammable logic devices
(FPLDs) have undergone dramatic
changes. The devices of the late
1970s and early 1980s primarily consisted of fuse-programmable PALs,
pioneered by Monolithic Memories
Inc. (now part of Advanced Micro
Devices), and a handful of alternative
versions from competitive manufacturers. The devices were used by
circuit designers to integrate discrete
TTL logic into higher density packages. The designer's goal was to take
four or five discrete logic chips and
replace them with one. And while
the existing device architectures were
not that sophisticated, that modest
goal was easily accomplished using
fairly inexpensive devices and programming hardware.
Consider, for example, the device
choices available to the designers of
ten years ago. Clearly, bipolar was
the most prevalent technology in use
with devices being "hard" programmed by blowing some sort of a
fuse - PALs, FPLAs and alternatives were good examples of that
genre. The differences between competing devices related to the number
ofI/O, the number of product terms
and packaging options.
Today, FPLD density levels have
increased as have the number and
types of devices that compete in the
logic-integration arena. Indeed, it's
common to find a dozen FPLDs on
the market today that can easily
4-2
accommodate the functionality of
20 random TTL logic devices. And
CMOS has replaced bipolar as the
dominant technology of choice for
FPLDs.
Perhaps the biggest change has
taken place in the area of device
programming and the enabling technology. While conventional bipolar
fuse technology continues in use
today, EPROM, EEPROM, static
RAM (SRAM) and antifuse programming technologies have quickly
been adopted by the majority of new
FPLD companies entering the industry. And device programmers have
been replaced with sophisticated
workstation-based development environments that combine powerful
new hardware and software that
speeds the design time for complex
circuit designs.
But as the complexity of the new
breed of FPLDs has increased for
designers, so has the difficulty in
determining the suitability of the
various devices for particular applications. Moreover, with new manufacturers emerging each year with
devices of different architectures and
design methodologies, the 1990s system designer's selection task is likely
to become more difficult as the
decade progresses. Even device names
have gotten more sophisticated and,
as a result, more confusing. Today,
the terms PLD, EPLD, FPGA,
PGA, LeA, PMD, PAL, PLA,
EEPLD as well as others are all
generally being used to refer to
devices that fall into the FPLD
category.
Interestingly, once armed with the
proper set of criteria, system designers can accurately and safely select
preferential devices. The guidelines
that follow take aim at achieving
that goal for designers. Specifically,
the criteria relates to comparing
device technologies, understanding
density issues and evaluating development tools.
Comparing Device
Technologies is Key
Because memory-based technologies are now in use, the question of
device volatility becomes an issue.
Specifically, the volatility of the programming element used to interconnect the logic on device determines
whether programming is permanent
or must be performed each time
power is applied. One-time programmabIe devices use fuses or antifuses.
Reprogrammable devices are either
electrically reconfigurable (EEPRO M
or SRAM technology), or optically
reconfigurable (EPROM technology).
Only the SRAM technology is
volatile in that, if power is interrupted, the device must be reloaded.
SRAM or EEPROM devices must
be used if the design requires that the
device be dynamically reconfigured
in operation.
If reconfiguration is not required,
then the advantages each technology
offers need to be considered. Some
designers are attracted to reprogrammability during the development
cycle because it allows them to try
design iterations without wasting
components. That is a consideration,
but it must also be noted that there is
a design time and production cost
penalty for the software and hardware support required to boot the
device. As always, when evaluating
systems, all the relevant costs should
be considered. For example, losing
parts to design iterations is not nearly
as costly as placement and routing
(P&R) that require manual intervention.
Article Reprint 1
In certain applications, such as
military, nonvolatility is preferred.
In other applications, such as
aerospace/ satellite environments,
radiation-tolerant nonvolatile technology may be the only option due
to the threat of a loss of program
information from radiation. The
vendors of devices being considered
for use in a hostile environment
should be asked what conditions
could adversely affect their products.
Beyond device volatility, performance becomes another element in the
selection criteria. But how is it possible to determine performance and
functionality without testing or, at
least, simulating the device? Actually,
the performance a design achieves
depends on how it is implemented
within a device. A qualified estimate
of how the design might be implemented usually can be made once
the architecture of the device is
understood. But while some manufacturers claim that their device performance is entirely predictable, estimates are valid only insofar as the
P&R performs as expected. It may
be easy to anticipate how the FPLD
software will implement a state
machine or counter. It is quite another
thing to guess how it will accommodate an adder or other large
combinatorial function.
The maximum clock frequency is
often determined by the number of
logic levels required between flipflops, the pad-to-flip-flop time, or
the flip-flop-to-pad time. The last
two times may be found in the device
data sheet. Estimating the number
of levels of logic in the longest path
in a design requires more thought.
The number of terms and/ or variables driving the input to a flip-flop
is one consideration.
Devices such as PALs, that have
AND-OR array feeding flip-flops,
can implement terms containing large
numbers of variables as easily as
terms containing only one. Other
devices have a set number of variables that its logic elements can
accept. Terms that exceed that number require more than one element,
causing elements to be cascaded to
implement the logic. Cascading logic
increases the delay path. Moreover,
some AND-O R array devices have a
fixed number of terms in their arrays.
If this number is exceeded, the extra
terms must be implemented elsewhere and brought to the array
plane. Bringing in terms from outside the array causes additional delay.
Routing delays are another consideration. Devices that fall into the
FPGA (field-programmable gate
array) camp have delays that increase
with fanout. The amount of additional delay due to fanout is published by the manufacturers in tabular or equation format. Other, more
fixed architectures like PALs, don't
exhibit fanout-dependent delays, but
have the problem of fixed routing
paths whose delays cannot be improved.
In either case, however, post P&R
timing analysis is required to determine the anticipated delays. For
example, a path for one application
might pass through blocks that were
not needed for a different circuit,
thus adding path delays. Thus, designers need to carefully anticipate
path differences and resultant delays
for each application.
The Path to a Masked
Device
While FPLD devices have proven
themselves as cost-effective production vehicles in all but extremely
high volume applications (greater
than 10,000 units per month), many
designers are taking advantage of
the FPLD's fast device turnaround
times and are employing them as
prototyping vehicles as well. Indeed,
it is often easier and faster to develop
an FPGA for a prototyping environment than it is to implement the
equivalent system design using discrete logic devices on a printed circuit
board. As such, many designers today
use programmable devices for design
prototyping before migrating the
design to a masked array or standard
cell for production. Use of an FPLD
as a temporary solution is an attrac-
2
tive alternative to the time and expense of developing a masked device.
The programmable device may even
be used to sustain production until
demand for the product justifies the
masked solution. This strategy can
effectively shorten product introduction time by months.
The ease with which the design
may be migrated from one technology to another determines the
risk, cost, and time involved in making the transfer. In general, the closer
the architecture of the device to that
of a masked device, the easier the
migration path. Some gate array
vendors have software that converts
netlists of programmed devices to
masks. The impact that the choice of
a programmable device would have
on a conversion should be considered
as early in the life cycle of the
product as possible.
The Gate- Density
Confusion
Perhaps no other criterion has
created as much controversy as has
the issue of device density of FPLDs.
Considering the difficulty of comparing different technologies, the issue
is understandable. Indeed, accurate
assessments of gate densities is crucial
in not only design but also in comparative cost analysis.
Because FPLD device manufacturers often use conventional masked
gate arrays and related density measurements as points of comparison,
any similarities that they can make
are often drawn, and the resulting
marketing hype adds to the confusion. Specifically, many FPLD manufacturers equate the density of their
devices in terms of "number of gates."
Unfortunately, manufacturers all
don't count gates the same way.
Other FPLD manufacturers don't
refer to gates at all. As a result,
FPLD device comparisons using
density measurements are error
prone.
The problem doesn't exist when
comparing conventional gate arrays.
Masked gate array vendors specify
their devices as having a certain
4
number of two-input NAND gates
and the percentage that can be actually utilized in a design. With masked
arrays, higher-input gates and other
logic functions are made from specified numbers of these two-input
gates. Such comparisons cannot be
made with FPLDs because each
manufacturer uses a unique architecture and functional logic module,
or block within the device to implement various circuit functions. Indeed, it is very difficult for designers
who are used to designing with
conventional logic, and even those
who have had experience with masked
gate arrays, to understand what they
can do with a more complex type of
logical element or module. Clearly,
designers need a more accurate way
to gauge FPLD device densities.
Only Two Basic Device
Types
The first step toward achieving
that goal is to understand that
despite what vendors claim, there
are only two basic types that fall into
the FPLO category - programmable
logic devices (PLOs) and fieldprogrammable gate arrays (FPGAs).
Understanding which devices are
one or the other is critical to proper
device selection. For example, PLDs
have a fixed interconnect architecture, comprise AND-OR array planes
that feed flip-flops and usually exhibit static power dissipation. Devices in this category include those
with labels such as PALs, FPLAs,
EPLDs and EEPLDs.
The distinguishing characteristics
among the different vendors, aside
from programming specifics (fuse,
EPROM, etc), is the number of flipflops that the device contains. An
FPGA, on the other hand, has a
flexible interconnect technology with
no fixed AND-OR plane, but does
incorporate an array oflogical building blocks that are often structured
loosely around a masked gate array
architecture. PGAs and LCAs are
names some manufacturers use to
describe their specific type of FPG As.
It's important for designers to
keep in mind that there are areas of
similarities between FPGAs and
PLOs that often add to the confusion. For example, both categories
of devices incorporate combinatorial
logic that feed other logical devices
(flip-flops, gates, etc). The combinatoriallogic that exists with FPGAs,
however, is more flexible and more
easily configurable than that within
the typical PLO. In addition, the
FPGA's combinatorial logic is in
more of an array configuration, so it
is interconnected differently.
The Elusive "Equivalent
Gate"
Once the PLDI FPGA distinction
is made, the second step in gauging
FPLO device densities involves understanding device architectures to
determine capacities rather than relying on manufacturers' stated density
claims. To be specific, the question
of gate densities or capacities with
either category can be reduced to
analyzing the amount of logic that
can be realized from the logic building blocks in the device, and the
amount of blocks on the device that
actually can be used in a design.
To facilitate the analysis, the concepts of Gate Array Equivalent Gates
and PLDI LCA Equivalent Gates
needs to be defined. A gate array
equivalent gate is the two-input
N ANO gate mentioned earlier and
can be considered as a real or usable
gate.
A PLD Equivalent Gate corresponds to the number of gates that
are assigned to the device by the
manufacturer. The number may reflect the total number of gates on the
device, or it may be based on some
assumptions about the effective use
of gates. Each building block on an
FPGA contains an inherent number
of gates. The way the building block
is used in a design, as well as the
architecture of the device itself, determines how many of the gates can
actually be used to implement logic.
The distinction doesn't arise with
masked gate arrays because the lowest level of logic is the gate itself.
3
When the level of granularity of the
logic element is higher, as it is in all
PLDs and FPGAs, the question of
how many of the gates on the device
can actually be used is important.
One Gate, Two Gates,
Three Gates, ...
With this understanding in mind
and armed with information derived
from material supplied by manufacturers, consider the counting methods
and derived device densities of two
representative FPGA vendors - Actel
and Xilinx - and PLD supplier, Altera.
Actel determines the gate capacities
of its FPGAs by comparing actual
designs and library macros with the
same logic implemented on a masked
gate array. Actel's ACTM 1 architecture is similar to a channeled gate
array in nature. Because the gate
counts for masked gate array designs
are easily determinable, Actel uses
the masked arrays as a convenient
standard to characterize its devices
in gate-array equivalent gates. The
Actel architecture consists of an
array oflogic modules with the rows
and columns separated by routing
tracks. While it is more complex
than a gate, the Actellogic module is
much simpler than the basic element
of competing architectures. Based
on comparisons to masked gate
arrays and analysis of actual designs
over the past two years, in practice
Actel's module is equivalent to 3.2
gates.
The FPGAs from Xilinx, meanwhile, are based on a Configurable
Logic Block (eLB) architecture and
the vendor counts gates using another
technique. Each CLB contains two
dedicated flip-flops and a combinatorial function generator. The generator is a static RAM that is loaded
during the configuration process.
Five inputs are used to address the
SRAM so that it may implement
any five-variable function or two
four-variable functions with some
restrictions.
Xilinx 110 Blocks (lOBs) contain
an input latch, an output flip-flop
and a three-state output buffer. When
Article Reprint 1
Xilinx calculates the densities of its
devices, the company counts all the
gates in all the lOBs on a device as
usable gates. The total number of
gates in an lOB is said to be 13.
Xilinx determines the density of
their devices by multiplying the
number of CLBs and lOBs in the
device by the number of gates they
are claimed to represent. The function generator on the CLB is evaluated as two four-input generators.
Each is specified as having a maximum gate count of between 20 to 22
gates and a minimum of one. There
would thus be between 2 and 44
gates per CLB function generator.
The vendor then assigns an "average
number of gates" to the function
generator based on an assumed 50
percent utilization efficiency.
The CLB is rated as having a total
of 42 gates, 30 gates from the function
generator and 12 gates from the two
flip-flops. These gates account for
nearly 88% of the total number of
gates on the XC3000 series devices.
After multiplying all possible gates
in the CLBs and lOBs on the device
the products are summed to give the
total number of gates. A fraction of
the total (between about 50% and
65%) is given as the "usable gate
count" by the manufacturer. This
arbitrary approach is similar to that
applied to PLD architectures.
Altera's MAX architecture represents perhaps the most architecturally sophisticated PLD and the subject
of this comparison. The largest device in that family, the EPM5128,
incorporates 128 flip-flops (the typical
PAL has only eight) and a bus
structure that differs from conventional PLDs. The firm's PIA (programmable interconnect array) bus
allows the company to interconnect
more cells together than conventional PLDs, but the architecture of the
basic unit that designers must deal
with has not changed. It is the ANDOR macrocell. The architecture of
the device consists of Logical Array
Blocks (LABs) that can be interconnected via the PIA bus. The bus acts
to connect LAB outputs to other
LAB inputs.
Each LAB contains 16 macrocells
and an expander. The macrocell
contains an AND-OR structure feeding into an exclusive OR gate which,
in turn, drives a flip-flop. There are
also inverter/ buffers in the macrocell
to allow for true or complementary
logic. The expander produces additional terms which may be fed to
any of the macrocells in its LAB.
Altera doesn't specify gate counts
for its device but rather uses macrocells to characterize capacity.
Trust DeSigns, Not Vendor
Claims
Although the accuracy of the gate
densities derived from the counting
schemes itemized above will not be
disputed, designers are well advised
to disregard manufacturers' claims.
Instead, by employing the analysis
techniques and equations that follow,
system designers will be able to
accurately gauge competitive device
densities by understanding the various
architectural distinctions and by
analyzing their own designs. Because
the ultimate goal is to measure the
number of gates a designer can
expect to be able to use on an
FPLD / LCA device rather than basing this number on assumptions or
small-scale macro implementations,
prior to the evaluation that follows,
it is interesting to review the results
of actual designs that were analyzed
relative to Actel, Xilinx and Altera
devices as to gate usage. The results
will serve as a check on the gatecounting equations being presented
later in the discussion. The results of
these tests are open to scrutiny by
any who request it.
For Actel, nine customer designs
were randomly selected and a group
of TTL macros from the macro
library for analysis. The designs included a DRAM controller, 32-bit
accumulator, graphics controller, data
transfer controller, and a SCSI interface controller. The TTL macros
included counters, multiplexors, and
decoders. Table I lists the macros
analyzed.
4
TABLE OF TTL MACROS
ANALYZED IN ACTEL AND
XILINX LIBRARIES
3-to-8 Decoder
8-to-1 MUX
4-to-1 MUX
74161
74164
74194
74377
74139
Table 1.
Four customer designs were similarly randomly selected for data for
Xilinx along with an AMD (a Xilinx
second source) published application
article and the same TTL macros
selected from the Actellibrary. The
Xilinx designs included a keyboard
interface controller, an RLL2,7 coder,
and a video controller.
Counting gates for Altera devices
was a different situation. When thc
Altera MAX+PLUS development
system compiles a design it uses the
resources on the device to implement
it, but the details of the implementation are hidden. When a macrocell is
reported by the development system
as having been used there is no way
to know how much of its logic was
used. It could range from a few gates
to all the logic and the flip-flop in the
macrocell. What is required to accurately measure the capacity of the
device would be designs whose
schematic gate counts were known
and, more importantly, that are
known to have utilized virtually all
the resources on the chip.
Fortunately, four such examples
exist. These are some capacity benchmark studies performed by Altera.
The four designs were examples of
common logic circuits that were
chosen so as not to favor any type of
architecture over another. The designs include a Data Path, Timer
Counter, State Machine and Arithmetic functions. The benchmark
measures how many of each type of
design could be accommodated by
the device being tested. Because the
4
number of gates in each of the
designs is known, as well as the
number of the designs that could be
placed on a fully-loaded device, the
capacity of the device can be found
by simply mUltiplying the gates per
design by the number of designs.
Test Results Analyzed
The designs were analyzed by
counting the number of gates and
building blocks in them. The gates
were next categorized according to
whether they were combinatorial
gates, flip-flop (i.e. flip-flop or sequentiallatch) gates, or combinatorial gates associated with a flip-flop.
The last category refers to combinatorial gates that immediately precede
the D input of a flip-flop in a design.
The classification is significant because it enabled the equations that
follow to be developed that allow
designers to estimate the number of
gates that could be accommodated
on a device. The estimate would be
based on the number of flip-flops in
the design as a percentage of the
total. The designs were used to
verify the formula as a predictive
tool.
For consistency, individual logical
elements were counted (source: LSI
Logic HCMOS Design Manual, 1986)
as follows:
Element
Inverter
Buffer
Gate, two-input*
Gate, three-input
Gate, four-input
Gate, five-input
Gate, exclusive-Or
MUX,2:1
MUX,4:1
Latch
Flip-flop, D
Flip-flop, JK
Adder, full
2.5
3.0
3.0
6.0
4.0
6.0
ACTEL
GRAPH 1. LINEAR MODEL
R c .17
10.0
10.0
~:~ES
(* Note: Bubbled inputs on gates
were counted as .5 gates unless all
inputs were bubbled which counted
for none.)
6~···-;.······_·_·a- .•
3 ......................
2
oL-__- L_ _ _ _L -_ _- L_ _
o
.2
.4
.6
~
FRACTION OF GATES USED AS FLIP·FLOPS
FFs
FF Gates
Combinatorial
Gates
Design I
Design 2
Design 3
Design 4
Design 5
Design 6
Design 7
Design 8
Design 9
Design 10
24
143
144
672
835
962
252
132
714
1103
740
356
116
846
461
572.5
628.5
l3l
622
486.5
833.5
412
17l
.5
1.0
1.5
2.0
MOOULE
Design
162
42
22
127
178
127
84
Gates
.5
Table 2 shows the gate counts for
the various Actel designs. The designs are ordered from the lowest
number of gates per module to the
highest as calculated by simply dividing the total number of gates in the
design by the total number of modules. As may be seen in Graph I,
there is variation in the utilization in
the modules with the average being
about 3.2 gates per module.
Table 3 shows the gate counts for
the various Xilinx designs. Due to
the large number of flip-flops in the
Xilinx architecture (there are two
per LCA which is why the fraction
Modules
Used
Gates
per Module
99
512
451
436
220
98
477
549
469
172
2.63
2.96
2.87
3.52
4.00
2.68
2.80
2.90
3.36
4.47
3.22
Total Gates
CLB
Used
Gates
per CLB
414.50
267.00
820.00
341.00
913.00
1535.50
Average
58.50
34.00
104.00
44.00
97.00
136.00
.76
7.09
7.85
7.88
7.75
9.41
11.29
8.55
Total Gates
260
1518
1296
1534.5
880.5
263
1336
1589.5
1573.5
768
Average
Table 2. Actel Design Gate Counts
Design
FFs
FF Gates
Combinatorial
Gates
Design I
Design 2
Design 3
Design 4
Design 5
Design 6
38.00
24.00
74.00
32.00
99.00
167.00
228.00
144.00
448.00
192.00
594.00
1136.00
186.50
123.00
372.00
149.00
319.00
399.50
Table 3. Xilinx Design Gate Counts
5
.8
Article Reprint 1
of flip-flops to CLBs ranges from
zero to two) it was expected that
designs would achieve densities in
proportion to the number of flipflops used. As may be seen in Graph
2, there is a strong (98%) correlation
between the fraction of the total
gates in the design that are used to
implement flip-flops and the number
of gates realized per CLB. The average number of gates per CLB for the
designs analyzed was 8.5. Table 4
shows the number of gates for each
Xilinx device at 8.5 gates per CLB. It
also includes CLB utilizations that
are typical according to the Xilinx
design guide.
guide, the analysis techniques and
formulas that follow can be used by
designers to determine how their
application-specific design will fit on
the various devices available, regardless of density claims.
PLDs will be considered first, to
see how much logic they can accommodate and what types of designs fit
best on that architecture. In general,
the typical PLD architecture contains
a huge number of combinatorial
functions and only relatively few flipflops. The limiting factor with this
technology is the number of flipflops.
Designers must first analyze their
own designs to determine the ratio
of combinatorial two-input gates to
flip-flop gates, keeping in mind that
a flip-flop function represents six
two-input gates. For simplicity, treat
a three-input gate as I Y2 two-input
gates, and a four-input gate as two
gates. Next, the following formula
can be used to determine if the
intended PLD will accommodate
the design. The equation accurately
determines the density of any PLD
device and is more precise than gate
count claims because it accounts for
flip-flop limitations of PLDs.
Table 5 shows the gate counts for
the Altera-generated benchmarks.
The average was 1559 gates. Graph 3
plots the total number of gates in the
benchmark against the fraction of
gates used as flip-flops. Another
data point provided by Altera comes
from the data sheet for the EPM5128,
where it claims that an equivalent to
a 74161 counter occupies 3% of the
device. The counter requires 54 gates
to implement, which would put the
capacity of the device at 1800 gates.
GRAPH 3. ALTERA
2000
1600
TOTAL
GATES
r------------,
R ~ .46
__________ 0- _________________________ ,,___ ;;_
1200
800
GRAPH 2. XILINX
400
12 . . . - - - - - - - - - - - . . . , . . ,
R
~
.98
o~--~--~--~
o
.2
.4
.6
FRACTION OF GATES USED AS FLIP-FLOPS
GATES
PER CLB
Architectures Determine
Actual Device Density
Because there are almost as many
ways to count gates as there are
manufacturers, had additional devices
been analyzed, alternate gate counting methods would have been described. With these test results as a
6
OL-~~
o
__
.2
~_~_~
.4
.6
.8
FRACTION OF GATES USED AS FLIP-FLOPS
Device
CLBs
Typical
Gates
per CLB
XC3020
XC3030
XC3042
XC3064
XC3090
64
100
144
224
320
8.5
8.5
8.5
8.5
8.5
Total
Gates
544
850
1224
1904
2720
Device
Usable
Utilization Gates
0.85
0.8
0.75
0_7
0.65
462
680
918
1333
1768
lOB
Gates
per lOB
lOB
Gates
Total
Gates
Advertised
Gates
64
80
96
120
144
3.25
3.25
3.25
3.25
3.25
208
260
312
390
468
670
940
1230
1723
2236
2000
3000
4200
6400
9000
Table 4. Typical Capacity of Xilinx Devices
Design
FF
FF Gates
Combinatorial
Gates
Total Gates
Fraction
FF Gates
Designs per
Device
Gates per
Device
Arithmetic
Timer Counter
State Machine
Data Path
8
24
12
16
48
144
72
96
245
143
67
73
293
287
139
169
0.16
0.50
0.52
0.57
5
6
1465
1722
1529
1521
1559
Table 5. Altera Benchmark Gate Counts for EPM5128
6
II
9
Average
•
Gt =«6 gates x #FF)/PercentFF»*
(Df), where:
Gt = upper gate limit on the PLD
#FF = the total number of flipflops in the PLD
PercentFF = the fraction of gates
that reside in flip-flops of the
actual circuit
Df = a derating factor, or maximum
device utilization factor available
on manufacturers' data sheets.
For PLDs, Df is equal to approximately 0.9.
The term, PercentFF, is a useful
way of distinguishing among designs
and is relatively easy to estimate
based on the number of flip-flops in
a design and the approximate total
gates. Independent research conducted by Stanford University on scores
of logic designs indicates that approximately 45% of all gates used in
logic designs are used to implement
flip-flops and the remainder is used
for combinatorial logic. According
to the LSI Logic handbook, six twoinput gates are required to implement a flip-flop. Thus, designers can
use 45% as a representative number
if knowledge about their own designs
is absent. Thus, using the equation
and this 45% figure, the upper bound
on the capacity of a PLD containing
128 flip-flops may be found to be:
Gt = «6xI28)/(.45»(0.9)
1700(0.9) = 1530 gates.
Thus, a PLD design that uses all
the flip-flops available could expect
to achieve densities of no greater
than 1700 gates (Df= 1.0), regardless
of what the manufacturer states when
PercentFF is 0.45. For values of
PercentFF between 0.35 and 0.55,
Gt ranges from 1975 to 1250. This
number is about a third the density
claimed for such devices by some
PLD vendors and consistent with
the actual test results detailed in
Table 5. While it might be possible to
find a design that uses the fixed PLD
architecture more effectively by creating flip-flops from combinatorial
logic, such circuits are generally much
slower than built-in flip-flops and
are not generally usable.
The FPG A architectures can be
evaluated using similar equations to
determine capacities. As indicated,
the two types of FPGA architecture
available are the LCA and ACT 1
family of channeled gate arrays.
LCAs that employ dedicated flipflops have a better balance between
combinatorial logic and flip-flops
than PLDs, but for most designs,
there are more flip-flops than necessary and combinatorial logic resources, thus, become the limiting factor
in logic density instead of flip-flops
as with PLDs.
The designer can determine how
well his circuit will fit on an LCA
device by examining his schematics
to see how many inputs and how
many gates were used on the logic
feeding a flip-flop. For example, if
there is a four-input block, how
many gates are typically feeding that
block : 3, 4, 5? In the selection of
vendor-generated TTL macros analyzed in the previous section, there is
an average of 3.6 combinatorial gates
per CLB. If four or fewer inputs
were used most of the time, then the
average number of gates prior to flipflops can be used in the following
formula to find the upper.bound for
the gate utilization on an LCA (if
you need more than four, say, five,
you get fewer gates and you need
two blocks):
Gt = «#ff x #CLB x
#pre-FFgates)/ (1-PercentFF»
(Of), where:
Gt = upper gate limit on the LCA
#ff= number of flip-flops per CLB
#CLB = number oflogic blocks in
the device
#pre-FFgates = number of twoinput gates before the flip-flop
(e.g. 3.6)
(1- PercentFF) = fraction of gates
in the user's combinatorial logic
Df = a derating factor, or maximum
device utilization factor available
on manufacturers' data sheets.
For LCAs, Dfranges between .55
to .85.
7
4-8
(Note: The formula ignores gates in
110 blocks. Flip-flops in II 0 generally improve performance, and effectively increase routing resources
rather than gate count because there
are no combinatorial functions in
the 110 for use with those flip-flops.
For designs with PercentFF in the
range 0.35 to 0.55, LCAs are usually
combinatorial gate limited.)
Using this formula on an LCA
with 144 CLBs and assuming an
average of three pre-FFgates, 45%
of the gates reside in the flip-flops,
and a utilization of 75%, a sample
calculation would be as follows:
Gt = «2xI44x3)/.55).75 = 1180
gates.
Thus, an LCA with 144 CLBs and a
claimed gate density of 4200 gates
has a maximum 1180 usable gates,
and only 1570 if fully utilized (Of =
1.0, under the above conditions).
The results of this equation show
results somewhat higher than found
in the actual design of Table 4.
Similar capacities can be calculated with ACT 1 devices. The ACT!
architecture uses a large number of
relatively small building blocks and
has no dedicated flip-flops. Therefore, neither the PLD nor LCA
formulas detailed above can be used.
The ACT 1 architecture isn't constrained either by flip-flops or combinatorial logic. However, certain
assumptions can be made and a
formula derived that will also give
designers consistently accurate capacity calculations. As in the previous cases, assume 45% of all gates
reside in device flip-flops. Again,
other values may be used based on
actual design characteristics.
Consider, for example, the ACT 1020,
a 2000-gate FPGA. This device incorporates a total of 546 logic modules and two are used to implement
one flip-flop. Most gates, multiplexors and latches use one module.
With these architectural considerations in mind, the following formulas
can be used to give designers a fairly
accurate capacity reading on an
ACT 1 device:
Article Reprint 1
#LM =#MF + #MC
Eq. I
Two logic modules can be used to
implement a flip-flop and some
combinatorial logic. The sequential
modules in the designs in the previous section attained 3.4 gates/
module.
Therefore, the 2000-gate ACT 1
device with 546 modules:
#MFx3.4/ (#MFx3.4 + #M Cx3.5)
=.45
Eq.2
Gt = (#MFx3.4 + #MCx3.5)(Df)
Eq.3
where:
#LM = total number of logic
modules on an Actel device
#MF = number of modules used
for flip-flops
#MC = number of modules used
for combinatorial functions
Gt = upper gate limit on the LCA
Df= the module utilization factor.
Solving Eq. 2 yields:
#MF = .84#MC.
Substituting this result into Eq.
gives:
546 = 1.84#MC
#MC = 296 and #MF = 546-296 =
250.
Finally:
Gt = (296x3.5 + 250x3.4)(.9)
(1886)(.9) = 1697 gates.
Equation 1 states that the sum of
the combinatorial and sequential
modules equals the total number of
modules.
Equation 2 states that the assumption that 45% of the gates in an
average design are used as flip-flops.
Equation 3 computes the expected
number of gates used in an ACT 1
device.
The logic module can implement
one to five combinatorial gates with
3.5 being typical. Of is listed by
Actel as ranging from .85 to .95, so
.90 will be used for these calculations.
It should be noted that the results
of this equation are also consistent
with the tested gate counts determined for Actel devices in Table 2
when the 3.4 gates/ module utilization (approx.) factor is used.
Using these formulas for PLDs,
the results indicated in Table 6 were
calculated for a representative number of devices from eight different
manufacturers.
Evaluating Development
Tools
While the equations enable designers to evaluate architectures, a
hands-on approach is often preferred
by engineers. And although device
manufacturers supply development
systems that allow designers to
analyze the design and program the
devices here too, as with technology
and density comparisons, care must
be taken to achieve accurate results.
F or example, when doing a competitive analysis of development systems,
beware of trying to save time by
doing an identical small design on
each. A small design is one that uses
only a small fraction of the chip. To
see what the system and the device
can really do, it is important to test
them to their limits. A simple
example may make device development and programming tools appear
to be stronger than they actually are.
Most systems vary a great deal in
cost and capability. Without becoming familiar with a system it is
difficult to understand its capabilities. Some of the most important
features to examine are the place
and route software, timing analysis
tools and platform support.
The P&R software is the heart of
Device
Dedicated
Flip-Flop2
Fraction
Utilized 1
No Gates 4
No 1/0
Reprogrammable
Actel
1010
1020
0
0
.90
.90
916
1697
57
69
N
N
Altera
5064
5128
64
128
.90
.90
770
1530
36
60
y
AMD
16R8
22VIO
8
10
1.0
1.0
105
135
16
22
N
N
Vendor
Y
Atmel
2500
48
1.0
640
38
Y
leT
7024
7040
40
48
1.0
1.0
530
640
22
30
Y
Plus Logic
2020
144
.9
1255
72
y3
Signetics
2552
52
1.0
575
53
y3
630
930
1180
1860
2270
64
80
96
120
144
y
Xilinx
3020
3030
3042
3064
3090
128
200
288
448
640
.9
.85
.80
.70
.65
I
y
Y
Y
Y
Y
Table 6. Usable Gates
8
4-9
the development system. The software assigns the logical resources of
the device to implement logic functions described by the design netlist.
It uses the routing resources (i.e.
tracks and interconnect elements) to
connect the logic functions together.
The extent to which the software
succeeds will determine the logic
capacity of the device and, to a large
extent, its performance. All FPGAj
PLD vendors include P&R software
wi th their systems, but the software's
ability to implement designs is constrained by the power of its algorithms, the architecture of the device,
and the amount of interconnect
resources available.
Place and route on a PLD is
simpler and faster than on an FPG A
due to the less flexible architecture
of PLD devices. PLDs usually place
and route successfully unless the
design exceeds the resources of the
device.
The FPG A architecture allows
for great P&R flexibility, but requires
more of the software because there is
more to do. The router may fail to
complete successfully requiring that
design be changed or, if allowed,
routed by hand. The designer has
much more control over FPGA implementations than PLDs by being
able to assign placement directly or
attaching criticality to nets in the
design.
FPGAs require that less of the
device area be devoted to wiring
resources than PLDs. As devices
scale to higher levels of integration,
PLDs will require a larger proportion of the device for wiring than
FPGAs. For this reason, FPGAs
will be better able to sustain larger
device densities.
The speed with which the router
operates is also important. If each
design iteration requires many hours
to run the P&R tool, then it significantly impacts the design cycle
time. PLD routing times are short as
there is relatively little for the software to do. Routing times for
FPGAs may range from minutes to
hours depending on the routing resources available, the sophistication
of the software and the complexity
of the design. The problem is exacerbated if the software runs for a long
time and still fails to route some
nets. At that point the designer may
be forced to P&R the entire design
manually.
Manual P&R is tedious and time
consuming. When manual intervention is required, every design iteration
requires the same amount of time as
the first and the development time is
multiplied by the number of iterations. While manual P&R allows the
designer complete control over the
design implementation, it requires
him to have detailed knowledge of
the physical architecture of the device.
Nevertheless, FPGA architectures
and tools are available that permit
fully automatic placement and routing in under 30 minutes.
Designers have little control over
the delays from the P&R in PLDs.
Moreover, the fixed path delays limit
any opportunities for timing optimization. PLD delays do not vary with
load, but are determined by the
individual delays of the device elements that make up the path. There's
more variation in FPGA delays than
in PLDs and the delays are load
dependent. Architectural flexibility
allows for delay optimization which
may be done manually or by using
software tools.
The timing analysis tools available
with the various systems vary considerably in both the quality and
quantity of the information they
provide about the system. The timing
analysis tools that come with the
development system should allow
the designer to examine timing
details of the design. If there is a
timing problem, the timing analyzer
should be specific enough to allow
the problem to be pinpointed so the
design or the P&R may be modified.
The more commands available and
the more powerful the commands
are the faster the designer can analyze
the design.
If the vendor-specific software
supports other CAE systems, there
should be a way to back-annotate
the delay information to the CAE
system simulator. Back-annotation
allows the design timing information
to be seen graphically and under
simulated operating conditions.
Finally, designers need to consider
platform support. Designers who
already have invested in hardware for
a CAE system may also want to use it
to develop programmable devices.
Designers need to make sure the
vendor's interface package for his
workstation is complete. For example, does it have back-annotation
available for effective use of the
simulator.
The End of the Line
To be sure, PLDs have changed
dramatically. Fortunately, with a
little common sense and the aid of
the guidelines and formulas detailed
here, designers can wade through
the hype and mounds of vendor
literature to determine the best FPLD
for the application. It's important to
keep in mind that all devices are
different and can suit a variety of
applications. Successfully choosing
one category of device for Design A
doesn't mean it will be the right
device for Design B. Every application requires a recalculation and
close scrutiny of the circuit. By
closely examining the device technologies, development tools and,
above all, gate capacities, the optimum device can easily be selected.
Notes:
1. Based on design experience. Smaller devices can usually use all available flip-flops. Larger devices cannot always take advantage of
embedded flip-flops.
2. Actel devices contain no dedicated flip-flops. Any logic module may be used as a latch and any two modules may be used as a flip-flop.
3. These devices contain input flip-flops which cannot be driven by internal combinatorial logic. The input flip-flops are counted only
for the gates they contain.
4. All gate counts assume 45% of users' gates are used for flip-flops.
4-10
Article Reprint 2
A CMOS ELECTRICALLY
CONFIGURABLE GATE ARRAY
by Khaled A. EI-Ayat, Abbas El Gamal, Richard Guo,
John Chang, Ricky K. H. Mak, Frederick Chiu,
Esmat Z. Hamdy, John McCollum, Amr Mohsen
•
Reprinted from
IEEE JOURNAL OF SOLID-STATE CIRCUITS
Vol. 24, No.2, April 1989
IEEE JOURNAL 01' SOLID-STAno CIRCUITS, VOL. 24, No.3, JUNE 1989
752
A CMOS Electrically Configurable
Gate Array
KHALED A. EL-AYAT, ABBAS EL GAMAL, SENIOR MEMBER, IEEE, RICHARD GUO, JOHN CHANG,
RICKY K. H. MAK, FREDERICK CHIU, ESMAT Z. HAMDY, MEMBER, IEEE,
JOHN McCOLLUM, AND AMR MOHSEN, SENIOR MEMBER, IEEE
Abstract -A novel CMOS electrically configurable gate array is described. The chip combines the flexibility, efficiency, extendability, and
perfonnance of mask-programmed gate arrays with the convenience of
user programmability. The implementation is facilitated by a novel twotenninal antifuse programmable element and a new configurable interconnect technology. The chip has been fabricated using 2-I'm n-well CMOS
technology with two-layer metallization.
I.
is facilitated by a new configurable interconnect technology [9] based on a novel one-time, two-terminal, programmable, low-impedance circuit element [10]. The electrical characteristics of this element are more suitable for
on-chip integration than other previously published antifuses (e.g., [11]). The chip has been fabricated using a
2-p.m n-well CMOS technology with two layers of metallization.
INTRODUCTION
M
ASK-programmed gate arrays have gained wide
acceptance and are used in an increasingly larger
number of customer-specific applications. This is primarily
due to a) their architectural flexibility that allows efficient
implementation of a wide range of applications, b) large
gate densities of lOOK gates or more [1], [2] which permits
high levels of integration, and c) high performance. The
architectural flexibility is due to the use of primitive basic
cells and a routing capability that allows the construction
of a large set of macrocell building blocks and general
wiring interconnect between the macrocells. Gate arrays,
however, suffer from long and costly development cycles
which are barriers towards wider usage in many applications, especially those with low-volume requirements or
restricted to short development cycles.
Programmable logic devices (PLD's) [3]-[6], on the other
hand, offer instant turnaround in the design cycle. However, their architecture and technologies limit their efficient integration of many applications to only hundreds of
gates. Some PLD architectures [7], [8] have integration
levels of several thousand gates, but due to architectural
and interconnect inefficiencies, the number of usable gates
is a small fraction of the total available gates.
This paper describes a CMOS electrically configurable
gate array that combines the architectural flexibility and
efficiency of gate arrays with the convenience of user
programming available from PLD's. The implementation
Manuscript received July 28, 1988; revised November 14, 1988.
K. A. EI-Ayat, A. EI Gamal, R. Guo, J. Chang, E. Z. Hamdy,
J. McCollum, and A. Mohsen are with Actel Corporation, Sunnyvale, CA
94086.
R. K. H. Mak and F. Chiu are with Data General Corporation,
Sunnyvale, CA 94086.
IEEE Log Number 8926932.
II.
CHIP AF.CHITECTURE
A. Overview
The chip has a channeled gate array architecture (Fig. 1)
consisting of configurable logic modules organized in rows
and columns and separated by wiring channels (vertical
~iring channels were omitted from Fig. 1 for clarity).
Unlike gate arrays, however, the channels contain predefined segmented metal tracks of different segment lengths
to accommodate the routing requirements. Antifuse elements are located at the intersection of the horizontal and
vertical wire segments and between adjacent track segments. Circuit connections and module configuration are
established by programming the appropriate antifuse element which then forms a low-impedance connection between the required two metal segments. The concept is
similar to vias between the first and second level of metallization in double-metallization technology. The logic
module is configurable and was carefully selected to implement a large set of combinatorial logic cells as well as
latches and flip-flops for sequential circuit applications.
The chip also has configurable I/O buffers that can be
configured as input, output, or bidirectional I/O. Also
shown in Fig. 1 are the periphery circuits necessary for
addressing, programming, and testing the array.
B. Wiring
Wiring channels in this architecture consist of a set of
segmented tracks of various lengths. A typical channel
track segmentation scheme is shown in Fig. 2, with vertical
tracks mostly omitted for clarity of the diagram. The
lengths of the track segments vary from a minimum of two
0018-9200/89/0600-0752$01.00 ©1989 IEEE
Article Reprint 2
EL-A YAT et af.: CMOS ELECTRICALLY CONflGURABLE GATE ARRAY
VO Buffers
P'og/TsVDiaq
6~ilP6
Fig. 1.
Block diagram of the chip.
Logic Module
Input Segments
Fig. 2.
Wiring channels and track segmentation.
module spans and progressively increase until they span
the entire length of the array. Short segments offer more
routing flexibility because their granularity affords a large
number of their type. Their disadvantage is that longer
nets suffer a degradation in performance due to the addition of interconnect antifuses in the circuit path. Careful
choice of the lengths of the horizontal and vertical wire
segments as well as the number of tracks in each channel
provides routing flexibility comparable to gate arrays [12].
The trade-offs between routing flexibility, performance,
and silicon area was carefully weighed in designing the
channel and its track segmentation. The number of tracks
per channel in this architecture is only slightly higher than
mask-programmed gate arrays with the same gate density.
The similarity of this architecture to gate arrays can be
best understood by drawing a parallel between the programmable antifuse elements and the vias introduced during the fabrication of gate arrays. The small size of the via
allows gate arrays to contain an extremely large number of
potential via sites, of which a very small fraction is needed
to implement any application. Similarly, the size of the
antifuse elements permit packing them as close as the
metal pitch of the employed technology. Thus, a very large
number of these elements can be included in the chip. As
an example, a typical 2000-gate mask gate array has approximately 90 000 potential via sites compared with
200 000 antifuse sites in the chip family described in this
paper. Again, only a few percent of these antifuse elements
753
need to be programmed to implement any application.
This allows gate array-like extension of the chip architecture, by adding more logic modules and incrementally
increasing the number of wiring tracks in each channel.
The interconnect architecture of this chip is further
illustrated by Fig. 3, which shows a small section of the
array. Two partial rows of modules (each row containing
three modules) with vertical and horizontal track segments
are shown. Antifuse elements are located at the intersection of the track segments as well as between adjacent
horizontal and vertical wire segments. To program an
antifuse element, 18 V is applied across its terminals while
all other elements are subjected to half that voltage. Two
examples of circuit path connections are shown in Fig. 3.
Path PI is a short path in which a vertical segment is
connected to a horizontal track by programming antifuse
FI. The path is completed by programming a second
antifuse F2 which connects the horizontal segment to
another vertical segment. The second example path P2
illustrates how arbitrarily longer nets can be constructed in
this architecture whenever the desired single segment length
is not available. Antifuses F3 and F5 are programmed in a
similar fashion to path PI. Antifuse F4, however, is programmed to connect two adjacent horizontal segments to
form a new longer horizontal segment. The isolation nchannel transistor Tl is part of the addressing and decoding circuitry used to address, decode, and program the
required antifuse elements. When programming antifuse
F4, Tl must be turned OFF to supply the full programming
voltage across its terminals.
C. Logic and I/O Module
The interconnect architecture described in this paper
can be used with a variety of logic modules. The choice of
the module involves trade-offs between logic capability,
routability, performance, and silicon areas. The configurable logic module used in this architecture is shown in
Fig. 4. It has eight inputs and one output, and was chosen
for its efficiency in implementing both combinatorial and
sequential circuits and for its optimum utilization of routing resources. The module implements a modified 4: 1
multiplexor function with inputs A-D, select inputs SA,
SB, SO, and SI, and output Y. To implement the required
logic function, the module is configured as the desired
macrocell by simply programming the appropriate antifuses at its input terminals to connect the inputs to the
required nets or to VDD/VSS '
The module is capable of implementing all two- and
three-variable functions and some four-variable functions
such as a 4: 1 multiplexor. It can also be configured to
implement latches and flip-flops. One module is needed to
implement a latch while two modules are needed to implement a flip-flop. No predetermined hardwired latches or
flip-flops are implemented or needed in this gate array.
Latches and flip-flops may be implemented anywhere in
this array to suit the requirements of the application. The
required macrocell connections are simply placed and in-
I
754
IEEE JOURNAL OF SOUD-STAn ClRClIITS. VOl..
24. NO.3. JUNE 1989
HORIZONTAL CONTROL
/
VERTIC~RACK
~,
LOGIC MOD~E
I
~,
!
ED
;==;
L
I
I
n!
-.J
PI
~:
L
;==;
it!
SEGMENTED
HORIZONTAL
TRACKS
wm,';;: -;;;"m'~".
)~ ~~
I
Fig. 3.
CONTIIOl
I
FI iI.
;:1:: ~.-...
f-""
]~ ~~
I
VERTICAL
I
~:
...... :---. PASS
..
-';}t: ~~'M
,m"",.
• ~4 ~4
~,
.-"
TRANS!STOR
:D.
!to
~~ ]f,~
I
I
Interconnect scheme with typical routing examples.
I/O
Fig. 4.
Logic module schematic.
TABLE I
TYPICAL MACROS AND
THEIR EQUIVALENT
MODULE IMPLEMENTATION
Logic Funcrioo
Modules
3·inptllOR
Fig. 5.
Bidirectional programmable I/O module.
4-inputAND
2·inpUI XNOR
Rows of Logic
Dlalch wirhclr
OFF wirh Pre &< Cir
terconnected wherever needed in the array. A sampling of
typical macros is shown in Table I. Also shown are the
required number of modules needed to implement the
macro.
The I/O architecture is also quite flexible as shown in
Fig. 5. Each I/O module may be configured as input,
output, or bidirectional I/O by simply programming the
appropriate antifuse elements. The I/O module is configured as an input buffer or output buffer by simply
programming antifuses F2 or Fl, which connect the
ENABLE input to Vss or VDD , respectively. To configure
Fig. 6.
Low·skew clock distribution network.
it as a bidirectional I/O the ENABLE line is driven from
the appropriate net in the array and the two anti fuses are
not programmed. Also, any I/O module can be driven
from virtually any net in the array. Naturally, appropriate
placements of the I/O relative to the application circuit
would always result in more efficient routing.
Article Reprint 2
EL-AYAT
et al.: CMOS ELECTRICALLY CONfJGURABLE GATE ARRAY
755
Adionprobe " Coklmn Select Register (4<4 bltl)
I
Adionprobe "
Row selact
register (6)
Ad~B
& user 110
Fig. 7.
In-circuit probing scheme.
D. Clock Distribution
Another important feature of this architecture is a highspeed clock distribution network which is routed to every
module in the array (Fig. 6). This provides the user with a
minimal skew distributed clock network that can be used
to map high-speed synchronous applications in this array.
It also relieves the user from the added burden of clock
network design in his gate array application. Since latches
and flip-flops can be built using any module in the array,
building high-speed synchroIi(;us sequential networks is
accomplished by using the clock network and the required
latch, flip-flop, or combinatorial macrocells. The speed of
the clock network is determined by the critical paths in the
application. For example, the frequency of a simple toggle
flop-flop has been measured at more than 70 MHz whereas
in a complex application the clock network will operate at
speeds up to 40 MHz.
and multiplexed I/O ports. The probe mode is used in the
following manner. The chip is first switched into the
required probe mode which selects one or both of the two
probing channels. The desired node address, consisting of
row and column select fields, is then serially shifted into
the chip and loaded into the required column and row
select registers. This column and row select address is used
to uniquely select the desired node in the circuit and feed
its logic value through a multiplexor and onto a designated
I/O port. Any test instrument such as an oscilloscope or
logic analyzer probe may then be used to observe the
required node in real time, while the chip is running in the
actual system environment. The diagnostic capability described is supported by a development system which provides the user with a convenient interface to his circuit and
eases the verification and debug cycle.
III.
IMPLEMENTATION
E. In-Circuit Diagnostic Capability
A. Technology
One of the key features of this chip is a novel probing
scheme that allows 100-percent observability of any node
in the array. This feature allows designers to access and
observe signals at any node in their schematic in real time,
which eases and simplifies the development and debugging
cycle of a user circuit.
The probing scheme which is shown in Fig. 7 consists of
two independent probing channels to allow simultaneous
real-time observance of two mdependent nodes in the
circuit. This is internally implemented by two separate sets
of column and row select registers, multiplexor channels,
The chip is implemented using a 2-JLm n-well CMOS
process with two layers of metallization (Table II) and four
transistor types. Two low-voltage/high-speed 250-A oxide
transistors are used in all performance-sensitive circuits,
such as logic modules and I/O buffers. They are also used
in other low-voltage circuit applications such as address
latches and control logic where their tighter design geometries are better suited. The two high-voltage 4OO-A oxide
transistors are used in all the high-voltage circuit applications such as decoders, level shifters, programming path
switching transistors, as well as isolation transistors to
I
756
24.
IEEE JOURNAL OF SOLID-STATE (') RUJITS. VOl ..
No.3. JUNE
1989
10
TABLE II
TECHNOLOGY OVERVIEW
(rnA)
Transistor Type
Gale Oxide
!.elf
Vollage
NLow Vollage
250 A
l.Iu
5V
PLow Vollage
250 A
1.2u
5V
N High Vollage
400 A
1.5u
lOV
P High Vollage
400 A
1.8u
20V
1.000
----+-I----;I-+-+-.---+---f-+--+--I
. 10001--+-1-+-t-+--+~----t+-t---t
-+-_J....
Id i vl--+--+_+--+----l__
f--~-4--r~-+_-+--+_-~~-
1--
TABLE III
ANTI fUSE ELECTRICAL CHARACTERISTICS
Paramecer
Programming Voltage VPP (vohs)
Programming Time (ms)
. 00ggo~0:;----''---'-----'--..l.---L-.l...--L-L.l...--L-2~0 . 00
VO
2.000/c:tiv
(V)
Value
18
(a)
< 10
(rnA)
Programming Current (ma)
<\0
On ResistanCe (ohms)
< lK
Off Resistanee (ohms)
> 100M
1.000
-
/
/
.1000
Idiv _.
isolate high-performance circuits from high-voltage circuits. The chip has 40 000 transistors evenly divided between low-voltage and high-voltage transistors.
The electrical characteristics of the antifuse element
used in this technology are summarized in Table III. The
two-terminal anti fuse (13] is a dielectric between an n +
diffusion and polysilicon and is compatible to CMOS,
BIMOS, and bipolar technologies. The anti fuse normally
exhibits a very high impedance ( > 100 Mn) in the OFF
(unprogrammed) state. When an 18-V programming voltage is applied across its terminals, a bidirectional lowresistance « 1 kn) connection is established. The programming current requirement is less than 10 rnA and the
programming time is less than 10 ms. Since this is an
antifuse technology, only a few percent of the total available antifuse elements need to be programmed and therefore, total application programming time is fairly short.
Fig. 8 further illustrates the characteristics of the antifuse.
Fig. 8(a) shows the I - V characteristic of the antifuse as it
transitions from the unprogrammed state (I = 0) to the
programmed state. In this example the antifuse programmed at 15.5 V. The maximum current through the
antifuse is determined by the antifuse impedance as w"ell as
the impedance of the external circuit. Fig. 8(b) shows the
I -- V characteristics of several programmed antif\lses_ The
slope of the I - V curves is the impedance of the programmed antifuse.
B. Circuit Design
A chip photomicrograph is shown in Fig. 9, and measures 240 X 360 mils in 2-lLm technology. The logic module
rows and the horizontal channels are illustrated. Each
channel contains matrices of anti fuse elements at the intersections of the horizontal and vertical track segments
implemented as two layers of metallization. Surrounding
/
I-
/
~-
/
V
._--
./
I
it
.0000
.0000
1.000
.1000/c:tiv
(V)
(b)
Fig. 8.
Antifuse /- V characteristic (a) before programming, and
(b) after programming.
Fig. 9.
Chip photomicrograph.
the array are all the periphery circuits: address latches,
decoders, high-voltage programming circuits, testability
circuits, and I/O modules necessary to program and test
the array.
The addressing scheme used to select an,d program a
particular antifuse consists of a serial address stream which
is shifted into the chip and latched. A distributed two-level
decode scheme is then used to select the unique track(s)
Article Reprint 2
EL-A YA T el
al. : CMOS ELECTRICALLY CONFIGURABLE
757
GATE ARRA Y
TABLE IV
r
,I!'
Vpp
Column Decoder
CHIP CHARACHRISTICS
:_~1-~f= 1~.tt-
Delays measured at 5 V. 25°C, fan-out = 3
l'
Group Decoder
Iv
.I1
Row Decoder
;j>-~-----
Antifuse /
T
PIckage type (PLCC)
84
SWldby current (ma)
<5
Number of Modules
295
,-Number of progTBmmable elements
Group Decoder :
Horizontal Control
Fig. 10.
240.360
Chip Size (mils)
~VTl
Vertical Control
Value
Parameter
Chip programming path.
that must be addressed to program the required antifuse.
This is best illustrated by the programming path example
shown in Fig. 10. To program the indicated antifuse element, the programming voltage must be transmitted and
applied across its two terminals. This is accomplished by a
two-level decode scheme consisting of a group decoder
followed by a column decoder, which transmits the programming voltage onto vertical track V. The voltage is
then propagated along the track by switching transistor
VTl to apply the high voltage on one terminal of the
antifuse. Similarly, the ground potential is switched by
another group decoder and row decoder and through
switching transistor HTI to the other terminal of the
antifuse. The number of switching transistors that must be
switched in the vertical and horizontal tracks varies with
the location of the anti fuse. All decoders and switching
transistors are driven from the address that was serially
shifted into the chip.
A circuit schematic of the decoder cell is shown in Fig.
11. Each decoder cell consists of a flip-flop, a low-to-high
voltage level shifter, high-voltage decoder transistor, and
miscellaneous control gates. The flip-flop is part of the
address latch used in programming and testing the chip.
The output of the flip-flop OUT is fed to the INI input of
the adjacent decoder cell, thus forming the serial address
shift register. The control gates are used to disable the
decoder, precharge the decoder output, or enable the flipflop output to drive the decoder. The gated signal then
passes through a low-to-high voltage level shifter which
drives the gate of the decoder high-voltage output transistor. The IN2 input of the flip-flop is used in test modes to
sense and latch the result of each test. The decoder cell
and some variations of it are used to implement all group,
column, and row decoders in the chip.
A summary of chip characteristics is shown in Table IV.
The chip has 295 configurable logic modules, 55 con figurable I/O modules, and over 112K anitfuse element sites.
This illustrates the touting flexibility of this chip and
technology. The module delay of 5 nll is for a typical net
configuration implementing a two- to four-variable function.
Another circuit technique used in the design is the
appropriate use of high-voltage and low-voltage transistors. All high-performance circuits such as logic and I/O
module use low-voltage high-performance transistors.
112.000
Clock network skew (ns)
3
Module performance (ns)
5
LalCh setup time (ns)
6
latch hold time (ns)
2
Number of user defined pins
Configurable J)O buffers
5S
TIL. Bidirectional
Input delay (ns)
6
OulpUt delay (ns)
12
OulpUt drive (rna)
4
However, since the terminals of the modules are exposed
to high voltage during programming, high-voltage isolation
transistors are used to isolate the high-performance part of
the circuit from the programming path (Fig. 5). During
programming the isolation transistors are turned OFF. During normal operating mode, the gates of isolation transistors are driven to a high voltage to lower the source/drain
resistance of the transistors and minimize the delay penalty
through the transistor.
IV .
TESTABILITY
Testing a one-time programmable gate array presents
some challenging problems [14]. Fortunately, testing is
facilitated by the normally open antifuse elements. AU
circuit connections are normally open-net connections,
module configuration, I/O module connections-and will
be established as the user-application circuit is programmed into the chip. All active circuits such as modules,
I/O buffers, latches, and decoders must be tested and
iheir functionality and performance guaranteed. In addition, all passive circuit elements such as wiring channels
must be free of defects such as opens and shorts.
Consequently, one of the critical design objectives was
to ensure that the architecture and the design are fully
testable [14]. This is critical in assuring high-quality chips
and high programming yield. Testability circuits and techniques were irtlplemented to permit complete testing of:
•
•
•
•
•
•
every logic and I/O module,
all vertical and horizontal tracks,
addressing and decoding circuits,
all programming and high-voltage circuits,
integrity of all anti fuses in the array, and
ac performance of the chip.
•
758
lHI: JOlJRNAI. Ot SOI.\I)-STATE ('IRCUITS, VOL.
24, NO.3, JUN[ 19K9
INl
LD
OUT
HIGH
VOLTAGE
l~pp)
I~TO
~
OR
~LER
RST.
Fig. 11.
Decoder cell u!.ed in the two-level decode scheme.
the example shown in Fig. 12 sensing and latching is
accomplished by column decode C and group decode D.
Also, each module has a sense circuit to sense its output
voltage during test. Finally, the shift-out and compare
phase consists of simply shifting the test result out on one
of the IjO pins and comparing it with the expected test
results. By using the above technique and the appropriate
decoders for driving and sensing the entire chip can be
tested. For example, the fuse integrity test is done by
simply driving vertical tracks and sensing horizontal tracks.
AC performance of the chip is tested by programming a
typical path in the chip consisting of redundant logic
modules and testing its speed_
The overhead of the testability circuits is minimized by
sharing the periphery circuits and switching transistors
inside the array between programming, testability, and
diagnostic modes. Appropriate test patterns are applied
that thoroughly exercise the chip before configuration. All
aspects of the chip are tested, including programming of
redundant antifuses and integrity of unprogrammed antifuses (antifuse programming is verified by the programming hardware as the application is programmed). This
assures high-quality chips and good programming yield_
Fig. 12.
Typical drive, sense and latch circuit path.
The testability technique consists of three phases: 1)
driving phase, 2) sense and latching phase, and 3) shift-out
and compare phase. Fig. 12 illustrates a typical drive, sense
and latch circuit path used in the various test modes. In
the driving phase, the chip is first driven into the required
testability mode by loading a control byte into the chip. A
unique test pattern is then serially loaded into the address
latches which drive the two-level decode mechanism explained above. The driving circuit in this test example
consists of group decoder A and column decoder B. This
technique permits driving any track with any desired test
voltage to suit the test pattern. Further, any number of
tracks may be simultaneously driven including, if needed,
all tracks in the array. Also shown in this example is a
logic module under test whose inputs are driven by the
vertical tracks excited by decoders A and B above. In the
sense and latching phase, sensing circuits implemented in
the array and in the periphery are used to sense the
required signals and latch them into the same periphery
address latches (data are latched on a different phase of
the clock to prevent contention with the test pattern). In
V.
ApPLICATIONS
Two application examples are included to illustrate the
flexibility and performance of the chip. The first example
(Fig. 13) is a standard 4 x 4 array multiplier. The input
vectors are X(O-3), Y(O-3) and the product term is P(O-7).
The basic macros used to implement the multiplier are the
full-adder macros FA1A and FA2A. Both macros implement the standard full-adder logic functions. The second
macro FA2A implements a full adder with inputs AO and
Al going through a NOR function before the sum and carry
functions are generated. This special macro is useful in
multiplier functions. Each macro uses one logic module to
implement the desired adder function and is configured by
anti fuses as explained in Section II-C.
The second example (Fig. 14) is a seven-decade frequency counter. The application simply measures the total
number of clocks on the" Frequency _ in" pin being measured, that occur during a particular reference time interval. The reference time interval is generated by the oscillator and the six-decade counter. The eight-decade counter is
used as the I rcquency counter and the control circuits
Article Reprint 2
EL-AYAT
759
et ul.: CMOS EtECTRICALtY CONHGlJRABLI' GATF ARRA Y
Fig. 13.
4 x 4 multiplier.
•
Fig. 15.
6 x De~ade Counrd
(Freq Divider) 1
l i ! !
Fig. 14.
Seven-decade frequency counter with prescaler.
generate the required timing and load signals. The frequency count result is then loaded into a set of output
latches. The oscillator circuit is shown in Fig. 15. It uses an
input buffer, an inverting logic module, and an output
buffer as well as external crystal and resistor components.
Crystal oscillator using on-chip inverter.
Fig. 16(a) illustrates the front-end prescaler used to scale
the Frequency _in pin being measured. The prescalar is
simply a divide-by-lO counter implemented as a divide-by-2
toggle flip-flop followed by a divide-by-5 shift counter.
Fig. 16(b) shows the logic diagram of the flip-flop macro
DFM used in the shift counter. Fig. 17 shows the captured
waveforms at the input and output of the prescaler. The
Frequency _in pin in this waveform is running at 80 MHz
and the prescaler output is running at 8 MHz. A prescaler
is used in this application to vary the frequency range
being measured. This application uses 284 out of 295 logic
modules, 37 I/O buffers, 60 flip-flops, and 32 latches.
760
IEEE JOURNAL Of SaUD-STATE CIRCUITS, VOl ..
24,
NO.3, JUNE
1989
VDD
(a)
Q
(b)
Fig. 16.
(a) Divide-by-lO prescaler. (b) Configuring flip-flops with MUX inputs to perform logic functions.
VI.
CONCLUSION
A CMOS electrically configurable gate array with a
channeled gate array architecture has been described. The
chip is made feasible by a new interconnect technology
and a novel two-terminal anti fuse element. The array has
295 configurable logic modules and over 112K anti fuse
elements. The wiring channels consist of segmented tracks
with antifuses at their intersections and between adjacent
segments. The I/O modules .are also configurable. The
chip is implemented in a 2-,...m n-well CMOS process with
two layers of metallization. The implementation of the
chip ensures that the design is fully testable.
Fig. 17.
ACKNOWLEDGMENT
PrescaIer input and output waveforms.
When mapped into a conventional mask-programmed gate
array this application would require 1200 gates.
The chip is supported with a computer-aided design and
development system specifically developed for this chip
family [12]. Designs are entered into the system as
schematics or net lists using a macrocell library. The
designs are then automatically placed and routed using
fully automated placement and routing software capable
of consistently mapping applications utilizing up to 95
percent of the array modules. Net delays are accurately
computed by the system from the post-layout data. The
system also contains development and diagnostic tools that
greatly simplify the design process.
The authors would like to acknowledge contributions by
K. Bushey, L. Chan, J. Chen, S. Chiang, S. Eltoukhy, J.
Greene, K. Hayes, H. Nguyen, B. Osann, J. Reyneri, E.
Rogoyski, A. Samie, F. Sohail, and E. Takyu. They would
like to acknowledge A. Chatterjee, J. Dorosti, and M.
Ghafghaichi of Data General for their excellent support in
manufacturing the chips.
REFERENCES
[1]
[2]
M. Takechi el al., "A 630K transistor CMOS gate array," in
ISSCC Dig. Tech. Papers, Feb. 1988, pp. 72-73.
R. Blumberg el al., "A 640K transistor sea of gates 1.2 I'm
HCMOS technology," in ISSCC Dig. Tech. Papers, Feb. 1988, pp.
74-75.
Article Reprint 2
EL-AYAT
et al.:
CMOS ELECTRICALLY CONFIGURABLE GATE ARRAY
J Pathak et al., "A 50MHZ CMOS programmable logic device," in
ISSCC Dig. Tech. Papers, Feb. 1988, pp. 144-145.
S. H. Bowden et al., "A 9ns electrically erasable CMOS programmable logic device," in ISSCC Dig. Tech. Papers, Feb. 1988,
pp.146-147.
[5) S. Wong et aI., "CMOS erasable programmable logic with zero
standby power," in ISSCC Dig. Tech. Papers, Feb. 1986, pp.
242-243.
[6) H. Hsieh et al., "A second generation user-programmable gate
array," in Proc. ClCC, May 1987, pp. 515-52l.
[7) H. Hsieh et al., "A 9OOO-gate user-programmable gate array," in
Proc. CICC, May 1988, pp. 15.3.1-15.3.7.
[8) K. H. Gudger et al., "A 2500 gate programmable logic device with
subdivisable macrocells," in Proc. CICC, May 1988, pp.
15.2.1-15.2.4.
[9) K. El-Ayat et aI., "A CMOS electrically configurable gate array,"
in ISSCC Dig. Tech. Papers, Feb. 1988, pp. 76-77.
[10) A. Mohsen et al., patent pending, 1988.
[11) H. Stopper, "A wafer with electrically programmable interconnections," in ISSCC Dig. Tech. Papers, Feb. 1985, pp. 268-269.
[12) A. El Gamal et aI., "An architecture for electrically configurable
gate arrays," in Proc. ClCC May 1988, pp. 15.4.1-15.4.4.
[13) E. Hamdy et al., "Dielectric based anti fuse for logic and memory
ICs," in IEDM Tech. Dig., Dec. 1988, pp. 786-789.
[14) K. El-Ayat et al., patent pending, 1989.
[3)
[4)
Khaled A. EI-Ayat received the B.Sc. degree in
electrical engineering from the University of
Cairo, Egypt, in 1968, the M.Sc. degree in electrical engineering and computer science from the
University of Toronto, Canada, in 1971, and the
Ph.D. degree in electrical engineering and computer science from the University of California,
Santa Barbara in 1977.
In May 1977 he joined Intel Corporation,
Santa Clara, CA, to work in the Microprocessor
Design Group, where he worked on the definition and development of industry standard microprocessor families such
as 8086, 80186, and 80386. As a Project Manager at Intel, he was
responsible for the design of the control structures of the 80386 M.icroprocessor. After leaving Intel, he cofounded Actel Corporation in
Sunnyvale, CA, and was the Program Manager responsible for development of electrically configurable gate arrays. His present research interests include configurable logic and application-specific architectures, VLSI
design, and microprocessor architectures. He has authored many articles
and holds patents covering Actel's architecture and testability techniques.
Presently he is a Chief Engineer working on the definition of the next
generation of products.
Abbas EI Gamal (S'71-M'73-SM'83) was born
in Cairo, Egypt, on May 30, 1950. He received
the B.S. degree in electrical engineering from
Cairo University, Cairo, Egypt, in 1972 and the
M.S. degree in statistics and Ph.D. degree in
electrical engineering both from Stanford University, Stanford, CA, in 1977 and 1978, respectively.
From 1978 to 1980 he was an Assistant Professor of EE systems at the University of Southern
California. Since 1980 he has been with the Electrical Engineering Department at Stanford University where he is currently an Associate Professor. From 1984 to 1986 he was the Director of
the Systems Research Lab, LSI Logic Corporation, Milpitas, CA. He is a
co-founder and Chief Scientist of Actel Corporation, Sunnyvale, CA.
Richard Guo received the B.S. degree in electrical engineering from
National Taiwan University, Taiwan, Republic of China, in 1978,
and the M.S. degree in electronics engineering from the University of
Southern California in 1981.
In 1981 he joined Intel Corporation where he worked as a Design
761
Engineer on 256K CMOS dynamic RAM's. In
1985 he joined Samsung where he worked as
Senior Engineer in the design of aiM CMOS
DRAM. In 1986 he joined Actel Corporation,
Sunnyvale, CA, as a Senior Design Engineer
where he worked on the design of electrically
configurable gate arrays. At present he is a Project Manager working on the next generation of
products.
John Chang received the B.S. degree in electronic
physics from National Chiao-Tung University,
Taiwan, Republic of China, in 1977, and the
M.S. degree in electronics engineering from the
University of Iowa in 1981.
In 1981 he joined Advanced M.icro Devices
where he worked as a Design Engineer in MOS
static RAM's. In 1986 he joined Actel Corporation, Sunnyvale, CA, as a Senior Design Engineer where he worked on the design of electrically configurable gate arrays.
Ricky K. "- Mak was born in Hong Kong on
January 13, 1959. He received the B.S. degree in
electrical engineering and computer sciences from
the University of California, Berkeley, in 1983.
He joined Data General Corporation, Sunnyvale, CA, in 1983 as an IC Design Engineer.
From 1986 to 1988 he was involved in the development of the configurable gate array. He is
currently working on 64-bit CPU chip design.
Frederick Chiu was born in 1960 and received
the B.S. degree in electrical engineering and computer science from the University of California,
Berkeley, in 1983.
He joined Data General Corporation, Sunnyvale, CA, in 1984 as a Design Engineer. Currently he is working on the design of a VLSI
CPU chip.
Esmat Z_ Hamdy (S'76-M'80) was born in Cairo,
Egypt, on February 2, 1950. He received the
B.Sc. and M.Sc. degrees from Cairo University,
Cairo, Egypt, in 1971 and 1975, respectively, and
the M.A.Sc. and Ph.D. degrees from the University of Waterloo, Waterloo, ant., Canada in 1977
and 1980, respectively, all in electrical engineering.
From 1971 to 1975 he served as an Instructor
in the Department of Electrical Engineering,
Helwan University, Helwan, Egypt. From 1975
to 1980 he was a Research and Teaching Assistant in the Department of
Electrical Engineering, University of Waterloo, where he was engaged in
the analysis, modeling, and design of high-density bipolar and MaS
integrated structures for LSIjVLSI technologies. He joined Intel Corporation, Aloha, OR, in 1981, where he was a Senior Staff Engineer/Project
Manager in the Technology Department involved in device physics and
circuit design of dynamic RAM's and submicrorneter CMOS. In 1985 hc
cofounded Actel Corporation, Sunnyvale, CA, where he is now Direct" ..
of the Technology Development. He has authored or co-authored over 15
papers on LSI/VLSI circuits including contributions to 12 L, singlcdevice-well (SDW) MOSFET's. and CMOS latch-up. He also has eight
pending or granted patents.
II
762
IUt Jllllkt.,,1. 01· SOI.lD-STATI'. CIRCUITS, VOl ..
Dr. Hamdy was the winner of the Best Student Paper at the i <)7'.'
International Electron Device Meeting (IEDM).
John McCollum received the B.A. degree in
physics from the University of California ~t
Berkeley.
He is currently employed at Actel Corporation. Sunnyvale, CA. where he is the Manager of
Process Development. He previously workcJ at
Intel Corporation where he worked on both de·
sign and process development. His interests in··
elude astronomy and electronics.
24, NO.3, JUNE 19X9
Amr Mohl>en (S'72-M'74-SM'X4) icceived the
Ph.D. degree in electrical enginecring and applied physics from the California institute of
Technology, Pasadena.
He is a founder and Chairman of the BoarJ of
Actcl Corporation. Sunnyvale, CA, and has more
than 20 years of e)(perience in the semiconJuctor
industry. Before founding Actel. he was a Senior
Engineering Manager in the Technol')gy Development Division at Intel Corporation. He ~lso
worked on charge-coupled device development at
Bell Laboratories and served as a consultant. He has autho. ,(\ more than
45 artides relating to semiconductors and is responsible fUi inve0tion~
covered by 20 patt;nts.
Actel Corporation 955 East Arques Avenue Sunnyvale, CA 94086 408,739,1010
Article Reprint 3
AN ARCHITECTURE FOR
ELECTRICALLY CONFIGURABLE
GATE ARRAYS
by Abbas EI Gamal, Jonathan Greene, Justin Reyneri,
Eric Rogoyski, Khaled A. EI-Ayat, Amr Mohsen
•
Reprinted from
IEEE JOURNAL OF SOLID-STATE CIRCUITS
Vol. 24, No.3, 1989
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 24, No.2, APRIL 1989
394
An Architecture for Electrically
Configurable Gate Arrays
ABBAS EL GAMAL, SENIOR MEMBER, IEEE, JONATHAN GREENE, JUSTIN REYNERI,
ERIC ROGOYSKI, KHALED A. EL-A Y AT, AND AMR MOHSEN, SENIOR MEMBER, IEEE
Ab.dracl - An architecture for electrically configurable gate arrays using
a two-terminal anti-fuse element is described. The architecture is extensible, and can provide a level of integration comparable to mask-programmable gate arrays. This is accomplished by using a conventional gate array
organization with rows of logic modules separated by wiring channels.
Each channel contains segmented wiring tracks. The overhead needed to
program the anti-fuses is minimized by an addressing scheme that utilizes
the wiring segments, pass transistors between adjacent segments, shared
control lines, and serial addressing circuitry at the periphery of the array.
This circuitry can also be used to test the device prior to programming and
observe internal nodes after programming. By providing sufficient wiring
tracks segmented into carefully chosen lengths and a logic module with a
high degree of symmetry, fully automated placement and routing is facilitated.
l
vertical
control
venical
track
segment
I
I
I
L -3
cross
fuse
I
n.
L
f-J
L
i---J
L -3
vertical
INTRODUCTION
logic
module
horizontal
track
segment
M
ASK-programmable gate arrays offer the architectural flexibility and efficiency to integrate thousands of gates, but require long development time and high
nonrecurring engineering costs. On the other hand, the
convenience of field programming is available with programmable logic device (PLD) technologies, but their architectures have not allowed integration of a wide variety
of applications exceeding a few hundred gates [1], [2].
We describe a novel gate array architecture [3] which
combines the flexibility of mask-programmable arrays with
the convenience of field programmability. Its implementation is made possible by a two-terminal electrically programmable anti-fuse offering low resistance in its conducting state and small area.
The architecture supports a design style similar to conventional gate arrays, including fully automatic placement
and routing algorithms attaining 85-95-percent utilization.
This required considerable emphasis on symmetry and
routability, which we touch on below.
The anti-fuse is so called because it irreversibly changes
from high to low resistance when "blown" by applying a
programming voltage across it. The anti-fuse, or fuse for
short, has an ON-state resistance of approximately 500 n.
The layout area of the fuse cell is generally limited by the
Manuscript received August 22, 1988; revised December 2, 1988.
The authors are with Actel Corporation, Sunnyvale, CA 94086.
IEEE Log Number 8826132.
L
f.=F
L
pi'"
tr~~::tor
I
I
I
I
L .....J
I.
L
p-
\.- f-J
horizontal
control
Fig. 1.
horizontal
pass transistor
horizontal
fuse
Interconnect architecture.
pitch of the first- and second-level metal lines that connect
to it; it is about the same size as a via.
This paper focuses on the architecture itself, which is
fairly independent of the exact details of the particular
CMOS technology and the anti-fuse. Other papers describe
more fully the anti-fuse [4], a CMOS circuit implementing
the architecture [5], and a study comparing the architecture's logic density to that of conventional gate arrays [6].
II.
PROGRAMMABLE INTERCONNECT ARCHITECTURE
The general architecture, shown in Fig. 1, exhibits the
familiar gate array organization: rows of logic cells interspersed with routing channels. There are, of course, several
key differences.
The tracks in the channels are not simply empty areas in
which metal lines can be arranged for a specific design.
Rather, they contain predefined wiring "segments" of various lengths. Other wiring segments pass through the channels vertically. Each input and output of a logic module is
connected to a dedicated vertical segment. Other vertical
segments just pass through the modules, serving as
feedthroughs between channels. (The number and lengths
of segments in Fig. 1 are only suggestive.)
0018-9200/89/0400-0394$01.00 CD1989 IEEE
Article Reprint 3
EL GAMAL
et al.:
ARCHITECTURE fOR ELECTRlCALLY CONflGURABLE GATE ARRAYS
395
1
Fig. 2.
L
I
Horizontal fuse programming.
I
I
jM- GND
,I=;
.....iF;
F2
FS
I
I
I
jM-
.....ifi
F3
F4
Fig. 4.
Fig. 3.
A sneak path.
Cross-fuse programming.
TABLE I
A fuse is located at each crossing of a horizontal and
vertical segment. Programming one of these "cross fuses"
provides a low-resistance bidirectional connection between
the segments. Other fuses are located between adjacent
horizontal segments within a track. When blown, these
"horizontal fuses" connect the two segments to form a
longer one. (Although not shown in the diagram, fuses
may also be provided to connect adjacent vertical segments.)
In order to program a fuse, we need to apply high
voltage across it. This is accomplished by an efficient
addressing scheme that uses the wiring segments themselves, pass transistors connecting adjacent segments, and
control logic at the periphery of the array. Fuse addresses
are shifted into the chip serially.
As shown in Fig. 1, each column of "horizontal pass
transistors" connecting horizontal tracks is controlled by a
shared "horizontal control" line running across the array.
Each row of "vertical pass transistors" is controlled by a
"vertical control" line. The peripheral circuitry can drive
the control lines and the segments at the end of each track.
Horizontal fuse programming is quite simple. In the
example of Fig. 2, we apply programming voltage Vpp
across the fuse Fl' All horizontal control lines except the
one in the column containing Fl are turned on by connecting them to V pp , and the appropriate track segments are
driven to GND and Vpp as shown. (Vertical fuses, if
present, are programmed similarly.) Cross-fuse programming uses both horizontal and vertical control lines as
shown in Fig. 3. Segments not driven to either GND or Vpp
are left l'recharged to Vpp/2. Thus the voltage across fuses
not being programmed is either zero or Vpp/2.
Some care is required to assure that a unique fuse is
addressed. Fig. 4 shows how previously blown fuses can
divert current along a "sneak path," in this case programming fuse Fs through previously blown fuses F3 and F4
instead of programming F2 • Fortunately, we are not interested in blowing an arbitrary pattern of fuses (this is not a
PROM!). For example, we are not concerned with programming a pattern that connects two outputs together
since this does not form a useful net. If we consider only
relevant patterns, it can be shown that programming the
4 transistor cells
modules
3 input NOR
4:1 mux, non-invening
D latch with clear
D flip-flop with clear/set
full adder
10
fuses in a carefully chosen order eliminates sneak paths. In
general, fuses must be programmed starting from the center of the chip and moving outward, channel by channel.
Determining the proper order is a bin sort operation, and
can be done by software in linear time.
The pass transistors and peripheral control logic are also
used to test the chip; this is discussed in detail later.
III.
CHOICE OF THE LOGIC MODULE
As outlined so far, the programmable interconnection
architecture could be used with a variety of logic modules.
Which would be best? This turned out to be a very
difficult question, involving subtle trade-offs among
routability, the logical capability of the module as perceived by the user, and delays due to capacitive loading in
the routing segments.
The complexity of the module must be balanced with
the routing overhead. Mask-programmed gate arrays provide very flexible and efficient routing. They therefore use
a simple four-transistor cell. On the other hand, routing is
very expensive in both area and delay with present programmable logic arrays. These generally use a module
capable of implementing more complex functions [2]. The
architecture outlined here has a cost of routing closer to a
conventional gate array, suggesting a logic module of intermediate size. Because this is about the same complexity as
conventional gate array hard macros, the designer can use
a library like the familiar gate array cell libraries; there is
no need to map logic into a more complex module. Table I
lists several typical gate array macros and the numbers of
four-transistor cells and logic modules required to implement them.
II
396
IEEE JOURNAL Of SOLlD·STATE CIRCUITS, VOL. 24, No.2, APRIL 1989
Programmed
Output
Segment
Horizontal Fuse
Logic Module
Output
Logic Module
A1
5A
OUT
Input
Input
Segment
~~~~§i~~
.~
81
Prograrruned
58
so _ _ _ _ _---.J
51
-------~
Fig. 5.
Module function.
Our chosen module, shown in Fig. 5, has eight inputs
and a single output. It is composed of three two-to-one
~ultiplexo~s, with an OR gate on the last stage's select
mput. Vanous macros, such as those in the table are
implemented by using an appropriate subset of the i~puts
and tying the remaining inputs high or low. Thus the
m~dule can implement all macros with two inputs, most
with three inputs, many with four inputs, etc.
The module's output is connected to a vertical segment
spanning several channels. Each input is connected to a
short vertical segment spanning one channel. Four of these
span the channel above the module, four the channel
below. The use of short segments for the inputs reduces
parasitic capacitance and hence delay.
Note that each input is accessible from either the channel above or below but not both. At first, this would
appear to limit routability compared to a conventional
"double-entry" gate array cell, in which signals may enter
from either channel. However, there is nearly always more
than one way to implement a macro. For example, there
maybe up to four distinct ways to implement a two-input
gate: with both signals connecting to inputs in the top
channel, with both signals connecting to inputs in the
bottom channel, with one signal in the top and the other in
the bottom channel, or vice-versa.
By letting the router choose an implementation that uses
inputs accessed from convenient channels, the benefits of
full double-entry symmetry are approached or sometimes
attained. The degree of symmetry possible for a particular
macro m implemented in a given module is reflected in the
following measure S:
S(m)
=
log2 (N(m))
where N(m) is the number of distinct possible implementations of the macro m. Full double-entry symmetry would
correspond to a value S( m) equal to the fan-in of the
macro. To evaluate the overall symmetry of a module, we
average S( m) over the macro library, weighted by relative
macro usage U( m) and the fan-in F( m):
LU(m)S(m)
LU(m)F(m) .
4-26
Rows of
Logic Modules
~~~~~~~~§~~~~~~
Fig. 6.
Cross Fuses
A routing path.
Thi~
is the effective fraction of macro inputs in a typical
design that have double-entry symmetry, and is an important criterion for choosing a module.
IV.
ROUTING
Fig. 6 illustrates the routing of a net. The vertical
segment connected to the driving module's output is connected by cross fuses to horizontal segments, which in turn
connect to the segments associated with module inputs. In
the top channel, a horizontal fuse is used to link two
segments into a longer one.
The resistance of the blown fuses and the parasitic
capacitance of the segments used form an RC tree, with
the driver of the net as the root. Note that each input is
driven through a maximum of three and generally two
fuses to limit the delay. (If the number of series stages in
the RC tree were allowed to increase further, the delay
through the routing would increase rapidly.) The maximum number of fuses and the segment lengths (hence
capacitances) can be altered to suit the chip dimensions
and the resistance of the fuse technology.
In rare instances, it is not possible to place the macros
so that all inputs on the net lie within the channels
spanned by the output segment of the driving module. To
handle this case, a few additional uncommitted long vertical segments are provided. The net is then routed from the
output segment to a horizontal segment, then to the long
vertical segment, then to another horizontal segment, and
finally to the necessary input segments. To keep the number of fuses in series limited to four, no horizontal fuses
are allowed in such nets. (If necessary, the architecture can
be extended to provide a special fuse connecting the output directly to a long vertical segment passing over the
driving module, thus eliminating the first horizontal segment and reducing the total number of fuses in series back
to three.)
A means must also be provided to connect internal
signals to the bonding pads of the chip. Each pad has a
dedicated bidirectional buffer, which connects to the array
through an associated I/O module. The I/O modules fit
in the outer columns and rows of the array next to the
logic modules. Each I/O module has two inputs, data and
enable, and an output. The data and enable signals are
Article Reprint 3
397
EL GAMAL et at.: ARCHITECTURE FOR ELECTRICALLY CONFIGURABLE GATE ARRAYS
sent to the output buffer of the associated bonding pad,
and the module's output comes from the input buffer of
the pad. Thus the I/O module can be configured to
provide input, output, tristate, or bidirectional capability.
To minimize clock skew due to differential routing delay, one entire track (or more if needed) in each channel is
set aside for clock distribution. These tracks are connected
directly to buffers, so that each input presents a similar
load driven through exactly one fuse.
An interesting theoretical question is whether more horizontal tracks are needed in each channel here (where the
lengths of the wiring segments must be predetermined)
than in mask-programmed routing (where the wiring is
customized for the design). Surprisingly, a high probability
of routability is obtained with only a few tracks above
channel density.
This requires a careful choice of the lengths of the
segments, based on statistics from an extensive suite of
design examples. This was done by first determining the
distribution of net lengths, i.e., the length each net would
run along each channel if the constraint of fixed segmentation were absent (as in a conventional gate array). The
distribution of physical segment lengths provided on the
chip was chosen to obey similar statistics. Then the segmentation was" tuned" manually based on actual routings
which obeyed the constraints it imposed.
To obtain good routing performance it is also necessary
to take advantage of the symmetry of the macros where
possible. For example, observe that if macro 4 in Fig. 6
permits its input to be routed from either the upper or
lower channel, there is a better chance of finding a free
horizontal segment to connect it.
V.
TESTING
To assure high programming yield, It IS necessary to
thoroughly test the chip for defects in the modules and
fuses prior to programming. With a simple addition, the
addressing circuitry used for programming suffices for this
purpose as well.
Continuity of the tracks is easily verified by turning on
all vertical and horizontal pass transistors, and using the
peripheral circuits to drive the tracks from one end and
read them back from the other. Testing for the absence of
shorts between adjacent tracks is done in a similar way by
applying a pattern of alternating ZERO'S and ONE'S.
Shorted or weak cross fuses are detected by turning on
all horizontal and vertical pass-transistor lines, grounding
all horizontal segments, and driving all vertical segments
to some stress voltage. Horizontal fuses are tested column
by column, with the same addressing method that is used
to program them.
To verify the functional operation of the modules, we
need to apply test vectors to their inputs and read their
outputs. A vector is applied simultaneously to an entire
row of modules by turning on all vertical pass transistors
except those in the row being tested. Data are applied to
. .-===:rs=.. " . . "
Probe
Column
1: . To
~xtemal
Enable
...
,~
~-o-----
Module
Row Select
Column Sense
i
Fig. 7.
Probe circuit.
the inputs in the channel above the row from the periphery
at the top of the array, and to the inputs in the channel
below the row from the bottom of the array.
Since the outputs of the modules share a vertical track
with outputs of other modules above and below them,
some other means is required to read the module outputs
at the array periphery. As shown in Fig. 7, a select line is
provided along each row of modules, and a sense line
along each column. Activating the select line for the row of
modules under test gates their output values onto the sense
lines. The sense lines are loaded into a shift register at the
top of the array.
This ability to read the output of any module at the
array periphery is highly useful after programming as well.
Only a small amount of extra circuitry is required to select
one of the sense lines and make its value available at an
external pin of the chip. Thus by shifting in the appropriate address, the user can observe any internal node of his
design externally in real time. This virtual probe can be
used and its address changed even as the programmed chip
is operating in the user's system.
VI.
IMPLEMENTATION: SILICON AND SOFTWARE
The architecture has been implemented in a CMOS
device. For details, including the speed of the module in
isolation and in an application, see ~5].
Computer-aided design tools have been developed to
support the architecture. Designs are entered as schematics
or net lists using a cell library.
The placement and routing algorithms are specific to the
architecture. As usual these are time consuming, taking up
to a few hours on a low-cost workstation. They achieve
lOO-percent routing completion. (Even expert users have
never been able to improve manually on the automatic
router.) The probability of successful routing can be predicted by analyzing some statistics of the design.
Because the nets are RC trees, delays are not a simple
function of capacitive load as with mask-programmed gate
II
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 24, No.2, APRIL 1939
398
arrays. Nevertheless, we are able to quickly calculate precise delays at each input for post-layout simulation and
timing verification.
ACKNOWLEDGMENT
The authors gratefully acknowledge the technical contributions of J. Chang, D. Gluss, R. Guo, D. How, and F.
Sohail.
REFERENCES
[1]
[2]
[3]
[4]
[5]
Eric Rogoyski received the B.S. degree in mathematics from the State University of New York at
Stony Brook in 1972.
He was at IBM Corporation from 1974 to
1982 with his last position in EDS at East
Fishkill, NY. He held various positions in physical design from 1982 to 1984 at the company he
co-founded, California Automated Design Inc.,
and from 1984 to 1986 at Mentor Graphics
Corporation. He is currently a software consultant at Actel Corporation, Sunnyvale, CA, supporting the architectural development and physical design of Actel's
configurable technology.
Abbas EI Gamal (S'71-M'73-SM'83) received
the B.Sc. degree in electrical engineering from
Cairo University, Egypt, in 1972, and the M.Sc.
degree in statistics and the Ph.D. degree in electrical engineering both from Stanford University,
Stanford, CA, in 1977 and 1978, respectively.
From 1978 to 1980 he was an Assistant Professor of electrical engineering at the University of
Southern California, Los Angeles. Since 1980 he
has been with the Electrical Engineering Department of Stanford University where he is currently an Associate Professor. From 1984 to 1986 he was Director of the
Systems Research Laboratory, LSI Logic Corporation, Milpitas, CA. He
is a co-founder and Chief Scientist of Actel Corporation, Sunnyvale, CA.
Khaled A. El-Ayat received the B.Sc. degree in
electrical engineering from the University of
Cairo, Egypt, in 1968, the M.Sc. degree in electrical engineering and computer science from the
University of Toronto, Canada, in 1971, and the
Ph.D. degree in electrical engineering and computer science from the University of California,
Santa Barbara, in 1977.
In May 1977 he joined Intel Corporation,
Santa Clara, CA, to work in the Microprocessor
Design Group, where he worked on the definition and development of industry standard microprocessor families such
as 8086, 80186, and 80386. As a Project Manager at Intel, he was
responsible for the design of the control structures of the 80386 Microprocessor. After leaving Intel, he cofounded Actel Corporation in Sunnyvale, CA, and was the Program Manager responsible for development of
electrically configurable gate arrays. His present research interests include
configurable logic and application-specific architectures, VLSI design,
and microprocessor architectures. He has authored many articles and
holds patents covering Actel's architecture and testability techniques.
Presently he is a Chief Engineer working on the definition of the next
generation of products.
Jonathan Greene received the Sc.B. degree in
biology from Brown University, Providence, RI,
and the Ph.D. degree in electrical engineering
from Stanford University, Stanford, CA, in 1983,
where he performed research on configurable
VLSI arrays, VLSI complexity, and information
theory.
During 1984 he was with Hewlett-Packard
Laboratories. From 1984 to 1986 he worked on
cell design automation and module compilation
at the LSI Logic Systems Research Laboratory'
in Palo Alto, CA. He is currently Manager of System Architecture at
Actel Corporation, Sunnyvale, CA.
Arnr Mohsen (S'72-M'74-SM'84) received the
Ph.D. degree in electrical engineering and applied physics from the California Institute of
Technology, Pasadena.
He is the founder, President, and Chief Executive Officer of Actel Corporation, Sunnyvale,
CA, and has more than 20 years of experience in
the semiconductor industry. Before founding Actel, he was a Senior Engineering Manager in the
technology division at Intel Corporation. He also
worked on charge-coupled device development at
Bell Laboratories and served as a consultant. He has authored more than
45 articles relating to semiconductors and is responsible for inventions
covered by 20 patents.
[6]
4-28
S. Wong, H. So, C. Hung, and 1. Ou, "CMOS erasable programmable logic with zero standby power," in ISSCC Dig. Tech.
Papers, Feb. 1986, pp. 242-243.
H. Hsieh et al., "A second generation user programmable gate
array," in Proc. Custom Integrated Circuits Conf., May 1987, pp.
515-52l.
A. El Gamal, K. El-Ayat, and A. Mohsen. "Programmable interconnect architecture," pending U.S. patent.
E. Hamdy et al., "Dielectric based anti fuse for logic and memory
ICs," in IEDM Tech. Dig. (San Francisco, CAl, 1988, pp. 786-789.
K. El·Ayat et al., "A CMOS electrically configurable gate array," in
ISSCC Dig. Tech. Papers, Feb. 1988, pp. 76-77.
B. Osann and A. El Gamal, "Compare ASIC capacities with gate
array benchmarks," Electron. Des., vol. 36, no. 23, pp. 93-98, Oct.
13, 1988.
Justin Reyneri received the B.S. and M.S.E.E.
mathematics degrees from Stanford University,
Stanford, CA, in 1978, and the Ph.D. degree in
electrical engineering, also from Stanford, in
1985, having done research in cryptology and
information theory.
He is now Manager of System Development at
Actel Corporation, Sunnyvale, CA, where he has
worked since 1986. Prior to that he worked at
LSI Logic's System Research Laboratory on automated data· path layout, and at Hellman Asso·
ciates on communications security systems.
Article Reprint 4
DIELECTRIC BASED ANTIFUSE
FOR LOGIC AND MEMORY ICS
by Esmat Hamdy, John McCollum, Shih-ou Chen,
Steve Chiang, Shafy Eltoukhy, Jim Chang,
Ted Speers, Amr Mohsen
I
A paper presented at the
ternational Electron Devices Meeting, 1988
DIELECTRIC BASED ANTIFUSE FOR LOGIC AND "E"ORY
ICs.
Esaat Haady, John McCollua, Shih-ou Chen, Steve Chianl,
Shafy Eltoukhy, Ji. Chang, Ted Speers, Aar Mohsen
Actel Corporation, 955 East Arques Ave., Sunnyvale CA 94086
ABSTRACT
This
paper
describes a
Proeraaaable Low
lapedanee Circuit Eleaent are
used; the low voltage high speed transistors
are used for the loeic and signal path while
786-IEDM 88
the high Yolta,e transistors are uaed for the
proaraaaing path. Fig. (2) i8 a photoaicroeraph
ot a
2000 gate progra••able gate array
utilizing 166.000 antituses. There are roughly
one hundred antifuses for each logic gate to
achieve a high decree ot interconnectivity and
gate utilization. Each antifuse occupies an
area equivalent to a contact or via. The
antituses
are
incorporated
into
a
cell
structure such that when the antifuse is
proeraaaed. it connects a aetal 1 and aetal 2
line U.2]. The cell structure size is thus
liaited by the aetal 1 and aetal 2, pitch.
Fi.. (3) is a photoaicroeraph of a 64K PRO"
designed using the sa.e technoloey with a
typical access tiae of 35 naee.
ANTI FUSE CHARACTERISTICS
When 16 volts is applied across the antifuse
through X-V aelect transistors [1.2], the
antituse is proeraa.ed to the conductive
state in about 200 usee aa shown in Fig. (4)Fie. (5) shows a tight resistanee distribution
of antituses proeraaaed with a current of
5 .A.
Once
the
dielectric
ia
ruptured.
the
resistanee Rl of the conductive state i.
deter.ined by the aize of the conductive
conduit Uink>.
The
size
of
the
link
is
deter.ined by the aaount of power dissipated
in the link which ael ts the dielectric. Since
the te.perature of the aolten core varies
inversely with its radius, the .olten core will
expand until it. te.perature drops below the
.eltina point of the dielectric. Since the size
of the link is .uch s.aller than the thickness
of the conductive silicon layers on both sides
of the dielectric. the te.perature ,radient
and the resultant heat conluction can b •
• odelled ~y a si.ple sphere. The resultant
equation for the link te.perature T I:
TI-I,V1/4lCktlarl
••••••••••••• (1)
Where I p is the pro,ra.aine current. V I ia the
voltage acrOB8 the link. r. ia the radius of
the link. ani ktll i8 the theraal conluctivity
of silicon. The calculation however beco.ea
CH2S28-8/88/0000-0786 $1.00 © 1988 IEEE
Article Reprint 4
co.pUcated if the ther.al conductivity of
slilcon I.
taken into consideration as a
function of its te.perature [3). Further.ore.
the
power
dissipation
i&
distributed
throu,hout
the
sphere. Hence
a
co.puter
sl.ulat.lon that breaks the sphere into 1 n.
thick conductive shells was used to calculate
the link te.perature TI and resistance RI vs
pro,ra••ln,
current
I,.
The
resultant
calculations are plotted In Fi.. (6). The link
radius 1'1 is deter.ined by the equilibriu. of
power dissipation and .eltine te.perature, as
expressed In the followin, equation [4):
1'1-21,<"'1+111/2)/87'" ••••••••• (2)
predicted liteti.e is .ore than 40 years.
Fig. (10) is a plot of ti.e to breakdown vs
lIelectric-field
for
unprogra••ed
discrete
antifuses. Extrapolation of Fia. (10) indicates
that the Iifeti.e of an unproera•• ed antifuse
under
continuous
5.5v
stress
exceed.
100 years, with a
failure rate lesl than
1 FIT. In addition to the accelerated discrete
antifuse device reliability data, 364 product
units have accu.ulated a total of 571.000
device hours of dyna.ic burn-in at 5.5 volt
and 125C (with so.e units reachine 2500 hours)
with
no
failures
or
change
in
a.e.
characteristics.
Thil
is
equivalent
to
a
failure rate of less than 100 FIT.
r
vhere
I' fai are the resistivities of the link
and silicon respectively durinc proera••inc
and the relation between link resistance RI
and
link
radius
1'1
can be
obtained as
follows[4]:
Fro. eqn.(2) and eqn.(3) we derive that RI is
thus inversely proportional to I, (Fig. (6»,
liven by the followi", equation:
RI=(87/tlp).(rl+1'li)/~JI+ 100M OHMS
Fig.(3) Photomicrograph of 64K PROM
Fig.(l) SEM cross-section of antifuse.
~
0
....~
~
20 f18
"15
0
-
>
"j
:I
10
,..
oC
a:
8a:
A.
f-
100
200
300
500
400
PROGRAMMING TIME
(j.Ls)
Fig.(4) Programming time of antifuse
with 18V programming voltage.
Fig.(2) Photomicrograph of 2000 gate
programmable gate array.
788·IEDM 88
Article Reprint 4
Programmed Antlfuae Rellatanee Histogram
Kelvin Structln
70''''-'('''
.---10
0
10
3
Dltt.
80
10
O~_---L
100
JQO
100100
-
-+10
2
4
Poly
l
0
,
Gnd
0
-++V
L.--
10011001100110017001100
Prognmmed AntIIuM AMlatance (ofwIs)
Fig.(8) Antifuse Kelvin structure.
Fig.(5) Programmed antifuse resistance
histogram, programmed with 5 rnA current.
ANTIFUSE ACCELERATED LIFETEST
250C 5mA TEST
1000
12
W
(/)
800
~
11
i=
10
,
r
E
!
Z
r(
C
100
I
i
¥I'
I
LL.
'Ii'
7
W
:J:
l-
i
(/)
(/)
400
0
I
II:
l.
0
C
1:>
200
w
Cl
~
C
I...I
0
>
10
24 48 72 116 120144 1&8 li2 216 240 264 288 312 336 360 384 408 432
456
Fusing Current Ip (mA)
ELAPSED TIME (HOURS)
Fig.(6) Programmed antifuse resistance
versus programming current.
R
Fig.(9) Voltage across antifuse versus
stress tim-e, with 5 rnA stress current.
(0)
11V
10V
8V
8V
7V
10
1.000
40yeara
&+OS
.1000
IdIw
'"
11 ~
w//,IU/~ ~
~ ~ I17A
t:"""== ......:
...
I"IIIIIIIIII ~
-
.l1OlIO
~
......... ~
-.10.00
11
1.0/dlw
IaAI
Fig.(7) Programmed antifuse resistance
versus reading current, with 100 ohms of
parasitic resistance in series.
liE (cmIMy)
Fig.(lO) Time to breakdown versus inverse
electric field.
IEDM 88-789
•
Article Reprint 5
FIELD-PROGRAMMABLE
GATE ARRAY WITH
NON-VOLATILE CONFIGURATION
by Andrew Haines
II
Reprinted from
MICROPROCESSORS AND MICROSYSTEMS
Volume 13, No.5, June 1989
4-35
Field-programmable gate array
with non-volatile configuration
Field-programmable gate arrays (FPGAs) allow fast turnaround'
mask-programmable gate arrays offer flexibility. Andrew Haines de~cribes
the new FPGA family from Actel which boasts speed and flexibility
Actel Corporation has developed a family of fie/dprogrammable, channelled gate arrays based on proprietary
architecture and interconnect technologies. The new
semicustom devices combine the flexibility of maskprogrammable, channelled gate arrays and the convenience
of fast-turnaround, electrically programmable logic devices.
A new programming element, the PLICE antifuse, which
can be produced in a standard CMOS fabrication facility,
permits significant architectural innovation.
microsystems
gate arrays
logic modules
PLiCE antifuse
In July 1988, Actel introduced the ACT 1 family of fieldprogrammable gate arrays (FPGAs). Ranging from 1200 to
6000 gates, these devices have a number of important
attributes that clearly distinguish them from previous
approaches: low cost; fast prototype development;
mas.~ed array de~sity; and applications flexibility. Using a
familiar schematiC capture and macro library design entry
procedure, comprehensive CAE support includes automatic placement and routing at 85-95% gate utilization.
The underpinning of the ACT 1 family is a new PLiCE
(programmable low-impedance circuit element) antifuse
programming element. The antifuses, non-volatile, twoterminal elements that exhibit low 'on' resistance when
programmed, provide the same wire-to-wire interconnect
functions as 'vias' provide in mask-programmable gate
arrays, and that transistor-based EPROM and RAM cells
and metal fuses provide in previous programmable logic
devices.
The very small size (1.25.u m per side) and low
programmed-state resistance of the PLiCE antifuse facilitate
the creation of a new field-programmable, channelled
gate-array architecture, with gate denSity, application
flexibility and performance comparable with masked
arrays, but with short development cycles.
The paper first reviews the key elements of FPGA
architectures and then, in this context, examines the
Actel Corporation, 955 E. Arques Avenue, Sunnyvale, CA 94086, USA
Paper received: 27 January 1989. Revised: 10 March 1989
ACT 1 architecture. Next the PLiCE programming element
which underlies the architecture is considered and finally
the characteristics of ACT 1 family devices are described
from the point of view of the user.
ARCHITECTURAL CONSIDERATIONS IN
FIELD-PROGRAMMABLE GATE ARRAYS
Consider the key elements of field-programmable gate
array architecture - the logic modules, the number of
potential interconnect elements, and the delay introduced
when interconnecting logic modules. Gate arrays consist
of an arrar of cells.or logic. modules s.ep~rated by routing
channels. The logIC capaCIty of a devICe IS determined by
the number and size of the logic modules, the effectiveness of the modules in implementing diverse logic
functions (such as flip-flops, latches and gates), and the
percentage of modules that can be wired together in a
given application. This percentage utilization is dependent
on the amount of wiring or routing resources that are
available on the chip. Performance is determined by the
intrinsic delay of the logic modules, the delay introduced
by interconnect wiring and the number of programming
elements required in a programming path.
A major consideration in the choice of a logic module is
the module size which can range from a few gates to tens
of gates. The ideal logic module is very small, i.e. made up
of a few gates which can be combined in many ways to
provide suitable logic functions. Choice of a small logic
module is constrained, however, because the smaller the
module the greater the wiring resources required on-chip
with which to tie them together. Adequate routing
resources may be impossible to achieve if the size of the
programmable interconnect element consumes too
much silicon area. While the choice of a large logic
module relieves routing resource requirements, larger
module sizes may be less efficient in logic function
implementation if a module containing tens of gates must
be sacrificed to perform simple AND, OR or INVERT
functions. Logic applications vary with respect to the
relative amount of combinatorial and sequential logic that
they contain 2• Regardless of its size, the logic module
0141-9331/89/05305-08 $03.00© 1989 Butterworth & Co. (Publishers) Ltd
Vol 13 No 5 June 1989
305
Article Reprint 5
must be designed such that it can implement a wide
variety of logic function with high efficiency.
While the size of the logic module contributes to the
routing resource requirements, equally important are the
routing resource requirements intrinsic to the user's
application. Functions with a regular structure and serial
data transmission, such as shift registers and counters,
have relatively low fan-out per gate and modest routing
requirements. Most random logic applications, such as
multipliers, accumulators and ALUs, have higher effective
fan-out and demand more routing resources for the same
number of gates. Bus-oriented architectures can exhibit
very high fan-out and are notorious for presenting difficult
routing problems.
Providing sufficient routing resources on chip to meet a
wide range of application requirements is a crucial
architectural problem. If resources are inadequate, users
will find that effective device capacity varies greatly
depending on the application. If this is so, the supplier's
designated capacity of '2000 gates' may actually materialize
only infrequently, if ever. Placement and routing, a key
part of the design cycle, is also adversely affected. Under
these conditions, automatic software may perform poorly,
forcing the user into frustrating and time-consuming interactive placement and routing.
For field-programmable devices of all types, performance
in specific applications is dominated by the delay
associated with interconnecting logic modules. Fieldprogrammable devices must have a programming element
in the interconnect pathways. Since these elements do
not have zero resistance, and capacitance is present in the
circuits, they introduce delays. Hence, with respect to
device performance, programming elements can be
thought of as delay elements. The lower the resistance of
the programming element, the faster the device will be.
Therefore, another critical architectural consideration is
the minimization of the number of delay elements in any
given interconnect path. A large or indeterminate number
of programming elements in signal paths leads to
intolerably slow or unpredictable performance.
In reality, the selection of a logic module, routing
resources, and device performance are closely coupled.
The trade-offs between choices for these elements find a
common focus in the choice of a programming element.
Once a programming element has been selected, many
possible architectures are immediately ruled out. For
example, if a large, medium resistance element (approximately 1 k ohm) such as a static RAM cell is chosen, then
the choice of a small logic module is precluded because
the large number of interconnect elements that would be
called for cannot fit on any reasonably sized chip. The
desired programming element must exhibit the smallest
possible size, low on-state resistance and compatibility
with conventional CMOS double-layer metal manufacturing
processes.
ACT 1 FAMilY ARCHITECTURE
Actel has developed a new architecture that takes
advantage of the small size and low resistance of the
PLiCE antifuse elements for electrically configuring devices.
The architecture can be extended to high gate counts
through the use of sub-micron geometries, providing a
level of integration comparable with mask-programmable,
channelled gate arrays. In fact, the device architecture
306
I/O buffers.
program
and test
I/O buffers. program and test
I I I
I I I I I I II
Routing channels
'I
Logic modules
1'fJ I I
d&1
II
I/O buffers. program and test
I/O buffers.
program
and test
Figure 1. Block diagram of ACT 1 array showing arrangement of logic modules, wiring channels and peripheral
circuits. Testability circuits and address circuitry for
programming are built into the periphery
Figure 2. Die photograph of ACn020, 2000-gate array.
Channelled gate array architecture with rows of logic
modules (narrow rows) separated by routing channels
(wide rows) is prominently displayed
exhibits the familiar, channelled gate array organization rows of logic cells interspersed with routing channels (see
Figures 1 and 2). The architecture differs from that of a
traditional gate array in the details of the horizontal and
vertical wiring tracks and, of course, the anti fuse
programming elements.
Figure 3 is a section of Actel's customer-configurable,
channelled gate array showing the tracks, antifuse elements
and interconnect architecture. The tracks within the
channels contain predefined horizontal wiring segments
of various lengths. Vertical tracks span several rows of
logic modules and adjacent horizontal channels. An
Microprocessors and Microsystems
•
Mask programmed gate array cells are 'double entry'
Vertica I
contro I
Vertic al
track
segmen t
I
I
"-
1
J
L ~
L
::::r
L
p-
Lf..-J
~~~:s~
L --.J
or
L~
Vertica l _
pass
transis
tor-1
Logic
modul e
I
Same effect achieved with configurable gate arrays, with
several ways to map a macro into a module
I
I
LI-::J'"
L~
Horizo ntal
track
segment
>~r--J'HOrizontal
control
HOrizontal
pass transistor
HOrizontal
fuse
Figure 3. Simplified organizational view of ACT 1 family
architecture showing logic modules, routing tracks and
programming elements
anti fuse programming element is located at each crossing
of a horizontal and vertical segment. Programming one of
these 'cross antifuses' provides a low resistance bidirectional
interconnect between the segments. Other antifuses are
located between adjacent horizontal segments within a
track. When programmed, these 'horizontal antifuses'
connect the two segments to form a longer one. Similarly,
anti fuse elements are also located between adjacent
vertical segments.
Vertical and horizontal pass transistors are also shown
in Figure 3. These pass transistors are used only during
anti fuse programming and, so, do not playa role in device
performance. Inclusion of these control transistors allows
the routing tracks themselves to carry programming
voltages during the programming sequence, thus saving
additional wiring. The control transistors, in turn, are
controlled from a shift-register-based addressing scheme
in the chip periphery. During programming, antifuse
'addresses' are shifted into the chip thus setting the pass
transistors to the appropriate state to deliver the 18 V
programming voltage to the selected antifuse.
Interconnection
An antifuse element is programmed by the application of
a programming voltage (Vpp)' An efficient addressing
Table 1. Effectiveness of the ACT 1 logic module in
implementing common logic functions
Function
Module efficiency
(% of chip required)
ACT1 01 0
lCA3020
1200 gates 2000 gates
(from Xilinx)
1 of 8 decoder (74138)
8: 1 MUX (74151)
3-bit binary counter
8-bit serial to parallel SR
4-bit Johnson controller
w ClK EN
Vol 13 No 5 June 1989
4.1
1.7
3.4
5.4
7.8
6.3
3.1
7.8
2.7
3.1
AB
A
B
ctJctJC=PctJ
B
NAND: 2 inputs, 4 maps
MUX4: 6 inputs, 8 maps
A
AB
full flexibility
partial flexibility
Figure 4. Inputs to the ACT 1 logic module. Like conventional masked arrays, these can be routed from the top
or bottom of the cell since most macro functions can be
mapped into the logic module several ways. A function
with two inputs, A and B, can have several mappings with
v3rious locations of A and B, thus giving the router more
choices
scheme, utilizing wiring segments, pass transistors connecting adjacent segments, and control logic at the
periphery of the array, provides selective application of
programming voltage to anti fuse elements. Antifuse
addresses are shifted into the chip serially. Further
information on ACT 1 family devices and architecture can
be found in References 3-5.
Logic modules
A configurable logic module is the basic logic building
block of Actel's channelled gate array deVices. Depending
on the size of the gate array, the family of fieldprogrammable, channelled gate arrays contains between
295 (ACT1010) and 1258 (ACT1260) configurable logic
modules. The flexible logic module of the ACT1 01 0 has
eight inputs and one output. With a small size, the
module is similar in capability to a conventional gate array
macro and can be configured into over 150 different
functions. The design of the module was carefully chosen
for its efficiency in implementing both combinatorial and
sequential circuits and for its optimum utilization of
routing resources (see Table 1). Because JK- and D-type
flip-flops can be configured from the logic modules, no
dedicated flip-flops are required on-chip.
Also, like a conventional gate array, the module is
'double entry' in that inputs can enter from the top or
bottom of the cell (see Figure 4). Each of the eight inputs is
connected to a short vertical segment spanning one
routing channel. Four inputs span the channel above the
module and four span the channel below. The module's
output is connected to a vertical segment spanning
several channels. The benefit of double-entry symmetry
relates to routing efficiency: by letting the routing
algorithm choose module access (from above or below),
up to 95% gate utilization is attained.
307
Article Reprint 5
ACT1260
•
OJ
~
Actel ACT 1020
D LSI logic LL 7220
40
12
30
-5
10
L
QI
a. 20
•
90% of the time), otherwise a heavy
design burden is placed on the end user and the design
cycle is considerably lengthened. Minimum routing
requirements can be theoretically determined given the
number of logic modules which must be interconnected
and the average number of I/Os per module 6 . Figure 5
compares the routing resources available in ACT 1 family
chips with those found in field-programmable arrays using
the logic cell array (LCA) architecture and with the
theoretical minimum needed to achieve 90% probability
of successful completion of automatic placement and
routing (APR).
Gate count
Because Actel's devices have a channelled gate array
architecture with a very large interconnect capacity similar
to conventional gate arrays, device gate capacity is similar.
The ACT 1 family of gate arrays offers 1200, 2000, 3000
and 6000 gate sizes with up to 142 I/O pins. Unlike
programmable logic devices, where 'equivalent gate'
308
Data path
Timer/counter State machine
Arithmetic
Benchmark type
Figure 6. Benchmarking results comparing the effective
gate capacity of ACT 1020, 2000-gate array with a LL7720,
2200-gate array
values vary dramatically depending on the application,
these configurable gate arrays maintain consistent capacity
and utilization across a broad variety of application types
just as in conventional arrays.
In the absence of industry-standard benchmarks, Actel
has developed four benchmarking standards that can be
used to test achitectural capacity with respect to different
application types 2. These include a data-path circuit, a
counter/timer, a state-machine function, and an arithmetic
structure. Each of these benchmarks is of the order of
150-300 gates as measured in LSI logic gate array terms an industry accepted means. Figure 6 compares the
capacity (in terms of benchmark circuits) of the 2000-gate
ACTl 020 against the industry standard LL7220 channelled
gate array. This comparison clearly indicates that the
effective gate count of the field-programmable channelled
gate array is comparable with conventional devices over a
broad range of application types.
PERFORMANCE CHARACTERISTICS OF THE
ACT 1 ARCHITECTURE
The low resistance of the PUCE antifuse (typically
500 ohm) and the ACT 1 interconnect architecture
minimize the impact of programming elements on device
performance. In 90% of cases the structure of the ACT 1
modules and wiring scheme limit the number of antifuses
found in a propagation path to two. Routing software
limits the number of antifuse elements to a maximum of
four.
While specific application examples can be used to
illustrate performance, a more informative view can be
gained from an understanding of delays using a statistical
approach. Any given path in a field-programmable array
can have a range of delays depending on how it is actually
routed on the chip. By taking a specific path and routing it
many times, a distribution of delays can be formed. The
resulting distribution illustrates the range of path delays
which can be expected over a whole series of applications.
The characteristics of this distribution provide a more
general view of performance than that from looking at
specific examples. If the distributions are very narrow,
delays will predictably fall into a narrow range. Conversely,
Microprocessors and Microsystems
II
Six level pipeline logic
CL~~Q
ACTl implementation
eight logic modules, five nets
CL~~Q
XC2018/70 implementation
five logic modules, four nets
Figure 7. Segment of pipeline circuit used for performance
modelling has six logic levels between flip-flop elements
20
i
w
A ACT 1010
• XC 2018170
15
10
-
Highly reliable one-time
programmed connection
-
10 times smaller than SRAM
cell
-
10 times smaller than EPROM
cell
-
Only 2-3% need be
programmed
PUCE poly
o~~~~~~~~~~~~~~~
42
50
58
66
74
82
90
98
106 114 122 130
Worst case delay (ns)
Figure 8. Results of Monte Carlo simulation showing
delay distributions for 50 placement and routings of six
logic-level pipeline path
the wider the distribution, the more likely it is that
unexpectedly large delays will occur during application
development. Such delays may not be revealed until the
design process is well under way, making early preroute
performance estimations of little value.
A convenient way to create such distributions is to use
the Monte Carlo simulation method that is commonly
used in the physical sciences, In this method, known
distributions of a set of independent variables are used to
create a distribution of a function of the independent
variables. For FPGA performance evaluation, actual net
delay statistics gathered from placement and routing a
number of designs can be used to simulate delays for a
complex logic path consisting of a number of nets. Figure
7 shows two paths, one implemented in a 2.0,um ACT 1
family array and the other implemented in a 1.2,um
XC2018 LCA. Each logic function separated by a wire is
implemented in a separate logic module. Such paths are
common in pipelined logic structures. Differences in logic
module capacity dictate the difference in the number of
logic modules used to implement the two cases, In
preparation for the path delay simulation, net delay
statistics were gathered from actual placement and
routings of nets with a fan-out of two. As in masked arrays,
nets in field-programmable devices with low fan-out tend
to be faster than nets with high fan-out. Figure 8 shows the
delay distributions for 50 placements and routings of the
two paths as generated by the Monte Carlo simulation
technique.
Vol 13 No 5 June 1989
Figure 9. Scanning electron microscope cross-section of
PLiCE antifuse showing conductive layers of polysi/icon
and diffusion separated by insulating layer
PLiCE antifuse
The PUCE antifuse is a programming link that has been
developed for use in integrated circuit applications.
Unlike a traditional fuse approach where programming
creates an open circuit, programming an antifuse creates a
very low resistance connection.
While many antifuse approaches have been developed
to date, most need very high breakdown voltages and
currents, are difficult to manufacture or do not meet the
reliability requirements of state-of-the-art integrated
circuits.
PLICE does not have these drawbacks and is the first
antifuse that can be closely controlled during manufacture
to deliver consistent and reproducible characteristics.
PUCE antifuse elements consist of a dielectric sandwiched
between two conductors (Figure 9). When a voltage is
applied across the element, the dielectric breaks down
and connects the conducting materials together. The
PUCE antifuse adds only three masking steps to a
conventional double-metal CMOS process and can be
produced in a typical CMOS fabrication facility using
conventional material and standard processing equipment
and technology.
The PUCE element is a two-terminal, non-volatile, onetime programmable device that normally exhibits a very
309
Article Reprint 5
Table 2.
PLiCE antifuse characteristics
Programming voltage (Vpp )
Programming time
Programming current
On resistance
Off resistance
18V
<10ms
100M ohm
high impedance (> 100M ohm) in the unprogrammed
state. When an 18 V programming voltage (Vpp) is applied
across its terminals, the bidirectional low-resistance
connection is established. Programming current is < 10 mA.
Conveniently, in an array based on anti fuse elements no
more than 3% of the available antifuse elements need to
be programmed. The characteristics of the PUCE antifuse
are listed in Table 2. For further information on the PUCE
antifuse see Reference 7.
Reliability
The reliability of the PUCE technology has been
demonstrated. Because an antifuse element possesses
both an 'on' state and an 'off' state, PUCE antifuse
reliability depends on the reliability of each state. The
reliability studies are of two types:
• the PLICE antifuse reliability by itself
• reliability of products containing PUCE antifuses.
The reliability of isolated circuit elements such as
transistors, diodes or antifuses are normally studied with
special test structures under operating conditions that
significantly accelerate failure mechanisms. Studies of this
type reveal reliability levels that would take years to
demonstrate with products operating at their environmental extremes.
To confirm the reliability of the 'off' state, accelerated
data were collected on the equivalent of millions of PUCE
antifuse device hours. The data confirm that the time to
failure for a PUCE anti fuse at normal operating conditions
is substantially more than 40 years and the combined
contribution of all antifuses to the gate array product's
hard failure rate is <1 FIT (failure in time; 0.0001%/1000 h).
In the case of the reliability of the 'on' state, accelerated
measurements show no intrinsic failure mechanism.
Product reliability depends not only on circuit elements
such as the PUCE antifuse, but also on other features of
the manufacturing and assembly process. Product reliability
studies show reliability levels normally encountered with
CMOS circuits.
Product reliability data from a sample of 614 devices
subjected to 125°C dynamic burn-in life testing for more
than 762 000 device hours indicate no observed PUCE
antifuse failures or change in AC characteristics, implying a
device failure rate of 82 FIT at 70°C.
wiring channels, are free of defects such as opens and
shorts.
Actel has implemented within its ACT gate arrays
testing circuits and techniques to permit complete testing
of every logic and I/O module, all vertical and horizontal
tracks, addressing and decoding circuits, all programming
circuits, and the integrity of all antifuses in the array. In
essence, ACT 1 arrays allow 100% testability of all
circuits.
The testability technique consists of a driving phase, a
sense and latching phase, and a shift out and compare
phase. In the driving phase, the chip is first driven into a
required test mode and then loaded with a specific test
pattem. This technique permits any number of tracks to
be driven simultaneously with any desired voltage that
suits the test pattern. In the sense and latching phase,
special circuits on the periphery of the chip are used to
sense the required signal and then latch them into the
same circuits. During the final phase, the test results are
shifted out and compared with expected test results.
Over 700 000 test patterns are applied to each 2000gate chip by the test equipment in order to exercise and
test the chip before configuration by the user. All aspects
of the chips are tested including the integrity of unprogrammed antifuses. As a result of the extensive factory
testing, the user is not responsible for generating test
vectors to screen devices for manufacturing faults .
DESIGN AUTOMATION SYSTEM
Productive application of field-programmable gate arrays
requires the availability of a sophisticated design automation system. The designer must rely on software tools
for nearly every aspect of the development cycle, from
design entry to performance analysis.
The design automation system, action logic system
(ALS), is comprised of three major components: the logic
design system; diagnostic and debug software; and the
Activator hardware for programming and functional test.
The capabilities of the logic design system include
schematic capture with the Actel macro library, simulation,
timing analysis, electrical rules checking and placement
and routing.
The design automation process for the ACT 1 family is
analogous to that of mask-programmed gate arrays (see
Figure 10). The designer begins with a schematic capture
phase that is followed by simulation. The simulated net
list is then transferred to the Actel-developed portion of
the system where the user makes his physical I/O
1 Schematic
2 Simulation
4 Design
validation
TESTABILITY FEATURES
Testing a one-time-programmable gate array requires
careful attention to on-chip testability features. All
antifuse connections are normally open; no routing
connections will be established until the user's application
circuit is programmed into the chip. The ACT 1 architecture
allows all active modules to be tested by adding
appropriate testability circuits. In addition, testability
circuits ensure that all passive circuit elements, such as the
310
Figure 10. Design sequence for ACT 1 family fie/dprogrammable gate arrays
Microprocessors and Microsystems
II
Single module macros
SOA
S1
S1 SO
1~
A
OR AND gates (Module count = 1)
MXT
D2
Y
D3
A~
g
B
Y~~X4
Y
OA2
AND XOR gates
A=Q(MOdUle count = 1)
B
C
~
~
~~
=t:J
~
Figure 11. Selection from Actel macro library showing
number of logic modules required for each function
assignment and perfonns an electrical rule check (validation).
The next step is placement and routing, followed by
timing analysis using back annotation. The remaining
steps include programming with programming hardware,
debug and functional test.
The initial task facing a user of the ACT 1 family is to
implement his logic function using the Actel macro library.
In this respect, the macro library becomes a major feature
of the user interface to the Actel devices. In practice, a
thorough knowledge of the macro library is more
important than a detailed understanding of the underlying
architectural features. The user can safely disregard details
such as the number of wiring tracks per channel or the
exact nature of the logic module, since they are successfully
hidden from the user of ACT 1 arrays by the complete
automation of the placement and routing process. This
minimizes both the time required to learn the system and
the length of the development cycle. ALS users claim that
a finished part can be available as little as 1.5 h after
completion of schematic capture 8 .
Library macros are implemented by making a specific
connection to the logic modules. These macros, similar in
appearance and function to those of conventional gate
arrays, provide the primary logic entry interface for ACT 1
devices. Each single logic module implements macros
ranging in complexity from simple gates to transparent
data latches and 4-to-1 multiplexers. All Actel gate arrays
are supported by a macro library of over 200 standard logic
functions. Also included in the library is a selection of commonly used TTL functions. Figure 11 shows some macros
which can be implemented in a single logic module.
Unlike conventional gate array libraries, the Actellogic
modules implement logic functions with inversions
on inputs as efficiently as non-inverting inputs. More
importantly, they do so with no increase in propagation
delays. Actel's library includes gate macros and any
combination of 'negation' bubbles on the inputs. Thus,
inverters and associated wiring, generally, are no longer
needed.
Perhaps the most interesting aspect of the design
automation system is the placement and routing algorithm,
which is specific to Actel's channelled gate array architecture. The placement and routing software which maps
a logic description to the physical device resources (the
logic modules, wiring segments and interconnect elements)
Vol 13 No 5 June 1989
is vital to effective use of a field-programmable gate array.
Ineffective software in this step substantially lengthens
the development cycle and decreases device utifization,
thereby increasing the cost per function delivered.
As mentioned above, the architecture supports a
design style similar to conventional mask-programmed
gate arrays. The system produces fully automatic placement and routing with device utilization levels of 85-95%.
In fact, Actel's algorithms frequently provide fully automatic placement and routing even at 100% module
utilization for the 1200- and 2000-gate devices. Thus,
designers need no longer spend additional time manually
completing placement and routing for their complex
designs. By eliminating this manual operation, design time
can be cut by weeks, improving productivity and lowering
overall system development costs.
The impressive performance of the placement and
routing operation is made possible by a channelled gate
array architecture with ample routing resources available
through the use of PLiCE antifuses as interconnect elements.
This architectural approach permitted Actel to develop
advanced software by building on years of algorithms
development for mask-programmable channelled gate
arrays.
The placement and routing algorithms operate in three
phases. The first phase includes constructive placement
and placement optimization similar to conventional gate
arrays. The second placement optimization improves the
characteristics of the placement using a simulated
annealing-style algorithm prevalent today in high-end
gate array routing software. The final phase, detailed
routing, is unique to the Actel architecture.
Within the diagnostic software, special debug software
with commands similar to those of a simulator are
available to the user. Also available under the diagnostic
subsystem are in-circuit diagnostic capabilities that permit
the user to obtain realtime information about the state of
the device while it is running in the actual product
environment.
A unique feature of Actel's diagnostic and debug
software is that it takes advantage of two Actionprobe
diagnostic circuits within the chips. The Actionprobe
circuits can connect any two points electrically in the
device to external pins (Figure 12). The connections are
not implemented by the one-time programmable antifuses,
A 100% observable circuit
D
01---&-----1
elK
Two programmable probes
> Probe A; OOANDB
> Probe B;
00
Figure 12. On-chip diagnostic probes are implemented
in ACT 1 devices and are accessible through the design
automation system
311
Article Reprint 5
but instead bypass transistors in the array so that the
probe points can be changed any time by commands
typed at the keyboard. Probing provides 100% observability
of all intemal nodes so that the user actually observes
electrical activity normally hidden within the gate array.
With the use of a special socket, 100% observability is
available even when a device is plugged into the user's
system and is operating at speed. Efficient verification and
debug can take place as the system generates millions of
stimuli per second for the device under test. The
Actionprobe feature shortens design verification and insystem debug; it also reduces the burden of test vector
generation by substituting the use of in-system stimuli.
The Activator is comprised of hardware and software
for programming and functional test. The Activator facilities
can be used to perform a number of specific tests in
addition to programming. For example, every chip is
tested by the Activator before programming to ensure that
the device is actually unprogrammed. In addition, the
Activator performs a verification test to ensure that
antifuses have programmed correctly. Programming time,
including the verification phase, is typically 5-7 min for a
fully utilized 1200-gate device and 8-12 min for the 2000gate part. Improvements in the programming algorithm
and updated programming hardware will permit the 6000gate devices to be programmed in under 10 min.
Programming times are short since few of the antifuses
need to be programmed to utilize a device fully. The
Activator system can also be used for functional testing
and capturing test vectors for simulation from known
good devices.
The ALS is currently available on a 80386-based PC and
provides an interface to, and includes, Viewlogic's
schematic capture and simulation capabilities. The
recommended hardware configuration is a standard 386based AT with 4 Mbyte of extended memory, a 40 Mbyte
hard disc, one serial and one parallel port. Users will use
Viewdraw for schematic capture and Viewsim for simulating
their designs. Actel's logic design system, activator and
diagnostic software are included with this version of the
ALS. Mentor Graphics workstations are also supported as
well as the OrCAD schematic capture system.
and flexibility of mask-programmable gate arrays with the
convenience of field programmability. Sophisticated
design verification capabilities have been made available
to users through new hardware- and software-based
diagnostic features, and design productivity is increased
by the development of novel placement and routing
algorithms that deliver fully automatic operation.
REFERENCES
2
York, T 'Gate array architectures' Microprocessors
Microsyst Vol12 No 6 Ouly/August 1988) pp 323-330
Osann, R et al. 'Compare ASIC capacities with gate
array benchmarks' Electronic Design (October 13,1988)
P 93
3
4
6
7
8
9
EI Gamal, A et al. 'An architecture for electrically
configurable gate arrays' IEEE Custom Integrated
Circuits Conf. 15.4.1 (May 1988)
ACT 1 family gate arrays, product information Actel
Corporation, Sunnyvale, CA, USA Oanuary 1989)
EI-Ayat, K et al. 'A CMOS electrically configurable
gate array' Int Solid State Circuits Conf. Digest of
Technical Papers (February 1988) pp 76-77
IBM J. Design Automation and Fault Tolerant Computing (May 1979)
Hamdy, E et al. 'Dielectric based anti fuse for logic
and memory ICs' Proc. Int Electron Devices Meeting
(December 1988) pp 786-789
Leonard, M 'Digital system design' Electronic Design
(january 12, 1989) P 88
Mohsen, A 'Desktop-configurable channelled gate
arrays' VLSI Systems Design (August 1988)
Andrew Haines is the
director of marketing at Actel
Corporation.
I
CONCLUSION
The PUCE anti fuse makes possible a new field programmable device which combines the high density, speed
312
Microprocessors and Microsystems
Article Reprint 6
TESTING
ANTIFUSE-BASED
FPGAS
by Khaled EI-Ayat, Andy Haines, Ken Hayes
II
Reprinted from
ELECTRONIC ENGINEERING TIMES
May 15,1989
This week:
Monday
Progra....ble logic
May 15,1989
Issue 538
Technology Trends liftout,
after page 42.
A CMP Publication
THE INDUSTRY NEWSPAPER FOR ENGINEERS AND TECHNICAL MANAGEMENT
FPGAs
Tes
By Khaled EI
quired, comparable to those found in a
conventional array. These are arranged in
horizontal rows and vertical columns.
Horizontal routing tracks alternate with
vent of even higher
the rows of logic modules. Vertical routing
density field-protracks run over the top of the logic modules
grammable gate arrays (FPGAs), the
and horizontal tracks. Around the outside
and
issue of thorough
of chip are 110 circuits, test circuits, progranuning circuitry and a shift-registerKen Hayes,
factory testing will
based antifuse addressing scheme.
become evermore
Product Engineering important. That's
The Actel-invented Plice antifuse was
Manager
because
inadspecifically developed for making logic
interconnections. An antifuse is any
Actel Corp.
equate device testprogramming element that is normally
Sunnyvale. Calif. ing prior to use
_ _..:.-_ _ _-J places a severe
open and makes a connection upon the
application of a programming sequence.
burden on the system manufacturer.
The antifuse has a natural advantage
To try to compensate for inadequate
over fuse-based programming with refactory testing, the system manufacturer
spect to testability. Normally open in
can institute testing at incoming inspecthe unprogrammed state, antifuses do
tion-a difficult task for complex LSI denot short chip circuitry together.
vices. If he doesn't, he must rely on boardOn the contrary, all elements of the
level testing, or worse, system-level
chip can be viewed as isolated building
testing. And, inevitably, some faulty deblocks. By building in special on-chip
vices will escape to become field failures. It
circuitry, all architectural elements can
is weD-known the cost of finding failures
be thoroughly tested prior to shipment
rises dramaticaUy at each of these stages.
and programming. Much of this circuitry
Masked-programmed gate arrays are atis also required for programmability and
tractive for their low recurring costs, high Non-voilltli. testing
gate counts and speeds. However, one Wrth at least one non-volatile, one-time pro- normal chip operation, so that testabilittle discussed disadvantage is that the granunab1e programming teclmology, a degree lity overhead is modest.
This multiplicity of function is especially
quality of test coverage is the responsibility of testability comparable to reprogrammable
of the user. A thorough job of test-vector devices is possible. Actel's ACT 1 arrays allow evident for the routing tracks. Routing
generation requires the use of fault grading for special test modes that perfonn fimctional tracks both horizontally and verticaUy are
or generation of automatic test vectors. testing of unprogrammed devices at essentially broken into segments of various lengths.
This makes possible the use of segments of
Neither is very common in practice. Fault 100 percent fauk coverage.
This testability is independent of the appropriate lengths for interconnections.
grading is generaUy expensive and adds
complexity to the design process. Auto- large number of equivalent gates (l,200 in Pass transistors are present between
matic vector generation is simplified if a the ACT 1010 and 2.000 in the ACT 1020). these segments and can be used to form
scan-design technique is adopted, but few In order to describe how this is accom- temporary connections between adjacent
engineers are wiDing to conform to the plished, a few points about the ACT 1 segments.
This aUows the routing tracks to be used
required design constraints. Automatic architecture need review.
The basic building block on the chip is for carrying programming voltages, acvector generation without scanning techthe logic module, which. is programmable cessing chip elements for testability purniques is stiU a developing discipline.
However, poorly tested devices are the and capable of implementing all two-input poses, as weD as providing their primary
likely result of inadequate fault coverage. It logic functions, most three-input functions, function as logic routing resources. The
has been shown that typical masked gate and some four- and six-input functions. A pass transistors are not used during normal
arrays with 70 percent fault coverage are single module implements a variety of device operation and do not contribute to
likely to have 0.1 to 0.2 defects per device. latches, and two modules implement more logic-signal propagation delays. The acMany field-programmable devices largely than 32 types of D and 1-K flip-flops. Each companying figure shows a simplified view
alleviate this problem because they can be mod1,l1e is effectively the same complexity of logic modules, routing tracks and pass
tested at the factory. Such factory testing, as a conventional gate-array macro. The transistors.
characteristic of standard products, generaUy ACTl020 contains 546 of these logic modimproves over time, resulting in high quality ules. In order to achieve 85 percent to 95 In the test mod..
levels without significant involvement by the percent utilization of these numerous small The architecture of the ACT 1 arrays aluser. This has not always been the case, building blocks, a large number of routing lows outstanding testability of unprohowever. Early one-time programmable de- tracks and progranuning points are re- granuned devices. All elements of the chip
Ayat,
Engineer
and
Andy Haines,
Director of Marketin
Chief
With the explosion
~i;~~:~ !~ns~z
vices using fuse-based technology often disappointed users with functional defect rates
of more than 10 percent for those parts that
programmed successfully.
This has been improved substantially by
the addition of on-chip test circuits to defect
rates of 0.5 percent to 1 percent. Whi1e this
represents a low defect rate for individual
chips, putting 10 such chips on a single board
contributes to a 5 percent to 10 percent
board-failure rate. Further improvement is
difIicuIt due to the fact that in conventional
one-time programmable architectures, complete functional testing of all circuits in an
unprogrammed part is not possible. In such
fuse-based architectures in the unpr0grammed state, many circuits are shorted
together by the unprogrammed fuses.
The introduction of reprogrammable logic devices based on EPROM, E'PROM and
RAM programming elements has raised
functional quality levels to nearly 100 percent. However, defect rates do not fall to
zero because of post -testing failure mechanisms such as incorrect progranuning,
marking, insertion and ESD damage.
can be thoroughly tested at the factory with
the exception -of the programmability of
specific antifuses needed for a particular
user-defined application. However, these
can be automaticaUy tested during the programming sequence. Thus, a complete test
is performed on each device without the
need for user-generated test vectors.
Testing begins with the shift-registerbased addressing scheme. Through the
shift register, commands can be shifted
into the device, enabling special modes
such as programming and functional testing. In addition, the shift register can address specific antifuses, logic module outputs, and read or drive voltage levels on
specific routing tracks. The shift register
can be both up-loaded and down-loaded by
means of a serial shift clock and a data pin.
This setup allows various patterns to
be shifted in and out again, verifying full
shift-register functionality. With the
shift register operational, all vertical
and horizontal routing tracks can be
tested for continuity and shorts. These
tests also ensure that all vertical and
horizontal pass transistors wiD turn on.
Further tests on these pass transistors
are performed to ensure that they have
satisfactory parametric values.
The ACT 1 arrays contain a dedicated
clock buffer that crosses the chip in the
horizontal routing channels. This buffer can
be tested by driving the chip's clock pin and
reading the proper levels on the sides of
the array via the shift register.
The arrays have two special pins that can
be used as user 110 channels or, in the test
mode, as outputs of two on-chip probes
that can route the output of any logic module to the external pins. This is possible on
either programmed or unprogrammed
parts, and selection of probe points does
not require the programming of antifuses.
In fact, the probes are reprograrnmable and
can be steered under software control to
any two points in the chip.
These probes are also available to the
user through design-verification software,
which is a standard part of the Actel design
system. The probes allow a great many
tests to be performed that would otherwise
be impossible.
For example, all input buffers can be
tested by driving them low on the outside
of the chip and probing the internal side of
Article Reprint 6
the buffer. Built-in test modes also pennit
driving all output buffers to high, low or
high-impedance. So it is possible to test
Vol, Voh, 101, loh, Vii, Vih and leakage on
all I/O lines.
One of the keys to thoroughness with
which ACT arrays can be tested is the
module test. Logic modules, in their unprogrammed state, can be viewed as eightinput/one-output combinatorial functions.
By using the horizontal and vertical pass
transistors, any of the eight module inputs
can be forced to a high or low. The output
of the module can then be read through
either probe pin. In this way, every logic
module on the chip can be tested with a 100
percent fault-coverage vector set. No antifuse programming is required to perform
these tests. In addition, the architecture
allows modules to be tested in parallel for
reduced test time.
tracks on chip. This, in turn, requires numerous programming elements. The small
size of the Plice antifuse makes sufficient
numbers possible. The ACT 1020 chip,
with its 2,OOO-gate equivalents, contains
186,000 antifuse programming elements.
Surprisingly, only 2 percent to 3 percent
need be programmed to fully utilize a chip.
Nonetheless, antifuse testing must play a
significant part in the overall device test
methodology.
The first antifuse test is the blank test.
The entire array of antifuses is easily tested to ensure that antifuses are in their
normally open state. The next and most
important antifuse test is the accelerated
life test. Dielectric breakdown in Plice antifuses is a strong function of voltage, and
this effect can be used to perform an effective life-test acceleration screen in place of
the more commonly used temperature-acceleration approach.
AIdIfIIM
In addition, the convenience of a voltageThe consistently high utilization and acceleration factor facilitates the perforflexibility of ACT 1 FPGAs is due in large mance of a life-test screen on every. chip
part to the copious number of routing and its antifuses as part of the normal test
INtI.,
sequence. The acceleration voltage is applied through the programming voltage pin
(VPP). After this test, the blank test is
again run to detect any antifuses that have
failed.
This test is very effective at catching
antifuse defects. Using devices screened in
this way, more than 700,000 device-hours
have been completed in 1250 C dynamic
burn-in reliability testing with no observed
antifuse failures.
Several tests are used to confirm that
programming circuitry works correctly.
The first such test is a basic junction stress
and leakage test. The programming mode
is enabled, and the programming voltage
(VPP) plus a guard band is applied to the
chip. No voltage is applied to the antifuses.
If programming current (lPP) exceeds a
normal value, the device is rejected. The
functionality of the programming circuitry
can be further verified by programming
various antifuses associated with special
test structures and other non-user accessible features.
A number of logic modules exclusively
available for test purposes are connected to
each other by progr.amming antifuses. This
structure also can be used for AC-performance characterization of the chip. In addition, a number of antifuses are configured
as a readable signature word. Bits in this
word are programmed with factory-{fefined
patterns. Successful completion of these
tests ensures basic functionality of on-chip
programming circuitry.
The description of the numerous ACT
1010 and 1020 test modes attest to the
outstanding testability of these devices. As
a result, statistical sampling currently
shows a programming yield of better than
98 percent.
Even more important, for all sampled
parts that successfully passed programming, no functional failures have been observed.
These tests were conducted with designs that utilized 92 percent to 97 percent
of the logic modules. Although this utilization is high, it is not infrequent among
users of Actel arrays.
955 East Arques Ave.
Sunnyvale, CA. 94086
408-739-1010
•
Copy"ght~
t 989 by CMP Publlcalions, Inc., 600 Community Drive, Manhasset, NY 11030. Reprinted Irom Electronic Engineering Times with permission.
Article Reprint 7
COMPARE ASIC CAPACITIES
WITH GATE ARRAY BENCHMARKS
by Bob Osann, Abbas El Gamal
II
Reprinted from
ELECTRONIC DESIGN
October 13, 1988
DESIGN APPLICATIONS
FOUR BASIC ApPLICATION CIRCUITS ACCURATELY
PREDICT IF AN ASIC DESIGN CAN BE IMPLEMENTED
IN A SPECIFIC GATE ARRAY
COMPARE ASIC CAPACITIES
WITH GATE ARRAY
BENCHMARKS
hen evaluating a gate array's capacity, designers
typically concentrate on only the circuit they are
working on. The trouble here is that this evaluation says nothing about whether the gate array
will implement the designer's next circuit, or the
one after that.
A functional method of testing gate array capacities must consider the characteristics of the different circuits being implemented. ASIC designers must
know if a circuit can fit into a certain gate array. The solution: benchmarking,
which takes routability and application differences into consideration. Toward
this end, Actel has developed a set of application circuit benchmarks that give
design engineers accurate predictions of ASIC capacities.
ASIC capacities are currently measured in terms of equivalent gates. An
equivalent gate is a two-input NAND gate made up of an array of unconnected
transistors. For example, manufacturers of CMOS gate arrays calculate the
amount of equivalent gates simply by counting the number of transistors and
dividing by four-the typical number of CMOS transistors per gate. This approach to calculating gate-array capacities is fairly reliable and consistent within a limited class of devices, but it doesn't relate to a meaningful standard that
I
I. AMEASURE FOR a benchmark circuit unit is taken by mapping the unit into a gate
array as often as possible in a step-and-repeat procedure.
BOB OSANN AND ABBAS EL GAMAL
Actel Corp., 955 East Arques Ave., Sunnyvale, CA 94086; (408) 739-1010.
Article Reprint 7
ASIC
BENCHMARKS
.2. 248 GATES ARE needed to implement the timer-counter benchmark circuit. This timer is typical of microprocessor-based systems.
applies across all ASIC types.
A better, more accurate measure
of ASIC capacity is a benchmark that
considers application differences.
Considering the broad variation of
applications, creating these benchmarks isn't a trivial task. For example, some applications contain a
great deal of combinatorial logic
while others are mainly sequential.
These inconsistencies mean that certain application circuits map into different ASIC gate-array types better
than others.
The key to making efficient use of
an ASIC's available gates is routability. Because all circuits aren't equally routable, numerous benchmark
circuits are needed. Another factor
in routability is the ratio of combinatorial to sequential logic-an important consideration for an ASIC that
comes with a fixed number of flipflop resources.
PIN-PER-GATE COUNTS
One way to roughly evaluate a circuit's routability is to consider how
many macro pins-per-gate each macro function in the circuit employs. A
"macro pin" refers to the input and
output connection points on a macrologic function-it has no relation to
an I/O pin on the device package. Se-
quential logic tends to use many
gates and relatively few pins per
gate, while combinatorial logic uses
a relatively high number of pins per
gate approaching the maximum of
three (two-input NAND).
In general, the greater number of
pins per gate, the more difficult the
circuit is to route. However, the pinper-gate routability measure is limited: An average pin-per-gate figure
doesn't convey the local congestion
that can arise in certain types of applications, putting excessive strain
on an ASIC's routing resources.
A useful evaluation is to actually
route the various benchmark circuits
into gate arrays. This approach is especially effective when applied to
ASIC architectures with routing resources less abundant than those of
channeled gate arrays. Specifically,
the benchmark circuits must offer
divergent pin-per-gate ratings to
give a valid picture of routability for
different applications.
Because most real-world applications include a variety of circuit
types with different routability levels, filling a chip with one type of circuit might seem artificial. This is
why there are four different benchmark circuits each mapped into one
chip. But even one application bench-
Reprinted with permission from ELECTRONIC DESIGN· October 13.1988
mark circuit alone gives a realistic
measurement as most application
circuits tend to be bimodal-their sequentiallogic and combinatorial logic generally have very different pinper-gate measurements. The comb inatoriallogic sections are commonly
routing-limited, while the sequential
logic sections are not. The different
benchmark circuits measure the
ASIC's ability to deal with each type
of logic.
An ideal benchmark circuit unit
should be small enough to fit into an
ASIC many times without leaving a
sizable amount of unused logic. On
the other hand, the units must be
large enough to realistically create
the routability challenges of actual
application circuits. Also, the circuits
should be designed such that no I/O
limitations are encountered, allowing the benchmark focus to remain
solely on logic capacity.
Any useful benchmark circuit
must relate to some type of standard. The standard against which
Actel measures its units is an LSI
Logic gate array. Among the most
widely used gate arrays in the industry, LSI Logic parts have become de
facto industry standards. With circuits constructed from LSI Logic
macros, capacity requirements in
Copyright 1988 VNU Business Publications. Inc.
•
ASIC
BENCHMARKS
terms of two-input NAND equivalents can be quickly computed in a
clear way.
To take application variability into
account, benchmark circuits must be
designed to represent all the different types of application circuits. The
CALC (comparative ASIC logic capacity) benchmark developed by Actel is actually a set of four circuits, including a data path, timer-counter,
state machine, and an arithmetic
unit. The CALC benchmarks are an
attempt at a broader, more disciplined determination of an ASIC's
capacity than an equivalent gate
reading.
STEp·AND·REPEAT
Each benchmark circuit is used to
create a measure by mapping it into
an ASIC as many times as possible in
a step-and-repeat procedure (Fig. 1).
The results demonstrate how well an
ASIC accommodates the type of circuit represented by the benchmark.
The number of successful iterations
supply a relative measure of device
capacity when the same benchmark
circuit and procedure is applied to
several devices.
The first CALC benchmark is the
data-path circuit unit. Most boardlevel designs-even those that heavily utilize programmable logic devices-use a significant number of
data path devices. The purpose of the
data-path circuit benchmark is to
steer signals from one point to another, and to occasionally store the signals. Common TTL datapath functions include octal registers and
latches, buffers, transceivers, and
multiplexers. Shift registers used to
perform parallel-to-serial and serialto-parallel conversions are also considered a form of data-path circuit
function.
This data-path benchmark circuit
employs a 4-to-1 multiplexer, an 8-bit
register, and an 8-bit shift register.
The octal4-to-1 multiplexer-similar
to an LS153 multiplexer-selects 8
bits from a 32-bit input bus. This
function approximates the many dynamic-RAM-based systems in which
a 4-to-1 multiplexer selects between
the processor address and the refresh address. Eight of the 32 data
BENCHMARK CIRCUITS IMPLEMENTED
bits entering the multiplexer are
kept separate in order to allow data
to be received from a preceding
benchmark unit when placing multiple, daisy-chained circuit units into a
device.
The outputs of the multiplexer
feed a continuously-clocked, 8-bit
holding register. The register then
feeds an 8-bit parallel-in/serial-out
shift register, similar to the popular
LS166 register used in many video
designs.
The benchmark shift register also
has a parallel 8-bit output used for
the daisy-chain connection to the
next benchmark unit. The shift register's operation is controlled by its
Shift/Load signal, which connects to
the Clock Enable inputs of the register's flip-flops.
The pin-per-gate distribution indi-
To other
benchmark units
3. THIS THIRD circuit, an averagesized controller/sequencer, contains a
large amount of non-standard logic. The
outputs of this state machine are
determined by its state diagram (see the
insert).
cates many sequential macros with
0.5 pins per gate, and on the other
end of the scale, 4:1 multiplexers
with the maximum of three pins per
gate. When implemented in an LSI
Logic LL7220 gate array, each datapath benchmark circuit needs about
157 gates.
The second benchmark circuit is
the timer-counter (Fig. 2). This particular circuit is typical of the many
timer functions found in microprocessor-based systems. While dedicated timer-counter chips such as the
8253 are popular, they are too general for many ASIC applications and
would require too many gates. A better solution is to build a specific timer-counter function.
The multiplexer used in this circuit
is similar to an LS157 multiplexer
and allows the LS161-type counter to
Article Reprint 7
l,i'iia:l,jQQliH,jii'i:~1
ASIC
BENCHMARKS
4. THE ARITHMETIC CIRCUIT, a basic multiplier/accumulator, employs the
highest number of gates out of all the benchmarks-295.
be loaded in two different ways: directly from incoming data or from
the preset value register. The desired preset data is loaded into the
counter when the "equal-to" comparator indicates a match between
the counter's state and the value in a
compare register.
Both the preset value and the compare value registers are similar to
LS377 octal registers with Clock-Enables except that these also have an
asynchronous reset function. The
timer-counter's basic sequence of operations (count, compare, load) is re-
5. ACOMPARISON OF pin
1
density distributions Is a good measure of
routability. Hllh plns·per-gate circuits
are generally difficult to route.
peated endlessly.
The timer-counter circuit employs
an intense amount of registers and
exclusive-Or (XOR) gates. Registers
hold the preset and compare values
as well as implementing operation of
the counter. The XOR gates, which
are low pin-per-gate functions, are
used in the counter and compare
functions.
In the pin-per-gate distribution,
the sequential functions have less
than one pin per gate, and the simple
gates have about 2.5 pins per gate.
The timer-counter benchmark circuit
requires about 248 gates to implement in an LL7220 gate array.
Benchmark number three is the
state machine circuit representing
an average-sized controller-sequencer that's found in many PLD applications (Fig. 3). Because of the large
amount of nonstandard logic involved, this is an important application for an ASIC.
This third benchmark circuit is a
Mealy machine, whose outputs are
determined from the total state of
the system. The outputs are registered and clocked to produce synchronous output data. The machine
has eight states and goes through 12
transitions.
The circuit's distinctive feature as
a benchmark is its use of many wide
decode stages (12 bits wide) in the
combinatorial logic section, which
demands a relatively large number
of pins per gate. Consequently, this
benchmark tests for how well an
ASIC can handle highly congested
circuits with high fan-in on each macro. Because they generally have a
limited number of internal pins per
module, the circuit poses a challenge
to module-based ASICs.
Many LS273-type registers are included on this benchmark circuit.
The combinatorial portion of a typical state-machine circuit has an extremely high pin-per-gate rating.
The state-machine benchmark utilizes about 153 gates in an LL7220
gate array.
The final benchmark is the arithmetic circuit, more specifically a
multiplier/accumulator (MAC) (Fig.
4). This circuit consists of a multiplier, an 8-bit adder, and an 8-bit register. Such arithmetic circuits are employed in FIR filters and other DSPtype applications.
A 4-by-4-bit multiplier accepts either outside data or data from a preceding benchmark circuit unit. The 8bit multiplication results are added
to previous accumulator results, and
then the current 8-bit accumulator
result is moved to the output register. The I-bit MAC input controls a
gating function on the adder's input.
The MAC circuit represents a combination of circuit types that test an
ASIC's ability to handle logic with
both high and low pin-per-gate
counts. The adders, as combinatorial
functions, are high pin-per-gate sections. However, the registers used in
both the multiplier and the output
register are low pin-per-gate functions.
The largest pin-per-gate frequency (three pins per gate) generally occurs with this type of benchmark circuit, owing to the adder tree used to
make up the array-multiplier (Fig.
5). The sequential content of this
benchmark has less than one pin per
gate, as expected. Although one of
the simplest benchmark circuits
from a conceptual point of view, the
arithmetic benchmark requires the
most gates in the LL7220 gate array-295.
STANDARD RESULTS
The CALC benchmark results for
LSI Logic gate arrays can be compared to the results for other ASICs
supplying a reliable standard. The
II
1,lfi'H:I,jQQ,iKliilll:~j
ASIC
BENCHMARKS
LL7220, one of the industry's most
successful arrays, is commonly referred to as a 2000-gate devicethough it actually contains 2224 2-input NAND gate equivalents. With
an 85% usability rating, the LL7220
supplies 1890 usable gates.
Actel's ACT1020 desktop-configurable, channeled gate array
scores about the same as the LSI
Logic LL7220 channeled gate array
for all but one of the benchmarks
(see the table). Therefore, even
though the ACT1020 gate array implements an application circuit in
logic modules instead of interconnected transistors, the device can be
thought of as a 2000-gate gate array.
These results were produced by
first stepping and repeating the
benchmark circuit, then using an
automatic router to map the circuit
into the device. The use of autorouting is critical because although it's
always possible to achieve better results with a manual place-and-route
procedure, few designers would prefer a manual procedure for an application circuit. Consequently, these
circuit benchmarks will produce typical outcomes only when autorouting
is used.D
Bob Osann, a marketing consultant to Actel Corp., was previously
a vice president of P-CAD, following the acquisition of Assisted
Technology where he served as president and chief executive officer.
Osann holds a BSEE from Cornell
University, and holds three patents in the areas of consumer and
industrial electronics.
Abbas El Gamal, a cofounder of
Actel Corp. and an associate professor at Stanford University, was
previously director of the Systems
Research Laboratory at LSI Logic
Corp. He holds a PhD in electrical
engineering from Stanford University and is credited with a patentfor Actel's device architecture.
~
955 East Arques Avenue, Sunnyvale, CA 94086. (408) 739-1010
ACTEL CORPORATION DIRECT SALES OFFICES
955 E. Arques Ave.
Sunnyvale, CA 94086
Tel: (408) 739-1010
8130 McFadden Ave., Suite 109
Westminster, CA 92683
Tel: (714) 373-4488
425 N. Martingale Rd., Suite 800
Schaumburg, IL 60173
Tel: (708) 706-3866
1740 Mass Avenue
Boxborough, MA 01719
Tel: (508) 635-0010
9101 Guilford Road, Suite 107
Columbia, MD 21046
Tel: (301) 604-0111
2350 Lakeside Blvd., Suite 850
Richardson, TX 75082
Tel: (214) 235-8944
DOMESTIC REPRESENTATIVES
ALABAMA
MISSOURI
Rep Inc. ...... ... . . . . .. . . . . . . . . . . . . . . . . .. . ... . . . . . . .. (205) 881-9270
John G. Macke Co. .................................. (314) 432-2830
ARIZONA
NEW JERSEY
Luscombe Engineering .............................. (602) 949-9333
Nexus ............................................... (201) 947-0151
CALIFORNIA
NEW MEXICO
Centaur (Calabasas) ................................. (818) 704-1655
Centaur (Irvine) ...................................... (714) 261-2123
Centaur (San Diego) ................................. (619) 278-4950
12 Inc. (Santa Clara) .................................. (408) 988-3400
12 Inc. (Orangevale) .................................. (916) 989-0843
COLORADO
Luscombe Engineering .............................. (303) 772-3342
Luscombe Engineering .............................. (505) 883-0333
NEW YORK
L-MAR Associates (Apalachin) ....................... (607) 687-1828
L-MAR Associates (E. Rochester) ...................... (716) 381-9100
L-MAR Associates (Poughkeepsie) ................... (914) 462-8025
NORTH CAROLINA
CONNECTICUT
Rep Inc. (Charlotte) .................................. (704) 563-5554
Rep Inc. (Morrisville) ................................. (919) 469-9997
CompRep Associates ................................ (203) 269-1145
OHIO
FLORIDA
Giesting & Associates (Cincinnati) ..................... (513) 385-1105
Giesting & Associates (Cleveland) .................... (216) 261-9705
Sales Engineering Concepts (Altamonte Springs) ...... (407) 682-4800
Sales Engineering Concepts (Deerfield Beach) ........ (305) 426-4601
GEORGIA
OREGON
L2 Ltd. . ............................................. (503) 629-8555
Rep Inc .............................................. (404) 938-4358
PENNSYLVANIA
ILLINOIS
Omega Sales ........................................ (215) 244-4000
Carlson Electronic Sales Associates .................. (708) 956-8240
TENNESSEE
INDIANA
Rep Inc. ................ .............. ...... ......... (615) 475-4105
Giesting & Associates.. . . . . . . .. . . . . . . . . .. . .. . . . . .. . .. (317) 844-5222
TEXAS
IOWA
KANSAS
OM Associates (Austin) .............................. (512) 794-9971
OM Associates (Houston) ....... .................... (713) 789-4426
OM Associates (Richardson) ......................... (214) 690-6746
OLE Electronics ..................................... (316) 683-6400
UTAH
Carlson Electronic Sales Associates .................. (319) 378-1450
MARYLAND
Luscombe Engineering .............................. (801) 565-9885
New Era Sales ................................. ,..... (301) 544-4100
WASHINGTON
MASSACHUSETTS
L2 LTD. .......... . . . . .. . . . . . . . . . . . . . .. .. ... . .. . . . .. .. (206) 827-8555
CompRep Associates ................................ (617) 329-3454
WISCONSIN
MICHIGAN
Carlson Electronic Sales Associates .................. (414) 476-2790
Giesting & Associates (Livonia) ....................... (313) 478-8106
Giesting & Associates (Comstock Park) ............... (616) 784-9437
CANADA
MINNESOTA
Gibb Electronic Sales ................................ (612) 935-4600
Clark-Hurman Associates (Quebec) .. .. . . ... . . . . . . . . .. (514) 426-0453
Clark-Hurman Associates (Ontario-Bramptan) ......... (416) 840-6066
Clark-Hurman Associates (Ontario-Nepean) ........... (613) 727-5626
DOMESTIC DISTRIBUTORS
WYLE LABORATORIES
ARIZONA
OREGON
Phoenix ............................................. (602) 437-2088
Hillsboro ................•........................... (503) 640-6000
CALIFORNIA
TEXAS
Calabasas ...........................................
Irvine ...............................................
Rancho Cordova ....................................
San Diego ...........................................
Santa Clara .........................................
(818)
(714)
(916)
(619)
(408)
880-9000
863-9953
638-5282
565-9171
727-2500
Austin .............................................. (512) 345-8853
Houston .........................................•.. (713) 879-9953
Richardson ...... " .................................. (214) 235-9953
UTAH
West Valley City ...........•......................... (801) 974-9953
COLORADO
WASHINGTON
Thornton ........................................... , (303) 457-9953
Redmond ........................................... (206) 881-1150
MASSACHUSETTS
Burlington .......................................... (617) 272-7300
PIONEERSTANDARDELECTRONICS ________________________________________
CONNECTICUT
NEW JERSEY
Norwalk ............................................. (203) 853-1515
Pine Brook ................•......................... (201) 575-3510
ILLINOIS
NEW YORK
Addison .. '" .............................. " ........ (312) 495-9680
Binghamton ......................................... (607) 722-9300
Fairport ............................................. (716) 381-7070
Woodbury ........................................... (516) 921-8700
INDIANA
Indianapolis ......................................... (317) 573-0880
KANSAS
OHIO
Lenexa ......... '" ....................... , " ........ (913) 492-0500
Dayton .............. '" ............ " .. '" .......•.. (513) 236-9900
MASSACHUSETTS
PENNSYLVANIA
Lexington ........................................... (617) 861-9200
Pittsburgh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. (412) 782-2300
MICHIGAN
TEXAS
Grand Rapids .. ; .................................... (616) 698-1800
Livonia .............................................. (313) 525-1800
Austin .............................................. (512) 835-4000
Dallas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. (214) 386-7300
Houston ............................................ (713) 988-5555
MINNESOTA
Eden Prairie ............................... " ........ (612) 944-3355
MISSOURI
S1. Louis ............................................ (314) 432-4350
PIONEERTECHNOLOGIES _______________________________________________
ALABAMA
MARYLAND
Huntsville ........................................... (205) 837-9300
Gaithersburg ........................................ (301) 921-0660
CALIFORNIA
NORTH CAROLINA
San Jose ............................................ (408) 954-9100
Charlotte ............ '" ....... " ... " ............... (704) 527-8188
Durham ............................ " ............... (919) 544-5400
FLORIDA
Altamonte Springs ................................... (407) 834-9090
Clearwater .......................................... (813) 536-0445
Deerfield Beach ..................................... (305) 428-8877
PENNSYLVANIA
Horsham ............................................ (215) 674-4000
GEORGIA
Norcross ............................................ (404) 448-1711
INTERNATIONAL DISTRIBUTORS
AUSTRALIA
ITALY
Reptechnic (Nuetral Bay, NSW) ......................... (2) 953.9844
LASI Elettronica S.pA(Milan) ......................... (2) 66.10.1370
DENMARK
NETHERLANDS
Nordisk Electronik AS (Herlev) ......................... (42) 84.20.00
Transfer B.V. (Enschede) ............................... (53) 33.43.81
EGYPT
NORWAY
SEE (Cairo) ............................................. (2) 665.948
Nordisk Electronik AS (Hvalstad) ........................ (2) 84.50.70
ENGLAND
SPAIN
Gothic-Crellon Ltd. (Wokingham) ..................... (0734) 78.88.78
Manhattan Skyline Ltd. (Maidenhead) .................. (0628) 75.85.1
Semiconductores SA (Barcelona) ....................... (1) 742.2313
FINLAND
Traco AB (Farsta) ....................................... (8) 93.00.00
OY Fintronic AB (Helsinki) ............................. (0) 69.26.022
SWITZERLAND
SWEDEN
FRANCE
Omni Ray AG (Dietlikon) ............................... (1) 835.21.11
ASAP (Montigny Le Bretonneux) ...................... (1) 30.43.8233
Scaib SA (Rungis Cedex) ............................ (1) 46.87.2313
TAIWAN
ISRAEL
WEST GERMANY
A.S.T. Ltd. (Herzelia) ................................... (52) 58.33.55
bit-electronic AG (Munich) ........................... (089) 41.80.070
ECN/Dacom GmbH (Munich) ........................ (089) 96.09.080
KOREA
Eastern Electronics, Inc. (Seoul) ......................... (2) 566.0514
SEED TECH Corporation (Taipei) ........................ (2) 521.1100
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19 Create Date : 2017:11:01 09:52:37-08:00 Modify Date : 2017:11:01 10:28:36-07:00 Metadata Date : 2017:11:01 10:28:36-07:00 Producer : Adobe Acrobat 9.0 Paper Capture Plug-in Format : application/pdf Document ID : uuid:99ab9040-46e9-2848-9421-2083d5bf31af Instance ID : uuid:36753fb2-3bce-3d49-94a8-b2d3d740a4e3 Page Layout : SinglePage Page Mode : UseNone Page Count : 200EXIF Metadata provided by EXIF.tools