1991_Cypress_Applications_Handbook 1991 Cypress Applications Handbook
User Manual: 1991_Cypress_Applications_Handbook
Open the PDF directly: View PDF .
Page Count: 736
Download | ![]() |
Open PDF In Browser | View PDF |
APPLICATIONS HANDBOOK • , -:4' : = CYPRESS SEMICONDUCTOR Cypress Semiconductor, 3901 North First St., San Jose, CA 95134 (408) 943-2600 Telex: 821032 CYPRESS SNJ UD, TWX: 910 997 0753, FAX: (408) 943-2741 Cypress Semiconductor, Cypress PLD Toolkit, and QuickPro II are trademarks of Cypress Semiconductor Corporation. IBM, IBM PC, and PCIXT are registered trademarks of the International Business Machine Corporation. SPARC is a registered trademark of SPARC International. Data I/O is a registered trademark of the Data I/O Corporation. PLD Test and ABEL are trademarks of the Data I/O Corporation. STAG is a registered trademark of Stag Microsystems Ltd. Published August 1991 © Cypress Semiconductor Corporation. 1991. The Information contained herein Is subject to change wtthout notice. Cypress Semiconductor Corporation assumes no responsibility for the use of anyclrcunryotherthan clrcunryembodled In a Cypress SemlconductorCorporatlon product. Nor does It conwyor Imply any license under patent or other rfghts. Cypress Semiconductor does not authorize Its products for use as critical componenta In Ilfe-supporl systems where a malfunction or failure of the product may reasonably be expected to result In significant Injury tothe user. The Inclusion of Cypress Semiconductor products In life-support systems appllcallons Implies that the manufacturerassumesaJl rfskofsuch use and In sodolng Indemnifies Cypress Semiconductor against all damages. CYPRESS SEMICONDUcrOR Preface specific designs indicate whether the designs have been simulated and/or built and completely debugged. H you have questions about any Cypress product, please contact your local Field Applications Engineer at the nearest direct sales office. A list of Cypress sales offices, representatives, and distributors is included at the back of this Handbook. For continuous on-line information about Cypress products, you can connect to the Cypress Bulletin Board at (408) 943-2954. About This Book This Applications Handbook is a learning tool for using Cypress devices. The application notes included here range from general product overview articles, such as "Understanding Dual-Port RAMs," to specific design examples. The general overviews describe product-family characteristics and explain some of the products' capabilities. These application notes appear at the beginning of this Handbook. Next appear application examples that show how to use specific Cypress devices in the context of real designs. The application examples are organized by product type (e.g., SRAMs or EPLDs). Within each product type examples are arranged by product number, using the product that is the article's primary focus. Although your specific application might not appear explicitly in an application note, the design examples can still be useful to you. H the design example is similar to your application, you might be able to adapt the hardware or software to your design easily. Many of the application notes provide PLO software code for design tools from a variety of vendors, so that you can copy the code and use it as a skeleton for your own PLO designs. Even if none of the examples relate directly to your design, they can stimulate new ideas by showing features or applications that might not have occurred to you. The information can also significantly reduce the learning curve normally associated with unfamiliar ICs. Most of the designs described in this Handbook are based on actual circuits produced either by Cypress or by one of our customers. Application notes that discuss About Cypress Semiconductor Since its incorporation in 1982, Cypress has successfully addressed diverse, high-performance niche markets by creating technologically sophisticated products, using innovative packaging, and emphasizing quality. Cypress is a complete semiconductor manufacturer, performing its own process development, circuit design, wafer fabrication, assembly, and test. Its core CMOS and BiCMOS processes lead the industry with O.8-micron design rules. Cypress ships over 200 products in seven product areas: SRAMs, PROMs, PLOs, logic devices, SPARC microprocessors and peripherals, multichip modules, and high-speed BiCMOS PLO and memory devices. Cypress is an international company, with headquarters in San Jose, California and fabrication facilities in San Jose; Round Rock, Texas; and Bloomington, Minnesota. The company has started up five subsidiaries that are funded by Cypress but run as independent businesses, including Cypress Semiconductor (Texas) Inc., Aspen Semiconductor Corporation, Multichip Technology Incorporated, Ross Technology Inc., and Cypress Semiconductor (Minnesota) Inc. iii Contents Page General Information System Design Considerations When Using Cypress CMOS Circuits .......................... 1-1 Power Characteristics of Cypress Products ................................................ 1-23 Tips for High-Speed Logic Design ........................................................ 1-29 Protection, Decoupling, and Filtering of Cypress CMOS Circuits ............................. 1-34 Modules Choosing Packages in High-Density Module Designs ........................................ 2-1 The Multichip Family of Universal JEDEC ZIP/SIMM Modules .............................. 2-7 ECL and TTL BiCMOS Noise Considerations in High-Speed Logic Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-1 Using ECL in Single + 5V TIL Systems ................................................... 3-4 BiCMOS TIL and ECL SRAMs Improve High-Performance Systems ......................... 3-7 PLCC and CLCC Packaging for High-Speed Parts ......................................... 3-15 A New Generation of BiCMOS High-Speed TIL SRAMs ................................... 3-20 Access Time vs. Load Capacitance for High-Speed BiCMOS TIL SRAMs . . . . . . . . . . . . . . . . . . .. 3-23 Combining SRAMs Without an External Decoder ....... . .. .. . . . . . . .. . . . . .. . . . .. . . . .. . .. . .. 3-27 BiCMOS TIL SRAMs Improve MIPS R3000 and R3000A Systems .......................... 3-30 Memory and Support Logic for Next-Generation ECL Systems .............................. 3-33 SRAMs RAM I/O Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-1 Understanding Dual-Port RAMs .......................................................... 4-7 Using Dual-Port RAMs Without Arbitration ............................................... 4-19 Using Cypress SRAMs to Implement 386 Cache ........................................... 4-23 PROMs Pin-out Compatibility Considerations of SRAMs and PROMs ................................ 5-1 Introduction to Diagnostic PROMs ......................... . .. . .. . . . . . . . . .. . .. . .. .. . .. . ... 5-4 Interfacing the CY7C289 to the AM29000 ................................................. 5-10 Interfacing the CY7C289 to the CY7C601 ................................................. 5-23 PLDs Introduction to Programmable Logic. . . . . . . . . . . . .. . . . . . .. . .. . .. . . .. . .. .. . . . .. .. . . . .. . .. . . .. 6-1 CMOS PAL Basics ...................................................................... 6-10 Are Your PLDs Metastable? ............................................................. 6-21 PLD-Based Data Path For SCSI-2 ........................................................ 6-40 v Page PLDs (continued) PAL Design Example: A OCR EncoderlDecoder ........................................... ~3 1'2 Framing Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-76 Using CUPL with Cypress PLDs .......................................................... 6-93 Using ABEL to Program the Cypress 22V10 .............................................. 6-119 Using ABEL to Program the CY7C330 .......................................... : ........ 6-139 Using ABEL 3.2 to Program the Cypress CY7C331 ........................................ 6-147 Using Log/IC to Program the CY7C330 ........................ , ......................... 6-154 State Machine Design Considerations and Methodologies .................................. 6-173 Understanding the CY7C330 Synchronous EPLD ......................................... 6-213 Using the CY7C330 in Closed-Loop Servo Control ........................................ 6-233 FDDI Physical Connection Management Using the CY7C330 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-247 Bus-Oriented Maskable Interrupt Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-259 Using the CY7C330 as a Multi-channel Mbus Arbiter ..................................... 6-270 Using the CY7C331 as a Waveform Generator ........... ~ ................................ 6-279 CY7C331 Application Example: Asynchronous, Self-Timed VMEbus Requestor .............. 6-286 Understanding the 361 ................................................................. 6-295 Using the CY7C361 as an Mbus Arbiter ................................................. 6-305 TMS320C30/VME Signal Conditioner Using the CY7C361 ........................ ~ ........ 6-315 DMA Control Using the CY7C342 MAX EPLD .......................................... 6-327 Interfacing PROMs and RAMs to High-Speed DSP Using MAX . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-345 FIFO RAM Controller with Programmable Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-351 Logic Understanding Small FIFOs .............................................................. 7-1 Understanding Large FIFOs ............................................................. 7-14 Designing with the CY7C439 Bidirectional FIFO (BIFO) .................................... 7-2fJ Microcoded System Performance ................... :..................................... 7-47 Systems with CMOS 16-Bit Microprocessor ALUs ......................................... 7-50 RIse SPARC Software Advantages Over CISC ................................................... 8-1 Register Windows ................................................. '.' . . . . . . . . . . . . . . . . . . . .. 8-3 CY7C600 System Design Footnotes ........................... '.' .. . . . . . . . . . . . . . . . . . . . . . . . .. 8-7 The Impact of Memory on High-Performance RISC Microprocessors ......................... 8-17 High-Speed CMOS SPARC Design ....................................................... 8-23 SPARC System Surface-Mount Design .................................................... 8-33 Memory System Design for the CY7C601 SPARC Processor ............. . . . . . . . . . . . . . . . . . . .. 8-38 Cache Memory Design ..................................................... :............. 8-48 Synchronous Trap Identification for CY7C600 Systems. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. 8-65 An Introduction to Mbus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-69 Multiprocessing System Boot-Up ......................................................... 8-81 vi Page RIse (continued) Porting UNIX to the CY7C604 or CY7C605 ............................................... 8-84 Getting Started with Real-Time Embedded System Development ....................... , ..... 8-89 SPARC as a Real-Time Controller ........................................................ 8-95 Memory Protection and Address Exception Logic for the CY7C611 SPARC Controller ........ 8-108 Bus Products VIC068 Special Features and Tips ......................................................... 9-1 Interfacing the VIC068 to MC68020 ........................................................ 9-5 Glossary .............................................................................. 10-1 Index ................................................................................... 1-1 vii Section Contents Page General Information System Design Considerations When Using Cypress CMOS Circuits .......................... 1-1 Power Characteristics of Cypress Products ................................................ 1-23 Tips for High-Speed Logic Design ........................................................ 1-29 Protection, Decoupling, and Filtering of Cypress CMOS Circuits ............................. 1-34 Systems Design Considerations V\lhen Using Cypress CMOS Circuits analogous to the gm of a vacuum tube and is inversely proportional to the gate oxide thickness. Thin gate oxides, which are required to achieve the desired performance, result in highly sensitive inputs. These inputs require very little energy at or above the device input-voltage threshold (approximately 1.5V at 25°C) to be detected. CMOS products might detect high-frequency signals to which bipolar devices would not respond. MOS transistors also have extremely high input impedances (5 to 10 MO), which make these transistors' gate inputs analogous to the input of a high-gain amplifier or an RF antenna. In contrast, because bipolar ICs have input impedances of 10000 or less, these devices require much more energy to change state than do MOS ICs. In fact, a typical Cypress IC requires less that 10 picojoules of energy to change state. Thus, when Cypress CMOS ICs replace bipolar or NMOS ICs in existing systems, the CMOS ICs might respond to pulses of energy in the system that are not detected by the bipolar or NMOS products. This application note describes some factors to consider when designing new systems using Cypress high-performance CMOS integrated circuits or when using Cypress products to replace either bipolar or NMOS circuits in existing systems. The two major areas of concern are device input sensitivity and transmission line effects due to impedance mismatching between the source and load. To achieve maximum performance when using Cypress CMOS ICs, pay attention to the placement of the components on the printed circuit board (PCB); the routing of the metal traces that interconnect the components; the layout and decoupling of the power distribution 'system on the PCB; and perhaps most important of all, the impedance matching of some traces between the source and the loads. The latter traces must, under certain conditions, be analyzed as transmission lines. The most critical traces ate those of clocks, write strobes on SRAMs and FIFOs, output enables, and chip enables. Replacing Bipolar or NMOS les Reflected Voltages Cypress CMOS ICs are designed to replace both bipolar ICs and NMOS products and to achieve equal or better performance at one-third (or less) the power of the components they replace. When high-performance Cypress CMOS circuits replace either bipolar or NMOS circuits in existing sockets, be aware of conditions in the existing system that could cause the Cypress ICs to behave in unexpected ways. These conditions fall into two general categories: device input sensitivity and sensitivity to reflected voltages. Cypress CMOS ICs have very high input impedances and - to achieve TTL compatibility and drive capacitive loads -low output impedances. The impedance mismatch due to low-impedance outputs driving high-impedance inputs might cause unwanted voltage reflections and ringing, under certain conditions. This behavior could result in less-than-optimum system operation. When the impedance mismatch is very large, a nearly equal and opposite negative pulse reflects back from the load to the source when the line's electrical length (PCB trace) is greater than 1= _t_r_ Input Sensitivity 2Tpd High-performance products, by definition, require less energy at their inputs to change state than low- or medium-performance products. Unlike a bipolar transistor, which is a current-sensing device, a MOS transistor is a voltage-sensing device. In fact, a MOS circuit design parameter called K' is where tR is the rise time of the signal at the source, and T pd is the one-way propagation delay of the line per unit length. The input clamping diodes in bipolar IC families (e:g., TTL, LS, ALS, FAST, FACT) are inherent in th~ 1-1 fabrication process. The P substrate is usually grounded and N-wells are used for the NPN transistors and Ptype resistors. The wells are reverse biased by connecting them to the Vee supply. As a result, a PN junction diode is formed between every input pin (cathode or N material) and the substrate (anode or P material). A negative voltage at an input pin due to either lead inductance or a voltage reflection forward biases the diode, which turns on and clamps the input pin to a Vf below ground (approximately -0.8V). Historically, as circuit performance improved, the output rise and fall times of the bipolar circuits decreased to the point where voltage reflections began to occur even for short traces when an, impedance mis~ match existed between the line and the load. Most users, however, were unaware of these reflections because the reflections were suppressed by the diodes' clamping action. Conventional CMOS processing results in PN junction diodes, which adversely affect the ESD (electrostatic discharge) protection circuitry at each input pin and cause an increased susceptibility to latch-up. In addition, when the input pin is negative enough to forward bias the input clamping diodes, electrons are injected into the substrate. When a sufficient number of electrons are injected, the resulting current can disturb internal nodes, causing soft errors at the system level. To eliminate this problem, all Cypress CMOS products use a substrate bias generator. The substrate is maintained at a negative 3V potential, so the substrate diodes cannot be forward biased unless the voltage at the input pin becomes a diode drop more negative than -3V. (See Figure 5 in "CMOS PAL Basics" for a schematic of the input protection circuits used on all Cypress CMOS products.) To the systems designer, this translates to approximately five times (3.8V divided by O.8V = 4.75) the negative undershoot safety margin for Cypress CMOS integrated circuits versus those that do not use a bias generator. Voltage reflections should be eliminated by using impedance matching techniques and passive components that dissipate excess energy before it can cause soft errors. Crosstalk should be reduced to acceptable levels by careful PCB layout and attention to details. clock, chip select, output enable, and write and read control 'lines from each other and from data and address lines so that the signals do not cause coupling to each other or to the data lines. It is standard practice to use ground or power planes between signal layers on multi-layered PCBs to reduce crosstalk. The capacitance of these isolation planes increases the propagation delay of the signals on the signal layers, but this drawback is more than compensated for by the isolation the planes provide. The Theory of Transmission Lines A connection (trace) on a PCB should be considered as a transmission line if the wavelength of the applied frequency is short compared to the line length. If the wavelength of the applied frequency is long compared to the length of the line, you can use conventional circuit analysis. In practice, transmission lines on PCBs are designed to be as nearly lossless as possible. This simplifies the mathematics required for their analysis, compared to a lossy (resistive) line. Ideally, all signals between ICs travel over constantimpedance transmission lines that are terminated in their characteristic impedances at the load.. In practice, this ideal situation is seldom achieved for a variety of reasons. Perhaps the most basic reason is that the characteristic impedances of all real transmission lines are not constants, but present different impedances depending upon the frequency of the applied signal. For "classical" transmission lines driven by a single-frequency signal source, the characteristic impedance is "more constant" than when the transmission line is driven by a square wave or a pulse. According to Fourier series expansion, a square wave consists of an infinite set of discrete frequency components - the fundamental plus odd harmonics of decreasing amplitude. When the square wave propagates down a transmission line, the higher frequencies are attenuated more than the lower frequencies. Due to dispersion, the different frequencies do not travel at the same speed. Dispersion indicates the dependence of phase velocity upon the applied frequency (Reference 1 pg. 192). The result is that the square wave or pulse is distorted when the frequency components are added together atthe load. ' A second reason why practical transmission lines are not ideal is that they frequently have multiple loads. You can distribute the loads along the line at regular or irregular intervals or lump them together as close as practical at the end of the line. The signal-line reflections and ringing caused by impedance mismatches, non-uniform transmission line impedances, inductive leads, and non-ideal resistors could compromise the dynamic system noise margins and cause inadvertent switching. Crosstalk The rise and fall times of the waveforms generated by Cypress CMOS circuit outputs are 2 to 4 ns between levels of 0.4 and 4V. The fast transition times and the large voltage swings could cause capacitive and inductive coupling (crosstalk) between signals if insufficient attention is paid to PCB layout. You can reduce crosstalk by avoiding running PCB traces parallel to each other. If this is not possible, run ground traces between signal traces. In synchronous systems, the worst time for the crosstalk to occur is during the clock edge that samples the data. In most systems, it is sufficient to isolate the 1-2 IC t t VI TO 10 INFINITY Figure 1. Transmission Line Model One system design objective is to analyze the critical signal paths and design the i~terconnectio.ns .such that adequate system noise margms are mamtamed. There will always be signal overshoot and undershoot. The objective is to accurately predict these effects, determine acceptable limits, and keep the undershoot and overshoot within the limits. Input or Characteristic Impedance To calculate the characteristic impedance (also called AC impedance or surge impedance) looking i~to terminals a-b of the circuit in Figure2, use the followmg procedure. . . . Let Zl be the input Impedance looking mto terminals a-b, with Z2 for terminals c-d, Z3 for terminals e-f, etc. Zl is the series impedance of the first inductor (lL) in series with the parallel combination of Z2 and the impedance of the capacitor (IC). From AC theory: XL= jrolL where XL is the inductive reactance. 1 Xc= jrolC The Ideal Transmission Line An equivalent circuit for a transmission line ap- pears in Figure 1. The circuit consists of subsections of series resistance (R) and inductance (L) and parallel capacitance (C) and shunt. admittance (~) or parallel resistance, Rp. For c1anty and consistency, these parameters are defined per unit length. Multiply the values of R, L, C, and Rp by the length of the subsection, 1, to fmd the total value. The line is assumed to be infinitely long. If the line of Figure 1 is assumed to be lossless (R = 0, Rp = infinity) Figure 1 reduces to Figure 2. A small series resistance has little effect upon the line's characteristic impedance. In practice and by design, the series resistance is quite small. For I-ounce (0.0015inch-thick), I-mil-wide (0.010-inch) copper traces on 010 glass epoxy PCBs, the trace resistance is between 0.5 and 0.3.0 per foot. 2-ounce copper has a resistance 50 percent lower than that of I-ounce copper. where Xc is the capacitive reactance. Then Z2XC ZI=XL+ Z2+XC If the line is reasonably long, Zl = Z2 stituting Zl = Z2 into Equation 1 yields Eq. 1 Z3. Sub- ZIXC ZI=XL+ XZ1+ C or, Z12- ZIXL- XCXL= 0 Eq. 2 ~Z3 t e IL g Ie! ~'--I-C-!--;-4--~Y b f d ~ ~I .. ~I.. Figure 2. Ideal Transmission Line Model 1-3 h ~ Substituting the expressions for Xc and XL yields whether the rise time of the signal at the source equals or is greater (slower) than two times the propagation delay of the line. The condition for a voltage reflection to occur is Z1 2 - jrolL = !=.... Eq.3 C Equation 3 contains a complex component that is frequency dependent. You can eliminate the complex component by allowing I to become very small and by recognizing that the ratio UC is constant and independent of I or ro: ZI= ~LIC Eq.4 The AC input impedance of a purely reactive, uniform, lossless line is a resistance. This is true for AC or DC excitation. _tr_ L> - 2TpdL Solving for the loaded propagation delay yields ~ Eq.l0 2Tpd The intrinsic capacitance of the line from Equation 5 is 1= C - 0- Eq.ll Tpd Zo It is standard practice to use Co to designate the intrinsic line capacitance, Lo the intrinsic line self inductance, and Zo the intrinsic line characteristic impedance. Substituting the expressions from Equations 9, 10, and 11 into Equation 6 gives the relationship for the line length at which voltage reflections might occur. Two conditions must be present for voltage reflections to occur: The line must be long, and there must be an impedance mismatch between the line and the load. -..fLC The propagation delay for a lossless line is the reciprocal of the propagation velocity: Tpd = -..fLC Eq.5 = ZIC where L and C are once again the intrinsic line inductance and capacitance per unit length. Adding additional stubs or loads to the line (Refer~ ence 2 pg. 129) increases the propagation delay by the factor ...J 1 + cwc where CD is the load capacitance. Therefore, the propagation delay of a loaded line, TpdL, is TpdL = Tpd...J 1 + Cwc Eq. 6 This application note shows later that a transmission line's unloaded or intrinsic propagation delay is proportional to the square root of the dielectric constant of the medium surrounding or adjacent to the line. Propagation delay is not a function of the line's geometry. The characteristic impedance of a capacitively loaded line decreases by the same factor that the propagation delay increases: ZI = ''\jF1-+- - CIYic 8 However, the actual physical length of the line is Propagation Velocity and Delay 1 q. Eq.9 The propagation velocity (or phase velocity) of a sinusoid traveling on an ideal line (Reference 1 pg. 33) is 1 cx= _.- Z ' E CD + -----~x Tpd Eq.12 Tpd Zo Solving Equation 12 for the line length, 1, yields L=~ 2Tp d 1 J Eq.13 C Z 1+~ tr Equation 13 is very useful to the system designer. It is generic and applies to all products irrespective of circuit type, logic family, or voltage levels. The equation allows you to estimate when a line requires termination, using variables you can easily determine. When driving a distributed or non-lumped load, the signal's rise time depends on the source - not the load, as you might expect. The intrinsic,· or unloaded, line propagation delay per unit length is a function of the dielectric constant and can be easily calculated. The intrinsic line characteristic impedance isa function of the dielectric constant and the PCB's physical construction or geometry and can also be calculated. Finally, you can estimate the equivalent (lumped) load capacitance by adding up the number of loads (device inputs) being driven and multiplying by 10 pF. For I/O pins, use 15 pF per pin. Eq. 7 Note that the capacitance per unit length must be multiplied by the line length, I, to calculate an equivalent lumped capacitance. The Condition for Voltage Reflection It is relatively straightforward to obtain a c1osedform solution for a transmission line's maximum allowable length, which, if exceeded, might cause a voltage reflection. If the line is. not terminated in its characteristic impedance, a reflection is guaranteed to occur. The reflection's amplitude depends on the amount of impedance mismatch between the line and the load and Signal Transition Times The standard Cypress 0.81l (L drawn) CMOS process yields output buffers whose signals transition approximately 4V in 2 ns, or, have a slew rate of 2V per 1-4 nanosecond. The rise time/fall time is 2 ns. Products fabricated using the Cypress BiCMOS process have the same rise times. The Cypress ECL process yields products with 500ps output signal rise times and fall times, or slew rates of 1V/0.5 ns = 2V per nanosecond. Internal signal slew rates are lOV per nanosecond, but only for short (usually less than 500 mY) voltage excursions. Thus, high-frequency noise is generated on chip, which you can eliminate by using 100- to 500-pF ceramic or mica filter capacitors between Vee and ground. The values in Table 1 come from using Equation 13 to calculate the line length at which voltage reflections might occur. The calculations assume a 50Q intrinsic line characteristic impedance and that the PCB is multilayer, using stripline construction on G-lO glass epoxy material (dielectric constant of 5). These conditions result in an unloaded line propagation delay of 2.27 ns per foot. Table 1 reveals that decreasing the source rise time from 2 to 0.5 ns (a factor of 4) decreases the line length at which a voltage reflection might occur by a factor of 5 (4.73 divided by 0.93 = 5.09) for the same load (10 pF) and intrinsic propagation delay (2.27 ns/ft.). A second observation is that for signals with rise times of 0.5 ns, you should terminate all lines. reflects back from the load to the source, where the voltage either adds to or subtracts from the original signal. A mismatch between the source and line impedance might also cause a voltage reflection, which in tum reflects back to the load. Therefore, two reflection coefficients are defmed. For classical transmission lines driven by a single frequency source, the impedance mismatches cause standing waves. When pulses are transmitted and the source's output impedance changes depending upon whether a Low-to-High or a High-to-Low transition occurs, the analysis is complicated further. You can use classical transmission line analysiswhere pulses are represented by complex variables with exponentials - to calculate the voltages at the source and the load after several back and forth reflections. However, these complex equations tend to obscure what is physically happening. Energy Considerations Now consider the effects of driving the ideal transmission line with digital pulses and analyze the behavior of the line under various driving and loading conditions. The first task is to define the load and source reflection coefficient s. Figure 3 shows the circuit to be analyzed. The ideal transmission line of length I is driven by a digital source of internal resistance Rs and loaded with a resistive load RL. The characteristic impedance of the line appears as a pure resistance, Zo= ..JLIC to any excitation. The ideal case is when Rs = Zo = RL. The maximum energy transfer from source to load occurs under this condition, and no reflections occur. Half the energy is dissipated in the source resistance, Rs, and the other half is dissipated in the load resistance, RL (the line is lossless). If the load resistor is larger than the line's characteristic impedance, extra energy is available at the load and is reflected back to the source. This is called the underdamped condition, because the load under-uses the energy available. If the load resistor is smaller than the line impedance, the load attempts to dissipate more energy than is available. Because this is not possible, a reflection occurs that signals the source to send more energy. This is called the overdamped condition. Both the underdamped and overdamped cases cause negative traveling waves, which cause standing waves if the excitation is sinusoidal. The condition Zo = RL is called critically damped. The safest termination condition, from a systems design viewpoint, is the slightly overdamped condition, because no energy is reflected back to the source. Reflection Coefficients Another attribute of the ideal transmission line is reflection coefficients, which are not actually line characteristics. The line is treated as a circuit component, and reflection coefficients are defined that measure the impedance mismatches between the line and its source and the line and its load. The reason for defining and presenting the reflection coefficients becomes apparent later when it is shown that if the impedance mismatch is sufficiently large, either a negative or positive voltage Table 1. Line Length at which a Voltage Reflection Occurs tr (ns) CD (oF) L (inches) 2 10 20 40 4.73 2 2 2 1 1 1 1 0.5 0.5 0.5 0.5 80 10 20 40 80 10 20 40 80 4.32 3.74 3.05 2.16 1.87 1.53 1.18 0.93 0.76 0.59 0.44 Line Voltage For a Step Function To determine the line voltage for a step function excitation, you apply a step function to the ideal line 1-5 Rearranging Equation· 16 yields and analyze the behavior of the line under various loading conditions. The step function response is important because any pulse can be represented by the superposition of a positive step function and a negative step function, delayed in time with respect to each other. By proper superposition, you can predict the response of any line and . load to any width pulse. The principle of superposition applies to all linear systems. According to theory, the rise time of the signal driven by the source is not affected by the characteristics of the line. This has been substantiated in practice by using a special coaxially constructed reed relay that delivers a pulse of 18A into 50n with a rise time of 0.070 ns (Reference 1 pg. 162). The equation representing the voltage waveform going down the line (Figure 3) as a function of distance and time is VL(X, t) = VA(t) U(t- X tpd) for t< To Eq.14 VA(I) ~ vs(t{ 2 0 : 0 Rs ) VB= VL+ VL 1L= VL Zo and VL' Ii' = - _.- Zo. (The minus sign is due to IL being- negative; i.e., IL is opposite to the current due to VL.) Therefore, Zo Eq.16 A By defmition: _ reflected voltage VL' PL - incident voltage VL Solving for, VL' /VL in Equation 16 and substituting in the equation for PL yields RL- Zo PL= RL+ Zo Eq.17 The reflection coeffiCient at the source is Rs- Zo ps= Rs+ Zo VL') VL VL Eq.19 f)L) VL Equation 19 describes the voltage at the load (VB) as the sum of an incident voltage (VL) and a reflected voltage (pL VI) at time t = To. When RL = Zo, no voltage is reflected.· When RL < Zo, the reflection coefficient at the load is negative; thus, the reflected voltage subtracts from the incident voltage, giving the load voltage. When RL > Zo, the reflection coefficient is positive; thus, the reflected voltage adds to the incident voltage, again giving the load voltage. Note that the reflected voltage at the load has been defined as positive when traveling toward the source. This means that the corresponding current is negative, subtracting from the current driven by the source. This piecewise analysis is cumbersome and can be tedious. However, it does provide an insight into what is physically happening and demonstrates that a complex problem can be solved by dividing it into a series of simpler problems. Also, eliminating the exponentialswhich provide phase information in the classical transmission line equations - simplifies the mathematics. To use the piecewise method, you must do careful bookkeeping to combine the reflections at the proper time. This is quite straightforward, because a pulse travels with a constant velocity along an ideal or low-loss line, and the time delay .between reflected pulses can be predicted. The rules to keep in mind are that at any location and time the voltage or the current is the algebraic sum of the waves traveling in both directions. For example, two voltage waves of the same polarity and equal amplitudes, traveling in opposite directions, at a given location and time add together to yield a voltage of twice the amplitude of one wave. The same reasoning applies to all points of termination and discontinuities on the line. The total voltage or current is the algebraic sum of all the incident and reflected waves. Polarities must be observed. A positive voltage reflection results in a negative current reflection and vice versa. where VA = the voltage at point A X = the voltage at a point X on the line I = the total line length tpd = the propagation delay of the line in nanQseconds per foot To = I tpd, or the one-way line propagation delay U (t) = a unit step function occurring at x = 0 V8(t) = the source voltage When the incident voltage reaches the end of the line, a reflected voltage, VL', occurs if RL does not equalZo. The reflection coefficient at the 10ad,pL, can be obtained by applying Ohm's Law. The voltage at the load is VL + VL', which must be equal to (IL + IL')RL. But Zo = ( 1+ = (1+ Eq. 15 VB= VL+ VL,=[VL_ VL')RL , j+ Rs ~I ~X -. IA Zo -. IB t SOURCE I + VB(-X) RL VA S B IB IA .- .- LINE 1 LOAD Figure 3. Ideal Transmission Line Loaded and Driven Eq. 18 1-6 a via from a signal plane through a ground plane to a second signal plane in a multilayer PCB or module. IC sockets and other connectors can also cause discontinuities. Step Function Response of the Ideal Line Before examining reflections at the source due to mismatches between the source and line impedances, consider the behavior of the ideal line with various loads when driven by a step function. The circuit for analysis appears in Figure 3. Figure 4 shows the voltage and current waveforms at point A (line input) and point B (the load) for various loads. (These values are drawn from Reference 1 pg. 158 - 159.) Note that Rs = Zo and that VA at t = 0 equals Vsl2. This means that no impedance mismatch exists between the source and the line; thus, there is no reflection from the source at t = 2To. To is the one-way propagation delay of the line. The time-domain response of the reactive loads are obtained by applying a step function to the LaPlace transform of the load, then taking the inverse transform. Note that the reflection coefficient at the load is not the total reflection coefficient (a complex number) but represents only the real part of the load. The piecewise method eliminates the complex (jrot) terms by performing the bookkeeping involving the phase relationships, which the complex terms account for in classical transmission line analysis. Note that for the open-circuit condition in Figure 4b, ZL = infinity, so that PL = +1. The voltage is reflected from the load to the source (at amplitude Vo = Vsl2). Thus, at time = 2 To, the reflected voltage adds to the original voltage, V0 = Vsl2, to give a value of 2V0 = Vs. While the voltage wave is traveling down to and back from the load, a current of I - Vo _ Vs Z 0- Zo - 2 Ideal Transmission Line's Pulse Response Consider next the behavior of the ideal transmission lIne when driven by a pulse whose width is short compared to the line's electrical length - when the pulse width is less than the line's one-way propagation delay time, To. Figure 6 shows another series of response waveforms for the circuit in Figure 3, this time for a pulse instead of a step (drawn from Reference 1 pg. 160 - 161). Note that Rs = Zo and that VA at t = 0 equals Vsl2. This means that there is no impedance mismatch between the source and the line; thus, there is no reflection from the source at t = 2To. Finite Rise Time Effects Now consider the effects of step functions with finite rise times driving the ideal transmission line. During the rise time of a pulse, half the energy in the static electric field is converted into a traveling magnetic field and half remains as a static electric field to charge the line. If the rise time· is sufficiently short, the voltage at the load changes in discrete steps. The amplitude of the steps depends on the impedance mismatch, and the width of the steps depends on the line's two-way propagation delay. As the rise time and/or the line gets shorter (smaller To), the result converges to the familiar RC time constant, where C is the static capacitance. All devices should be treated as transmission lines for transient analysis when an ideal step function is applied. However, as the rise time becomes longer and/or the traces shorter, the transmission line analysis reduces to conventional AC circuit analysis. 0 exists. This current charges up the distributed line capacitance to the value Vs, then the current stops. The waveforms at the source and load for the series RC termination shown in Figure 4g are of particular interest because this network dissipates no DC power; you can use this network to terminate a transmission line in its characteristic impedance at the input to a Cypress IC. Figure 4h represents the equivalent circuit of a Cypress IC's input. Combining both networks models a Cypress IC driven by a transmission line terminated in the line's characteristic impedance, when the values of Rand C are properly chosen. Reflections From Small Discontinuities Figure 7 shows a pulse with a linear rise time and rounded edges driving the transmission line of Figures 5a and 5b. The expressions for Vr are derived on pages 171 and 172 of Reference 1. The reflection caused by the small series inductance is useful for calculating the value of the inductor, L', but little else. The reflection caused by the small shunt capacitor is more interesting. If this capacitor is sufficiently large, it can cause a device connected to the transmission line to see· a logic Zero instead of a logic One. Reflections Due to Discontinuities Figure 5 illustrates three types of common discon- tinuities found on transmission lines. Any change in the characteristic impedance of the line due to construction, connectors, loads, etc., causes a discontinuity, which causes a reflection that directs some energy back to the source. The amount of energy reflected back is determined by the discontinuity's reflection coefficient. Because discontinuities are usually small by design, most of the energy is transmitted to the load. In general, a discontinuity has series inductance, shunt capacitance, and series resistance. An example is The Effect of Rise Time on Waveforms Next, consider the ideal line terminated in a resistance less than its characteristic impedance and driven by a step function with a linear rise time. The stimulus, the circuit, and the response appear in Figures Sa, b and c, respectively. Once again, note that because the source 1-7 (a) Series Inductance (b) Shunt Capitance (c) Series Resistance R ~"-----'t z, VA +2 -t------' (R VA R + I- 1'-+\ + 20) + 2Z 0 l' 2Tol Figure 5. Reflections from Discontinuities with an Applied Step Function resistance equals the line characteristic impedance, there are no reflections from the source. The resulting waveforms are similar to those of Figure 4c when modified as shown in Figure8c. The waveform's final value must be the same as before (Figure 4c). The resultant wave at the line input (Vin)is easily obtained by superposition of the applied wave and the reflected wave at the proper time. In Figure 8, because the step function's rise time is less than the line's twoway propagation delay, the input wave reaches its final value, Vs/2. At t = 2To, the reflected wave arrives back at the source and subtracts from the applied step function (the load reflection coefficient is negative). Figure9 illustrates waveforms for two relationships between the step function rise time and the propagation delay. appears on the line and travels toward the load. After a one-way propagation delay time, To, the wave reflects back with an amplitude of PL V0. This first reflected wave than travels back to the source, and at time t = 2To, the wave reaches the input end of the line. At this time, the first reflection at the source occurs, and 'a wave of amplitude ps (pL Va) reflects back to the load. At time t = 3To, this wave again reflects from the load back to the source with amplitude 2 PL ps (pL Vo) = ps PL Vo This back and forth reflection process continues until the amplitudes of the reflections become so small that they cannot be observed. Then, the circuit is said to be in a q uiesce~t state. Multiple Reflections Now consider the case of an ideal transmission line with multiple reflections caused by improper terminations at both ends of the line. The circuit and waveforms appear in Figure 10. The reflection coefficients at the source and the load are both negativethe source resistance and the load resistance are both less than the line characteristic impedance. When the switch is initially closed, a step function of amplitude Vs20 Vo= Vin= Rs+ 20 Effective Time Constant Voltage reflections in small increments and of short durations approximate an exponential function, as indicated by the dashed line in Figure lOb. The smaller and narrower the steps become, the more closely the waveform approaches an exponential curve. The mathematical derivation is presented on pages 178 and 179 of Reference 1. The time constant is K= - 1-9 2To 1- ps PL Eq.20 ··b ..~ :1 n D ., o 2TO To "1'1--[-·-1 w. To r 6:-°1•.""" TO Figure 6. Pulse Response of Figure 3 for Various Terminations _,-;-;:;VA = Vs/2, 10 = VO/Z o, To = hLC, 1-10 (RL- 2 0 ) PL = (RL+ 2 ) 0 ' Consider the case of a short-circuited transmission line driven by a step function with a source impedance unequal to the characteristic line impedance. The general case is shown in Figure lOa. For RL = 0 the reflection coefficients are Zs- Zo ps = Zs + Zo PL = - 1 Thus, the resultant voltage waveform at the load can be approximated by V(t)= voe(i) Eq.21 For Equation 21 to be accurate, PL and ps must be reasonably large (approaching ± 1) so that the incremental steps are small. Because the product PSPL is a positive number, less than one, the time constant is a negative number, which indicates that the exponential decreases with time. This is usually the case in transient circuits. Both reflection coefficients must also have the same sign to yield a continually decreasing or increasing waveform. Opposite signs give oscillatory behavior that cannot be represented by an exponential function. The approximate time constant is _ k= 2To ~ To (Zs+ Zo) 1 - ps PL 1 + ps Zs ToZo Eq.22 or - k= To+ - Zs Recall that To= l-fLC (one-way delay) and Zo= ...JLIC where 1 is the physical length of the line, and L and C are the per-unit-Iength parameters. Substituting these variables into Equation 22 yields From Transmission Line to Circuit Analysis When a transmission line is terminated in its characteristic impedance, the line behaves like a resistor. It usually does not matter if you use transmission line or circuit analysis, provided that you take the propagation delays into account. - k= To+ l~ Zs Vs (a) Applied Pulse from Generator APPLIED STEP FUNCfION L'VA V=-')20 T, Zo (b) Reflections from Small Series Inductor L' Vs "2 (c) Reflections from Small Shunt Capacitance C' Reflected Wave Vs~ Rz+ Z" TR Figure 7. Reflections From Small Discontinuities with a Finite Rise Time Pulse 2To Figure 8. Effect of Rise Time on Response of Mismatched Line with Rl < Zo 1-11 1--1 Yin 1 Reflected Wave VI ! Rl Vs - - Rl+ Zo 2To=T R RI (a) circuit Yo-+-----, 4To (a) TR = 2To Yin (b) R/ ~__~----~----4_----~--~--~--~VsR/+R, Yin 2To 4To 6To t Reflected Wave Rl VS--Rl+ Zo 2To 2To TR (c) tin~: current 4To 6To 4T (b) TR > 2To (I+PL)VO~ Figure 9. Effects of Rise Time on Response for R/ Rl < Zo Vs - .. R/+ [. It is necessary to have Zs smaller than 20. Thus, the reflection coefficients have the same sign to give exponential behavior. Opposite signs give oscillatory behavior. If Zs < 20, the exponential approximation becomes more accurate. If Zs is very small compared to Zo, then To is negligible compared to 1 Ll20, so that Equation 22 reduces to k= - 2To 4To (d) load voltage 6To Figure 10. Step Function Applied to Line Mismatched on Both Ends; Shown for Negative Values of ps and PL Types of Transmission Lines The types of transmission lines include Coaxial cable Twisted pair Wire over ground Microstrip lines Strip lines l~ Zs But 1 L is the total loop inductance, and Zs is the circuit's total series impedance. The time constant is then L' k= RsThis is the. same time constant you would obtain by a circuit analysis approach if you considered the line a series combination of L' and Rs. By open-circuiting the line and performing a similar analysis, it can be shown that an RC time constant results. Coaxial Cable Coaxial cable offers many advantages for distributing high-frequency. signals. The well-defined and uniform characteristic impedance permits easy matching. The cable's ground shield reduces crosstalk, and the' low attenuation at high frequencies make the cable ideal for transmitting the fast rise- and' fall-time signals 1-12 The characteristic impedance is approximately 120n. This value can vary as much as ± 40 percent, depending upon the distance from the ground plane, the proximity of other wires, and the configuration of the ground. generated by Cypress CMOS ICs. However, because of its high cost, coaxial cable is usually restricted to applications that permit no alternatives. These applications usually involve clock distribution systems on PCBs or backplanes. Because coaxial cable is not easily handled by automated assembly techniques, its application requires human assemblers. This requirement further increases costs. Coaxial cables have characteristic impedances of 50,75,93, or 150n. These values are the most common, although special cables can be made with other impedances. Coaxial cable's propagation delay is very low. You can compute it using the formula Tpd= 1.017 -{e,: (nsl/t) Eq.23 where er is the relative dielectric constant and depends upon the dielectric material used. For solid Teflon and polyethylene, the dielectric constant is 2.3. The propagation delay is 1.54 ns per foot. For maximum propagation velocity, you can use coaxial cables with dielectric Styrofoam or polystyrene beads in air. Many of these cables have high characteristic impedances and are slowed considerably when capacitively loaded. Microstrip Lines A micros trip line (Figure 12) is a strip conductor (signal line) on a PCB separated from a ground plane by a dielectric. If the line's thickness, width, and distance from the ground plane are controlled, the line's characteristic impedance can be predicted with a tolerance of ± 5 percent. The formula given in Figure 12 has proven to be very accurate for width-to-height ratios between 0.1:1 and 3.0: 1 and for dielectric constants between 1 and 15. The inductance per foot for micros trip lines is L = Eq. 24 (Zo)2 Co where Zo is the characteristic impedance and Co is the capacitance per foot. The propagation delay of a micros trip line is Tpd = 1.017...J 0.45 er + 0.67 (nsl/t) Eq.25 Note that the propagation delay depends only upon the dielectric constant and is not a function of the line width or spacing. For G-10 fiberglass epoxy PCBs (dielectric constant of 5), the propagation delay is 1.74 ns per foot. Twisted Pair You can make twisted pairs from standard wire (AWG 24 - 28), twisted about 30 turns per foot. The typical characteristic impedance is lIOn. Because the propagation delay is directly proportional to the characteristic impedance (Equation 5), the propagation delay is approximately twice that of coaxial cable. Twisted pairs are used for backplane wiring, sometimes for driving differential receivers, and for breadboarding. Strip Line A strip line consists of a copper strip centered in a dielectric between two conducting planes (Figure 13). If the line's thickness, width, dielectric constant, and distance between ground planes are all controlled, the tolerance of the characteristic impedance is within ± 5 percent. The equation given in Figure 13 is accurate for W/(b - t) < 0.35 and tlb < 0.25. The inductance per foot is given by the formula Wire Over Ground L Figure 11 shows a wire over ground. This configura- = (Zo)2 Co The propagation delay of the line is given by the formula T pd = 1.017 -{e,: (nsl/I) Eq. 26 tion is used for breadboarding and backplane wiring. h ~ Ground Wff$!;/$////$ff$;///;/$l~ Zo= _ _8_7_ln(~) ..Jer+ 1.41 Z ~ln(4h) o = O.8w+ t {i;d Figure 11. Wire Over Ground Figure 12. Microstrip Line 1-13 For 0-:10 fiberglass epoxy boards, the propagation delay is 2.27 ns per foot. The propagation delay is not a function of line width or spac·ing. Line Termination Strategies There are two general strategies for transmission line termination: Match the load impedance to the line impedance Match the source impedance to the line. impedance In other words, if either the load reflection coefficient or the source reflection coefficient can· be made· to equal zero, reflections are eliminated. From a systems design viewpoint, strategy 1 is' preferred. Eliminating the reflection at the load (i.e., dissipating the excess energy) before the energy travels back to the source causes less noise, electromagnetic interference (EMI), and radio frequency interference (RFI). Modern PCBs Most PCBs employ microstrip, stripline, or some combination .ofthe two. Microstrip construction on a double-sided board with power and ground nets can suffice for low- to medium-performance, and low-densityPCBs. For high-performance, high-density PCBs, stripline construction is preferred. Power planes isolate signal layers from each other and provide higher-quality power and grounds than those of a two-layer board. Manufacturing quality control assures that. the metalization is of uniform thickness and that the layers are properly laminated, thus ensuring uniform, predictable electrical characteristics. Multiple Loads, Buses, and Nodes In the case where multiple loads are connected to a transmissiOn line, only one termination circuit is required. The termination should be located at the load that is electrically the greatest distance from the source. This is usually the load that is the greatest physical distance from the source.' A point-to-point or daisy chain connection of loads is preferred. Bidirectional buses should be terminated at each end with a circuit whose impedance equals the intrinsic, characteristic line impedance. The reason is that each transmitting device sees the characteristic impedance of the line when the device is transmitting. Consider next a line that has three bidirectional nodes: one on each end and one in the middle. The middle node, when driving the line, sees an impedance equal to Zo/2, because the node is looking into' two lines in parallel with each other. The end nodes, however, see an impedance of Zo. In this case, as in a backplane, each end of the line should be terminated in an impedance equal to Zo/2. When to Terminate Transmission Lines Transmission lines should be terminated when they are long. From the preceding analysis, it should be apparent that · Tr L ong L me> -2T pdL where TpdL is the loaded propagation delay of the line per unit length. For Cypress CMOS and BiCMOS products, the rise time, Tr, is typically 2ns. For stripline construction (multilayer PCBs), the line length at which voltage reflections occur has been shown to vary from 4.73 inches for a lO-pF load to 3.05 inches for an 80-pF load (see Equation 13 and Table 1). Not all lines exceeding these lengths need to be terminated. Terminations are usually required on control lines (such as clock inputs, write and read strobe lines on SRAMs and FIFOs) and chip select or outputenable lines on RAMs, PROMs, and PLDs. Address lines and data lines on RAMs and PROMS usually have time to settle because they are normally not the highestfrequency lines in a system. However, if very heavily loaded, address and data bus lines might require terminations. Types of Terminations There are three basic types of terminations: series damping, pull-up/pull-down, and parallel At terminations. Each has its advantages and disadvantages. Except for series damping, the termination network should be attached to the input (load) that is electrically the greatest distance from the source. Component leads should be as short as possible to prevent reflections due to lead inductance. Series Damping Zo= ~ln( -Ie; 4b 0.671t~ 0.8 + .;) Series damping is accomplished by inserting a small resistor (typically 10 to 75.Q) in series with the transmission line, as close to the source as possible (Figure 14). Series damping is a special case of damping in which the series resistor value plus the circuit output impedance equals the transmission line impedance. The strategy is to prevent the wave reflected back from the load from reflecting back from the' source. This is done by making th,e source reflection coefficient equal to zero. 1 Figure 13. Strip Line Construction 1-14 ~ ~~~OID~~~~~~~~~~~~~~~~~S~y~s~te~Dl~S~D~es~i~g~n~C~o~n~si~d~e~r~a~ti~o~n~s Zo Provides current limiting when driving highly capacitive loads; the current limiting also helps avoid ground bounce The disadvantages of series termination are Degrades rise time at the load due to increased RC time constant Should not be used with distributed loads The low input current required by Cypress CMOS lCs results in essentially no DC power dissipation. The only AC power required is to charge and discharge the parasitic capacitances. C B A Figure 14. Series Damping Termination Pull· Up/Pull·Down Termination The channel resistance (On resistance) of the pulldown device for Cypress lCs is 10 to 200, depending upon the current-sinking requirements. Thus, subtract this value from the series damping resistor, Rd. Zo = Rs+ Rd Eq.27 A disadvantage of the series damping technique, as illustrated in Figure 15, is that during the two-way propagation delay time of the signal edges, the voltage at the input to the line is halfway between the logic levels, due to the voltage divider action of Rd. The "half voltage" propagates down the line to the load and then back from the load to the source. This means that no inputs can be attached along the line, because they would respond incorrectly during this time.· However, you can attach any number of devices to the load end of the line .because all the reflections are absorbed at the source. If two or more transmission lines must be driven in parallel, the value of the series damping resistor does not change. The advantages of series termination are Requires only one resistor per line The pull-up/pull-down resistor termination shown in Figure 16 is included for historical reasons and for the sake of completeness. For TTL driving long cables, such as ribbon cables, the values Rl = 2200 and R2 = 33m are recommended by several bus interface standards. If the cable is disconnected, the voltage at point B is 3V, which is well above the 2V minimum High TTL specification. Because most control signals are active Low, a disconnected cable results in the unasserted state. The maximum value of Rl is determined by the maximum acceptable signal rise time, which is a function of the charging RC time constant. The minimum value of Rl is determined by the amount of current the driver can sink. The value of Rz is chosen such that a logic High is maintained when the cable is disconnected and the equivalent Thevenin resistance is Rr= RIR2 Rl+ R2 The value of Rl and R2 in parallel is slightly less than the cable's characteristic impedance. Ribbon cables with characteristic impedances of 15()Q are typical. If both resistors are used, DC power is dissipated all the time. If only a pull-down resistor (R2) is used, Consumes little power Permits incident wave switching at the load after a To propagation delay A v \ II r\ J IRl > 50n. and 20n. > R2 > lOn., depending upon speed and output currentsinking requirements. Positive Step Function Response The initial voltage on the capacitor is zero. At t = 0, the switch is moved from position 2 to position 1. At t = 0+, the capacitor appears as a short circuit, and the voltage V is applied through Rl to charge the load (R3C). The voltage across the capacitor Vc(t), is Vc(t) Commercially Available RC Networks = V( 1- J(Rl: ~3)C]) A variety of combinations of Rand C values are available as series RC networks in SIP packages from at least two sources. Bourns calls these networks the Series 701 and 702 RC Termination Networks. You can obtain data sheets by calling the factory in Logan, Utah (801-750-7200) or a local sales office. Thin Film Technology also refers to the networks as RC Termination Networks. You can obtain data sheets by calling the factory in North Mankato, Minnesota at 507-635-8445. Negative Step Function Response The capacitor is charged to approximately V. At t = 0, the switch is moved from position 1 to position 2, and the capacitor is discharged. The voltage across the capacitor, Vc(t) is Vc(t) = vJ (R2: ~3)C] Eq.29 Vee Rl Zo A Eq.28 In theory, the voltage across the capacitor reaches V when t equals infinity. In practice, the voltage reaches 98 percent of V after 3.9 RC time constants. You can verify this by setting Vc(t)/V = 0.98 in Equation 28 and solving for t. Zo B Figure 16. Pullup/Pulldown Figure 17. Parallel AC Termination 1-16 The voltage decays to 2 percent of its original value in 3.9 RC time constants. You can verify this by setting Vc(t)/V = 0.02 in Equation 29 and solving for t. V ~ The Ideal Case Consider the ideal case, where Rl = R2 = O. Let R3 = R in Equations 28 and 29. If a positive pulse of width T is applied to the modified circuit of Figure 18, the pulse disappears if 4RC > T. Because the discharging time constant is the same as the charging time constant for the ideal case, a negative-going pulse of width T also disappears if 4RC > T. That is, if the applied signal is normally High and goes Low, as does the write strobe on an SRAM, the termination filters out all negative glitches less than 4 RC time constants in width. The maximum frequency that the circuit passes is 1 F(max.) = 2T Eq. 30 '\ 2 12 Source This is true because the charging and discharging time constants are equal for the ideal case. Capacitance for the Ideal Case = V ( 1- T The Real World To go from the ideal to the real world, calculate the values of Rl and R2 from the curves on the data sheet of the device driving the line. Rl is the slope of the output source current vs. output voltage between 2 and 4V. R2 is the slope of the output sink current vs output voltage between 0 and 0.8V. Add the value of Rl to 470 and calculate C, using Equation 32. Then check to see that the RC charging time constant does not violate some minimum positive pulse-width specification for the line. If so, reduce C. Add the value of R2 to 470 and calculate C. Then check to see if the discharging RC time constant violates some minimum pulse-width specification for the line. If so, reduce C. V For For V(t) _ V - V(t) _ V 0.1, - 0.9, t= t= Eq.32 For T = 5 ns, Table 2 can be constructed. This table indicates that 500 transmission lines on PCBs that are terminated with RC networks should use a 47Q resistor and a capacitor of 48 pF max; 47· pF is a standard value. This network eliminates glitches of 9 ns or less. The table's second column applies to wirewrapping construction, which is not recommended for systems operating at frequencies over 10 MHz. An exception is if the system consists of less than six MSI or SSI ICs. Eq.31 1- V(t) Load C = 2.2R J~~] ) RCln[ _ _l_] I:' The time for the signaf to transition from 10 to 90 percent of its final value is then T = 2.2 RC. Solving for C yields for t yields t= I V(t) Figure 18. Lumped Load; AC Termination The value of the capacitor, C, must be chosen to satisfy two conflicting requirements. First, the capacitor should be large enough to either absorb or supply the energy contained or removed when positive-going or negative-going glitches OCClJr. Second, the capacitor should be small enough to avoid either delaying the signal beyond some design limit or slowing the signal rise and fall times to more than 5 ns. A third consideration is the impedance caused by the capacitor's capacitive reactance, Xc. The digital waveforms applied to the AC termination can be expressed as a Fourier Series, so that they can be manipulated mathematically. However, because these signals are not periodic in the classical meaning of the word, it is not clear that the AC steady-state analysis model of Xc applies here. In most applications, the degradation of the signal's rise and fall times beyond 5 ns determines the maximum value of the capacitor. The procedure is to calculate the rise time between the 10- and 90-percent amplitude levels, equate this rise time to 5 ns, and solve for C in terms ofR: V(t) Rl 0.10RC. Schottky Diode Termination In some cases it can be expedient to use Schottky diodes or fast-switching silicon diodes to terminate 2.3RC. 1-17 Table 2. Termination Values for an Ideal Case PCB Wirewrapped Zo(O) 50 120 R (0) 47 110 C (max., pF) 48 20 RC (ns) 2.25 2.2 4RC (ns) 9 8.8 Vee Figure 19. Schottky Diode Termination lines. The diode switching time must be at least as fast as the signal rise time. Where line impedances are not well dermed, as in breadboards and backplanes, the use of diode terminations is convenient and can save time. A typical diode termination appears in Figure 19. The Schottky diode's low forward voltage, Vf (typically 0.3 to 0.45V), clamps the input signal to a Vf below ground (lower diode) and Vee + Vf (upper diode). This significantly reduces signal undershoot and overshoot. Some applications might not require both diodes. The advantages of diode terminations are: Impedance matched lines are not required. The diodes replace terminating resistors or RC terminations. The diodes' clamping action reduces overshoot and undershoot. Although diodes cost more than resistors, the total cost of layout might be less because a precise, controlled transmission-line environment is not required. If ringing is discovered to be a problem during system debug, the diodes can be easily added. As with resister or RC terminations, the leads should be as short as possible to avoid ringing due to lead inductance. A few of the types of Schottky diodes commercially available are IN4148 (switching diode) IN5711 MBD 101, MBD 102 (Motorola) SN74S1050152'56 (TI, single-diode arrays) SN74S1051/53 (TI, double-diode arrays) example because in most modem high-performance digital systems, the PCBs have multiple layers. The equivalent On channel· resistance of the PLD pull-up device, 620, is calculated using the output source current versus voltage graph, over the region of interest (2 to 4V), from the PAL C 20 series data sheet. The equivalent resistance of the pull-down device, 110, is calculated in a similar manner, using the output sink current versus output voltage graph, over the region of interest (0.4 to 2V), also on the data sheet. The equivalent input circuit for the FIFO is constructed by approximating the input and stray capacitance with a lO-pF capacitor and the input resistance with a 5-MO resistor. The input leakage current for all Cypress products is specified as a maximum of ± 10 J..I.A, which guarantees a rhlnimum of 500 KQ at Yin = 5V. Typical leakage current is 10 pA. Because the PLD is driving four FIFOs in parallel, the equivalent lumped capacitance is 4 X 10 pF = 40 pF, and the equivalent lumped resistance is 5,000,000/4 = 1.25 MO. The next step is to calculate the propagation delay and the loaded characteristic impedance of the line. The unloaded propagation delay of the line is calculated using Equation 26 with a dielectric constant of 5: Tpd = 2.27 nslft To calculate the loaded line propagation delay, the intrinsic capacitance must first be calculated using Equation 5. Tpd= Zo Co where Zo is the intrinsic characteristic impedance, and Co is the intrinsic capacitance. Un terminated Line Example The following example is presented to illustrate the procedure for calculating the waveforms when a Cypress PLD generates the write strobe for four Cypress FIFOs. The PLD is a PAL C 16L8 device and the FIFOs are CY7C429s. The equivalent circuit appears in Figure20 and the unmodified driving waveform in Figure 21. The rise and fall times are 2 ns. The length of the stripline trace on the PCB is 8 inches and the intrinsic characteristic line impedance is 500. The voltage waveforms at the source (point A) and the load (point B) must be calculated as functions of time. Stripline construction is used for this C o = !..J!!!...= Zo 2.27 nslft = 454 FI'" 50 . P Jt. Because the line is loaded with 40 pF, Equation 6 is used to compute the loaded propagation delay of the line. TpdL= TpdV1+ CDICo r-------~_=---- TpdL = 2.27 nsljt TpdL = 3.46 1+ 40pF 8. 45.4 pF/.tt x 12 ~n./. m·lft 1-18 nslft. The magnitude of the reflected voltage at the source is then VS1 = - 4Vx (- 0.5) = 2V. This wave propagates from the source to the load and arrives at t = 3To. The wave adds to the OV signal. The rise time is preserved, and thus the time required for the signal to go from 0 to 2V is 2Vx 2ns tr= 4V = 1 ns. Note that the capacitance per unit length must be multiplied by the line length to arrive at an equivalent lumped capacitance. The intrinsic line impedance is reduced by the same factor by which the propagation delay is increased (1.524; see Equation 7): Zo' = :~~ = 32.80. Initial Conditions The signal at the load thus reaches the 2V level at time t= 3To+ 1 ns= 7.9 ns. and remains at that level until the next reflection occurs at t= 5To The wave that arrives at the load at 3To reflects back to the source and arrives at t= 4To= 9.2ns. The 2V level adds to the -4V level, for a total of -2V. The rise time is preserved, so that this level is reached at t = 4T0 + 1 ns = 10.2 ns. and maintained until the next reflection occurs at t= 6To. The 2V wave that arrives at the source at t = 4To reflects back to the load and arrives at t = 5To. The portion that is reflected back to the load is VS2 = 2 x (- 0.5) = - 1V. This value subtracts from the 2V level to give 2 - 1 = 1V. Because the fall time is preserved, the time required for the signal to go from 2 to 1V is 1Vx 2ns if= 4V = 0.5 ns. At time t = 0, the circuit shown in Figure20 is in a quiescent state. The voltage at points A and B must be the same. By inspection: VA = VB = (Vee - Vf) (~) Rs+ RL 6 = (5 - J 5 X 10 6 = 4V 28+ 5 x 10 At t = 0, the driving waveform changes from 4V to approximately OV with a fall time of 2 ns. This is shown in Figure20 by the switch arm moving from position 1 to position 2. The wave propagates to the load at the rate of 3.46 ns per foot and arrives there '.f. 8 in. 23 4 1) ( To= 3. 6nslJt x 12in.lft = . ns later, as illustrated in Figure22b. Because the reflection coefficient at the load is pL = 1, an early equal and opposite polarity waveform is propagated back to the source from the load. The reflection arrives at t = 2To = 4.6 ns (Figure 22a). Note that the fall time is preserved. The reflection coefficient at the source is: Rs - Zo' 11- 32.8 ps = Rs + Zo' 11 + 32.8 - 0.498 The IV level is thus reached at time t= 5To + 0.5 ns = 12 ns. To simplify the calculations that follow, consider 0.5 to be the Low-level source reflection coefficient. VA(t) r 24 I- Vee = 5V ~I 4V Zo = 500. 1V ~IB 62 1 0. I: foF J.- I = 8" i (" rn + 40 pF 1.25 0 1--- 20 ---1 J.- 0 2 22 Figure 21. V A(t), Unmodified Figure 20. Equivalent Circuit for Cypress PAL Driving 1-19 24 4 3 2 o -1 -2 -3 -4 Figure 22(a). Unterminated Line Example; VA(t) 4 4 2 o -1 o To 2.3 4.3 3To 6.9 7.9 5To 11.5 8 2 7To 9To To 3To 16 W.7 2.3 6.9 -2 -3 -4 Figure 22(b). Unterminated Line Example; VB(t) At t = 6To, the IV wave arrives back at the source, where it subtracts from the - 2V level to give -IV. The rise time is tr = 1 x 0.5 ns/V = 0.5 ns. The signal at the source reaches the IV level at t= 6T o + 0.5 = 14.3 ns. The IV wave that arrives at the source at t = 6To is reflected back to the load and arrives at t = 7To. The portion that is reflected back is VS3= 1 x (- 0.5) = - 0.5V. This value subtracts from the IV level to give 0.5V. The fall time is 0.25 ns. The 0.5V level remains until the next reflection reaches the load at t= 9T o At t = 8To the 0.5V wave that reflects from the load at t = 7To arrives back at the source, where it subtracts from the - IV level to give - 0.5V. The rise time is 0.25 ns. The portion that reflects back to the load is VS4= 0.5 x (- 0.5) = - 0.25V. The -0.25V signal arrives at the load at t = 10To 23 ns and subtracts from the 0.5V signal to give 0.25V. This process continues until the voltages at points A and B decay to approximately OV. t= The difference (11.75 - 7.65) is 4.1 ns, which is wide enough for the FIFO to interpret as a second clock. To eliminate this pulse, the line must be terminated. Strobe Shortening Considerations In this example the width of the negative strobe is 22 to 24 ns. If a CY7C429-20 FIFO is used, the write (or read) strobe must not be shorter than 20 ns. Even if the FIFO does not recognize the 4.1-ns negative pulse, the shortening of the write strobe by 5T0 = 11.5 ns is sufficient to violate the minimum negative-pulse-width specification. This strobe-shortening phenomenon might also occur on other active-Low control lines such as output enables and chip selects. Clock lines must also be analyzed for this problem; in general, these lines should be terminated. The Rising Edge of the Write Strobe Now consider an analysis of the write strobe's rising edge to assure that the reflections associated with this edge do not cause multiple clocks or false triggering of the FIFO. At t = 22 ns, the rising edge of the write strobe begins, which is the equivalent of closing the switch in Figure20 in the 1 position. For this analysis, it is convenient to start the time scale over at zero, as appears in Figures22a and 22b. If the forcing function were a step function, the equations of Figure 4h would apply. The time constant in the eq uation is Observations The positive reflection coefficient at the load and the negative reflection coefficient at the source result in an oscillatory behavior that eventually decays to acceptable levels. The voltage at point A reaches -IV after 6 To delays and the voltage at point Breaches 0.5V after 7 To delays. The reflection at the load that causes the voltage to equal the TTL minimum One level (2V) at T = 3To causes a problem. The actual input voltage threshold level is 1.5V for TTL-compatible devices that do not exhibit hysteresis. The voltage at the load falls from 4 to OV in 2 ns, beginning at t = To. Because To = 2.3 ns, the voltage reaches zero at 2.3 ns + 2 ns = 4.3 ns. The 1.5V level occurs at 2ns 4.3 ns - 4'V x 1.5V = 3.55 ns. RZo'Ce T = R + Zo' 2ns 4'V x Eq. 33 Because R> Zo' ,T= Zo'Ce where Zo' = 32.80, and Ce = 45.4 pF. This is the equivalent of saying that you can ignore the 1.25-MQ device input resistance for transient circuit analysis. Substituting Zo' and Ce into the preceding equation yields a time constant ofT = 1.489 ns. Writing the equation for the voltages for the circuit of Figure20 yields 1 VB(t) = iZo' + ce Jt i dt Eq. 34 = KtU(t) - Eq.35 0 Also, The rising edge begins at t= 3To = 6.9 ns. The 1.5V level occurs at 6.9 ns + 5To + 0.25 ns= 11.75 ns. VB(t) K(t- Tl) U(t- Tl). where Kt is the rising edge of the write strobe (K = 2V/ns) applied at t = 0 using a unit step function, U(t); and-K(t - Tl) represents an equal but opposite waveform applied at t = T1 (after the rise time) using a unit step function, U(t - Tl). Equating the expressions and taking the LaPlace transforms of both sides yields 1.5 = 7.65 ns. The time difference (7.65 - 3.55 = 4.1 ns) is long enough for the FIFO to interpret the signal as a Low. Next, consider the width of the positive pulse that begins at the load at t = 3To. Because the rise time is preserved, the signal takes 1 ns to reach 2V, or 0.75 ns to reach 1.5V. The signal begins to fall at t = 5To, reaching 1.5V at Tls K Ke( 7----;:= Zo' J(s) + J(s) Ces = Zo' + 1 ) C J(s) es Eq.36 1-21 However, Equation 42 is used to calculate the voltage at the load at t = 2To, because 1 To is used for propagation delay time: f ~ VB(t) 1 Ce to i dt, or, VB(S) = /(s) , Ces VB (t=2To) = Therefore, K -12 Ke- Tis ;- -;= ( 1 ) Zo' + Ces CeSVB(S). -2Vx 32.8 x 45.4x 10 (1- e-1. 489 )(e- 2 )+ 4 2x ;;;: - 1.489(0.774) (0.1353) + 4 = - 1.559 + 4 = 3.84V. The voltage at the load remains' at this value until the first reflection from the source reaches the load at t = 3To • Meanwhile, at t = To, the wave at the load reflects back to the source and arrives at t = 2To. The wave subtracts from the 4V level at the source, as illustrated in Figure6c. The.amplitude of the droop is given by C 'Zo' Vo Vr = - 2 - Tr . Eq.44 Eq.37 10- 9 Solving for VB(S) yields $( 1- eVB(S) = S , + Ce (Zo TIS ) Eq.38 '1 C S) e which is equivalent to --L( 1 _ - TIS) Zo'C e e Eq.39 for Rs = Zo. If Rs does not equal Zo', Equation 44 must be modified. Instead of Vo/2, the voltage is Taking the inverse LaPlace transform yields 1) + KtJ U(t) [ KZo'Ce( J-t:c~l) J - 1) + K(t- Tl) ] U(t- Tl) VB(t) - = [ KZo'Ce( e- Zo!C. - o V ( so that Equation 44 becomes Eq.40 V - C 'Zo'Vo for KZo'Ce( f~J = T1 eLzo'C. - t~ VB(t) Tl ) K 1 + T1(t) fnJ)f-tJ eL + KZ'C( =~ 1- eL zo'c. Zo'C. for t> Tl where Kl is the fmal value, which is 4V. Substituting the correct values for t yields VB(t=Tl)= 2x32.8x 45.4x 102x 10- 9 12 r- Eq.41 = Tl = 2 ns (e-1. 489 _ 1) + 2V x 2ns ns = - 1.15 + 4 = 2.85V. If the forcing function is a step function, the equation is 4~ Iz~,~.J) = 1Eq.43 at t = 2 ns, VB = 3V, which is more than the 2.85V , calculated using Equation 41. At t = 22 ns + To, the voltage waveform begins to build up at the load and continues to build until the first reflection from the source occurs at t = 3To. VB(t) Tr Vr = 1.716V. Because 4V - 1.716 = 2.284, the voltage does not drop below the minimum TTL VIH level of 2V, but the voltage does come close. The reflection coefftcient at the source is Rs- Zo' ps= Rs+ Zo' where Rs = 62.Q, Zo' = 32.80, ps = 0.308. The amount of voltage reflected from the source back to the load is then VSl = 1.716 x 0.308 = 0.53V. The 40-pF capacitor reduces the rise time of the waveform at the load. The reflection at the source caused by the load capacitor is insufficient to reduce the 4V level to less than the TTL One level (2V). The reflection coefficient at the source is small enough so that the energy reflected back to the load is insufficient to cause a problem. Eq.42 K1 J Rs Eq.45 Rs+ Zo' where C' = 40 pF, Zo' = 32.80, Rs = 620, TR = 2 ns, and Vo = 4V. Substituting these values into Equation 45 yields The ftrst term in Equation 40 applies from time zero up to and including Tl, and the second term applies after Tl: VB(t) S RS: 20' J References 1. Matick, Richard E. Transmission Lines for Digital and Communications Networks. McGraw Hill, 1969. 2; Blood, Jr., William R. MECL System Design Handbook. Motorola Inc., 1983. 1-22 CYPRESS SEMICONDUCTOR Power Characteristics of Cypress Products This application note presents and analyzes the power dissipation characteristics of Cypress products. The knowledge and tools presented here will help you manage power when using Cypress CMOS products. sociated with the inputs, outputs, and internal nodes. This component is commonly called C V2 power and is directly porportional to the operating frequency, f. The charge, Q, stored in a capacitor, C, that is charged to a voltage, V, is given by the equation: Q= CV Eq.l Dividing both sides of Equation 1 by the time required to charge and discharge the capacitor (one period, or T) yields: Q_CV Eq.2 T- T Design Philosophy The design philosophy for all Cypress products is to achieve superior performance at reasorlable power dissipation levels. The CMOS technology, circuit design techniques, architecture, and topology are carefully combined to optimize the speed/power ratio. By definition, current (I) is the charge per unit time and Power Dissipation Sources f= 1T Therefore, 1= C Vf Eq.3 The power (P = V I) required to charge and discharge the capacitor is obtained by multiplying both sides of Equation 3 by V: P= VI= CV2 f Eq.4 It is standard practice to assume that the capacitor is charged to the supply voltage (Vee), so that P=Vee I=CVee2 f Eq.5 The total power consumption for CMOS systems depends upon the operating frequency, the number of inputs and outputs, the total load capacitance, the internal equivalent (device) capacitance, and the static (quiescent) or standby power consumption. In equation form: Pd = [CINT FINT + Cload Fload] Vee2 + lee Vee Eq. 6 The first four quantities are frequency dependent, and the last is not. This same equation can be used to describe the power dissipation of every IC in the system. The total power dissipation is then the algebraic sum of the individual components. The relative magnitudes of the various terms in the equation are device dependent. Note that Equation 6 Power is dissipated both inside and outside ICs. The internal and external power have a quiescent (or DC) component and a frequency-dependent component. The relative magnitudes of each depend upon the circuit design objectives. In circuits designed to minimize power dissipation at low to moderate performance, the frequency-dependent component is signifigantly greater than the DC component. In the high-performance circuits designed and manufactured by Cypress, the frequency-dependent power component is much lower than the DC component. This is because a large percentage of the internal power is dissipated in linear circuits such as sense amplifiers, bias generators, and voltage/current references, which are required for high performance. Frequency-Dependent Power CMOS circuits inherently dissipate significantly less power than either bipolar or NMOS circuits. The ideal CMOS circuit has no direct current path between Vee and Vss. In circuits using other technologies, such paths exist, and DC power is dissipated while the device is in a static state. The principal component of power dissipation in a power-optimized CMOS circuit is the transient power required to charge and discharge the capacitances as- 1-23 Table 1. Types of Input Buffers worst-case power dissipation. The information is presented as functions of frequency, Vee, and temperature. A general-purpose power dissipation model for all Cypress ICs appears in Figure 1. To obtain power dissipation data on an IC, you must isolate the three components of power dissipation included in Equation 6 by controlling the Ie's inputs. The standby current (Ice) is measured with the inputs to the IC at OAV or less. Under this condition, the input buffers arid unloaded output buffers draw only DC leakage currents. All other direct currents derive from the substrate bias generator, sense amplifiers, other internal voltage or current references, and NMOS memory circuits. At Yin = 1.5V, the input buffers draw maximum Ice. To find the total input buffer Ice current, you measure the total current and subtract the quiescent current. You can then calculate the current per input buffer by dividing the total input-buffer current by the number of input buffers. ICC (MAX. IN rnA) BUFFERTYPE A 1.3 B 0.8 C 0.6 must be modified if all of the internal nodes or all of the outputs are not switching at the same frequency. Transient Power Cypress devices incorporate N-well CMOS inverters that can affect the devices' transient power consumption. In an ideal N-well CMOS inverter, the Pchannel pull-up transistor and the N-channel pull-down transistor (which are in series with each other between Vee and V ss) are never on at the same time. Thus, there is no direct current path between Vee and ground, and the quiescent power is very nearly zero. In the real world, when the input signal makes the transition through the linear region (i.e., between logic levels) both the n-channel and p-channel transistors are partially turned On. This creates a low-impedance path between Vee and Vss whose resistance equals the sum of the n- and p-channel resistances. Input Buffers Cypress products use· three different types of input buffers. For purposes of illustration, they are referred to as types A, B, and C. Table 1 lists the buffer types used in various products. Figure 2 shows schematics and input characteristics for the three types of buffers. A circle on a transistor's gate means that the transistor is a P-channel device. As Figure2 shows, the input buffers draw essentially zero Ice when Yin is OAV or less. This is also true when Yin is 4V or more, except for type A. In other words, if the inputs are driven rail to rail, the B and C input buffers dissipate power only during the input signal transitions. DC or Static Power In addition to conventional gates, Cypress devices contain sense amplifiers; input and output buffers; and bias and reference generators that all dissipate power. RAMs and FIFOs also have memory cells that dissipate standby power whether the IC is selected or not. PROM and PAL products have EPROM memory cells that do not dissipate as much standby power as a RAM cell. Core and Output Buffers The standby power dissipation of an IC's core derives from the substrate bias generator, reference generators, sense amp lifers , and polyload RAM cells or EPROM cells. This current is measured with Yin = OV, so that the input buffers draw no current. Under these conditions, the output buffers draw only leakage current and dissipate essentially no power. Programming either PROMs or PALs stores charge on the floating gate of. an NMOS transistor, which increases the transistor's threshold voltage. This Power-Down Options Five Cypress static RAMs offer a power-down option that enables you to reduce the devices' power dissipation by approximately an order of magnitude when they are not accessed. The power-down technique disables or turns off the input buffers and sense amplifiers. Power Dissipation Model The rest of this application note presents power dissipation models for various Cypress CMOS products as well as information on each product's typical and n INPUTS I '::t ~ ~CIN INPUT BUFFERS . m CORE J... OUTPUT BUFFERS ,*C1NT Figure 1. Power Dissipation Model 1-24 'I ~CL OUTPUTS Power Characteristics Vee VlNl 1.3 0.8 ~ ~ TYPE A 5 0 0.6 ~ .-.§, TYPEB .-.§, 0 .!:r> 0 ..Y ..Y 0 0 0.6 0 0 2.0 0.5 1.5 3.5 0 ".0 0 0.5 VIN (v) VIN (V) 1.5 3.5 VIN (v) Figure 2. Three ButTer Types current, i(t), and the X axis during Tcy. Thus, because the "current pulse" is effectively spread over a longer time when the frequency is decreased, the average current is proportionately lower. Note that the preceding calculations have not accounted for any DC loads. You must calculate these separately. higher threshold prevents the transistor from turning on during normal operation; unprogrammed transistors do tum on. Therefore, unprogrammed PALs and PROMs draw more current and dissipate more power than programmed devices. The output buffers on Cypress products have nchannel pull-up devices that cause the output voltage level to reach VOH = Vcc - VT = 5V - IV = 4V The capacitance of the output buffers, including stray capacitance, is typically 10 pF. If CL = 10 pF, VOH "= 4V Again, using E~uation 3, Icc(f) = 40 x 10-1 f for the output buffers. ADDRESS / OATA Current Measurement ICC Figure 3 illustrates the instantaneous current drawn by a Cypress RAM. The instantaneous power is calculated by multiplying this current times the constant supply voltage, V cc. Most of the power is dissipated during the access time. This is also true for PROMs and PALs. The current measurement unit in an automatic tester integrates the instantaneous current over the measurement cycle and arrives at an equivalent average current. In other words, the average current, Iz. during time 'fCy equals the area between the instantaneous tf-------TCy-------t 1I = Quiescent 12 = i(t) = Icc Average Icc Instantaneous Icc Figure 3. RAM Icc 1-25 Table 2. Static RAMs Part No. Because the input buffers on the CY7C 169 are type C, the average current is 0.3 rnA. If the input-signallevel transitions are 4V and the transition times are 2V/ns, the transition time is 4V Tt= 2V/ns = 2ns No. CINT Icc Icc "uffer No. Type Inputs Outputs (Q) (max (pF) (rnA) (rnA) CY7C122/123 A 16 4 24 50 90 CY7C128 B 14 8 27 59 120 CY7C147 B 15 1 34 28 90 CY7C148/149 B 12 1 32 45 90 CY7C150 B 18 4 20 44 90 CY7C1611162 B 22 4 300 13 70 CY7Cl64 B 20 4 300 13 70 CY7C166 B 21 4 300 13 70 CY7C167 C 17 1 75 25 70 CY7C168/169 C 18 4 75 50 70 CY7C170 B 18 4 50 33 90 CY7C1711172 B 18 4 100 27 70 CY7C185/186 B 25 8 330 13 100 CY7C187 B 19 1 150 7 100 CY7C189/190 B 10 4 21 32 90 The duty cycle is then 2 n~5'nS = 0.057 Each input buffer thus draws OJ rnA x 0.057 = 0.0171 rnA If all inputs change, the total transient input buffer current is 18 x 0.0171 = 0.31 rnA To calculate the CVf input buffer current: 1= CVf CIN = 5 pF 1= 0.57 rnA V= 4V f = 1/35 ns TOTAL = 18 x 0.57 = 10.28 rnA To calculate the internal CVf current: 1= CVf CINT = 75 pF 1= 10.71 rnA V= 5V f = 1135 ns To calculate the output CVf current: 1= CVf COUT = 10 pF 1= 1.15 rnA V= 4V Product Characteristic Tables Tables 2 through 5 allow you to calculate the current requirements for Cypress products. CINT is the equivalent device internal capacitance, lce(Q) is the quiescent or DC current, and IcC(MAX) is the maximum Icc (as specified on the data sheet) for the commercial operating temperature range. Conditions are Vee = 5V and TA = 25°C. Note that for the 16L8, 16R8, 16R6, and 16R4 PALs, the number of inputs and outputs is user configurable. All the PALs use type B buffers. Table 3. PROMs Part No. Buffer No. No. CINT Icc Icc Type Inputs Outputs (Q) (max (pF) (rnA) (rnA) [1] SRAM Calculation Example To illustrate how to use Tables 2 through 5, consider an example of estimating the typical Icc for the CY7C169-35 RAM at room temperature (TA = 25°C) and Vee. Assume the duty cycle is 100 percent at the specified acces time. The procedure shown here calculates the typical and worst-case Icc with all inputs and outputs changing and with output loading of 10 pF. From the RAM product characteristic table: Number of inputs = 18 Number of outputs = 4 CINT = 75 pF Iee(Q) = 50 rnA CY7C225 B 12 8 32 CY7C235 B 13 8 CY7C245 B 13 8 90 35 35 90 35 50 90 CY7C251 C 18 8 43 9.5 100 CY7C254 C 18 8 43 35 100 CY7C26113/4 C 14 8 60 45 100 CY7C268 C 19 118 60 60 100 CY7C269 C 17 118 60 60 100 CY7C2811282 B 14 8 35 35 100 CY7C2911292 B 14 8 35 50 100 [I]/Bidirectional pins 1-26 35 Power Characteristics Table 4. PALs f = 1/35 ns TOTAL = 4 x 1.15 = 4.6 rnA The quiescent current is 50 rnA. The total current at Tey = 35 ns is: Input Transient 0.31 rnA Input CVf 10.28 rnA Internal CVf 10.71 rnA Output CVf 4.6 rnA Ouiescent 50 rnA Total Icc 75.9 rnA (all inputs/outputs changing) Part No. Icc (Q) (rnA) Icc (max) (rnA) 45 PALC16L8/R8!R6!R4 40 25 PLDC20G10 50 30 55 PALC22VlO 50 300 40 42 80 120 IPLDCY7C330 Total Icc = Input Transient Icc + Input CVflee + [Internal CVf + Output CVf + Ice(Q)] x 1.13 Icc = 0.31 + 10.28 + [65.31] x 1.13 = 84.4 rnA This value is approximately 94 percent of the 90 rnA specified on the data sheet. Note, however, that the data sheet Icc maximum does not include the output CVf current. Note that the worst-case transient current is 25.9 rnA. If half the inputs and outputs change, the worstcase transient current decreases to 12.95 rnA, which gives a total current of 63 rnA (typical Icc). Note also that the input CVf current the the output CVf current have the same values for a bipolar device. Worst-, Worst-, Worst-Case ICC Now consider a procedure for estimating Icc for worst-case Vee and low temperature, in addition to all inputs and outputs changing. Then you can compare the result with the Icc specified on the data sheet. Icc is greater at high Vee, which is 5.5V, or 1.1 times the nominal 5V Vee. Because the increase in Icc due to the lower temperature is 3 percent, the total increase is 13 percent. These factors apply to the internal CVf current (10.71 rnA), the output CVf current (4.6 rnA), and the quiescent current (50 rnA), which together total 65.31 rnA. I I (pF) CINT ICC-Versus-Frequency Characteristic The Ice-versus-frequency curves for all Cypress products have the same basic shape, which is illustrated by the PAL 16R8 curve in Figure4. The current remains essentially constant at the quiescent Icc value until the frequency increases to the point where the capacitances begin to cause appreciable currents. The location of this point depends upon the input, internal, and output capacitances; the number of inputs and outputs; the rate at which the inputs and outputs change; and the voltage levels the inputs and outputs are switched be- Icd VS FREQUENCY FOR PAL 16R8 ALL INPUTS / OUTPUTS CHANGE Vcc=5V. TA=25OC. VIL=O.8V. VIH =2V TYPICAL Icc VS f 120 1 J/ ! 100 ..,. ...: 80 V'det=sop, E (OUTPUTS ~ EN~BL1D) 0 ...Y 60 AR'{ 40 ~ 1(0) -' 25 20 ,."", ~~ ~ Jt~ I o·S«\I' ~ ~CJ=O~F I (~ ct:i( ~. B o 10KHz 100KHz 1 t.lHz FREOUENCY IN HERTZ Figure 4. Typical Icc vs f 1-27 10t.lHz ~TYfl 100t.lHz Table S. Logic Products Part No. Buffer No. No. CINT Icc Icc Type Inputs Outputs (pF) (Q) (max . (rnA) (rnA); [1] CY7C401 B 6 6 53 30 75 CY7C402 B 7 7 53 30 75 CY7C403 B 7 6 53 30 75 CY7C404 B 8 7 53 30 75 CY7C408 B 11 12 100 42 135 CY7C409 B 11 13 100 42 135 CY7C428/9 C 14 12 190 18 80 CY7C510 C 24 19/16 60 30 100 CY7C516 C 28 16/16 60 30 100 CY7C517 C 28 16/16 60 30 100 CY3341 B 6 6 53 30 45 CY7C601 C 25 19/64 950 89 600 CY7C901 C 24 10/4 160 25 80 CY7C909 C 21 5 80 25 55 CY7C910 C 22 16 150 2.6 70 CY7C911 C 13 5 80 25 55 CY7C9101 C 36 22/4 70 30 60 CY7C9116 C 22 1120 1000 35 150 CY7C9117 C 38 114 1000 35 150 tween. For Cypress products, this point is in the I-to 1O-MHz range. The PAL 16R8 devices that were tested to optain the data for the curve were exercised such that all inputs and outputs changed every cycle. Curve A sho:ws the total Icc for a 50-pF load on each of the eight outputs. Curve B shows the total Icc when the outputs are disabled. The B curve results from the input and the internal capacitances. In most applications, ·the actual operation of the device falls somewhere between the A and B curves. You can extrapolate the A and B curves backwards until they intersect the quiescent current, which occurs at point C in Figure 4. Point C is approximately 5.6 MHz. This gives you an easy-to-use formula for calculating Icc. For frequencies less than 5.6 MHz: Icc = Icc(Q) = 25 rnA For frequencies greater than 5.6 MHz: Icc = Icc(Q) + 3.5 rnAlMHz(alI outputs changing) or Icc = Icc(Q) + 0.5 rnA/MHz (no outputs changing) [l]/Bidirectional pins 1-28 CYPRESS SEMICONDUCTOR Tips for High-Speed Logic Design This application note provides tips and makes substantive suggestions for designing high-speed logic circuits that operate reliably. The tips and suggestions are organized under the headings: Noise Considerations Clock Distribution Buses and Memories Care and Feeding of PLDs PCB Effects Metastability and Crosstalk As electronic system clock rates reach ever higher, logic designers who were engineering lO-MHz, l00-nscycle-time systems are finding themselves working with systems running at speeds upwards of 20 MHz, with 50ns cycle times. These designers are discovering that adequate techniques for work at 10 MHz are no longer appropriate at 20 MHz and beyond. At 10 MHz, you can utilize sluggish and relatively well-behaved LS TTL logic with its leisurely set up and hold parameters; long propagation delays; forgiving output enable and disable times; and high-output current-drive capacity. As an alternative, designers turned to faster bipolar logic families, but found that power dissipation rose proportionally. To save power and enhance reliability, today's designers are changing to CMOS components. Designers are happy to find that CMOS can deliver the speed they require at the low power levels they desire. In the quiescent state, CMOS logic (ACt ACTIFCT) draws three to five orders of magnitude less power than bipolar logic (LSI ALSIAS). At 1 MHz, CMOS logic dissipates about 0.1 mW per gate, while LS TTL logic dissipates about 2.0 mW per gate. CMOS technology has truly rewritten the speed/power rules set forth in the bipolar era. Plenty of challenges still face the high-speed logic designer, however. For example, high-performance logic families are sensitive to system noise and generate noise themselves. As a result of the effort to make these devices as fast as possible, they often have anemic output drive capacity. Clock distribution becomes much more of an issue at high frequencies because skew and slow rise times degrade operating margins. As bus cycles tighten, it becomes increasingly difficult to avoid bus clashes (multiple devices driving a bus). Very fast SRAMs and FIFOs require read and write pulse widths that are very difficult to synthesize using synchronous logic; hence the appearance of self-timed memory devices. PLDs have become ubiquitous in modern board-level designs, but high-speed designers must carefully consider PLDs' relatively long propagation delays and slow switching speeds. You can no longer think of printed circuit boards as an ideal electrical interconnect. In the high-speed realm, you must account for the effects of distributed capacitance, inductance, and propagation delay on the PCB. To mitigate the effects of ringing, resistive termination of critical signals becomes a practical necessity above 20 MHz. In the days of old, it wasn't appropriate to factor loading into propagation delays. Today, conservative designers account for loading when calculating worst-case prop delays and worst-case signal skew. Heavy capacitive bypassing and low-inductance decoupIing is essential to minimize switching noise above 20 MHz. Metastability, a phenomenon not widely appreciated until recently, is a critical issue in high-frequency systems. It is essential to be able to resolve asynchronous events quickly and reliably in high-performance designs. Finally, crosstalk is a substantial concern with high-slew-rate and noise-sensitive CMOS logic. Noise Considerations High-speed CMOS logic tends to be noisier than LS TTL for two reasons: CMOS voltage swings are railtorail, and small-geometry, dual-layer-metal CMOS technology makes possible faster edge rates (2V per ns and faster). The classic ground-bounce noise situation arises when several outputs of a CMOS logic device switch from High to Low. The simultaneous switching causes a relatively large sink current from the load capacitance to flow to ground through the device package inductance. The potential developed momentarily across this 1-29 Figure 1. Maintaining Duty Cycle Symmetry inductance equals the product of the package induc": tance and the sink current's rate of change. This ground-bounce voltage spikes the Low state held on the quiescent outputs. The spike can often exceed the input Low-level maximum voltage (0.8V), causing the downstream logic device to switch erroneously. Both the chip ground reference and the chip Vee reference are spiked, but because. more energy is switched through the ground-lead inductance, it is much more common to see a problem in a quiescent Low-state output. Here are some procedures that minimize ground and Vee bounce noise: 1. Pursue any steps that reduce the parasitic inductance between the package and ground and Vee. These steps includes using a PCB with ground and Vee planes or, at the very least, power distribution elements. Avoid use of sockets, but do use low-inductance decoupling and bypass capacitors. On critical parts, use a standard ceramic decoupling capacitor (0.01 to 0.1 ~) along with a high-frequency filtering capacitor (approximately 470 pF). The Rogers Corp. MiCro/Q 1000 Series highfrequency, low-inductance caps are optimal for this purpose. Surface-mount packages have lower package inductance than DIP packages. So-called rotated-die devices with center Vee and ground pins also have lower inductance. 2. Whenever possible, design synchronous 'circuits. The ground bounce produced by an octal register, for instance, is triggered by the clock. If the register feeds another registered device, then the noisy output has unp.l a set-up time before the next clock to settle. When you must drive an asynchronous signal with an octal driver, use an output pin close to the package ground pin. The output pin next to the Vee pin can have as much as SO% more ground-bounce noise than the output pin next to the ground pin. 3. Use various techniques to slow switching or transition edge rates and, therefore, the sink. and source currents' rate of change. This can be accomplished with series damping resistors or by increasing the inductance or capacitance between the driving device's output pin and the receiving device's input pin. PCB traces exhibit parasitic ground-path capacitance and inductance that depend on trace length and topology'; 'these factors are thus difficult to predict. The most common technique is to use series damping resistors in the 2S to 3S0 range; 330 is a standard value. Series resistors also limit signal overshoot and undershoot. 4. Try to avoid. running control signals through a device that drives data and address lines. When using a 10-output PLD such as a 22VlO in an 8-bit bus-oriented application, for instance, you might be tempted to use the extra two outputs for control signals. If the eight data lines switch simultaneously, however, the control lines will probably be disturbed. Using devices that feature input hysteresis adds to the noise margin. Input ,hysteresis can typically provide 200 mV of additional noise immunity. Note that mixing logic families can compromise noise immunity margins. For comparison purposes, the margin for a specific logic family is the magnitude difference between the family'S guaranteed input threshold and the guaranteed output voltage for the High and Low states: N.. . Vil- Vol Olse Immumty = Vih - Voh When possible, use a logic family that can drive (commercial) transmission lines directly. This specification is characteristic of devices that can switch sufficient current to guarantee so-called incident-wave switching. Switching that occurs on the incident wave is faster than having to wait for the reflected wave. In addition to caUSing false triggering of downstream sequential logic and glitches in downstream combinatorial logic, ground-bounce noise can .also cause registers in the bounced device to "forget" their stored state. This is due to the momentary disturbance in the chip's ground and Vee reference. The switching of multiple outputs can also skew the device's propagation delay by approximately 200 ps per switched output. With an octal or lO-bit device, this 1 to 2 ns additional delay should be included in worst-case ,timing analyses. son Clock Distribution Adequate clock distribution is essential for 20-MHz and faster systems because skew can eat up precious nanoseconds and because high-speed logic devices are sensitive to clock waveform distortion and slow rise times. All physical devices exhibit, an edge-dependent propagation delay asymmetry; the Low-to-High edge propagates more quickly than the High-to-Low edge, or vice versa. For example, the c1ock-to-Q propagation delay for a Signetics 74F74 ranges from 3.8 to 6.8 ns Low to High, and 4.4 to 8.0 ns High to Low. The data sheet for the Texas Instruments 74AS1000 NAND driver specifies a 1-to-4-ns range for both Low-to-High and High-to-LoW edges, but any specific physical devic,e shows some asymmetry. It is possible to maintain duty-cycle symmetry· in a buffered-cIock distribution network by cascading two inverting drivers. The two drivers must both be in the same package, as shown in Figure 11. Because the two 1-30 drivers are in the same package, their prop delay characteristics track, and the High-to~Low and Low-to-High differential delays tend to cancel. Limit the fanout from a clock buffer to eight to 15 devices. Fanout calculations must account for both AC and DC loading. The AC characteristics for logic components are specified at 50 .pF of load capacitance and occasionally at 300 pF of load capacitance. Propagation delays and output-enable times increase by approximately 1 ns per each 50 pF of additional load capacitance. The input capacitance of bipolar logic families is higher (approximately 10 pF) than that of CMOS (approximately 5 pF). If the sum of the capacitance being driven exceeds 50 pF, derate the driver's AC characteristics appropriately. Input current is the important DC electrical characteristic for loading purposes. The driving device must be able to sink the sum of the Low-level input currents to which it is connected (101 at Vol). The driving device ~ust also be able to source the sum of the High-level mput currents to which it is connected (Ioh at Voh). . ~e Low-level input current for bipolar logic famihes ranges from -400 to -100 JlA, while the Lowlevel input current for modem CMOS logic families ranges from -5 to -1 JlA. The High-level input current for bipolar logic families ranges from 50 to 20 JlA, while the High-level input current for modem CMOS logic families ranges from 5 to 1 JlA. Because the 101 at Vol for bus drivers is often as high as 48 rnA, and the Ioh at Voh is often as high as -24 rnA, input current loading is seldom an issue, except when driving a parallel (resistor) terminated load. For example, a 220Q pull-up resistor requires about 22 rnA worst case (Vol = OV, Vee = 5V), and a 330Q pulldown resistor requires about 15 rnA worst case (Voh = 5V, Gnd = OV). Consider using an AC termination scheme if this additional current cannot be tolerated. If a single buffer cannot safely supply a sufficient clock fanout, use parallel drivers (Figure 22. When distributing a clock signal,attempt to load each of the parallel lines equally. Unequal loading increases the skew between lines. Figure 2. Parallel Clock Drivers The input load or leakage currents for CMOS SRAMs, PROMs, and DRAMs is approximately 10 JlA, sink and source. When you use high-output-current bus drivers (24 rnA 101 or greater), DC loading is rarely an issue. As system cycle times shorten, it becomes more difficult to avoid bus clash situations. Bus clash or bus contention occurs on a shared bus when one three-state device fmishes its output-enable time before a second device finishes its output-disable time. For a short period of time,. both devices drive the bus. Because the output stages of memories and logic components can typically withstand at least 20 rnA of current, the excess current does not shorten the devices' useful lives.· Bus clash does cause large positive and negative current changes in the device Vee and ground paths, however. The demand for current induces Vee and ground bounce noise just like the simultaneous switching situation previously discussed. Thus, avoid more than 5 ns of overlap in the worst-case output enable and output disable times. You can use CMOS components' low input current to advantage on buses when hold time is deficient. For example, consider a CMOS memory connected to a CMOS octal register. The memory is read, the IOE (or the ICE) deasserted, and the data clocked into the register. Ordinarily, the data should be clocked into the register before IOE is deasserted because the memory's worst-case output-disable time could be very short. When the memory is read in this case, however, the distributed capacitance presented by the register inputs, the PCB. trace, and the memory's own outputs is charged. Because the memory's output leakage current and the register's input current are very low (5 to 10 JlA), this distributed capacitance remains charged for some time. In effect, the data is held long enough to make up for the deficient timing. High-speed SRAMs and FIFOs have timing requirements that are often difficult to meet using synchronous circuits. In such situations, there are asynchronous alternatives to consider. You can use the delay lines supplied by various manufacturers by combinatorially gating the output taps to synthesize the required signal. Delay lines are typically calibrated by Buses and Memories When you design buses in high-performance systems, it is important to consider the effects of AC and DC loading. The input and output capacitance of CMOS SRAMs, PROMs, and DRAMs ranges from 5 to 7 pF. This capacitance can become a concern with large memory arrays. Be especially careful when using SRAM modules, which might have high input and output capacitances due to the multiple devices connected to each signal line. Because the signals that drive large memory arrays (such as the address, RAS, CAS, and data lines) tend to have long PCB traces, it is common practice to seriesterminate these lines to minimize ringing, undershoot, and overshoot. 1-31 comparing the input's rising edge to the various delayed outputs' rising edges; the delay times for the falling edges are less accurate. If a decoded signal _uses falling edges, make sure that the design can tolerate a few nanoseconds of inaccuracy. The Engineered Components Company makes a family of pulse-generator modules (POMs), which issue a precise pulse when presented with a positive-going edge. The company offers standard PGMs, fastrecovery PGMs that have a higher maximum repetition rate, and delayed PGMs, which wait for a specified period before issuing the pulse. Both delay lines and PGMs have propagation delays that range from 5 to 10 ns. Table 1. Pull-Up and Pull~Down Values RESISTOA/ALUES THEVENIt-EQUIVALENT 220Q PULL UP 330Q PULL DOWN 1320 330Q PULLUP 470Q PULL DOWN 194.Q I-oz. copper line 1.5 mils thick over a ground plane separated by a dielectric of 0-10 fiberglass epoxy 62.5 mils thick, the theoretical unloaded characteristic impedance is approximately non. In reality, PCB trace characteristic impedances can range from 50 to 200n. Capacitive loading reduces the characteristic impedance, increases the delay, and slows the rise time on a transmission line. The conventional method for reducing reflections on transmission lines is with some form of termination, the most common being the so-called Thevenin type. This termination consists of a pull-up resistor to Vee and a pull-down resistor to ground. The goal is to match the two resistors' Thevenin equivalent to the trace's characteristic impedance. Table 1 lists common values for the pull-up and pull-down resistors. Both of the termination pairs shown in the table pull toe line to a logic High of approximately 3V when the dHver is disabled. Place the termination resistors as close as possible to the receiver. Keep in mind that many CMOS logic components have input and output clamp diodes to help damp overshoot and undershoot. Care and Feeding of PLDs Programmable Logic Devices (PLDs) are exceedingly useful for designing high-performance systems, but their characteristics and shortcomings must be well understood. The set-up time for most registered PLDs is usually just less than the propagation delay. This is because the signal to be latched must propagate through the AND array as well as the OR/XOR gate before reaching the flip-flop, while the clock is connected directly from the pin to the flip-flop. Accordingly, the hold time for this type of PLD is 0 ns minimum worst case and several nanoseconds negative, typically. This negative hold time implies that the PLD samples the state of the inputs as they existed several nanoseconds before the clock's rising edge. You can take advantage of this phenomenon when the device feeding the PLD is hold-time deficient with respect to the PLD clock. PLD outputs usually do not have the drive capacity of standard logic. When you use a PLD to generate a critical signal, such as a FIFO-read or shift-out pulse, buffer the signal with a fast, hard~driving gate. Bear in mind, too, that identical equations implemented in the same PLD can exhibit· different propagation delays due to different on-chip path lengths. PLD propagation delays are especially dependent on capacitive loading. Metastability The output of a latch or flip-flop can go into an undefined or metastable state (neither High or Low) when the set-up time or hold time for the device is violated. The metastable condition typically occurs when an asynchronous signal is being synchronized. It occurs in all process technologies and is impossible to completely eliminate. The two important metastability parameters to consider in design work are the mean time between failures (MTBF ) at maximum operating frequency and the average or typical resolution or settling time, Tsw. The latter is the time the device takes to resolve from a metastable state to a stable state. These parameters and/or the equations for deriving them should be available from a device's manufacturer. Metastability performance is proportional to a technology's Vih-to-Vn slew time. High-speed CMOS registers such as those found in Cypress PLDs have very fast slew times and typical settling times that range from 100 to 600 ps, depending on the device type. By double-latching asynchronous inputs, you can dramatically increase a system's MTBF and reduce the probability of a metastable event causing system mal- PCB Effects The most conservative way to handle PCB signal distortion effects. is to consider every substrate interconnect .. as a transmission line. In practice, this approach only works when the unloaded signal transition time approaches the round-trip substrate propagation delay. _ -For ordinary PCB materials (0-10 fiberglass epoxy), t~e round-trip propagation delay is approximately 0.3 ns per inch. Therefore, for 3-ns transition times, you should' consider any PCB trace longer than 10 inches as a transmission line. characteristic imA transmission line presents pedance and has distributed inductance and capacitance. You can ~~imize ringing on a transmission line by closely matching the output impedance of the driving device to the line's characteristic impedance. According to the micros trip model, for a lO-rnil-wide, a 1-32 .develops whose duration is twice the difference in the arrival times of the two waves; thus, the magnitude of the disturbance increases when the length of the parallel or adjacent traces increases. Due to CMOS's fast edge rates, crosstalk is a legitimate concern. You can take the following steps to reduce forward and reverse crosstalk: 1. Maximize the distance between traces, and minimize the distance over which traces are parallel or adjacent. When possible, make the signals on adjacent PCB layers perpendicular. Use the power and ground layers as shields between the signal layers. On two-layer PCBs, run ground lines between adjacent, parallel signallines. 2. Make every other conductor a ground line when using flat ribbon cable. Protect critical signals such as clock lines with a dedicated ground strip on PCBs or with a ground tWisted pair on backplanes. 3. Use Thevenin termination of a line to its characteristic impedance to reduce crosstalk amplitude by 50 percent functions. When determining the length 'of time to delay before clocking the second register, multiply the published typical settling time by two or three to create an extra margin of protection. Crosstalk Crosstalk is the undesirable coupling of a transition on an active line (talker) onto an inactive line (listener). The crosstalk amplitude is proportional to the talker edge rates, the physical proximity between signal lines, and the distance over which the two lines are parallel or adjacent. Crosstalk results from two important .physical causes: mutual impedance and velocity differences. Mutual impedance is due to the mutual inductance and capacitance between adjacent signal lines and is a transformer-like effect. Velocity differences arise when a signal propagates along a conductor that is in contact with two materials. of differing dielectric constants, such as fiberglass epoxy and air in PCBs. The wave propagating at the copper-to-epoxy interface travels slower than the wave propagating at the copper-to-air interface. A pulse 1-33 Protection, Decoupling, and Filtering of Cypress CMOS Circuits This application note explains how to protect your ICs with a low-cost zener diode and why it is good insurance against inadvertent voltage transients. Also explained is the reason why decoupling and high-frequency-filtering capacitors are required. A method is provided for determining the capacitors' values. rating. Because zener diodes always fail· shorted, they cause the power supply to "crowbar" and thus protect the ICs. A negative voltage on the Vcc line puts a forward bias on the· diode. This turns on the diode, which clamps the voltage to approximately -O.8V. If the negative voltage times the current exceeds the diode's power rating, the diode fails shorted, as in the reversed-bias case, and protects the ICs. Zener Diode Protection Linear power supplies can cause large voltage transients. When caused by the collapse of a magnetic field, the transient is negative. When the supply is turned on, the resulting transient is positive. Some commercially available laboratory bench supplies behave the same way. When they turn on, they can over-shoot several volts. When they turn off, lead inductance can cause a negative transient voltage at the Vee pin. If sufficient energy is available, internal gate oxides can break down, either destroying or weakening the IC such that it might fail later. You can avoid this problem by adding a 20¢ zener diode (also called a voltage-regulator diode) between Vcc and ground. Connect the diode's cathode to Vcc and the anode to ground (Figure 1). A 400-mW, 6.2V lN525 or equivalent is recommended. You can also use the IN753, a 5OD-mW, 6.2V zener diode. If a voltage greater than the zener voltage (6.2V) occurs on Vcc, the diode breaks down, clamping the voltage to 6.2V and shunting the current to ground (Figure 2). The diode can be destroyed if the current multiplied by the zener voltage exceeds the diode's power High-Frequency Filtering In addition to the protection offered by zener diodes, decoupling and high-frequency-flltering capacitors are required on high-performance CMOS circuits. To use these capacitors effectively, you must understand why they are required. To realize the fast rise and fall times that Cypress CMOS integrated circuits are capable of achieving, the power-distribution system must be able to supply the instantaneous current required when the device outputs switch from Low to High. The energy converted to current is stored as charge on the local decoupling capacitors. They decouple or isolate the circuit from the power-distribution system. It is standard practice to use one decoupling capacitor for each IC that drives a transmission line and one capacitor for every three devices that do not. The PCB trace inductance plus the IC lead inductance can "current-starve" the output circuits, causing Vr v Figure 1. Zener Diode Connection Figure 2. Zener Diode Characteristic 1-34 The last step is to assume a reasonable, tolerable droop in the capacitor voltage. Assume dV = 100 mV. Additionally, the signal rise and fall times are 2 ns. Substituting these values in Equation 2 yields c= Figure 3. Simplified Capacitor Equivalent Circuit = 14.4 X 10- 9 rise-time degradation. Remember that the current through an inductor cannot change instantaneously. Therefore, you must minimize any series inductance, including the lead inductance of the decoupling capacitor s. = 0.0144~ It is standard practice to use 0.01 to 0.1-~ decoupiing capacitors. A 0.1-~ capacitor can supply 5A under the conditions assumed in the preceding calculations. Another way to look at the situation is that a 0.1~ capacitor supplies 720 rnA of instantaneous current in 2 ns with only 14.4 mV of voltage droop across the capacitor. Decoupling capacitors for high-speed Cypress CMOS circuits should be of the high-K ceramic type with a low effective series resistance (ESR). Capacitors using 5 ZU dielectric are a good choice. Decoupling-Capacitor Calculations To determine the value of the decoupling capacitor, you must estimate the instantaneous current required when all the outputs of an IC switch from Low to High, assuming a reasonable droop of the voltage on the capacitor. The charge stored on the local decoupiing capacitor is Q= CV Differentiating yields i(t) = EQ= c dV dt High-Frequency Filter Capacitors The 0.1 to 0.01-~ decoupling capacitors usually do not provide high-frequency decoupling or filtering. These capacitors do not behave like capacitors at high frequencies because their series resonance frequency is not high enough. This is primarily because of lead inductance in their construction, which is a result of the capacitor's relatively large value. For high-frequency filter analysis, you can use the simplified equivalent cirCuit of a capacitor shown in Figure 3. Rs is the effective series resistance (ESR), L is the effective series inductance (ESL), and C is the capacitance. The impedance of the simplified equivalent cirCuit is: Eq.l dt The characteristic impedance of a typical transmission line is 50.0. Lines with a heavy capacitive load have a lower characteristic impedances. Next, assume that the IC is a nine-output FIFO, such as the CY7C429. The outputs reach Vee - Vt = 5V - IV = 4V Each output thus requires 4V/50n = 80 rnA. Because the FIFO has nine outputs, it requires a total of 720 rnA during the rise times of the outputs. Solving Equation 1 for C yields c= i Eq.2 dt dv Zc= Rs+ jroL + .1C Eq.3 l ro 102 '\\ \ 1\ \ '\ \ ~ f 720x 1O- 3 x 2x 10- 9 100x 10-3 V \\ / ./ V /7 / \ V -- IV ~ K i( o ul 10 ~ // LX V v Zc= Rs+j [roL- 1 ro c] Eq.4 J --- K L--- L-- The magnitude of the impedance is / Zc~ "'-I RI + [0> L - 1I \/ .:c l' Eq.5 At the series resonant frequency: ~ Ipr roL= _1_ ~ IVV roC or, ro= 10 102 1al 104 lOS 106 107 108 109 1010 Frequency (Hz) 1 -::riC At the resonant frequency, Zc = Rs, which is the minimum impedance. Figure 4 shows how the impedance varies with frequency. The series resistance usually increases as the Z (Ohms) Figure 4. Capacitor Impedance Versus Frequency 1-35 F?l. ~ Protection, Decoupling, and Filtering ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ capacitance decreases. Also as the capaCitance decreases, the inductance typically decreases, which means that. the resonant frequency increases. This. is usually due to the capacitor's physical construction. Note that a surface-mounted capacitor's lead inductance is at least an order of magnitude less than that of an axial-lead capacitor. ,The next step in high-frequency fllter analysis is to determine a typical system's expected high-frequency components. Begin by assuming that the circuit is driven by a series of digital pulses with finite rise and fall times, then perform a Fourier transform on the series to determine their frequency components. 1t 1 3.1416x lOx 10- 9 31.83 MHz 1 F2= - = 9= 159.15MHz 3.1416 x 2 x 10Within the IC, signal rise and fall times can be as fast as 300 ps (picoseconds), which means that F2 = 1.061 GHz (1,061 MHz) .. In some ICs short timing pulses are generated internally, but they are usually longer than the 300-ps rise time, so. the preceding F2 is the highest harmonic present. 1t tr Because the IC's data outputs can normally change no faster than those of the inputs, the outputs do not generate additional higher-frequency harmonics. Fourier Transform of a Periodic Pulse Figure 5 illustrates a periodic pulse of amplitud~ A, period T, rise and fall times of tr, and pulse width of Tp, as measured between the SO-percent-amplitude points. The approximate frequency-domain transform appears in Figure 6. The amplitude of the frequencydomain voltage is a function of the signal's amplitude and duty cycle in the time domain. The fundamental frequency, Fa, is related to the pulse train's period. The first harmonic, FI, is of equal energy and is a function of the pulse width. The second harmonic, F2, contains half the energy of Fo and is a function of the, pulse rise time. The rise and fall times of Cypress's CMOS and BiCMOS circuits are 2 ns, by design. If a Cypress PLD is driving the write- or read-strobe inputs .of a CY7C429-20 FIFO at the maximum frequency of 33.3 MHz (T = 30 ns) with a 10-ns/30-ns-duty-cyc1e signal (Tp = 10 ns), the following signal frequencies are generated: 1 1 Fo= - = 1t T 3.1416 x 30 x 10- 9 1 Tp FI= - - = Parallel the Filter Capacitors You cannot fmd a· capacitor whose three series resonant frequencies correspond to Fa, FI, and F2. Instead, select three separate capacitors with the appropriate resonant frequencies and connect them in parallel between Vee and ground, as close to the IC as possible. The capacitors act as a bandpass fllter, shunting the unwanted, high-frequency signals to ground. The sum of the capacitors' values should be greater than or equal to the capacitance value given by Equation 2. The total high-frequency flltering capacitance is usually between 100 and SOO pF. Low-Frequency Filter Capacitors A solid tantalum capacitor of 10 JlF is recommended for every SO to 100 ICs to reduce power-supply ripple. Place this capacitor as close as physically possible to where the Vee and ground enter the PCB or module. 10.61 MHz A a= 2Aa O.SA Aa 0 t I I Fo FI F2 Ie. T f- Figure 6. Fourier Transform of Periodic Pulse Figure S. Periodic Pulse Waveform 1-36 Section Contents Page Modules Choosing Packages in High-Density Module Designs ........................................ 2-1 The Multichip Family of Universal JEDEC ZIP/SIMM Modules .............................. 2-7 , -=4 CYPRESS SEMICONDUCTOR Choosing Packages in High-Density Module Designs components, ceramic modules can be used in military applications. For all applications, the ceramic-substrate devices have better thermal characteristics than nonceramic types. This application note describes the various packages in which high-density memory modules are available and reviews some of the application areas where specific packages find use. Module outline drawings accompany the text. You can use high-density memory modules in place of multiple monolithic les to minimize space, achieve better performance, and obtain single-device solutions. These modules are now available in a variety of package styles, each of which satisfies different needs in high-performance systems. Table 1 summarizes the characteristics of the different package types. There are two general module types. The first type uses plastic-encapsulated les mounted on an epoxyfiberglass substrate. The monolithic les on the modules can be mounted in Sale, VSOP, or SOJ packages, which are small-outline parts with either gull~wing or J-bend leads. The second module type offers hermetically sealed Lee (leadless chip carrier) les mounted on ceramic substrates. Modules built on epoxy-fiberglass substrates offer economic advantages over modules with ceramic substrates. In general, however, ceramic substrates can accommodate more components than epoxy-fiberglass substrates. Further, when assembled using military-grade SIPs The single in-line package, or SIP, is a vertically mounted module with a single row of pins along one edge for through-hole mounting. The pins are on a lOa-mil pitch. Note in Figure 1 that the footprint of this plastic package is only 0.66 square inches. SIPs are typically used in low-pin-count applications and are often used where high component density is required. These modules' vertical orientation and accommodation of components on both sides can increase component density by a factor of four or more over designs that use monolithics. In addition to meeting space constraints, this higher density can also improve memory system performance by reducing path lengths from chip to chip. Another chief source of appeal for the SIP module is fast, easy access to state-of-the-art package technology. That is, a design's main circuit board can be implemented in conventional, high-yield, through-hole technology, while the system, overall achieves superior component TopY'_ ~ I" '1 DDDDDDDDD~ . b{=l 0.040 TYP ~ 0.175 0.100 TYp. 0.035 0.075 .M12 0,022 Figure 1. SIP 2-1 0.01 TV? -I 0."" I-- T 4.440 I MAX I D[JLJLJ[jI }1PI~~ .Q.QQZ 0.013 ~ o ~ . 0.1750.100 TYp. ..2:.Q.1! 0.075 0:026 Figure 2. Flat SIP density and high performance by employing fully-tested modules whose fine-pitch, surface-mount components are mounted on a multilayer, tight-tolerance substrate. which they are mounted. Flat SIPs' advantage is their low profile; they are typically used where component height above the main board is constrained. Flat SIPs range in height from 0.300 to 0.38 inch. Flat SIPs ZIPs Flat SIPs are virtually identical to SIPs, except that their single rows of pins have a 90· bend (Figure 2). Therefore, flat SIPs lie close and parallel to the board on ZIP modules are similar to SIPs. However, the ZIP module has pins on 100-mil centers along both sides of Table 1. Module Package Characteristics Package Typical Typical Type Pin Count Height (in.) Mil Min Max Min Advantages Disadvantages Max Board Space (sq. in.) FR4 Cer SIP 24 50 0.5 0.9 N Vertical orientation; FR4 or ceramic Limited pin count 1.2 0.9 FSIP 24 50 0.2 0.4 N Very low profile; mechanical stability; FR4 or ceramic Lower density due to horizontal orientation 2.7 2.4 ZIP 24 100 0.5 0.9 N Vertical orientation; JEDEC standard pinouts; pinout compatible with SIMM 1.2 N/A SIMM 24 100 0.5 0.9 N Vertical orientation; socket mounting; pinout compatible with ZIP 1.2 N/A VDIP 36 104 0.5 0.95 y 0.17 0.37 1.2 0.9 y Vertical orientation Low profile; excellent mechanical ruggedness Horizontal 2.9 2.9 DIP 24 60 QUIP 48 200 Y Low profile; excellent mechanical Horizontal ruggedness; increased number of pins 2.9 2.9 QFP 68 144 Y Surface mount; low profile; excellent Surface mount teohnology required; mechanical ruggedness;large number of pins in small area horizontal; comp~ments on one side only 3.1 3.1 PGA 68 144 Y Large number of pins in throughhole technology; low profile; excellent mechanical ruggedness Multilayer boards; horizontal; components on one side only 2.9 2.9 Notes: Mil entries indicate whether a hermetic, military version is available Board space is the mother board area that the package occupies when the module carries eight to 28 components 2-2 ~ ~'§r""'" ~ Packages In High.Density Module Designs SEMlCONDUCfOR ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;= BotIomVl_ --1' -\ ~ --I 0.350 I- ;~~iftlmlI W =r;j I:-~.... .+. ~ ...-1~[....-1 J--.~W•••• ;lJ--~ I I · · · · · · · · · · · · · .. . ........ · · · · · · · J I ~ 0.100 TYP Pin 1 Figure 3. ZIP the substrate (Figure 3). Pins on alternate sides are staggered by 50 mils. The dual row of staggered pins allows a higher connection density than the SIP, while maintaining lOO-mil spacing between adjacent pins. The staggering provides additional separation for the lead vias and supports between-lead traces. At the same time, pin count is doubled over that of SIPs. Many ZIP modules have a vertical dimension of 0.500 inch maximum. This low profile makes them candidates for VME systems, where there is a maximum allowable component height. Some module devices are available in· both ZIP and SIMM packages with the same form factor. The pin out is such that the footprint of some SIMM sockets matches the footprint of corresponding ZIP modules. This allows system prototypes to use socketed SIMMs, and production systems to use through-hole-soldered ZIPs, with no change in the motherboard. Some SIMMs and matching ZIPs have presencedetect pins, whose unique combination of no-connects or grounds can be used by external logic to identify the module's memory capacity. Thus, the system can determine the amount of memory present without user input. SIMMs DIPs Single in-line memory modules, or SIMMs, are also similar to SIPs, except that SIMMs have no pins for through-hole mounting (Figure 4). Instead, the module's bottom edge effectively acts as an edge connector, which is part of the substrate material. Contacts directly opposite each other are connected together. Some SIMMs have contacts on lOO-mil centers; others have 50-mil centers. The typical application for SIMM modules· requires socket-mounted components, either for repair or for upgrades in the field. Some SIMM sockets hold the SIMM at an angle, which reduces the height of the module on the board. DIP modules have identical footprints and similar form factors to standard IC DIPs. The modules are typically taller than the DIP packages used for monolithics. Components are mounted on both the top and bottom of the substrate. Generally, these modules are used in anticipation of monolithic devices that will someday fit the same footprint. DIP modules allow engineers to design-in monolithic devices that do not yet exist by employing the modules to meet immediate production needs. Practically, even after monolithic devices become· available, the modules generally continue to find utility while initial 0.125 DIA. +.001 2 PLCS 0.145 REF . . . - - - - - - - " - - " - - - - 3.35 (64 P I N S ) - - - - - - - - - - - - - . { Figure 4. SIMM 2-3 PIN 64 0.345 M L~II ~~ f ~ 0.175 0100 -1 L TvP I I -1L __ I I 1-. 0.015 0.013 0.1~ I I 0:025 TYP (a) ~I·-----------------------~------------------------~ DO D L.-_ _.... DO DO 0.050 lYP (b) Figure 5. VDIP (a) and HVDIP (b) production ramp-up of the monolithic devices keeps supplies short. through-hole mounting (Figure 5b). Components are hermetically encapsulated. Used in both low- and high-pincount applications, they are especially attractive when high component density is required on the main board. As with the plastic VDIP, pins on opposite sides of the module are aligned, and spacing in both directions is 100 mils. VDIPs VDIP modules typically have the largest pin out of any modules. Similar to ZIPs, VDIPs are vertically mounted modules with plastic-encapsulated components and epoxy-encapsulated chips (Figure 5a). VDIP modules have pins along both sides of the substrate, with the pins on alternate sides aligned. Spacing along each row and across the module is 100 mils. The dual row of pins allows a higher connection density than SIPs, while maintaining lOO-mil minimum spacing between adjacent pins. Like ZIPs, VDIPs are useful in high-pin-count devices, where the host board is designed to normal through-hole design rules. VDIPs help retain the density advantages of vertical packages, while providing a low profile. HDIPs Hermetic DIP (HDIP) modules have ceramic substrates with the same pin arrangements and footprints as standard IC DIPs (Figure 6). Hermetic components are mounted on both sides of the substrate. Hermetic DIP modules range in. size from 24-pin devices with 300-mil widths to 60-pin, 600-mil devices to 900-mil special modules. The QUIP The quad in-line package (QUIP, Figure 7) is similar to. the DIP except that the QUIP has a dual row of pins along the package edge. In-row and row-to-row spacing is 100 mils, with pins in adjacent rows aligned directly across from one another. The QUIP is a low-profile package with excellent mechanical ruggedness and the added advantage over DIPs of higher pin density for the same package length. Ceramic Modules For harsher environments, several types of modules are available with ceramic substrates and side-brazed leads.· These modules sometimes have sealed metal lids to protect directly-mounted IC chips or utilize hermetically sealed LCC-packaged ICs. Four hermetic packaging styles are available: HVDIPs, HDIPs, PGAs, and QFPs. PGAs and QFPs HVDIPs Pin grid arrays (pGAs, Figure 7) and quad flat packs (QFPs Figure 8) are ceramic-substrate packages similar to those used for monolithic devices, except that the Hermetic vertical DIPs (HVDIPs) are vertically mounting ceramic modules with pins along both edges for 2-4 provides the die-to-die interconnect and the connection to the I/O pins. modules' cavities house more than one die. Each die is individually bonded to pads. The customized substrate I 1.414 ~ • .. I II~bJCII~~ [[~: ~0230 0.285 II ~ -i I- 0.021 I I 0.100 -i I- TYP Figure 6. HDIP 1.010 O.tto ·000000000' -0 ·0 I ·0 e- 0' 0- 0..350 DIA WltDOW '0-+,-0- '0 •0 -0 llOOC 0'0'01 TYP -' DIA 0- 0 0· -000000000- A 000000000 / , 2 l "' ~ , 7 I , '0" "t BOno ... Vltw " A' TOP VIEW Figure 7. PGA Module SEAL RING t .115 MAX t .180 t HEAT SINK Figure 8. QUIP 2-5 .600 JL .100 TYP . ~~RESS 49' Packages in High-Density Module Designs ~COIDUcr~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. - ~---------!g----------- ... - 1-----------8g----------~· ~-------;~--------~ """, N ... i-------~~=------_i g"!i 11 !! !llgj l-:i -:i o 3: 1:S 1fT'! I~ 2-6 The Multichip Family of Universal JEDEC ZIP/SIMM Modules You can use each generation as a x32, x16, or x8 memory block by driving the chip enables as address pins and connecting the 110 pins in parallel. This scheme allows the memory configurations shown in Table 1. This application note describes three Cypress memory modules, their special features, and how to use the modules as universal memory building blocks. The three modules are the CYM1821, CYM1831, and CYM1841, which provide 512K, 2M, or 8M of static RAM. The CYM1821, CYM1831, and CYM1841 provide the versatility to design many different systems with the same memory modules. The pin out and footprints allow you to use the same module in 8-, 16-, or 32-bit systems in ZIP or SIMM form factors using a single board layout. Variable Depth The three modules provide additional flexibility in memory depth. All three are 64-pin modules with compatible pin outs. The CYM1821 has four no-connect pins: 29, 30, 35, and 36. Pins 29 and 30 are address pins on the CYM1831, and pins 35 and 36 are still no-connects. On the CYM1841, pins 35 and 36 are address pins. This allows the modules to function as memory options in a design. The module family's variable depth is enhanced by the inclusion of two presence-detect pins: PDo and PD1 on pins 2 and 3, respectively. These pins provide unique logic conditions for a system to automatically sense the amount of memory present, which permits the system to adapt automatically to the module that is plugged in. The presence-dectect pins are either tied to ground on the module or left open, according to the information in Table 2. Figure 3 shows a simple circuit that decodes the presence-detect pins and generates depth-indicator status signals. History The JEDEC Solid State Products Engineering Council approved four series of SRAM ZIP/SIMM module pin outs for balloting 1 in December, 1987. The 64-pin module included definitions for 4 x 16K x 8, 4 x 64K x 8, and 4 x 256K x 8 generations. The JEDEC definition established the industry standard for the mechanical specifications and pin outs of the three generations of modules (Figure 1). The CYM1821, CYM1831, and CYM1841 follow the JEDEC definition. Variable Width The JEDEC pin-out definition includes four chipenable pins that each control a byte-wide block of memory (Figure 2): CS1, on pin 32, enables 1100 through 1107 CS2, on pin 31, enables 1I0s through 11015 CS3, on pin 34, enables 11016 through 11023 CS4, on pin 33, enables 11024 through 11031 Layout Considerations The three modules are available in either ZIP or SIMM form factors. Additional versatility is included in Table 2. Presence-Detect Pins Table 1. Memory Configurations x16 32Kx16 x32 16x32 No Module POI OPEN PDO OPEN CYM1821 x8 64Kx8 CYMl821 OPEN CYM1831 256Kx8 128x16 64Kx32 CYM1831 GND OPEN CYM1821 1Mx8 512Kx16 256Kx32 CYM1841 GND GND Word Width 2-7 GND ~~RE$ If!' ~ ZIP/SIMM Modules SEMiCQIDucrOR prototyping and testing various memory depths in the same socket. the module footprint to allow ZIP or SIMM modules to fit into the same board layout. The ZIP pins are arranged in the same hole pattern as a SIMM socket. If the board layout fits a SIMM socket, such as the AMP 821825-1, a ZIP plugs right in. This capability is useful for board 4x256Kx8 4 x 64K x 8 4 x 16K x 8 PDo(GND) PDo(OPEN) PDo(GNO) 1/0 0 1/00 1/00 1/0, I/O, 1/0, 1/02 1/0 2 .1/02 1/03 1/03 1/03 10 Vee Vee Vee 12 A7 A7 A7 14 As As As 16 Ag Ag 1/011 II0g 4 x 16K x 8 4x 64K x 8 4 x 256K x 8 GND GND GND PO, (OPEN) po, (GNO) PO, (GND) 4 5 1/04 1/04 I/O. 6 7 I/Os I/Os II0s 9 1/0, I/O, I/O, 11 1/07 1/07 IIOr 13 Ao Ao Ao 15 A, A, A, 17 A2 A2 A2 19 00,2 00,2 D0 12 21 00,3 00,3 00'3 23 00'4 00'4 00,4 25 00,5 00'5 00 , 5 27 GND GND GNO 29 NC A,s A,s 31 ~2 ~2 "CS2 33 ~4 ~4 ~4 35 NC NC A'7 37 ~ ~ M 39 1/020 1/020 1/0 20 8 18 I/Oe 1/011 20 I/Og 110g 22 1/0,0 1/0,0 1/0'0 24 I/O" I/O" I/O" 26 28 WE WE WE A'4 A'4 NC 30 ~, ~, "CS, 32 ~3 ~3 ~3 34 AliI NC NC 36 GND GND GNO 38 1/0,8 1/0,8 1/0,8 40 1/0'7 110 ,7 1/°,7 42 1/0 '8 1/0 ,8 1/°'9 1/0 ,9 1/0'8 46 A,O A,O A,a 48 A" A11 A" 50 A'2 A,a A,a 52 A'3 A'3 A'3 1/0 24 1/024 1/024 1/0 25 1/025 1/025 58 1/0 26 1/0 26 1/°26 60 1/0 27 1/°27 1/027 62 GND GND GND 64 Figure 1. JEDEC Solid State Products Engineering Council, Committee Letter Ballot JC-42.3-88-9, 16 January 1988. 3 ~ 1/0 111 Reference 41 1/021 1/0 21 1/021 43 1/022 1/° 22 1/0 22 45 11021 1/021 1/0 23 47 A3 A3 A3 49 A4 Ac A4 51 As A5 53 Vee As Vee ~ ~ ~ 1/028 1/0 28 110 28 1/0 21 1/021 1/0 21 1/0 30 1/0 30 1/0 30 110" 1/0 3 , 1/0 3, 44 56 6~·Pin 61 SRAM Module Pinout 2-8 Vee ADDRESS ~ WE - ~-r-~ x4 SRAM ~ 1/00 -1/0 3 ..... x4 r-- SRAM -r- ---- I r-~ ~-r- --- f---- x4 I/O SRAM ~ I/Oa- " x4 SRAM P:- 1/0 ~- x4 SRAM -1/0 19 -i-f--- i-- ~ I I-- 16 ~ I x4 SRAM P:- J - '--~ x4 SRAM -,,:- 1/024 - 1/027 '-- '--L.-- x4 SRAM ~ I 1 Figure 2. 64·Pin SRAM Module Block Diagram Vee PDo -..1..-..,---------...,.--1 NO MODULE 16Kx32 64Kx32 256Kx32 Figure 3. Depth Indicator Circuit 2-9 Section Contents Page ECL and TTL BiCMOS Noise Considerations in High-Speed Logic Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-1 Using ECL in Single + 5V TIL Systems ................................................... 3-4 BiCMOS TIL and ECL SRAMs Improve High-Performance Systems ......................... 3-7 PLCC and CLCC Packaging for High-Speed Parts ......................................... 3-15 A New Generation of BiCMOS High-Speed TIL SRAMs ................................... 3-20 Access Time vs. Load Capacitance for High-Speed BiCMOS TIL SRAMs . . . . . . . . . . . . . . . . . . .. 3-23 Combining SRAMs Without an External Decoder ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-27 BiCMOS TIL SRAMs Improve MIPS R3000 and R3000A Systems .......................... 3-30 Memory and Support Logic for Next-Generation ECL Systems .............................. 3-33 CYPRESS SEMICONDUCTOR Noise Considerations in High-Speed Logic Systems This application note explains why ECL is a lowernoise logic family than TTL or CMOS logic, both internally at the circuit level and externally at the system level. Also presented are the implications of ECL for your design needs. In state-of-the-art logic system design, clock frequencies of 50 MHz and beyond are not uncommon and give rise to many noise problems that were not significant in the past. Due to the nature of TTUCMOS logic, operating at these faster clock rates is inherently noisier and requires high-power output drivers with their associated groundbounce problems. Fortunately, ECL solves these problems. It is built for speed and is available in a low-power BiCMOS process technology. Since ECL was designed for high-speed applications in 1962, a number of design iterations have improved ECL devices. Consequently, it is the premier highspeed logic family. Additionally, built-in temperature and voltage compensation provides constant noise immunity in lOOK devices, so that noise margins are flat. In these devices, temperature compensation is designed into the DC input thresholds by voltage regulation. A correction factor designed into the current source, along with added circuitry between the output transistors' bases, make lOOK ECL's output voltage levels insensitive to temperature. These corrections rely on opposing positive or negative temperature-tracking-coefficient circuits. In both lOOK and 10KH ECL devices, voltage compensation is done by regulating an internal reference voltage, supplying a constant current source, and making both functions independent of supply voltage. These compensations result in a 3x improvement over T1L noise immunity. Additional anti-noise features include differential pairs, which prevent large current spikes when switching logic states, provide clean power supplies, and reduce ground bounce. Differential paths also cancel internal parasitic charging currents. Finally, ECL's more constant power dissipation - independent of operating frequency - keeps power-supply surges to a minimum. Supply current drain is governed by the constant-current sources that provide operating current for the differential switches and level-shifting networks. Thus, ECL's current drain remains the same regardless of the state of the switches. The high ratio of ECL noise immunity to internally generated noise also contributes significantly to reliable system operation. ECL's Internal Advantages Internally, ECL steers current and compares input signals to a voltage level instead of switching transistors on and off over a wide voltage excursion, as do other logic families. ECL's small voltage swings and low-current switching in signal paths minimize crosstalk and noise generation (Figures 1 and 2). ECL generates less noise switching logic levels due to the smaller dV/dt in the I = CdV/dt equation, where C is the coupling capacitance between signal paths, I is crosstalk current, dt is the rise/fall time, and dV is logic swing. dt 3.6V 1= C dV/dt -O.9V EeL Crosstalk current I is less for ECL than TTL due to smaller dV and dt -1.7V TTL __O_V_--r Figure 1. Effects of Rise and Fall Time 3-1 5;;= ~ -; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;~N; ;o; ;is; ;e; ;C~OD; ;s; ;i; ;de; r; ;a; ;tI; ; ; · o;;;;;;D;;;;;;s;;;;;;iD;;;;;;;;;;;;;H;;;;;;i;;;;:;g;;;;;;;;h;;;;;;-S;;;;p;;e;;;;;;e;;;;;;d;;;;;;L;;;;;;o;;;;;g;i;;;;;;c;;;;;;S;y;;;;;;st;;;;;;e;;;;;;m;;;;;;;;s SEMlcamUCfOR_ 5.0V Br'ov 3.6V EeL RISEIFALL TIME = dt LOGIC SWING = dV Therefore, time is saved because the logic swings are smaller and rise/fall time faster 3.6V ~,-_-....:::;O.:.:...9V,-= -L ___-~1.~7V~~~~________~8=OO V TTL----'O'--'-V-...J SLEW RATE = dV!dt dv ,--- CMOS Figure 2. Effects of Slew Rate The smaller transitions also prevent the emitter-follower outputs from generating ,large current spikes when switching logic states, unlike TTL totem-pole outputs. TTL current spikes are also related to I = CdV/dt. For ECL, C is the capacitive load. TrL ground bounce results from the current spikes and the inductance (L) between the board and the device's pins and bond wires. The bounce voltage (V) equals -Ldildt, which can be severe (see the' Reference) and can cause the chip's ground to rise. Because ECL's. einjtter followers provide superior output current and the lower capacitance of characteristicimpedance transmission lines, ECL solves the problem of power-supply droop and spikes when a large number of transistors change state. ECL in the System Environment In the ECL system environment, low-impedance open-emitter outputs and high current capacity allow you to use board-level transmission-line techniques that reduce reflections and decrease roll-off of high-speed rise and fall times. To understand these system-level advantages, consider that voltage-mode circuits have a High-state output impedance between 50 and 150n and exhibit an outputstepped characteristic. They fIrst reach 50 percent of the final value, then later reach the fmal value, which can be 3.5V and above. In contrast, ECL output impedances are less than lOn and ensure a full-valued signal into transmission lines. The signal only needs to be 800 mV.OutPuts are also capable of supplying 50 rnA, which is required to drive passive terminations. Because ECL gives you the built-in ability to drive controlled-impedance PCB traces, you can make tradeoffs among power dissipation, speed considerations, and PCB trace width. Some ECL devices have skew-free differential or complementary outputs for common-mode noise rejection at the receiving ena of either board traces or twisted-pair wire. As mentioned earlier, ECL's smaller logic transitioris lower crosstalk between board-level signal traces, as well as at the IC level. OUTPUT VOLTAGE LEVELUMITS Logic Family of Choice The factors described here make ECL the logic family of choice when 'designing systems at 50 MHz or greater clock/data rates. As !l percentage of total logic swing, ECL provides superior noise margin in the system environment compared to both TrL and CMOS logic. In a typical TrL/CMOS system, board-level noise can be 800 mV or higher due to ground bounce and other switching noise. The Reference explains this effect for both CMOS and TTL and includes actual measurements ~""""""""""""'" - Va.. (min.) .' INPUT TRANSITION REGION UMITS Note: VNH and VNL are the High- and Low-level device noise margins. Because ECL system noise is much lower than TTL system noise, the smaller ECL device noise margins are still better than the TTL margins. Figure 3. Identifying Specification Limits on Input and Output Voltage Levels 3-2 Table 1. System Noise Generated by Logic lower percentage of total signal swing than in TTUCMOS systems. ECL is therefore less susceptible to logic errors at high speeds (Figures 4 and 5). EeL's full temperature and voltage compensation results in relatively constant signal levels and thresholds; improved noise margins over chip-to-chip temperature and voltage variations; and a tighter AC window in the system environment. Another benefit of reduced noise generation is the improvement in electromagnetic interference (EMI) and radio-frequency interference (RFI) in ECL systems versus TTL and CMOS. TTL CMOS lOOK Parameter HIGH "I" (VNH) (mV) 400 400 LOW "0" (VNL) (mV) 400 400 Typ. System Noise (mV) 800 3.5V 800 22.8% 16% Logic Swing Percent Noise! 5V 10K 145 145 175 20 20 900mV 2.2% 140 Ipercent Noise = (system noise/logic swing) x 100 Reference of ACL (CMOS) devices. In an ECL system, the noise is closer to 20 mY. As shown in Table 1 and Figure 3, EeL's overall system or board-level noise is at a much ns 126.B40 ns 101.840 ns .-------,------_.- -1_______ _. . ___.___.__. _____.__. __._..._..__ . __.__.________ . _. _,_,_. _ _________ _______ ---'_. _ _--'-_'--__ Ct!. 3 ~ 200.0 mVolts/div Timebase 5.00 ns/div Ch. 3 Peremeters Freq. ~L "EDN's advanced CMOS logic ground-bounce tests," David Shear, EDN, March 2, 1989. ~ - 80.8538 NHz Of fset Oelay a 622.5 mVults 101.840 ns Figure 4. EeL Signal 76_8400 ns 126.B40 n!\ 101.840 ns I ___ .1-____ ._____ _ Ch. 3 600.0 mVolts/dlv Timebase 5.00 ns/dlv Ch. 3 Parameters Freq. - 80.6530 MHz Figure 5. TTL Signal 3-3 Offset Delay - 258.7 mVolts 101.840 flS . :z , .. CYPRESS SEMICONDUCTOR Using EeL in Single +5V TTL Systems ECL normally uses a -S.2V supply for 10K- and 10KH-compatible devices or a -4.SV supply for 100Kcompatible devices. Pulling ECL circuits and memories up to a single positive 5V level instead of using the nonnal supply does not change any performance or absolute-value logic levels so long as all the ECL device Vcc pins are tied to +5V, and the device VEE pins are tied to ground. The translators have separate supply pins and either separate or common ground pins for the circuit's EeL and TTL portions. This feature isolates the noisy TTL supplies from the ECL section, which runs at much faster speeds and with tighter noise margins. The advent of very high speed, low-power, ECLcompatible BiCMOS SRAMs and PLDs is causing an evolution in high-performance systems. ECL's inherent speed and noise improvement is well documented, but questions .and misconceptions concerning the devices might occur. These questions stem from the fundamental problems 6f mixing CMOS logic and bipolar ECL circuits on the same die and from interfacing ECL devices in single +5V supply CMOS/TTL systems. Chip-Level Considerations At the chip level, it is possible to integrate both ECL and .CMOS logic with negligible noise coupling. This compatibility is mainly due to the absence of noisy highdrive output devices between the device's CMOS sections and the ECL lIOs. The combined ECL/CMOS chips exhibit very low interconnect capacitance between devices on-chip, and the drive requirements are minimal. The devices generate less noise than occurs between devices at the board level. The noise magnitude on the chip VEE line equals approximately 20 mV worst case, in contrast to SOO mV of noise in typical high-speed, board-level CMOSITfL designs. Further, the unique configuration that Cypress Semiconductor employs to connect the device ECL circuit ground (Vcc) and ECL output ground (VCCA) reduces noise coupling between the internal CMOS circuitry and the ECL output drivers. Because the devices have a low overall noise level and employ internal supply decoupling, both the ECL and CMOS sections of Cypress devices run successfully on the same power pin. ECL-TTL-ECL Translation The Brooktree Bt501 (lOKH ECL compatible) and Bt502 (lOOK compatible) octal transceivers and translatorsperform bidirectional ECL/TTL transfers. These devices offer the option of supplying +5V only to the circuit's ECL portion (Figure 1). This arrangement makes it possible to design the system with only one power source and simplifies the task of adding ECL circuitry to a TTL board. You can isolate the ECL section from the TTL section in much the same way you isolate analog and digital sections on a mixed-signal board. To isolate the TTLgenerated noise from the ECL +5V supply lines, you must maintain separate ECL and TTL power lines; you can ~--I-- Board-Level Considerations At the board level, using ECL-I/O-type devices in single +5V TTL systems is possible with off-the-shelf level translators. These translators are made specifically to run standard ECL devices in a pseudo-ECL logic mode, with switching levels pulled up to the range between the +SV supply and ground. BC.(DO.D7) DIR 11'L vee TIL GND IiCL VIlB sa. vee Figure 1. Bt501l502 TTL/ECL Transceiver 3-4 ~RffiS -==:t!!If., -;;;;;;=========;;;;U;;;s;;;;in;;;g~E;;;;C;;;;L;;;;;;;;I·n;;;;S;;;;i;;;;n;;;;gl;;;;e;;;;+;;;;5;;;;V=T;;;;T;;;;L;;;;S;;;;;:;y;;;;;;s;;;;te;;;;m;;;;;;;.s SEMICONDUCTOR .;;; (a) MC10H350 +5Vdc (b) MC10H351 IfOui Bin II GUI AOUi A in o in 12 10 C in 15 Vec (+5.0 Vclcl - 1'1... lind 111 Gnd - Pin 8 14 Common 9 Slrobe 4 Aoul 111 00Ui 17 0001 19 ~ 18 C GUI = VCC ( ... S.O Vdcl - Pins II. 11, 15.20 Gnd = Pin 10 160 = Figure 2. ECL to TTL (a), TTLINMOS to ECL (b) Figure 4. ECL to TTL have common or separate ground planes. Employ normal power-supply decoupling for ECL devices. The Brooktree devices have the advantage of providing eight sets of transceivers for both translation directions in one IC package. A disadvantage is speed. The Bt501l502 devices have maximum propagation delays of 7 ns when translating from TTL to ECL and 11 ns in the other direction. In some applications this might be too slow. For a faster set of translators that run on single 5V supplies, try the Motorola MCI0H350 and MCI0H351 (Figure 2). The MCI0H350 only translates in the ECL-toTTL direction, but it is faster than the Brooktree parts, with a 5-ns maximum propagation delay, and includes differential inputs. The MCI0H351 is the TTL-to-ECL converter, offering a maximum delay of 2.1 ns. These devices have separate ECL and TTL supply pins, but have common ground-pin connections. Another method of translation is to use all discrete components or a combination of discrete and integrated products. The purely discrete approach speeds up the translation but introduces the risk that noise from the TTL-to-ECL sections might feed through the power and ground connections. You also have to consider the lack of temperature andlor voltage compensation, which affects noise margins. For translating TTL signals to ECL, use a simple voltage divider network, whose primary purpose is to reduce the TTL levels to ECL-Ievel logic transitions (Figure 3). In the other direction, a high-speed PNP transistor increases the logic swing to accommodate TTL-logic-level . transitions (Figure 4). A faster approach appears in Figure 5, where a differential pair consisting of two PNP transistors takes advantage of the ECL differential outputs. The choice of transistors greatly influences the propagation delay through these translators. Motorola manufactures some +5Vdc +5Vdc Figure 5. ECL to TTL Figure 3. TTL to ECL 3-5 very fast RF-type PNP transistors and matched PNP pairs that can serve well in the circuits shown. vee CYIOE383/101E383 Full-Duplex Translator 00 ~ QO Do DIFFERENTIAL 01 ECllNPUTS 01 For the ultimate in speed and flexibility, the Cypress CYlOE383/CYlOlE383 is a new-generation, full-duplex, TTL-to-ECL and ECL-to-TTL logic~level translator designed for high-perfonnance systems (Figure 6). The CYlOE383/CYlOlE383 has many features to satisfy a variety of applications. In the past, lev~l translators suffered from having an insufficient number of channels or supply options. This caused skew and noise problems that made the use of high-speed ECL logic levels in TTL systems highly undesirable. The CYlOE383/CYlOlE383 contains ten independent TTL-to-ECL translators and ten independent ECL-to-TTL translators for high-speed, bidirectional, fullduplex data-transmission, mixed-logic, and bus applications. The CYlOE383/CYlOlE383 is especially well suited to driving ECL backplanes between TTL system boards. The translator is implemented with differential ECL va to provide balanced, low-noise operation over controlled-impedance buses between TTL and/or ECL subsystems. The part features a delay of only 2 ns max from TTL to ECL arid 3 ns max from ECL to TTL, with minimum skew between channels. The CYlOE383/CYlOlE383 comes equipped with internal 2-K..Q pull-down resistors tied to VEE (ECL supply) to decrease the number of external components. For system testing purposes or for driving light differential loads, the pull-down resistors are the only termination, thus eliminating up to 20 external resistors. You can also use standard ECL terminations with the internal pull-down resistors and still adhere to standard 10K/10KH and lOOK logic levels. Additionally, the translator contains an ECL VBB reference voltage output, which you can use to. tie half of the ECL inputs for single-ended operation. The device is designed with ample ground pins to reduce bounce and has separate ECL and TTL power/ground pins to reduce noise coupling between logic families. The CYlOE383/CYlOlE383 can operate in single or dual supply configurations while maintaining absolute 10K/10KH and lOOK level swings to be used with either TTL-type (+SV) or ECL-type (-S.2V) supplies or both. The translators are offered in standard 10K/lOKH (lOE) and lOOK (lOlE, lOOK levels with up to -S.2V power supply) EeL-compatible versions. The TTL VO is EeL SUPPLY 01 ~~ 03 02 153 ..... & " 04 os 05 os as Q4 as 07 os 08 08 D9 010 ., Dll > as ". 012 ~ 013 .~ 014 > > > > 010 010 011 011 Q12 012 013 013 .014 014 015 015 016 .~ TIL INPUTS 015 TTL SUPPLY 016 011 TIL SUPPLY 06 t56 07 07 ~ TTl OUTPUTS 018 :> 019 > ~i ~i~i ~i ala OIFFEFlENTIAL ECLOUTPUTS ECLSUPPLV 017 017 018 018 OHl 'O19 i Figure 6. CYIOE383 Full-Duplex TTLIECLITTL Translator fully TTL compatible. The CYlOE383/CYlOlE383 is packaged in an 84-pin, surface-mountable PLCC. Reference Blood, William R., Jr., "Motorola MECL System Design Handbook," (Motorola Semiconductor Products Inc., Fourth Edition, 1988.) 3-6 CYPRESS SEMICONDUCTOR BiCMOS TTL and ECL SRAMs Improve High-Performance Systems A new BiCMOS process based on clean-slate approaches to implementing ECL or TTL logIc with bipolar, BiCMOS, and CMOS transistors in single devices is revolutionizing the speed/density characteristics of SRAMs. Historically, BiCMOS technologies were developed as either CMOS speed enhancers or bipolar power misers. The resulting BiCMOS processes were patches on either CMOS or bipolar process flows, and performance for the complementary bipolar or MOSFET components was sub-optimal. In contrast, Cypress's STAR M2 is a third-generation, 0.8Jl BiCMOS technology in which the baseline process is BiCMOS. In the STAR process, nonvolatile elements such as polysilicon loads and TiW fuses are easily incorporated into the baseline process. This results in high-density SRAMs, high-speed PLDs, and high-density EPROMsIPLDs. Figure 1 shows a simplified cross section ·of the STAR M2 BiCMOS process. This I8-mask, double-poly, double-metal technology utilizes a thin· epitaxial layer to achieve NPN Ft greater than 10 GHz and CMOS latch-up immunity. The MOSFETs both use lightly doped drains for high performance and reliability. In contrast to the architectures of SRAMs made using first-generation BiCMOS processes, STAR's poly silicon bipolar emitter is the same poly used for MOS gates. This enhances NPN performance and decouples the NPN from the poly load module used for 4T SRAM cells. By utilizing this poly load resistor, STAR allows for an 85-squaremicron memory cell. This third-generation process beats second-generation BiCMOS technologies in terms of product performance, density, and manufacturability. SRAMs area key technology driver, and BiCMOS fills the gap between the power-hungry pure bipolar ECL and the very high density, medium-speed CMOS. To indicate the performance of the Cypress process, Table 1 summarizes gate delays as a function of logic family and fanout. Table 1. Gate Delays Tpd (ps), psiFanout Gate Fanout = 1 CMOS 110 70 12 BiCMOS 240 ECL 95 30* *ECL delay varies with the square root of fanout Figure 1. STAR M2 BiCMOS Process, Simplified 3-7 5if;cvm:ss BieMOS SRAMs Improve High-Performance Sy·stems ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Read Cycle No. 1 too t - - toHA DATA )( )( ADDRESS our PREVIOUS DATA VALID Write Cycle No. I ~ '*XX)( DATA VALID ,r--I djD2 READ '__ _ Addr ___ J I Sh1ft Dl I i A In In _ ~,- ~D1n SEL u _ Uj WE 1: NUX RegIster Clock 2 N I DI = ~ ~ f-----, o >u D2 0 0 SEL u ~ ~ ~ ~ ~ o L-IA In _ NUX I~ .. I IWE Latch ~ 6 • 360 MHz I :g gt- e= ~ ~ u ~ o RAM '" tu N N V NUX ~D1n ~ 1: f Set/Reset D, W ~ N V RAM to §fj~~l , I ~ ~. " ,.. Clock 1 (180 MHz) 1: r ...... ~ o00 .... Btl08 ~ ~ ~ ~ tn tn ~ < E A o o I--- I:Il ~ SEL ShIft RegIster § ~ nWf~1--" lJJ ~ u1: o D 1n ~ MUX ~ DAC -, L-f-I DI RAM e ~ ~ 111 ~ ('D ~ J.4 I USEL ~ I.. == ~ MUX I EVEN READ --..J Addr ua L........f DI lJJ o o L-ISEL u 1: y "'C ('D If') If') __~===J=X~II~__1 I .L.---J- D2 MI ;> ~ I:Il 1-04 l&J rHD2 00 ~ J---l I Rr' ~ 111 I ?--- R/W ~ 111 i n ('D 00 t..od I:Il ('D "'" e t"I:l Samp Ie Clock Waveform Addresses J Waveform Data b [b] lPF ~~ rI • -I I L~>outPut Figure 5. Waveform Synthesis System ECL I/Os. Interconnect capacitance between devicell on the chip is very low, and drive requirements are minimal. Consequently, noise is not generated at the high levels encountered between discrete devices installed on a board. The noise magnitude on the chip Vee line is approximately 20 m V worst case, rather than the 800 m V encountered in typical high-speed, board-level CMOS/TTL designs. With a low overall noise level and internal supply decoupIing, both the ECL and CMOS sections of Cypress devices run successfully on the same power pin. Cypress employs a unique configuration to connect the device ECL circuit ground (Vee) and ECL output ground (Vcca). This configuration further reduces noise coupling between the internal CMOS circuitry and the ECL output drivers. The configuration also inhibits output oscillation in response to slow or noisy input signals. In operation, read and write addresses and data are fed to the SRAMs from the octal 2: 1 multiplexer/latches, and the color pixel data from the memories is sent to the DAC. This path is one of three in which the DAC drives the intensity of the display's red, green, or blue (RGB) electron gun drivers. This system's 360-MHz speeds are sufficient to drive 2K x 2K displays. The waveform synthesis system in Figure 5 can be controlled by either a microprocessor or a numerically controlled oscillator (NCO). Another part of the system writes waveform data to memory. Then the processor commands an address sequencer, whose output controls the memory, and the data read out is fed to the DAC, which outputs an analog waveform. This type of fast digital waveform synthesis finds many applications in satellite communications and video and test equipment. The 8-, 10-, and 12-ns speeds of the TTL 16K x 4 CY7B 166 SRAM have improved the throughput of such systems. The system could also use ECL BiCMOS for increased speed, but the resolution of available high-speed ECL DACs is not as high as available TTL DACs. For analog-to-digital applications, ECL and TIL SRAMs are used with high-speed flash ND converters. Some of converters have ECL outputs, whose clock rates range from 20 MHz to 1 GHz. Other converters have TTL outputs as fast as 25 MHz. In applications such as HDTV, phased-array radar, digital oscilloscopes, and single-event digitizers, the SRAMs create high-speed specialty memories such as self-timed SRAM, pipelined SRAM, and interleaved SRAM. Further applications for ECL and TTL SRAMs are found in high-performance workstations, file servers, and high-end embedded controllers. Figure 6 shows an example based on Mips Computer's lOOK ECL version of a BiCMOS ECL and TTL SRAM Applications Applications for ECL and TTL SRAMs include graphics and image processing, waveform generation via direct digital synthesis (DDS), and fast ~ systems. In video graphics, ECL memory stores color image information. In waveform. generation and DDS, ECL memory stores digital representations of analog waveforms before they are fed to a digital-to-analog converter (DAC). In a typical raster-graphics video system (Figure 4), 3.5-ns CYlOOE422 ECL SRAMs are used as color lookup tables (LUTs) to drive a Brooktree BTl08 video DAC. The SRAMs are interleaved to achieve the necessary speed and to supply the 8-bit words required for 3D solids shading. Motorola MClOOE155s, which have clock-tooutput delays of 1 ns, are used as 2: 1 mux latches. 3-11 ~~~~~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~==~~~ BiCMOS SRAMs Improve High-Performance Systems . , Instruction BUS ....... jJII"""" ,, ....... Instruction -- Cache If) :l 10 - Control ~r Y Bus ~ ~ FloatIng Point Controller Floating Point MultIplier X Bus If) If) 4) • L. U U -< CPU /' ~ .......... l .... Data Cache System Bus Controller ~ If) :::J m E !! If) III j ~ Secondary I...a.... .......... Cache - J .. - .... Data Bus .'-' Figure 6. RISe System commercial RISC microprocessor, which has a clock frequency of 67 MHz and is rated in a general-purpose application mix at 55 MIPS. The cache and TAG blocks are implemented using ECL BiCMOS SRAMs. The system uses standard memories to provide two levels of data cache. The primary caches include 64 Kbytes of storage for instructions and 16 Kbytes for data, using the fastest 64-Kbit, 8-ns, CY100E494 SRAMs. Cache control is part of the integer unit. With primary 8ns caching, the R6000 CPU can fetch both an instruction and a data word every cycle, instead of having to wait several cycles for main memory to keep pace. The. slower 512-KByte secondary cache is made up of 20-ns devices. A general-purpose cache-TAG implementation using standard EeL memories (Figure 7) uses two Cypress 1K x 4 CY100E474 ECL SRAMs (3.5 ns max access time). Two Motorola MC100E107 quint XORlXNOR gates (800 ps max prop delay D to F) perform the compare function. The speed of the SRAMs and logic correspond to a 4.5 ns address to match comparison time. Note that the outputs of ECL PLDs or logic are wire ORed to save one additional component. Alternatively, one CY100E302 (16P4) ECL PLD could be programmed to implement the 8-bit compare function in approximately 3 ns and save board space. Other memory sizes (e.g., 16K x 4) could be used to increase depth, and word width could be optimized by cascading devices. Figure 8 shows the critical path for a TTL 80386based cache system with a two-phase clock. The path consists of a DRAM controller implemented in a gate array, address generation configured in PLDs, cache SRAM, and cache TAG. Table 2 shows how the speed of cache tag and cache RAM affects path speeds. Table 2. Path Speeds for 80386 Cache Bus Cycle Time (MHz) Device Gate array TILPLDs Cache RAM enable Cache TAG Total + 2-phase clock 3-12 33 17.5 7.5 15.0 20.0 60.0 30.0 40 15.0 7.0 12.0 15.0 50.0 25.0 50 12.0 5.0 10.0 13.0 40.0 20.0 ~ =- ~~RESS BiCMOS SRAMs Improve High-Performance Systems ~~~COID~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As bus cycle times decrease because of faster ~ clocks, the speed of the cache TAG and cache RAM become very important in achieving path speeds. For today's TIL ~s, BiCMOS TTL SRAM implementations reduce access times to meet cycle time requirements. An example is shown in using the Cypress CY7B160 16K X 4 TIL BiCMOS SRAM. It is an 8-, 10-, and 12-ns device with internal decoding to enable easy memory expansion without sacrificing speed. Figure 9 shows how to use four devices to create an 8-, 10-, or 12-ns 64K X 4 memory and a 32K X 8 memory. Additional configurations are also easily implemented using no external decoding. 1----------, I I I Ul as ~ E r.c: IK x 4 iii ~ ::J 0. Wire u I OR ~ 1- _.., ~ II II I I L __ u I ,...' CY7Bl60B tlT A,-A" A,. 1/00-1/0, .... _-- ......... -< f't) A,-A .3 CY7Bl60C Glillfrn---i-j G~ rn ~fli VCC: .:J - - . CE5 - _______ CY7BI60C Cl!!!m'-iD ~ 1/0.-1/0, GJill TRUTH TABU L -_ _ ._~A~"~:CE4 rn ~ t\tt CE4 : • -1/° 1/° 0 3 A.-A' 3 m ~ I ~~ A,-A' 3 VCC:CE4 :: VCClCE5 R! tlT Au CE4 : ('i ________ V~CE5 • J 1/°0- 1/°3 p..- CE GMjrn ! i --., ...t= D I CY7Bl60B D CY7Dl60A i ~rn ~ - Ao-A" tlT i m AlSlrn Iif! CE ~! I " - A,-A .3 CY7Bl60A ~ rn CE4 : V.Q;; ~~~___~J 1/00-1/03 I - - - Ao-A" ~ n~ 4 RAM CONFIGURED WITH FOUR CY78160s All EE2---;T CY7Bl60D 14 A B C D o o 0 0 0 i i hIS CE5 .: _________ 1 - WmA.-'" TRUTH TABLE ...== =-• (J'C:I 1A15 AI4 A 0 0 1 B C D 0 0 0 ~ f't) "'1 I/OO-I/03i"'-- CY7Bl60D I 0 I 0 0 0 1 0 0 0 1 0 111 0 0 0 1 0' "'1 a = n ~ t'D f:I') Figure 9. Example Memory Configurations ~ rI.l ~ f't) a rI.l ==-~ .iii CYPRESS , SEMICONDUCTOR ~ PLCC and CLCC Packaging for 'High-Speed Parts The semiconductor industry is constantly searching for package options that enhance the capabilities of highperformance devices. For fast device performance with minimal ground bounce, electrical characteristics must include low inductance and capacitance from external pin to die bond-wire pad. A package should also furnish good thermal characteristics for reliability overextended temperature ranges. Other major properties sought after are low cost, as well as standardized outline/pin configurations for compatibility, ease of manufacturing, and handling throughput. The package must also work with surface mount technology and have a small footprint to save board space. The package that best meets all these requirements is the PLCC (plastic leaded chip carrier). In the past, utilization of PLCCs was not practical for high-power, bipolar devices. However, the advent of low-power bipolar and BiCMOS ECL-compatible SRAMs and PLDs now provides the opportunity for high-volume usage. As manufacturers switch from bipol.ar to BiCMOS, the lower power dissipation of high-density ECL SRAMs and complex PLDs promise to give PLCC packages a bright future. For military applications and extended temperature environments or for devices with higher power dissipation, you can substitute the CLCC (ceramic leaded chip carrier). The PLCC has many desirable qualities: Suitable for surface mounting with J-type leads Small footprint to save board space Low inductance and capacitance for high speed with little ground-bounce Good thermal characteristics for reliability over temperature range Ease of manufacturing and handling for production throughput Low cost compared to CERDIP, flatpack, LCC Standard package outline and pin-configuration compatibility The PLCC's J-type surface-mount leads have the advantage over gull-wing leads, which are susceptible to fatigue. J leads also enhance handling ease in test and burn-in fixtures. The PLCC's I-pF capacitance compares favorably with the 3 and 6 pF for plastic DIPs and CERDIPs, and inductance is equally impressive: 2 nH versus 6 and 11 nH for plastic DIP and CERDIP. Unlike flatpacks, PLCCs are available in standard tooling. PLCCs come in a variety of pin configurations, from 18 to over 200 pins, versus a maximum of 40 pins for plastic DIPs. The Ceramic Leaded Chip Carrier For high-temperature environments and high-power devices, you can make use of the ceramic leaded chip carrier (CLCC, Y package), which can also be surface mounted. The Y package has the same footprint and J leads as the PLCC (Figure 1) and works well for the faster PLDs and SRAMs. If you do not know system temperature in the early stages of a design, you can substitute the Y package for the PLCC and vice versa, so long as the device's die junction temperature does not exceed IS0°C. The Y package is slightly more expensive than the PLCC, but with a ther. mal resistance from junction to ambient (8JA) of 3SoC/W at SOO LFPM, the Y package can dissipate heat more efficiently. Reliability Cypress's bipolar and BiCMOS products in PLCC and CLCC packages go through extensive burn-in and testing at elevated temperature to guarantee package integrity. Cypress strongly recommends 500-LFPM system forced air flow but guarantees reliability in systems with or without the flow if the ambient air does not cause the junction temperature (TJ) to exceed IS0'C. The PLCC's 8JA is approximately 4SoCIW. The SRAMs have power dissipation that ranges from 780 mW max for the CYlO0E422L-S up to 1097 mW max for the CYI0E474L-S. This dissipation results in junction temperature rises from 3S to 49°C. The 16P4-type PLD (CYl00E302L-6) has a temperature rise of 39°C, and the 3-15 28-uad Plastic uaded Chip Carrier J64 DIMENSIONS IN INCHES RIGHT SlOE ,LTd o.~ I ~ O.Oo&S "'ilO56 VIet( ~=}!1'0.'3 r---....._ 0.021 ~ .---J 0.430 0090 "O':'i'2O 28-Pin Ceramic Leaded Chip Carrier Y64 DIMENSIONS IN INCHES MAX. TOPV1EW 'MiN. DETAIl. 0F>- Figure 1. Diagrams of 28-Lead Chip Carriers 3-16· ~ 0.020 ~ 0.180 MIN. ~ ~~~OID~~~~~~~~~~~~~~~~~~~P~L~C~C~a~n~d~C~L~C~C~P~a~c~k~a~g~in~g The graphs are based on a linear method of interpreting the failures observed at bum-in and indicate the longterm reliability of Cypress devices. You can use the graphs to determine MTBF and FITs for any Cypress device in any package after calculating the appropriate 16P8-type PLD (CYlOE301L-6) has a temperature rise of 47°C. The CLCC package's aJA equals 3YC/W for temperature rises of up to SSoC (CYlOE474-3). Finding Chip-Level Junction Temperature ~T. The following relationship determines chip-level junction temperature for the PLCC package: TJ = ~T + TA where ~T= Pn x aJA and aJA = aJC + acs + aSA To calculate worst case junction temperature (TJ) use maximum supply VEE and lEE for power dissipation and maximum TA for the temperature range of interest. For the 10KllOKH CYlOE301L in a PLCC, for example, device lEE = 170 rnA max and VEE = S.46V max for Pn = 928 mW. Add IS mW per output for a total output Pn = 120 mW. Therefore, the total Po = 1048 mW. For a PLCC, aJA = 4SoC/W at SOO LFPM, and aJA = 64°C/W for still air. For a CLCC, aJA = 3SoC/W at SOO LFPM, and aJA = S4°C/W for still air. Because TJ = total Po x aJA + TA and TA = 7SoC worst-case commercial temperature range, for the PLCC: TJ = (1.048 W)(4SoC/W) + 75°C = 122°C at 500 LFPM TJ = (1.048 W)(64°C/W) + 75°C = 142°C in still air This calculation is for absolute worst-case data sheet conditions. The bum-in temperature used by Cypress (TJ) is much higher than the device will ever see in a system. Note that rrwst systems will not run at worst case due to guard-banding. For this reason, use VEENOM = S.2V or 4.SV and IEENOM = (IEEMAX)(8S%) for nominal-condition calculations. The X-axis on the graphs indicates junction temperature. These values are determined by adding the L\T to ambient temperature, as described earlier. As an example, Figures 2 and 3 note the following critical points for a CY10E301L ECL PLD under three different operating conditions: Point A-10Kl10KH typical data sheet conditions: 2SoC ambient, nominal VEE and lEE, son loads, SOO LFPM air flow, TJ = 64"C, FITs = 7, MTBF = 18,000 yrs. Point B -10Kl10KH typical operating conditions: 5SoC ambient, nominal VEE and lEE, son loads, SOO LFPM air flow, TJ = 94°C, FITs = 45, MTBF = 2800 yrs. Point C - 10KlKH absolute worst-case conditions: 7SoC ambient, S.46 V max and 170 rnA max, son loads, 500 LFPM air flow, TJ = 122°C, FITs = 22S, MTBF = S2S yrs. The activation energy used for the MTBF and FITs information is 0.7 eV. This is an average number for diesurface-related defects, such as metal and oxide pinholes, etc., but is very conservative for silicon defects or mechanical interfaces to packages. The number is usually 1.0 eV. A small change here results in a significant change in MTBF or FITs. A change to 0.8 eV equates to a 33% reduction in FITs rate or a SO% increase in MTBF. The Packages of Choice The PLCC and CLCC are accepted as the packages of choice by many manufacturers of high-speed devices. Motorola Semiconductor uses the PLCC as the only package for the company's very high speed ECLINPS ECL logic family, which stands for "ECL in picoseconds" and is pronounced "eclipse." This family has set-up times and propagation delays in the sub-nanosecond range, with power dissipation of over 1W. Fully compatible with Cypress SRAMs and PLDs, the ECLINPS family includes many 10K, 10KH, and lOOK standard logic gates, building blocks, and transceivers. Real-World Values Obviously, most systems do not operate at the worstcase conditions. Therefore, Figures 2 through 5 show graphs over different operating conditions to determine failures in time (FITs) and mean time between failure (MTBF) for a typical system or in a worst-case scenario. 3-17 ~RESS ~, PLCC and CLCC Packaging SEMICOIDUCTOR ============================;;;;;;;;;;;;;;;;;;;; ECLPLD FITs vs. Tj ATs ----------- ----------- ----------- ------------------------ ----------- ----------- ----------- ----------1 +-------r-----~r_--~_+------~------4_------+_----~-------+------~ 10 60 80 90 100 110 120 130 140 150 Junction T etql (deg C) Figure 2. Failures in Time vs Junction Temperature Eel PlD MTBF vs. Tj MmF !'years) 100 +------;------;------4------~----~r_----~------~----~----__1 60 70 80 90 100 110 120 Ti,Junction T eIq) (deg C) Figure 3. Mean Time Between Failures vs Junction Temp. 3-18 130 140 150 Eel SRAM FITs vs. Tj RTs 1000 70 60 80 90 100 110 120 130 140 150 Ti.Junction TeIq) (deg q Figure 4. Failures in Time vs Junction Temperature Eel SRAM MTBF vs. Tj MTBF (Years) 100000 10000 ----~;-------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----:~ 111111111~111~·1~1~~~1 ~~~ ::mll11.1 ~11111·~~11 :1:::~111111:::!1111.: r--- 1000 100 +-------+-------~----~~-----4-------+------_r------_r------~----~ 50 70 80 90 100 110 120 Ti. Junction TeIq) (deg C) Figure 5. Mean Time Between Failure vs Junction Temp. 3-19 130 140 150 CYPRESS SEMICONDUCTOR A New Generation of BiCMOS High-Speed TTL SRAMs rior ground-bounce characteristics and faster propagation delays than is possible with rail-to-rail output swings. This application note profiles the Cypress CY7B166 family of TTL-I/O 64K SRAMs, which are ushering-in a new era of high-performance memory devices. These are the world's fastest BiCMOS RAMs, with address access times as low as 8 ns.' Arranged in 16Kx4 and 8Kx8 architectures, the.pevices are functionally equivalent to their industry-standatd, TTL-compatible, CMOS counterparts; there is no difference in 110 logic-level minimax specifications. In addition, on-chip features provide supe- c: UJ LL LL ::::l _______) BIPOLAR a.. uJ c: a.. c: o o () UJ 0 () ~I ___"~ ~ BiCMOS technology employs CMOS inputs for compatibility with existing products and bipolar on-chip bus interconnects and sense amplifiers to speed the internal access timing. The resulting throughput improvement allows more time for the outputs to slew load capacitance. c: UJ 0 ____~~~ m ~ z BiCMOS Technology l1li-----l1li CMOS UJ _BiCMOS o X X c: UJ LL LL ::J f - -_ _.... CONTROL LOGIC m I-: ::::l a.. I-: ::::l o Figure 1. 64K TTL SRAM Circuit Technology 3-20 ~ ~~~OIDucr~~~~~~~~~~~~~~B~I~·C~~~O~S~H~i~gh~-~S~p~e~ed~T~T~L~S~R~A~~~s CDJT TTL CMOS INPUT TTL BiCMOS OUTPUT Figure 2. CY7C166C BiCMOS Family 1/0 Architecture Further, BiCMOS uses both CMOS and bipolar transistors on the outputs to optimize drive capability. Figure 1 shows how the parts of the memories are partitioned by technology. On the outputs, two bipolar transistors drop two Vbe levels (approximately 1.6 V) to reduce the High-level output swing. One device is tied base to collector as a diode, and the other is the Highlevel drive transistor. Both transistors cause the output to conform to standard TIL logic levels (not CMOS rail-torail). This output structure appears in Figure 2. The diode is the bipolar transistor Q3, and Q2 is the High-level drive transistor. M18 is an output Low-level pull-down MOSFET (n-type). Keeping the output from swinging to the power supply rail saves time when changing states, as shown in Figure 3. Figure 2 also shows the SRAM's input structure. The CMOS devices are M2 and M4. The input structure includes bipolar-type input clamping diodes, which act as ESD protection devices and meet MIL-STD-883C Method 3015 static discharge voltages of 2001V. The inputs adhere to standard CMOS specifications. The outputs include the same diodes and are an improvement over CMOS-type diodes. The diodes also clamp transmissionline reflections in mis-matched board traces. Compatibility and Improvements To reduce ground-bounce noise problems associated with full-swing, high-speed CMOS devices-and TIL parts to a lesser degree-the CY7B 166 SRAMs include an internal supply-bypass capacitor between the power supply pin and the ground pin. In parallel with this capacitor, an inductor of equal value to package lead inductance cuts in half the overall inductance associated with output-swing ground bounce. Both the capacitor and inductor decreases the magnitude of the bounce on the output-logic swing's falling edge. In conclusion, to illustrate BiCMOS compatibility and improvements, Figure 4 shows I/O waveforms for BiCMOS and CMOS devices. These waveforms show that no compatibility problems arise when substituting BiCMOS-type TTL devices for CMOS parts in a new or existing TTL-I/O system. On the contrary, upgrading from a CMOS 64K TTL-I/O SRAM to Cypress' BiCMOS device family increases speed and noise immunity and decreases noise generation, for an overall system improvement. 3.8V S.OV dv C110S __~.l5~V__~_----------~~ SLEW RATE = dv/dt RISE/FALL TIME = dt LOGIC SWING = dv THEREFORE TIME IS SAVED BECAUSE THE LOGIC SWINGS ARE SMALLER Figure 3. Speed Increase from Reduced Logic Swing 3-21 3X Attenuation 1/01 100.0 10 pF son scop -=- TEST SETUP ADDR DATA 7~8 15 ns ns TIL CMOS OUTPUT TTL BiCMOS OUTPUT Figure 4. CY7B166 BiCMOS Output vs 64K TTL CMOS Output 3-22 = ~ ~., CYPRESS SEMICONDUCTOR Access Time vs Load Capacitance for High-Speed BiCMOS TTL SRAMs Although many TTL and CMOS components are specified for 30- to 50-pF drive requirements, the actual characteristics of modern high-speed systems are quite different. In a system environment using good transmission lines and termination techniques, the drive requirement depends on the characteristics and. length of the transmission lines, the number of succeeding device packages, and where devices are physically distributed along the line. For testing purposes, however, you can approximate the effective capacitance seen at the output of high-speed SRAMs as a lumped capacitance connected directly to the output. This lumped value is from 10 to 30 pF in most board-level systems. The graph in Figure 1 represents the additional access time requirements for various lumped-output- This application note provides a technique for analyzing a system's load capacitance and shows how to determine access time (Taa) degradation. You can also determine other output-related specifications such as tDOE using this method. The BiCMOS process has made available a new generation of 8-, 10-, and 12-ns 'J'TL-I10-compatible SRAMs. In the past, the most significant speed barrier in SRAMs was the propagation delay through the device. This delay is now becoming quite small. Consequently, the time the device takes to slew the output load capacitance is a substantial portion of overall delay and must be understood to determine optimum system timing. The techniques presented here can thus help you maximize your system's throughput. DELTA taa (ns) 164K 0.8 0.7 0.6 0.5 ------------------------------------- -- -- ------- ------------------------- 0.3 0.2 0.1 ------------------------------------------------------- 0.4 TIL SiGMOS SRAM I ------------------- o ~~~~~~~~~~~~~~++~r+~-r~~~;_~~_r++~~~-r~ -0 1 - - - - -. . - - - - - ----- .- -. --- - - ----- .- - - -.- ----- .- - - - . - - - - - -0:2 0 - - - - - .5. - - - - - '0 - - - - -'5. - - - - .20- - - - - 25- - - - - 30 - - - - -35 - - - - _40. - - - - .45- - - - - 50 - - - - - - - CL Total Load Capacitance ( pF ) -0.3 Jj ~~~~~~~~;;~~~~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~~~~~~~l~~~~~~l~~~~~~ -0.9 -1 -1.1 -- --------------- ------ ------ ------------ ------ ---------------------- ------ ------ ------------ ------ ------------------------ ------ ------ ------------ ------ -----Figure 1. Normalized DELTA T aa vs Load Capacitance 3-23 ~~ Access Time vs Load Capacitance for BiCMOS SRAMs ~ ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The setup· used to get the values in Figure 1 approximates son terminated transmission lines and is shown in Figure 2. To avoid loading the output, a lOOn resistor is put in series with the termination resistor. This adds a 3X attenuation factor but does not alter the results. Using the following measurement techniques and a reasonable number of device loads, you can derive any system's characteristic capacitance. This allows load adjustment to optimize time degradation to keep address access to a minimum. You can also use this technique to determine other specifications that depend on output rise and fall time, such as tDOE. 31 AUt'Du8iion ]/0 1 1001'\ lOOp, Mil SCOPE TEST SETUP Measuring Load Capacitance Now that the capacitance's effect on the device speed is known, the 20-pF approximation can be used to determine Taa. This requires a method for measuring the system load capacitance. A simple method is to use timedomain reflectometry (1DR), which determines capacitance on a transmission line by measuring the pulse reflection the capacitance causes. The TOR test system (Figure 3) consists of a fast pulse generator and oscilloscope with son terminated inputs. The oscilloscope's channel A measures the reflected voltages, and channel B measures the setup of rise time, logic swing, and pulse width. A single device or a critical Figure 2. Test Load to Determine T aa vs CL capacitance values. This graph applies to the CY7Bl60, CY7B161, CY7B162, CY7Bl64, CY7B166, CY7B185, and CY7B186. Data is shown for the falling edge only because this edge is eff~ted most by load capacitance. The graph is normalized to 20 pF and can be used for all speed grades. For the -8 devices with no capacitive load, for example, the Taa is 6.9; at 20 pF it is 8 ns; and at 50 pF it is 8.8 ns. Power Djvider (optional) ! Pulse Generator HP 80B2A or Equivalent Cable A Channel A D1G1TIZ1NG Test OSCII.J..OSCOPE HP54120 OR EQUIVALENT Cable B Zo = non Pin CabJe C Channel B GND DEV)CE Dynamjc Test Board Figure 3. Test Setup for TDR Capacitance Measurement 3-24 ~RESS Access Time vs Load Capacitance for BiCMOS SRAMs ~~ ~COND~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Pulse Width DO ns , - - - - - - - - - - - - - . . , . . - - + - - - - - Vih +3.0 V '------ViI +OAV 90% 50% 10% tr = 2 ns _ _----34 Figure 4. Setup Pulse on Channel B (nothing in the path) where /).V = Maximum reflected voltage at channel A, tr = Rise time of incident pulse, Zo = 50n, VI = Logic swing of incident pulse The equation includes the 2X attenuation factor introduced by the test circuit. path with various loads can be measured to determine dynamic capacitive loading. Note that although the length of cables A and C is not critical to the measurement, the time it takes the pulse to traverse cable B must be much greater than the pulse's maximum rise time. This ensures that reflections are measured after the pulse has stabilized and not during a transition. Also note that the test point is any input or output to a PCB transmission line or device and that outputs must be forced to a Low state to be measured. Figure 4 shows the setup pulse on channel B with no device or board in the path. The setup waveform corresponds to the SRAM's output characteristics. Figure 5 shows an example reflection, indicating the /).V reflected voltage measurement and position on cable B. You can determine capacitance values from the test data using the following equation: CD= 4(tr)(/).V) ZNl Eq.1 Measuring Capacitance Values Exactly The line capacitance along with the load capacitance found using Equation 1 determine the total capacitance and time delay added to the access time. Two ways to determine the additional delay are to calculate the extra time and add the result to the no-load access time or calculate the load capacitance and use the graph in Figure 1. These approximations for total capacitance are adequate in most situations, but you can also measure actual line and load capacitance using a high-frequency LCR meter. Usually this equipment is unavailable and/or expensive due to the frequency range needed to get an accurate measurement. 100% Zone t 6V = Ma.ximum Reflected 2X Length Voltage of Cable B Figure 5. DELTA V Reflected Voltage Measurement 3-25 ~""'" Access Time vs Load Capacitance for BiCMOSSRAMs ~~COID~OR ~~~~~~~~====~==============~~==~~~~~~~~~~~~~ Another approach is to determine the transmission line's capacitance per foot by analyzing the line's characteristics based on the type of line and board construction. A typical 50n microstrip line has approximately 35 pF/ft. (3 pF/in.) based on the equation: Co 1.017 x 1O-9(0.45er+ 0.67)(·5) = - 1 88 n~ = 11.4 ns Another way to determine delay is from the overall load capacitance, including line and distributed load: 3 for G-10 fiberglass-epoxy The distributed load and line capacitance interact for J CD)(O.S) Ctotal =C~1 + Co to: C J(0.5) nVft. 3 in. 12 in. =0.47 ns an overall transmission-line propagation delay equivalent tpd = 1.0 17 (er) (0.5'L 1 + C~ ftX taa total = 0.47 ns + 10.9 ns PCBs { ~ 1.017(3)(O.'{1+ ;5]""5) "-'ft -. Eq.2 ZOpFITt. where Zo = 50n and er !pd Eq. 4 where Ctotal = Total line capacitance You can use Ctotal to determine the additional access time from the graph in Figure 1. For example, for a 3-in. 50n micros trip transmission line (Co = 35 pF/ft.) driving one load (CD = 5 pF/ft.), the total capacitance is: 5 )(0.5) 3 in. Ctotal = 35 pF!ft. ( 1 + 35 x 12 in. Eq.3 where CD = Distributed capacitance This line length and load-dependent delay can be added to the no-load (0 pF) access time from Figure 1 to derive system timing. For example, for a 3-in. micros trip transmission line (Co = 35 pF/ft.) with a 12-ns device driving one load (5 pF), the total delay is: taa @20 pF- 1.1 ns = 12ns - 1.1ns ~ 10.9ns = No-load access time = 9.35 pF Using the graph, the access time decreases by 0.41 ns, for an access time of 11.6 ns. If the line is 6 in. long with two loads, the total capacitance is' 19.8 pF, for an increased access time of 11.95 its. Using Equatibn 3 gives 11.4 ns and 11.9 ns for the two examples. 3-26 • = ~~.::z ~ , CYPRESS SEMICONDUCTOR Combining SRAMs Without an External Decoder 32K x 8 RAM CONFIGURED WITH FOUR CYl8160s CE CEI CY7B160B AO-AI3 CEI CY7BI6OC GNDr~-D GNnl~ ~ I L---IH-"A:l.:...;14:..., CE4 ~: vee C~~J CEI CY7B160D 14 ABC D o 1 0 1 0 1 0 1 0 1 Figure 1. 32 Kbit x 8 3-27 An internal decoder with four chip-enable inputs helps designers retain the 8/1O-ns access times of the CY7B160 16K x 4 BiCMOS SRAM in multiple-chip memory configurations. Without this capability, denser memory arrays require external logic, which adds 3 or more nanoseconds to the access time. This application note describes how to use the 16K x 4 SRAM to create 64K x 4, 32K x 8, or 64K x 8 memories without an external decoder. In the x4 configuration, only one Cypress CY7B160 is active at a time. In the x8 configurations, two chips are active at once. Devices that are deselected power-down to less than 40 rnA of standby current from a maximum operating current of 120 rnA. Figures 1, 2, and 3 show how two additional address lines, connected to the memories' chipenable (CE) inputs, permit multiple-SRAM configurations without using an external decoder. You can use a fifth chip-enable input to power-down all devices. The decoder works without external logic because two of the CE inputs (/CE2 and ICE3) are active Low, and two (CE4 and CES) are active High. When any CE pin is pulled out of its active state, the chip is deselected. Any CE pin can deselect and power-down the. device independently of the other CE pins. 1991 Electronic Design. Reprinted by permission. A,,~ ~eEIT ~I ~CE4 gl 100 Q;LU 1/°0-1/°3 ,1 -110" '(tx: - A.-A" ~ tE" f- CY7B160A ~~~ ~I ~CE4 ~I V..cc1CE5 = I 1/°0-1/°3 r-" 1/'1.-110" TRU'I1:I TABLE _A.-A 13 CE tE" A.-A,. A.. A"j-~ A,~I~ A,& V.~CE4 ~ I vee CE5 '-=-= At6 ~14 A .CY7B160B E' R I 1/° -1/° ,110,-110, 0 3 ---' ~ A.-A 13 ~ "CEI i- CY'1B16OC GNDr~ . ' ~ S~~I A,. CE. g I vee => CE5 _ _=.JI 1/%-1/% 110,110, 1/0.-1/0, ~ Ao-A'3 ~ tE" Au-~ CY7Bl60D 0 ~§~ ~I V~CE4 ~ gI ~g;~!J 1/°0-1/°3 ,!tn.-I/O, - A.-A '3 ....ll tE" CY'1B160E GNDr~ GNnl~ fl A;;1CE4 10- -!it - ~ CEc:. &I i! 1/°0-1/°3 .J 1/01)-110, Ao-A'3 CY7B160F tE" A rmGND 1Cil3 ~CE' AI6 CE5 ig,I !I -' 1/°0-1/°3 ~ A.-A,. ~ m- II/O.-J/O, - CY'1B160G ~:,~ i' A;;iCE4 gl - ~ CEo ~ A.-A '-- "CEI :J 1/%-1/°3 1/(11"1/0, '3 CY?B160H Figure 2. 64 Kbit x 8 3-28 B C D E F 0 0 t 0 t 0 0 0 0 0 0 t 0 t 0 t 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 t 0 0 0 0 0 1 0 t G H -.. ~ ~'~RESS , Combining SRAMs With ought External Decoder SEMlcamucroR =;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;=;;;;;;;;;;;;;~=;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;=;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;===;;;;;;;;;;;;;;;;;; 64K 4 RAM CONFIGURED WITH FOUR CY78160s X A15'--CE2 ~ AI.4 CE3 ~ v~ v~ CE4 CE5 1 I ~I ~ I J f--_ ]/0 -]/0 0 3 Ao-A 13 ~ CY78160A CEl A15r~ 1)1 V~ICE5 il GNDI CE3 Ec I ~CE4 o I t--_ ]/0 -]/0 0 3 J AO-Aj3 CE Ao-Ats Aj4 Au CEl CY7B160B CE2~ I GND ICE3 AI5 1CE4 AI!; V~JCE5 ~ ~ 1/°0-]/°3 I I il 1/°0- 1/°3 I--_J Ao-A 13 It--- G1!Q CE2 CE4 CE5 D 1 A15 A14 g: D ~I f--_ J 1/00 -1/°3 A -A o 13 '--- TRUTH TABLE r-- G]J2 CE3 A 14 A 15 CY7B160C CEl CEl CY78160D Figure 3. 64K x 4 3-29 A BCD 0 0 1 000 0 1 0 100 1 0 0 010 1 1 0 001 CYPRESS SEMICONDUCTOR BiCMOS TTL SRAMs Improve MIPS R3000 and R3000A Systems This application note analyzes the speeds required for the cache SRAMs used in RISC systems. The focus here is on the R3000/R3000A RISC processor architecture from MIPS Computer Systems Inc. One of the goals of RISC-type machines is to execute one instruction per CPU cycle. To achieve this goal, RISC processors employ a compact and unified instruction set, a deep instruction pipeline, and careful adaptation to optimizing compilers. However, these benefits can be rendered useless without an efficient cache memory system composed of fast SRAMs. Design Overview A block diagram showing the memory components of an R3000A cache system appears in Figure 1. The memory system is designed for maximum bandwidth by utilizing separate instruction and data caches and an external write buffer for main memory. The high-speed cache is physically close to the processor and holds instructions and data that are repetitively accessed by the CPU; this reduces the number of times that slow main memory must be utilized. The R3000 can handle up to 256 Kbytes in 64K entries. The processor provides cache control, which is direct mapped. The processor also provides tag control to verify that the correct data is read from cache. The controller can refill multiple words when a cache miss occurs. With separate data and instruction caches on the same bus the processor can access or write data and instructions at the CPU's cycle rate. The separate cache architecture for instruction and data memory means that each are alternately accessed during each CPU cycle. This makes cache access time equal to half the cycle-time clock period. As the processor speed increases from 25 MHz for the R3000 to 33 and 40 MHz for the R3000A, the time allowed for instruction and data fetches from cache memory decreases. The clock period is 30 ns for the 33MHz system and 25· ns for the 40-MHz system. This leaves 15 ns to access and/or read/write data for the 33MHz system and only 12.5 ns for the 40-MHz system To further illustrate the cache timing, a sample read critical path in a 64-Kbyte cache system appears in Figure 2. Path 1 is the access time from the R3000 through the 373A latch and into the CY7B166 SRAMs. Path 2 is the time it takes data to be Valid after an IRd signal is received from the R3000. The extemallatch between the R3000A and the cache address inputs provides part of the pipelining used in the R3000 system and also minimizes loading between the addresses of cascaded memories and the R3000. This R3000/R3000A ruse M]CROPROCESSOR DATA ADDRESS DATA ADDRESS MAIN MEMORY Figure 1. R30001R3000A System with High-Performance Cache 3-30 DATA. BUS I I I TAG BUS !- -...--1 ! L I I I I- ll - - Tnnlpan!'M UMt'h 373A "" ~7~ 11 .,."'. • PATH 2 7''''-...7'''-...7 ? ,.. ...::: Al1DRII:I ~:~ AUJ(LO 1::!.!J3 - T_aP .,- :;; ;; ;; ..... ""~ Tnnl- DIItB pIIn!'M UMt'h llltlltP 1lC'.Ik r - - f1:r--- lClIl L7 3'73A ~ ... n.Id • QB\ "1\ fE- ~ R3OOO~QOA me ssor lJllf\ lW, DR'" M o.m D'lr\ WE\. ~ ... ~"" .,. ~ 1IOIt. CY7Df66 'Dah Cache PATH 1 Clllb9;y. L Nemory Inler1at"e 'I: n CY''1IH66 Imrll"lleUoD Cache , - C1I2IIS:In, lID\ SpOut\ M..r,[20] MmJM MmI..,\ ::- JlafBua;T ; Wr'Ba...,.\ .... CpCanIlIO] S 1Iu1lBrTar\ ClIlt'tR'll ClII2:xPllI JI. .1\ C]:dPt'\ Ibm\ !txt'\ Cp'llaa: CpCandl:J I] ]m....I~DJ ~ ~ ,7 jf-=- Clocks ~ 21-::3 .... ::.. Co~ Processors ..L ; A r- I 8BJ'dwBrt! Inlerupts I Figure 2. Data and Instruction Cache Critical Paths Table 1 lists the time constraints for critical paths 1 and 2 for different system speeds. This data indicates that fast SRAMs are essential to keep up with the 33- and 40MHz processors. Fortunately, BiCMOS processes now fill this need with 8-, 10-, and 12-ns TIL-liO-compatible SRAMs with reduced internal propagation delays and improved driving capability. extra device causes the memory's access time to become critical, however. As shown in Figure 3, data is fetched from the data SRAM (path 2 for data cache) while the address for the instruction SRAM is set up (path 1 for instr. cache). During the next half cycle, the opposite operation is performed. This arrangement allows use of shared pins on the processor, which save up to 64 I/O lines; however, bus bandwidth requirements are doubled. You must therefore keep signal lines short and loading as low as possible to minimize capacitance. For a 40-MHz system, critical path 1 in Figure 2 includes 3.9 ns for a 74PCT373A latch, which leaves only 8.6 ns for the memory, board trace, and address set-up. 'Fortunately, memory access can overlap into the read cycle by 3 ns. Path 2 for a read cycle includes the time it takes the R3000 to send the IRd. signal to the CY7B166, the CY7B166 OE-Low-to-data-valid time (tDOE), and the R3000's data set-up time. The set-up time for the 40-MHz R3000A is 4 ns, and the read signal takes 3 ns to generate. This leaves only 5.5 ns for tDOE and for slewing the output load capacitance. CY7B166 16K x 4 BiCMOS SRAM The CY7B166 SRAM is optimized using a BiCMOS process to achieve 12-, 10-, and 8-ns access times. Bipolar and CMOS technology combines to speed-up critical paths and boost output drive (see the block diagram in Figure.1 of "A New Generation of BiCMOS TTL SRAMslt). CMOS technology reduces memory array size and keeps power to a minimum, while bipolar technology speeds-up critical paths. BiCMOS technology allows the inputs to be CMOS for compatibility with existing products, while the on-chip bipolar bus interconnects and sense amplifiers speed the internal access timing to allow more time for the outputs to switch. On the outputs, two bipolar transistors drop two 3-31 Vbe levels (approximately 1.6V) to reduce the High-level output swing. One transistor is tied base-to-collector as a diode and the other transistor is the High-level drive transistor. Both transistors cause the output to conform to standard TTL-type logic levels (not CMOS rail to rail). (See Figure ~. in "A New Generation of BiCMOS TIL SRAMs" for a diagram of this output structure.) The diode is the bipolar transistor Q3, and Q2 is the High-level drive transistor. M18 is an output Low-level pull-down MOSFET (n type). Keeping the output from swinging to the power supply rail saves time when changing states and makes the ramp rate slower (as shown in Figure 3 of "A New Generation of BiCMOS TIL SRAMs"). The CY7B166's input side includes CMOS devices M2 and M4. Input clamping diodes are also included to provide ESD protection and meet MIL-SID-883C Method 3015 static discharge voltages of 2001 V. The inputs meet standard CMOS specifications. To reduce ground-bounce noise problems associated with full-swing, high-speed CMOS devices - as well as TIL parts to a lesser degree - the CY7B 166 incorporates an internal supply-bypass capacitor between the power supply pin and the ground pin. The device also includes an inductor, whose value equals that of the package lead inductance, in parallel with the bypass capacitor to cut the overall inductance associated with output-swing ground bounce in half. Both the capacitor and inductor decrease the magnitude of the bounce on the falling edge of the output logic swing. Substituting BiCMOS type TIL devices for CMOS parts in a new or existing TTL-I/O system creates no compatibility problems. Upgrading from a CMOS 64K TIL SRAM to Cypress' BiCMOS family of devices increases speed and noise immunity, while decreasing noise generation for overall system improvement. Table 1. Delays Through Two Critical Paths P A T 25 MHz PNWElER tAV R3000 1.5 ns tpd 373A 5.5 ns H I ns 4.1 ng 40 MHz I ns 3.9 ng AS 10 ns 8 ns 2BS 2 ns 1.5 20 n. 15 ns l1Rd\R3000A 5.0 ns 3.75 ns 3.125 ns lOOE CY7B166 6 ns 5 ns 4 ns lOS R3000/A 6 ns 4.5 ns 4ns 3.0 ns 1.75n. 1.375 ns 2Q ns 15 ns 12.5 BS 40 ns 30ns 25ns tAA CY7B166 1 33 MHz En6fI) IElA'IS"' ACCESS CYCLE 12 lIS 12.5 ns TIME p A T H fOIrR) 2 IElA'IS"' READ CYQ.E TIME CLOCK PERIOD Hoard delays are cntlcal as speed lDcreases. The access time needed by the SRAM can overlap the path cycle time by 3 ns to make up for loss in board delays. 33MHz CLOCK 40 MHz CLOCK 15ns 12.5n& RISC UP CLOCK D ADDRESS BUS PATH DATA BUS D PATH 2 Figure 3. Cache Interleaved InstructionlData Timing 3-32 CYPRESS SEMICONDUCTOR Memory and Support Logic for Next-Generation EeL Systems This application note describes the characteristics and use of ECL-lIO technology. Available for many years, this technology is now breaking into mainstream applications due to innovative process technologies. The high power requirements and low device density that once banished ECL to high-speed niche markets are fading with advanced technology and circuit designs. Table 1 shows how performance and power utilization are evolving. As system clocks pass 50 MHz, it becomes hard for TTL to provide the necessary low-noise drive capability for fast rise times, and ECL becomes essential. Happily, new BiCMOS SRAMs, gate arrays, and improved bipolar PLDs combine ECL lIO speed with higher density and lower power requirements. A bipolar ECL implementation of an industry-standard PLD such as the l6P4, for example, draws a modest 220 rnA (max), while exhibiting propagation delays of 3 ns (333 MHz). These specifications are for Cypress's CYlOE302 and CYlOOE302 lOKH- and lOOK-compatible devices. Low-power (170 rnA) versions with 4-ns propagation delays are also available. This performance is based on new approaches to combining ECL and CMOS in single devices. Historically, BiCMOS technologies were developed as either CMOS speed enhancers or bipolar power misers. The resulting BiCMOS processes were based either on CMOS or bipolar process flows, and performance for the complementary bipolar or MOSFET components was less than optimal. In contrast, Cypress's STAR M2 process is a thirdgeneration, 0.8J.l BiCMOS technology in which the baseline process is BiCMOS. (See Figure 1 in "BiCMOS TIL and ECL SRAMs Improve High-Performance Systems" for a simplified cross section of the STAR M2 BiCMOS process.) The STAR process utilizes a modular architecture. That is, polysilicon loads, TiW fuses, or other non-volatile elements are easily incorporated into the baseline process. This results in high-density SRAMs, high-speed PLDs, or high-density EPROMsJPLDs, respectively. The STAR M2 process is an l8-mask, double-poly, double-metal technology that utilizes a thin epitaxial layer to achieve excellent production performance for NPNs (Ft greater than 10 GHz) and CMOS latch-up immunity. The MOSFETs use lightly doped drains for high performance and reliability. Unlike first-generation BiCMOS processes, which were limited to SRAMs, STAR's poly silicon bipolar emitter is the same poly used for MOS gates. This enhances NPN performance and decouples the NPN from the poly EeL and BiCMOS BiCMOS combines bipolar ECL I/O with both bipolar and CMOS internal functions. This helps parts such as Cypress's CYl0E474/CYlOOE474 lK x 4 static RAMs draw only 275 rnA, while exhibiting access times of 3.5 ns. Low-power (190 rnA) versions exhibit 7-ns access times. Table 1. ECL Families Parameter Ext. Gate Delay (ps) Flip-Flop (MHz) lOKH lOOK ECLPSTM 500 250 400 500 800 Gate Power (mW) Speed(X)Power (pJ) 25 30 12 25 400 3-33 8 2.4 Cypress STAR™ 500 800 3 0.6 EeL PLD D1strlbut1on or R/W and Clock S1gnals Address Sequencer Figure 1. High-Speed A-to-D Application load module used for 4T SRAM cells. Use of. tpis poly load resistor allows for an 85-square-micron memory' cell and small die size. The advantages of the STAR M2 process over second-generation BiCMOS technologies include higher product performance and greater density and manufacturability. ADC at top speeds. After the memory is full, you can load the data at a slower rate to a PC or digital oscilloscope for manipulation and/or measurements in software. Instead of using the ECL PLD to implement the SRAMs' address sequencer, it might once have been necessary to incorporate the sequencer as part of the memory chip or use discrete logic. Neither approach was satisfactory, in the one case because of power dissipation and in the other because of the speed limitations imposed by multiple levels of discrete logic. Further applications for ECL PLDs and SRAMs are found in high-performance workstations, file servers, and high-end embedded controllers. In fact, the next generation of high~end workstations will require ECL support logic. Figure 2 shows an example based, on Bipolar Integrated Technology's 10K-ECL version of Sun Microsystems Inc.'s. SPARC processor. In this 80-MHz SPARe implementation, based on Bipolar Integrated Technology's ECL SPARe chip, cache and tag memories use BiCMOS SRAMs and the cache control, memory management unit (MMU), and cache data path (COP) are implemented with ECLPLDs. The BIT system is bipolar and consists of the main integer unit (IU), a floating-point coprocessor interface chip, a multiplier and accumulator floating point chip set, and a register file chip. The IU can handle off-chip cache of almost any size with complementary sets of 30n cache address drivers to split the cache into two banks. This minimizes trace length, reduces noise, and improves cycle time. The 4K or 64K BiCMOS ECL SRAMs implement the cache memory and reduce system power dissipation. The IU has· a 12.5"ns cycle time and provides a Data Ready clock signalthat allows a 15-ns cache access time. This access time makes up for trace propagation delays. The design can use SRAMS with access times from 3 to 12 ns, depending on the required cache size and power requirements; these SRAMs can easily keep up with the IU, as can the 3-ns PLDs. Applications for ECL and BiCMOS ECL Applications for ECL PLDs and SRAMs include graphics and image processing, waveform generation, and direct digital synthesis (DDS). In the case of video, ECL memory stores images. In waveform generation and DDS, ECL memory· stores digital representations of analog waveforms before feeding the information to a digital-toanalog converter (DAC). In both image and waveform applications, PLDs are used for address generation/decoding, data manipulation, and clocking schemes/timing control. These functions previously had to be either built discretely with ECL gates or added onto the DAC or memory on the same die. However, high~speed video DACs (greater than 125 MHz) use bipolar process technology, which does not lend itself to high density due to power dissipation problems. It is easier .to implement the functions in ECL PLDs and BiCMOS SRAMs. For analog-to-digital conversion, ECL PLDs work with high-speed· flash AID ·converters (ADCs) that have EeL outputs. These converters' clock rates range from 20 MHz up to 1 GHz. Applications include HDTV, phasedarray radar, ... digital 'oscilloscopes, and single-event digitization ..'Here, PLD~ help create high-speed· specialty memories such· as self-timed SRAM, pipelined SRAM, and intetleaved SRAM. ' Using the design shown in Figure 1, you can implement a·· fast· digital oscilloscope to· display analog waveforms on a PC. The flash ADC contains a string of comparators that split the signal into a digital "thermometer" code. From there the digital codes are usually decoded into 8 bits, which are latChed on the outputs every· clock period. The flash AID converter feeds BieMOS SRAMs, which can be interleaved for maximum' speed. The ,PLDs are programmed as address decoders and counters to change the EeL SRAM's address location every clock period. .similar to the way a cache memory works, the memory stores'the digital information from the Designing with ECL Because ECL PLD propagation delays are as short as 3 ns, and output rise/fall times are in the sub-nanosecond region, you must adhere to striet system layout guidelines. ECL speed and noise performance are enhanced, with correct transmission-line design and power-supply bypassing techniques. The underlying objectives are to minimize the 3-34 ~ 16 " T ag Read Data Memory Mgt & Comm SPARC I Cache ECL Control Tag ..,r- v. ~~ Unit ~Control Hi h v. Add. Low g T Phys ic al 16,{ l'17 I Addres s I rr-~ SPARC r SPARC 72 Integer MMU ECL I/O Data , .... ~~~ (PLD) rUnit Cache Cache Read Data Data "'~3 6 r I - -.. - - t , , ..- ---- ~ ~ Cache Write Data 64 o4~ 72 , SYS Bus SPARC COP / , ~ 72 -- 32 --... r - J r--, ... ~ I 64 36/ H • SPARC Fl Pt !Contro ller Control / CP Bus 'r TTL System Bus Interface -30 I 64/ / .. Fl Pt Bus In SPARC Fl Pt Subsystem I ... Fl Pt Control Figure 2. SO-MHz SP ARC Implementation capacitive loading that slows data, prevent ringing and reflections from impedance mismatch, and minimize voltage drops that add system noise and reduce noise margin. ECL-I/O circuits achieve the best possible match to transmission lines for maximum energy transfer. The output stage consists of a low-impedance, open-emitter transistor that can effectively drive different values of transmission-line Zo with the addition of a pull-down termination resistor. The pull-down resistor is also necessary for operation of the output transistor and can serve a dual role as the transmission-line termination. ECL input pins are connected to a transistor's high-impedance (DC) base, which appears as a small capacitive load to a properly terminated transmission line. It is always a good idea to use transmission lines, but they are essential when line propagation delay to the receiving end and back again is greater than or equal to the signal's rise time. Basic calculations for different etched circuit board (ECB) lines appear in Figure 3, along with an equation for propagation delay through the transmission line. Table 2 lists common values for the dielectric constant. Stripline is used in multi-layered boards and between ground planes; it consists of a trace buried between ground/power planes. The stripline calculations assume that W/(H - T) is less than 0.35 and that T!H is less than 0.25. Single and composite microstripline is used on the top and/or bottom of single- or double-ground boards; it consists of a trace on the surface, with the ground or power plane buried. Other common high-speed practices are to use equal line lengths from device to device and rounded comers on 3-35 QC'IPRE$ Memory and Support Logic for EeL Systems ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ .L H H ~ t Zo = + 1.41 5.98H j 1nt O.8W of- T ) B == 1.016·.J7; ns/ft where 0 = propagation delay Ee = 0.475 Er + 0.67 Figure 3. Zo for Microstrip and Stripline traces. Component lead lengths should be short, with surface-mount passive and active components used as much as possible. Table 2. Common Values for Dielectric Constant Material Duroid Quartz 0-10 (FO Epoxy) 2.56 3.78 Alumina 9.7 11.7 Silicon Terminations £r Transmission-line terminations must match the line's characteristic impedance to minimize reflections. The termination is usually also used as the pull-down resistor for the open-emitter ECL outputs. These outputs allow Zo values from 50 to 150.0. This means that ECL can accommodate 75.0 video systems as well as 50.0 communications systems. Some ECL outputs even allow 25.0 trans- 4.7 3-36 -5.2 or -4.5 V = 2.6 Zo (l0KH) R 2 a. Parallel 2.26 Zo (lOOK} .B.1... (lOKH) _ 1.6 R1 - R fig (lOOK) b. Thevenin ~ __ zo c. Series Vee = (lOKH) I n number of lines I (lOOK) I : Rt I = Zo· IOn luru~_-r.r""--------------' Figure 4. Three Types of Transmission Line Terminations 3-37 Table 3. ECL Output Transistor Power and TerminatinglPull-Down Resistor Power The series termination's efficiency depends on the value of RE. The power dissipated by a small RE can exceed the power dissipated in the parallel termination. A large RE can slow negative-going transitions because the input capacitance of the following gates (typically 4 pF) are being charged through the resistor. A large RE can also reduce noise margins. Note that, in this case, RE does not have to equal the transmission-line impedance. Table 3 shows a tabulation of ECL output transistor power and RE power dissipation. Because the series termination is installed at the near end of the transmission line, only lumped loads can be used. Distributed loads cause problems because the full value of the pulses are seen only at the far end of the line and not along the length of the trace, as with the parallel and Thevenin terminations. Typically, you can have up to 10 lumped loads at the end of the line. Thus, you must choose RE to supply enough current to drive the loads. However, you must also consider the voltage drop in the series terminating resistor. One way to minimize dissipation is to make the series termination drive two or more lines with lumped loads in parallel (as in Figure 4c). Dissipation in Terminating Resistor Value (0) ECL Output Transistor (mW) Terminating Resistor (mW) Parallel Termination 150 5.0 4.3 100 7.5 6.5 75 10 8.7 50 15 13 Thevenin Termination 82/130 15 140 Series Termination 2K 2.5 7.7 lK 4.9 680 7.2 15.4 22.6 510 9.7 30.2 Measurements After prototyping transmission lines and terminations, you can make waveform measurements on a sample board to uncover any mismatches. Simple time domain reflectometry (TDR, Figure 5) can show the position of discontinuities or mis-matches along the line and the type of reactance or termination needed to correct them. Discontinuities, such as gate input capacitances distributed along the line, appear as small glitches on the output waveform. The reflection's amplitude is proportional to the capacitance. You can therefore calibrate the test setup using a series of standard capacitances. Also, test equipment with TDR capability, which simplifies measurements, is available from HP. mission lines to drive doubly terminated 500 bus lines in backplane applications. Figure 4 shows the types of terminations with calculations. The different options have tradeoffs that include routing, power dissipation, loading, and ease of use. The parallel· termination (Figure 4A) is simple: The terminating resistor at the far end of the transmission line equals the line's Zo. In reality, the line and Rt always exhibit some mismatch caused by the ECL device's input capacitance. This termination offers the fastest performance and lowest power dissipation, but requires an additional power supply for the termination resistor (Rt). An advantage of parallel terminations over series terminations (Figure 4C) is that you can use the former with ECL loads distributed along the length of the transmission line. This is because the parallel termination is installed at the transmission line's receiving end and absorbs most all reflections. The Thevenin equivalent (Figure 4B) of the parallel termination (called the Thevenin termination) requires two resistors but needs no separate supply because the termination relies on the system power bus. Although this feature is convenient for small systems, the Thevenin termination draws 11 times more power per termination than does the parallel termination. The series termination is potentially the most powerefficient. It matches Zo by means of a resistor (Rt) in series with the driving ECL gate's output impedance, which is 100 in STAR devices). Instead of totally preventing any reflections at the far end of the line, the series termination allows pulses to be reflected by the high impedance there, absorbing them when they are reflected back to the near end. Interfacing and Prototyping With the increased use of ECL in new and nextgeneration systems, many connector and cable companies, such as W. L. Gore & Associates, are offering controlledimpedance coax ribbon cable and wrappable coax cable for prototypes and final design. Although most ECL system prototyping is done on PC boards, alternatives exist. ECL and mixed-TTL/ECL wire-wrapping boards with extensive ground planes are available from MUPAC Corporation. You can use wrappable coax on these boards between signal pins, with additional connections to adjacent ground pins. Programming EeL PLDs Cypress's current ECL PLDs are bipolar devices with proven TiW fuses. This means that, unlike the company's erasable CMOS PLDs, the ECL PLDs are one-time fuseprogrammable. You can program the devices using Data I/O, Stag, and Logical Devices PLD programmers; you 3-38 Vreflect f".lf------IHH~.=... Power ,.....-----.. Dlvider 200 MHz (or faster) Scope Pulse Generator v "800 mV Tr " 2 ns T " 50 ns Term inat ion Figure 5. Time Domain Reflectometry Setup can also use Cypress's QuickPro II. Development software, including simulation models, is available from Data I/O (ABEL) and Logical Devices (CUPL). 3-39 Section Contents Page SRAMs RAM 110 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4-1 Understanding Dual-Port RAMs .......................................................... 4-7 Using Dual-Port RAMs Without Arbitration ............................................... 4-19 Using Cypress SRAMs to Implement 386 Cache ........................................... 4-23 ~4 ~ .= CYPRESS , SEMICONDUCTOR RAM 1/0 Characteristics This application note describes the function and I/O standards of Cypress high-speed static RAMs. Manufactured using a speed-optimized CMOS technology, these RAMs meet and exceed the performance of competitive bipolar devices, while consuming significantly less power and providing superior reliability. While providing identical function, the RAMs exhibit slightly different input and output characteristics, which permit you to improve overall system performance. For more detailed information on these products, refer to the Cypress Data Book. Generic I/O Characteristics Input and output characteristics fall generally into two categories: when the area of operation falls within the normal limits of Vee and Vss plus or minus approximately 600 mY, and abnormal circumstances, when these limits are exceeded. Under normal operating conditions, inputs switch between logic Zero and logic One. This application note considers operation in a positive-True environment, and therefore a One is more positive than a Zero. The RAMs provide TTL-compatible I/O. Therefore a One is 2.0V, while a Zero is 0.8V. To be considered a One, the input of a device must be driven greater than 2.0V, but not exceeding Vee + 0.6V. To be considered a Zero, the input must be driven to less than 0.8V, but not less than Vss - 0.6V. Output characteristics represent a signal that drives the input of the next device in the system. Because the RAM levels are TTL compatible, you can assume that the VIL and VIR values of 0.8V and 1.0V referenced above are valid. In consideration of noise margin, however, driving the input of the next stage to the required VIL or VIR is not sufficient. Noise margins of 200 to 400 mV are considered more than adequate. Thus, an adequate VOH is 2AV and VOL is OAV, providing a noise margin of 400 mY. Because the driven node consists of both a resistive and a capacitive component, output characteristics are specified such that the output driver is capable of sinking IOL at the specified VOL, and capable of sourcing IOH at VOH. Because the values of IOL and IOH differ depending on the device, these values are shown in Table 1. Outputs have one other characteristic to be aware of: output short-circuit current (Ios). This is the maximum current that the output can source when driving a One into Product Description The five parts represented in Figure 1 constitute three basic devices of 64, 1024, and 4096 bits. The CY7C189 and CY7C190 feature inverting and non-inverting outputs, respectively, in a 16 x 4-bit organization. Four address lines address the 16 words, which are accessed via separate input and output lines. Both of these 64-bit devices have separate active-Low select and write-enable signals. The 256 x 4 CY7C122 is packaged in a 22-pin DIP, and features separate input and output lines, both activeLow and active-High select lines, eight address lines, an active-Low output enable, and an active-Low write enable. Both the CY7C148 and CY7C149 are organized as 1024 x 4 bits and feature common pins for data input and output. Both parts have 10 address lines, a single activeLow chip select, and an active-Low write enable. The CY7C148 features automatic power-down whenever the device is not selected, while the CY7C149 has a highspeed, 15-ns chip select for applications that do not require power control. This family of high-speed static RAMs is available with access times of 15 to 45 ns with power in the 300- to 500-mW range. These RAMs are designed from a common core approach and share the same memory cell, input structures, and many other characteristics. The outputs are similar, with the exception of output drive and the common I/O optimization for the CY7C148 and CY7C149. 4-1 DO DO 0, D, 02 02 03 D3 AO Ao 00 00 A, A, 0, 0, 02 Ai 02 A2 0] 0] A3 A3 ES B Wl WE 7C190 7C189 AO 110 0 Ae A, AS A2 I/O, A7 A3 AS A. I/Oi AS A. 1/0 3 A~ AG CS A7 WE 7C148/9 7C122 Figure 1. RAM Block Diagrams vSS. You need to be aware of los for two reasons. First, Technology Dependencies and Benefits the output should be capable of supplying this current for some reasonable period of time without damage. Second, this is the current that charges the capacitive load when switching the output from a Zero to a One and will control the output rise time. Because memories such as these are often tied together, you also need to consider the output characteristics when the devices are deselected. All of the RAMs in the family feature three-state outputs; when deselected the outputs are in a high-impedance condition that does not source or sink any current. In this condition, as long as the input is driven in its normal operating mode, a three-state output appears as an open, with less than 10 IJA of leakage. Thus, to any other device driving this node, the output does not exist. . Some of the products described in this application note were originally produced in a bipolar technology. They have since been re-engineered in NMOS technology, arid Cypress has now produced them in a speed-optimized CMOS technology. Both technology dependencies and benefits associated with each technology relate to the design of input and output structures. When you use these products, you should know about these characteristics and how they can benefit or impede a design effort. One of the most obvious factors is that both NMOS and CMOS device inputs are high impedance, with less than 10 J.LA of input leakage. Bipolar devices, however, require that the driver of an input sink current when driv- 4-2 RAM I/O Characteristics Table 1. DC Parameters CY7C122 Parameters Description Test Conditions VOH Output High Voltage Vee = Min., IOH = -5.2 rnA VOL Output Low Voltage Vee = Min., IOL = 8.0 rnA VIH Input High Voltage 2.1 VIL Input Low Voltage -3.0 IlL IIH Input Low Current Input High Current Vee = Max., VIN = Vss Vee = Max., VIN = Vee IOFF Output Current (High Z) VOL < VOUT < VOH, TA = Max. los CY7C148/9 CY7C189190 Min. Max. Min. Max. Min. Max. 2.4 2.4 2.4 0.4 Vee 0.8 10 -3.0 10 -10 +10 Vee = Max., O°C < TA < 70°C Output Short-Circuit Current VOUT = Vss, -55°C < TA < 125°C -10 V 0.4 V V 10 Vee 0.8 10 I.lA 10 10 IlA 0.4 2.0 Units Vee 0.8 +10 2.0 -3.0 -10 V +10 J.I.A -70 -90 -275 rnA -80 -90 -350 rnA perform normally. Operated in full CMOS mode, the devices save power because the current consumed in the input converter decreases as the input voltage rises above 3.0V or falls below 1.5V. Because the input signal is in the 1.5V-to-3.0V range only when transitioning between logic states, the power savings in a large array with true CMOS inputs can be significant. With input signals on over half of the pins of a device, significant savings in a large system can be realized by using CMOS input voltage swings even in TTL systems. Although this application note does not directly deal with the AC characteristics of high-speed RAMs, the input and output characteristics of these devices have a great deal to do with the actual AC specifications. Conventionally, all AC measurements associated with highspeed devices are done at l.5V and assume a maximum rise and fall time. This eliminates the variations associated with the various usage configurations (as a figure of merit when testing the device), but does not mean that you can ignore these influences when designing a system. Maximum rise and fall time is usually included on every data sheet. For the products referred to in this application note, a lO-ns maximum rise and fall time is specified for all devices with access times equal to or ing to VIL, but appear as high impedance at VIH levels. This is because the input of a bipolar device is the emitter of a bipolar NPN-type device with its base biased positive. The bias (l.5V) establishes the point at which the input changes from requiring current to be sourced to presenting a high impedance. This switching level is the reason that AC measurements are done at the 1.5V level. Although NMOS and CMOS device inputs do not change from low to high impedance, great care is taken to balance their switching threshold at 1.5V. This allows you to consider only capacitive loading for MOS device fanout, while bipolar has both. a capacitive and DC component. The other· input characteristic that differs between bipolar and MOS is the clamp diode structure, which exists in both MOS and bipolar. However, in MOS devices that use bias generator techniques (all high-speed MOS devices), the diode does not become forward biased until the input goes more negative than the substrate bias generator plus one diode drop. Because the bias generator is usually at about -3V, this factor removes the clamping effect. CMOS/NMOS/Bipolar Input Characteristics Although NMOS, CMOS, and bipolar technologies differ widely, the I/O characteristics are the TTL derivatives that have been covered above and are documented in Table 1. With the exception of the differences in input impedance between MOS and bipolar devices, all three technologies are used to produce TTL-compatible products. Another group of devices provide a true CMOS interface, where signals swing from Vss + 1.5V. In addition, loads are primarily capacitive. Only devices produced in a CMOS technology are capable of behaving in this manner. CMOS devices can, however, handle both TTL and CMOS inputs. Devices such as the ones described in this application note have input characteristics such as those depicted in Figure 2. While operated in the TTL range, these devices 3.5 ! 3.0 2.5 2.0 1.S / If 1.0 0.5 o 0.0 r I -.....~ \ ~ \. J / 1.0 2.0 3.0 4.0 INPUT VOLTAGE - V " s.o Figure 2. Input Voltage vs Current 4-3 6.0 RAM I/O Characteristics ACLoad High Impedance Load Rl 470 S1 5V~ OUTPUT . r 30pF R2 R14700 5V~ Thevenin Equivalent OUTPUT 152S1 OuTPUT ~1.62V r 5 pF -= 2240 R2 ~ 224" Figure 3. Test Loads ticularly because the VOL and VOR changes track to the same 100 mY. greater than 25 ns. All devices with access times less than 25 ns have a 5-ns maximum rise and fall time. The AC load and its Thevenin equivalent in Figure 3 represent the resistive and capacitive load components that the devices are specified to drive. With either of these loads, the device must source or sink its rated output current or its specified output voltage. The capacitance stresses the ability of the device output to source or sink sufficient current to slew the outputs at a high enough rate to meet the AC specifications. The high-impedance load is a convenience to testing when trying to determine how rapidly the output enters a high-impedance mode. The resistive divider charges the capacitance until equilibrium is reached. Allowing for noise margin, testing for a 500 mV change is normal. By using a smaller capacitance than normal, you can make the change occur more quickly, allowing a more accurate determination of entry into the high-impedance state. Electrostatic Discharge Because of extremely high input impedance and relatively low breakdown voltage (approximately 30V), MOS devices have always suffered from destruction caused by ESD (electrostatic discharge). This problem has had two effects. First, major efforts to design input protection circuits without impeding performance have resulted in MOS devices that are now superior to bipolar devices. Second, care in handling semiconductors is now common practice. Interestingly enough, bipolar products that once did not differ from ESD have now become sensitive to the phenomenon,primarily because new processing technology involving shallow junctions is in itself sensitive. MOS devices are in many cases now superior to bipolar products. A sampling of competitive bipolar and NMOS 64-bit, 1-Kbit, and 4-Kbit products reveals breakdown voltages as low as ±150V and greater than ±2001V. The circuit in Figure 4 protects Cypress products against ESD.The circuit consists of two thick-oxide field effect transistors wrapped around an input resistor and a thin-oxide device with a relatively low breakdown voltage (approximately 12V). Large input voltages cause the field transistors to turn on, discharging the ESD current harmlessly to ground. The thin oxide transistor breaks down when the voltage across it exceeds 12V; this transistor is protected from destruction by the current limiting of Rp. The combination of these two structures provides ESD protection greater than 2250V, the limit of available testing equipment. Repeated applications of this stress do not cause a degradation that could lead to eventual device failure, as observed in functionally equivalent devices. Switching-Threshold Variations Along with input rise and fall times, switchingthreshold variations can affect the performance of any device. Input rise and fall times are under your control and are primarily affected by capacitive loading or the driver and bus termination techniques. Switching threshold is affected by process variations, changes in Vee, and temperature. Compensation of these variables is the responsibility of the manufacturer, both at the design stage and during the manufacture of the device. Combined threshold shifts over full military temperature ranges and process variations average less than 100 mV. This translates directly to VIL and VIR variations that track well within the noise margins of normal system design, par- TTL TO ~.......---,..---...--JV'o.f'Ir--"-----'------T-- ~~~~ERTER "'-----' THIN OXIDE TRANSISTOR RSUB 'Thick Oxide Field Transistor "Substrate Diode RSUB VSUB Figure 4. Input Protection Circuit 4-4 RAM I/O Characteristics Output Driver CMOS Inverter n-MOS PULL-DOWN DEVICE 'h n+ DIFFUSION AND n- WELL GUARD RING OUTPUT LS p+ DIFFUSION GUARD RING n-MOS /PULL-UP DEVICE Vee LATERAL npn BIPOLAR TRANSISTOR INPUT OUTPUT PARASITIC RESISTANCE Figure 5. CMOS Cross Section and Parasitic Circuits Latch-up can be induced at either the inputs or outputs. In true CMOS output structures such as the ones previously discussed, the output driver has a PMOS pull-up resistor that creates additional vertical bipolar PNP transistors, which compound the latch-up problem. Additional isolation using the guard ring technique can solve this problem at the expense of additional silicon area. Because all the devices of concern here require TTL outputs, the problem is totally eliminated through the use of an NMOS pull-up resistor. CMOS Latch-Up The parasitic bipolar transistors shown in Figure 5 result in a built-in silicon-controlled rectifier (Figure 6). Under normal circumstances the substrate resistor RSUB is connected to ground. Therefore, whenever the signal on the pin goes below ground by one diode drop, current flows from ground through· RSUB, forward biasing the lower transistor in the effective SCR. If this current is sufficient to turn on the transistor, the upper PNP transistor is forward biased, which turns on the SCR and normally destroys the device. Two possible solutions are to decrease the substrate resistance or add a substrate bias generator (Figure 7). The bias generator technique has several additional benefits, such as threshold voltage control, which increases device performance. The bias generator is thus employed in all Cypress products. Also used are guard rings, which effectively isolate input and output structures from the core of the device and thus decrease the substrate resistance by short-circuiting the current paths. Inducing Latch-Up for Testing Purposes Exercise care in testing for latch-up because it is typically a destructive phenomenon. The normal method is to power the device under test with a current-limited supply, so that when latch-up is induced, insufficient current exists to destroy the device. Once this setup exists, driving the inputs or outputs with a current and measuring the point at which the power supply collapses allows nondestructive measurement of latch-up characteristics. In actual testing, with the device under power, individual inputs and outputs are driven positive and nega10 Vee 1.0 I L 0.1 cjJ c( E 0.0 co co ~ I""""" i/ , ,I 0.00 1 0.000 ~ , 0.0000 1 -5.0 -4.0 -2.0 -3.0 VBB -1.0 -v Figure 7. Bias Generator Characteristics Figure 6. Parasitic SCR and Bias Generator 4-5 0 0.0 o. 0 -1. 0 1 -2. 0 1 -3.0 / I -1.0 w to- !iw -2.0 w 1/ ~ C( ! -3.0 1 -4.0 -4.0 J J -6.0 Vee = 5.0V Vee = 5.0V -5.0 0.0 6.0 -6.0 12.0 VINPUT (VOLTS) 0.0 6.0 12.0 VINPUT (VOL lS) Note: Output is in a High Impedance Condition. Figure 8. Input VII Characteristics Figure 9. Output VII Characteristics tive with a voltage. The current is measured at which the device latches up. This provides the DC latch-up data for each pin on the device as a function of trigger current. Measuring .. the latch-up characteristics of devices should encompass ranges of reasonable positive and negative currents for trigger sources. Depending on the device, latch-up can occur at sink or source currents as low as a few milliamperes to as high as several hundred milliamperes. Devices that latch at trigger currents of less than 20 to 30 rnA are in danger of encountering system conditions that cause latch-up failure. done using low-impedance, epitaxial substrates andlor a substrate bias generator. The use of a low-impedance substrate increases the undershoot voltage required to generate the trigger current that causes latch-up. A substrate bias generator has two effects that help to eliminate latch-up. First, by biasing the substrate at a negative (-3.0V) voltage, the parasitic devices cannot be forward biased unless the undershoot exceeds -3.0V by at least one diode drop. Second, if undershoot is this severe, the impedance of the bias generator itself is sufficient to deter enough trigger current from being generated. The bias generator has one additional noticeable characteristic: It effectively removes the input clamp diode. This is due to the anode of the diode connecting to the substrate that is at -3.0V. Therefore, even though the diode exists, as shown in Figure 4, DC signals of -3.0V do not forward-bias the diode and exhibit the clamp condition. The benefits of the bias generator are apparent in higher noise tolerance, as substrate currents due to input undershoot do not occur. Figures 8 and 9 represent the voltage and current characteristics of the devices discussed in this application note. Figure 8 is characteristic of an input pin, and Figure 9 an output pin in a high-impedance state. In Figure 8, the input covers +12V to -6V - well outside the +7V to -3V specification. Figure 4 helps explain these characteristics. When the input voltage goes negative, the thin-oxide transistor acts as a forward~biased diode, and the slope of the the curve is set by the value of Rp. As the input voltage goes positive, only leakage current flows. The output characteristics in Figure 9 show the same phenomenon, except that, because this is not an input, no protection circuit exists and therefore no Rp exists. An equivalent thin-film device acts as a clamp diode that limits the output voltage to approximately -IV at -5 rnA. Competitive Devices Although few devices compete directly with the Cypress devices covered in this application note, the latch-up characteristics of the closest functionally similar devices were measured. The results show devices that latch-up at trigger currents as low as 10 rnA all the way to devices that can sustain greater than 100 rnA without latch-up. The Cypress devices covered in this applicatiori note can sustain greater than 200 rnA without incurring latch-up, which is far more than it is possible to encounter in any reasonable system environment. Eliminating Latch-Up in Cypress RAMs The latch-up characteristic inherently exists in any CMOS' device. Thus, rather than change the laws of physics, semiconductor manufacturers design to minimize latch-up effects over the operating environment that the device must endure. The environmental variables include temperature, power supply, and signal levels, as well as process variations. Several techniques are employed to eliminate the latch-up phenomenon. One approach is to move the trigger threshold outside the operating range so that the voltage level never approaches this threshold. This can be 4-6 CYPRESS SEMICONDUCTOR Understanding Dual-Port RAMs This application note examines the evolution of multi-port memories and explains the operation and benefits of Cypress's dual-port RAMs. A dual-port RAM is a random-access memory that can be accessed simultaneously by two independent entities. In digital ICs, this implies a dual-port memory cell that can be accessed at the same time using two independent sets of address, data, and control lines. according to a 4-bit command; the result, C, is output. The chip also provides a carry-in input, a carry-out output, and A = B outputs. A mode-control pin selects either logical or arithmetic operations. The 74181 is combinatorial; no storage is provided. Early computers used the contents of a memory location as one operand and an accumulator in the CPU as the second operand. The results were usually stored in the accumulator. A Brief History of Multi-Port Memories Bringing the Registers On Chip The first multi-port memories were probably used in the CPU of the first computers. Many two-operand instructions are efficiently implemented using dual-port registers for the operands and the result. For example, consider Equation 1, which describes a typical two-operand operation in the ALU (arithmetic logic unit) of a CPU: ( C ) = ( A ) [ OPERATOR] ( B ) Eq. 1 A and B could be either the operands (Le., the data) or the addresses of the operands, in which case the data could be either in memory or in registers. In any case, Equation 1 describes two pieces of data, A and B, being operated upon by the OPERATOR and the results designated as C. C could also be the data, a register, or a memory location. OPERATOR could be arithmetic or logical. The 67901 was the first 4-bit slice that brought 16 4-bit registers onto the chip. The MMI 67901 was second-sourced by AMD and became the 2901. At one point, five vendors offered this industry-standard bipolar ALU. The Cypress CMOS CY7C901 is the highest-performance, TTL-compatible, 4-bit slice that is form, fit, and functionally equivalent to the original 901. The 16-word deep, 4-bit wide register array is functionally equivalent to a 16 x 4 dual-port memory. Four A address lines and four B address lines select the contents of two of the 16 registers, whose outputs are applied to transparent latches. The latch outputs are then applied to 3: 1 multiplexers, whose outputs drive the ALU inputs. The ALU outputs can be sent off chip, entered into a temporary register (Q), or written back into the register file, thus replacing one of the operands. This architecture is shown in the CY7C901 block diagram in the Cypress data book. The Combinatorial ALU The 74181 was the first integrated circuit ALU. In this IC, the 4-bit operands, A and B, .are operated upon CY7C901 Dual-Port Memory Operation A simplified CY7C901 block diagram appears in Figure1. The device's A and B addresses select the con- tents of two registers, whose outputs are applied to two 4-bit latches. When the clock (CP) is High, the latch outputs follow their data inputs (Le., are transparent). When the clock is Low, the ALU outputs are written (WE) into the register array at the location specified by the A or B addresses, depending upon the instruction being executed. A Low on the clock causes the data in the latches not to change, so that the ALU outputs are Figure 1. **901 Dual-Port Memory (Simplified) 4-7 stable when they are written back into the re,gister array. Note that the CY7C901 does not perform the three-port function described by Equation 1. In the CY7C901, the C operand equals either the A or B operand, depending upon the instruction being ex~ ecuted. In fact, the A and B addresses can be the same. An old programming trick is to Exclusive-OR the contents of a register with itself, which clears the register. Additionally, the CY7C901......s dual-port memory does not use a dual-port memory cell. This type of cell is not required because the CY7C901 does not need the ability to simultaneously write independently to two separate memory locations. ~ ~ RAM MPl 1 MUX MP2 Figure 2. Dual-Port Memory Using Single-Port Ram data, converts the data to analog form, and sends the data out over the communications channel on the trans'mit side. If the system contains only one processor, the data buffers are not shared, and the system needs neither a virtual nor a physical dual-port RAM. Control information associated with each data buffer tells the communications controller the number of words in the buffer and the starting address of the data in the buffer. The control information resides in one or more memory locations whose addresses have been previously agreed upon by the two processors. This simple software-based buffer example requires a second level of control- a mechanism or procedure that prevents the two microprocessors from getting in each other's way. In other words, the system needs a procedure control mechanism. Another way of analyzing this requirement introduces the concept of data ownership. Say, for example, that processor A assembles and stores messages and thus owns the data while performing these tasks. Likewise, the communications processor B owns the data while performing its tasks. The procedure control mechanism amounts to a technique for transferring data ownership between processor A and B. In large systems, where many processors perform many different operations, the processing of the information is called a job or a procedure. The procedure is divided into many tasks, which can be performed by different processors. The tasks can either be scheduled and assigned by a processor dedicated to that task or be performed by any' available processor. These alternatives are referred to as autocratic and egalitarian systems, respectively. The term egalitarian implies that the processors are treated equally. In either case, the processors must have access to, a shared memory location used for message passing. Synchronizing sequential processes is the cornerstone of concurrent programming, which applies to multi-tasking, single-processor systems; distributedprocessor networks; and tightly-coupled mUltiprocessor systems. Dual-Port Memory Using Single-Port RAM Before the dual-port memory cell existed, designers created dual-port RAMs from single-port RAMs by adding a multiplexer between the RAM and the two entities that shared the RAM. Figure 2 illustrates a block diagram of such an arrangement. Two processors, MPI and MP2, share the RAM. If each processor has access to the RAM half the time, the resource is shared equally and is said to be allocated according to a fairness doctrine. This time division multiplexing assures that there is no contention for the RAM. However, performance suffers if the RAM's access time does not equal 1/2 or less of the processors' clock period, assuming that the processors are clocked from the same source, For example, consider two processors clocked from the same 25-MHz source, for a period of 40 ns. Because the processors are closely coupled, only one operating system is in memory. In this case, the maximum access time of the dual port has to be 20 ns or less. The highest-speed dual-port RAM available has a 25-ns ,access time. Therefore, each processor suffers a WOfstcase 20% performance degradation. Dual-Port RAM Applications The first" applications for dual-port memories were for CPU register files. Dual-port RAMs can also serve as data or instruction cache memories. However, the largest usage of dual-port RAMs is in communications, which includes the exchange of' data between processors, processes, and systems. Virtual Dual-Port RAM Communication between systems does not require physical dual-port RAMs. Instead, a conventional RAM memory is partitioned into virtual data-storage areas (buffers), ,usually to store at least two data packets. These buffers are shared between the communications controller and the intelligent element that assembles the packets and stores them (usually ,a microprocessor). The communications controller can also bea microprocessor. It reads the data from memory, converts the data from parallel to serial form, encodes the Message Passing In the two-processor system under consideration, synchronization can be achieved by using a lockword or lockvariable. The lockvariable can apply either to data (as in this example) or to executable instructions. The, lockvariable is a location in shared memory that is operated upon using two synchronization primi- 4-8 Note that this procedure does not require the use of a dual-port RAM. The procedure does require each processor to perform a TAS instruction, clear the lockvariable, and send a message to the other processor. Sending a message implies writing to a location in shared memory. To know that a message is waiting, the processor receiving the message must either read the memory location periodically (referred to as polling a mailbox) or the act of writing to the mailbox must generate an interrupt to the receiving processor. The interrupt-driven alternative is usually preferred because the receiving processor does not have to waste time in a polling sequence. tives: LOCK (v) and UNLOCK (v), where (v) is the location operated upon. These are simple binary switch operations. If a processor wishes to lock or own a critical section of code or data, the processor indivisibly sets the lockvariable if testing shows the lockvariable to be zero. If the lockvariable is not zero, then the operation is repeated until the lockvariable is zero. To unlock the critical section, a processor sets the lockvariable to zero and continues. Most modern processors have indivisible read/modify/write instructions, also called test and set (TAS) instructions. In Reference 1, however, E. W. Dijkstra shows that lockvariables can be implemented without using a read/modify/write instruction. And in Reference 2, he develops the semaphore, a technique for managing a queue of tasks waiting for a resource. Lockvariables surround or bracket semaphores and thus provide entry and exit control on a mutual exclusion basis. Dual-Port RAM Cell History The first dual-port RAM ICs to use a dual-port RAM cell were the Synertek SY2l30 and SY213l, introduced in 1983. These products are organized as 1024 words of 8 bits and use n-channel, double-poly silicon technology to achieve 100-ns access times. The SY2130 has an automatic power down feature controlled by the chip enables, and the SY2131 does not. The smaller (512 X 8) SY2132 and SY2133 were similar but unsuc~ cessful. The original dual-port RAMs include two mailboxes for message passing. When written to from one port, a mailbox generates an interrupt to the opposite port. Additionally, on-chip arbitration logic generates a busy signal to the loser when both left and right ports address the same memory location. If the loser was attempting to write, the write is suppressed. Most of the dual-port RAMs on the market today are functionally equivalent to the original Synertek products. The "new features" added to several dual-port RAM products by Motorola and Integrated Device Technology (IDT) include dedicated semaphore registers. These semaphores are unnecessary, however, and the products that use them do not have second sources. The SY2l30 was second-sourced by IDT in 1984 and Advanced Micro Devices (AMD) in 1985. IDT also doubled the density to 2K X 8 and called the new part the IDTI132. Due to pin limitations (48 pins), the interrupt functions were deleted. The AMD part (Am2130, 1024 X 8) had at least three logic errors. A busy-going-active indication failed to reset the interrupt when both ports addressed the same mailbox location. Additionally, busy going inactive failed to retrigger the address transition detection circuitry at all locations. And finally, when contention occurred and both ports were attempting to write, the losing port was not· prevented from writing. The data sheet for this device does not explicitly state these conditions, but they must occur for the device to make logical sense (more on this later). In 1985 IDT added slave companion parts to the company's dual-port family. The IDT7l40 (1024 X 8) is the slave to the IDT7130, and the IDT7142 (2K X 8) is Typical TAS Instruction The current example assumes that the processors have a TAS instruction. A typical TAS instruction operates as follows: Read, test, and set to X. The addressed memory location is read, and if its contents are zero, the value X is written into that location. If the contents are not zero, the contents are returned to the processor, and the value in the memory location is not disturbed. The usual convention is that a value of zero in the lockvariable means that the resource associated with it is available. A non-zero value means that another processor temporarily owns the resource and that the resource is not available. After performing the task associated with the lockvariable, the processor sets the lockvariable's value to zero. The system is initialized with alilockvariables set to zero. In the current example, processor A performs a TAS operation on the lockvariable and, fmding the lockvariable zero, sets the lockvariable to a one. This tells processor B that the message is in the process of being assembled in the memory buffer area and is not ready to be transmitted. Processor A then assembles the message. After the message is assembled, processor A clears the lockvariable, sends a message to processor B saying that the message is ready to be transmitted, and gives the data's location and the number of bytes to be sent. Processor B reads the message from processor A and performs a TAS operation on the lockvariable; finding the lockvariable zero, processor B sets it to a two. This tells processor A that the message is in the process of being transmitted. Processor B then transmits the message and clears· the lockvariable. Processor B sends processor A a message that the transmission task has been completed. After receiving the message from processor B, processor A performs a TAS operation on the lockvariable; finding the lockvariable zero, processor A concludes that the message has been successfully transmitted. 4-9 ot which requires one more of the same resources to perform its task. For example, if processor A owns resource X and processor B owns resource· Y, and both resources are required to accomplish the task, a stalemate occurs. in which each processor waits for the other to relinquish the required resource. This is the simplest example. J'he concept extends to n processors and m resources. The solution to the deadly embrace depends upon whether the system is autocratic or eglitarian, the tasks' priorities, etc., and is beyond the scope of this discussion. In the case of dual-port RAMs, however, the solution is simple: Do not cascade two masters in width; use a master and a slave. the slave to the IDT7132. The slave device provides word-width expansion. Busy is an input to the slave from the master, and the slave contains no arbitration logic. One master can drive many slaves. .This arrangement avoids the classic deadly embrace problem. This arrangement avoids the classic deadly embrace problem described in the next section. The Deadly Embrace The deadly embrace can. occur when two masters are connected in parallel to make a wider word. If the left and right port addresses match, and the left and right port chip enables then become active to both chips at approximately the same time, it is possible to have one port of one master lose and the opposite port of the other master also lose. In other words, if an address match occurs and both ports are enabled during a small time window, or aperature of uncertainty, the dual-port RAM cannot determined which port wins or loses. Under these conditions, if the corresponding left and right port busy pins are connected together, both ports of both masters are active (Low). This condition occurs because the busy outputs are open drain, and the loser pulls the node Low. This condition is the simplest example of the deadly embrace. So far as the external world is concerned, both ports are busy, and the system remains locked up indefinitely, with each port waiting to be released by the other. Each master's arbiter section thinks it has lost the arbitration and is waiting to be. released by the other. In general, the deadly embrace occurs under two conditions: a processor requires one or more resources to perform a task, and one or more of the required resources is temporarily owned by another processor, The Cypress Dual-Port RAM Family Table 1 lists the members of the Cypress dual-port RAM family. The package designator D26 stands for 600-mil ceramic DIP, and P25 stands for 600-mil plastic DIP. The 48-pin ceramic leadless chip carrier (LCC) is designated as L68. The 52-pin packages are designated as L69 for ceramic LCC and J69 for plastic LCC (PLCC). Note that the interrupt function is not available at the 2048 X 8 level in a 48-pin package. This is due to pin limitations. At the 2-Kbyte level, each port requires an additional address pin for the address's most significant bit. The MIS column in Table 1 indicates whether the device is a master or slave. The difference between these devices is that the masters have arbitration logic and· the slaves do not. The busy signals are outputs from the master and inputs to the slave. (The ramifications of this are examined later.) Table 1. The Cypress Dual-Port RAM Family Packa2e Options MIS Configuration Part Number 48-pin Dual In-Line Pkg Ceramic IKX8 2KX8 D26 CY7C130 M CY7C131 M --- CY7C140 S D26 48-pin Square Plastic 52-pin Square LCC LCC PLCC L68 --- --- --- --- L69 J69 P25 L68 --- --- L69 J69 P25 CY7C141 S --- --- --- CY7C132 M D26 P25 L68 --- --- CY7C136 M --- --- --- L69 J69 CY7C142 S D26 P25 L68 --- --- CY7C146 S --- --- --- L69 J69 Note: The nterru pt function IS not available at the 2KX8 level In a 48-PIn package 4-10 interrupted port reads that memory location, the interrupt is reset. When both ports address the same memory location and both chip enables are active (Low), contention occurs for that address. An arbitration is then performed, and ownership of the memory location is assigned to the winner. An active (Low) busy signal notifies the loser of the arbitration. Dual-Port RAM Functional Description Figure 3. Dual-Port RAM Block Diagram An important aspect of the Cypress dual-port RAM s is their interrupt logic. A simplified logic diagram of this logic appears in Figure 4, with the chip enables deleted. A port's chip enable must be asserted for the port to either read from or write to any location, including the mailboxes. Note that you can use the mailbox locations as conventional memory by not connecting the interrupt line to the appropriate processor. The upper two memory locations (7FF and 7FE for 2K x 8; 3FF and 3FE for IK x 8) can be used for message passing. The highest memory location serves as the mailbox for the right processor. When the left processor writes to this mailbox, the interrupt (request) to the right processor, INTR, goes Low. When the right processor reads its mailbox, the flip-flop is reset, and INTR goes High. The second highest memory location serves as the mailbox for the left processor. When the right processor writes to this mailbox, the interrupt (request) to the left processor, INTL, goes Low. When the left processor reads its mailbox, the flip-flop is reset, and INTL goes High. Note that each port can read the other port's mailbox without resetting the associated flip-flop. If your application does not require message passing, leave the appropriate pin open. Do not connect a pull-up resistor to the pin, and do not connect the pin to the processor's interrupt request pin. Note that the active state of the busy signal prevents a port from setting the interrupt to the winning port. Additionally, an active busy signal to a port prevents that port from reading its own mailbox and thus resetting the interrupt. These operations are ramifications of the data-ownership concept. If both ports address the same memory location at the same time, the master performs an arbitration, so that one port wins and the other loses. Because each of the two ports can be in either the reading or writing state, there are four possible combinations of ports and states (Table 2). Cypress Dual-Port RAM Operation A simplified block diagram of the Cypress dual port RAM appears in Figure 3. The device interface includes three types of signals: address, data, and control. There are two sets of these signals: those of the left port and those of the right port. Each signal has either the subscript L or R to designate left or right, respectively. The address pins are designated AO through A9 (1024 X 8) and AD through AIO (2048 X 8), where AO is the least significant bit (LSB) and A9 or AlO is the most significant bit (MSB). The address pins ~ unidirectional inputs to the device; their states specify the memory location to be read from or written into. The data pins are designated 1I0D through 1I07, where 1I00 is the LSB and 1I07 is the MSB. The data pins are bidirectional; their states represent either the data to be written or the data to be read. The control pins are chip enable (CE'), readlwnte (R/ W), and output enable (00). Two flags are also provided, INT and BUSY; both have open-drain outputs and require external pull-up resistors. A Low on the chip enable input allows that port to become functional. Data is either read from the internal dual-port RAM array or written into it, depending upon the state of the read/write signal; a Low initiates a write operation. The three-state data output drivers are enabled by a Low output enable. When one port writes to a pre-determined mailbox, an interrupt to the other port is generated. When the (OPEl lUll) LEFT 'ID ADDlEn '-------' IIIHT SIIE Both Ports Reading If both ports read the same location at the same time, you would assume that both ports should read the same data. This is true for all dual-port ICs. When arbitration occurs as a result of contention in a Cypress dual-port RAM, the port that wins the arbitration gets temporary ownership of the memory location. The ADDlElS (DPlI DUUI Figure 4. Interrupt Logic 4-11 Table 2. Functional Operation of Duill..Port Masters RESULT OF OPERATION AFTER ARBITRATION (MASTER) OPERATION CASE LEFTPORT RIGHT PORT CYPRESS and IDT AMD BOTH PORTS READ 1 READ READ BOTH PORTS READ 2 READ WRITE 3 WRITE READ LOSER WRITES, WINNER IF LOSER PREVENTED FROM READING, MIGHT HAVE WRITING. IF LOSER IS CORRUPTED DATA AND READING AND PORTS ARE NOT KNOW IT ASYNCHRONUS, DATA 4 WRITE WRITE READ MIGHT NOT BE VALID losing port can read the memory location but is told that it lost the arbitration by the busy signal. To guarantee data integrity in a multiprocessor system, it is standard practice to apply the concept of data ownership. This ownership can apply to executable code, data, or control locations in memory. The control locations in memory can be associated with a resource, such as a printer, tape drive, disk drive, or communications port. The arbitration logic consists of left and right address equality comparators with their ass~iated delay buffers; the arbitration latch formed by the crosscoupled, three-input NAND gates labeled L and R; and the gates that generate the busy signals. Operation With Unequal Addresses When the addresses of the right and left ports are not equal, the outputs of the address comparators (nodes A and B) are both Low, and the outputs of the gates labeled Land R (nodes C and D) are both High. This condition forces both Busy signals High and both Wnte InhibIt signals High. The arbitration latch does not function as a latch. One Port Reading. the Other Writing In the AMD dual-port RAM, the losing port is not prevented from writing. In the Cypress and IDT devices, the losing port is prevented from writing. All dual-port RAMs assert a busy signal to the losing port, so that this port can tell that the data might be corrupted. In the Cypress dual-port RAMs, the losing port is prevented from writing so that the data cannot be corrupted. Busy is asserted to the losing port, so that the port can tell that its read or write operation might not have been successful. Left Port Camped on an Address Next, consider the condition where the left-port address and chip enable are quiescent, and the right port address changes to an address equal to that of the left port. Nodes A and B are initially Low. Because the right-port address does· not go through the delay buffer, the output of the right-address com- Both Ports Writing ADDRESS L ADDRESS(R) In the AMD dual-port RAMs, both are allowed to write. Busy is asserted to the losing port, indicating that the data might be corrupted. However, the winning port is not told that the data it just wrote might be corrupted by the writing of the losing port. This situation can cause system errors. In the Cypress and IDT dual-port RAMs, the losing port is prevented from writing, so that the data cannot be corrupted. Busy is asserted to the losing port, indicating that its write operation was unsuccessful. Arbitration Logic Figure5 shows the arbitration logic used in Cypress dual-port RAM masters. The arbitration logic has three functions: to decide which port wins and which loses if the addresses are equal simultaneously; to prevent the losing port from writing; and to provide a busy signal to the losing port. WRITEINHIBIT(R) WRITEINHIBIT(L} Figure 5. Arbitration Logic 4-12 ~ ~~~am~~~~~~~~~~~~~~~~U~n~d~e~r~s~ta~n~d~i~n~g~D~u~a~1~-P~o~r~t~R~A~~~s parator (node B) goes High before node A goes High by a delay interval, d. The delay must be greater than the delay through the R gate, so that when node B goes High, node D goes Low, causing node C to remain High. CE(R) and CE(L) are both High; they are the inverse of the chip enable inputs. Node D going Low causes the output of the BR gate to go Low, which tells the right port that the memory location it just addressed belongs to the left port. A write-inhibit signal is also generated that prevents the right port from writing into the addressed memory location. In summary, when the right port addresses a memory location that is already being addressed by the left port, a delay occurs that equals the sum of the propagation delays of the right-address comparator, the R gate, the BR gate, and the output driver (not shown in the diagram). Then the busy signal to the right port is asserted. Nodes A, B, and C are now High, and node D is Low. BUSY is asserted to the right port. Due to the symmetry of the arbitration logic, the device operates the same when either the right or left ports are camped on an address. >K_______ I ---------~--------------------! I k~ ADDRR ==>C!J56RESS !'lATCH 'viER --O-_Il-_[)--~~~~~:~~==*~----- ADDRI_ i t i r i ===X:__~_!'~QQB~!??__~~k_~!i._____· ____________ ~tJ~.xr.oj ~t..stJ·ol.,·-l-···I" .• . -I-\.oI~ll~Jmmi:r~-.~-f"'->i- BUSYL ---..;... DOUTL ! vtrUN/li/M'i i" >.c I ~········-l!lYr ....... ~~~.d .-----------------------------------------------------------'-i / DEL "'----./ Figure 6. Busy Timing time, another cycle must be added to detect the condition, which can severely reduce performance. This time is less than the minimum cycle time for all speed grades of all Cypress dual-port RAMs. Another parameter, Busy High from address mismatch, tBHA, is the maximum time it takes busy to go from Low to High, as measured from the time the two port addresses do not match until the busy signal goes High. ,The comments of the preceding paragraph also apply here; The next two parameters are similar to the preceding two. The difference is that the chip enable controls the busy signal. The parameters are Busy Low from CE Low, tBLC, and Busy High from CE High, tBHC. Both of these parameters are less than the minimum cycle time for all speed grades of all Cypress dual-port RAMs. Busy High to valid data, tBDD, is the maximum time it takes the data to become valid to the losing port after Busy goes away. This parameter's value equals the address access time, tAA, because a read cycle is initiated to the losing port when its Busy signal transitions from Low to High. An action by either port can cause the busy transition. The winning port can either change its address or deassert its chip enable. To illustrate the last two parameters, Figure 6 shows the timing for the right port performing a write operation and· the left port asynchronously moving to the same address and attempting to perform a read operation. The .first parameter of interest is tDDD , which is the maximum time between the stabilization of the data to be written by the winning port and that same data becomin~lid at the outputs of the port that received the Busy. The second parameter of interest is tWDD, which is the maximum time between the High-toLow transition of the winning port's write strobe and the data becoming valid at the outputs of the port that received the Busy. It is possible for the losing port to read either the old data,the new data, or some random combination of Right and Left Addresses Equal Simultaneously In the general case, it is possible to have both ports access the same memory location simultaneously, unless this is guaranteed not to occur by the design of the system. When nodes A and B go from Low to High at ex~ actly the same instant, .the arbitration latch· settles into one of two states and determines which port wins and which port loses. The latch is designed such that its two outputs are never Low at the same time. It also has a very fast switching time. The dual-port RAM imposes a minimum time difference between either of two events: the two chip enables going from inactive to active and the two sets of addresses going from mismatch to equal. If the events are close together in time, the probability of each port either winning· or losing the arbitration is approximately equal. This parameter is called port set-up time for priority and· is abbreviated as tps on the data sheets. The specified value is 5 ns. (Note, though, that Cypress product engineers have measured tps at· room temperature and nominal Vee (5V) and found a value of approximately 200 ps.) In other words, if one port addresses a memory location 5 ns before the other port, the first port is guaranteed to win. If not, the result of the subsequent arbitration is unpredictable. Other Key Busy Parameters Several other key parameters are specified with respect to the busy signal. For example, Busy Low from address match, tBLA, is the maximum time it takes busy to go Low, as measured from the time the two port ad-. dresses are the same. This is the time from an address match until the losing port is notified that it has lost the arbitration. Obviously, the sooner this occurs the better. If the value· of tBLA is greater than the memory cycle 4-13 5~ Understanding Dual-Port RAMs . ~~~================~~~~~==~==~ IDLE - - - - - - , the two under these circumstances: the two ports are operating asynchronously (i.e., with independent clocks), and the conditions illustrated in Figure6 occur (winning port writing and losing port reading). If the read occurs early with respect to the write, old data is read. If the read occurs late with respect to the write, new data is read. And, if the read occurs at the same time the data is changing from old to new, the data read is not predictable. However, all is not lost There are two general solutions. Both use the fact that the busy signal is asserted to the losing port, telling the port in this instance that the data it is reading might not be valid. One solution is to use the High-to-Low transition of the busy signal to the losing port to generate an interrupt to the processor (or state machine) so that operation can be repeated. The drawback of this technique is that a snapshot of the states of the losing port's address lines and readlwnte line must be taken, so that the processor can tell what load/store operation caused the interrupt Taking this snapshot requires latches or flipflops for the data and control logic for doing the sampling, and the technique uses up an interrupt line. The processor must also be able to read the sampled data later. A second solution is to use the Low level of the Busy signal to the losing port to prompt one of three types of delays: delay the reading of data until the data becomes valid, which occurs an access time after the Low-to-High transition of Busy; insert wait states until Busy goes High; or stretch the clock until Busy goes High. Any of these methods probably. require less hardware and control logic than the preceding approach. Use of these methods does mean that the Busy signal must eventually go from Low to High. This happens when the winning port either changes its address or deasseru its chip enable. For this reason, as well as for system noise immunity and power-saving considerations, it is recommended that blocks of addresses be decoded to generate chip enables for the dual-port RAMs. Because the losing port has no control over the winning port in the general case, however, a question arises: What can the losing port do to successfully read the data just written, assuming the winning port does not change its address, write, or chip-enable signals? There are two possible operations: 1. Change an address line to a different address, then change back to the original address. This toggles the busy signal to the losing port 2. Change the state of the chip enable. This also toggles the busy signal to the losing port l DETEClEVENT TURN· ON CIRCUITS ~ PERFORMOPERATION l TURN-OFF CIRCUITS 1 Figure 7. Simplified ATD Sequence Detection (ATD) to improve performance and reduce power dissipation. ATD improves performance by equilibration of differential paths, pre-charging critical nodes, and forcing the outputs to a high-impedance state. Equilibration and pre-charging bias critical nodes to voltage levels approximately in the mid-point of the small-signal operating range; when the data is sensed, it takes a shorter amount of time to transition to the Zero or One level. Forcing the outputs to their high-impedance states improves speed slightly, but more importantly, the technique reduces output switching noise by eliminating crowbar current and separating the output current into two pulses instead of one. ATD minimizes power consumption because it turns on power-hungry circuits only when they are required. Slightly over 50 percent of a RAM's circuits are linear, and approximately 70 percent of the power is dissipated in the sense amplifiers during a read operation. When the RAM is operating at its .maximum frequency, the ATD circuits are constantly triggered, so the power savings are minimal. At lower speeds or smaller duty cycles, however, the power savings are significant A diagram representing a typical ATD sequence is illustrated in Figure 7. The event that triggers the ATD sequence for either port is the transition of any address, chip-enable, or read/wnte signal. Equilibration and precharging are performed next, followed by either turning on the sense amplifiers and latching the data (read operation) or pulling the BIT and BIT lines to the required levels (write operation) at the addressed location. The master clock pulse lasts from 7 to 11 ns, depending upon temperature, supply voltage, and the distributions of IC processing parameters. At the end of the pulse, the data is latched and the appropriate circuits are turned off. Master Stand-Alone Operation Figure8 presents a block diagram of a system using two 8-bit microprocessors, the Cypress CY7C132 dualport RAM, static RAM, and EPROM. The address lines of each microprocessor are decoded to generate the chip enables to the dual-port RAM, the SRAM, and the EPROM. Note that pull-up resistors are required on the· interrupt requests to the microprocessors and Address Transition Detection Why does changing the address or chip enable allow a losing port to read data successfully? All Cypress dual-port RAMs, both masters and slaves, use a circuit design technique called Address Transition 4-14 VCC Q INT ( L) ADDR DAT A .... WR . .. po - WAIT MREQ f8-BIT iii ~ . CHI P ENABLE DECODE IN T ( L) A ( L) D ( L) WE ( L) CE ( L) BUSY ( L) INT ( R.l A (R) D (R) ~ WE (R)~ CE ( R) BUSY ~ ( DUAL-PORT CY7C132 2K x ~~ TV C C ,..-- n -. -. -. I-- ADDR DATA WE CE RAM .. po po J Ie .. ADDR DATA ~ WE ~ CE RAM l+ ADDR ---+ DATA CE EPROM INT ( R) ADDR DATA WR WAIT MREQ 8-BIT . L p ~ - I CHIP ENABLE DECODE ,..-- ADDR f+DATA ~ CE EPROM -. .. po Figure 8. Typical 8-Bit Microprocessor the busy signals, which go to the microprocessors' wait inputs. cycle times must be increased by this amount of time. In equation form: twc = tPWE + tBLA Eq.2 where the delay must be at least equal to tBLA. Note that if you add more slaves to make a wider word, (e.g., 24 or 32 bits) the delay elements' outputs can connect directly to the write-strobe inputs. Additional delay elements are not required. Slave Word-Width Expansion The block diagram in Figure 9 shows how to interconnect a CY7C132 (2K x 8) master and a CY7C142 (2K x 8) slave to form a 16-bit-wide word. The diagram does not show the interfaces to the processors or the connections for the interrupt signals. As previously explained, the interrupt outputs are not available at the 2K X 8 level in the 48-pin DIP due to pin limitations. In the LCC and PLCC packages, the interrupt outputs are available from both the master and the slave devices. You can use either one. You do not have to tie the corresponding interrupt pins of the master and the slave together. Slave Stand-Alone Operation Some applications might require that you give one port permanent and absolute priority over the other. You can easily do this. by implemen~ the memory using only slave dual-port RAMs. The Busy input to the priority port must be tied High by either connecting it directly to Vee or to Vee through a lO-Kn pull-up resistor. You can connect the -ow priority port's Busy input to the high-priority port's read/write input. In this configuration, the busy (read/write) signal to the lower-priority port always prevents the port from writing when the high-priority port is writing to any location. The data of the Lower priority port is overwritten when the two ports operate asynchronously, the lower-priority port is writing, and the higher-priority Delaying the Write Strobe In width expansion, the write signals to the slave devices must be delayed by an interval at least equal to tBLA, which is the time required for the master to assert the busy signal to the slave after an address match. The delay prevents the slave data at the address in contention from being overwritten. Both the write and read 4-15 port simultaneously writes. This is not a very elegant solution because the Busy input to the low-priority port is not qualified by comparing the addresses of the two ports or their chip enables. However, this approachsuggests how the slave dual-port RAMs can be used with external arbitration logic. The busy inputs can be used by control logic or under program control to dynamical' ly change the port priorities. If the lower-priority port is read only, you can tie its Busy input High by either connecting it directly to Vee or to Vee through a pull-up resistor. Dual-Port Design Example The following design example illustrates the methodology to follow when designing with Cypress dual-port RAMs. In this example, a dual-port memory is used for message passing and bus. snooping for many bus masters on a 32-bit-wide system bus. The dual-port RAM s interface to a 32-bit system bus on the right side and a 16-bit processor on the left side. From the right port, the memory appears as 8K 32-bit words, and from the left port the memory appears as 16K 16-bit words. The memory has the following characteristics: 1. The memory location corresponding to address zero for both ports is the same. 2. The data read from and written to the memory from both ports is in the same order. Thus, DO of the right port corresponds to DO of the left port. Additionally, D16 of the right port appears as DO of the left port in address location 2048. 3. The minimum cycle time is 35 ns. 4. To conserve power, blocks of addresses are decoded to generate the required chip selects. AID - AD ( L ) D7 - DO LJ.) WE (L ) OE (L ) CHIP ENABLE { L BUSY (L ) A ( L) A (R) DU,A L PO RT D (R) D (L) WE (L) RAM CHIP WE (R)~ _ OE ( L) CY7C132 DE (R)~ .... CE ( L) 2K x 8 CE ( R) ::: BUSY (L )MAS T E R BUSY - .. ... .. ' ('1 l ~ - D8 I..LI) ~ I,' . --"' 4 ~ ~ ---.;. . Vee oE (R) CHIP ENABLE BUS Y CR) (R) DELAY :j A ( R) f4eL) DUAL PO RT D ( R ) D , (L) WE ( L ) .RAM CHIP WE (R) f4OE (L) CY7C142 OE (R) ~ CE ( R) ~ CE ( L) 2K x 8 BUSY ( t .., BUSY {'L)S LAVE 1-+ A AID - AD (R) D7 - DO (R) WE (R) v I 1\ DELAY D15 5. The CY7C132 and CY7C142 dual-port RAMs are used. Part of the design task is to specify the number of masters and slaves required and the way they must be interconnected. . 6. The appropriate Busy signals must be generated to the correct port when contention occurs. 7. All possible mailbox locations that can be used for message passing 'are used. 8. The right port signals are ARO ...ARI2, DRO ...DR31, ~; and BusyR. The left port signals are ALO... AL13, DLO...DLI5, eEL, and BusyL. A simplified logic diagram of the memory appears in FigurelO. A total of 16 2K X 8 dual-port RAMs are required. The devices labeled MA (master, bank A) through MD (master, bank D) are CY7C132 masters. The devices labeled SU (slave, upper half-word) and SL (slave, lower half-word) are CY7C142 slaves. The memory cpnsists of' four masters and twelve slaves, along with the required control logic. From the right port The memory is configured as 8K 32-bit words, with a master' controlling three slaves. The one-of-four decoder labeled RB (right bank) generates. chip-enable signals for each bank of 2K 32-bit words. Data is written (sampled) on the bus side, and the only reads performed are from the mailbox locations. A general-purpose, right-port, control-logic block generates control signals that conform to the timing diagram shown in Figure 11. The diagram does not show the generation of the output-enable control signals, but they are similar to the RB decoder signals. If your application does not require message passing to the right port, you can tie the right-port output-enable pins of all of the dual-port RAMs directly to Vee. \j' Figure 9. Expansion (2K x 16) With Slave 4-16 D15 - D8 (R) From the left port, the memory is configured as 16K 16-bit words. For this organization, you might think: that the slave dual-port RAMs in the second column from the right in Figure 10 should be masters. If this were the case, however, you would have to defeat the arbitration logic in them when the right port addressed the same address; this would add logic, reduce the speed, and complicate the design. Therefore, this design uses a combination of left-bank decoding (LB, 1-of-4 decoder) and upper-lower 16-bit word decoding (UL, 1 DL' .- ~ ~ LEFT POIT COITIOL LOIIe -- - I. >-f- - t--< UL 4 I&-I&-- 1/11-1 1(0.10) 1/0-1 CE-I IE-I I-L I&•7 !G- I II • I OF DECODE 1 L. 1 OF 4 DECODE I I ~ I-- CE-L DI-L I-I 1/11-1 0 - - I(O.lO) I/D· I I"'"" C E-I ~fDE-I ~ I-L --- - SL SL '----- '----- I-PI-I-- PPl- I/O-L ~ 1/0- L ...-< CE - L f-- I--< CE-L f---< OE-L I-- I--c o E- L f---< I-I I-- I--< '-1 1/11-1 l>- f-I-1/11. I I-- I CO .10) I-- I-- I CO. 10) 1/0-1 I-I/O •• CE-I l>- tI-CE-I OE -I l>- tDE-I tI-- '-L I - -< I-L I-SU '---- "' '---- - AL(D.lI) I/D-L --c CI- L OI-L '-1 1/11-1 ~ 1(0.11) 1/0-1 CE-I DE-I I-L III IT POI CDUIOL LOIIC f- L II. CIOI-L '-1 I-1/11- I l>ICO .10) I-1/0-1 ~ C E-I l>oE- I l>'-L I-- IC SU '----- '---- 1~ ElAILE-1 SL SL '----- '---- I>I-- I-I>l>- ~ .. ... I--< --< CI-L f---< DE· L l - I- OI-L f-c '-1 I-- i-- rE SL '----- '----- 1/11-1 ~ i-- ICO.IO) i-1/0-1 i-CE-I DE -I I-f---< '-L I-- i~ l>- I-- I u" J I OF 4 DECODE oE· ~cc ....... I/II-L i--<~i-- I--<~i""- I-~ AL(D .10) ALU.IO) I-- I - AL(O.IO)i-- i-- AL(O.IO) '--- 1/O-L I - 1I0-L '--- 1/0-L - - 1I0-L CE- L ' - - - - CE- L I-- I---< C E- L r-' CE - L OI-L OE- L I-- I---< OI-L +-- DE - L 1- I '-1 i - H - I-I i - t--< 1- I 1/11-1 1>--11/11-1 I>- H I/II-I l>- I-1/11-1 PI(O.lD) ICO .10) I- 1-1- ICDalD) l - i--- I Co. 10) l - t1/0-1 I-I/O-I I-1/0-1 I-1/0- I i CE-I C E- I P- 1-1-CE-I P- I-C E· I P- I - IE-I 01-1 P- I-f-01-1 l>- I-01-1 l>I- L I-- I-f---< '-L '-L '-L l - i---< I-10 su SL 5L -.....J II. J '---- '---- '----- YeC} Dill 011' DIU 01' Figure 10. Logic Diagram for Dual-Port Example 4-17 I-I-- Cl-L I - '-1 1/11-1 ~ i""i - ICO.IO) i 110-1 i CE-I ~ ia E·I ~ fI--< '-L SL i-- A .. 0 1 I-- 01!4 - ~ ~ ~ i--<~f- i--<~f- I--<~ AL (0 .10) ~ ~ AL(O.U) I-- i-- ALCO.lO) ~ I/O-L I-- 1/0-L ~ I/O-L - L-...c - AI(O.lD AL(D.l0) I/O-L I- I/O-L CE- L OE-L '-1 I/II· I 1(0.10) 1/0-1 C E· I oE-I I-L l>- --c~co • A I r CE- L Ir OE·L 1-1 I I I I-- ~ I/O-L • iEr A - f----<~I-- f----<~I-- I--<~ ALCD.IG) I-- I-- AL(OaU) I-- I-- AL(O.lI) i--- AL(D.U) • - - - AL CD .10) AL CD .10) -_ I/O-L 1/0-L C E- L -< --< CE- L >----( OE·L DE. L -< f - - --< I-I I-I -< 1/11-1 I>-- f1/11- I >-1(O.lO) ~ ~ 1(0.1'0) ~ 1/0-1 ~ 1/0-.1 i""C E·I l>CI-I ~ i foE·I l>- fDE-I ~ i 1- L I-- f---< 1- L l- i- I-SU L--- '----- C II r------c~- --c~- -<~ r---<~CO .---- For purposes of this discussion, "word" refers to the 32-bit word at the right-port system-bus interface. At the 16-bit processor interface, the 32-bit word is referred to as either the lower half word (right-port bits ILl U EIAILI· ALII DLD Right-Port Operation yec 0 AL(D.lI) I/O-L --< CE- L OE- L I-I 1/11-1 1(0.11) I/O-I CE-I II-I I. L Id ALl! ALll DLll --<~ AL(O.lD) ~ - of 8 decoder) to cause the bank master to arbitrate when the right port is addressing the same bank as the left port (more on this later). 01 . . 010 - .17 Aill AUI Left-Port Operation ADDRESS _______X______________________XL_____________ CE,OE.'vIE u Figure 11. Dual-Port Timing for Example o through 15) or the upper half-word (right-port bits 16 through 31). The bank-selection process employs the chip enables. Specifically, the l-of-4 RB decoder decodes the four combinations of the upper two right-port address-bus signals and generates four active-Low chip enables to each bank of four dual-port RAMs. Bank A contains addresses 0 through 2047, bank B contains addresses 2048 through 4095, bank C contains addresses 4096 through 6143, and bank D contains addresses 6144 through 8191. In other words, bank A addresses 0 to 2K, bank B 2K to 4K, bank C 4K to 6K, and bank D 6K to 8K. The lower 11 right-port address lines, AR(0:10), are connected to the AO through A10 right-port address pins of all the dual-port RAMs. Figure 11 does not show the generation of the write strobe, but does show the signal's timing. The write enable is applied directly to all the masters in parallel, then buffered, and th~n applied to all the slaves. The minimum propagation delay of the buffer must be at least as large as tSLA, which is the time required for the master to assert the busy signal to the slaves after an address match occurs. Note that all the right-port output-enable pins are connected together. These pins should be driven if reading is required; otherwise connect them to Vee. The open-drain busy outputs of the right port masters must be pulled up to Vee using resistors. A value of 3300 is recommended. The master busy outputs connect to all the right-port slave busy inputs for each bank. For the data bus interface, the I/O pins of each RAM column connect to their respective I/Q pins on each bank. This OR-tie connection is allowed because the bank-selection chip enable causes the output buffers of the un-selected banks to go to the high-impedance state. The l-of-4 decoder labeled LB performs bank selection for the left port. The upper two left-port address lines, AL13 and AL12, decode bank-select chipenable signals for the four masters only. Bank A corresponds to addresses 0 through 4095, bank B corresponds to addresses 4095 through 8191, bank C corresponds to addresses 8192 through 12,287, and bank D corresponds to addresses 12,288 through 16,383. To perform upper and low~r half-word selection, the I-of-8 decoder labeled UL decodes the upper three right-port address signals. The decoder then generates eight chip-enable signals with a resolution of 2048. The chip enables connect to the slaves' chip-enable and output-enable pins (2048 resolution) and to the masters' output enable_ Because the master chip-enable resolutiqn is 4096, the master arbitrates for two block~ of 2048 16-bit half words. The lower eleven left-port address lines, AL(O: 10), connect to left-port address pins AO through A10 of all the dual-port RAMs. At the 16-bit interface, writing is only required if the left port wishes to send a message to the right port. Otherwise, you can· connect the left-port write pins of all the dual-port RAMs to Vce. To implement the left-port data bus interface, the left port's data I/O pins are connected together in the same manner as those of the right port for all RAMs in the same column. In addition, to multiplex a 32-bit data word to a 16-bit half word, the least-significant bytes and the most-significant bytes of each 2048-word group are connected together_ The UL decoder that controls the left-port output enable performs the selection_ Jf you use the masters' interrupt pins, pull them up to Vee through a 3300 resistor and connect them to the processor interrupt-request input. You can leave the slaves' interrupt pins IlDconnected. If the control signal connections from their source to the dual-port memory constitute electrically long lines, they might require proper termination to avoid voltage reflections· due to impedance mis-matches. Refer to the application note "Systems Design Considerations When Using Cypress CMOS Circuits" in this book for further information; References 1. Dijkstra, E.W., "Solution of a Problem in Concurrent Programming Control." CACM, Vol 8, no.9, Sept. 1965, p 569. 2. Dijkstra, E.W., "Co-operating Sequential Processes." Programming Languages, F. Genyus (Ed.) Academic Press, New York, 1968, pp 43 - 112. 4-18 CYPRESS SEMICONDUCTOR Using Dual-Port RAMs Without Arbitration to generate a hold to the microprocessor until Busy is deasserted. Adding an occasional wait state to a microprocessor generally has no effect on the overall system performance. Gating the Wait line and generating a hold to the processor resolves the logical problem of simultaneous address conflicts but does not address the system-level issues that can cause the conflicts. The two-processor example serves to illustrate a common underlying cause of a Busy state. Say that processor A attempts to read an array of data that was generated by processor B, but the system contains no mechanism to alert processor A when the data is ready or valid. Therefore, processor A might be updating a RAM location while processor B is reading the same address or vice versa. This lack of overall synchronization or interprocessor communication can manifest as stale data or incomplete arrays of data in the shared memory. In a few cases, stale or incomplete date is tolerable, but in most cases it is fatal. Locking a processor or processors out of specific memory areas until data is available guarantees that processors never receive stale data. To implement such address-space restrictions, you must provide a level of access protection above the basic gating-of-Busy technique. In mpst cases, you must add external hardware that signals the processors when new data is available or when This application note offers several ways to implement dual-port RAMs to facilitate communication between processors. The applications covered include communication with general-purpose processors; video and radar equipment; digital sigrial processors; and bit-slice processors. The most common application for dual-port RAMs is to provide a high-speed memory resource that can be shared between two processors in a system. Figure 1·· illustrates how the two processors communicate by passing data and commands via the shared memory. Both processors benefit by having access to the dual-port RAM because it is mapped just like any other memory device on the board. Fast, local access to the shared memory eliminates the need to arbitrate for and access the system bus, when reading .or writing a common resource area such as a shared memory card. In fact, many mUltiprocessor embedded-control systems implement dual-port RAMs for interprocessor communication and eliminate the system bus entirely. Removing the burden of a system bus, which only exists to hook the processors together, reduces the complexity of the system as well as the part· count and power consumption. Dual-Port Overview Incorporating dual-port RAMs into a design such as the dual-processor example is straightforward. But it is important to consider the case of an address contention or busy situation that can arise when both. processors simultaneously attempt to access the exact same location. Cypress dual-port RAMs have several mechanisms that simplify simultaneous access. The simplest approach to resolving contention is to use the dual-port RAM's Busy output lines. Both right and left ports provide a Busy output signal. The arbitration logic inside the dual-port RAM activates Busy when the logic senses a match between the left and right address lines. Assertion of Busy indicates that both ports have attempted to access the same location in the RAM. In the case of a dual-processor system, these signals can easily be gated with the processor's local Wait signal Processor ADDRESS DUAL PORT RI\Iot Processor "B" ADDRESS "'" , .... DATA DATA allY Il,IIY INT!RIU>T ", INT!RRLPT" , Figure 1. Dual-Processor Communication 4-19 ~= Using Dual Port RAMs Without Arbitration ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;; Table 1. CY7C132 Interrupt Line Usage Function, Result Write to left Address 7FFh Asserts Int_right Read from Right Address 7FFh Removes Int_right Write to Right Address 7FEh Asserts IntJeft .synchronizing processes or restricting address spaces via software. . You now have tW? m~in options for dealing with sImultaneous address SItuations: Use Busy in a strictly hardware solution, or couple interrupts with status words for a software solution. Regardless of your preference for a hardware. or so~tware approach, Cypress dual-port RAMs provIde all SIgnals and functions necessary to ensure a simple and effective system solution that maintains data integrity and system sanity. Using Dual-port RAMs Without Arbitration Read from Left Address Removes IntJeft 7FEh permission is granted to access a certain area of the dualported device. Interrupts serve well as a simple means of alerting or synchronizing interdependent system elements that pass d~ta ~ia a shared memory. Cypress dual-port RAMs proVIde. mte?Upt outputs to simplify the task of interrupting or signalmg the processors; this relieves you of the need to create your own interrupt mechanism. Assertion and deas~ertion ?f these interrupt lines is accomplished by perfo~ng wnte and read operations to special locations withm the dual-port RAM. Table 1 lists the read and write operations. The data word written to in these devices, 3FEh and 3FFh, . can be used as a status word or semaphore. This word IS presented to the data bus during the read operation of an interrupt removal cycle. The status word provides additional system-level information that augmen~ the h.ardware inteu:upt signal by passing along some !lleanmg ~Ith the actual mterrupt event. More simply, the mterrupt line alerts the processor that some action is required, and the status word provides additional information about exactly what happened or what needs to be done. The actual. meaning of the status byte is defined by the system desIgner. Generally, the status byte is used to indicate that data is ready, to lock a processor out of a specific range of addresses, or to prompt a processor for new data. Using the interrupt, along with status information, is an easy way of avoiding busy conditions by Wait states and interrupts are a good solution for systems with microprocessor-like elements that are not affected by an occasional wait state. However, a much broader class of systems and applications cannot tolerate any type of data flow interruption or busy condition. Typically, these systems are dedicated function units that are ~gidly pipelined and operate on continuous or nearly contmuous streams of data. A high-speed video processor is a good example of a system whose elements cannot be wait-stated due to the requirement that a data word or pixel be processed in every clock cycle. The block diagram in Figure 2 shows a video data transform or look-up table. This implementation uses a very common dualbanked or "ping-pong" RAM to realize a look-up-table translation function (Figure 3). A continuous stream of video data drives the address lines of RAM bank O. The output or transformed data of bank 0 flows downstream to the post-processor units. Meanwhile, as continuous video data flows through RAM bank 0, the transform table of bank 1 is updated by a processor element, without interfering with the video data flow. Dual banks make it impossible for a busy condition or address conflict to exist, because each system element essentially has its own discrete dedicated RAM. The processor finishes updating the look-up table, then swaps RAM banks by toggling the bank-select line. The PAL then changes the state of the buffer-enable signals which redirects the data flow pattern of the two RAM banks. The ping-pong arrangement is effective, but the implementation is very costly in terms of real estate. The RAM VIdeo Data ---~ ~-~--~A D~-~~--~ (Bank 01 Processor Address -----~A RAM Bus (Bank 11 Figure 2. Video Look-Up Table 4-20 TransforMed Data Out Table 2. Dual-Port vs. Ping-Pong RAM .... Y,o" Otv FCT244 6 15 0.4 FCT245 2 10 0.4 180 0.4 2Onsx8 RAM 2 140 0.52 Total 11 570 4.64 120 1.5 CY7C142-35 DotaCklt Power (rnA) Size (Sa.in.) Device PALI6L8-D TrOll.forMd IlR(O:71~----~Gm'm' ~lLi::O' AReo:., Figure 4. Video Lookup with Segmented Dual-Port RAM design requ~es at least 11 very high speed devices, using standard static RAMs. Replacing the buffers, logic, and SRAMs with a single dual-port RAM (Figure 4) simplifies the design sub.stantially. Video data utilizes the device's left port, while the processor communicates with the right port. Raving two ports eliminates the need for any type of data and address steering buffers. During processor update cycles, however, there remains the problem of simultaneous address accesses and busy conditions. RAM segmentation eliminates the possibility of a busy conflict and provides the key to implementfug a dual-banked RAM within a single dual-port RAM. A single inverter segments the RAM. The Bank select signal from the processor drives the left address -port MSB, and the Bank_select signal's inverse drives the right MSB. The dual-port RAM is now segmented into two lK address spaces that do not overlap. The RAM appears as two totally separate RAMs, as it did in the ping-pong implementation. Again, because the left address can never equal the right address due to the opposite state of their MSBs, a busy condition is not possible. . Using a dual-port RAM does more than simplify the deSIgn. Table 2 shows the tremendous savings in real estate and power consumption. Specifically, a single dualport device reduces the board area by 68 percent and reduces the power consumption by almost 80 percent. In terms of MTBF, system reliability benefits greatly from Video Dota Ran_Bank Se'80t --T---t-----L--+----t--+--------.J ~_,,....-----4-----4Cpu.Data Ran.Bank.SIII.t ---r-+--I--+-----L-~ Figure 3. Ping-Pong RAM Array 4-21 Using Dual-Port RAMs Without Arbitration Dua ~Y~~~~2Ra" 1--_ _ _ _'ilALIO:1I DLIO:" ce· oe· , I DRIO'" ~ Or ...... 0.•• 0.. Dua~Y~~~t2R." ALlIO) ~ RIV Addu .. ' 1lL10")~ ALIO", ~:8iU::~ Da •• ~ RIV-L ,E-L CE.L ~~:::::::;i " .--Ji Figure 5. Data Descrambler - RIV_R OI!..R CE..R - DRco:" "'''noW' HlIIOI V ~r:- having fewer components and significantly lower power dissipation. The multitude of buffers and transceivers that steer data and address signals in a ping-pong memory array take up relatively large amounts of board space as well as adding to the data propagation delay. The latter forces you to use very high speed RAMs. Dual-port RAMs do not suffer from the burden of buffer delays and can therefore operate at significantly lower speeds. ""0:11 Figure 6. CPU/Pipelined Processor Interface up. Initialization is only required once because the FIFO utilizes its retransmit function (described in the CY7C429 FIFO data sheet), unless the data ordering changes. Because this design implements the dual-port RAM as a segmented memory, you can ignore the problems caused by address contention. Handling Video or Radar Data Many types of high-speed data-processing applications can benefit form the use of dual-port RAMs. For example, high-speed video or radar data is often transmitted in nonsequential or cross-interleaved order. The receiver must first descramble or reorder the data before the data can be used. Again, the incoming data stream cannot be stopped in the event of an address contention. Figure 5 shows that a dual-port RAM is an ideal solution for this type of problem. Incoming data is written into the RAM's left port in the received order. The pixel counter provides sequential addresses to the left side of the dual-port RAM and increments after each pixel. At the end of the first line, the counter reaches terminal count and initiates a bank toggle via aT-type flip-flop. After the banks switch, the new data is accessible via the right port. A FIFO stores the reordering sequence and thus drives the right port's address lines to read-out the stored video data. PROMs and counters can also implement the descrambling function, but this approach requires more parts and is much less flexible. Using a FIFO eliminates the need to generate addresses for the reordering sequence table. The CPU initializes the descrambling FIFO at boot For DSPs and Bit-Slice Processors Interfacing a system's CPU to a high-speed, pipelined digital signal processor or bit-slice processor is another common system interface problem. Coefficients and commands must be passed to the pipelined processor, and fmal results read back by the CPU. Dual banks of RAM are often furnish a solution because they provide a shared memory space that both system elements. can use without address contention. Because the machines involved are rigidly pipelined, they cannot easily be stopped or interrupted. Thus, a single, segmented, dual-port RAM (Figure 6), or several dual-port RAMs in parallel with no additional glue logic, provides a simple, cost-effective solution to this problem. If two banks of data are too restrictive, you can segment the dual-port RAM into multiple address spaces by restricting more of the upper-address-line pairs. This scheme allows the processor to easily and quickly communicate with the pipeline processor without using large amounts of real estate and power. 4-22 --...... ~ ~ .~.. CYPRESS SEMICONDUCTOR ____iiiii",'= -':] Using Cypress SRAMs to Implement 386 Cache Because the 80386 is the most commonly used 32· bit microprocessor available today, this application note discusses some 386 cache implementations that take advantage of special features offered by Cypress's SRAM products. This application note does not offer a broad treatment of cache memories, however, and it assumes that you have a fundamental understanding of cache memories and the terminology associated with them. Mainframe computers have used cache memories for several years. Desktop systems did not require caches until the advent of 32-bit microprocessors, such as the 80386, that run at clock frequencies of 20 MHz and above. A cache allows you to make full use of the microprocessor's available throughput. This is because the processor's bandwidth is greater than the bandwidth available from commonly available DRAMs. In a memory hierarchy, a cache is a small, fast memory placed between the processor and main memory. A cache stores the most often. used data and instructions to avoid accesses to main memory. Because of speed requirements, a cache is usually implemented with fast static RAM. The goal, then, is to implement the memory subsystem such that the processor's effective average access time approaches that of the cache, while the memory subsystem's cost per bit approaches that of the main memory. Computer programs exhibit temporal and spatial locality, which make cache memories possible. Temporal locality refers to a program's tendency to re-reference the elements referenced in the recent past. Loops, temporary variables, and stacks are examples of constructs that conform to this property. Spatial locality refers to a prograJ1l'S tendency to access a portion of the address space in the neighborhood of the last reference. Sequential program execution and repeated access to array variables are examples of this property. In addition to discrete cache implementations, several VLSI cache controllers are available today for the 80386. This application note describes two of the most popular: the Intel 82385 and the Chips and Technologies 82C307. A discrete cache implementation using Cypress products is covered first. Discrete Implementation You can implement a cache memory without using a VLSI. cache controller. This discrete approach has the advantage of allowing you to custom tailor the cache subsystem to your specific requirements instead of being limited by a VLSI cache controller's capabilities. You can implement a low-cost cache subsystem or a cache with higher performance characteristics than can be achieved with today's VLSI cache controllers. The discrete approach also has drawbacks. It makes high-speed caches more difficult to implement due to the delays incurred by discrete ICs input and output· buffering, as well as trace delays introduced by the printed circuit board: Discrete solutions can also increase board-space and power requirements, and transmission line and noise effects become a more significant problem. Figure 1 shows a block diagram of a simple, 64Kbyte, direct-mapped, write-through cache. You can implement the control logic in programmable logic or a gate array (which are not detailed here). The cache tag or directory into the cache data is implemented in the CY7Cl50 lK X 4 resetable SRAM. The CY7B185 8K X 8 SRAM serves as the cache data RAM. CY7C408A 64 X 8 FIFOs are used as write buffers, which reduce the number of processor stalls in the write-through cache. This example assumes that no memory references are made above 1 Gbyte. Thus, only the lower 30 address bits of the 80386 are used. Because the tag directory has lK entries, and the data cache is organized as 8K X 32, the line size for this example is eight words or 32 bytes. The 80386 supports two modes of local bus operation: pipelined and non-pipelined. With address pipelining enabled, the processor puts the address of the next memory access on the bus during the current access. This effectively gives the memory subsystem an extra clock cycle to decode the address. This approach has two drawbacks, however. First, entering pipeline mode incurs an additional wait state. Wait states also occur during branches, after periods when the processor's 4-23 ~ .~~R~ Using Cypress SRAMs to Implement 386 C. . ache ~~~~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. mn lIT cucu, 14 11 .• 111 JLL.U ~ - ' .. At ' .. IS U.. tI J ~ mInE LIm nmu Inri VE 'E cumu 14 ,..LU- IS ~cs 14 0.. lIZ JO .. II ~ 11 .. 11 111 .. 117 .....m VE FIE llI ill- - ~ ~ II m ,-11 'r-II CEZ ,-/IE 'r-lill II an IF ,..-- ~ cmlll l,m Tt - UII 1110 II ~ r----- - r-- Uti U .. AIt nmUl 14 II. ".1118 .. 1 / 1Et .. 1El .... 131 111 •• 111 II I ?=, mUll IllY 11 .. 117 V-I~:I rr-II .# ,----" - ~ II an If r-- I~ 1l1' '--- Figure 1. A Discrete Cache Implementation cach~ pre-fetch queue is full, and after another bus master, such as a DMA controller, relinquishes the local bus to the ·processor. The second drawback is that the address and' some of the control signals must b.eextemally latched, requiring additional board space and complexity. Thus, for simplicity and increased performance, the 80386's address pipelining feature is disabled in this example. . During . memory read accesses, address bits 5 through 14 index one of the entries in the tag RAM. Simultaneously, address bits 2 through 14 access the data RAM. After time tAA of the tag RAM, the address tag appears at the comparator inputs. This tag is qualified by the valid bit and compared with 80386 address bits 15 through 29. The ~atch output is fed to the control logic. If a match is found, and the cache valid (i.e., a read hit occurred), the cache RAMs supply data to the 80386, and the cache control log~c asserts /386 ROY. If a match is not detected, or the cach~ line; i~-invalid (Le., a read miss occurred), the output enable of the cache ~AMs is de-asserted, and a main memory access, is initiated. The cache control logic causes the cache line to .be updated from main. memory. The control logic then updates .. the valid bit and supplies requested data as well as /386 ROY to the processor. . . T!lis cache implements. a. write-through, no-write-allocate' policy. Therefore,' for. write hits, both the cache RAM and main memory; are updated before the 386 lin~ . is the 4-24 10 ns. This speed is important, because the tag logic can prove to be the critical speed path in the design. Second, the CY7C150 has a memory reset function that allows the contents of the entire tag to be flushed within two memory cycles. Therefore, a cache flush operation can be performed much faster than if the processor had to invalidate the tag RAM on a line-by-line basis. The CY7B185 SRAM is fabricated in Cypress's high-performance BiCMOS process and is organized as 8K X 8. The device is available with access times as fast as 10 ns and comes with a variety of packaging options. This part's X 8 width allows you to implement the entire data cache with only four devices. Cypress provides a wide variety of memory width and depth configurations, all available with fast access times. You can thus implement the configuration that best suits your specific design requirements. Table 1. Worst-Case Timing Calculations with the 82385 CALE : 82385 Cache Address Latch Enable CS(3:0)# : Cache Select 3:0 COEA#,COEB# : Cache Output EnaBles A,B WEA#, WEB# : Cache Write Enables A,B Read Timinlf tAA (max) non-vivelined mode 4 CLK2 periods = 4 x 15 ns 60ns CALE valid from CLK2 (max) - 15 ns 386 data set up time (max) ...::..i!!§ tAA (max) 40ns COE(A.Bl#. CS(3:0)# to Data Valid 4 CLK2 periods = 4 x 15 ns CS(3:0)# valid from CLK2 386 data set up time COE(A,B)#, CS(3:0)# to data valid t()P (max) 2 CLK2 periods = 2 x 15 ns COE(A,B)# active from CLK2 (max) 386 data set up time (max) tOE (max) Write Timin!! WEA#, WEB# pulse width (min) 82385 Implementation 60ns The 82385 is a VLSI cache controller offered by Intel that is specifically designed to work with the 80386. The device supports a 32-Kbyte cache and can be configured to operate in direct mapped or two-way set-associative modes by strapping the 2W/D# pin. Appendix A provides information for strapping the 82385. The CY7C184 cache RAM connects directly to the Intel 82385 and 80386 with no external glue logic. You can configure the CY7C184 as a 2 X 4K X 16-bit device for set-associative implementations or as an 8K X 16 device for direct-mapped implementations. During read misses, the 82385 invokes the 80386's pipeline mode to reduce the miss penalty. Therefore, the processor's address must be externally latched. The CY7C184 contains address latches, eliminating the need for discrete latches. Using discrete 4K X 4 SRAMs to implement the two-way set-associative configuration would require 18 ICs for the data cache and address latches. Only two CY7C184s can implement the same function in a space-saving 52-pin PLCC package. The CY7C184 is configured by strapping the MODE pin High for set-associative operation or Low for direct-mapped operation. In set-associative mode, address bit A12 is a Don't Care and should be externally grounded. Figures 2 and 3 show the connections for two-way set-associative and direct-mapped modes, respectively. Table 1 illustrates some worst-case· timing calculations for a 33-MHz system. As the CY7C184 data sheet shows, the -25 part meets or exceeds all the worst-case requirements. For the 33-MHz configuration, there is no difference in the 82385 timing specifications for setassociative and direct-mapped operation. Therefore, set-associative operation is recommended, because it yields higher hit rates. For some lower-speed grades of the 82385, the timing is less stringent for direct-mapped operation. Therefore, slower, less-expensive cache can be implemented for direct-mapped operation. Thus, you must make a price/performance decision. - 25 ns - 5 ns -30ns 30ns - 15 ns - 5 ns -10 ns 20ns can continue execution. On write misses, only main memory is updated. Write buffers between the processor and main memory improve write performance. During write cycles, the processor writes to the write buffers, and the cache control logic updates main memory as a background task. While main memory is updated, the processor can continue executing as long as it executes read hit cycles or write cycles and as long as the write buffer has room. After a read miss, the processor halts until the write buffer has been completely flushed to main memory. Otherwise, the processor might access stale data from main memory. The write buffers are implemented with Cypress CY7C408A 64 X 8 FIFOs. This device features speeds up to 35 MHz. It is deep enough that a full write buffer condition seldom occurs, and its output enable makes external three-state devices unnecessary. The CY7C150 SRAM has two features that are beneficial in cache tag applications. First, access time is very fast. This product is available with a tAA as fast as 4-25 f5r:~CCIDK:TOR =;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;==;;;;;;;;;;;U;;;;sl;;;;·n;;;::g;;;;;C;;;;y:;;;;:p:;;;r;;;;e;;;;s;;;;s;;;;S;;;;R;;;;A;;;;M;;;;;;;;;;;s;;;;t;;;;o;;;;I;;;;m;!p;;;;le;;;;m;;;;;;;;;;;en;;;;t;;;;3;;;8;;;6;;;;;;;;;;C;;;;a;;;;ch;;;;e;;;;; 386 ADORES BUS A2 A CY7C184 AO All r - AU ALE DO lOp IIIEA 10EI IIIEI ICSl ICSO .:!:.ll..... MODE ICE ~ - J !) - D I" 1I'J1 II 1 II 386 DATA BUS * CY7C184 AO All r - AU ALE DO 10EA IIIEA lOEB IVEB I CSl ICSO .±.l.L MODE ICE ~ - 82385 CACHE CONTRO l CALE ICOEA ICVEA ICOEI JeVEB ICS! ICS2 Jest ICSO I - D I" nn n 1 iii ~ Figure 2. Set Associative Operation with the 82385 386 ADDRESS BUS ~~~~ ________________________ __________ ~ ~-4+- ~AO CY7C184 - AIZ ~ALE ________~/CSO MODE ICE 82385 CACHE CONTROL CY7C184 ~~~~________ AO - A1Z -L~~-H~~~______~ALE -L~~-H__~~________ /OEA -L~~-H__~~________ /IIEA + V lOEB + IIIEB ~~L-_ _ _ _ _ _ _ _ _ _ _ _ _ _~/CSI ~~L-_ _ _ _ _ _ _ _ _ _ _ _ _ _~/CSO MODE ICE Figure 3. Direct Mapped Operation with the 82385 4-26 386 DATA BUS 5~CYPR!SS Using Cypress SRAMs to Implement 386 Cache ~CaID~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 82C307 Implementation Table 2. Worst-Case Timing Calculations with the The 82C307 is a combination cachelDRAM controller offered by Chips and Technologies. The device is part of a chip set designed to offer a high-performance IBM PC/AT-compatible system with a minimum number of components. Because the 82C307 has a two-way set-associative cache-mapping policy, strap the CY7C184 MODE pin High for proper operation. The cache organization for the 82C307 is 2 X 4K X 32 bits. Two CY7C184s implement the entire data cache. The 82C307 also makes use of the CY7C184's built-in address latches when pipelined mode is required. The 82C307 has a programmable feature that allows either chip select or output enable to· be supplied to the cache data RAM. This feature should always be programmed to generate a chip select when using the CY7C184. Figure 4 illustrates how to use the CY7C184 with the 82C307. The Chips and Technologies 82C306 is used to latch the 80386 byte enables. Table 2 illustrates some worst-case read timing calculations for a 25-MHz system in both non-pipelined and pipelined modes. As the CY7C184 data sheet shows, the -25 part meets or exceeds the worst-case requirements for non-pipelined mode, and the -45 part does the same for pipelined mode. Again, you must make a price/performance decision based on these options. 82C307 CRD(1:0)# : 307 Cache Read 1:0 Read Timingnon-oioelined mode 4 CLK2 periods = 4 x 20 ns tAA (max) CRD(1:0)# from CLK2 (max) 386 data set up time (max) tAA (max) 10£ (max) non-vipelined mode 1.5 CLK2 periods = 1.5 x 20 ns CRD(1:0)# active from CLK2 (max) 386 data set up time tOE (max) PCB Layout Considerations As with any high-speed system, you must pay careful attention to the layout phase of a 386 cache project. The following rules of thumb help reduce noise problems and radiated EM!. A multilayer board with both power and ground planes is strongly recommended. Power and ground planes provide good, lowinductance paths for the power connections to the devices on the PCB. These paths help minimize ground bounce and other noise problems. Sandwiching power or ground planes between signal layers greatly· improves the circuit board's noise characteristics. Ground-loop currents are minimized, which reduces capacitive and inductive signal coupling. A maximum center-to-center spacing of 8 mils between signal and power layers is recommended. Good high-frequency decoupling on power and ground connections is very important for reliable highspeed operation. High-frequency bypass capacitors with NPO or X7R dielectrics are recommended. These devices store charge and supply. instantaneous power required by the active devices on the PCB. For the CY7C184, one O.1-JlF and one 0.01-JlF capacitor are recommended per device. Surface-mount capacitors are preferred because of the lower lead inductance these devices exhibit. Additionally, you can place surfacemount devices on the back of the PCB in the center of the device they are intended to decouple. This placement reduces the inductance between the capacitor 80ns - 12 ns ~ 61 ns 30ns - 12 ns _ 7 n" 11 ns fAA _(max) vipelined mode 6 CLK2 periods = 6 x 20 ns 120ns CRD(1:0)# from CLK2 (max) - 12 ns 386 data set up time (max) -70s tAA (max) 101 ns toP. .(max) vivelined mode 3 CLK2 periods = 3 x 20 ns CRD(1:0)# active from CLK2 (max) 386 data set up time (max) 60ns - 12 ns -=-2m. tOE (max) 41 ns leads and the actIve deVIce's power and ground connections. Avoid sockets whenever possible because of the extra inductance introduced. If sockets are necessary, high-quality sockets with gold-plated contacts are recommended. Pay careful attention. to the routing of traces. In general, traces should be kept as short as possible to reduce transmission-line effects. Point-to-point connections are recommended, as opposed to stubbed or treetype connections. The latter causes discontinuities in the transmission line, which create reflections. Instead of 90· bends, traces should be curved; or use two 45· bends. This help~ reduce EM!. Critical signals, such as clocks and control lines, should be routed first. Whenever possible, keep these signals on the same layer, because vias cause transmission-line discontinuities. Routing these signals on the 4-27 386 '7 , ADDRESS BUS CY7C184 AO - All Al2 ALE DD 10EA IWEA 10EI lWEI ICSI ICSO .tlL MODE ICE ~ ~ ~ . 0.5D1II D31 386 DATA BUS ~ 82C307 CACHE CONTRO L 82C306 CALE ICROO leWED ICRDI ICWEI IlRE3 ILBE2 ILIEI ILBED I CY7C184 AO - All A12 r--ALE DO 10EA IWEA lOEB IWEB ICSI ICSO .tlL MODE ICE ~ - o [S DD DtS ~ Figure 4. Operation with the 82C307 inner layers reduces radiated emissions. To minimize transmjssion-line effects, keep these traces to a maximum of six inches in length. To minimize crosstalk, a center-to-center minimum spacing of 16 mils is recommended for critical traces. The signal quaiity of the system clock is a very important consideration. Pay careful attention to clock loading and skew. For high-speed clocks, it is usually recommended to supply each clock input from a separate driver. The clock drivers should be in a monolithic package, such as a hex inverter, so that clock skew is minimized. Keeping clock traces approximately the same length also helps minimize clock skew. Series damping resistors in the 10 to 270. range might be required on clock traces to achieve good signal quality. If so, use as low a value as possible. Experimentation determines the optimal value. Once control lines have been routed, address and data lines can be routed. These signals are somewhat less critical, because some settling time is .usually provided in the worst-case timing. However, these signals should still be routed point to point, and trace length should be minimized. Appendix A Strapping Information for Different Steppings of the Intel 82385 Intel manufactures different versions (steps) of the 82385 cache controller. For example, the C step activates the output enables to the cache RAMs whenever the write enable signals are asserted. Step B, on the other hand, inhibits OE# while WE# is Low. StepSB, one of the new revisions, allows you to control the state of the OE# output during write cycles. Cypress recommends that pin A14 be tied Low; then OE# is de-asserted during write. There are two reasons for this: Although the 7C183 three-states its outputs (tHZWE = 15 - 20 ns) after WE# is asserted, even if the OE# input is active, the write pulse ' width (tpWE) in some systems might not be long enough to satisfy the tSD requirement after tHZWE is satisfied. . Assuming tPWE is long ,enough to satisfy tSD, you must contend with another problem. After the 7C183 three-states its outputs, the noise caused by its buffers drives the VIR level to 3V. In other words, any inputs less than 3V might not be recognized as a High level. If you want to avoid this condition, pull OE# High 10 ns before asserting WE#. 4-28 Section Contents Page PROMs Pin-out Compatibility Considerations of SRAMs and PROMs ................................ 5-1 Introduction to Diagnostic PROMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5-4 Interfacing the CY7C289 to the AM29000 ................................................. 5-10 Interfacing the CY7C289 to the CY7C601 ................................................. 5-23 = ~ CYPRESS F SEMICONDUCTOR Pin-Out Compatibility Considerations of SRAMs and PROMs This application note discusses the non-electrical parameters of pin-out and programming involved in finding socket-compatible second sourc~s for PRO~s. Included here is an example of a venfied converSIOn from the Motorola 68764 to the Cypress 7C264, a PROM conversion that is not address-line compatible. An SRAM Comparison To understand how to choose second-source PROMs, consider a comparison with .the process of choosing second-source SRAMs. Ignonng the AC~ characteristics, rmding a second source for an SRAM IS relatively simple. So long as the power, ground, control (chip select, read, write), address, and data lines ar~ on the same pins, the. devices should be compatible. Specifically, on SRAMs, the address and data lin~s need not be numbered identically between the two deVIces for them to function identically in the same socket. As an example, on several Cypress SRAMs, the addre~s pin numbering is not the same as some of our competItors. Consider a simplified example that illustrates why address pin numbering is not a problem: Assume you have a new device, the 2-bit x 4-location SRAM shown in Figure 1. Note that the inferior pin-out chosen by the Brand "X" 2 x 4 assigns address . line 2 (A2) to pin 1, while the superior pin-out used by the Cypress device has Al at pin 1, etc. Assume that your engineering staff designed an .infrared scanning-pattern-recognizing toaster oven usmg the Brand "X" SRAM, working only from the device's data sheet. Just as your company is about to ramp into Cypress 2x4 1FI>ll3 2~4 volume production, Brand "X" sends out an ~nd Of Life notice on the 2 x 4, because the company IS converting all of its capacity to making DRAM~. At this point, because you have no deSIre to layout a new PCB, take a look at how the Cypress and Brand "X" SRAMs would look in your design (Figure 2). In the figure, J.1P designates a microprocessor interfacing to the SRAM. The important thing to notice in Figure2 is that the data read from an address generated by the microprocessor is the same as data ~tten k? the same location earlier. With an SRAM, any mconslStency between the address and data line numbering does not matter because the data read is the same as the data previously written. To illustrate the point further, suppose that you write a value of 1 (J.1P:D2,Dl = 0,1) at location 2 (J.1P:A2,A1 = 1,0). If you read location 2, you obtain the value 1 that was written, because the address presented to the SRAM during the read is the same as the address for the previous write. Similarly, the data read is in the same bit order as presented during the previous write to the location. So far as the system is concerned, the two SRAM devices are compatible. Although not significant to the system, the devices differ in where they internally store the data. In the Brand 'X" Board uP---A2------l ~ 3-----D2------uP UP--Al-----2~ 4----------DI---uP Brand 'X" 2x4 Brand "x" Board with Cypress 2 x 4 1~3 2~4 uP--A2------1 [A1Dil3----D2--------up uP--Al-----2 ~4---------Dl--------up Figure 2. Example System with 2x4 SRAMs Figure 1. Example 2x4 Simplified SRAMs 5-1 ~~ ~... ~~~~~~~~P~in~-~O~u~t~C~o~m~p=a~ti~b~il~it~y~fu~r~S~R~A~M~s=a=n=d=P=R=O=M~s SEMICONDUCTOR;;;;;; Cypress device, the J1P address of 2 (J,JP:A2,A1 = 1,0) actually stores the data at SRAM location 1 (Cypress:A2,A1 = 0,1). The Brand "X" RAM physically stores the data at address 2. The address translation is transparent to the J1P, however. Because the same location is accessed for the subsequent reads, the difference in address numbering between the two devices does not matter to the system. Similarly, any numbering difference on the data lines does not matter either. All writes and reads are generated in your system; thus, s? ·lon~ .as .the a?dress and data lines are on the· same pms, differences m the numbering do not matter. Second-Sourced PROMs For PROMS, the scenario becomes slightly more complex. Because you program PROMS using a programmer that is separate from the system. in which they are used, it is more difficult to substitute PROMs that do not have the same address- and/or data-pin numbering. . Assume, for example, that the high-tech toaster oven's 2 x 4's are PROMS. If you program each location with data, you find that the Cypress device does not work properly when used in the Brand "X"-designed socket. In this case, the PROM programmer puts the data at location 2, and the board reads this data when the microprocessor requests the data at location 3. Additionally the data bits are swapped on this read. What a mess! It becomes apparent that it is easiest to replace this PROM with a device that has the same address- and data-line numbering. There are methods that allow you to use the Cypress 2 x 4 PROM in the Brand "X" socket, however. The objective in trying. to make the Cypress PROM work in the foreign pin-out socket is to have the system read the same data as when the Brand "X" device is used. In the 2 x 4 example, you encounter two problems: mismatches in the numbering of address lines and data lines. Correcting Data-Line Mismatch First consider the data-line mismatch. As it stands, data programmed in as bitl,bit2 is read as bit2,bitl. You could fix this problem by swapping the printed traces for Dl and D2. Unfortunately, this would also disallow the use of the Brand "X" device. If you could internally swap the data bits prograiruned into the Cypress device, they would be in the correct order when read. You can, in fact, swap the data bits in the Cypress device through several means. First, you might modify your programming adapter such that D2 and D 1 are swapped when programming the part. Then when the device is read, you get the bits in the same order as presented by the Brand "X" device. This is not a recommended method of solving the problem, because modifying prog~ammers tends to make the manufacturer of the programmer unhappy. 5-2 1) Brand "X" 2 x 4 : Bit 2, Bit 1 2) Programmer (Cypress) : Bit 1, Bit 2 3) Cypress 2 x 4 : Bit I,Bit2 4) System Board uP : Bit 2, Bit 1 Figure 3. PROM Bit Swapping with Programmer A second method of solving this problem is to alter the binary image of the PROM contents such that bits D 1 and D2 are swapped in a file on your computer's disk· this altered binary image file is then used to progr~ the Cypress PROM. This approach is less likely to cause damage. than modifying a programmer, but requires some skill in altering the binary. file. Finally, the easiest solution to this problem is to trick the PROM programmer into swapping the bits for you. If you set your programmer for the Cypress device type, read a programmed Brand "X" device into memory, then program the Cypress part with the image in programmer memory, the bits are swapped for you. ." You can see how this bit swapping works by exanuning Figure 3. The bits in the Brand "X" device are stored in the order Bit2,Bitl - the same order in which the toaster's J1P reads them. When you set the programmer to read the Cypress part, the data lines are logically swapped from the Brand "X" ordering. Thus, when you read the Brand "X" part, the data bits are swapped as shown. When the Brand "X" part is removed from the socket, and the Cypress device is plugged in and programmed, the bits are programmed into the Cypress part in this same "reversed" order.. When you place t~e Cypress part into your board, the bIts are swapped agam due to the difference in numbering between the Cypress part and the board layout, and the J1P gets the data in the correct order. Correcting· Address-Line Mismatch The second problem in substituting PROMs is the difference in address-line numbering. You can resolve this problem in exactly the sam~ manner as the data swap problem. By simply setting the programmer to the Cypress device type, reading the Brand "~" pcu:t, then programming the Cypress part, any addresslOg dIfferences are solved. The location of data words are swapped to allow for the difference in pin-outs, just as the bits were swapped in the data-line mismatch. Working with PROM Programmers Many programmers allow you to read a device different than the part selected, complaining only duri~g programming if the device types do not match. WIth use as a source for copying with uncooperative programmers. PIN Cypress 7C264 Motorola 68764 21 AIO Al2 19 All AlO 18 A12 All Figure 4. Cypress 7C264 vs. Motorola 68764 Pin-out such a programmer, carrying out the procedures to convert a PROM should not present a problem. Some programmers, however, do not allow you to read a device if it is different from the part selected. These programmers prevent the conversion method from working. Fortunately, the Cypress CY3000 QuickPro programmer does permit use of the conversion method. Cypress Field Applications Engineers, sales offices, and distributors can use their QuickPro programmers to generate a· Cypress master PROM that you can Conversion Example As an example of a PROM conversion, consider the Motorola 68764 8K x 8 PROM. It has a similar pinout to the Cypress CY7C264, with the exception of address lines 10, 11, and 12. To program a Cypress CY7C264 to work properly in a socket designed to accept the Motorola device, use this procedure: Invoke the Cypress QuickPro or other appropriate programmer and select the Cypress CY7C264 as the device to be programmed. Place the Mqtorola part in the programmer adapter socket and read the device. Optionally, write the device contents to a disk file. Place a Cypress CY7C264 in the programmer adapter socket, and program the part Optionally, you can read the contents of the disk file as the source for programming. The programmed device now works in the socket designed for the Motorola part. 5-3 CYPRESS SEMICONDUCTOR Introduction to Diagnostic PROMs This application note provides a basic understanding of the concept of a diagnostic PROM, as well as a brief introduction to possible applications. Beginning with a short tutorial on system diagnostics, this application note presents the reason for incorporating diagnostics into a design and the special testability problems associated with sequential designs. The concept of shadow-register-based diagnostics is presented, and the benefits of this approach are outlined. Next, a description of diagnostic PROMs is given. This covers the similarity of diagnostic PROMs to standard registered PROMs, as well as the fundamental operation of a diagnostic PROM. Next is a description of the Cypress CY7C268 and CY7C269 8K x 8 diagnostic PROMs. An application example is also included. function of the current inputs. Test vector methods are easily devised and implemented for combinatorial systems. But, for a sequential system, in which the outputs are a function of both the current inputs and the previous state(s), controllability and observability can be lost due to lack of access to the internal states of the machine. Consequently, building testability into a system means being able to control and observe all possible states of the system. Consider the simple sequential machine in Figure 1. Access to internal states is either denied or difficult to obtain. The obvious way to add testability to this system is to permit access to these internal states. One way to gain this access is through addition of a diagnostic shadow register, as shown in Figure 2. Observability is effected by adding a serial data output path (SDO) to allow shifting internal state information out of the system. Controllability is gained by permitting a serial data input path (SDI) to set the state of the internal registers. As a result, relatively simple test vector methods can be used to test the system. Introduction to System Diagnostics As electronic systems continue to grow in size, function, and complexity, it is becoming increasingly difficult to test them and determine their reliability, as well as to service the end product in the field. One way to simplify the task of testing electronic systems is to design some form of testability into the system. Controllability and observability are the key points of testability. These two qualities are easily obtained for a combinatorial system in which the outputs are strictly a INPUTS INPUTS !-_-r-_____-+OUTPUTS STATE OUTPUTS t----f----.OUTPUTS COMBINATORIAL LOGIC STATE OUTPUTS INTERNAL STATE fEEDBACK CLK SEQUENTIAL SYSTEM SEOUENTIAL SYSTEM Figure 2. Simple Sequential Machine with Diagnostic Capability Figure 1. Simple Sequential Machine 5-4 OUTPUTS SYSTEM INPUTS •.------------------------------I ·:•, • ·•·---------------2--------------I I I I I 01 OUTPUTS ~-------------------------------~ Figure 3. Complex Sequential Machine Consider, for example, the complex sequential machine shown in Figure 3. This system would be virtually impossible to test in the current configuration because you cannot control or observe the machine's internal states. To increase this machine's testability, observability must be added at points 01, 02, and 03. If this were accomplished, you would be able to observe the internal states of the machine. Additionally, controllability must be added at points Cl, C2, and C3. This would allow you to set the internal states of the machine. This controllability and observability can be attained by adding shadow registers, as depicted in Figure 4. The result is a complex sequential machine with a high degree of testability. As a result of these actions, simple test vector methods can now be used to fully test the machine. For instance, the state of the register at point Cl can be set, the machine can be clocked through some known number of cycles, and the state of the machine can be observed at points 01, 02, and 03. Knowing what state the machine should be in at a specific time at each observation point (the machine's "known-correct" state) can be compared with the observed machine state. This comparison determines if the machine is functioning correctly, and if it is not, which. "machine primitive" is not functioning correctly (fault detection). Note that this approach to sequential design also permits testing to see what the machine would do if a glitch caused a jump into an unused state. This capability makes the design task of forcing the machine back into a known state much less complex. The real advantage of this approach is that it requires no changes in architecture, minimal hardware changes, and results in a minimal (5 - 10 percent) area penalty when integrated into existing integrated circuits. Diagnostic PROMs Diagnostic PROMs are a relatively minor migration from standard registered PROMs. A block diagram of a diagnostic PROM appears in Figure 5. The addition of diagnostic capability to a registered PROM includes the addition of: Shadow register Multiplexer MODE pin SDI (Serial Data In) pin SDO (Serial Data Out) pin Diagnostic clock The shadow register is dynamically configured, based on the value of the mode signal. If the mode is set to input data to the PROM, the shadow register is configured as serial-in, parallel-out; if you want to extract information from the PROM, the shadow register is configured as a parallel-in, serial-out. The shadow register thus serves two purposes. First, it can be configured to serially receive state information that will appear at the outputs during the next cycle. This feature allows you to preset a condition to be sent through the part of the system fed by the PROM; i.e., you can insert state information into the system. This feature adds controllability to the system. The second purpose that the shadow register serves is to allow you to transfer state information from the register and to serially shift that data out of the PROM. This feature adds observability by allowing you to observe the state of the PROM's pipeline register at any given time. 5-5 Mode. SOl, SDQ, and DCLK for each "Machine Primitive" Figure 4. Complex Sequential Machine with Diagnostic Capability Including the features listed above in a registered PROM can therefore add testability to any system. Note that this increase in function is effected without loss of other desirable registered-PROM features, such as programmable initialization, programmable output enable, wide diagnostic PROMs are manufactured in CMOS for an optimum speed/power tradeoff. Both PROMs contain an edge-triggered pipeline register and on-chip diagnostic shift register. Both PROMs can withstand 2001 V ESD. Both PROMs are produced in Cypress's EPROM-based process, which allows testing for lOO-percent programmability. Both PROMs are available in PLCC/LCC and dual-inline packages, and both PROMs are available in a windowed package for reprogrammability . etc. Cypress Diagnostic PROMs Cypress Semiconductor manufactures two diagnostic PROMs: the CY7C268 and CY7C269. These 64K-byte- .----------------------------------------~ I I STATE OUTPUTS Figure 5. Diagnostic PROM Block Diagram 5-6 Table 1. CY7C268 Pin Functions Name MODE PCLK CONTROL LOGIC SOl Function 110 Ao-A12 I Address Input 00-07 0 Data Lines ENA I INIT I Synchronous or Asynchronous Output Enable Asynchronous Initialize MODE I Sets PROM to Operate in Pipelined or Dia~nostic Mode DCLK I Diagnostic Clock (Used to Clock the Shadow Register) PCLK I Pipeline Clock (Used to Clock the Output Re~isters) SDI I Serial Data In (Used to Serially Shift Data into the Diagnostic Register) SDO 0 Serial Data Out (Used to Serially Shift Data Out of the Diagnostic Register) SDO Figure 6. Condensed Block Diagram of the CY7C268 Table 2. CY7C268 Operational Modes Mode ENA[l] SDI SDO DCLK PCLK Normal Operation[l] L H,L Data In SDO -- Rising Edge Shadow to Pipeline[l] H H,L X SDI -- Rising Edge Pipeline to Shadow H L L SDI Rising Edge -- Data In to Shadow H H L SDI Rising Edge Shift Shadow Reg. [1] L H,L Data In SDI Rising Edge --- No Operation[1] H H,L H SDI Rising Edge -- Data Flow Description Note: 1. For the asyn~hronous-enable operation, data out is enabled on the first Low-to-High clock transition after E is brought Low. When E goes from Low to High (enable to disable), the outputs go to the high-impedance state after a propagation delay if the asynchronous enable was programmed. If the synchronous enable was selected, a Low-to-High transition is required. Note that full diagnostic capability is realized through the use of four control signals: SDI (Serial Data In), SDO (Serial Data Out), MODE, and DCLK (diagnostic clock). Including both DCLK and PCLK ensures that serial data can be shifted into or out of the diagnostic register while the PROM is operating in normal pipeline fashion. As a result, the CY7C268 has three possible modes of operation: Normal (pipelined) Diagnostic Pipelined and diagnostic simultaneously Table 2 summarizes the operational modes of the CY7C268. The CY7C268 features full diagnostic capacity and is available in 32-lead PLCC/LCC or 32-pin O.5-inch DIPs. The CY7C269 features limited diagnostic capability and is available in 28-lead PLCC/LCC or 28-pin O.3-inch DIPs. For an in-depth description of the PROMs' functions, refer to the data sheets. The following discussion briefly describes the diagnostic functions available in each device. CY7C268 A condensed block diagram of the CY7C268 appears in Figure 6. Table 1 lists the pin names and functions of the CY7C268. 5-7 Table 3. CY7C269 Pin Functions MODE Ell I/O Ao-A12 I Address Input 00-07 0 Data Lines E,I I Enable or Initialize Clock I Pipeline and Diagnostic Clock MODE I Sets PROM to Operate in Either Diagnostic or Regular Pipelined Mode SDI I Serial Data In SDO 0 Serial Data Out -- CONTROL lOGIC SOl soo CLOCK Function Name 8 8'~--------------~ Figure 7. Condensed Block Diagram of the CY7C269 CY7C269 Design Example A condensed block diagram of the CY7C269 appears in Figure 7. The CY7C269 has reduced diagnostic function relative to the CY7C268. The CY7C269 is ideal for applications requiring limited diagnostics with a premium on board-space conservation. This PROM is available in 28-pin, 300-mil DIPs (windowed or opaque) and in 28lead PLCC/LCC packages. The pin names and functions of the CY7C269 are listed in Table 3. Note. that limited diagnostic capability is realized through inclusion of three diagnostic signals: MODE, SDI, and SDO. Because there is only one clock, the regular and diagnostic modes are mutually exclusive. Table 4 summarizes the operating modes of the CY7C269. As an example of using diagnostic PROMs, consider the complex sequential machine presented earlier. This machine could be easily implemented using CY7C268s or CY7C269s, as shown in Figure 8. Note that the block labeled "diagnostic control" could consist of PLDs, PROMs, a sequencer, or a small microcontroller. Choosing between the CY7C268 and the CY7C269 is based on the complexity of the diagnostic function required. For full diagnostics that can function simultaneously with regular pipelined operation, use the CY7C268. For an application where limited diagnostic capability is required perhaps only a function at power-up or some other welldefined time - use the CY7C269. Table 4. CY7C269 Operating Modes Data Flow Description Normal Operation Mode L - E,I Clock SDI SDO [1],[2] Rising Edge X HighZ Shadow to Pipeline H L Rising Edge L SDI Pipe or Bus to Shadow H L Rising Edge H SDI Shift Shadow H H Rising Edge Data In SDO Notes: 1. The Eor I function is selected during programming. 2. If I is selected, the outputs are always enabled. If E is selected, the outputs are enabled synchronously or asynchronous_ ly, as"'programmed. 3. If I is selected, the outputs are always enabled. If E is selected, during diagnostic operation the data outputs remain in the state they were in when the mode was entered. When enabled, the data outputs reflect the outputs of the pipeline register. Any changes in the data in the pipeline register appear on the output pins. 5-8 SYSTE... INPUTS "2 ~I I ADDRESS DECODER PROGRA ...... ABLE ARRAY B! ~--+I J. DIAGNOSTIC MUX I DlAGNOSnC CONTROL CONTROL LOGIC ", (I PROG. INITIALIZE WORD B - BIT PIPELINE REGISTER I I 8 - BIT DIAGNOSTIC SHIFT REGISTER ~ I ~l I " rl t1 I-- I 8! r---I I+-- II ~ B 8 2 7 " I ADDRESS DECODER PROGRA ...... ABLE ARRAY + 8! ---+f DIAGNOSTIC "'UX I .... -+f I PROG. INITIALIZE WORD -+18 - BIT PIPELINE REGISTER - 2 I ~ I II ADDRESS DECODER PROGRAW ...ABLE ARRAY + 8! -H I I I 8 - BIT DIAGNOSTIC SHIFT REGISTER DIAGNOSTIC "'UX I 8 8 CONTROL LOGIC 3 ~ 4- CONTROL lOGIC H H - 8 8! t1 PROG. INITIALIZE WORD 8 - BIT PIPELINE REGISTER I I B I I 8 - BIT DIAGNOSTIC SHIFT REGISTER J ~ B 8 8 8 6 6 8 2 Figure 8. Complex Sequential Machine Implemented with Cypress Diagnostic PROMs 5-9 ~ ~ ~~~~~~~ == , SEMICONDUCTOR iii CYPRESS Interfacing the CY7C289 to the AM29000 CY7C289 PROMs This application note describes how to use highspeed Cypress CY7C289 PROMs to design an instruction memory system with virtually zero wait states for a 33-MHz AMD AM29000. The design includes 1 Mbyte of CY7C289 PROMs in addition to the inteiface cir~ cuitry used to support processor bursts. A. logic schematic and the equations for the PLDs used m the memory interface are included. Traditionally, PROMs have been much slower than RAMs. System designers used PROMs only for the boot process, immediately transferring the information into RAMs once power-up was complete. This inefficient solution wasted a considerable amount of board space, but system performance was generally considered more important. The need for this tradeoff is now evaporating. Cypress PROMs have narrowed the speed gap between RAMs and PROMs to almost nothing. The CY7C289 PROMs use a fast-column-access. architecture to produce on-page access ti~es of just 20 .ns (f~r registered mode) at a 512-Kblt (64K x 8) denSIty. ThIS architecture takes advantage of the burst mode feature common in many current microprocessors. Because most 32-bit processors burst just 16 bytes in a. w~ap around fashion, the burst mode accesses fall wlthm a single page of the CY7C289 PROMs. Thus, each access in a burst to the PROM is always completed in 20 ns. Even with a prOcessor that generates bursts considerably longer than 16 bytes, the CY7C289 can supply all the data in a burst from a single page. An excellent example of this capability is the 29000 instruction memory design described in this application note. Even though 29000 bursts can be up to 1 Kbyte long, the memory design described here never requires a wait s.tate dur~g a processor burst. Wait states .are only.re9-urred dunng an initial access,· and the .maxImum . walt· In a 33~MHz system is just two clock cycles. Figure 1 displays a block diagram of the ins~ction memory system design for the 29000. The deSIgn has three basic blocks: the 29000 microprocessor, the controllogic, and 1 Mbyte of CY7C289 PROM. The CY7C289 is one of four new 64K x 8 reprogrammable PROMs offered by Cypress Semiconductor. Two of these PROMs, including the CY7C289, feature the unique fast-column-access architecture. On these devices, the PROM array is divided into 1024 pages that are each 64 bytes long. Any consecutive access to the same CY7C289 page requires just 20 ns to complete. If an access cr~ss~s an internal. P~OM pa~e, the device delivers data wlthm 65 ns. To mdlcate an mternal page crossing to the external circuitry, the CY7C289 generates a W AIT\ signal. Along with the unique array architecture, the CY7C289 provides a variety of programmable features to simplify the memory interface. Among these programmable features. is the ability to capture the input address with on-chip registers or latches. If you select the address latch option, the address flows into the PROM during the active portion of the ALE signal and is captured when ALE is deas.serted (the ALE signal's polarity is programmable). ThIS ?Ption is appropriate for most CISC processors, whIch supply a valid address after the system clock's rising edge. The ALE option can improve system performance by allowing the PROM to capture the ad.~ess as soon as it becomes available, as opposed to wattmg for the system clock's next rising edge. The ?raw?ack to the address latch· option is that external lOgIC mIght be req uired to generate the ALE signal. If you select the CY7C289's registered option, the address at the input is captured at the CLK input's rising edge. The advantage of the registered ~ode is that the memory interface is often simpler. ThIS configuration is particularly useful when interfacing to RISC processors. Most of these processors generate addresses arouhd the risillg edge of a system clock, making it easy to capture the address with the CY7~89 input registers. (See the application note, "Interfacmg the CY7C289 to the CY7C601.") Another important CY7C289 feature is the ability to program the polarity of two chip selects (CS 1 and 5-10 CY7C289 PROM - ( 4 BANKS ) .........................................: CONTROL LOGIC AM29000 .... 10-131 ( ..... (ROY IREQ IBREQ AZ-AS 00-031 - : - - - - AO-A5. cst.CSC - - - niL - - - - WAIT ALE : LOGIC ..... AIO-A31 - r- ~ M I ) COUNTER U X v : ./ -v- : : : : - : " - I- ) A6-h15 .. :..........................................: I I I Figure 1. AM29000 Instruction Memory Block Diagram CS2), which facilitates automatic bank selection for up to four banks of PROM. Proper use of the chip selects also allows you to extend the PROM page's length beyond 64 words when using multiple banks of PROM. This capability improves the system's performance by effectively increasing the size of a page in the CY7C289s (more on this later). Here is a complete list of the programmable features available on the CY7C289: The input address can be either registered at CLK's rising edge or latched by the ALE input. You can program the polarity of both chip selects (CSl and CS2). You can set each of these options by appropriately programming a reserved PROM location. Therefore, the devices are configured at the same time the array is programmed. AM29000 Microprocessor The 29000 is a 32-bit general-purpose microprocessor used mainly in embedded controller applications. The version used in this design operates at 33 MHzthe highest-speed 29000 currently available. The processor's pipelined RISe. architecture attempts ~o execute an instruction in every clock cycle. To dQ thIS, the 29000 relies heavily on burst-mode accesses. The 29000 contains three buses, one each for address, data, and instruction. During a normal access, the three-bus architecture behaves essentially like a two-bus system (address and data), because the dedicated instruction and data buses must wait for the shared address bus. In burst mode, however, only the initial data You can program the address set-up and hold window. You can program the WAIT output's polarity. You can program the ALE input's polarity. The WAIT output can be generated off the falling or rising edge of CLK for the registeredmode CY7C289. 5-11 ·ESEtT~--~----~----------~--~~----------------------------, He II 4.71 . 14 ~ cn uu ·U~ ,,~ ~ .ill.ll.L.... .n r---~--rHr-----------------------~~AlE liE >--t-t-------'I"()C u· f f~ T74FII +-----'2"-l1 ~-------' ;ll-- CpreLl I sn cLl I I ! S I -+--t4-+t+-~~I~. EO;.t--i4H II VA IT 5 '----H--t+f-7IT.IL~O.. AD;-.H I I HUT '- ~ ~C +-+--11L! ll' UITOUT 7 ,~ WAIT->---+-+-4---~ ,r liD I~I t .. ~ T I rrr-imQ I': ru-CITCU ,1 ~ :m11 t III 11 lin I/O HIT,-iAL 7iE r--+-' I/O 11 PRUET G.!• 15 11 CLlH DII LL I10 Uf r-:-If II PlIT.-'c"5-------IC E RESET I t+--lI;';'IE~E•.!...-H 2 II +-HH1;-i-ILi'-l'l--'i-i, IZ IZ ........-++l---;A~l..... E _~; U :: 12 C'UCI I • '-rri"r- 17 C4 ~ ,-,t1L~--;""" H:; Ol:c;.;~:,;",;--+-+----I~m ... .,-,~,....:t--: It II Cl 11 Z OZ DUTZ g. lI l 74 FI 12 H-l-++f-_________________-+I--I-__-+I+f+t-____---............... - - -._____--l lUI ~.rA I>--H--+---.!...jZ • , • II!ill:)[IlUIRRf1l r-LIE SET I,J!.L- 11 IIEI Z II U c, .--_ _ _ _ _ _ _ _- ' r-------H-~+__7IHIL;"_l---ilH IZ U 7 CI A! 4 ;lL I' 0l3!++'..;"-!---I-----lOUT Uf 17 • II " 14 01 .UTI ~ !l/eLl OJ c..-..!. Cll· >tr f rt-L T #.- !!H o,~~ za. :: i!i l AI A7 itA '~D ,.~ t .. r-1--llll-- T 7 t 15 II AI At 11:: liE. ~: 111 10 17 +--+---!I~I~LL~I~2 110 AU!!L.J.. cu· 74F74 II #--:' 74F74 t---------l CID U :: .... lIZ ::ll: :~ :~ II "* ff C ~• It~~ :~ 11 Iz :~ ::~~ 110 lii:"a 7 OJ ~ • 10 ~ fCC Ie :!f-om zt- .:~ ~. "lZZUO 110 ~------' Figure 2. Control Logic for the 29000 Instruction Memory ample, if a burst begins four words before the end of a 1-Kbyte boundary, the burst can at most be four words long. or instruction address is sent to the corresponding memory space, and the task of incrementing the address is left to the interface circuitry. Thus, during burst, both the data and instruction buses can operate simultaneously without having to wait for the shared address bus. In: other words, a 29000 in burst mode can fully utilize the separate data and instruction bus architecture. ' Although· the 29000 achieves maximum throughput during bursts, AMD did impose a limit, on a burst's length: The 29000 only' performs bursts within a 1Kbyte boundary. Therefore, an 8-bit counter suffices to increment the burst addresses. Note that a 29000's maximum burst length depends on where it begins. For ex- Control Logic The memory interface logic required for this design is detailed in Figure2 and appears symbolically in Figure 3. In addition, PLD ToolKit source code for the PLDs used in the control logic appears in Appendices A through D. Referring to the block diagram in Figure 1, note that the interface circuitry performs two' primary functions. One is to generate all necessary interface signals, and the other is to increment the instruction address to 5-12 support processor bursts. The hardware required to implement the interface consists of two SSI devices (a 74F74 and a 74F112) and four small PLOs. The 22VlO PLO is a 15-ns version, while the remaining three PLOs (one 16R4 and two 16L8s) have a maximum propagation delay of 5 ns. PROM Configuration In this application, 16 CY7C289 PROMs constitute a I-Mbyte instruction memory, distributed in four banks. The CY7C289s' 22-ns access time (in latch mode) allows on-page accesses to complete in a single clock cycle at 33 MHz. Proper use of the programmable chip selects ensures that all burst accesses fall within the same PROM page and never require a processor wait cycle. The CY7C289s are configured with address latches to take full advantage of the 29OO0's mid-clock address release. Latch mode minimizes the number of wait cycles during a single access or during a burst's first access. The set-up and hold window for the address latches should be programmed to minimize the hold time required after latch close. This setting is critical to proper operation of the address increment circuitry. The CY7C289s' chip selects are programmed on a bank-to-bank basis, such that each bank has a unique polarity combination of CS 1 and CS2. This arrangement permits PROM bank selection without external address decoding. The other applicable programmable features on the CY7C289 are the polarity of the WAIT\ and ALE signals. In the design implemented here, WAIT\ is active Low and ALE is active High. Memory Interface The 29000 has a few peculiarities that affect the memory system design. For example, the instruction bus is unidirectional. The 29000 can only READ from instruction memory. This limitation makes it difficult to use RAMs for instruction memory, because there is no mechanism to load the instructions into the RAMs to begin with, but the nonvolatile nature of PROMs makes them ideal for this application. One way to use RAM fpr the instruction memory is to trick the 29000 into thinking it is writing to data memory (the data bus is bidirectional), but route the information back to the RAMs in instruction memory. Implementing this memory subsystem requires two 32bit 2: 1 multiplexers on the data and instruction buses, in addition to the associated glue logic necessary to control the transfer. To use the memory subsystem, the system copies the instruction information (from boot PROMs located elsewhere on the board) onto the data bus and subsequently into the RAMs on the instruction side of memory. This solution is costly, wastes board space, and slows system operation by adding multiplexer delays into both the instruction- and data-bus paths. A much better solution is to use PROMs in the instruction memory. Because they are nonvolatile, the instruction information is programmed into the device prior to assembling the system, eliminating the extensive logic needed to write to the instruction bus. Further, with the CY7C289's high speed, the system has no need for shadow RAMs. The resulting circuit occupies much less board space than the RAM-based version and provides better system performance. Moreover, lessening the number of components improves the circuit's reliability. Another unusual 29000 feature is the processor's ability to suspend bursts to instruction memory. At any time during an instruction burst, the 29000 can suspend the sequence by deasserting the burst request signal (IBREQ\). The instruction memory must respond by discontinuing its operation while the IBREQ\ signal is inactive. When the processor reasserts IBREQ\, the memory system must resume from the point at which the burst was suspended. Note that because the 29000 does not send a new address at this point, the interface logic has to remember the address at which the processor suspended the burst. An instruction burst is not complete until the 29000 asserts the instruction request signal (IREQ\) and sends a new address. The interface logic described in this design fully supports suspended bursts. Address Connection Scheme For the most part, the placement of the CY7C289 PROMs in this design is straightforward. However, there are two important memory design features that bear clarification. The first is the address connection scheme used for the CY7C289 PROMs. In Figure 3's display of the address input to the CY7C289s, notice that the addresses fed to the PROMs are not entirely sequential. This non-sequential addressing scheme is used with the chip selects to extend the effective PROM page length to 1 Kbyte, and thus achieve no-wait-state burst performance. To understand how this is done, consider some internal details of the CY7C289. In this PROM, the lowest six address inputs (AO - A5) designate a· specific byte within a 64-byte internal PROM page. Inputs A6 A15 select one of 1024 PROM pages. When any of the inputs at pins A6 - A15 changes, a new page is selected, and the CY7C289 asserts the W AIT\ output. You can think of the CY7C289's chip selects (CSl and CS2) as additional address inputs in a multi-bank memory system. Like AO - A5, changes at these inputs do not result in an internal CY7C289 change. With four banks of PROM, you have a total of 8 address bits (AO - A5, CSl, and CS2) that do not affect the internal PROM page, as opposed to just 6 (AO - A5) when using one bank of PROM. The 8 bits of on-page addresses translate into a PROM page length of 256 words, or 1 Kbyte, which equals the 29000's maximum possible burst. The schematic in Figure 3 reveals how this pagelengthening scheme is implemented. Note that all the 5-13 Figure 3. AM29000 Instruction Memory Design outputs from the control logic (O~ - 09) connect to CY7C289 inputs that do not cause a page change (AO ~ A5, CSI, and CS2). The lowest address connected directly from the CPU to the PROMs is AlD. The 29000 is guaranteed never to change AIO - A31 during a burst, because this would constitute crossihga I-Kbyte boundary. All the addresses that can change during burst connect to AO - A5, CSI, and CS2; thus, the CY7C289 never crosses an internal page-and never causes a wait state-during a 29000 burst. The chip selects in this design effectively quadruple the PROM page length, allowing a· greater percentage of single accesses and all burst accesses to finish within a single clock cycle. To make the. extended page useful, note that you need .to locate sequential code on the same PROM page. Because this design extends each PROM page across all four banks, you must segment code into pagelength blocks; this is analogous to using interleaved DRAMs. Because each CY7C289 PROM has a 64-byte internal page, your code must be separated into 64- 5-14 word blocks. In other words, place the first 64 words of code in bank 1, the next 64 words in bank 2, and so on. You can accomplish this segmentation with a simple program. The ALE input controls the input of addresses to the CY7C289s. The CY7C289's latch mode takes full advantage of the 29000's mid-clock address release and minimizes wait states during the initial access. The drawback to latch mode is that an ALE signal must be generated externally. In this design, the 16R4 creates ALE based on the input clock, IRDY\, WA11\, and IBREQ\. The remaining signals generated by the 16R4 control the burst counter logic implemented in the 22VI0 and the 16L8s. Note that most of the logic displayed in Figure 2 is required regardless of the memory device you choos'e. Implementing the burst-counter interface to the 29000 requires the 22VI0, both 16L8s, and a portion of the 16R4. Thus, only the two SSI components and part of one 16R4 are needed to create the appropriate communication signals between the 29000 and the CY7C289 PROMs. Using the WAIT\ Signal The second memory design issue that bears clarification is the connection of the CY7C289's WA11\ signal. The CY7C289 asserts this signal when the input address crosses an internal page boundary (at least one of the inputs A6 - A15 changes). W A11\ tells the 29000 that the PROMs need an additional clock cycle to deliver the requested instruction. Note in the schematic in Figure 3 that only one WA11\ output connects to the control logic. This is because all the PROMs examine the same upper-order address inputs to determine if an internal page has been crossed. Therefore, only one PROM is required to identify a page crossing and assert the WAI1\ signal, even if the chip selects (CSI and CS2) are deasserted at the time. The only time the PROM does not generate WA11\ is when the chip enable signal (CE\) is inactive during an address change. System Timing Figures 4 and 5 illustrate the communication be- tween processor and memory that supports burst mode and inserts wait states. The 29000 generates the instruction request (lREQ\) and instruction burst request (IBREQ\) signals to initiate instruction accesses. To begin an access, the 29000 asserts IREQ\ and places a valid address on the address bus a maximum of 12 ns after CLK's rising edge. If this is the beginning of an instruction burst, the processor asserts IBREQ\ no more than 10 ns after the system clock's falling edge. At each subsequent rising edge of CLK, the 29000 samples the instruction ready (IRDy\) input before reading data. Therefore, by deasserting IRDY\, the external memory system can hold the CPU until an access is completed. When the access is finished, the memory system must assert IRDY\ at least 10 ns before CLK's next rising edge. The data must appear on the bus at least 4 ns before CLK's rising edge. Burst Counter The 22VlO and the two 16L8s support the 290oo's burst mode capability. The 22VI0 implements an 8-bit loadable counter, which loads a new address from the 29000 when the IREQ\ signal is asserted. On each subsequent clock rise in which IBREQ\ is active, the 22VlO increments the current address and delivers the result to a multiplexer. Note that the clock to the 22VlO is not the system clock. The 16R4 generates a special counter clock that properly times the loading of the counter and halts the count during a suspended burst. The pair of 16L8s are utilized primarily as two high-speed multiplexers. Each 16L8 implements a 4-bit 2: 1 multiplexer that selects the ins truction address from the 29000 or the counter. During an initial access (IREQ\ Low), the 16L8s feed the processor address to the instruction memory. When the 29000 is bursting, the counter address is routed to the PROMs. No-Wait Timing The control logic in this design generates IRDy\ based on the W A1T\ output from the CY7C289 PROMs and the IREQ\ and IBREQ\ signals from the processor. During a single access or a burst's first access, the interface automatically inserts one wait cycle due to the 29000's late delivery of valid address; completing a single access without a wait state would require a 12-ns PROM access time. The interface inserts the wait state by deasserting IRDy\ in the cycle in which IREQ\ was asserted. In the scenario illustrated in Figure 4, this access falls on the same page as the previous PROM access and therefore does not generate a WA11\. The interface logic asserts IRDy\ in the following cycle, and data is delivered prior to CLK's next rising edge. The CY7C289 PROMs' 22-ns on-page access time is well within the 44-ns window that results from the single inserted wait state. Signal Generation The primary function of the remaining interface logic (the 16R4, 74F74, and 74F112) is to generate the necessary system control signals. These signals include the instruction ready signal (IRDy\) to the 29000 and the address latch enable signal (ALE). for the PROMs. The IRDy\ input to the 29000 halts the processor when accessing slower memory. Based on the WA11\ output from the CY7C289s, the interface circuitry deasserts IRDy\ when the PROM. requires more time to complete an access. Because this design never requires a wait during a burst access, the control logic simply holds IRDy\ Low while a burst is in progress. For the PROM interface, IRDY\ is only used during a single access or during a burst's initial access. 5-15 CPUCLK ADR I REO IBREO I ROY DATA 4:4 ALE VAIT Figure 4. Instruction Memory Timing (WAIT deasserted) the ALE input is active, the latch is transparent, and the address at the input flows into the PROM. On the transition of ALE from High to Low, the PROMs latch the address and ignore further changes to the address while ALE is Low. In this design, the ALE input remains active (open) until a burst sequence begins. During a burst, the ALE signal advances the counter and controls the loading of the counter address into the PROM. Because ALE's falling edge increments the count, the PROM's address inputs change only after the address latch closes. Note in the schematic in Figure 3 that the 16R4 generates the clock input to the AM29000. This clock arrangement ensures that the ALE and CPUCLK sig- Once the initial access is delivered, the memory can complete each burst access within a single cycle. The control logic therefore keeps IRDY\ asserted as long as the IBREQ\ signal from the CPU is active. In Figure4, note that the· 29000 temporarily deasserts IBREQ\-the method the processor uses to suspend an instruction burst In response, the instruction memory suspends data delivery until the IBREQ\ is reasserted. When IBREQ\ reasserts, the data is delivered from .the point at which the burst was suspended, as illustrated in the timing diagram in Figure4. To govern the operation of the instruction PROMs, the control logic generates the address latch enable signal (ALE), also shown in Figure 4. In this design, the ALE input is programmed as active High. Thus, when 5-16 ~ =- ~~0ID~~~~~~~~~~~~I~n~te~r~fa~c~in~g~th~e~C~Y~7~C~2~89~to~t~h~e~A;~~2~9~O~OO CPUCLK AOR VALtD IBREO IROY DATA ALE \lAIT Figure 5. Instruction Memory Timing (WAIT asserted) played in Figure 2 uses this WAIT\ output's falling edge to send an additional wait signal to the 29000. This wait signal is created by keeping the IRDY\ signal High for one additional cycle. As shown in Figure 5, this added wait provides a total of 74 ns for the PROM to complete the access. An access that involves crossing an internal PROM page actually requires only 65 ns. Note once again that after the initial data has been delivered, all subsequent burst accesses are delivered within a single clock cycle. nals track each other and are as closely synchronized as possible. PROM Wait Timing If WAIT\ is asserted during a single access or during the initial access of a burst, the control logic inserts one additional wait cycle (Figure 5). This wait cycle occurs if a PROM address crosses a page boundary; the WAIT\ signal is then asserted a maximum of 21 ns after the address is loaded. The control logic dis- 5-17 '5):= __ Interfacing the CY7C289 to the AM29000 ~COID~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. PLD Toolkit Source Code for the 16R4 CI6R4; {Norman Taffe Cypress Semiconductor April 23, 1990 Control Logic for CY7C289 PROM interface to the AMD 29000. } CONFIGURE {inputs} CLK, CLKIN, RESET, IREQ, WAIT, INLOAD, WAITOUT, DIBREQ, KILL, OE(node= 11), {outputs} IRDY(node= 12), ALE, PRESET, CLRJK, DKILL, DIREQ, COUNTCLK, CPUCLK, EQUATIONS; lPRESET = < sum> !INLOAD & lRESET; lDIREQ = < sum> lALE = < oe> < sum> lDIREQ & lWAITOUT & lCLKIN & lRESET # lDIBREQ & lWAITOUT & lCLKIN & lRESET; lCPU CLK = < oe> < sum> lCLKIN & lCLKIN & lCLKIN & lCLKIN # lCLKIN & lCLKIN & lCLKIN & lCLKIN; !IRDY = < oe> < sum> WAIT & lWAITOUT & lDIBREQ & lRESET # WAIT & lWAITOUT & lPRESET & lRESET # lWAITOUT & lDIBREQ & lDKILL & lRESET # lWAITOUT & lPRESET & lDKILL & lRESET # lCLRJK & !RESET; lCLRJK = < sum> lPRESET & WAITOUT & lRESET; lCOUNTCLK= lDKILL = !IREQ & lRESET; < sum> lWAITOUT & DIBREQ & PRESET & CLRJK & DIREQ # CLKIN &lWAITOUT & PRESET & CLRJK; < sum> lCLRJK & lRESET; 5-18 ~ =~RFSS ~,.. SEMICONDUCTOR Interfacing the CY7C289 to the AM29000 .;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;:::;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;=;;;;;;;;;;;;;;;;;;=;;;;;;;;;=~ Appendix B. PLD Toolkit Source Code for the Upper 16L8 C16L8; { Norman Taffe Cypress Semiconductor April 23, 1990 Control Logic for 285/9 PROM interface to AMD 29000. } CONFIGURE; {inputs} RESET, IREQ, KILL, ALE, A2, A3, A4, A5, C2, C3(node= 11), DIBREQ(node= 16), C4, C5, {outputs} 02(node= 12),03,04,05, CE(node= 19), EQUATIONS; !02= < oe> < sum> # # # !RE SE T & !A2 !IREQ & !KILL & ALE & !A2 RESET & !ALE & !C2 RESET & KILL & !C2 # RESET & IREQ & !C2 # !A2& !C2; !03 = < oe> < sum> # # # # # !RESET & !A3 !IREQ & !KILL & ALE & !A3 RESET & !ALE & !C3 RESET & KILL & !C3 RESET & IREQ & !C3 !A3 & !C3; !04 = < oe> < sum> # # # # # !RESET & !A4 !IREQ & !KILL & ALE & !A4 RESET & !ALE & !C4 RESET & KILL & !C4 RESET & IREQ & !C4 !A4 & !C4; !05 = < oe> < sum> !RE SE T & !A5 # !IREQ & !KILL & ALE & !A5 # RESET & !ALE & !C5 # RESET & KILL & !C5 # RESET & IREQ & !C5 # !A5 & !C5; !CE = < oe> < sum> !IREQ # !DIBREQ; 5-19 Sir;= Interfacing the CY7C289 to the AM29000 ... SEMICOIDUCfOR ===============;;;;;;;;;;;;;;;;;:;=======;;;;;;;;;;;;;;;==;;;;;;;;=;;;;;;. Appendix C. PLDToolkit Source Code for the Lower 16L8 C16LS; { Norman Taffe Cypress Semiconductor April 23, 1990 Control Logic for 2S5/9 PROM interface to AMD 29000. } CONFIGURE; {inputs} RESET, IREQ, KILL, ALE, A6, A7, AS, A9, C6, C7(node= 11), CS(node= 17), C9, {outputs} 06(node= 12),07, OS, 09, ALEBAR, EQUATIONS; !06 = < oe> < sum> !RESET & !A6 # # # # # lIREQ & !KILL & ALE & !A6 RESET & !ALE & !C6 RESET & KILL & !C6 RESET & IREQ & !C6 !A6& !C6; !07 = < oe> < sum> # # # # # !RESET & !A7 lIREQ & !KILL & ALE & !A7 RESET & !ALE & !C7 RESET & KILL & !C7 RESET & IREQ & !C7 !A7 & !C7; !OS = < oe> < sum> !RESET & !AS # # # # # !09 = lIREQ & !KILL & ALE & !AS RESET & !ALE & !CS RESET & KILL & ICS RESET & IREQ & ICS !AS & !CS; < oe> < sum> !RESET & !A9 # # # # # !ALEBAR = !IREQ & !KILL & ALE & IA9 RESET & !ALE & !C9 RESET & KILL & !C9 RESET & IREQ & !C9 !A9 & !C9; < oe> < sum> ALE; 5-20 Appendix D. PLD Toolkit Source Code for the 22VI0 C22V10; { Norman Taffe Cypress Semiconductor April 25, 1990 8-bit counter for AMD29000 PROM interface. } CONFIGURE; {inputs} CLK,A2,A3,A4,A5,A6,A7,A8,A9,KILL,IREQ, {outputs} 09( node= 14),08,07,06,05,04,0 3,02,Q 1(noreg), EQUATIONS; !Ql:= < sum> !KILL & !IREQ; !09:= < oe> < sum> 02 & 03 & 04 & 05 & 09 & 08 & 07 & 06 & Ql # !02 & !09 & Q 1 # !03 & !09 & Q 1 # !04 & !09 & Q 1 # !05 & !09 & Q 1 # !09& !06& Ql # !09 & !07 & Q 1 # !09 & !08 & Ql # A2 & A3 & A4 & A5 & A6 & A 7 & A8 & A9 & !Q 1 # !A2 & !A9 & !Q 1 # !A3 & !A9 & !Q 1 # !A4 & !A9 & !Q 1 # !A5 & !A9 & !Q 1 # !A6 & !A9 & !Q 1 # !A7 & !A9 & !Q 1 # !A8 & !A9 & !Ql; !08:= < oe> < sum> 02 & 03 & 04 & 05 & 08 & 07 & 06 & Ql # !02 & !08 & Q 1 # !03 & !08 & Q 1 # !04 & !08 & Ql # !05 & !08 & Ql # !08 & !06 & Q 1 # !08 & !07 & Q 1 # A2 & A3 & A4 & A5 & A6 & A 7 & A8 & !Q 1 # !A2 & !A8 & !Q 1 # !A3 & !A8 & !Q 1 # !A4 & !A8 & !Q 1 # !A5 & !A8 & !Q 1 # !A6 & !A8 & !Q 1 # !A7 & !A8 & !Ql; ~~~~~~~~~~~~_In_t_e_r[_a_C_in~g~t_h_e_C_Y~7_C_2_8_9_to~th_e_A~·~_2_9_0_0~O Appendix D. PLD Toolkit Source Code for the 22VIO (cont.) 107:= < oe> < sum> 02 & 03 & 04 & 05 & 07 & 06 & Ql # # # # # # # # # # # !06:= < oe> < sum> # # # # # # # # # !02& !07 & Ql !03 & !07 & Q 1 !04 & !07 & Q 1 !05 & !07 & Ql !07 & !06& Ql A2 & A3 & A4 & AS & A6 & A7 & !Q 1 !A2 & !A7 & !Q 1 !A3 & !A7 & !Q 1 !A4 & !A7 & !Q 1 !A5 & !A7 & !Q 1 !A6 & !A7 & !Ql; 02 & 03 & 04 & 05 & 06 & Q 1 !02& !06& Ql !03&!06&Ql !04 & !06 & Q 1 !05 & !06& Ql A2 & A3 & A4 & AS & A6 & !Q 1 !A2& !A6& !Ql !A3 & !A6& !Ql !A4 & !A6 & !Q 1 !A5 & !A6 & !Ql; !05:= < oe> < sum> 02 & 03 & 04 & 05 & Ql # !02& !05 & Ql # !03 & !05 & Ql # !04 & !05 & Ql # A2 & A3 & A4 & AS & !Q 1 # !A2 & !A5 & !Q 1 # !A3 & !A5 & !Q 1 # !A4 & !A5 & !Ql; !04:= < oe> < sum> 02 & 03 & 04 & Q 1 # !02 & !04 & Q 1 # !03 & !04 & Q 1 # A2 & A3 & A4 & !Q 1 # !A2 & !A4 & !Q 1 # !A3 & !A4 & !Ql; !03:= < oe> < sum> 02 & 03 & Q 1 # !02 & !03 & Q 1 # A2 & A3 & !Q 1 # !A2 & !A3 & !Ql; !02:= < oe> < sum> 02 & Ql # A2 & !Ql; 5-22 CYPRESS SEMICONDUCTOR Interfacing the CY7C289 to the CY7C601 clock's rising edge. Ordinarily, you must latch these signals externally with several 74F74s or the like. However, the CY7C289's on-chip registers capture the address bits at the system clock's rising edge. This feature, as well as the CY7C289's automatic WAIT-signal generation, allow for a straightforward connection between the memory and the processor. Figure 1 displays a block diagram of the instruction memory system design for the CY7C601. As the diagram shows, the design has only two major components: the CY7C601 32-Bit RISC Processor and one Mbyte of CY7C289 PROM. This application note describes how to use highspeed CY7C289 PROMs to design an instruction memory for a 40-MHz CY7C601 RISC processor. The design features 1 Mbyte of PROM and requires no interface circuitry. Utilizing a unique fast-column-access architecture, the CY7C289 supplies data in a 40-MHz system .with only occasional wait states. A schematic of the design is included at the end of this application note. Because microprocessor performance improvements have outpaced access-time advances in high-density memory devices, system designers have resorted to memory interleaving and high-speed SRAM caches to more fully utilize a processor's performance capability. In embedded control applications, the alternative has been to compromise system performance by slowing every processor access to PROM memory with wait states or by using PROMs only for the boot process and running instruction code from SRAMs. The necessity for faster, nonvolatile memory in high-performance embedded applications has prompted Cypress to design high-speed PROMs that you can easily interface to a variety of microprocessors. Using the CY7C289, high-speed embedded application s can run code directly from PROM and eliminate the extra board space, cost, and logic required to transfer code into ." shadow" RAMs. To achieve this level of performance, the CY7C289 PROMs employ an innovative architecture that accentuates local speed. The memory array is split into 64-byte pages that allow on-page access times of just 20 ns in a 512-kbit (64K x 8) PROM. This performance equals that of the fastest static RAMs at similar densities. SRAM-like performance, combined .with the non-volatility of EPROM technology, makes these devices ideal for high-performance embedded control applications. Another important CY7C289 feature is the availability of on-chip address registers. The CY7C601 memory design presented in this application note is an example of the address registers' usefulness. Like many RISC architectures, the CY7C601 delivers its address and memory signals unlatched prior to the system CY7C289 PROMs The CY7C289 is part of a high-density (512K), high-speed CMOS PROM family offered by Cypress Semiconductor. The CY7C289, along with another of the family members, features a unique fast-column-access architecture. The PROM array is arranged into 1024 pages, each 64 bytes long. Consecutive accesses to the same page require only 20 ns to complete. When an access crosses a page within the PROM, the data is delivered in 65 ns. The 7C289 generates a WAIT signal to alert external circuitry of an off-page access. The CY7C289 emphasizes fast local accesseswithin a 64-byte page. The principle behind the CY7C289 derives from a statistical approach to performance improvement. Many microprocessors linearize memory access requests because of on-chip cache burst-fill modes or instruction pre-fetch queues, in effect localizing the instruction fetch sequences. In the CY7C289, Cypress uses the fast-column-access architecture to improve local performance and take advantage of instruction stream linearity and locality. Fast access is possible when consecutive PROM retrievals are within the current page. When a memory cycle requests data that is not on the current page, the chip must power up the correct page. Because processor code tends to be linear in nature, though, PROM accesses usually fall on the same PROM page and therefore require only 20 ns to complete. 5-23 CY7C289 PROM ( 4 BANKS) CY7C601 00-031 V ~ MHOLD MOS 00-D31 - - - - - - AO-A5. . . CS1 . CS2 - - - r-"") A6-A15 .... - -'- - I-- ri-- - WAIT I A2.-A9 AIO-A31 I I I Figure 1. Block Diagram of CY7C601 Memory Design Along with the unique array architecture, the CY7C289 simplifies system design by providing the onchip logic necessary to generate a WAIT signal. This signal is used to automatically insert microprocessor wait states during an off-page access. To simplify the memory interface with a variety of microprocessors, the CY7C289 contains a rich set of programmable features. For example, you can latch the input address with the ALE input or register the address at CLK's rising edge. The CY7C289 provides a programmable bit to select between latched and registered address inputs. The default is registered inputs, which samples the address on CLK's rising edge and captures the address in the address register. This configuration suits most RISC processors, which generate addresses around the system clock's rising edge. When in LATCH mode while the ALE pin is active, the PROM recognizes any address changes and latches the address into the address registers on the user-defined edge of ALE. This option is particularly useful when interfacing with CISC processors (see Reference). Most CISC processors generate a valid address some time following the system clock's rising edge. Instead of waiting for the next rising clock edge (and sacrificing perfonnance), you can capture the address immediately using the ALE input. The drawback to LATCH mode is that it might require external interface circuitry. If you do select the ALE function, you can define the ALE signal's polarity, with the default being positive, To eliminate external bank decoders, the CY7C289 includes two programmable chip selects (CSl and CS2). The polarity of these inputs is user programmable, facilitating automatic bank selection of up to four banks of PROM. The programmable chip selects provide an additional advantage for multibank PROM designs. If you arrange them correctly, you can effectively extend 5-24 the length of the CY7C289 pages from 64 to as many as 256 words. This extension improves system performance by increasing the likelihood of on-page PROM accesses (more on this feature later). The CY7C289 includes these programmable features: 1. You can either register the input address at CLK's rising edge or latch the address using the ALE input. 2. You can program the address set-up and hold window. 3. You can program the WAIT output's polarity. 4. You can program the ALE input's polarity. 5. You can generate the WAIT output from CLK's falling or rising edge for the registered-mode CY7C289. 6. You can program the polarity of both chip selects (CS 1 and CS2). Each of these options is set by appropriately programming a reserved PROM location. Therefore, the devices are configured at the same time the array is programmed. PROM Configuration In this application, four banks (16 CY7C289s) of PROM are used to provide 1 Mbyte of memory. Like most RISC architectures, the CY7C601 sends out valid address information immediately preceding a rising clock edge (and removes it soon afterward). Thus, the CY7C289s are configured in registered mode. The onchip address registers capture the input at CLK's rising edge and ignore all unclocked address changes. The chip selects on the CY7C289s are programmed on a bank to bank basis. Each bank is programmed with a unique polarity combination of CSI and CS2 to permit PROM bank selection without external address decoding. The other programmable features relevant to this design involve the CY7C289's WAIT signal. For compatibility with the CY7C601, the WAIT signal should be active Low and generated with respect to CLK's falling edge. PROM Interface CY7C601 Microprocessor Because this design involves no glue logic, the CY7C289 PROM's circuit connections are relatively straightforward. The CY7C601 communicates with external memory via a 32-bit address bus and a 32-bit data/instruction bus. Note, in Figure2, however, that the addresses fed to the PROMs are not entirely sequential. The reason for the nonsequential addresses lies in the way the CY7C289 is organized. To improve the system's performance, the CY7C289 chip selects (CSI and CS2) are used to extend the effective PROM page length to 256 32-bit words (1 Kbyte). To understand how this is done, consider that the CY7C289's lowest six address inputs (AO - AS) designate a specific byte within a 64-byte internal PROM page. The CY7C289 uses inputs A6 - A15 to select one of 1024 PROM pages. When any of the inputs at pins A6 - A15 changes, a new page is selected and the CY7C289 asserts the WAl1\ output. You can think of the CY7C289's chip selects as additional address inputs in a multibank memory system. As with AO - AS, changes at the chip select inputs do not result in an internal page change. With four banks of PROM, you have a total of 8 address bits (AO - AS, CS1, CS2) that do not affect the internal PROM page, as opposed to just 6 (AO - AS) when using one bank of PROM. The 8 bits of on-page addresses translate into a PROM page length of 256 words or 1 Kbyte. The schematic in Figure 2 reveals how this pagelengthening scheme is implemented. Note that the lowest 8 address bits from the CPU (A2 - A9) connect to the CY7C289 inputs that do not cause a page change (AO - AS, CS1, CS2). The lowest address that connects directly from the CPU to the PROMs is AI0. The chip selects in this design have effectively quadrupled the The CY7C601 is a 32-bit general-purpose microprocessor that offers extremely high performance for embedded controller applications. The system described in this application note, for example, operates at 40 MHz. The CY7C601 is Cypress's CMOS implementation of Sun Microsystems' SPARC (Scalable Processor Architecture). This architecture achieves 29 MIPS by executing most instructions in a single clock cycle. A CY7C601 architectural feature that affects the memory interface is an internal pipeline. To achieve an instruction execution rate approaching one instruction per clock cycle, the CY7C601 uses a four-stage instruction pipeline. All four stages operate in parallel, working on up to four different instructions at a time. The stages are: 1. Fetch-The processor sends out the instruction address to fetch an instruction. 2. Decode-The instruction is placed in the instruction register and decoded. The processor reads the operands from the register file and computes the next instruction address. 3. Execute-The processor executes the instruction and saves the results in temporary registers. 4. Write-The processor writes the result to the destination register. A basic single-cycle instruction enters the pipeline and completes four cycles later. Normally, once the pipeline is full, an instruction is executed during every clock cycle. The existence of the instruction pipeline affects the memory interface (as described in the System Timing section of this application note). Otherwise, the memory interface design is straightforward. 5-25 CY7C601 40 MHZ· ..r ".LF Figure 2. CY7C601 Memory Design 5-26 PROM page length, allowing a greater percentage of PROM accesses to complete within a single clock cycle. Note that the extended-page-Iength feature of this design affects the software that runs on the system. To make the extended page useful, sequential code needs to be located on the same PROM page. In this design, where each PROM page extends across all four banks, c?de .mus.t be segmented into page-length blocks. This situatIon is analogous to interleaving DRAMs. Because each ,CY7C289 PROM has a 64-byte internal page, the users code must be separated into 64-word blocks. In other words, place the first 64 words of code in bank 1 the next 64 words in bank 2, and so on. A simple pro~ gram can accomplish this segmentation. Another design issue that bears clarification is the connection of the WAI'I\ signal generated by the CY7C289. This signal is asserted when the input address crosses an internal page boundary on the PROMs. WAI'I\ connects directly to the CPU's Memory Hold A (MHOLDA\) and Memory Data Strobe (MDS\) in~uts to .tell the CY?C601 that an additional clock cycle is reqUlred to dehver the requested instruction from PROM .. In the schematic in Figure 2, only one WAIT\ output is connected to the CY7C6010 This is because all 16 PROMs examine the same upper-order address inputs to determine if an internal page has been crossed. Therefore, only one PROM is needed to assert the yvAIT\ signal when an off-page access is detected. It is lmporta~t to no~e that the PROM will not generate WAlT\ if the ChiP enable signal (CE\) is inactive when the address changes. This ensures that when the CPU addresses some other portion of memory such as RAM, the internal PROM page does not ch~ge and a ' WAIT\ signal is not generated. System Timing This section provides a brief description of the CY7C601 timing interface to the CY7C289 PROMs. The .ti~ng diagram in Figure3 illustrates a typical commUfllcatIon sequence between the CPU and the PROMs. The memory interface's timing depends on whether or not the access is on the same page as the previous access. !h~ case w~ere an internal PROM page is crossed is illustrated m the left side of Figure 3. Address 1 (displayed as A1) is an access to PROM that causes an internal page change. W Am is asserted by the CY7C289 to freeze the processor until the PROMs can deliver valid data. Note in Figure 3 that WAIT\ is not asserted until the next processor clock cycle. This delay is possible, using either MHOLDA\ or ~OLDB\, because of the CY7C601's pipelined architecture. The delay allows memories or interface logic more time to examine the address and determine if a wait state is required. The processor samples MHOLDA\ on the processor clock's falling edge. An active MHOLDA\ indicates that the adru:ess in the previous clock cycle requires at least one Walt state to complete. However, as shown in Figure 3, by the time MHOLDA\ is detected active, the processor has already read the data corresponding to Al. Reading this false data is perfectly acceptable due to the CY7C601's internal instruction pipeline. The CPU has the time to invalidate the erroneous data before it reaches the execution stage. The MDS\ signal strobes in the correct data when the data becomes available. The CY7C289s are configured to generate the WAlT\ signal with respect to CLK's falling edge to ensure proper operation of the wait-state mechanism. If the rising-edge option were selected, it is possible that the WAIT\ signal would be generated too early by the PRO~s. Consequently the CY7C601 would recognize ~ a~tIve level on MHOLDA\ during the first cycle and mvahdate the data from the bus cycle prior to the PROM access. Generating the W AI'I\ signal from the falling edge ensures that the CPU does not detect the hold until the access's second cycle. Another important aspect of the memory interface's operation during a PROM page change is that WAIT\ connects directly to MDS\ as well as to MHOLDA\. This arrangement causes MDS\ to be asserted for two clock cycles instead of just one, but this does not affect the system's operation. Although the CY7C601 copies data erroneously during the first cycle of MDS\, the erroneous data is overwritten with valid data in the next cycle. This approach works because MHOLDA\ remains asserted and does not allow the internal pipeline to advance until the correct data arrives. The advantage to feeding WAIT\ directly into MDS\ is that it avoids the use of any external logic for the memory interface. CY7C601 Interface As shown in Figure2, the instruction memory interface requires only two control inputs (MHOLDA\, MDS\). MHOLDA\ freezes the clock to the instruction pipeline during a cache miss (for systems with cache) or when accessing a slow memory, such as the 65-ns page-miss operation in the CY7C289. Whenever the CY7C289 generates a WAm signal, MHOLDA\ is asserted and the instruction pipeline is frozen. The processor freezes with the next instruction's address on the address bus. MHOLDA\ must be presented to the CY7C601 at the beginning of each processor clock cycle and be stable during the processor clock's falling edge. The other control signal, MDS\, signals the processor when slow or missed (cache-miss) data is ready on the bus. The signal must be asserted only while the processor is frozen by either MHOLDA\ or Memory Hold B (MHOLDB\). Assertion of MDS\ enables the clock to the on-chip instruction register during an instruction fetch and effectively strobes the valid data into the CPU. 5-27 QCI'I'IOSS. Interfacing the CY7C289 to the CY7C601 ·SEMlcamucrOR ,==========;;;;;;;~~~;:;;~;;;;;;;;~;;;;;;;~~;;;;;;;~~;;;;;;;;= elK A2 ADR t\HOLDA 3 /'\DS DATA VAlID INV~LIO <~--~----~~-65----------~ 'WAIT ~19U--..--.;- __--,1 Figure 3. Memory Interface Timing Figure 3 . also displays some of the speed requirements that must be met in the instruction memory interface. In the case of an internal page, change, the CY7C289 'PROMs require two wait cycles to complete an access. The 40-MHz CY7C601 requires 2 ns of data setup time before the system clock's rising edge. This sequence results in a total of 73 ns available for the memory' to return valid data. The CY7C289 meets this requirement with the 65-ns off-page access· time. Th,e relatively trivial timing of sequential accesses falling on the same PROM page is illustrated in the right portion of Figure 3 .. The PROM latches A2' into the on-chip registers at CLK's rising' edge and delivers data a maximum of 20 ns later. Reference For information on using the CY7C289 in latched mode, see the application note entitled "Interfacing the CY7C289 to the AM29000." 5-28 Section Contents Page PLDs Introduction to Programmable Logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-1 CMOS PAL Basics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-10 Are Your PLDs Metastable? ............................................................. 6-21 PLD-Based Data Path For SCSI-2 ........................................................ 6-40 PAL Design Example: A GCR EncoderlDecoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-63 1'2 Framing Circuitry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-76 Using CUPL with Cypress PLDs ......................................................... 6-93 Using ABEL to Program the Cypress 22V10 .............................................. 6-119 Using ABEL to Program the CY7C330 ................................................... 6-139 Using ABEL 3.2 to Program the Cypress CY7C331 ........................................ 6-147 Using Log/IC to Program the CY7C330 .................................................. 6-154 State Machine Design Considerations and Methodologies .................................. 6-173 Understanding the CY7C330 Synchronous EPLD ......................................... 6-213 Using the CY7C330 in Closed-Loop Servo Control ........................................ 6-233 FDDI Physical Connection Management Using the CY7C330 ............................... 6-247 Bus-Oriented Maskable Interrupt Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-259 Using the CY7C330 as a Multi-channel Mbus Arbiter ..................................... 6-270 Using the CY7C331 as a Waveform Generator ............................................ 6-279 CY7C331 Application Example: Asynchronous, Self-Timed VMEbus Requestor .............. 6-286 Understanding the 361 ................................................................. 6-295 Using the CY7C361 as an Mbus Arbiter ................................................. 6-305 TMS320C30/VME Signal Conditioner Using the CY7C361 ................................. 6-315 DMA Control Using the CY7C342 MAX EPLD .......................................... 6-327 Interfacing PROMs and RAMs to High-Speed DSP Using MAX . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-345 FIFO RAM Controller with Programmable Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6-351 CYPRESS SEMICONDUCTOR Introduction to Programmable Logic Why Use a PLD? nology, thus making them EPLDs, which are erasable using an ultraviolet light source. You can make design changes at any point in the product cycle more easily than you can with other ASICs. The design cycle of a moderately complex PLD can be a week or less, and after the one-time purchase of a good development software package and programmer, the parts are relatively inexpensive. PLDs simplify logic timing because all logical functions take approximately the same path through the device. Thus, the same propagation delays apply to all device outputs (more on this later). ASICs (Application Specific Integrated Circuits) are one of the fastest growing segments of the semiconductor market for good reason. In addition to increasing packaging density and reducing board real estate by integrating SSIIMSI logic functions, ASICs reduce power requirements, improve reliability, and provide product secrecy. ASICs include several different types of devices: full-custom devices, standard cells, gate arrays, and PLD s. Full-custom devices offer the greatest degree of integration, but they are expensive, and the development cycles can be on the order of nine months to a year. Full-custom designs are justified only for very large volume applications. Standard cell devices can be turned around much more quickly (in about four months) and cost less than full-custom devices. However, the level of integration, and thus the speed, are lower than with the full-custom product. Gate arrays offer even less dense integration, but because only two metal masks must be fabricated, the design turnaround can be as short as six weeks. One drawback of all these ASICs is that the design logic must be set at the start of the fabrication cycle. If the design changes, the whole product cycle must start over. In addition, because each device is application specific, you must watch inventory very carefully to make sure that just enough of each device is ordered to meet demand. An alternative to custom or semicustom devices is the PLD (Programmable Logic Device). Although PLD s do not offer the same level of integration as the other ASICs, the board-space reduction is still significant. The reduction factor is application dependent and ranges from 4: 1 and 10: 1 for smaller PLDs (20 to 24 pins) to 75: 1 for high-density/pin-count devices such as the LCA or MAX families. Additional benefits include reduced parts inventory, faster design and turnaround times, and simplified timing considerations. Because a PLD is sold as a "generic" array of logic, customized by the user, you can use the same PLD in many different applications, spanning any number of projects. Cypress's PLDs are based on EPROM tech- PLD Technology All Cypress EPLD families except the CY7C360 family utilize the familiar sum-of-products architecture. You can implement Boolean transfer functions of this form by programming the AND array whose output terms feed a fixed OR array. This scheme can implement most combinatorial logic functions and is limited only by the number of product terms available in the AND-OR array. PLDs come in a variety of different sizes and with additional architectural features such as flip-flops. TTL PLDs use a fuse as their programmable element. During the manufacturing process, fuses are built into all the connections between input pins and product terms. All unwanted connections are then blown during the programming process. Bipolar products are programmed using 20V pulses from 50 ~s to 100 ms long. These 100- to 300-mA pulses blow unwanted fuses. Fuses are blown one at a time so that the heat generated does not damage or weaken the IC. Because of the high currents required, bipolar PLDs have to be programmed one at a time. Because physical fuses are blown, you can program these devices only once. In contrast, the Cypress CMOS EPLD family uses an EPROM cell instead of fuses. This structure allows Cypress to functionally test and then erase all devices prior to packaging, thus facilitating 100-percent programming yields. The EPROM cell used by Cypress serves the same purpose as the fuse used in most bipolar PLD devices. Before programming, the AND gates (product terms) are connected via the EPROM cells to both true and complement inputs. 6-1 Introduction to Programmable Logic 1 1 1 Z J ...... 7 t LoUt 7 -..... LOll12 ~ ... 7 4 ...... Loft. .J 7 ~ I ..-1014 7 - UtlD ~ ..... L15U .J 7 ~ • unt 7 • I , I J 7 7 I .J ~ • I I 1 ~ 3 ~~F204~ ~O* YPle!~C~~C file fie: P D .J)rodqced· 24 Z LoODO - ...J I 1211989 re s TooJklt C 100* Secunt bittln ro ammed* 000 111111111111111)111111) 1 Ilfll11lO*N OE PT pin: 19* LOO032 10011111111111111111111111111111*N Sum pt, pin= 19* LOO064 Oll0l111111111111111111111111111*N Sum PT, pin= 19* it [gg?i~ ~ggggggw088gg:~ ~~~~: pi~: l§: LO016O OOOOOOOoooOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, gin= 19* LO0192 OOOOooooooOOOOOOoooooooooOOOoooO*N Sum PT pin= 19* L00224 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT: pin= 19* L00256 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE PT pm= 18* L00288 ooooOOooooOOOOOOOOOOOOOOOOOOOOOO*N Sum pt, pin= 18* 11 [g~m~ gggggggoo~~gggggggg:~ ~~~, pi~: l~: 17 LOO384 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT: gin= 18* L00416 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 18* LOO448 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 18* LOO480 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 18* L00512 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE Pfrtpm= 17* 11 g: [gg~6a gggggggggo0888gggoo00W0gggggggggg:~ ~~~ ~: pi~: g: [gg~~~ 888o~~gggggggggggggggg:~ ~~ PT' pi~: II L00608 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT' pin= 17* LO0640 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT: pin= 17* 14 L00736 OOOOOOOOOOOOOOOOoooOOOOOOOOOOOOO*N Sum PT, ~in= 17* LOO768 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE PT pm= 16* L00800 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum pt, pin .. 16* L00832 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin.. 16* L00864 OOOOOOOoooOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 16* [88g~~ gggo088ggg8880oggggggggggggggggggg:~ ~~~ ~: pi~: l~: lIS L00960 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, gin= 16* LOO992 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 16* LOI024 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE I'L,pm= 15* LOI056 ooooOOOOOOOOOoooOOOOOOOOOOOOOOOO*N Sum 1'1, pin: 15* L01088 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 15* L01120 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 15* L01152 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 15* LOl184 OOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 15* L01216 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin: 15* L01248 ooooOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 15* L01280 OOOOOOOOOOOOOOOOOOOOOOOO*N OE I'L,pm= 14* L01312 OOOOOOOOOOOOOOO*N Sum 1'1, pin: 14* LOl344 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 14* L01376 OOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 14* L01408 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 14* LOl44000000000000000000000000*N Sum PT, pin= 14* L01472 OOOOOOOOOOOOOOOOOOOOOOOOOOOON Sum PT, pin= 14* L01504 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 14* L01536 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE PT pm= 13* L01568 ooooOOOOOOOOOOOOOOOOOOOOOOOOoooO*N Sum pt, pin: 13* L01600 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 13* L01632 OOOOOOOOOOOOOOOOOOO()()QOOOOOOOOOO*N Sum PT, pin= 13* lZ 11 Figure 4. The 16L8 Block Diagram. The official, standardized version of a fuse map is called a JEDEC map. This map can contain various informational fields and/or comments in addition to the 1s and Os. FigureS shows the JEDEC map that implements the function shown in Figures 2 and 3. Each number starting with L in the leftmost column represents the first fuse number in that row. An N denotes a note or comment. QF precedes the total number of fuses in this device-QF2048 in this example. FO means that the fuse default is 0, or unprogrammed. GO specifies an unprogrammed security fuse, whereas G 1 denotes a programmed security fuse (more on this later). C precedes a checksum value for the file. An * specifies the end of a field. A JEDEC file can also contain test vectors, which are not shown here. For more information on the JEDEC Standard, refer to "JEDEC Standard No.3-A, Standard Data Transfer Format Between Data Preparation System and Programmable Logic Device Programmer" available from: Solid State Products Engineering Council 2001 Eye Street N.W. Washington, DC 20006 Most PLD design packages compile the design and translate it into a JEDEC map. The map is then downloaded to the programming hardware, which programs the device(s) accordingly. [gl~~ ~~ggggoo~ggggg:~ ~~~, pi~~ B: [g~~~~ ggj~ggjgggggggggggjgggggjggjgg:~ ~~~ PT' pi~~ g: LOl728 OOOOOOOOOOOOOoooOOOOOOOOOOOOOOOO*N Sum PT: ~in= 13* L01760 OOOOOOOOOOOOOOOOoooOOOOOOOOOOOOO*N Sum PT, pin= 13* L01792 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N OE ~pm= 12* L01888 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT: gin= L01920 OOOOOOOOOOOOOOOOOOOOOOOOOOOOoooO*N Sum PT, pin= L01952 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= L01984 OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= ~~~\~ OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO*N Sum PT, pin= 12* 12* 12* 12* 12* 0000 Figure 5. A 16L8 JEDEC Map. First-Generation PLDs The ftrst PLDs were strictly combinatorial logic with three-state outputs, like the PALC16L8. Then D flip-flops, a clock input, and internal feedback were added, allowing a single PLD to implement sequential logic or state machines. The 16L8, 16R4 (four registered outputs), 16R6 (six registered outputs), and 16R8 (eight registered outputs) became industry-standard parts. Testability was a problem in some of the earlier devices. Because a blank device had all fuses intact, out6-3 ~~ Introduction to Programmable Logic ~, SEMICQIDUCI'OR ASYNC RESET GLOBAL CLOCK SYNC PRESET PTERM PIN PRODUCTS FEEDBA K TO ARRAY Cl Figure 6. The 22VIO Macrocell. put enables were all turned off, configuring all device pins as inputs. This scheme made it difficult to test blank devices and to check whether the fuses could be blown without actually blowing any of them; To . get around these problems, a phantom array was added to the device. The 16L8, for example, has 256 additional bits in its phantom array. These bits are used to test the PLD functionally and verify dynamic (AC) operation after the chip is packaged, without using the normal array. The phantom array is so named because it does not function in regular operating. mode. The device must be in a special mode to access the phantom array. The phantom array is usually programmed and verified as part of the final relectrical test procedure during the manufacturing process. This procedure verifies both .the PLD programmability and function. Cypress's EPLDs are programmed, tested, and then erased before they are packaged. You can also use the phantom array as part of incoming inspection. Another feature of today's PLDs is register preload, which loads data into the registers of registered devices for testing purposes. This arrangement greatly simplifies and shortens the testing procedure. You can use this feature to check illegal state resolution -a state machine's ability to pass from an accidental illegal state to a legal one. Preloading is accomplished by applying a super-voltage (usually in the range of 12 to 14V) pulse of at least 100-lls duration to a specific pin, while holding a second pin at VIH. The super voltage acts as a write strobe, which clocks data applied to the I/O pins into the corresponding registers. A security fuse has also become a standard PLD feature. In addition to writing a' fuse map into a device, any good device programmer can read a device's fuse map. This capability tends to negate the PLO's advantage of hiding proprietary logic from observers. But if you do not want your PLDs to be read by a programmer,You can program the security bit, which discon- nects the lines used to verify the array. In' a Cypress EPLD, the security EPROM cell is designed to retain its charge longer than any of the other cells in the array. The Programmable Macrocell The basic 20-pin PLOs of the past still had some limitations. For instance, they provided no way to control output-pin polarity without doing DeMorgan operations on the logic equations. Quite often the OeMorgan version has too many product terms to fit in such as device, even after several hours of reduction using a . logic-optimization program. Another drawback is that you have to stock a variety of the basic 20-pin PLDs and/or their 24-pin equivalents to get the best fit for a given'design. Often extra registers are left unused when the design is fmished. Even though these PLOs tend to be pin limited, the pins ·.associated .with the extra registers end up being wasted because you .cannot use them for anything else. The 24-pin 22VlO overcame earlier limitations and revolutionized PLDs by introducing the programmable macrocell (Figure 6). The programmable macrocell allows you to select one of four output configurations: combinatorial inverting, combinatorial noninverting, registered inverting, and registered noninverting. You can use the "output" pin as an input or for bidirectional I/O if you specify the macrocell as combinatorial. Each of the 22VI0's .ten I/O pins have all four configuration options. You select the option using two fuses, or cells, identical to those in the array. These .20 bits (two for each of ten macrocells) appear at the bottom of the fuse map that represents the array. Another innovation of the 22VlO is that some pins have a larger sum of products than others-an approach called variable product term distribution. In· the 22VlO, I/O pins have sums from eight to 16 product terms wide. This variable distribution accommodates 6-4 ~ <;~~ -=., Introduction to Programmable Logic SEMICGIDUCfOR applications such as D flip-flop counters, where several outputs require a large number of product terms. The 22VI0 offers yet another improvement over PLDs such as the 16R8, which powers up with all registers in the reset state. The only way you can .chan~e this is by clocking in new data. The 22VlO aVOids thIS problem by adding two extra product terms. One sets all registers, the other resets all registers. Because the set and reset are each a product term, they can be programmed to be the AND of any array input(s). For additional flexibility, the set is designated as a synchronous operation, and the reset is asynchronous. Because of the 22VI0's versatility, it has become something of an industry standard. It is available in TTL, CMOS, and GaAs. Many companies have introduced similar architectures with slightly different features. For example, the Cypress PLDC20G 10 uses a similar macrocell that· adds the capability to choose between a product-term output enable and a pin-controlled output enable. To make the PLDC20G 10 faster and less expensive than the 22VlO, Cypress has reduced the array to nine product terms per I/O macrocell and removed the set and reset product terms. Another device introduced around the same time as the 22VI0 is the 20RAI0, which targets asynchronous registered applications. Like the 22VlO, the 20RAI0 has I/O pins with programmable polarity bits. You can configure the 20RAlO's I/O pins as registered or combinatorial, but not with dedicated fuses. Instead, each I/O pin has a sum of four product terms that c~nnects, through a polarity switch, to the D input of a flip-flop. Each of these flip-flops has dedicated product terms connected to its clock, set, and reset functions. When both the set and reset of a flip-flop are asserted (High), the flip-flop becomes transparent, thus making the output combinatorial. In addition, the 20RAlO has an unusual outputenable scheme. Pin 13 is inverted and ANDed with an output-enable product term. If pin 13 is High, all I/O pins are at high impedance. The 20RAlO also offers a synchronous register preload in operating mode. When pin 1 goes Low, any data driven onto an I/O pin is latched into the corresponding flip-flop. An 20RAI0's I/O pin is illustrated in Figure 7. This device's flexibility and asynchronous nature make it ideal for bus-arbiter and interrupt-controller applications. Second-Generation PLDs The architectural features introduced by the 22VI0 greatly enhance PLD flexibility, but this. device still has some limitations. It offers only D-type flIp-flops, for example, which are cumbersome for applications such ~s counters. Further, each flip-flop and its feedback sull use a pin, even if the flip-flop's output is not needed outside the PLD. Bidirectional, registered pins cannot be implemented. High-speed applications often require flip-flops outside the PLD's inp~ts to latch data bec.ause propagation delays impose relatIvely long set-up urnes for output flip-flops. Cypress solves all these problems with the CY7C330. In addition to· the output registers on the I/O pins, each pin except power and groun? ~as an input register with a choice of two clocks. ThIS mput macrocell makes the 28-pin CY7C330 ideal for pipelined control and high-speed state machine applications. Another CY7C330 feature is its ability to emulate T and JK flip-flops-a useful alternative in counter designs. In each I/O macrocell, the sum of products from the array drives one input of an exclusive-OR (XOR) gate. The second input to the XOR gate is another product term. This gate's output connects to the D input of the output flip-flop in the macrocell (Figure 8). If the flip-flop's Q output is fed back and connected to the single product term driving the XOR gate, the sum-of-products acts as the T input of a T flip-flop. The macrocell can also emulate a JK flip-flop in this way, using the relation T = J!Q + KQ. If you require a D flip-flop, you can use the XOR gate to control polarity. Close examination of Figure 8 reveals two paths into the array. The first is a multiplexer that selects feedback from either the output register or the input OUTPUT ENABLE (FROM PIN 13) OE~~UL PRELOAD (FROM PIN 1) ________________-r__________~~ eLO K PTERM RESET PTERM CO TO~~L-____________~~==~ __ __ ~ Figure 7. The 20RAIO Macrocell. 6-5 ~ Introduction to Programmable Logic SET ·E SET JCL\Cl JCLKO OCLK OE OE p~~ ______________________ ~~~~-+~ XOR SUM~~______-R~ , TO C3 'h.~~~ 1np.t FROM ADJACENT MACROCELL Figure 8. The CY7C330 Macrocell. register's Q output. This multiplexer is called the feedback mux. The inputs to the second path, called the shared input mux, are the Q outputs of input registers belonging to adjacent I/Omacrocells. This path allows you to feed back the Q output of a macrocell's output register, and still utilize' the pin associated with that macrocell as an input. You can do this for six of the 12 I/O macrocells. If you need more registers for an application, the CY7C330 contains four additional buried registers. These registers are identical to the output register portion of the 1/0 macrocell, except they are not connected to any pin. Just as the CY7C330 can be considered as' an extended, enhanced version of the 22VI0, the CY7C331 represents an extension of the lORAI0.· The lORAI0 has many of the same limitations as the 22VI0, with the additional limitation that the sum of products is only four product terms wide. The CY7C331 has 12 I/O macrocells. In addition to the 20RAI0-like output flipflops, the CY7C331 has identical flip-flops in the input path. As in the lORAI0, each flip-flop has a productterm-controlled clock, set, and reset. If the set and reset product terms are both asserted, the flip-flop' becomes transparent. The 20RAlO polarity fuse has been replaced in the CY7C331 by an XOR gate, which has as inputs the sum of products and a dedicated product term. Thus, you can control the output's polarity or have the flip-flops emulate T or JK flip-flops, as in the CY7C330. The CY7C331 macrocell appears in Figure 9 Like the 22VI0 and CY7C330, the CY7C331 has variable-product-term distribution with sums from four to .12 product terms wide. The CY7C331 borrows the shared· input mux and output enable schemes from the CY7C330. The CY7C331 does not support the lORAI0's operating mode preload, but you can preload the CY7C331's registers using a super voltage. The CY7C331 is designed especially for self-timed applications such as high-speed 1/0 interfaces. The device supports self-timed designs with programmable clock inputs, well-controlled internal timing relationships, and ultra-fast metastable resolution. No other PLO has this self-timed capability. Another PLO architectural. trend, is to put registered inputs in combinatorial devices. These PLDs generally serve in sophisticated decoding applications, where the address or data is only stable for a short time. In the past, an MSI chip with latches or flip-flops was used to capture transient data, and the latched data fed into a PLO. Now PLOs such as the CY7C332 feature an input macrocell that you can program as combinatorial, registered, or .latched. You have a choice of two clocks, and you can program the clock polarity as well. The CY7C332 I/O macrocell (Figure 11) incorporates. the input macrocell and a combinatorial output path. The latter includes a variable sum of products that drives one input of an XOR gate; a dedicated product term drives the XOR's other input. An output-enable mux allows a product term (pin 14) to control the output enable. This combinatorial output path can act as an input to the programmable-input register/latch, thus allowing you to create state machines. 6-6 Introduction to Programmable Logic OE (pIN 14) OE PTERM OUT SET PTERM CO OUT ClK PTERM OUT RESET PTERM IN ClK PTERM IN SET PTERM TO TO INPUT B FFER register INPU~~~~~ FROM ADJACENT MACROCELl Figure 9. The CY7C331 Macrocell. OE PLk~--------4-~ XOR P~ SUM F PRODUCTS TO C4 >-+-________________ ~ ~~~----~--~uOLJO PIN INPU~T~B~~~__~~ C2 OE (PIN 14) ClK! ClK2 Figure 10. The CY7C332 Macrocell. High-Density PLDs figured using expander product terms. Each of these product terms is called a logic array block (LAB). The CY7C342 contains eight LABs, which connect together via a programmable interconnect array (PIA). The CY7C342 macrocell (Figure12) contains a sum of three product terms driving one input of an XOR. The other XOR input is a dedicated product term. The XOR drives a programmable flip-flop, which you can Because of its low power consumption, CMOS can achieve higher integration than can bipolar technologies. Several manufacturers are taking advantage of this fact to produce very high density PLDs. The CY7C342, for example, is a 68-pin member of the new MAX family and contains 128 flip-flops and over 1000 product terms. Up to 256 additional latches can be con6-7 Introduction to Programmable Logic P T 'UU'-'<-.---i P T ........'-"-----1 P T 'UU'""----i CLOC ...........--L..Io.Jl.. VILP VIHP VILP VILP VIHP VILP VIHP VILP VIHP VILP 11 VILP VIHP VILP VIHP VIHP 12 13 14 15 16 17 18 19 20 21 22 23 VILP VIHP VIHP VILP VILP VILP VIHP VIHP VILP VIHP VILP VIHP VIHP VIHP VILP VILP VIHP VIHP VIHP VIHP VIHP VILP VILP VILP VILP VIHP VILP VILP VILP VIHP VIHP VILP VILP VIHP VILP VIHP VILP VILP VIHP VIHP VIHP VILP VIHP VILP VILP VIHP VILP VIHP VILP VIHP VIHP VILP VIHP VIHP VILP VIHP VIHP VILP VIHP VIHP 24 VIHP VIHP VILP VILP VILP 17 25 33 41 49 57 25 VIHP VIHP VILP VILP VIHP 10 18 26 34 42 50 58 26 27 VIHP VIHP VILP VIHP VILP VIHP VIHP VILP VIHP VIHP 28 VIHP VIHP VIHP VILP VILP 29 VIHP VIHP VIHP VILP VIHP VILP VILP VILP 0 8 16 24 32 40 48 56 VILP VILP VIHP 1 9 VILP VIHP VILP 2 19 27 35 43 51 59 VILP VIHP VIHP 3 11 VIHP VILP VILP 4 12 20 28 36 44 52 60 VIHP VILP VIHP 5 13 21 29 37 45 53 61 30 VIHP VIHP VIHP VIHP VILP 31 PO 'VIHP VIHP VIHP VIHP VIHP VILP VILP Vpp X X PI P2 P3 VILP VIHP Vpp X X VIHP VILP Vpp X X VIHP VIHP Vpp X X VIHP VIHP VILP 6 14 22 30 38 46 54 62 VIHP VIHP VIHP 7 15 23 31 39 47 55 63 DO .D1 D2 D3 D4 :D5 D6 D7 Programmed Data Input 6-14 - £~~Rffi') ~., CMOS PAL Basics SEMICONDUCTOR -t> INPUTS 10 - 31) POP,P2 P3 o 1 2 3 • 1.7 , • lOll 12131411 18"1.1. 20212223 2UnU7 28213031 0 1 2 3 ~ • ~ J.. ~ • I 7 19 A > ....... t - - - - -R ~ •• ~ 10 11 12 13 .. l' II ~ A ~r--- I' I' ..... 20 21 22 23 H- E~ 11 I' ~R ..... . 18 17 ... ~J---- .> r- 2. 21 28 27 28 28 30 ~ ~ J. ~ ~ ..... ,......" .H H 31 16 c:_ > 32 33 ~ " 3Ii 31 37 31 31 . 15 ~ A ~ ..... 40 ~ R .," ...,.... . . '2 .... ., -t p ..... 'I ~~ ;~ 55 .....~ eg &0 .. 52 53 ..... 14 .... ... ..... 58 ~ 9 57 5' 51 60 81 82 6' ... .... c: ~ ~ POP,P2 P3 0 ' 23 .567 "'011 1213'415 111171.19 20212223 24252127 2121303.1 Figure 2. Functional Logic Diagram of PAL C 16L8A 6-15 13 12 11 CMOS PAL Basics Programming Operation In a PAL C device, pins 5 - 9 are decoded (Table 4) in a one of 32 decoder, whose outputs correspond to the inputs labeled 0 -31 in Figure 2. For programming, 15V is applied to the bottom of the word line through a weakdepletion-mode device. The EN (enable) signal to all of the three-state drivers is Low, which prevents the normal PAL input signals from driving the word lines during programming. The DO - D7 inputs (pins 19 - 12) drive the program transistors (0, 8, 16, 24, etc.) as selected by pins 2,3, and 4 (Table 3). To disconnect a word line from a bit line, the program transistor is forward biased, which increases the threshold of the read transistor. data at the PAL C input pins is applied to all 64 of the product term lines. If any of the P transistors (16 per product term line) have not been programmed, they tum on and pull the lower input of the corresponding sense amplifier (SA) to 2V or less. Because this voltage is lower than the reference (Vref), the sense amplifier's output is Low. The reference is an unprogrammed EPROM cell that tracks the same process, voltage, and temperature variations that affect all the cells in the array. The reference is approximately 3V at room temperature and nominal Vee (5V). Phantom PAL Operation The PAL is in the Phantom PAL operation mode when a supervoltage (Vpp = 13.5V) is applied to pin 6. The phantom array is programmed as shown in Figure 2. When the device is· in Phantom PAL mode, you can measure the worst-case propagation delay from the pin 2 input to the outputs (pins 12 through 17). The truth table for the phantom array appears in Table 5. Verify Operation To verify the programmed cells, the device must go from the Program PAL mode to the Program Inhibit mode to the Program Verify mode. This is accomplished by reducing the voltage on pin 11 to VIHP (3V) and then to VILP (O.4V). Inside the device (Figure 4), the voltage changes disable the l-of-32 decoder, bring the EN signal Low, and put 31 of the 32 input term lines at OV. The line being verified is at 5V. The input address lines (pins 2 through 9) do not need to change when going from Program to Verify mode. Because the Ones that were programmed cause the thresholds of the R transistors to increase, these transistors do not tum on during Verify mode. The unprogrammed. transistors do tum on, however; the complement (inverse) of the data programmed is thus read during verify. Reliability Reliability is designed into all Cypress products from the beginning by using design techniques to eliminate latchup and improve ESD and by paying careful attention to layout. All products are tested for all known types of CMOS failure mechanisms. Failure mechanisms can be either classified as those generic to CMOS technology or those specific to EPROM devices. Table 6 lists both categories of failures, their relevant activation energies, Ea in electron volts, and the detection method used by Cypress. In both cases, the mechanisms are aggravated by HTOL (high temperature operating life) tests and HTS (high temperature storage) tests. Regular (Normal) PAL Operation The PAL implements the programmed function when no supervoltages are applied to any of the pins. During regular PAL operation, the l-of-32 decoder and the DO D7 decoder are disabled, the EN signal is High, and all 32 input term lines are at 5V. Under these conditions, the 14 8 7-INPUT 8 NOR OUTPUT DRIVERS 12-19 PROGRAM Figure 3. 16L8 Device Simplified Block Diagram 6-16 8 PINS CATES Table 5. Phantom Array Truth Table Pin 2 0 1 0 0 Inputs 3 4 1 0 1 0 1 X 1 X Outputs 19 18 17 16 15 X X 1 1 1 o· 0 X X 0 1 0 X X X 1 X X X 0 This results in a reduced read margin. The effects of this mechanism are generally negligible. Electrons might become trapped in the gate oxide during programming and cause diminis~ed re~rog~am mability. For one-time-programmable deVIces, thIS faIlure mode has little significance. This is because Cypress PAL C devices are programmed only three times: twice during manufacture and once by the customer. 14 13 12 1 1 1 0 0 0 X X X X X X HTOL Testing High temperature operating life test (or burn-in) detects most generic CMOS failure mechanisms. Units are placed in sockets under bias conditions with power applied and at elevated temperatures for a specific number of hours. This test weeds out the "weak sisters" that would fail during the fIrst 100 to 500 hours of operation under normal operating temperatures. HTOL tests are also used to measure parameter shifts to predict and screen for failures that would occur much later. Specific EPROM failure mechanisms include charge loss, charge gain, and electron trapping. Thermal energy and field emission effects accelerate charge loss. Thermal charge loss failures usually occur on random bits and are often related to latent manufacturing defects. In many instances a dramatic difference between typical and worst-case bits are observed. Field emission effects are generally detected as weakly programmed cells. The high voltages used to program a selected bit might disturb an unselected bit as a result of a defect. Charge gain is due to electrons accumulating on a floating gate as a result of bias or voltage on the gate. PINS 5- 9 HTS Testing High temperature storage tests are used to thermally accelerate charge loss. These tests are performed at the 1 OF' 32 DECODER (INPUT TERMS) 1 CORRESPONDS TO INPUTS 0.1 OF' riG. 2 00-07 - - - 4 - - 1 - - -....--+--1 fO~~~~~~AM ONLY 5V F'OR NORMAL AND VERIFY OPERATIONS 15V F'OR PROGRAMMING Figure 4. Programming Method 6-17 wafer level and under unbiased conditions. Both pass/fail data as well as shifts in thresholds are measured. For. a more detailed discussion of charge loss screening, see the References. The generally accepted screening. method for identifying charge loss is a 168-hour bake at 250·C. This cor~ relates with more than 220,000 years of normal operation at 70·C using a failure activation energy of 1.4 ev. The sample size chosen guarantees that at least 99 percent of the units will not fail during their useful operating life. Initial Qualification The process in general and the PAL C design specifically was qualified using HTS (bake) at 250·C for 256 hours, in conjunction with an HTOL .test at 125·C for 1000 hours. In the qualification process, four wafers were erased using ultraviolet light, and the linear thresholds of the cell's read transistors measured at 25 sites on each wafer. The wafers were then programmed, and the linear thresholds measured and recorded. The wafers were alternately baked at 250·C and the linear thresholds measured and recorded at 0.25,0.5, 1,2, 4, 8, 16, 32, 64, 128, and 256 hours. The number of device hours was therefore 100 x 256 = 25,600. . The results of this process revealed that the average threshold reduction due to charge loss was 0.66V. The range was 8 to 10 percent of the average initial threshold of 7.7V. This reduced threshold is more than 4V above the sense amplifier voltage reference. There were no failures. If the charge loss failure activation energy is assumed to be 1.4 ev, the HTS time of 256 hours at 250·C trans- lates to 438,356 years of. operation at 70·C. This time translation was computed using the industry-standard Arrhenius equation, which converts the time to failure (operating lifetime) at one temperature and time to another temperature and time. To summarize the results: Sample size: 100 Device hours: 25,600 HTSconditions: 256 hours at 250·C Average initial threshold: 7.7V Average threshold decrease: 0.66V Standard deviation: 0.12 Lifetime (1.4 ev): 438,356 years at 70·C These results confirm that the data retention characteristics of the EPROM cell used in all Cypress PALs and PROMs guarantees a minimum operating lifetime of 438,356 years for activation energies of 1.4 ev. Production Screen Units from the same population were assembled without being subjected to HTS and were subjected to an HTOL of 150·C for 1000 hours. The units were tested at 12,24,48,96, 168, 336, and 1008 hours and the measurements recorded. Variations in the thresholds of the EPROM cells were measured and correlated to the units tested in the HTS test to determine a maximum acceptable rate of charge loss. This data allows Cypress to guarantee data retention over the devices' normal operating lifetime. PAL C Advantages Over Bipolar PALs The most pertinent data sheet parameters of Cypress PAL C devices are compared with those of representative bipolar PALs in Table 1. The supply current and propaga- Table 6. Generic CMOS Failure Mechanisms Mechanism Surface charge Contamination Electromigration Micro-cracks Silicon defects Oxide breakdown Hot electron injection Fabrication defects Latchup ESD Charge loss Charge Gain (oxide hopping) Electron trapping in gate oxide Activation Energy (eV) 0.5 to 1.0 1.0 to 1.4 1.0 -0.3 0.3 -- --- -0.8 to 1.4 0.3 to 0.6 -- Detection Method HTOL, Fabrication monitors HTOL, Fabrication monitors HTOL Temperature cycling HTOL High-voltage stress, HTOL LTOL (low-temperature operating life) Bum in High-voltage stress, bum in, characterization Characterization HTS (high-temperature storage) HTOL Program/erase cycle Table adapted from "An Evaluation of 2708, 2716, 2532, and 2732 Types of U -V EPROMs, Including Reliability and Long Term Stability." Danish Research Center for Applied Electronics, Nov. 1980. 6-18 ~ __ ~ ____ ~ ______ ~ __ ~~ __ ~ __________ ~ ____________ ~ TTL TO __--+CMOS CONVERTER THINO)(IOE TRANSISTOR ·Thick Oxide Field Transistor • ·Substrate Diode VSUB Figure 5. Input Protection Circuit ground. Current rapidly increases until, in effect, a short circuit from Vee to ground exists. If the current is not limited, it will destroy the device, usually by melting a metal trace. The CMOS processing used to fabricate both N- and P-channel MOS transistors also inherently creates parasitic bipolar transistors - both NPNs and PNPs. Latch-up is caused when these parasitic transistors are inadvertently turned on. So long as the voltages applied to the package pins of the CMOS IC remain within the limits of the power supply voltages (usually 0 to 5V), the parasitic bipolar transistors remain dormant. However, when either negative voltages or positive voltages greater than the Vee supply voltage are applied to input or output pins, the parasitic bipolar transistors might tum on and cause latch-up. Figure 6 shows a cross section of a typical CMOS inverter using a P-channel pull-up transistor and an Nchannel pull-down transistor. Also shown is an N-channel output driver that is isolated from the CMOS inverter by a guard ring (channel stopper). The latter is necessary to prevent parasitic MOS transistors between devices. P+ guard rings surround N-channel devices, and N+ guard rings surround P-channel devices. The parasitic SCR (PNPN) and bias generator appear in Figure 7, which does not show the output driver schematic. For latch-up to occur, two conditions must be satisfied: The product of the betas of the NPN and PNP transistors must be greater than one, and a trigger current must exist that turns on the SCR. Because the SCR structure in bulk CMOS cannot be eliminated, the task of preventing latch-up is reduced to keeping the SCR from turning on. If either Rwell or Rsub equal 0, the SCR cannot turn on. This is because the base and emitter of the PNP transistor are tied together and thus the base/emitter junction cannot be forward biased; and the base/emitter junction of the NPN cannot be forward biased because the base is connected to ground. Note, however, that the NPN can be turned on by a negative voltage on the output pin if the right end of Rsub is grounded. tion delay specifications are compared under identical test conditions. The output current sinking specifications are also identical. Cypress PAL C devices are clearly superior to bipolar PALs. The lower power advantage of the PAL C results in several benefits: Lower capacity power supplies, which therefore cost less Reduced cooling requirements Increased long term reliability due to lower die junction temperatures You can further reduce the power dissipation by driving the PAL C inputs between 0.5V or less and 4V or more. This reduces the power dissipation in the input TIL-to-CMOS buffers, which dissipate power when their inputs are between 0.8 and 3V. PAL C Technology The PAL C devices' 0.8~, double-Iayer-polysilicon, single-layer-metal, N-well, CMOS technology has been optimized for performance. Careful attention to design details and layout techniques has resulted in superior-performance products with improved ESD input protection and improved latch-up protection. The circuit shown in Figure 5 is used at every input pin in all Cypress products to provide protection against ESD. This circuitry withstands repeated applications of high voltage without failure or performance degradation. This is accomplished by preventing the high ESD voltage from reaching the internal transistors' thin gate oxides. The circuit consists of two thick-oxide field transistors wrapped around an input resistor (Rp) and a thinoxide gate transistor with a relatively low breakdown voltage (12V). Large input voltages cause the thick-oxide transistors to turn on, discharging the ESD current to ground. The thin-oxide transistor breaks down when the drain-to-source voltage exceeds 12V. This transistor is protected from destruction by the current-limiting action of Rp. Experiments confirm that this input protection circuitry results in ESD protection in excess of 2000V. Latch-up Preventing Latch-Up Latch-up is a regenerative phenomenon that occurs when the voltage at an input or output pin is either raised above the power supply voltage potential or lowered below the substrate voltage potential, which is usually The traditional cures for latch-up include increased horizontal spacing, diffused guard rings, and metal straps 6-19 Vee compatibility is required. In addition, the P-channel pullup transistor is sensitive to overshoot and introduces another vertical PNP transistor that further compounds the latch-up problem. Cypress uses N-channel pull-up transistors that eliminate all of these problems and still maintain TIL compatibility. Cypress is the fIrst company to use a substrate bias generator with CMOS technology. The bias generator keeps the substrate at approximately -3V DC, which serves several purposes. The parasitic diodes shown in Figure 5 cannot be forward biased unless the voltage at an input pin is at least one diode drop more negative than -3V. This translates into increased device tolerance to undershoot at the input pins caused by inductance in the leads. If the undershoot is larger than 3V, the output impedance of the bias generator itself is sufficient to prevent trigger current from being generated. , The same reasoning applies to negative voltages at the output pins (Figure 7). To tum on the NPN transistor, the voltage at the output pin must be at least one VBE more negative than -3V. To protect the core of the die from free-floating holes and stray currents, Cypress uses a diffused collection guard ring that is strapped with metal and connected to the bias generator. This provides an effective wall against transient currents that could cause mis-reading of the EPROM cells. Substrate Bias Generator - . Figure 6. Parasitic SCR and Bias Generator to critical areas. These solutions are obviously opposite to the goal of greater density. A brute-force approach that has been successful in reducing latch-up has been to increase the conductivity of the N well and the substrate. Changing the well conductivity is unacceptable because it affects the characteristics of the P-channel MOS transistors. Using an epitaxial layer to reduce the substrate resistivity (Rsub) is also unacceptable because the price per wafer with a P+ epi-Iayer is approximately three times the cost of the industry-standper square, P- wafer. ard 5-inch, Cypress uses several design techniques in addition to careful circuit layout and conservative design rules to avoid latch-up. Conventional CMOS technology uses a P-channel MOS transistor as a pull-up device on the output drivers. This has the advantage of being able to pull the output voltage High to within 100 mV of the positive voltage supply. However, this is of marginal value when TIL son References Woods, Murray H. "An E-PROM's integrity starts with its cell structure," Electronics magazine, August 14, 1980, pg. 132. Rosenberg, Stuart. "Tests and screens weed out failures, project rates of reliability," Electronics magazine, August 14, 1980, pg. 136. Output Driver n-MOS PULL-DOWN DEVICE 'n n+ DIFFUSION AND n- WELL GUARD RING OUTPUT p+ DIFFUSION GUARD RING CMOS Inverter "1..J n-MOS /PULL-UP DEVICE Vee OUTPUT LATERAL npn BIPOLAR TRANSISTOR Figure 7. CMOS Cross Section and Parasitie Circuits 6-20 INPUT ~ ------ ~ CYPRESS , SEMICONDUCTOR = ~.!~~~.. Are Your PLDs Metastable? This application note provides a detailed description of the metastable behavior in PLDs from both circuit and statistical viewpoints. Additionally, the information on the metastable characteristics of Cypress PLDs presented here can help you achieve any desired degree of reliability. Metastable is a Greek word meaning "in between." Metastability is an undesirable output condition of digital logic storage elements caused by marginal triggering. This marginal triggering is usually caused by violating the storage elements' minimum set-up and hold times. In most logic families, metastability is seen as a voltage level in the area between a logic High and a logic Low. Although systems have been designed that did not account for metastability, its effects have taken their toll on many of those systems. In most digital systems, marginal triggering of storage elements does not occur. These systems are designed as synchronous systems that meet or exceed their components' worst-case specifications. Totally synchronous design is not possible for systems that impose no fixed relationship between input signals and the local system clock. This includes systems with asynchronous bus arbitration, telecommunications equipment, and most I/O interfaces. For these. systems to function properly, it is necessary to synchronize the incoming asynchronous signals with the local system clock before using them. Figure 1 shows a simple synchronizer, whose synchronous input comes from outside the local system. The synchronizer operates with a system clock that is synchronous to the local system's operation. On each leading edge of this system clock, the synchronizer attempts to capture the state of the asynchronous input. Figure 2 shows the expected result. Most of the time, this synchronizer performs as desired. CLOCK ASYlle 111. ur \'-------/ Figure 2. Expected Synchronizer Output Digital systems are supposed to function properly all the time, however. But because there is no direct relationship between the asynchronous input and the system clock, at some point the two signals will both be in transition at very nearly the same instant. Figure 3 shows some of the synchronizer's possible metastable outputs when this input condition occurs. These types of outputs would not occur if the synchronizer made a decision one way or the other in its specified clock-to-output time. A flip-flop, when not properly triggered, might not make a decision in this time. When improperly triggered into a metastable state, the output might later transition to a High or a Low or might oscillate. When other components in the local system sample the synchronizer's metastable output, they might also become metastable. A potentially worse problem can occur if two or more components sample the metastable signal and yield different results. This situation can easily corrupt data or cause a system failure. Such system failures are not a new problem. In 1952, Lubkin (Reference 1) stated that system designers, incIud- CLOCK ASYNC INPUT IflCHIOlun ASUCHROIOUI , T ITiCHlOIOUI OUTPUT LOCALLY - ...................- - - + - - - - - - - - - - - - i SYNC OUT SYICHIOIOUI nSTU I Figure 1. Simple Synchronizer UTlSTAIU lUOLVE TO I 0 HITlSTAILE RESOLVE TO 1 I HETASTAILE OSCILLATIU OUTPUT Figure 3. Possible Metastable States of Synchronizer 6-21 take anywhere from an additional few hundred picoseconds to tens of microseconds to reach a valid output level. The amount of additional time beyond teo. max required for the outputs to reach a valid logic level is known as the metastable walk-out time. This walk-out time, while statistically predictable, is not deterministic. Figure 6, from Reference 2, shows the variation in output delay with data input time. The left portion of the graph shows that when the data meets the required set-up time, the device has valid output after a predictable delay, which equals teo. The middle portion of the graph indicates the metastable region. If the data transitions in this region, valid output is delayed beyond teo max. The closer the input transitions to the center of the metastable region, violating the device's triggering requirements, the longer the propagation delay. If the data transitions after the metastable region, the device does not recognize the input at that clock edge, and no transition occurs at the output. As given in Reference 3, you can predict the region tw, where datil transitions cause a propagation delay longer than t, from the formula: ing the designers of the ENIAC, knew about metastability. The accepted solution at that time was to concatenate an additional flip-flop after the original synchronizer stage (Figure 4). This added flip-flop does not totally remove the problem but does improve reliability. This same solution is still in wide use today. Recovery from metastability is probabilistic. In the improved synchronizer, the first flip-flop's output might still be in a metastable state at the end of the sample clock period. Because the flip-flops are sequential, the probability of propagating a metastable condition from the second flip-flop stage is the square of the probability of the first flip-flop remaining metastable for its sample clock period. This type of synchronizer does have the drawback of adding one clock cycle of latency, which might be unacceptable in some systems. As system speeds increase and as more systems utilize inputs from asynchronous external sources, metastability-induced failures become an increasingly significant portion of the total possible system failures. So far, no known method totally eliminates the possibility of metastability. However, while you cannot eliminate metastability, you can employ design techniques that make its probability relatively small compared with other failure modes. Eq.l where 't depends on device-specific characteristics such as transistor dimensions and the flip-flop's gain-bandwidth product. Figure 7 shows another way of looking at metastability. A flip-flop, like any other bistable device, has two minimum-potential energy levels, separated by a maximum-energy potential. A bistable system has stability at either of the two minimum-energy points. The system can also have temporary stability - metastability - at the energy maximum. If nothing pushes the system from the maximum-energy point, the system remains at this point indefinitely. A hill with valleys on either side is another bistable system. A ball placed on top of the hill tends to roll toward one of· the minimum-energy levels. If left undisturbed at the top, the ball can remain there for an indeterminate amount of time. As this figure indicates, the characteristics of the top of the hill as well as natural factors affect how long the ball stays there. The steepness of the hill is analogous to the gain-bandwidth product of the flipflop's input stage. t w= t coe Explanation of Metastability In a flip-flop, a metastable output is undefined or oscillates between High and Low for an indefinite time due to marginal triggering of the circuit. This anomalous flipflop behavior results when data inputs violate the specified set-up and hold times with respect to the clock. In the case of a D-type flip-flop, the data must be stable at the device's D input before the clock edge by a time known as the set-up time, ts. This data must remain stable after the clock edge by a time known as the hold time, th (Figure 5). The data must satisfy both the set-up and hold times to ensure that the storage device (register, flip-flop, latch) stores valid data and to ensure that the outputs present valid data after a maximum specified clock-to-output delay teo_max. As used in this application note, teo_max refers to the interval from the clock's rising edge to the time the data is valid on the outputs. In most cases, teo_max equals the maximum teo found in data sheets, as opposed to the average or typical teo value. If the data violates either the set-up or hold specifications, the flip-flop output might go to an anomalous state for a time greater than teo_max (Figure 5). The outputs can 't ta • - . , '1_.'. CLOCK INPUT LOCALLY S'.CHUIOUI 1 1 I~_ _--.-J OUT PUJ"·· IfnER snCIlOIUII I Figure 4. Two-Stage Synchronizer "'_"0' 1 1'----I r--:::-: .. ~I 1 I Figure 5. Triggering Modes of a Simple Flip-Flop 6-22 Causes of Metastability Systems with separate entities, each running at different clock rates, are called globally asynchronous systems (Reference 4). The entities might include keyboards, communication devices, disk drives, and processors. A system containing such entities ~s. asynchronous because signals between two or more entItIes do not share a fixed relationship. Metastability can occur between two co?currently operating digital systems tha~ lack a. common ti~e. reference. For example, in a multlprocessmg system, It IS possible that a request for data from one system can occur at nearly the exact moment that this signal is sampled by another part of the system. In this case, the request ~ght be undefined if it does not obey tlle set-up and hold time of the requested system. When globally asynchronous systems co~unicate with each other, their signals must be synchrofllzed. Arbitration must occur when two or more requests for a shared resource are received from asynchronous systems. An arbiter decides which of two events should be serviced first. A synchronizer, which is a type of arb~ter w~~ a clock as one of the arbited signals, must make Its declSlon within a fixed amount of time. A device can synchronize an input signal from an external, asynchronous device in cases such as a keyboard input, an external interrupt, or a communication request. Care must be taken when two locally-synchronous systems communicate in a globally-asynchronous environment. A synchronization failure occurs when one system samples a flip-flop in tlle other system that has. an. undefined or oscillating output. This event can distnbute non-binary signals through a binary system (Reference 5). In synchronizers, tlle circuit must decide the state of the data input at the clock input's rising edge. If these two signals arrive at the same time, the circuit can produce an Figure 7. Graphical View of a Bistable System output based on either decision, but ~ust decide one way or the otller within a fixed amount of tIme. Attacking Metastability The design of synchronous systems is much different than the design of globally-asynchronous systems. The design of a synchronous digital syste~ is based on kn~wn maximum propagation delays of fhp-flops and IO~ICal gates. Asynchronous systems by definition have no fIxed relationship with each other, and therefore, any propagation delay from one locally-synchronous system to the next has no physical meaning. Two different methods are available to produce locally-synchronous systems from globally:asynchro~ous systems. The first method involves creating self-umed systems. In a self-timed system, the entity that performs a task also emits a signal tllat indicates tlle task's completion. This handshaking signal allows the use of the results when they are ready instead of waiting for the wo:st-~ase delay. Such handshaking signals allow commUfllcatlOns between locally-synchronous systems. The advantage of the self-timed method is tllat it permits .machines to run at tlle average speed instead of the worst-case speed. The disadvantages are that a self-timed system must have extra circuitry to compute its own completion signals and e.xtra circuitry to che~~ for tlle completion of any tasks asSIgned to external entitles. Petri Nets data flow machines, and self-timed modules all us: the self-timed method of communication among locally-synchronous systems. Self-timed structures do not completely eliminate metastability, however, because they can include arbiters that can be metastable. Most systems do not include self-timed interfaces due to tlle additional circuitry and complexity. The· second method of producing locally-synchronous systems from globally-asynchronous systems is the simple synchronizer. This is the most com~on way of communicating between asynchronous objects. The metastability errors that might arise from these systems must .be made to play an insignificant role when compared WIth other causes of system failure. Many metastability solutions involve special circuits (References 6 and 7). Some of these solutions do not reduce metastability at all (Reference 13 and 8). Others, however, do reduce metastability errors by pushing .the occurrence of metastability to a place where sufficl~nt time is available for resolving the error. Most of these Clfcuits are system dependent and do not offer a universal solution to metastability errors. The easiest and the most widely used solution is to give the synchronizing circuit enough time to both V A L I D D A T A o U T p U T T I M EI--....--_ __ 1w(1) '2r I NORMAL DELAY 1 SYNC; {Have two registers hold the} {true and inverted sense of } {the synchronization register} IF2 ISYNC; IERROR lRESET * Fl * IF2 + /RESET * IFI * F2 + RESET * ERROR; {ERROR# goes low when the XOR } {ofFI and F2 is false, ERROR#} {also toggles on RESET} ITSYNC TSYNC; {Fmax reg toggles on every clock} ITl TSYNC; IT2 /TSYNC; {Have two registers hold the} {true and inverted sense of } {Fmax reg} IFAIL Tl * IT2 + ITI * T2; {FAIL# goes low when the XOR} {of Tl and T2 is false, indicating } {Fmax has been exceeded} Figure 23. PLD Equations for Metastability Testing 6-32 4. Chapiro, Daniel M., Globally-Asynchronous Locally-Synchronous Systems, Department of Computer Science Report No. STAN-CS-84-1026, October 1984. 5. Horstmann, Jens U., Eichel, Hans W., Coates, Robert L., "Metastability Behavior of CMOS ASCI FlipFlops in Theory and Test," IEEE Journal of Solid-State Circuits, Vol 24, No 1, Feb 1989, pp. 146 - 157 6. Wormald, E.G., "A Note on Synchronizer or Interlock Maloperation," Professional Program Session Record 16, WESCON 87, November 17 - 19, 1987, Electronic Conventions Management, Los Angeles, CA 90045. 7. Pechouchek, Miroslav, "Anomalous Response Times of Input Synchronizers," IEEE Trans. Computers, Vol. C-2S, No.2, Feb 1976, pp. 133 - 139. 8. Chaney, T. J., "Comments on 'A Note on Synchronizer or Interlock Maloperation,'" IEEE Trans. Computing, Vol C-28, No 10, Oct. 1979, pp. 802 - 804. 9. Couranz, George R., Wann, Donald F., "Theoretical and Experimental Behavior of Synchronizers Operating in the Metastable Region," IEEE Trans. Computers, Vol C-24, No.6, June 1975, pp. 604 - 616 10. Veendrick, Harry I.M., "The Behavior of FlipFlop Used· as Synchronizers and Prediction of Their Failure Rate," IEEE Journal of Solid-State Circuits, Vol SC-15, No.2., April 1980, pp. 169 - 176. 11. Kacprzak, Tomasz, Albieki, Alexander, "Analysis of Metastable Operation in RS CMOS Flip-Flops," IEEE Journal of Solid-State Circuits, Vol SC-22, No 1, Feb 1987, pp. 57 - 64. 12. Flannagan, Stephen T., "Synchronization Reliability in CMOS Technology," IEEE Journal of SolidState Circuits, Vol. SC-20, No.4, Aug 1985, pp. 880 882. 13. Wakerly, John F., A Designers Guide to Synchronizers and Metastability, Center for Reliable Computing Technical Report, CSL TN #88-341, February, 1988Computer Systems Laboratory, Departments of Electrical Engineering and Computer Science, Stanford University, Stanford, CA. 14. Freeman, Gregory G., Liu, Diek L., Wooley, Bruce, and McClusky, Edward J., Two CMOS Metastability Sensors, CSL TN# 86-293, June 1986, Computer Systems Laboratory, Electrical Engineering and Computer Science Departments, Stanford University, Stanford, CA. 15. Rubin, Kim, "Metastability Testing in PALs," WESCON 87 (San Francisco, CA, Nov. 17 - 19, 1987), Electronic Conventions Management, Los Angeles, CA 90045. 16/1. because the first synchronization stage can synchronize the asynchronous input signal, and the second synchronization stage can perform a Boolean function on a combination of the input and output signals. Boolean functions can be performed at either stage; the metastability characteristics listed in Table 1 apply to PLD registers' asynchronous inputs that are used directly as well as asynchronous inputs used as a Boolean combination of existing inputs and outputs. When implementing a two-stage synchronizer in a PLD, the probability that a synchronizer is metastable after the second stage of synchronization is the square of the probability that a synchronizer is metastable after the first stage of synchronization. The MTBF equation is MTBF=(~)2 fc!dW From this result, the equation for tr becomes t sw ( In (MTBF) + 2 x In (f ef dW) ) tr= 2 Using this result for a two-stage synchronizer in a Cypress PALC22VlOC, the tr fora 10-year MTBF is reduced from 13.0 ns to tr = (0.5 ) (0.547 x 10 -9s) [In ( 315 x 10 6s ) + In ( 90.9 x 10 6 x 90.9 x 10 6 x 8.08 x 10 -15) ] = 7.65 ns The maximum fc increases from 41.6 MHz to 1 1 53.6 MHz fe 1 1 f max + t r 90.9 MHz + 7.65 ns This example shows that if the cycle of latency caused by the additional synchronization stage is acceptable, you can dramatically increase the synchronizer's maximum operating frequency. References 1. Lubkin, S., (Electronic Computer Corp.), "Asynchronous Signals in Digital Computers," Mathematical Tables and Other Aids to Computation, Vol. 6, No. 40, Oct 1952, pp. 238 - 241. 2. Nootbaar, Keith, (Applied Microcircuits Corp.), "Design, Testing, and Application of a Metastable-Hardened Flip-Flop," WESCON 87 (San FranCisco, CA, Nov. 17 - 19, 1987), Electronic Conventions Management, Los Angeles, CA 90045. 3. Stoll, Peter A., "How to Avoid Synchronization Problems," VLSI Design, November/December 1982, pp. 56 - 59. 6-33 Appendix. Metastability Graphs of Cypress Devices CYPRESS PALC16R8-25 1.0E+09 1.0E+08 M T B F i 1.0E+07 1.0E+06 1.0E+05 1.0E+04 n s 1.0E+03 c 1.0E+01 e 0 n d s 1.0E+02 1.0E+OO 1.0E-01 1.0E-02 1.0E-03 1.0E-04 0 2 4 6 8 10 12 14 16 tr (ns) • 1/fc - 1/fmax CYPRESS PLDC18G8-12 1.0E+09 1.0E+08 M T B F 1.0E+07 1.0E+06 1.0E+05 1.0E+04 n 1.0E+03 s 1.0E+02 c n d s 1.0E+01 e 0 1.0E+OO 1.0E-O 1 1.0E-02 1.0E-03 1.0E-04 0 1 2 34567 tr (ns) • 1/fc - 1/fmax 6-34 8 9 10 Appendix. Metastability Graphs of Cypress Devices CYPRESS PALC20G10-20 1.0E+09 1.0E+08 M 1.0E+07 T 1.0E+06 F 1.0E+05 B i n 1.0E+04 1.0E+03 s e c 1.0E+02 0 1.0E+OO d 1.0E-01 n s 1.0E+01 1.0E-02 1.0E-04 o 1 5 234 6 tr (ns) • 1/fc - 1/fmax CYPRESS PALC20RA 10-15 1.0E+09 1.0E+08 M T B F i n s e c 0 n d s 1.0E+07 1.0E+06 1.0E+05 1.0E+04 1.0E+03 1.0E+02 1.0E+01 1.0E+OO 1.0E-01 1.0E-02 1.0E-03 1.0E-04 1.0E-05 0 1 234 5 tr (ns) • 1/fc - 1/fmax 6-35 6 7 .-. 45i~~~~~~~~~~~~~~~~~~~~A~r~e~Y~O~U~r~p~L~D~S~~~et~a~s~ta~b~l~e~? Appendix. Metastability Graphs of Cypress Devices CYPRESS PALC22V10-20 1.0E+09 1.0E+08 M 1.0E+07 F 1.0E+05 T B i n s e c 0 n d s 1.0E+06 1.0E+04 1.0E+03 /" 1.0E+02 1.0E+0 1 1.0E+00 /" 1.0E-0 1 1.0E-02 1.0E-03 111;;;;;;;I;;;;;;;;;;1;;;;;;;;IIII;;;;;;;;;;;;;;;;;;1 o 1 2 3 4 5 tr (ns) • 1/fc - 1/fmax CYPRESS PALC22V10B-15 M T B F i n s e c 0 n d s 1.0E+09 1.0E+08 1.0E+07 1.0E+06 1.0E+05 1.0E+04 1.0E+03 1.0E+02 1.0E+0 1 1.0E+00 1.0E-01 1.0E-02 1.0E-03 1.0E-04 1.0E-05 1.0E-06 0 2 4 6 tr (ns) • 1/fc - 1/fmax 6-36 8 10 Are Your PLDs Metastable? Appendix. Metastability Graphs of Cypress Devices CYPRESS PALC22V10C-10 1.0E+09 ...!" 1.0E+08 M 1.0E+07 B 1.0E+06 T F i 1.0E+04 s e c 1.0E+03 ./ ./ ./. ./ 1.0E+02 n 1.0E+01 s 1.0E+OO d ./ 1.0E+05 n 0 ./ .;1 ./ 1.0E-01 1.0E-02 0 4 2 6 8 10 12 14 tr (ns) • 1/fc - 1/fmax CYPRESS CY.7C330-66 1.0E+09 1.0E+08 M 1.0E+07 T 1.0E+06 F 1.0E+05 i n 1.0E+03 B s e c 0 n d s 1.0E+04 1.0E+02 1.0E+0 1 1.0E+00 1.0E-O 1 1.0E-02 1.0E-03 1.0E-04 0 2 4 6 tr (ns) • 1/fc - 1/fmax 6-37 8 10 Are Your PLDs Metastable? Appendix. Metastability Graphs of Cypress Devices CYPRESS CY7C331-20 M T B F i n s e c 0 n d s 1.0E+09 1.0E+08 1.0E+07 1.0E+06 1.0E+05 1.0E+04 1.0E+03 1.0E+02 1.0E+01 1.0E+OO 1.0E-O 1 1.0E-02 1.0E-03 1.0E-04 1.0E-05 1.0E-06 0 1 23456 7 tr (ns) • 1/fo - 1/fmax CYPRESS CY7C332-15 1.0E+09 1.0E+08 M T B F i n 1.0E+07 1.0E+06 1.0E+05 1.0E+04 1.0E+03 s e c 1.0E+02 0 1.0E+OO s 1.0E-02 n d 1.0E+O 1 1.0E-O 1 1.0E-03 1.0E-04 0 2 4 6 tr (ns) • 1/fo - 1/fmax 6-38 8 10 Appendix. Metastability Graphs of Cypress Devices CYPRESS CY7C344-20 M T B F n s e c 0 n d s 1.0E+09 1.0E+08 1.0E+07 1.0E+06 1.0E+05 1.0E+04 1.0E+03 1.0E+02 1.0E+O 1 1.0E+OO 1.0E-01 1.0E-02 1.0E-03 1.0E-04 1.0E-05 1.0E-06 1.0E-07 0 2 4 6 tr (ns) • 1/fc - 1/fmax 6-39 8 CYPRESS SEMICONDUCTOR PLD-Based Data Path for SCSI-2 This application note begins by describing the major differences between the original SCSI standard and the new SCSI-2 document, with special emphasis on SCSI-2's high-speed signal timing. This information is then put to use in a PLD-based, high-speed data-path design for a SCSI-2 host bus adapter. Connectors/Cables SCSI-2 documents a 50-mil-pitch connector system. This connector family allows fully shielded assemblies for the 50-wire A cable and optional 68-wire B cable. Many SCSI manufacturers use this micro-D-type connector in .volume. You can use the cable/connector scheme in a mix-and-match system with SCSI-I connector/cable types through the use of adapter cables that have different connector types on each end. One of the de facto (non-ANSI-standard) SCSI cable schemes, the 25-pin D-sub connector made popular by the Apple Macintosh, does not support SCSI's differential signal implementation. This cable system achieves its low pin count by removing a large number of the ground signals specified for single-ended operation. Because the single-ended transmission scheme is not recommended for SCSI-2's fast synchronous information transfer mode, users of this connector/cable system limit the data rates, cable lengths, and noise margins at which they can operate. Small Computer System Interface The SCSI-2 standards document is based on the original SCSI-l standard (ANSI X3.131-1986) developed by the X3T9.2 Accredited Standards Technical Subcommittee. The SCSI-2 specification, generated by this same subcommittee, offers substantial improvements over the existing SCSI-l standard in documentation, function, performance, interoperability, and command-set standardization. With the new SCSI-2 ANSI standard, companies that use SCSI for their peripheral I/O now face difficult decisions: Which of the new capabilities offered by SCSI-2 should they support? The changes in the SCSI-2 document affect both hardware and software. Although it is possible to implement the changes affecting software drivers over time, as these new features appear in peripherals delivered to the marketplace, companies must decide now which hardware features a host bus adapter (HBA) should support. After deliveries to customers, hardware changes made as field upgrades or retrofits always bear high costs and often present a negative picture to the customer. The physical differences between the original SCSI and the new standard fall into four main categories: SCSI-l options that are now requirements, new connector/cable options, faster transfer rates, and wider data buses. Transfer Rates SCSI supports two types of information transfer; asynchronous (interlocked) and synchronous (data streaming/offset interlock). In asynchronous transfers, a four-way handshake occurs between the SCSI peripheral (target) and the HBA (initiator) for each piece of information transferred on the SCSI bus. The SCSI bus's REQ (request) and ACK (acknowledge) control signals are used in this handshake operation, with the SCSI I/O signal determining the direction of information flow. This asynchronous transfer mode is the default mode for all SCSI devices and is required for all MESSAGE, COMMAND, and STATUS transfers. On SCSI systems implemented with very short cables and fast turn-around times in both the target and the initiator, theoretical burst-transfer rates can exceed 10 Mtransfers/s. None of the commercial LSI SCSI controller chips available at this time support this high rate for asynchronous trans- SCSI-! Options To be considered SCSI-2 compliant, an HBA must support both the parity and arbitration options of SCSI1. SCSI-2-compliant HBAs should be software configurable by SCSI device address to allow use of older SCSI-l peripherals that do not have both capabilities. 6-40 PLD-Based Data Path For SCSI-2 fers. Most of these controllers handle asynchronous transfers at 50 Ktransfers/s to 3 Mtransfers/s. SCSI-2 implements the synchronous transfer mode to remove device turn-around time and cable and transceiver delays as factors affecting transfer rates. Unlike asynchronous transfers, which are limited by the interface's four-way path delay, synchronous transfers are limited by interface skew-the difference in transmission delays among signals on the interface. SCSI-2 allows use of the synchronous method only for data transfers and only after enabling it with a SCSI MESSAGE negotiation between the initiator and target. Synchronous transfers exist in SCSI-I, but few commercial LSI SCSI controllers or peripherals implement this implementation defines capability. The SCSI-1 synchronous transfers for data transfer periods of 200 ns and slower. This specification limits the synchronous data rate to 5 Mtransfers/s. With tighter-tolerance parts and low-pair-to-pairskew cables now available, SCSI-2 defines an additional form of synchronous data transfer with a 100-ns minimum period. This change pushes the SCSI-2 maximum data rate to 10 Mtransfers/s. Because of the tighter timing defined for the fast synchronous transfer mode, the SCSI-2 document does not recommend this mode's use with single-ended transceivers, even for short cable lengths. The vast majority of the SCSI-2 changes are not really changes at all, just better definitions of items documented in the existing SCSI-1 standard. The arbitration and parity capabilities carry over unchanged from the SCSI-1 standard. The connectors and cables are now well defmed, with multiple component sources. The wide bus options require only a replication of existing data-path hardware, but the data-path hardware itself has undergone a significant change. The new fast synchronous data-transfer mode requires much tighter timing control than was necessary with SCSI-I. If you plan on using the fast synchronous transfer capability, you must contend with differential transceivers, low-skew cables, three data-transfer modes (asynchronous, synchronous, and fast synchronous), and short set-up and hold times. With all these challenges, it might seem doubtful whether anyone will use the fast synchronous transfer mode. However, a system analysis shows that implementing fast synchronous mode will cost less than any of the wide-bus implementations and still yield a burst data rate as high as 10 Mbytes/s with the standard 50-pin cables. This data rate is twice the maximum· offered in SCSI-1 and equal to that offered by the competing Intelligent Peripheral Interface (IPI) in its 2byte-wide standard implementation. The wide-bus requirement of a second cable also causes problems in weight, cost, and space. Many of the newer 3.5-in. peripherals just do not have room for an additional 68pin connector. . Wide Data Bus The last hardware addition allows use of wider SCSI data buses. In SCSI-1 the interface's data-bus portion was only eight bits wide. SCSI-2 allows two addi~ tional bus widths of 16 and 32 bits. Because of these different bus widths, SCSI-2 information transfer rates are usually specified in transfers/second rather than bytes/second. You determine the bytes/second rate by multiplying the SCSI data-bus width in bytes by the number of transfers per second on the interface. The wide SCSI bus is currently defined as a secondary 68-signal B cable that can contain an additional three bytes of bus width. Because this B cable contains only the SCSI control signals necessary for information transfer, you must use it in conjunction with a 50-signal A cable for proper communications. Use of the wide SCSI option at the maximum 32-bit data-bus width, along with the fast synchronous transfer mode, provides data transfer operations as high as 40 Mbytes/s. SCSI Transfer Timing Of the 23 different interface timing values specified in the SCSI-2 document, 11 apply directly to the different forms of information transfer. These values are: Cable skew delay 10 ns Deskew delay 45 ns Synchronous REQ/ ACK assertion period 90 ns 45 ns Synchronous data hold time Synchronous REQ/ ACK negation period 90 ns Synchronous/fast synchronous transfer period Selectable Fast synchronous REQ/ ACK assertion period 30 ns Fast synchronous cable skew delay 5 ns 20 ns Fast synchronous deskew delay Fast synchronous data hold time 10 ns Fast synchronous REQ/ ACK negation period 30 ns Of these 11 timing values, only the cable skew delay and the deskew delay apply to the asynchronous mode of information transfer. The remaining values apply to the two modes of synchronous data transfers. These timing values are all specified for the transmitting end of the SCSI interface. Sufficient margins are included in these values to allow proper interface operation under worst-case configurations of transmitters, receivers, and cables. The fast synchronous mode cuts many of the timing parameters by half or more from those of the synchronous mode. Because the interface must still operate over the same distance (up to New Problems SCSI users who require no more performance than they currently have need not make any changes to accommodate SCSI-2. The SCSI-1 standard's capabilities exist as a subset of SCSI-2. However, users experiencing an I/O bottleneck imposed by their current SCSI implementation must implement one or more of the new SCSI-2 features to get additional performance. 6-41 ec~CYPRF$ PLD-Based Data Path For SCSI-2 ~, SEMlCCtIDUCTOR TIMING AT TARGET DB[ 0 ..7. Pl REQ ~ID DATA ON BUS X tEXT ~1E==!5!5n. __-~1. VALID DATA ~!5!5n1~ .'nlllu----..J~------, ACK _lnlllUII ~ ~ I TIMING AT INITIATOR DB[O ..7.Pl~ REQ ~ I ACK ,~ I ______ ,'-_ __ Figure 1. Asynchronous Transfer Timing, Target Transmit 25m), usage of fast synchronous mode demands tighter tolerances for many of the electrical components. Because the initiator is not supposed to drive the SCSI bus until a transfer's first REQ occurs, the total delay for this first transfer is longer than the delay for a flrst transfer from the target to the initiator. To get around this longer delay, many initiators prestage the data for subsequent transfers. The initiator does this by driving the data bus with the next byte of information as soon as the REQ signal from the previous transfer goes Low (Figure2). SCSI Transfers All information transfers on the SCSI bus are controlled by the target device. The initiator cannot send or receive information until it flrst has received a valid REQ signal from the target device. Asynchronous Mode Transfers Synchronous Mode Transfers The interface timing for asynchronous transfers is common to all SCSI devices. Because MESSAGE, COMMAND, and STATUS transfers require support for this mode, all SCSI devices must support it. The interface timing for asynchronous operation varies slightly, depending on whether the SCSI initiator or SCSI target is sending information. When the target sends information, it must flrst place the correct data on the SCSI bus, delay a minimum of 55 ns, then assert REQ. The 55-ns delay accounts for all possible data-transmission-time variations caused by transceivers, bias and. termination networks; cables, and the information present on them. Because the data has been on the SCSI bus for at least this long prior to· REQ's assertion, the initiator knows that the data present at its inputs is supposed to be valid when it receives the asserted REQ signal. Because no set-up time is guaranteed at the initiator, it should not assert its ACK signal to respond to the REQ signal until after delaying long enough to ensure that it (the initiator) can properly. capture the data (Figurel). When the initiator sends information; it must first wait until it receives the REQ signal from the target. This is necessary because the bus phase, which determines .the information to be sent and the direction of the SCSI bus, does not begin until the REQ signal is asserted for that phase's first transfer. After receiving this flrst REQ, the initiator can place its· data on the SCSI bus, delay a minimum of 55 ns, and respond by asserting ACK. The SCSI target must delay its negation of REQ until it has captured the data. The synchronous mode of information transfer is an option for SCSI-I and SCSI-2 devices. This mode is only usable for data transfers and is not valid for MESSAGE, COMMAND, and STATUS transfers. SCSI target devices with the ability to use synchronous mode default to asynchronous transfer mode following either a SCSI reset or power-up sequence. To allow synchronous transfers to occur, the target device must fIrst be placed into synchronous mode through a MESSAGE negotiation sequence with an initiator. This sequence sets both the minimum synchronous transfer period and a maximum REQI ACK offset count. The synchronous transfer period specifies the minimum period between successive leading edges of any two consecutive REQ pulses or ACK pulses while operating with synchronous transfers. If the negotiated period is less than 200 ns but not less than 100 ns, the data .transfer is specified as operating in the fast synchronous mode and must meet the interface timing requirements specified for fast synchronous transfers. If the negotiated period is 200 ns or longer, the data transfer is specified as operating in the synchronous mode and must meet the interface .timing requirements specified for .synchronous transfers. If the negotiated period is ever set to zero, the data transfer mode reverts to asynchronous. Unlike asynchronous transfers, .where REQ and ACK are directly interlocked to each other to control the transfer's speed, synchronous mode data transfers impose no direct timing relationship between the 6-42 PLD-Based Data Path For SCSI-2 TIMING AT TARGET DB[0 ..7.Pl~ REO J ACK ______________~I )() \\-------~I \.. \\-_----~~ TIMING AT INITIATOR DB[ 0 ..7. Pl XXXXXXXX VALID DATA ON BUS X NEXT VALID DATA REO ACK Figure 2. Asynchronous Transfer Timing, Initiator Transmit Offset Count 2 J " ~ 6 6 7 specifically identified for synchronous transfers. SCSI synchronous-mode transfers do not require a 50-percent duty cycle for REQ or ACK timing. When operated at or near the maximum transfer rate the required interface timings approach this ratio, but at slower rates the duty cycle is allowed wide variability. When the target sends information in synchronous mode, the target must place it's data on the SCSI bus a minimum of 55 ns before asserting REQ. The target can then remove or change the data a minimum of 100 ns following REQ's assertion. REQ must remain active for a minimum of 90 ns and, once negated, cannot be reasserted for a minimum of 90 ns. In addition to these requirements, the minimum negotiated period must be maintained. A data transfer is completed when the target has no more data to send and the REQIACK offset count has returned to zero. As with the asynchronous transfer mode, the specified delays guarantee valid data at the initiator on REQ's leading edge and not before 666 REQ ACK _ _--'-_ _ _ _ _ _ __ Figure 3. REQIACK Offset Count target's REQ pulses and the initiator's ACK pulses. Instead, the initiator uses a count relationship, known as the REQIACK offset count, to slow the transfer. Maintained by both the initiator and the target, this count keeps track of the difference between the number of REQ and ACK pulses. When the count in the target device reaches the negotiated maximum value (Figure 3), the target device stops sending REQ pulses until the initiator brings the count below the maximum by returning an ACK pulse. A proper synchronous transfer requires that an equal number of REQ and ACK pulses be sent. The timing relationships of the REQ and ACK pulses and the data passed with them is specified by the two values used for asynchronous transfers and values (Figure4). When the initiator sends information to the target, the initiator must wait until it receives the REQ signal from the target. Once the initiator receives REQ's lead- TIMING AT TARGET DB[ 0 ..7. Pl REO ACK ~D DATA ON BUS ~~55nl J 100ni mln~' NEXT VALID DATA minimum TIMING AT DB[ 0 ..7. Pl ~~~~~mK==~~Z=~~ REO ACK Figure 4. Synchronous Transfer Timing, Target Transmit 6-43 PLD-Based Data Path For SCSI-2 TIMING AT TARGET D8[ 0 ..7. P1 REO ACK TIMING AT 08[0 ..7. P1 REO ACK Figure 5. Synchronous Transfer Timing, Initiator Transmit synchronous mode. Additionally, the minimum data set up prior to transmitting a REQI ACK pulse decreases to 25 ns, and the data hold time after REQIACK' s leading edge is only 35 ns. This timing provides data specified as valid, at the receive end of the SCSI bus, for only 10 ns immediately following REQI ACK reception. See Figures 6 and 7 for fast synchronous mode timing diagrams. ing edge, the REQIACK Offset count in the initiator is no longer zero. So long as the initiator has data available to send and the REQIACK Offset count is non zero, the initiator can continue to send data to the target. The timing for this transfer (Figure5) is like that of the transfer from the target described above. Synchronous"mode provides valid data at the SCSI bus's receiving end during" a 45~ns interval immediately following REQI ACK reception. SCSI-2 Data Path Design Synchronous and asynchronous data transfers, IOns timing windows, fixed and variable delays, and programmable pulse widths are all necessary functions of a SCSI-2 data path. The simpler techniques used with SCSI-l's 45-ns data-availability windows are quite different from those needed to operate with SCSI-2's 10-ns windows. Fortunately, designing a data path that handles all possible SCSI-2 information transfer modes is not as difficult as it might appear. By carefully selecting some of the newer PLD and interface parts, you can implement the design quite efficiently. Fast Synchronous Mode Transfers Fast synchronous transfers function the same as synchronous transfers but with different timing parameters. These transfers only exist for REQI ACK pulse periods shorter than 200 ns and longer than or equal to 100 ns. With fast synchronous transfers, the REQI ACK minimum assert and negate times decrease to one third their previous size. Thus, SCSI-2 permits REQ and ACK pulses as short as 30 ns when operating in fast TIMING AT TARGET DB [ 0 ..7. Pl REO ~D DATA ON BUS , NEXT VALID DATA 3!!1ns min =;j' ~1E==2!Sn. minimum ACK TIMING AT DB[ 0 ..7. Pl =~~~~~==tX~=~~ REO ACK Figure 6. Fast Synchronous Transfer Timing, Target Transmit 6-44 PLD-Based Data Path For SCSI-2 TIMING AT TARGET DBf o..7. Pl REO ACK DBf 0 ..7. Pl REO ACK Figure 7. Fast Synchronous Transfer Timing, Initiator Transmit To successfully meet the needs of fast transfer rates and operability for a wide variety of peripherals, the SCSI-2 design must be capable of: Asynchronous data transfers at up to 5 Mtransfers/s Synchronous data transfers at a maximum transfer rate of 5 Mtransfers/s, with selectable lower transfer rates for peripherals that cannot operate at the maximum synchronous rate Fast synchronous data transfers at a maximum transfer rate of 10 Mtransfers/s, with selectable lower transfer rates between 10 and 5 Mtransfers/s for peripherals that cannot operate at the maximum fast synchronous rate, yet can operate faster than the maximum synchronous rate Operation with differential transceivers trol operations for receive or transmit must perform the same function: receiving or transmitting information. Grouping the receive and transmit control functions into two separate and more generalized functional units reduces the design's complexity. The necessary operations of the receive control function are: Clocking information into the receive data register Returning and removing the ACK signal at the proper time Writing the received data into the data buffer The necessary operations of the transmit control function are: Reading the data from the data buffer and clocking the data into the transmit data register Returning and removing the ACK signal at the proper time Timing the necessary data set-up time Timing the necessary data hold time Timing the necessary ACK assertion time Timing the necessary ACK negation time The data buffer function is another area where some consolidation can· occur. Because the SCSI interface cannot send and receive data at the same time, a single common buffer is used for both transmit and receive functions. With these functions combined, the design now comprises seven functions: SCSI interface transceivers Receive data register Transmit data register Data buffer REQI ACK offset counter Receive control Transmit control Design Partitioning Correct partitioning is probably the most critical part of achieving an efficient implementation of any SCSI design. When partitioning the design, list the necessary functions and, where possible, combine multiple functions into a single, more global function. A SCSI-2 data path must include these functions: SCSI interface transceivers Receive data register Transmit data register Receive data buffer Transmit data buffer REQI ACK offset counter Asynchronous receive control Asynchronous transmit control Synchronous receive control Synchronous transmit control Fast synchronous receive control Fast synchronous transmit control Although the transmit and receive control functions must operate with different timing values, the asynchronous, synchronous, and fast synchronous con- SCSI Interface Transceivers The SCSI interface supports both single-ended and differential transceiver types. The single-ended variety is 6-45 PLD-Based Data Path For most common today because it is relatively inexpensive and most commercial LSI SCSI controller chips incorporate this type. Single-ended transceivers suit cable lengths less than 6m long and synchronous data rates of 5 Mtransfers/s or less. SCSI devices using fast synchronous mode require differential transceivers. This transceiver type meets the electrical specifications of the EIA RS-485 standard. Operating from a single +5V supply, these transceivers can handle large swings in common mode noise, are guaranteed glitch free during power-up and -down operations, and have short-circuit and thermal-shutdown protection. SCSI applications that use cables longer than 6m also require differential transceivers. Although currently limited in the SCSI standard to operation at no more than 25m, this transceiver type can drive signals much farther, as shown by the Intelligent Peripheral Interface usage of the same parts at 65m. Differential transceivers have one other advantage that is often overlooked. Because two differential signals determine the output state of each receiver, it is possible to achieve either active High or active Low TTL inputs and outputs by reversing the connection of the + and - differential signal lines on the SCSI bus. This programmable inversion can often eliminate the need for an inverter, and its associated delay, from many of the differential signals paths. All existing SCSI applications that use differential transceivers place these parts external to the LSI SCSI controller chips. This practice is due primarily to the transceivers' power dissipation and partially analog operation. Until recently you could only get differential transceivers in singles-one transmitter and receiver in an 8-pin part. This packaging required 18 parts to implement the transceivers for a SCSI-l bus. Due to the growing usage of these parts and improvements in power control technology, manufacturers now offer triple and quad transceiver parts. Some of these parts are designed specifically for the SCSI environment. To allow for the selection and arbitration sequences, for example, the trapsceivers have separate transmitter enables that allow individual transmitters to be turned on within the part. These transceivers meet all signal and skew requirements of the SCSI-2 fast synchronous mode. Receive Data Register The information from the transceivers is used for arbitration, selection, and reselectionsequences, as well as information transfers. Of the transfer· sequences, the fast synchronous transfer mode has the most stringent timing concerns. Because of the fast synchronous mode's· lOons dataavailability window, the receive data register must have a very short set-up and hold time. The 74F823, a 9-bit D-type register, fits this application nicely. With a maximum set-up-and-hold-time total of 5.5 ns, the register leaves room for a 4.5-ns skew in clock timing for proper SCSI~2 operation. Because of this timing, the clock path to the receive data register can afford only a single gate delay. To meet the defined lOons data window and work with the 74F823, the single gate must have a minimum propagation delay of 3 ns and a maximum delay of 7.5 ns for the Low to High output transition. Depending on the gating function needed, any parts such as the 74F08, 74Fll, or 74F32 meet the timing window. Transmit Data Register The same part type, 74F823, also works on the transmit side of the interface. Because both the transmit and receive data registers are as wide as the full SCSI data bus, they implement a nearly seamless design. Data Buffer You can implement the data buffer for a SCSI interface in many ways. Host bus adapters that support data-caching functions might require a large piece of memory. Because the data cache usually exists several logic levels away from the physical SCSI interface, the HBA needs smaller piece of memory to act as a "rubber band" between the SCSI target and the host or HBA memory. Using such a front-end buffer allows data to move quickly on the SCSI physical interface. Because the SCSI interface is asynchronous to most of the logic activity in any HBA, the cleanest form of this front-end data buffer has an asynchronous interface, which permits the buffer to accept data as the data becomes available. Memories of this type fall into two categories: dual~port RAMs and FIFOs. The . latter is an excellent fit because the information transferred over the SCSI interface is order dependent and does not contain memory-address information. The FIFO eliminates any need for address-sequencing logic for moving information in and out of the data buffer. The data buffer must also be bidirectional to allow the HBA to send and receive information. You can create a. bidirectional FIFO using unidirectional FIFO memories with external bus-steering and control logic. Unfortunately, a bidirectional FIFO built in this manner requires many extra parts, power, and board space. A much better· choice is to use a monolithic bidirectional FIFO. Although most available bidirectional FIFOs are register programmable and require a· processor connection to control their operation, the Cypress CY7C439 bidirectional FIFO does not. This 2K x 9-bit FIFO supports the full 9-bit SCSI data bus, in addition to the pin programmability necessary for simple state machine control. REQIACK Offset Counter The HBA uses the REQ/ ACK offset counter (Figure 8) for synchronous and fast synchronous transfers. The counter keeps track of how many unanswered REQ pulses the HBA has received and must respond to. Both transmit and receive operations employ this logic. a 6-46 PLD-Based Data Path For SCSI-2 lated from the information in the SCSI-2 document. You can approximate the remaining values to arrive at a number accurate to within a power of 2 (1 counter bit). The cables specified for the SCSI interface use a solid dielectric whose Vp ranges from 60 to 66 percent. Additionally, the use of twisted-pair cables is strongly recommended to reduce crosstalk. When wires are twisted together to form a cable, longer wires are needed to reach a specific physical cable length. Depending on the amount of twist in the pairs, the longer wires can lengthen the physical signal from 2 to 30 percent. The cables specified for fast synchronous transmission have a very tight pair-to-pair signal skew specification that is partially achieved by having a very Just how big a counter is needed? Although it would be easy to pick an arbitrary number, you can calculate the size of the counter needed to keep the SCSI interface operating at its peak rate. This task requires a counter of N bits, where R outstanding REQ and ACK pulses can be active, such that R=2N-1. This same R valu~applies to the target device as the maximum REQIACK offset count. The value of R depends on the SCSI cable's length, the velocity of the cable signals' propagation (Vp), the fastest synchronous period to be used, the turnaround time of a REQ pulse to an ACK pulse in the initiator, and the recognition time for an ACK pulse in the target. Many of these values are specified or can be calcu- .. --- PAL22V10C a.aac II!I1..IN ...uN 1P..INt E!CI!D 1 1 1 JCr..JIIMI /tI' taUN 41 1 I IXMLDII 1 I D£MI I 1 III QI II! 1 I 1-----I~-7 L Figure 8. REQIACK Offset Counter 6-47 DII'I" PLD-Based Data Path For SCSI-2 2. Each generated ACK pulse generates a single count down. 3. The counter does not change if REQ and ACK are recognized simultaneously. Although the simplest approach would be to run the REQ signal from the receiver straight into the metastable-prevent circuit, this could cause problems in some systems. Because the REQ signal is allowed to be as narrow as 30 ns at the cable's transmitting end, this pulse might shrink under some conditions such that the received pulse is less than the 20-ns sample period (plus set-up and hold time). This situation could occur under worst-case conditions of intersymbol interference, cable imbalance, and bias distortion, causing the the REQ/ ACK offset counter to miss the REQ pulse and create a transmission error. To make sure the counter does not miss the REQ pulse, you need to add a D flip-flop, configured as an edge detector, just before the metastable-prevent circuit. This flip-flop forces the received REQ signal to remain at the counter input until it is recognized. Although you can build the REQ/ ACK counter with a small handful of MSI/SSI parts, a superior approach is to use a single Cypress PAL22VlOC PLD. This one part can include the entire 3-bit up/down counter, two single-count-per-pulse filters, and both REQ and ACK metastable-prevent structures. Because of the PAL22VlOC's synchronous operation, the asynchronous edge-detector function still requires a single 74F74 flip-flop external to the PAL22VlOC REQ/ ACK offset counter. The equation list for this PLD appears in Appendix A. Receive Control Data reception from the SCSI bus is handled the same for all modes of information transfer. This is possible because the information on the SCSI bus is always valid at REQ's leading edge for asynchronous, synchronous, and fast synchronous transfer modes. Every received REQ pulse can thus clock the receive data register. Even when the initiator sends data to the target, and therefore clocks invalid data into the receive data register, the next REQ pulse overwrites the invalid data. It is necessary to delay the received REQ signal's leading edge by a gate delay that matches the 74F823 Received Data Register's set-up and hold times. The 74F08 fits nicely here with a 3-ns minimum delay on Low-to-High transitions and a 6.6-ns maximum delay. This delay still gives a 900-ps margin for fast synchronous transfers, judging from worst-case commercial specifications. Because timing is so tight when doing fast synchronous transfers, take care to avoid destroying any designed-in margins with poor circuit layout. The standard FR4 substrates used for most circuit boards exhibit a dielectric constant of about 5. With this high number, circuit trace delay exceeds 2 ns/foot. To prevent infor- loose twist in the signal pairs. In these cables, each line's internal physical signal length is approximately 2 to 10 percent longer than the external physical length. With a. maximum external cable length of 25m, the calculated one-way maximum signal delay through the cable is t = (25m + 2.5m) * 5.56 ns/m t = 153 ns Because the SCSI target does not know that an ACK has occurred until the ACK propagates to the target's end of the cable, this one-way delay must be doubled to allow for the return path time. In addition to cable delay, the transceivers themselves contribute a major portion of the total loop delay. The data sheet for a DS36954 quad differential transceiver lists a maximum delay value of about 20 ns for each transmitter and receiver that the REQ and ACK signals pass through. This delay adds 80 ns to the loop delay. The next delays to consider are the turnaround and recognition times in the initiator and target. These delays must be approximated by examining the operations that must occur. Because both the REQ and ACK signals are asynchronous when they are received, they must go through a metastable-prevent circuit before they can be used. The faster forms of TTL-compatible logic can execute a metastable prevention procedure in less than 20 ns and still provide a reasonable MTBF. Following this procedure, a counter must operate on the signal and generate a status value, which determines whether the transfer can proceed or must suspend. For worst-case operations, a miss must be assumed for the first stage of the metastable-prevent circuit. This assumption yields a maximum REQ/ ACK offset counter delay of 80 ns. The REQ/ ACK send delay is the last piece of the delay loop. The REQ/ ACK send delay assumes the necessary data set-up time before generation of the REQ or ACK pulse to send the data. For the fastest transmission mode, this delay could be as long as 70 ns. Adding these values yields a loop delay of 306 ns Cable delay 80 ns Transceiver delay 80 ns Initiator REQ/ACK offset counter delay 80 ns Target REQ/ ACK offset counter delay 70 ns Data set-up delay 616 ns Total loop delay Considering this figure and the 100-ns Inlmmum period for fast synchronous transmission, achieving continuous data flow demands that there be at most six outstanding REQ pulses at the target. This task requires a minimum of a 3-bit REQ/ACK offset counter to maintain data streaming for fast synchronous transfers. This counter must operate under the following rules: 1. Each received REQ pulse generates a single count up. 6-48 PLD-Based Data Path For SCSI-2 mation transfer errors, make sure the REQ signal's routing length to the receive data register is never more than 5 in. shorter or longer than, any of the data-path signals. Once information has been captured in the receive data register, it must be written into the data buffer. The I/O signal in this state indicates that the SCSI bus direction is set for input to the initiator. With these conditions met and REQ present, a FIFO write operation must occur. For a correct write to occur, the CY7C439 FIFO requires a pulse on the ISTBB pin with a minimum width of 30 ns. With SCSI-1 peripherals, you could build a small asynchronous state machine to generate a write FIFO pulse of this minimum width; the state machine could utilize the false state of the REQ signal that occurs after each REQ pulse. If you use this method, you need some external logic to terminate the last write to the memory. To support SCSI-2 peripherals that use fast synchronous transfers, you need a different method. Because the REQ pulse's transmitted false state for fast synchronous transfers can be as small as 30 ns, a pulse of this same width cannot be guaranteed at the receive end. You can choose among many methods for generating fixed-width pulses: delay lines, TTL delay elements (74LS31), strings of gates, counter chains, one shots, and standard TTL parts feeding R-C circuits. Each circuit type has its inherent problems. One shots are notorious for not triggering at all or mistriggering, lumped-constant delay lines have high field failure rates, and TTL delay elements have a too-wide margin of variability for a manufacturable design. In this case, however, a new type of reprogrammable CMOS synchronous state machine PLD, the Cypress CY7C361, can easily generate the required pulse. The CY7C361 is a programmable state machine that allows multiple concurrent and interacting state machine s to operate in the same part. Based on a Petri Net or token-passing philosophy, the CY7C361 can contain as many state machines as its state registers, inputs, and outputs support. This part contains 32 separate state registers that can operate at internal frequencies as high as 125 MHz. The CY7C361 also contains an internal clock doubler, which makes it unnecessary to generate and distribute frequencies upwards of 100 MHz in a TTL environment. Because this part is designed for interface operations, it also contains metastable-hardened input structures. By operating from the same 50-MHz clock used with the REQ/ACK offset counter (doubled internally to 100 MHz), a CY7C361-based 4-state machine can generate a 40-ns pulse to write the information into the FIFO memory. The state machine must account for the procedure used to govern writes to the FIFO. Although FIFO writes can occur even if the FIFO is half full, as determined by the FIFO status flags, the ACK signal that al- lows the interface to continue operation is held up until the host reads enough information from the FIFO to bring the FIFO state below half full. This governing procedure is used for asynchronous and synchronous operations. For synchronous operations, data continues to be written into the FIFO even after reaching the halffull state. Although ACK pulses are no longer returned to the target when the FIFO is at or above half full, the FIFO writes are only suspended when the REQ/ACK Offset counter in the target reaches its maximum and stops sending REQ pulses. Figure 9 shows the simple state diagram for writing information into the FIFO. The diagram includes four active states (1 - 4) and a reset state (0). When in the reset state, the CY7C361 continuously watches for a REQ signal to occur while the SCSI bus's I/O signal is asserted (SCSI bus direction = IN). When this condition occurs, the state machine advances to state 1 and continues through states 2, 3, 4 and back to reset. The CY7C361 implements this state machine using three of the 32 available state registers, labeled here as WO, WI, and W2. State registers WI and W2 also serve as FIFO strobe-delay states for FIFO read operations. Figure 10 shows, through three FIFO write cycles, how the CY7C36I's state registers change to achieve a fixed 40-ns delay. The outputs of the three state registers are logically ORed together in the CY7C361. Unlike many other register-based state machines, the CY7C361's internal design allows you to OR together adjacent but nonoverlapping state-register outputs to generate a glitch-free output signal. Next to each state register label in Figure lOis either an s, t, or w. These letters represent which of the three possible CY7C361 state register configurations is used for that specific state register. An s (start) specifies that the state register becomes active for exactly one clock cycle each time the required input conditions are met. A t (toggle) specifies that the state register changes state on each clock cycle while the required input conditions are met. The t-type state registers allow very efficient construction of counters. The last type of state register, w (wait for terminate), is set only by a carry in signal generated in the immediately preceding state register; the w-type state register is cleared when its required input conditions are met. Transmit Control Transmitting information to the SCSI target is by far the most complex function. The procedure requires controlled interval timing for reading data from the FIFO data buffer, placing the data in the transmit data register and on the SCSI bus, and generating multiplewidth ACK pulses. Because of these operations' controlled timing and concurrency, the CY7C36I is again called into service. The earlier application of this part used three of the 32 available state registers. The transmit function uses many of the part's remaining states to generate the 6-49 PLD-Based Data Path For SCSI-2 \/0 \J 1 l \/2 l Figure 9. FIFO Write State Register Timing necessary delays for asynchronous, synchronous, and fast synchronous transfers. For the SCSI transmit cycles to occur at the maximum rate, the HBA must stage or pipeline data so that the data is immediately available for transmitting. This operation requires that the HBA handle concurrent asynchronous events. As one transfer is occurring on the SCSI bus, the next piece of information must be read out of the FIFO and be available for the next bus ~n.sfer. These FIFO read functions operate in two very smular sequences: one for asynchronous SCSI writes and one for synchronous SCSI writes. Figure 11 shows the state diagram for FIFO read operations. This state diagram has a similar reset state (0) and the same delay states (2, 3, and 4) as the FIFO write state machine. The two entry states are for asynchronous (1) and synchronous (7) SCSI write operations. For asynchronous SCSI writes, the FIFO read' starts when synchronous operations are not enabled, data is available in the FIFO, the bus direction is set to out, and a FIFO read is not currently active. For synchronous SCSI writes, the FIFO read starts when the REQI ACK Offset counter is non-zero synchronous operations are· enabled, data is available ~ the FIFO, the bus direction is set to out, and a FIFO read is not currently active. The FIFO read operation uses five more state registers in the CY7C361. The state-register timing diagram in Figure12 shows these new states: RO starts the FIFO read for' asynchronous SCSI writes RSO starts the FIFO read for synchronous SCSI writes Figure 10. FIFO Load State Diagram Figure 11. FIFO Read State Diagram Rl serves as the FIFO strobe signal (ORed with state registers WO, WI, and W2) and notes internally that a FIFO read is currently active ES ends the FIFO read when the minimum delay has passed (delay states 2, 3, and 4) and the transmit data register contains no valid data DATA specifies that the transmit data register contains valid data Figure 12 shows two sequences: RO starts an asynchronous FIFO read, and RSO starts a synchronous FIFO read. In normal operation, consecutive FIFO read cyc!es .are .of the same type and overlap with data being available In the transmit data register. Because the FIFO output does not change (following the minimum output delay time) until the FIFO strobe is removed this strobe's' trailing edge is used to directly clock th~ data from the FIFO into the transmit data register. With the FIFO data now in the transmit data register and driven out onto the SCSI bus, the HBA must generate specific and precise delays to allow the ACK signal to be sent at the proper time. From the time. ~at the data clocks into the transmit data register, a mlnInlUm of 60 ns must be timed for asynchronous SCSI writes, 40 ns for fast synchronous SCSI writes, and 90 ns for synchronous SCSI writes. To create these delays and permit programmable synchronous data rates slower than the maximum allowed, part of the CY7C361 is used to create a loadable delay counter. This counter operates as a hardware subroutine within the CY7C361, providing all the necessary delays for ACK timing. For asynchronous SCSI writes, the state machine ?alls the delay routine as soon as information is .placed In the transmit data register. When the timer times out (returns to zero), the ACK signal is sent .For synchronous SCSI writes, the state machine calls the delay routine both to set and remove the ACK signal. The CY7C361 implements the delay hardware as a 4-bit count-up toggle counter, which provides 15 different synchronous timing periods ranging from 100 to 380 ns. Table 1 lists the values that load into the counter PLD-Based Data Path For SCSI-2 to - I U l to -----II Ria a Jl RSIa • RI to ES sa ACK Low interval when operating in fast synchronous transfer mode with a lOO-ns period. To generate the ACK delay for asynchronous mode, the SCSI specification for writes requires· two more delay states to get 60 ns. This added delay is achieved by setting the delay counter inputs to 1101. Figure15 shows the state diagram for asynchronous SCSI write operations. The first active state (1) is the fl11 state, which the state machine enters as soon as the FIFO read completes and valid data is in the transmit data register. The delay subroutine call appears as a single state (2) that loops until the delay is complete. Once the delay counter times out, the state machine advances to state 3, where ACK is transmitted. The state machine remains in this state until the REQ signal is removed. This clears the ACK signal and returns the state machine to the reset state (0). The state register timing for this sequence appears in Figure 16. This timing diagram shows not only the state registers used for generating the ACK signal, but all the state registers used in the CY7C361. You can therefore see the interaction of the FIFO read, delay counter, and asynchronous ACK control state machines. Figure 16 shows three tranjf~J'S. The ES state, which ends the FIFO read operation, starts the ACK delay state machine. As soon as this state machine is started, the next FIFO read is also started. The ACK cycle is terminated by the ATA state register, which monitors the REQ and ACK signals. When the ACK cycle completes, the next FIFO data is clocked into the transmit data register, and another ACK cycle is started. The state diagram for synchronous transfers appears in Figure 17. This sequence starts the same as an asynchronous transfer, except that the termination of the first ACK delay starts a second delay to remove the ACK signa1. When this second delay times out, the ACK state ends. Meanwhile, the ongoing FIFO read operation has put data into the FIFO. The end of the ACK state prompts the FIFO read to complete and start the next ACK cycle. Two fill states, 4 and 5, are n..n VI V2 n n ~ n n DATA w L Figure 12. FIFO Read State Register Timing to provide these periods. The load value for the counter enters the CY7C361 via four input pins. When the delay subroutine is called, the signal levels on these four pins load into four state registers, which in turn load into the counter. Figure 13 shows the state transition diagram for the delay counter. From the reset state (0) the delay counter enters a load state (L). Because the delay counter has 15 possible start points, the load state must have 15 possible exits. When the counter has reached its maximum value (1111), the counter enters an exit state (X) to toggle the ACK signal on or off. This loadable delay counter uses nine. more state registers in the CY7C361. Four of these state registers (CEO, CEl, CE2, and CE3) serve as counter enable bits that load the four toggle state registers (CTO, cn, CTI, and CT3). The ninth state register is used for the exit state (CTX). Two count sequences appear in Figure14. The first sequence shows the shortest timing interval, created by loading 1111 into the counter. The second sequence shows a longer delay, which results from loading 0101. Because the delay counter has overhead states, the shortest interval the counter can time is 30 ns. To get the widest range of synchronous transfer periods from the delay counter, a fill state is generated at the start of each ACK cycle to stretch this minimum interval to 40 ns. The 40-ns interval determines the shortest possible CTX CEI2J CEI CE2 CE3 CTI2J CTI CT2 CT3 s s s s s .---fl Jl Jl Jl ~ n n l l l l Figure 14. Delay Counter State Register Timing Figure 13. Loadable Delay Counter State Diagram 6-51 PLD-Based Data Path For SCSI-2 Figure 15. Asynchronous Write State Diagram Figure 17. Synchronous Write State Diagram ORed with the ACK state to meet the timing requirements 'of synchronous transfers. By carefully selecting the data-enable, set-up, hold, and ACK duty cycle, you can use the same state machine for synchronous and fast synchronous transfers. Figure 18 shows the, state register timing for three transfers in fast synchronous mode, with a 100-ns data period. Compare these transfers with Figure 19, which shows the state register timing for two synchronous transfers, with a 200-ns data period. The only difference between the two types of transfers is the amount of time spent in the delay counter. Additionally, the FIFO read portion of the waveforms shows that the synchronous FIFO read state register, RSO, starts the FIFO read instead of the RO state register used with asynchronous SCSI writes. As configured thus far, the CY7C361-based state machine generates the FIFO strobe signal for FIFO read and write operations and the ACK signal for asynchronous, synchronous, and fast synchronous SCSI writes. As for SCSI read operations, the HBA generates the ACK signal for asynchronous reads by returning the REQ signal as ACK. For synchronous reads, however, the HBA must use a different mechanism. w • ______________________ VJ t. ~ V2 t. ~ R8 • -I"L--.J1 ~B The ACK sequence needed for synchronous SCSI reads has the same timing as the ACK generated for SCSI writes, except that the initiator places no data on the SCSI bus. Because the CY7C361 outputs do not control the enables of the SCSI transceivers, or the receive and transmit data registers, the same ACK control state sequence used for synchronous SCSI writes can also serve for'synchronous SCSI reads. The return of an ACK on a SCSI read is based on the FIFO having room rather than the HBA having data available. Thus, a new state register must be added' to start the ACK cycle. Additionally, a signal, is needed to decrement the REQI ACK Offset counter. Although you might expect to use the output ACK signal for this purpose, it does not occur early enough in the cycle to count down the REQI ACK Offset counter before the next ACK cycle is ready to start. Figure 20 shows the state register timing for a fast synchronous SCSI read operation. The DOWN state Table 1. Synchronous Data Rates ~---------- r-1L-~ .:=J===LS=======LJ=======L:========== LJ ---flL======:;-'nL-======;-"LJnL======:;---- ACKSJ • --1"1'---_ _---' '--_ _---' M:J( t. _ _ _ L -_ _ _ __ ~__' ATA • _______----' '---_ _---' '--_ _---' ACKS2. _ _ _ _ _ _ _ _ _-'--_ _ _ _ _ _ _ _ ___ IIO<.A. ___________________________ ACKB. _______________________________ CEB CEJ CE2 CES CTB CTJ • • • • t. t. CT2 t. CTS t. n'-_ _ _--'nL_ _ _--'n'-_ __ - - - D_ _ _---'nL___--'n'-____ ___________________-..,..._________ - - - D ' - -_ _---'nL_ _ _--'nL.._ _ __ ---D n n'-_ _ __ nJ"l nJ"l nJ"l'-_____ r-1L.._ _ _ _ _--'r-1L_ _---'r-1'-_ __ roL.._ _---lro'-_ _ _---JroL.._____ " Data Rate Data Transfer Mode 1111 1110 1101 1100 1011 1010 1001 1000 0111 0110 0101 0100 0011 0010 0001 lOOns 120ns 140ns 160ns 180ns 200ns 220ns 240ns 260ns 280ns 300ns 320ns 340ns 360ns 3800s 10.0MUs 8.33MUs 7.14MUs 6.25MUs 5.55MUs 5.00MUs 4.54MUs 4.16MUs 3.84MUs 3.57MUs 3.33MUs 3. 12MUs 2.94MUs 2.77MUs 2.63MUs Fast Synchronous Fast Synchronous Fast Synchronous Fast Synchronous Fast Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous _____________ DATA --1LJrL...M •w ___________________ __ CClVNw _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ___ CTX • Synchronous Period nJ"l'-______________ nL-_____ _ , _ - - - - - - - - - - RJt.~ ES • Load Value " roL.._____ Figure 16. Asynchronous Write State Register Timing 6-52 PLD-Based Data Path For SCSI-2 w. _________________ The solution is to construct a small latch external to the CY7C361. The latch allows the ACK signal to be generated as soon as possible, but only transmitted on the SCSI bus after the REQ signal is received. The latch's output prompts the CY7C361 to terminate the current ACK when the CY7C361 sees an external ACK present and REQ not active. Now another SCSI possibility must be considered. When the HBA receives information on the SCSI bus in asynchronous mode, the ACK signal is just a repeated REQ signal. The repeated REQ must still be justified by the half-full signal from the FIFO. This extra qualifier requires use of another latch to handle the following sequence: If an ACK is returned when the FIFO half-full state is reached, the ACK being sent remains active until REQ is removed, but another ACK is not sent until the half-full flag changes and REQ is present. This same circuit must also give synchronous transfers a bypass path for generating an ACK pulse that is not tied to REQ. Gating the latch with the SYNC signal, which specifies synchronous operation, it is possible to disable the latch for synchronous operations and enable a different path at the same time. These complex gating functions are again an excellent fit for a PLD. Because the ACK signal is part of the asynchronous transfers' round-trip path, this application needs a fast part to limit the delays and skew between data and clock. The best choices are probably parts running at lO-ns or faster, such as the members of the PAL18G8, PAL20GI0, or PAL22VIOC families. Although many of these parts are only available with active-Low outputs, you can correct the signal polarity at the SCSI transceiver by reversing the differential signal lines. Figure 21 shows the necessary gating function for the ACK signal. VI t. V2 t. IS. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ OO\INw _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ 1tO 1 REQ )---...-+--+---I ~ .------------------- ~ Rl to _ _ _ _....,...._ _ _ _ _ _ ES. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ ~---'--'---- DATA w _ _ _ _ _ _ _ _ _ _ _ _--'--_ _ _ __ IS • --fl'___ _-' '-_ _---J ' _ _ _ - - ' - -_ _ __ IXMII w ACKSI • --.n'----__-' /10( -to -,--_ _--.J '-_ _---J IHF >---+-+-1-+----1 I ACK ' - -_ _ _ __ >-----,-+-+-I_+_~ ActCOUT ATA • _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ AO -----~ CEB. _ _--' eEl • _ _--' Figure 20. External ACK Gating/Latch CE2. _ _--' CES • _ _----' CTBto CTI t. CT2 t. CTS t. _ _ _-, data path, additional functions are needed to complete the host bus adapter. For example, you need control circuitry to· operate the transceiver enables; read and write the FIFO on the host bus side; monitor the SCSI bus and change the FIFO direction when necessary; control the selectionlreselection sequences; and similar operations. This data-path design meets its performance goals with a minimal amount of circuitry. Because much of it is implemented in PLD-type devices, you do not have to redesign the HBA to handle almost any change to SCSI-2 or future SCSI versions that affects interface timing; instead, you can simply reprogram the existing parts. This PLD flexibility provides the faster time to market necessary to remain competitive in today's markets. _ _ _-' _ _ _-' _ _ _...J Figure 21. Fast Synchronous Read State Register Timing Putting It All Together All the necessary SCSI-2 data-path functions are now accounted for. Interconnecting these pieces as shown in the overall schematic (Figures22 and 23) completes the data-path design. The fIrst sheet of this schematic (Figure22) details the physical SCSI interface connection and interface transceivers. The second sheet (Figure 23) contains the data-path logic functions. Although these two pages form a very compact and fast 6-54 II, ~~ i~ Dlfferenll.,l SCSI I nt-erF oce Conn .. clor !:eU,? PI J2 (TIR\ .,7.P I; ~2 f;J l@D +5V tSV 02 Dl 3A F1 IA ~~ ~ (fQ = -< -XJ\T1'LEN > "'I ~ N ~2' TIR4 ~ ~2e 0 ~28 <: "'I 01 25 ci®=J 21~ ~ ~311 ~ en 0, 29~ ~ ~ ~32 rJ:J -8SY r-==~::;~~~~~~~JI-------<'+BSY (") ::r ~ = a +OST -ASG n° ::r ~ :~;~ IK ~ l-d -SEL IR2 rJ:J a ~~~~~=i~~~~~~~~~~~~+~ +5V b ,~-~------~. (;.ocour I ~ (SELECTED ~ rJ:J "5 ~-~ ;~' = '1_ ~~ CYPRESS TITLE SE~ICONDUCTOR SCSI-2 HOST BUS ADAPTER FlN:TION PHYSICAL INTERFACE SIZE C ISt£ET 1 CF 2 REV f'D c:lo ~ ~ f"to ~ l-d ~ f"to ::r ~ ... o 00 Ci 00 ~ N !' ~ SET...DIR -I\..RESE:T ~ RCB8 B REa..IN l REo~i~E t::~ :~ ~ .--r-!-!-'R 1-4 GI cJ.!.. 2 t RIB' IC2 ~23 ~~22 ~r--21 ~~~ ~r--Ie ~ IJCl = "1 !'I> ~_0~7.P N ~ -- ~'----17 ~ I: - !~ ~ "1 ~ (J1 r:J':l CJ) AmI Rrn2 ROO3 Rtfl< SYNC @=> C3 r:J':l =- 12 H 20 STe' B'IP' STBB SYPB -!.' ,.., ElF" H'" 17 18 /eLIl SI I'EI\........ I'El\..IllS2 I'El\..I3U53 I'El\..IlJS.4 1'EI\..!lJS6 /£fLIIUS6 1'El\..9JSr 1'El\..8USP ~ 24 l' :D e....:.L. '---- GI 1C2 £EfI ~ ROO. 4 ~~t~~ ~ ~-~ == ~-~ SYNC RIB' 10 "* !L. I!E. Ts 1'416 - 1lE1\..alSI.", pJ - ~~~ T. r-4I. 14 15 2'6 R(B82~23 '-g CONTRCL IIII. i 2~ 25 1:~J 5 CY?C439 ~~ 1~ :~ RIBS ROO' ~ • >-~ 3 41 I RIB' I.. 3~~'* 361 ~~ BUFnR 5-18 ~IT ~ !'I> --REGISTER ·REgET Cl ~ + ""'TA Fal5 13 ~ ~~~ +- 2 ~ !") == an' • ROOO 0 q> I ?4F2e8 PARI TV ~ O£J RCB2 B P~R [TY....O..EDCCLOCK P/IR..O.K ~ b -;". !'I> ~ N IO_IN ILSr~ ~ 3 4 5 P,",L ••VI8C ~ ~ 21 ~ ~ -4 DIFFERENCE ~ CD..NlER =1 ~ ::E: f?- ~ VDI ~ I .5V ~ IK : m : LSr2 ~ PAL ••vlce ---i Z3 g... ~ t:'-l J m;KM..Ell£..TRf.NSMT ~ ~ ~ nl'~ ~ =i 1i= -iT "iT ~ CONTRIl. '--- ~ ..... ~ =- =tt rD Q.. ~ I CYPRESS ITITLE SCS I -2 FlKT[ClII SIZE ~ SEf\ICONOUCCTOR HOST BUS ADAPTE R DATA PATH ANO CONTROL C ISt£ET 2 CF 2 I REV ~ ..... =., ~ o 00- n0074 N PLD-Based Data Path For SCSI-2 Appendix A. PLD Toolkit Source Code for the REQI ACK Offset Counter C22VIO; {SCSI2DIF.CYP} {*********************************************************************** * * * difference counter - keeps track of how many REQUEST pulses * * have been received vs. how many ACKNOWLEDGE pulses have been * * sent. The single output DIFF, is used only during synchronous * * data transfers. When DIFF = 1 there exists a received REQUEST * pulse that has not been responded to by an ACKNOWLEDGE pulse. * * * The circuit contains two metastable prevent circuits to * capture the REQUEST and ACKNOWLEDGE signals. These signals * * * are filtered to be enables to a 3 bit up down counter. These * * signals can occur at the same time. If they do the counter * should not count. Only one count cycle is allowed per enable. * * * * ************************************************************************} CONFIGURE; CLOCK(node=l), REQ(node=2), SELECTED(node= 3), !CT DOWN(node=4), SYNC(node=5), {50 MHz system clock (20ns period)} {SCSI Request signal, used for count up} {used to reset the registers an counter} {down count pulse from CY7C361} {synchronous operation enabled} {outputs} DIFF(node= 14,noreg,ninv), Q2(ninv), Ql(ninv), QO(ninv), DOWN (ninv), DOWN INH(ninv), ACK :rN(ninv), /UP, UP INH(ninv), RE REQ & SYNC; UP INH UP = REQ_IN; REQ_IN & !UP_INH; 6-57 PLD·Based Data Path For SCSI·2 Appendix A. PLD ToolKit Source Code for the REQIACK Offset Counter (continued) ACK_IN = DOWN_INH = DOWN = ICT_DOWN; ACK_IN; ACK IN & !DOWN_INH; {3 bit counter} QO = SYNC * UP & !DOWN & !QO SYNC * DOWN & IUP & IQO SYNC * UP & DOWN & QO SYNC * !UP & !DOWN & QO; # # # Q1 SYNC * UP & !DOWN & IQ1 & QO SYNC * UP & !DOWN & Q1 & !QO SYNC * DOWN & !UP & IQ1 & !QO SYNC * DOWN & !UP & Q 1 & QO SYNC*UP&DOWN&Q1 SYNC & !UP & !DOWN & Q 1; # # # # # Q2 SYNC * UP & !DOWN & !Q2 & Q1 & QO SYNC * UP & !DOWN & Q2 & !Q1 SYNC * UP & !DOWN & Q2 & !QO SYNC * DOWN & !UP & !Q2 & !Q1 & !QO SYNC * DOWN & !UP & Q2 & Q1 SYNC * DOWN & !UP & Q2 & QO SYNC * UP & DOWN & Q2 SYNC & !UP & !DOWN & Q2; # # # # # # # DIFF # # Q2 Q1 QO; 6-58 PLD-Based Data Path For SCSI-2 Appendix B. PLD ToolKit Source Code for ACK and FIFO Strobe Control CY7C361; {**************************************************************** * * SCSI2 FIFO and ACK timing controller. Supports asynchronous writes and synchronous and fast synchronous reads and writes * * *****************************************************************} CONFIGURE; {reset control} /RESET(node= 3,ireg), GLBRST(node=64), {low asserted reset, single reg} {global reset control node} {clock control} CLKIN(node=4), CLKDB(node=74,dbl clk), IENA(node=29), IENB(node= 30), IENC(node=31), {system clock} {enable clock doubler} {input clock enable for nodes 3,5,6,9} {input clock enable for nodes 10,1l,12,13} {input clock enable for nodes 1,2,14,15} {inputs} ZERO(node=73), REQ(node=5,iireg), ACK_IN (node=6,iireg), 10_IN(node=10,ireg), DIFF(node=9,ireg), HF(node= 11 ,iireg), EF(node= 12,iireg), SYNC(node= 13,ireg), {internal tie point for enables} {asynchronous SCSI request signal} {gated ACK output signal, latched by REQUEST} {SCSI bus set to O=out, 1=in} {difference count <> O} {room for data in FIFO - write} {data in fifo - read} {synchronous transfer mode} {counter inputs} CO(node= 1,ireg), C1(node=2,ireg), C2(node= 14,ireg), C3(node= 15,ireg), {outputs} / ACK_OUT(node=16), /CT DOWN(node= 17), /FIFO_STRB (node= 18), {LSB (bit 0) of ACK length counter} {bit 1 of ACK length counter} {bit 2 of ACK length counter} {MSB (bit 3) of ACK length counter} {all low is an illegal value for CO,C1,C2,C4} {ACKNOWLEDGE signal, used for asynchronous SCSI writes and synchronous SCSI reads/writes} {count down pulse for DIFF counter} {FIFO strobe for SCSI writes, FIFO reads} {state nodes} {FIFO Write State Machine} WO(node=32,start), W1(node=33,tog), W2(node=36,tog), {FIFO Read State Machine} RSO(node= 34,s tart), RO(node= 37, start), R1(node=35,tog), ES(node=38,start), {starts FIFO write sequence} {delay state for FIFO strobe} {delay state for FIFO strobe} {start of sync FIFO read} {start FIFO read strobe} {stays active until transmit register is marked as empty, uses delay states from FIFO write machine} {ends FIFO read strobe and sets data in output latch} 6-59 PLD-Based Data Path For SCSI-2 Appendix B. PLD ToolKit Source Code for ACK and FIFO Strobe Control (Continued) DATA(node=39,cin,tenn), ACKSl(node=43,start), ACK(node=47,tog), ATA(node=42,start), ACKS2(node=47,start), ACKA(node=40,start), ACKB(node=41,start), AS(node=44,start), DOWN(node=45,cin,tenn), {data in output latch} {start fIrst ACK delay} {ACK active} {ACK Terminate, Async} {start second ACK delay} {synchronous ACK stretch I} {synchronous ACK stretch 2} {start ACK for sync SCSI read} {count down pulse for SCSI reads} {4 bit loadable counter} CEO(node=54,start), CTO(node=56,tog), {load counter bit O} {counter bit O} CE 1(node=57 ,start), CT1(node=58,cin,tog), {load counter bit I} {counter bit I} CE2(node=59,start), CT2(node=60,cin,tog), {load counter bit 2} {counter bit 2} CE3(node=61 ,start), CT3(node=62,cin,tog), {load counter bit 3} {counter bit 3} CTX(node=63,start), {terminal count reached (1111)} EQUATIONS; {CONTROL} GLBRST IENA IENB IENC RESET; lZERO; lZERO; lZERO; {STATES} {start} WO = 10 IN {tog} WI = IWO {global reset set to RESET signal} {allow input clocks} {allow input clocks} {allow input clocks} * REQ; {WO starts all FIFO write sequences when a REQuest is received with the bus direction set to IN, used as part of the FIFO STBX signal for FIFO writes} * /WI * /w2 * IRO * IRSO; {WI is triggered by WO and continues to toggle until WI and W2 return to 0, used as part of the FIFO STBX signal for FIFO writes} {tog} W2 = {start} RSO = WI; {W2 is triggered by WI for two clocks, used as part of the FIFO STBX signal for FIFO writes} 110 IN * SYNC * EF * DIFF * IRl; {synchronous FIFO read started when the bus is in the proper direction, synchronous' mode is active, data is in the FIFO, at least one ACK is pending (DIFF) and a read is not in progress} 6-60 PLD-Based Data Path F()r SCSI-2 Appendix B. PLD ToolKit Source Code for ACK and FIFO Strobe Control (Continued) {start} RO = {tog} Rl = /10 IN * ISYNC * EF * IR1; {asynchronous reads are started when the bus is in the proper direction (OUT), synchronous mode is not active, there is data in the FIFO (EF) and a read is not in progress (R l)} IRO * IRSO * IES; {set a read in progress with RO or RSO, end same with ES when read is complete and no data is in the output latch, used as the FIFO STBX signal for FIFO reads} {start} ES = Rl * IWl * IW2 * IDATA; {end read strobe and sets DATA in output latch} {cin,term} DATA = {start} AS = ACK ICTX * lATA; {data in latch set when FIFO read is ended and cleared by end of ACK cycle} 10_IN * HF * DIFF * IDOWN * lACK; {start new ACK cycle if DIFF<>O and cycle not active with room in FIFO} {cin,term} DOWN {start} ACKSl {tog} ACK {start} ATA = {start} ACKS2 CTX; {end counter down count} IES * lAS; {start the delay counter for the leading edge of the ACK signal} ICTX * lATA; {tum ACK on and off} ACK IN * ISYNC * IREQ; - {ACK Terminate Async is triggered when an external ACK is present and REQUEST has dropped, this occurs a minimum of 3 clocks after ACK is set due to metastable prevent pipeline delays. One more cycle occurs to remove ACK and DATA} SYNC * /ACK * CTX; {used only in synchronous modes, starts delay counter for terminate of ACK} {start} ACKA CTX {start} ACKB * ACK; {lengthen the ACK signal by two clock periods to allow data to change at the trailing edge of output ACK signal} ACKA; {lengthen the ACK signal by 2nd clock} 6-61 PLD-Based Data Path For SCSI-2 Appendix B. PLD ToolKit Source Code for ACK and FIFO Strobe Control (Continued) {start} CEO = {start} CE1 = {start} CE2 = {start} CE3 = {tog} CTO = CO IACKS1 * IACKS2; {latch bit a of counter for preset} C1 lACKS 1 * I ACKS2; {latch bit 1 of counter for preset} C2 lACKS 1 * I ACKS2; {latch bit 2 of counter for preset} C3 lACKS 1 * I ACKS2; {latch bit 3 of counter for preset} ICTO * Icn * ICT2 * ICT3 * ICEO; {toggle bit a of counter when any bit set} {cin,tog} cn = CTO; {toggle bit 1 of counter when bit {cin,tog} CT2 = CTO * {cin,tog} CT3 = CTO * cn * {start} CTX = CTO * cn * CT2 * CT3; a is set} {toggle bit 2 of counter when bits 1 and CT1; CT2; a are set} {toggle bit 3 of counter when bits 0, 1, and 2 are set} {counter has completed count up to 1111} {OUTPUTS} leo {disable output driver to allow CO as input} IC1 {disable output driver to allow Cl as input} 1C2 {disable output driver to allow C2 as input} {disable output driver to allow C3 as input} lACK FIFO STRB = - CT DOWN IWO * IACKA * IACKB; {ACKNOWLEDGE signal} * IWI * IW2 * IRl; {FIFO read/write strobe} lAS * /DOWN * IRl; {count down input for difference counter} 6-62 = ~ , CYPRESS SEMICONDUCTOR PAL Design Example: AGCR Encoder/Decoder Quarter-inch tape cartridges are used extensively to backup or archive data from hard disks. Most drives are operated in a continous or streaming mode (for reasons discussed later). Data is recorded at 10,000 FRPI (flux reversals per inch) in a serpentine manner on seven to 14 channels. The tape moves at 30 to 90 ips (inches 9per sec~nd), and the error rates achieved are one in 10 or 101 . A cartridge holds 2000 to 3000 feet of O.OOl-inchthick tape and stores 20 to 80 Mbytes of data. This application note describes the procedure used to encode/decode serial digital data for recording/reading from one-quarter-inch magnetic tape. The design presented here uses a Cypress CMOS PAL C 16R6 to implement the logic. Digital data encoding and decoding is often used to increase the reliability of data transmission and storage. One such area is the transformation between data stored on one-quarter inch magnetic tape and serial digital data. A Little History A Typical System The recording format and the Group Code Recording (GCR) code used in this design have been adopted and incorporated in a series of standards. The standards are set by the QIC (Quarter Inch Cartridge) Committee, composed of manufacturers and users of quarter-inch tapes and cartridges. The committee's purpose is to ensure compatibility between manufacturers and reliability to end users. Figure 1 shows a block diagram of a typical tape drive system. The interface with the host (or host adapter) is bidirectional. The interface has a byte-wide data path and 10 to 20 control signals, depending upon the interface standard. Data rates are 300 KBytes/s to 1 MBytes/s. The formatter or tape controller performs serial/parallel conversion and encoding/decoding as well as error checking; in some cases, the data is also error corrected. Control is usually provided by· a state machine, which PULSE OtT. TAPE rORIo4ATIER OR CONTROLLER rORIo4ATIER DRIVE QIC-50 QIC-59 QIC-02 SCSI IPI Interface Standards Interface Standards QIC-24/36 Figure 1. Typical Tape Drive System 6-63 HOST 1+-...,...-+1 ADAPTER HOST HOST GCR Encoder/Decoder o o o o o o '--------'Rl_----' o o o o READING FRO... TAPE Figure 2. GCR Signal handles the handshaking with the host as well as control of the tape. Data is written in blocks of various lengths (depending upon the standard), and a read-after-write check is usually performed. Buffer storage of at least two blocks of data is usually provided using static RAMs, FIFOs, or some combination of the two. The drive electronics include digital signals for controlling and sensing the tape motion and analog signals for the read and write paths. The interface between the drive electronics and the formatter is digital and varies depending on the standard used. phase difference between the data separator's own frequency and the peak detector's data output, then adjusting a voltage controlled oscillator (VCO) until the VCO's frequency equals that of the data. The reference clock's frequency must be at least twice (2t) that of the highest frequency to be read (t). The PLL is synchronized to the 2f reference frequency when not in use. Before a block of data is recorded, a string of Ones is recorded, which is called the preamble. When the command to read is given, the 2f reference frequency is removed from the data separator, and the· signal from the peak detector applied. The PLL then attempts to lock to the preamble - a procedure called getting bit sync. Just after the preamble, a code violation is recorded so that the formatter can recognize where valid data begins. The detection of the code violation is referred to as obtaining byte sync. PLLs typically exhibit frequency and phase offsets during preamble acquisition. Phase errors also occur after lock, during the reading of the data field. Differences in tape speed during record and playback (as well as from unit to unit) result in frequency differences between the 2f reference and the data read from the tape. Random phase errors caused by noise, intersymbol interference (bit crowding), timing errors, and other transients might also get the PLL out of lock. The data separator's PLL is susceptible to these errors because it must satisfy two conflicting conditions: it must lock quickly enough to detect the preamble, but it must not over-correct phase for a single misaligned bit. Strings of Zeros cause the PLL's phase to shift. If the shift is larger than the bit window, an error occurs. The QIC-24 standard calls for up to a 37-percent bit-shift tolerence, which means that the data separator must be able to recognize a One (flux transversal) that deviates ±18.5 percent from its expected time position without causing a data error. To achieve this performance, a 4-bit binary nibble is encoded into a 5-bit OCR code word, which is written onto the tape. Reading and Writing on Tape To write on the tape, a current of 100 rnA or less is used to change the direction of magnetization. To read from the tape, a coil of wire (the read head) is held against the tape; changes in direction of the tape's magnetic flux induce a voltage (10 mV or less) in the coil. Recording Codes All codes used for recording on magnetic mediums are classified as Franaszek Run Length Limited (RLL) codes of the form: (D, K) where D = the minimum number of Zeros between consecutive Ones, and K = the maximum number of Zeros between consecutive Ones. D controls the highest frequency that can be recorded, and K controls the lowest frequency. Using the Franaszek notation, the OCR code is (1, 2). As illustrated in Figure 2, a flux reversal signifies a One, and the absence of a flux reversal signifies a Zero. This is true for all codes. Peak Detection and Data Separation OCR recording equipment detects peaks instead of zero crossings because peak-detection circuits are less sensitive to noise. The output of the peak detector goes to the most critical analog circuit in the drive: the data separator. The data separator provides Ones and Zeros that occur at a precise frequency. The circuit does this using a phase locked loop (PLL). First the data separator synchronizes itself to a crystal-controlled reference clock. Then the circuit attempts to lock itself to the maximum data frequency on the tape. This is done by finding the The Purposes of GCR Code The 5-bit OCR code format encodes data such that no more than two consecutive Zeros occur in the serial data. This encoding relaxes the performance requirements of the PLL and loop filter, so that the system can achieve the desired performance. 6-64 GCR Encoder/Decoder Table 1. GCR Code GCR encoding also compensates for the speed variation of the tape due to: Mechanical Tolerences in cartridges and tape thickness (±3 percent) Tape elasticity and wear Motor speed variation Temperature and humidity These static tolerences can result in a (±10-percent tape-speed variation. In addition to the static tolerences, instantaneous speed variations (ISVs) occur. These result from discontinous tape release at the unwind spool (10 - 20 percent), guide/back stick slip (5 percent), and shuffle ISV (vibration) due to start/stop (5 - 30 percent). The shuffle ISV can be avoided by operating the tape in a continous (streaming) mode. If these dynamic tolerences are added together they can result in (±15-percent speed variation. The electronics in the tape controller and the drive are designed to compensate for the tape-speed variations due to mechanical tolerences. The compensation is accomplished by: Data encoding and error detection and correction PLL design . Bit-window tolerence 4-BitCode D D D 1 0 2 LlneNumber (For Ref.) D 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 3 1 1 1 y S 1 0 0 1 0 0 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 0 1 1 A 0 0 0 0 B 0 1 1 A 3 A 2 A 1 1 Y 1 0 1 0 1 0 1 1 Y 3 1 1 0 0 1 1 0 0 1 1 0 0 0 0 1 1 1 1 S-BitCode Y 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 B 1 B 2 1 1 1 0 1 1 B 3 1 1 0 1 0 1 0 1 0 1 0 1 B 4 ~ode-control signals. The GCR code used in this design IS part of the QIC-24 Standard and is also the ANSI X3.54 standard (1976). The MSB (leftmost bit) is recorded fIrst. Note that there are a maximum of two consecutive Zeros in the 5-bit code recorded on the tape. Sequence of operations Design Procedure During a write operation, the following sequence occurs: 1. Idle (hold) 2. Convert 4-bit parallel input to 5-bit GCR code and load into 5-bit register 3. Shift-out 5 bits to write amplifier. During a read operation, the following sequence occurs: 1. Idle (same as during write) 2. Shift-in 5 bits 3. Detect sync mark, set/clear invalid flag, convert 5-bit serial input to 4-bit binary value, and load value into register Note that the read clock and the write clock are not the same. Additionally, the logic must keep up with the tape data rate. Finally, the read and write operations are mutually exclusive. This means that the storage elements (D flip-flops) can be time-shared and that read and write operations require five clocks. The GCR design requires a total of five states because the idle state is common to both read and write operations. Therefore, the design requires three control lines. It is convenient to designate one control line as an enable line (active Low) and the other two lines as modecontrol signals. This application note does not describe the control of these lines or the required clock synchronization. This is because at the next level of control, you must implement in hardware the responses to error conditions. These response choices tend to be application dependent as well as subjective. The diagrams in Figure 3 show the flow of data under control of the ENABLE signal and the MO and Ml The procedure for designing the GCR circuits is to map the code conversions using Venn diagrams and write the logic equations as the sum of products, or in minterm form. Because the design requires six flip-flops, the logic is implemented using a CY7C16R6 PAL. Because the ~AL has inverting. output buffers, the Zeros are mapped mstead of the Ones. The D flip-flops require an extra term to hold their states when the ENABLE is HIGH. For a conventional D flip-flop, for example, the form of the logic equations is: D= ENABLE 1 (Q) ; RECIRCULATE PRESENT STATE + ENABLE 2 (F2) ; FUNCTION 2 + ENABLE 3 (F3) ; FUNCTION 3 where the ENABLE controls are mutually exclusive. 4-bit to 5-bit Conversion for Y3 Output At the bottom of Table 1, the 5-bit code columns are labeled BO through B4 to help show how the 4-bit code is mapped. In addition, the line numbers are labeled 0 through 15, which correspond to the values of the 4-bit binary code. Figure 4a shows how the 4-bit binary code is mapped on .the Venn diagram. For example, reference line zero, WhICh corresponds to binary value zero, is located in the lower right hand comer. The Venn diagram in Figure 4b shows the conversion for the Y3 output, which is labeled the BO input to the D flip-flop. Note that the parallel nibble (see Figure 3) is reversed end for end so that the MSB is written first when the nibble is shifted out. 6-65 GCR Encoder/Decoder ENABLE M1 MO x OPERATION X HOLD DATA FLOW DIAGRAM ~~~lf?lf? Y3 o o 0 SERIAL SHIrT IN Y2 1 Y2 0 CONVERT 5-BIT TO 4-BIT Y2 o YO Yl SO 01 1 CONVERT 4-BIT TO 5-BIT Y3 0 so YO Yl 03 1 SO SIN Y3 0 YO E~ Y3 o Yl Y2 Yl YY YO SO 1 SERIAL SHIrT OUT Y3 Y2 Yl Figure 3. Data Flow Diagrams 6-66 YO SO DO 3 DO DO 10 11 ....::; , 2 01 0 0 /' 0 1 14 15 6 0 0 0 1 1 02 1 0 1 1 0 1 1 1 1 1 1 "02 5 13 12 4 1 0 1 1 1 9 8 0 1 0 1 1 --' 0 \.......-' 03 (a) Binary Values (b) Y3 Map Y2 -IiI 0 0 o) 1 1 1 1 1 1 1 1(0 0 0 or = ronl .j.IDD2DO (c) Y2Map DO DO 01 1 03 03 DO 0 r-;1 0 oJ .p., 1 1 1 ~ 1 1 1 1 1 1 1 1 1 1 1 1 ' - I---' 02 (0 1 ~ /""'. 0" 1 I 01 0 1 I 0J 1 0 1 02 03 = Ii3 = DUil IXl ... D3 D1 DO + D215T DO (d) YIMap 0 '---" '----' 03 D3 Y'f=1I!=m 1 Dt 01 Y!l 02 '----' 1 (0 y-... 1 01 01 7 ...- r--- V (e) YOMap So=fi4-D100+DJIXl (I) So Map Figure 4. 4- to 5-Bit Conversions In Figure 4b, the Ones and Zeros in column BO are mapped. For example, reference line zero has the value One in column BO of Table 1. Therefore, a One is placed in the square corresponding to binary value Zero in Figure 4b. In a similar manner, reference line 15 has a value of zero in column BO, so a Zero is placed in the square corresponding to binary value fifteen. used combinations are Don't Cares, which are represented by Xs in the Venn diagrams. Don't Cares can be either Ones or Zeros, which further reduces or simplifies the logic equations. The procedure is to plot the Ones and Zeros, put Xs in the blank squares, and write the equations for the Zeros (Figure 5). Writing the Equation Serial Shift In If the output of the 16R6 PAL were positive-true logic, the equation would include all the Ones on the Venn Diagram. However, because the PAL output is negative logic (active Low), the equation includes all the Zeros. When the PAL inverts the signals, the Zeros are changed to Ones, so that the final outputs are positive-true logic. By inspection: BU=D3 DO+D3 Dl or, Y3 =D3 DO + D3 Dl During serial shift in (both mode control signals Low), the data separator's data output goes to the formatter's input. The signal is called SIN and is applied to the SOUTflip-flop's D input. The SOUT flip-flop's output goes to the YO flip-flop's D input, whose output goes to the Yl flip-flop's input, etc. After five read clocks, the MSB of the 5-bit GCR coded data is in Y3, and the LSB is in SOUTo 5- to 4-bit Conversion for Y Outputs During a write operation, after the 4-bit data is converted to 5-bitdata and reversed, the data is shifted out using the write clock and written on tape. The shift direction is opposite to that in serial shift in. Note that the data is right-shifted "end around" (see Figure 3) so that after five write clocks the same data appears in the register. Serial Shift Out A 5- to 4-bit conversion for Y outputs requires two 16-square Venn diagrams, because 25 = 32 possible binary values exist. Note in Table 1, however, that the 5-bit code columns do not use all 32 possible combinations. The un- 6-67 GCR Encoder/Decoder YO (x YO YO X) I---- X 0 n reX 0 X) 0 X X 1 X X I 0 ex I X X 0 X X). X I I X r--- 0 0 I l--(X 0 x X) Y3 SO=I 1 1 X I X I , 0 Y2 o I X X Y2 Y3 SO=O X YI I yz I I n YI I YO r 0 II 0 x x 0 0 X X X 0 Y3 50=0 YI I so-O X X II' rr I I~ x I W X ,-- ~ ------v'--" I----l0 "' X ~~ I 0 x X 0 I Y2 1 X X '----- f-----' Y3 ~ I x 0 .r-,X x X I , I X 0 YI 0 0 X X I , 0 K 0 X X X I I X X Y2 Y2 ~ X X X , , 0 0 0 YZ '----- I - Yl YT ~ AI = YO 0 YI YI X 50= 1 YO YO I J (b) Y2Map YO X x Y2 Y2 - A2 = Y1 (a) Y3 Map r-- ~ I Y3 VJ=AJ=Y!+Y3So yO 0 X SO=I so=o Y3 Y3 50= I YO=M=YJY2YO+SO + Y3 Y2 (d) YOMap (c) Yl Map Figure 5. 5· to 4·Bit Conversions Several design programs that run on the IBM PC (or equivalent) or the VAX computer are available from either semiconductor manufacturers or from third-party software vendors. The ftrst such program, called PALASM (PAL Assembler) was developed by Monolithic Memories. The program enables you to describe the logic in terms of Boolean equations, truth tables, or state diagrams using a language whose syntax is comparable to a microcomputer assembly language. Appendix A shows the equations for the GCR design, written in the P ALASM syntax. This ASCII file was created using Wordstar in the non-document mode. The PALASM file ·(GCREX.PAL) is· then translated to the syntax of the ABLE design program using the TOABEL program. The format of the command is: TOABEL -IB:GCREX -OB:GCREXT The TOABEL program converts the GCREX.PAL file to a file named GCREXT.ABL, whose listing appears in Appendix B. ABEL consists of an executive and several overlay programs that are executed by typing in: ABEL B:GCREXT The ABEL program was developed by a programmer manufacturer, Data I/O Corporation. ABEL can simplify a source file (logic reduction), perform logic simulation, and generate test vectors. Table 2 lists the ABEL programs. Invalid Flag (INV Flip-Flop) The Invalid flip-flop is set to a One when an invalid 5-bit code is read from the tape. This tells the tape formatter that the next data read is the beginning of the data block. Because INV is a negative-true signal, the logic equations are written for Ones on the Venn diagram. The 16 binary values not listed in Table 1 are plotted as Ones in Figure 6. Squares corresponding to valid 5-bit codes contain Zeros; the rest of the squares contain Ones. The equation for the Ones is: INV = YO SOUT+ Y3 Y2 + Y3 YI YO + Y3 Y2 Y1 YO Sout The Invalid flip-flop is enabled by a signal called CIF (Control Invalid Flag) and reset when CIF is Low. Synchronization Mark Detection Bit synchronization is achieved when the illegal 5-bit code of all Ones is read from the tape. This condition is the logical AND of all 5 bits, or BS = Y3 Y2 Y1 YO SOUTo Implementing the Design Once the conceptual design is complete, it must be reduced to practice. This process has two main steps: Describe the logic using a high-level language, and Program the PAL 6-68 ~RESS -==-~ SEMICGIDOCTOR GCR Encoder/Decoder ~=============================~ INV = YO sour + Y3 Y2 + Y3 VI Yl) + Y3 Y2 Yl YO SOUT Figure 6. Binary Values Not Listed in Table 1 gram printed out that 40 of the device's 64 available product terms were used. If the P ALASM input equations shown in Appendix A are implemented in two-input gates, approximately 30 gates are required for each of the six D flip-flop inputs, or a total of 6 X 30 = 180 two-input gates. The logic equations alone would then require 180/4 = 45 14-pin DIPs. The six flip-flops would require three 14-pin DIPs, for a total of 48 DIPs. Thus, one 20-pin Cypress PAL replaces approximately 50 14-pin DIPs. This design also illustrates the Cypress PAL's powersaving advantage. The 16R6 PAL's maximum Icc current, under worst-case conditions, is 45 rnA. In contrast, the total Icc for 50 TIL packages would be 500 rnA, assuming 10 rnA for the typical Icc per package. The worst-case Icc for the TIL system could be as high as 20 rnA per DIP, which would mean a total of lA for the system. The Cypress CMOS PAL reduces system power by a factor of 10 to 15, depending upon whether typical or worst-case numbers are compared. The ABEL output mes for this design based on the PAL C 16R6 (Figure 7) are: GCREXT.LST GCREXT.OUT GCREXT.DOC (see Appendix C) GCREXT.SIM (This design was not simulated.) P16R6.JED (see Appendix D) The last me is in JEDEC (JC-42.1-81-62) format and is suitable for loading into a PLD programmer. The listing appears in Appendix D. The DOCUMENT program output appears in Appendix C. Note that, although the file list includes a simulation me, this design was not simulated. The CY7C16R6 that implements the design was programmed using the Data I/O model 29B programmer operated in the remote mode to the PC. The design was then verified by testing the device on the bench. PAL Advantages This design example illustrates the space-saving advantage of Cypress CMOS PALs. The FUSEMAP pro- Table 2. ABEL Programs CK 1 t.41 2 t.40 3 sour 03 Y3 PROGRAM NAME Vee BS 02 5 " Y2 01 6 Y1 PARSE Read source file; check syntax; expand macros; act upon assembler directives TRANSFOR Convert the description to an intermediate form Perform logic reduction Create the programmer load (JEDEC) me 00 7 YO REDUCE EN 8 INV elF FUSEMAP 9 GND 10 SIN 11 E SIMULATE Figure 7. PAL C 16R6 FUNCTION Simulate the operation of a programmed device DOCUMENT Create a design documentation me 6-69 Appendix A. PALASM Equations DESIGN EXAMPLE FILENAME: GCREX.PAL PALI6R6 BRUCE WENNIGER 9/17/85 PATOOI 4B-5B ENCODER/DECODER CYPRESS SEMICONDUCTOR CK MI MO D3 D2 DI DO lEN ICIF GND IE SIN IINY YO YI Y2 Y3 SOUT IBS YCC ISOUT := EN*/SOUT IEN*IMI *IMO*/SIN IEN*/MI * MO*/YO IEN* MI */MO*/SIN IEN* IMI* MO* DI*/DO IEN* IMI* MO* D3*IDO + + + + + ; HOLDIRECIRCULATE ; SERIAL SHIFT IN ; SERIAL SHIFT OUT ; CONY. SIN & LOAD ; CONY. PAR. & LOAD ; DITTO IYO := EN*/YO IEN*/MI *IMO*ISOUT IEN*/MI * MO*/YI IEN* MI */MO*ISOUT IEN* MI */MO* Y3* Y2*/YO IEN* MI* MO*D2*IDI* DO IEN* MI* MO* D3*IDI* DO IEN* MI * MO*ID3*/DI */DO + + + + + + + ; HOLD ; SERIAL SHIFT IN ; SERIAL SHIFT OUT ; CONY. SIN & LOAD ; DITTO ; CONY. PAR. & LOAD ; DITTO ; DITTO IYI := EN*/YI IEN*/MI */MO*/YO IEN*/MI * MO*/Y2 IEN* MI */MO*/YO . IEN* MI */MO* Y3* Y2 IEN* MI * MO*/D2 + + + + + ; HOLD ; SERIAL SHIFT IN ; SERIAL SHIFT OUT ; CONY. SIN & LOAD ; DITTO ; CONY. PAR. & LOAD IY2 :=EN*/Y2 IEN*/MI */MO*/YI IEN*/MI * MO*/Y3 IEN* MI */MO*/YI IEN* MI * MO*!D3* DI IEN* MI * MO*!D3* D2* DO + + + + + ; HOLD ; SERIAL SHIFT IN ; SERIAL SHIFT OUT ; CONY. SIN & LOAD ; CONY. PAR. & LOAD ; DITTO IY3 :=EN*/Y3 IEN*/MI *IMO*/Y2 IEN*/MI* MO*ISOUT IEN* MI *IMO* Y3* SOUT IEN* MI */MO*/Y2 IEN* MI * MO* D3* DO IEN* MI* MO* D3* DI + + + + + + ; HOLD ; SERIAL SHIFT IN ; SERIAL SHIFT OUT ; CONY. SIN & LOAD ; DITTO ; CONY. PAR. & LOAD ; DITTO INY :=/CIF* INY + ; HOLD INY FLAG (ACTIVE LOW) CIF* MI *IMO*/Y3*/Y2 + ; SET IF INY ALID CIF* MI*IMO*/Y3*/YI*/YO + ; DITTO CIF* MI */MO*/YO*/SOUT + ; DITTO CIF* MI*/MO* Y3* Y2* YI* YO* SOUT ; DITTO BS = Y3* Y2* YI * YO* SOUT ; BIT SYNC. (ACTIVE LOW) 6-70 Appendix B. ABEL Listing module gcrext; flag '-rO'; title 'PAL16R6 FILENAME: GCREX.PAL DESIGN EXAMPLE PAT001 BRUCE WENNIGER 9/17/85 4B-5B ENCODER/DECODER CYPRESS SEMICONDUCTOR -Translated by TOABEL-'; P16R6 device 'P16R6'; "declarations TRUE,FALSE = 1,0; H,L = 1,0; X,Z,C = .x.,.Z.,.C.; GND,VCC pin 10,20; CK,Ml,MO,D3,D2,Dl,DO,EN,CIF,E pin 1,2,3,4,5,6,7,8,9,11; INV,YO,Y1,Y2,Y3,SOUT pin 13,14,15,16,17,18; SIN,BS pin 12,19; equations ISOUT := lEN & ISOUT #EN & IMI & IMO & ISIN # EN & IM1 & MO & IYO # EN & Ml & IMO & ISIN # EN & Ml & MO & Dl & IDO # EN & Ml & MO & D3 & IDO ; " HOLD/RECIRCULATE " SERIAL SHIFT IN " SERIAL SHIFT OUT " CONY. SIN & LOAD "CONV. PAR. & LOAD " DITTO IYO := lEN & IYO # EN & IMI & IMO & ISOUT # EN & IMI & MO & IYl #EN &Ml & IMO& ISOUT # EN & M1 & IMO & Y3 & Y2 & IYO # EN & Ml & MO & D2 & IDI & DO # EN & M1 & MO & D3 & IDI & DO #EN &Ml &MO & ID3 & IDI & IDO; "HOLD " SERIAL SHIFT IN " SERIAL SHIFT OUT "CONV. SIN & LOAD " DITTO "CONV.PAR. & LOAD " DITTO " DITTO 6-71 Appendix B. ABEL Listing (Continued) !Yl := !EN & !Yl #EN & lMl & lMO& lYO # EN & lMl & MO & lY2 #EN &Ml & !MO & !YO # EN & Ml & lMO & Y3 & Y2 #EN &Ml &MO& !D2; "HOLD " SERIAL SHIFT IN " SERIAL SHIFT OUT " CONY. SIN & LOAD " DITTO " CONY. PAR. & LOAD lY2 := !EN & lY2 #EN & !Ml & !MO & !Yl # EN & lMl & MO & !Y3 #EN &Ml & lMO & !Yl # EN & Ml & MO & !D3 & Dl #EN &Ml &MO & lD3 &D2 &DO; "HOLD " SERIAL SHIFT IN " SERIAL SHIFT OUT " CONY. SIN & LOAD " CONY. PAR. & LOAD " DITTO !Y3 := !EN & !Y3 #EN & !Ml & !MO & !Y2 # EN & !Ml & MO & lSOUT #EN &Ml & !MO & Y3 & SOUT #EN &Ml & lMO & !Y2 #EN &Ml &MO&D3 &DO #EN &Ml &MO &D3 &Dl; "HOLD " SERIAL SHIFT IN " SERIAL SHIFT OUT " CONY. SIN & LOAD " DITTO " CONY. PAR. & LOAD " DITTO lINY := CIF & !INY # !CIF &Ml & !MO & lY3 & !Y2 # lCIF &Ml & !MO & lY3 & lYl & lYO # !CIF & Ml & lMO & lYO & !SOUT # !CIF & Ml & !MO & Y3 & Y2 & Yl & YO &SOUT; " HOLD INY FLAG " SET IF INYALID " DITTO " DITTO " DITTO !BS = Y3 & Y2 & Yl & YO & SOUT ; " BIT SYNC. end _gcrext; 6-72 GCR Encoder/Decoder Appendix C. Document File Page 1 ABEL(tm) Version 1.10 - Document Generator 17-Sept-85 8:30 AM PAL16R6 DESIGN EXAMPLE FILENAME: GCREX.PAL PATOOI BRUCE WENNIGER 9/17/85 4B-5B ENCODER/DECODER CYPRESS SEMICONDUCTOR -Translated by TOABELEquations for Module _gcrext Device P16R6 Reduced Equations: SOUT:= !(IEN & !SOUT #EN & !MO & !Ml & !SIN # EN & MO & !Ml & lYO #EN & IMO &Ml & ISIN # IDO &Dl &EN & MO &Ml # IDO & D3 & EN & MO & Ml); YO := 1(IEN & lYO # EN & IMO & IMI & ISOUT # EN & MO & IMI & lYl #EN & IMO & Ml & ISOUT # EN & IMO & Ml & lYO & Y2 & Y3 # DO & IDI & D2 & EN & MO & Ml # DO & IDI & D3 & EN & MO & Ml # IDO & IDI & ID3 & EN & MO & Ml); Yl := I(lEN & lYI # EN & lMO & IMI & lYO # EN & MO & lMI & lY2 # EN & lMO & Ml & lYO # EN & IMO & Ml & Y2 & Y3 # ID2 & EN & MO & Ml); Y2 := 1(IEN & lY2 # EN & IMO & IMl & lYl #EN &MO & lMI & lY3 # EN & lMO & Ml & lYl # Dl & lD3 & EN & MO & Ml # DO & D2 & ID3 & EN & MO & Ml); Y3 := 1(IEN & lY3 # EN & lMO & IMI & lY2 #EN &MO& IMI & ISOUT # EN & IMO & Ml & SOUT & Y3 # EN & IMO & Ml & lY2 # DO & D3 & EN & MO & Ml # Dl & D3 & EN & MO & Ml); INY := I(ClF & lINY 6-73 Appendix C. Document File (Continued) Page 2 17 Sept-85 8:30 AM ABEL(tm) Version 1.10 - Document Generator PAL16R6 DESIGN EXAMPLE FILENAME: GCREX.PAL PATOOI BRUCE WENNIGER 9/17/85 4B-5B ENCODER/DECODER CYPRESS SEMICONDUCTOR -Translated by TOABELEquations for Module _gcrext Device P16R6 # # # # BS = IClF& IClF& IClF & IClF & IMO&Ml IMO&Ml IMO & Ml IMO & Ml & IY2& IY3 & IYO& IYI & IY3 & ISOUT & IYO & SOUT & YO & Yl & Y2 & Y3); I(SOUT & YO & Yl & Y2 & Y3); Chip diagram for Module _gcrext Device P16R6 PALC16R6 CK 1 ~1 2 as ~O 3 03 .- SOUT 02 5 Y2 Vee Y3 01 6 Yl DO 7 YO EN elr 8 GND INV 9 12 SIN 10 11 E end of module _gcrext 6-74 GCR Encoder/Decoder Appendix D. JEDEC File ABEL(tm) Version 1.10 JEDEC fIle for: P16R6 Created on: 17-Sep-85 8:30 AM PAL16R6 DESIGN EXAMPLE FILENAME: GCREX.PAL PATOOI BRUCE WENNIGER 9/17/85 4B-5B ENCODERIDECODER CYPRESS SEMICONDUCTOR -Translated by TOABEL-* QP20* QF2048* LOOOO 11111111111111111111111111111111 11111101110111011101110111111111 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 11111110111111111111111110111111 10111011111111111111111101111110 10110111111111111111111001111111 01111011111111111111111101111110 01110111111111110111101101111111 01110111011111111111101101111111 00000000000000000000000000000000 00000000000000000000000000000000 11111111111011111111111110111111 10111011111111101111111101111111 10110110111111111111111101111111 01111001110111111111111101111111 01111011111111101111111101111111 01110111011111111111011101111111 01110111011111110111111101111111 00000000000000000000000000000000 00000000000000000000000000000000 11111111111011111111111110111111 10111011111111101111111101111111 10110110111111111111111101111111 01111001110111111111111101111111 01110111111111011111111101111111 01110111011111111111011101111111 01110111011111110111111101111111 00000000000000000000000000000000 00000000000000000000000000000000 11111111111011111111111110111111 10111011111111101111111101111111 10110110111111111111111101111111 01111001110111111111111101111111 01111011111111101111111101111111 01111111111111111111011101111111 01110111011111110111111101111111 00000000000000000000000000000000 11111111111111101111111110111111 10111011111111111110111101111111 10110111111011111111111101111111 01111011111111111110111101111111 01110111101111110111111101111111 JEDEC Listing (Continued) 01110111101101111011011101111111 00000000000000000000000000000000 00000000000000000000000000000000 11111111111111111110111110111111 10111011111111111111111001111111 10110111111111101111111101111111 01111011111111111111111001111111 01111011110111011111111101111111 01110111111110111111111101111111 00000000000000000000000000000000 00000000000000000000000000000000 11111111111111111111111010111111 10111010111111111111111101111111 10110111111111111110111101111111 01111011111111111111111101111111 01111011110111011111111001111111 01110111111101111011011101111111 01110111011111111011011101111111 01110111101111111011101101111111 11111111111111111111111111100111 01111011111011101111111111111011 01111011111011111110111011111011 01111010111111111111111011111011 01111001110111011101110111111011 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000* C8E51* D15A 6-75 CYPRESS SEMICONDUCTOR T2 Framing Circuitry - This application note describes the design of a 1'2based transmission system. This system adds control characters to an image processor's data stream so that the resulting output can be slotted into a 1'2 channel. DS-2 transmission equipment is then used to relay this information onward. At receiving locations, the control bits are used to synchronize the site's circuitry to the incoming characters. The data is then restored to its original form, before being routed to its final destination. A block diagram of this system appears in Figure 1 ~ B B F 1 Ch 1 Ch 2 B B .. Justification: Three bits, referred to as Stuffing Indicator bits (C), are inserted into every sub-frame for justification purposes. Positive, negative, and no justification are possible by inserting the correct code into the relevant locations. ~~; = 1.536 Mbits/s You can achieve this maximum data rate for T1 transmission when using the Extended Super Frame format. This format <;iedicates all 8 bits of every channel to PROCESSOR .a Ch 24 Frame alignment, implemented by alternating between logic level 0 and· 1. Each sub-frame contains two of these bits, which are referred to as F bits. The maximum data rate in a T 1 channel is therefore T2 1 .9te user data instead of reserving the eighth bit for channel signaling. Figure 2 illustrates the composition of a T1 frame. The next level in the digital communications hierarchy is referred to as T2. Four T1 frames constitute a 1'2 Multi-frame. These frames are arranged as four subframes, each having six blocks of 49 bits. The leading character of every block is used for control purposes, and the following 48 bits consist of data. In total, a Multiframe comprises 1176 characters. This format includes three control features: Multi-frame alignment, provided by a 0111 pattern in each of the four sub-frames. These four bits are referred to as M bits. The fourth M bit location can also serve as an alarm service digit, if required. 24 Xl~OOO x 193 = 1.544 Mbits/s TRANSMIT INTERFACE 1 Ch 2 Figure 2. T1 Frame Structure Overview of Tl and T2 IMAGE Ch F = F BIT (one bit) Channel data = 8 bits Number of channels = 24 Digital transmission systems in North America are hierarchical in structure. Each carrier is multiplexed into higher bandwidth carriers. The lowest level is known as 11. This typically consists of 24 64-Kbitls pulse code modulation (PCM) telephone channels multiplexed together into frames. A single framing (F) bit precedes every Tl frame to· allow for features such as synchronizing channels, sending control characters, and generating cyclic redundancy code (CRC) bits. Thus, each frame contains 24 8-bit channels plus an additional framing bit, for a total of 193 bits per frame. The bit rate for a T1 channel equals the rate of a bit in the frame multiplied by the total number of bits in the frame: 1.544 x F DS-2/T2 LINK Figure 1. System Overview 6-76 T2 RECEIVER INTERFACE FINAL DESTINATION provides further details of all the framing structures mentioned here. T2 MULTI-FRAME Transmitter Site Circuitry In the example T2 system, the machine from which data originates can operate at frequencies as high as 10 MHz. The data is sourced to the T2 system at 6.183 MHz, which is the data rate of a T2 line. At 10 MHz, stopping and starting artifacts would arise from the disparity between the source and the transmission medium. The output from the transmitter circuitry is maintained at 6.312 MHz to allow the inclusion of control characters into the data stream. Phase-lock-loop design techniques ensure that the clocks in the T2 system and the data source are tightly coupled. Figure 5 shows the transmitter block diagram. Information feeds into a FIFO, ICl, under control of TXCLKIN (6.183 MHz, the source's clock). TXCLKOUT (6.312 MHz) retrieves data from ICI. IC2 (TXCNTRL PAL) controls the insertion of control bits into the data stream at every 49th time slot. IC4 is a PROM that holds a unique 24-bit control pattern. A counter, IC3 (PROMADDR PAL), provides the address to the PROM. ICS (DIVBY49 PAL) is programmed as a counter that increments on successive clock pulses. When this counter reaches its terminal count (49), a carry- M = MULTI-FRAME ALIGNMENT BITS (M1, M2, M3 and M4) C = STUFFING INDICATOR (Cj1, Cj2 and Cj3) F = FRAME ALIGNMENT BITS (FO=O, F1=1) Figure 3. T2 Frame Structure Figure 3 shows how the data and control bits interleave. Figure 4 illustrates the sequence in which control bits occur. The bit rate of T2 information bit rate is 6.312 Mbitls. The corresponding data rate in a T2 channel is therefore: 6.312 x 48 6.183 Mbits/s 49 Further levels exist within the communication hierarchy, but they are not relevant to this design. CCnT G743 Ml--. Cll - - . FO --. C12 - - . C13 - - . Fl --. M2 --. C2l - - . FO --. C22 - - . C23 ---. Fl --. C32 - - . C33 - . Fl -. M4 -. C41 - - . FO --. C42 - . C43 - - . Fl --. o M3 - - . C31 - . FO o o Figure 4. Control Bit Sequence FIFODATA IMAGE IPDATA ... PROCESSOR 1------1~~ PL~~S~ FIFO IC1 ... HFUll ~ lK L RDClKt r - l _ R C - I . _ LOOP L:C~IR~C~U~IT~R~Y~J-___T_xc_lK_oUT _ _ _ _-t~~ _"----1---, TXCNTRL 1~_____~~=Cl~KI~N________~. PAL r------------------------~~ IC2 ~'~'l ~ PROMADDR PAL IC3 TO DS-2 INTERFA~E t-_-B.A------~~ OPDATA ~ WUT ~~ ~~ WNTROL BITS PURt PROM ADDRESS BITS Ao-A4 .... A5-~-+ PROM IC4 (GND) TXCLKOUT --+ Figure S. Transmitter Site Circuitry 6-77 DIVBY 49 (COUNTER) IC5 ETC out signal is produced (FBITLOAD), which serves three purposes: It causes the counter to reload its base count (zero) It indicates that a control bit has to be inserted into the data stream It serves as an input to the state machine in the IC2 PLD, which is the control-bit sequencer that governs when the PROM address generator has to be incremented. A decode of one of the sequencer's states, INCFADD, causes the PROM address to increase by one. The listings for the design's PALs appear in Appendices A through J. The devices required to implement these tasks appear in Figure 6.ICI (IPFIFO) is a FIFO whose input source is the data and control character stream from the transmit site. The FIFO holds the most recent 196 bits of information entering the receiver circuitry. IC2 (DATASORT PAL) provides the commands that control this operation and acts as an intermediate buffer stage between the information presented to the FIFO and the characters subsequently read from that device. The outputs of IC3 (CLKGEN PAL) are the Read and Write clocks for the FIFO. IC4 (ALIGNDET PAL) and IC5 (FRAMCHEK PAL) perform pattern recognition. IC4 compares the expected control bit pattern to the stream of characters appearing at the FIFO's outputs. IC5 interprets the results and sets a flag whenever frame alignment is attained. IC5 also indicates if alignment is subsequently lost. Frame alignment is declared when four pre-determined bit patterns have been recognized. Thereafter, the circuit makes continuous checks to ensure that alignment is maintained. In total, the circuit seeks 12 bit patterns. If any check yields a negative result, alignment ha's been lost. A locally generated reset pulse then sets the relevant circuitry to its initial state, and the process of alignment detection begins once again. For a short period following the application of power, an initialization signal, RESET, is active. This signal ensures that the outputs ofIC5 (FRAMCHEK PAL) and IC6 (DSCOUNT PAL) are driven to their initial states and the FIFO (ICl) has all of its internal memory locations and control registers cleared to zero. Once the power-up Receiver Site Circuitry The most obvious way to detect a valid pattern of 1'2 data· and control characters is to serially shunt them through a shift register with 1176 stages. Outputs from the first, 50th,. 99th, etc. through the 1128th location can then be continuously monitored for the relevant character sequence .. This approach is very wasteful in terms of circuitry because monolithic shift registers provide either eight or 16 stages. Fortunately, you can achieve the same result with one FIFO and two PALs. The principle is to arrange the incoming information so that a pattern recognition circuit periodically samples the most recent, the 50th, the 99th, the 148th, and the 197th bits. This circuitry then compares the information to that expected. When a complete frame of control characters has been detected, the incoming information is frame aligned with the circuitry at the receiver site. ICLK3 START CLK3 PUR RESET CLK3 ALlGNF START CLK3 RESET PUR ICLK3 ALlGNDET IC4 00-04 MTRUE FRAMCHEK JC5 FTRUE PUR 01-04 MB,MA A1-A4 ALlGNF CLKGEN IC3 E,D,C IPFIFO' IC1 RD WR OPHFULL 00 RDCLK, WRCLK CLK3,ICLK3 OPFIFO ROCLK Ica DSTAGGR WRCLK Figure 6. Receiver Site Circuitry 6-78 DATAOUT TO FINAL DESTINATION 04 '" '15 03 02 '47 '48 rr II 01 . 48 IPFIFO Figure 7. YX Sequencer routine has completed, the process of writing information into the FIFO commences. All data entering the receiver is initially fed to the first input stage of the FIFO (D4) via a register in IC2. This ensures that the FIFO's set-up parameter is not violated. Every time a character enters the FIFO, a counter in IC6 increments once. When the terminal count (49) is reached, the counter's carry-out pin (NEXT) goes active. This condition causes the YX sequencer in IC2 to move from its initial state 0 position to state 1 (Figure 7). A decode of this state enables the strobe RD, which retrieves stored data from the FIFO. Thereafter, the data from the FIFO's first output stage (A4) is coupled, via IC2, to the FIFO's second input port (03). After two further occurrences of NEXT going active, the FIFO's second and third output stages (A3 and A2) are coupled to the third and fourth input ports (02 and D1), respectively. The YX sequencer goes to state 2. Figure 8 shows the FIFO's contents when NEXT becomes active for the fourth and final time. At this point, the pattern recognition circuitry can be enabled. IC2's five data output pins (D4 - DO) effectively perform the same function as a shift register with 197 stages. IC4 monitors this information until it detects the first ocare currence of 01000. These control bits M1IFlIC43/C42IFO, which are the signals present on D4 DO of the FIFO after the transmission of 1176 characters. This pattern could correspond to the detection of 01000 in IC4 for the first time. However, it is also quite probable that this sequence could randomly occur in the data stream. Thus, further checks are needed before assuming that the valid recognition pattern has been detected. As soon as the receiver recognizes the 01000 pattern, a signal labeled START goes active. This term enables a six-stage counter in IC7 (CBITCNTR PAL). The counter counts to 48, then issues a carry-out signal (LD49). A seven-state EDC sequencer (Figure 9) in IC5 recognizes every occurrence of this signal and thus always moves to its next stable position (state 1 in this case). 00 1----'04 03 02 01 Figure 8. Contents of Receiver Site FIFO A second check is made in IC4 to determine whether the second valid control bit pattern has been detected. IC4 uses the control bits E, 0, C, MA, and MB from IC5's EDC and M sequencer (Figure 10) to determine whether the incoming data has been aligned. These control bits represent the state of the sequencers in IC4 and determine the control sequence that should exist on the D4 - DO inputs. The second valid control pattern, 10001, is now sought on the bits F1, C43, C42, FO, and C41. If the pattern is not detected, a global reset is issued, and the search for the 01000 pattern recommences. Conversely, if the 10001 pattern is detected, the EOC sequencer assumes state 3. Further, FTRUE becomes true. This signal exists for one clock period and causes the sub-frame detector implemented by the F sequencer (Figure 11) to move to its next stable state. A further 147 clocks are allowed to elapse before the next control bit pattern check is carried out. By this time, the EDC sequencer is in state 6. The occurrence of a 11000 pattern for M4, Fl, C33, C32, and FO provides further proof that alignment has been attained, and the F sequencer moves to its next stable position, state 3. As before, a negative result causes the circuit to issue a global reset. The checking process would then continue with a 10000 pattern for F1, C33, C32, FO, and C31 being sought when a further 49 clock periods had elapsed (EDC sequencer in state 7). In this case, the occurrence of the correct pattern causes the M sequencer (multi-frame detector) to progress MTRUE POR MTRUE MTRUE LD41 MTRUE Figure 9. EDC Sequencer Figure 10. M Sequencer 6-79 ~ ~~~~~~~~~~~~~~~~~~~~~~~~~T~2~F~r~a~m~i~n~g~C~ir~c~u~it~r~y Control bits are not written to IC8 (OPFIFO); they coincide with the occurrence of an active ID49 (counter carry-out) signal. Thus, although a data bit is read out of the IPFIFO, the occurrence of LD49 prevents a write strobe (WRCLK) from being generated and the data bit from being written into OPFIFO. The process of removing data from IC8 commences as soon as that device is half full, indicated by OPHFULL. This prevents invalid data from being passed to the next stage when the FIFO empties. The frequency of the FIFO's write (WRCLK) and read (RDCLK) strobes are 6.312 and 6.183 MHz, respectively. Figure 11. F Sequencer from its state 0 start position to state 1. The F sequencer's state diagram shows that the sequencer assumes state 2 after the next occurrence of an active LD49 signal, followed one clock period later by a return to the start position, state O. As stated previously, the declaration of alignment is made only when four consecutive bit patterns - commencing with the start condition on MO, F1, C3, C3, and FO - have been sequentially detected. When these criteria have been satisfied, the ALIGNF flag is raised. This flag is held in its active state until one of the ensuing checks produces a negative result. In such an event, the RESET term goes active, thereby forcing certain areas of the receiver's circuitry into the same conditions as occurred at power-up. Immediately following the receiver's alignment of the incoming data stream, the ensuing information is written into a second FIFO (IC8, OPFIFO). This action is a preface to restoring the data to its original form, i.e., removing the control bits added by the transmitter. Once this operation has been completed, the data can be passed to its final destination. As in the transmitter's design, the receiver's source (IPCLK, 6.312 MHz) and sink (PLLOPCLK, 6.183 MHz) clocks must be locked together. A phase lock loop circuit performs this function. Other Considerations The T2 system requires interfaces at both the transmit and receive sites between the hardware described here and the relevant DS-2 equipment. Rockwell's industry-standard DX-33B"4 (CLNS-95-297) and DX-33K-3 (CLNS-95308) boards suit this task. The latter is fitted with a termination network that matches the receiver's input impedance to that of the transmission medium. Parts Lists Transmitter: IC1 = CY7C433 IC2 = CY7C22VlO IC3 = CY7C22V10 IC4 = CY7C225 IC5 = CY7C22VlO Receiver: IC1 = CY7C433 IC2 = CY7C22V10 IC3 = CY7C22VlO IC4 = CY7C22V10 IC5 = CY7C22VlO IC6 = CY7C22V10 IC7 = CY7C22V10 IC8 = CY7C433 IC3 provides the control and strobe signals for removing control bits from the data stream. The equations in the source code for this device (Appendix D) reveal the following facts: 6-80 Appendix A. PAL Equations For TXCNTRL PAL 22VlO T2 TRANSMITIER CONTROLLER (lC2) CYPRESS SEMICONDUCTOR ITXCLKIN /PUR ITXCLKOUT HFULL FBIT IFBITI..OAD FIFODATA NC8 NC9 NC10 NC11 GND NC13 WRCLK /RDCLK IENREAD OPDATA lEN IB IA IINCFADD NC22 NC23 VCC EQUATIONS WRCLK = TXCLKIN RDCLK = TXCLKOUT*ENREAD*/PUR ENREAD := IENREAD*HFULL*/PUR + ENREAD*/PUR ; TXCLKIN = 6.183 MHz ; TXCLKOUT = 6.312 MHz ; HFULL = FIFO HALF-FULL FLAG ; PUR = POWER-aN-RESET SIGNAL ; WRCLK = FIFO SHIFT-IN ; RDCLK = FIFO SHIFT-OUT OPDATA:= IOPDATA*FBIT*EN + IOPDATA*FIFODATA*IEN + OPDATA*IFBIT*EN + OPDATA*FIFODATA*/EN ; FBIT = FRAMING BIT FROM PROM ; EN = SELECTS DATA OR FRAMING BIT ; FIFODATA = DATA RETRIEVED FROM FIFO ; OPDATA = DATA PASSED TO DS-2 INTERFACE ; FBITLOAD = FRAMING BIT TO BE INSERTED INTO DATA STREAM ; INCFADD = CAUSES PROM ADDRESS TO BE INCREMENTED ; BA SEQUENCER = CONTROLS SELECTION OF FRAMING BITS A:= IB*IA*FBITLOAD*/PUR +/B*A*/PUR B:= IB*A*/PUR + B*IA*/PUR EN = IB*A INCFADD = B* A ; STATE DIAGRAM FOR BA SEQUENCER ;; EN ; PUR---- FBITLOAD ---- INCFADD ; ---->1 0 1-------->-------1 1 1------>-----1 3 1------->------1 2 1 ---------------------------<-------------------------- 6-81 Appendix B. PAL Equations For DIVBY49 PAL 22VI0 DIVIDE BY 49 COUNTER (IC 5) CYPRESS SEMICONDUCTOR ITXCLKOUT NC2 NC3 NC4 NC5 NC6 NC7 NC8 NC9 NCI0 NCl1 GND NC13 IFBITLOAD QO Ql Q2 Q3 Q4 Q5 NC21 NC22 NC23 VCC EQUATIONS QO := IQO*/FBITLOAD Ql := IQl *QO*/FBITLOAD + Ql */QO*/FBITLOAD Q2 := IQ2*Ql *QO*/FBITLOAD + Q2*/Ql */FBITLOAD + Q2*/QO*/FBITLOAD Q3 := IQ3*Q2*Ql *QO*/FBITLOAD + Q3*/Q2*/FBITLOAD + Q3*/Ql */FBITLOAD + Q3*/QO*/FBITLOAD Q4 := IQ4*Q3*Q2*Ql *QO/FBITLOAD + Q4*/Q3*/FBITLOAD + Q4*/Q2*/FBITLOAD + Q4*/Ql*/FBITLOAD + Q4*/QO*/FBITLOAD Q5 := IQ5*Q4*Q3*Q2*Ql *QO*/FBITLOAD + Q5*/Q4*/FBITLOAD + Q5*/Q3*/FBITLOAD + Q5*/Q2*/FBITLOAD + Q5*/Ql */FBITLOAD + Q5*/QO*/FBITLOAD FBITLOAD = Q5*Q4*/Q3*/Q2*/Q 1*/QO ; T2CLKOUT = 6.312 MHz ; QO-Q4 = COUNTER OUTPUTS ; FBITLOAD = USED TO INSERT FRAMING BITS INTO DATA STREAM (EVERY FORTY-NINTH LOCATION) 6-82 ~ ~~~~~~~~~~~~~~~~~~~~~~~T~2~F~r~a~D1~in~g~C~ir~c~u~i~tr~y Appendix C. PROM Equations PROM FILENAME:PROM CONTROL BIT GENERATOR (lC 4) CYPRESS SEMICONDUCTOR ADDRESS PROM CONTENTS (HEX) (HEX) 00 01 02 03 04 05 07 08 00 00 00 00 00 01 01 00 00 09 00 OA 00 01 01 06 OB OC OD OE OF 00 00 00 10 00 11 01 01 12 16 00 00 00 00 17 01 13 14 15 6-83 Appendix D. PAL Equations For PROMADDR PAL 22VIO FILENAME:PROMADDR PROM ADDRESS GENERATOR (IC 3) CYPRESS SEMICONDUCTOR ITXCLKOUT /PUR IINCFADD NC4 NC5 NC6 NC7 NC8 NC9 NC10 NCll NC13 AO A1 A2 A3 A4 /RELOAD NC20 NC2I NC22 NC23 VCC EQUATIONS AO := IAO*INCFADD*/RELOAD + AO*/INCFADD*/RELOAD Al := IAI*AO*INCFADD*/RELOAD + A1 *1 AO*/RELOAD + A1 */INCFADD*/RELOAD A2:= IA2*AI*AO*INCFADD*/RELOAD + A2*/A1 */RELOAD + A2*1 AO*/RELOAD + A2*/INCFADD*/RELOAD A3:= IA3*A2*A1*AO*INCFADD*/RELOAD + A3*1 A2*/RELOAD + A3*1A1 */RELOAD + A3*1 AO*/RELOAD + A3*/INCFADD*/RELOAD A4:= IA4*A3*A2*AI*AO*INCFADD*/RELOAD + A4*/A3*/RELOAD + A4 *1 A2*/RELOAD + A4 *1 A1 */RELOAD + A4*/AO*/RELOAD + A4*/INCFADD*/RELOAD RELOAD = PUR + Q4*Q3 */Q2*/Q I */QO ; T2CLKOUT = 6.312 MHz ; PUR = POWER-ON-RESET ; INCFADD = INCREMENT ADDRESS COUNT ; AO-A4 = PROM ADDRESS ; RELOAD = LOAD COUNTER Willi BASE COUNT 6-84 GND Appendix E. PAL Equations For DATASORT PAL 22VlO F~ENAME;DATASORT ARRANGE DATA READY FOR PATTERN DETECTOR (IC 2) CYPRESS SEMICONDUCTOR /ICLK3 A4 A3 A2 Al INPUT !NEXT /PUR NC9 NClO NCII GND NC13 /Y /X IDSTAGGR D4 D3 D2 Dl DO REGIN NC23 VCC EQUATIONS X:= /X*/Y*NEXT*/DSTAGGR*/PUR + X*/Y*/PUR + X*/NEXT*IPUR + X*DSTAGGR*IPUR Y:= /Y*X*NEXT*/DSTAGGR*/PUR + Y*X*IPUR + Y*/NEXT*IPUR + Y*DSTAGGR*IPUR DSTAGGR:= IDSTAGGR*Y*/X*NEXT*/PUR + DSTAGGR*/PUR ; STATE DIAGRAM FOR YX SEQUENCER , ;PUR ---- NEXT*/DSTAGGR ---- NEXT*IDSTAGGR ---- NEXT*/DSTAGGR ---- ; ---->1 0 1------------>-------------1 1 1------------>------------1 3 1------------>------------1 2 1 NEXT*IDSTAGGR ---------------------------------------------<--------------------------------------------; YX SEQUENCER = CONTROLS ARRANGEMENT OF DATA IN FIFO ; DSTAGGR = INDICATES WHEN DATA READY FOR PATTERN RECOGNITION ; PUR = POWER-ON-RESET ; NEXT = COUNTER O/P, CONTROLS DATA ORGANISATION INTO/OUT OF FIFO REGIN := INPUT DO := IDSTAGGR + IDO*Al*DSTAGGR*/Y*/X + DO*Y + DO*X + DO*Al Dl := /Y*IDSTAGGR + IDl *Y*X*/DSTAGGR + IDl *y*/x* A2*/DSTAGGR + IDl */Y*/X* A2*DSTAGGR + Dl*A2 +Dl*X + D 1*Y*DSTAGGR 6-85 AppendixE. PAL Equations For DATASORT (cont.) D2 := /Y*IDSTAGGR + 1D2*Y*/X* A3*/DSTAGGR + 1D2*/Y*/X*A3*DSTAGGR + D2*A3 + D2*Y*DSTAGGR + D2*X*DSTAGGR D3 := /Y*/X*IDSTAGGR + 1D3*X*A4*IDSTAGGR + 1D3*Y*/X*A4*IDSTAGGR + 1D3*/Y*/X*A4*DSTAGGR + D3*A4 + D3*X*DSTAGGR + D3*Y*DSTAGGR D4 :=REGIN ; DO-D4 = OUTPUTS TO PATTERN RECOGNITION CIRCUITRY, ALSO ; REGISTERED DATA BEING FED BACK INTO FIFO lIP STAGES ; AO-A4 = FIFO OUTPUTS BEING FED TO REGISTER ; INPUT = SERIAL DATA STREAM FROM RECEIVER liP STAGE ; REGIN = REGISTERED liP DATA 6-86 Appendix F. PAL Equations for CLKGEN PAL 22V10 FILENAME;CLKGEN CLOCK GENERATOR FOR DATA SORTING CIRCUITRY AND OPFIFO (lC 3) CYPRESS SEMICONDUCTOR IIPCLK IY IX IDSTAGGR PLLOPCLK /PUR OPHFULL ILD49 I ALIGNF NC10 NC11 GND NC13 ICLK3 IICLK3 fWR /RD ICLK4 IICLK4 IENREAD RDCLK WRCLK NC23 VCC EQUATIONS CLK3 = IPCLK ICLK3 = IIPCLK CLK4 = PLLOPCLK ICLK4 = /PLLOPCLK WR RD = = IICLK3 IY*X*/DSTAGGR*ICLK4 + Y*/DSTAGGR *ICLK4 + IY*/X*DSTAGGR*ICLK4 ; IPCLK = MASTER CLOCK FROM DS-2 INTERFACE (6.312MHz) ; YX SEQUENCER = USED TO CONTROL WHEN FIFODATA RETRIEVED ; DSTAGGR = USED TO CONTROL WHEN FIFO DATA RETRIEVED ; PLLOPCLK = O/P FROM PHASE LOCK LOOP CIRCUIT (6.183 MHz), ; DERIVED FROM 6.312 MHz MASTER CLOCK ; CLK3/ICLK3 = DERIVATION OF MASTER CLOCK (6.312 MHz) ; CLK4/ICLK4 = DERIVATION OF PHASE LOCKED O/P (6.183MHz) ; WR = lIP STAGE FIFO SHIFT-IN ; RD = liP STAGE FIFO SHIFT-OUT WRCLK = ALIGNF*/LD49*/CLK3 RDCLK = ENREAD*CLK4 ENREAD := IENREAD* OPHFULL */PUR + ENREAD*/PUR ; WRCLK = SHIFT-IN SIGNAL TO O/P STAGE FIFO ; RDCLK = SHIFT-OUT SIGNAL TO O/P STAGE FIFO ; ENREAD = CONTROLS WHEN DATA CAN BE READ FROM O/P STAGE FIFO ; ALIGNF = "ALIGNMENT" INDICATOR ; LD49 = O/P STAGE FIFO SHIFT-IN DISABLE TERM ; OPHFULL = INDICATES WHEN O/P STAGE FIFO IS HALF FULL ; PUR = POWER-ON-RESET 6-87 Appendix G.PAL Equations For DSCOUNT PAL 22VIO FILENAME; DSCOUNT DIVIDE-BY-49 COUNTER FOR DATA SORTING PROCESS (IC 6) CYPRESS SEMICONDUCTOR IICLK3 /PUR IDSTAGGER NC4 NC5 NC6 NC7 NC8 NC9 NC10 NCl1 NC13 QO Q1 Q2 Q3 Q4 Q5 lNEXT NC21 NC22 NC23 VCC EQUATIONS QO:= IQO*IDSTAGGR*/NEXT + QO*DSTAGGR*INEXT Q1 := IQ1 *QO*/DSTAGGR*/NEXT + Ql */QO*/NEXT + Q1 *DSTAGGR*/NEXT Q2:= IQ2*Q1 *QO*/DSTAGGR*/NEXT + Q2*/Q1 */NEXT + Q2*/QO*/NEXT + Q2*DSTAGGR*/NEXT Q3:= IQ3*Q2*Q1*QO*/DSTAGGR*/NEXT + Q3*/Q2*/NEXT + Q3*/Q1 */NEXT + Q3*/QO*/NEXT + Q3*DSTAGGR*/NEXT Q4 := IQ4*Q3*Q2*Q1 *QO*/DSTAGGR*/NEXT + Q4*/Q3*/NEXT + Q4*/Q2*INEXT + Q4*/Q1*/NEXT + Q4*/QO*/NEXT + Q4*DSTAGGR*/NEXT Q5:== IQ5*Q4*Q3*Q2*Q1 *QO*/DSTAGGR*/NEXT + Q5*/Q4*/NEXT + Q5*/Q3*/NEXT + Q5*/Q2*/NEXT + Q5*/Q1 *INEXT + Q5*/QO*/NEXT + Q5*DSTAGGR*/NEXT NEXT = PUR + Q5*Q4*/Q3*/Q2*/Q1 *IQO*IDSTAGGR ; ICLK3 = 6.312 MHz CLOCK DERIVED FROM DS-2 INTERFACE ; DSTAGGR = INDICATES WHEN DATA IS READY TO BE INTERROGATED BY ; PATTERN RECOGNITION CIRCUITRY ; PUR = POWER-aN-RESET ; QO-Q5 = O/P STAGES OF COUNTER ; NEXT = LOAD-ALL-ZEROES COMMAND TO COUNTER 6-88 GND Appendix H. PAL Equations ForCBITRCNT PAL 22V10 FILENAME; CBITRCNT CONTROL BIT REMOVAL INDICATOR/COUNTER (IC 7) CYPRESS SEMICONDUCTOR ICLK3 IRE SET ISTART NC4 NC5 NC6 NC7 NC8 NC9 NClO NCll NC13 QO Q1 Q2 Q3 Q4 Q5 ILD49 NC21 NC22 NC23 VCC EQUATIONS QO := IQO*START*/LD49 + QO*ISTART*ILD49 Q1 := IQ1 *QO*START*/LD49 + Q1 */QO*/LD49 + Q1 *ISTART*ILD49 Q2 := IQ2*Q1 *QO*START*/LD49 + Q2*/Q1 */LD49 + Q2*/QO*/LD49 + Q2*ISTART*ILD49 Q3 := IQ3*Q2*Q1 *QO*START*/LD49 + Q3*/Q2*/LD49 + Q3*/Q1 */LD49 + Q3*/QO*/LD49 + Q3*ISTART*ILD49 Q4 := IQ4*Q3*Q2*Q1 *QO*START*/LD49 + Q4*/Q3*/LD49 + Q4*/Q2*/LD49 + Q4*/Q1 */LD49 + Q4*/QO*/LD49 + Q4+ISTART*/LD49 Q5 := IQ5*Q4*Q3*Q2*Q1 *QO*START*/LD49 + Q5*/Q4*/LD49 + Q5*/Q3*/LD49 + Q5*/Q2*/LD49 + Q5*/Q1 */LD49 + Q5*/QO*/LD49 + Q5*ISTART*ILD49 LD49 = Q5*Q4*/Q3*/Q2*/Q 1*/QO + RESET ; CLK3 = 6.312 MHz CLOCK DERIVED FROM THE DS-2 INTERFACE ; RESET = LOCALISED RESET GENERATED WHEN "ALIGNMENT" IS LOST ; START = INDICATES THAT THE FIRST CONTROL BIT SEQUENCE (01000) ; HAS BEEN DETECTED ; QO-Q5 = COUNTER O/P STAGES ; LD49 = LOAD-ALL-ZEROES COMMAND 6-89 GND Appendix I. PAL Equations For ALIGNDET PAL 22VI0 FILENAME; ALIGNDET FRAME ALIGNMENT DETECfOR (IC 4) CYPRESS SEMICONDUCTOR ICLK3 DO Dl D2 D3 D4 /PUR IE ID IC ILD49 GND NC13 NC14 IMTRUE /FTRUE ISTART !RESET NC19 NC20 NC21 1MB IMA VCC EQUATIONS START := ISTART*ID4*D3*ID2*/DI */DO + START*!RESET FTRUE = IE*/D*C*LD49*/D4*IDI *START + E*D*/C*LD49*D4*/D1*START MTRUE = E*D*C*LD49*D4*D3*IDO*START*IMB + E*D*C*LD49*D4*D3*IDO*START*MB*MA + E*D*C*LD49*ID4*D3*IDO*START*MB*IMA RESET = PUR + E*D*C*LD49*/D4*START*/MB + E*D*C*LD49*/D3*START*/MB + E*D*C*LD49*DO*START*IMB + E*D*C*LD49*/D4*START*MB*MA + E*D*C*LD49*/D3*START*MB + E*D*C*LD49*DO*START*MB + E*D*C*LD49*D4*START*MB*IMA + IE*ID*C*LD49*D4*START + IE*ID*C*LD49*/D1*START + E*D*/C*LD49*ID4*START + E*D*/C*LD49*Dl*START ; CLK3 = 6.312 MHz CLOCK DERIVED FROM DS-2 INTERFACE ; DO-D4 = DATA CHANNELS ON WHICH CONTROL-BIT-PATTERN-RECOGNITION IS CARRIED OUT ; PUR = POWER-ON-RESET ; EDC = SEQUENCER USED WHEN SEEKING "ALIGNMENT" ; LD49 = INDICATES WHEN COMPARISON BETWEEN DATA CHANNELS AND EXPECTED PATTERN SHOULD BE CARRIED OUT ; MTRUE = MULTI-FRAME DETECTION INDICATOR ; FTRUE = SUB-FRAME DETECTION INDICATOR ; START = INDICATES THAT THE FIRST CONTROL BIT PATTERN HAS BEEN DETECTED ; RESET = ASSERTED WHEN ACTUAL AND EXPECTED CONTROL BIT PATTERNS ARE NOT IN AGREEMENT ; MBMA = SEQUENCER ASSOCIATED WITH MULTI-FRAME DETECTION 6-90 Appendix J. PAL Equations For FRAMCHEK PAL 22VIO FILENAME; FRAMCHEK FRAME ALIGNMENT CHECKER AND OPFIFO WRITE CONTROLLER (IC 5) CYFRESSSEMICONDUCTOR ICLK3 IRE SET IMTRUE IFTRUE ILD49 NC6 NC7 NC8 NC9 NClO NCll NC13 1MB IMA IFB IFA IE ID IC IALIGNF NC22 NC23 VCC EQUATIONS MB := IMB*MA*MTRUE*/RESET + MB*MA*/RESET + MB*IMTRUE*/RESET MA := IMA*/MB*MTRUE*/RESET + MA*IMB*IRESET + MA*IMTRUE*IRESET ; M SEQUENCER STATE DIAGRAM , ; RESET --- MTRUE --- MTRUE --- MTRUE --- ; ---------->1 0 1------->-------1 1 1-------->-------1 3 1------->--------1 2 1 MTRUE ------------------------------<-----------------------------FB:= IFB*FA*FTRUE*/RESET + FB*/FA*IRESET + FB*/E*IRESET + FB*D*/RESET + FB*/C*/RESET FA:= IFA*/FB*FTRUE*/RESET + FA*IFB*IRESET + FA*IE*/RESET + FA*D*/RESET + FA*/C*IRESET ; F SEQUENCER STATE DIAGRAM , ; RESET --- FTRUE --- FTRUE --- E*ID*C --- ------>1 0 1----->-----1 1 1----->------1 3 1----->-----1 2 1 E*ID*C -----------------------<-----------------E := IE*D*/C*LD49*/RESET + E*D*/RESET + E*/C*/RESET 6-91 GND Appendix J. PAL Equations For FRAMCHEK D:= ID*/E*C*LD49*/RESET + D*/E*IRESET + D*/C*/RESET + D*/LD49*/RESET C := IE*/D*/C*LD49*/RESET + E*D*/C*LD49*IRESET + IE*/D*C*/RESET + E*D*C*/RESET + IE*C*/LD49*/RESET ; EDC SEQUENCER STATE DIAGRAM ; RESET --- LD49 --- LD49 --- LD49 --- LD49 --- LD49 --- LD49 --- , ; ------->1 0 1--->---1 1 1--->---1 3 1---->--1 2 1-->---1 6 1--->---1 7 1--->---1 5 1 1 -----------------------------------<----------------------------------ALIGNF := IALIGNF*E*/D*C*/RESET + ALIGNF*/RESET ; ALIGNF STATE DIAGRAM ; ALIGNF ; RESET --- E*/D*C --- ; --------->1 0 1------>-----1 1 1 ; SEQUENCE OF EVENTS PRIOR TO ALIGNMENT DECLARATION: , ; START-LD49-STRUE-LD49-LD49-LD49-STRUE-LD49-LD49-MTRUE ; CLK3 = 6.312 MHz CLOCK DERIVED FROM DS-2 INTERFACE ; RESET = ISSUED IF ACTUAL AND EXPECTED CONTROL BIT PATTERNS DO ; NOT AGREE ; MTRUE = MULTI-FRAME DETECTION INDICATOR ; FTRUE = SUB-FRAME DETECTION INDICATOR ; LD49 = INDICATES WHEN COMARISON BETWEEN ACTUAL AND EXPECTED CONTROL BIT PATTERNS SHOULD TAKE PLACE ; ; MBMA = SEQUENCER ASSOCIATED WITH MULTI-FRAME DETECTION ; FBFA = SEQUENCER ASSOCIATED WITH SUB-FRAME DETECTION ; EDC = SEQUENCER USED IN DETERMINATION OF "ALIGNMENT" ; ALIGNF = WHEN TRUE INDICATES "ALIGNMENT" HAS BEEN ATTAINED 6-92 CYPRESS SEMICONDUCTOR Using CUPL With Cypress PLDs This application note covers the following topics: CUPL package components CUPL programming language syntax CUPL examples, using Cypress PLDs CUPL compiling A high-level universal language for programmable logic devices (PLDs), CUPL works with schematic capture packages such as SCHEMA and OrCAD-SDT and can port to UNIX-based systems. put. The output file contains a comparison of the device's expected output with its actual output; this is based on a file created by CUPL during compilation called the absolute file, filename.ABS. The comparison file contains the original header information found in filename.SI, all vectors that compared positively, and all discrepancies. CSIM flags the discrepancies with the values determined from the original logic equations. The CSIM command line is shown in Figure 2. When running CSIM with the -w or -d flag, you can change the view of the waveform by using the keys shown in Figure3. CUPL Package Components The CUPL package consists of CUPL (Universal Compiler for Programmable Logic), CSIM (CUPL Simulator), CBLD (CUPL Build), and PTOC (PALASM to CUPL Translator). CBLD The CBLD program allows you to maintain and personalize CUPL device libraries. Figure 4 shows the CBLD command line. You can use CBLD to create custom library files consisting, for example, of only the parts you currently use. The structure of this ASCII text file appears in FigureS. CBLD also checks to see if the current CUPL version matches the current version of the device library. If the key in the library does not match the CUPL version, CUPL The major component of the CUPL package is the CUPL program. This me allows you to compile logic description files that can be downloaded to a device programmer. CUPL supports Cypress's entire 20-pin PAL family, the PAL C 22VIO, the PAL C 20GIO, and the CY7C33x family of parts. In addition to providing a programming syntax similar to that of other PLD programming packages, CUPL helps implement lists, address ranges, and bit fields efficiently. CUPL includes state machine syntax (SMS) and truth-table input capability, allowing you to enter complex designs easily into Cypress's PLDs. CUPL also has four levels of minimization for logic reduction. CUPL comes with a menu-driven interface and a DOS command-line interface (the latter is explained in the last section of this application note). The menu interface integrates all the features necessary for efficient design implementation, including a program and JEDEC file editor, compiler, and simulator (Figurel). sage Center OMd'it!mmw mu , * CoIIplle aJPL rile * Look at DOC rile * Rev Ie.. error LST r lie • JEDEC f lie ed i tor • Inpllt sbulatlon file • Shoul"te CUPL tHe • \llew SI .... I.tlon ReSlllt" * Deu jce Select ion • Help (aJPL Qulclc Reference) • Tutorial for PLD's • QIllt CSIM CSIM, the file simulator for CUPL, takes an ASCII file as input (filename.SI) and outputs a file called filename.SO. The input file functionally describes the part by specifying the device's input and expected out- Allows YOIl to edit or conuert a design file. Figure 1. Menu Interface Screen 6-93 CSIM [flags] [library] source CBLD [flags] Where: [-flags] may have the following values -1 -j -v -u -w -d [build] [library) [devic'es] Where: .. [flags] may have the following values create listing file append test vectors to JEDEC file display simulation to screen use specified library (MS-DOS only) create listing file and display waveforms (MS-DOS only) display an existing simulation output file in waveform format -b -1 -m -t -u -e generate library using build file list long contents of library list allowable macros by pin list short contents of library use specified library list allowable extensions for devices [Build] is the name of the build file to be used with the -b option flag [library] is the name of the library that contains the device which was used when CUPL compiled the original source file. [Library] is a device library name and path name to be used with the -u option source is the name of the ASCII source file [Devices] is one or more device names to be used with the -t or -1 option Figure 2. CSIM Command Line Figure 4. CBLD Command Line ..... t + Fl F2 F3 F4 F5 F6 F9 FlO Scroll Right Scroll Left Scroll Up Scroll Down Decrease scale horizontally Enlarge scale horizontally Grid on/off Exit to DOS Shift screen left Shift screen right Create waveform hardcopy Waveform legend You can place ~n. "X" within any number to indicate a Don't Care value. Appendix A shows an example of using the Don't Care specification within truth tables. Comments are delimited with 1* and *1. The CUPL compiler ignores everything between these characters. For example, to put a paragraph of explanation within a program, enclose the entire paragraph in a set of comment delimiters. You do not have to put delimiters on every line, as in some packages. CUPL also supports list notation. Enclose all items in the list in square brackets:. [variable, variable, variable, ...] When using sequentially numbered lists, you can abbreviate the format to [variablem..n] CUPL 's format can be considered in three major parts: the header, pin/node defmition, and equations sections. The· header section contains general information about the design. The pin/node section assigns variable names to the device's pins and nodes. The equations section declares the device's function and can include truth tables, state machine syntax, Boolean equations, or a combination of these three. (Sample CUPL programs are listed in the appendices and are described later in this application note.) Figure 3. CSIM Waveform Viewing Commands CBlD generates an error message, and compilation is aborted. The file CUPL.DL contains a description of all devices supported by the current version of CUPL. CUPL Programming Language Elements The CUPL programming language's elements and syntax are very similar to those of other languages. Reserved words that cannot be used as variable names are listed in Figure6. You can use alternate number bases in CUPL by putting the base's name within single quotes immedi~ ately before the number. The designations for the supported number bases appear in Table 1. For example, to assign the hexadecimal value 16 to the variable "A," write: A = Header Section Figure 7 shows the header format. The NAME descriptor must be followed by the· name for the JEDEC map output, and the DEVICE descriptor must 'h'16 6-94 TARGET library SOURCE library1 devices I * SOURCE library2 devices I * Table 1. Number Base Representation Base Name Binary Octal Decimal Hexadecimal Where: TARGET identifies the new library. SOURCE identifies the source libraries. Operator ! & # $ library1 and library2 indicate source library names devices describes devices that are contained in the libraries ' 0' ' d' 'h' Example !A A&B A#B A$B Description NOT AND OR XOR The NODE declaration statement tells the compiler that a variable is needed to hold some kind of state information within the. device. This variable's outputs are not assigned to any output pin. You can use the NODE statement to assign variable names-and thus functions-to the buried registers in the CY7C330. Or you might use the NODE statement to arbitrarily assign a variable name to any unused macrocell in a PAL C 22VlO. This statement has the form NODE [!]var; Because the NODE statement arbitrarily assigns a register to the specified variable name, it might be more desirable to force the assignment of a variable to a specific node. You can do this with the PINNODE statement: PINNODE node_n = ![var] The FIELD assignment assigns a group of signals to one variable name. This feature is useful for address decoding and with truth tables, as shown in Appendix A. The FIELD statement has the form: FIELD var = [var,var, ...,var] The MIN declaration overrides the minimization level for a specific variable. This is useful, for example, in designs where a portion of the design should not be minimized. The MIN declaration has the form MIN var[.ext] = level; * is used to describe all devices in a library Figure 5. CBLD Custom Library Build File Format specify the device library for use during compilation. If you specify a different device file on the command line when you invoke the compiler, this file overrides the name found after DEVICE in the programming fIle. Pin/Node Section The pin declaration assigns specific pins to variable names using the format PIN pin n = [!]var; Both pin nand var can be lists. Use the "!" with inputs to indicate an active Low. The compiler chooses the signal's inverted sense when it is indicated as active in the logic equations. Use the "!" with outputs to indicate an active-Low output, and write the equations in a logically true form. In this case, the compiler performs DeMorgan's Theorem on the output variable to ensure that the output is a Low-asserted signal. FORMAT FUNCTION IF JUMP LOC LOCATION MACRO MIN NAME NODE OUT PARTNO Prefix 'b' Table 2. CUPL Logical Operators library indicates the target library name. APPEND ASSEMBLY ASSY COMPANY CONDITION DATE DEFAULT DESIGNER DEVICE ELSE FIELD FLD Base 2 8 10 16 PIN PINNODE PRESENT REV REVISION SEQUENCE SEQUENCED SEQUENCEJK SEQUENCERS SEQUENCET TABLE NAME; PARTNO; REVISION; DATE; DESIGNER; COMPANY; ASSEMBLY; LOCATION; DEVICE; FORMAT; Figure 7. CUPL Header Format Figure 6. CUPL Reserved Words 6-95 CUPL also contains several preprocessor commands that operate on the source file before the fIle is passed on to the parser. These commands perform functions such as string· substitution, fIle inclusion, and Ext Side .D L .L .K L L L .S .R .T .DQ .LQ L L L R R .AP L .AR .SP .SR .CK L L L L .OE .CA .PR .CE L .LE .OBS L L .BYP .DFB .LFB .TFB .IO .INT L R R R R R .CKMUX .OEMUX .TEC L · IMUX L .Tl .T2 .IOD L R .IOL R · IOCK L · IOAR L · IOAP L · IOSR L .IOSP L .ARMUX L .J L L L L L L . APMUX .LEMUX L conditional compilation. The commands allow you to develop general-purpose descriptions or modular portions of descriptions and customize them for different applications. Appendix D shows how to use the preprocessor command $DEFlNE to assign numbers to state variables. Description D input of D flip-flop D input of latch J input of JK flip-flop K input of JK flip-flop S input of SR flip-flop R input of SR flip-flop T input of T flip-flop Q output of D flip-flop Q output of a latch Asynch preset of flip -flop Asynch reset of flip-flop Synch preset of flip-flop Synch reset of flip-flop Programmable clock of flip-flop Programmable OE Complement array Programmable preload CE input of enabled D-CE type flip-flop Programmable latch enable Programmable observability of buried nodes Register bypass D feedback selection Latch feedback selection T feedback selection Pin feedback selection Internal feedback selec tion Clock MUX selection Tri-state MUX selection Technology-dependent fuse selection Input MUX selection of two pins Tl ~nput of 2-T flip-flop T2 input of 2-T flip-flop Pin feedback path through D register Pin feedback through Latch Clock for pin feedback register Asynchronous reset for pin feedback register Asynchronous preset for pin feedback register Synchronous reset for pin feedback register Synchronous preset for pin feedback register Asynchronous reset MUX selection Asynchronous preset MUX selection Latch enable MUX selection CUPL Programming Language Syntax This section focuses on CUPL's equation section. The program's logical and arithmetic operators (Tables 2 and 3, respectively) resemble those used in other programming languages. A variable's function depends on the extension added to it in the logic equation. These extensions define such capabilities as flip-flop descriptions and programmable three-state enables. The first column of Figure8 lists the extension that is used after the variable name. The second column indicates the side of the equation on which the extension is used. The third column briefly describes the extension's function. For example, the .OE extension controls the output-enable function for all Cypress PLDs with I/O pins; the .CKMUX extension selects the source for the inputregister clock in the CY7C330 and CY7C332; and .D selects registered output on devices that have both combinatorial and registered outputs. To see the extensions you can use with a specific Cypress part, use the CBLD program. To see all the possible extensions for use when programming the PAL C 22VlO, for example, the command line is CBLD -e CUPL P22VlO You can use the APPEND statement to assign more than one expression to a variable. This is the same as logically ORing the variable's present state with the expression that follows the APPEND statement. The latter has the form APPEND [!]var[.ext] = expr; CUPL also has several powerful set operations that you can use to increase code readability and decrease the amount of equation input. These set operations serve in the equations section to simplify equation input. For example, [varl, var2, var3] & var4; equates to [varl&var4, var2&var4, var3&var4] Table 3. CUPL Arithmetic Operators Operator + * I % ** Figure 8. CUPL Variable Extensions 6-96 Example A+B A-B A*B AlB A%B A**B Operation Add Subtract Multiply Divide Modulus Exponent Priority 1 1 2 2 2 3 TABLE var list 1 { -- input_1 input_2 => => => output_1 output_n Where: var_list_1 are the input variables var list 2 are then output variables inp~t_n is the value of the inputs (hex by default) output_n is the value of the outputs PRESENT 'b'Ol NEXT 'b'10; Figure 10. Unconditional Next State Diagram tion to state_m, else if expr_n is true, transition to state_y, else transition to state z. PRESENT state n IF expr NEXT state_ m; Figure 9. Truth Table Entry Format J Use set operations such as this with caution to ensure that when CUPL expands an expression, the result represents the minimum amount of logic needed to completely specify the desired operation. To see if a set of variables equals a constant, type [varl, var2, var3]:constant Or to check whether a set of variables lies between a range of constants, type [varl, var2, var3]:[constant 10 ..constant hi] CUPL supports truth tables with the fo~at shown in Figure 9. Truth tables are one of the easiest ways to express device function, and they are among the most easi1?, modified methods of design entry. You specify the mput and output variable lists, then specify a oneto-one assignment from the value of the input variable list to the value of the output variable list. You can use Don't Care values in the input specifications to make design entry easier. An example of truth tables with Don't Care values is shown in Appendix A. The state machine syntax of CUPL has the general form of SEQUENCE state var list { PRESENT state_l statements; IF expr_n NEXT state_y; [DEFAULT NEXT state z;] 3. Unconditional Synchronous Output Statement (Figure 12): This statement describes a transition from the present· state to a next state with a synchronous output accompanying the transition. PRESENT state n NEXT state n OUT [!]var ...OUT[!]var; 4. Conditional S"'*ynchronous Output Statement (Fig. ure 13): This statement describes a condition transition with its associated synchronous outputs. PRESENT state n IF expr NEXT state_lOUT [!]var ...OUT [!]var; IF expr NEXT state_n OUT [!]var; ..OUT [!]var; [DEF AULT NEXT state· m OUT [!]var;] 5. Unconditional Asynchronous Output Statement (Figure 14): This statement describes the asynchronous outputs associated with a specific state. PRESENT state n OUT [!jVar ...OUT [!]var; 6. Conditional Asynchronous Output Statement (Figure 15): This statement describes a conditional PRESENT state n statements; } INPUTA where SEQUENCE is the state space, and PRESENT indicates the device's present state and the function the machine should perform based on that state. The state machine syntax can be divided into six parts: 1. Unconditional Next Statement (Figure 10): If the machine is in state_n, then transition to stateJD. PRESENT state n NEXT state m; 2. Conditional Next Statement (Figure 11): If the machine is in state_nand if expr_1 is true, then transi- 8 ~ ~PUTA In C) PRESENT 'b' 01 IF INPUTA NEXT 'B'10; IF !INPUTA NEXT 'B'll; Figure 11. Conditional Next Statement Diagram 6-97 PRESENT ' b' 01 NEXT 'B'10 OUT Y OUT !Zi PRESENT 'b'01 OUT Y OUT !Zi Figure 12. Unconditional Synchronous Output Diagram Figure 14. Unconditional Asynchronous Output Diagram asynchronous output associated with a specific state and a specific input. PRESENT state n IF expr 'OUT [!]var ...OUT [!]var; The two examples described here both implement the functions of a Thunder~ird's (T-Bird's) tail lightsincluding the sequentially flashing directional signals. The .examples present this function in both the truth table and state machine formats to give you models of these CUPL syntax structures. using the FIELD statement. Similarly, you must assign all outputs to a variable name. All the inputs and outputs in the body of the truth table must be specified without commas, brackets, or variables. The CUPL 3.2 source code for this example is shown in Appendix A. CUPL's simulator verifies that this truth table operates correctly. When compiling the source code, you must use the -A flag to produce an absolute file for the simulator's use. The simulator also needs an input file, filename.SI, which contains the test vectors. To simulate a design with output going to both the screen and a listing file, filename. SO, type CSIM -L -v FILENAME Appendices B and C list the input and output simulation fIles, respectively. Truth Table Example The first example shows how to configure a 22VlO so that it makes two three-segment T-Bird tail lights perform flashing, braking, left turn, right turn, and a combination of these functions. Consider the truth table example· first. This example illustrates both the Truth Table syntax and CUPL's pin declarations. Note that when you· use a truth table, you must assign all inputs to a variable name State Machine Examples with the CY7C330 The second example performs the same function as the first, but is coded in CUPL's state machine syntax instead of truth tables. This second example also differs in that it employs Cypress's CY7C330. The CY7C330 is a high-performance, erasable programmable logic device (EPLD). Through the use of the user-configurable output macroce11, bidirectional I/O capability, input registers, and three separate IF expr OUT [!]var ...OUT [!]var; [DEFAULT OUT [!]var ...OUT [!]var;] CUPL Examples Using Cypress PLDs ~_X I_N~P~_T_B_Y ~'t!Z ____ ____- J PRESENT 'b'01 IF INPUTA OUT Xi IF !INPUTB OUT Yi DEFAULT OUT Z; PRESENT 'b'01 IF INPUTA NEXT 'B'10 OUT Yi IF !INPUTA NEXT 'B'll OUT !Zi Figure 15. Conditional Asynchronous Output With Default Figure 13. Conditional Synchronous Output Diagram 6-98 uses the XOR term to invert an equation's polarity when an active-Low output signal is specified. Using the XOR term in this example greatly reduces the number of product terms needed to specify the design. By connecting the signal name to the XOR product term, as shown in the equations, the equations represent a T flip-flop. For example, the equations for CNT2 specify that the flip-flop toggles (a) when preloading the lower limit, for CNT2 not equal to LL2, (b) when preloading the upper limit, for CNT2 not equal to UL2, (c) when counting UP, for CNTO and CNTI High, and (d) when counting DOWN, for CNTO and CNTI Low. It is important to keep in mind that UP, UEQUAL, and LEQU AL are Low-asserted internal signals. The part utilization for this design is shown in Appendix E. The CUPL design file appears in Appendix F. clocks, Cypress has tailored the CY7C330's architecture to implement high-performance state machines. This 28-pin device contains 11 dedicated input macrocells, whose input registers can be controlled by either of two input-register clocks. The 12 I/O macrocells (see Figure 1 in "Using ABEL to Program the CY7C330") contain an output register that is controlled by a dedicated state-register clock, output-enable control, an exclusive-or product term, an input register, and feedback selection. Each macrocell has between nine and 19 product terms you can use for design implementation. Each pair of macrocells also has a shared input multiplexer, which allows you to bury an output register while still utilizing the I/O pin as a device input The CY7C330's output enable can be controlled by either pin 14 or a product term. The device also provides four buried registers that can hold state information. The T-Bird design requires only four flip-flops [QO .. 3] to specify all possible tail-light combinations. Note that assignments such as LEFf.D = 'b'OOI are not allowed in the main body of the state machine structure. Instead, all outputs must be handled individually with the OUT command The source code for this example appears in Appendix D. An additional CY7C330 example shows the extended function of this PLD family. The CY7C330, unlike the PAL C 22VI0, has more nodes than pins. Thus, the additional nodes must be assigned node numbers so that they can be referenced in the design. Table 4 lists the node names. Numbers 33 to 44 refer to the output register associated with each pin. IMUXI refers to the shared input multiplexer between pins 28 and 27. The second CY7C330 design example is an up/down counter with preloadable limits. The lower limits are loaded the dedicated input registers on the rising edge of the lower-limit clock (lLC), and the upper limits are loaded the I/O macrocells' input registers on the rising edge of the upper-limit clock (ULC). The waveforms for preloading the upper limits and lower limits are shown in Figure16. When preloading is done, the counter counts upward from the last loaded limit until the other limit is reached. The counter then counts in the opposite direction until reaching the other limit The waveforms for counting between the preloaded limits of 4 and 8 are shown in Figure17. If the input register on a specific pin is not being used, you can reference the output register by referring to the I/O pin name. This is shown on pins 20 and 23. The CY7C330's shared input multiplexer is used to select an additional input into the product term array from either of a macrocell pair's input registers (and thus either macrocell's I/O pin). When referencing this input-signal name in the equations section, you must use the MUX name instead of the actual input signal name. CY7C332 State Machine Example The last example uses the Cypress CY7C332. This versatile combinatorial PLO has 25 array inputs: 13 dedicated inputs and 12 I/O inputs. Each input has a macrocell that you can configure as a register, latch, or simple buffer. Outputs have polarity and three-stateTable 4. Cypress CY7C330 Node Assignments PIN BRO BRI BR2 BR3 28 27 26 25 24 23 20 19 18 17 16 15 IMUXI IMUX2 IMUX3 IMUX4 IMUX5 IMUX6 Another important CY7C330 feature is the XOR product term. During DeMorgan minimization, CUPL 6-99 NODE 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 ~ ~~~~~~~~~~~~~~~~~~U~S~in~g~C~U~P~L~VVJ=I~th~C~yP~re~S~S~P~L~D=S L...ILJI..Jl .JL .JL ..JLJL .JL CLK LLC ~ ULC 1'1 LLO lit n n JI J 1 ~ LU ~ LL3 tl LL3 19' LL4 ILO LL5 1 LL6 1L2 LL7 1t3 LPL 1'10 ULO fJ5 UL1 [39 UL2 a6 UL3 ~? UL4 ;J8 UL:5 8 ·UL6 ~:5 UL7 ~? UPL its CHTO :5 CHT1 ~6 CHT2 ? CHT3 9 CHT4 I u 1 f1 II ................................ : 1 LL6 2 LL? 3 LPL ~O ULO UL1 p9 UL2 ~6 UL3 P? UL4 p8 UL:5 8 UL6 :5 UL7 ~? UPL Ft8CHTO ~5 n • ~:5~C~H~T~1----r-------~~--~~1 ~6 ? 9 ~4 ~O ra:3 ~4 IU ~3 ~2 4 1t6 CHT2 CHT3 CHT4 CHT:5 CHT6 CHT? UEQUAL UP PLDOHE LEQUAL I'CHTOE I'RESET r I J Figure 17~ .Up Down Counter Operation Waveforms 6-100 r~==~--~ o E PT-=--=Ec..:.R"-'M-=---_ _ _ _-I----i C4 p~ SUM OF PRODUCTS XOR TO INPUT ::>O.........._ _ _--+_-t-T""-=O~I 0 PIN BUFFER OE (PIN ClKl ClK2 14) Figure 19. The CY7C332 I/O Macrocell control product terms. Figure 18 shows the IJO macrocell. Each macrocell has up to 19 product terms to accommodate complex applications. In this example, the CY7C332 serves as a simple decoder (Appendix G). The device decodes a group of address lines to select one of four "windows" in memory. Inputs are implemented in each of the possible macrocell configurations. When reviewing the example code, it is important to note the use of the .CKMUX, .LEMUX, .DQ, and .LQ extensions. JED file for programming, a .LST error listing, and a .DOC equations-and-utilization file. You can compile a fIle either from the DOS command line, or from the CUPL menu structure. The· compilation command and its description are shown in Figure19. References This Application Handbook provides a more detailed explanation of the up/down counter example using the CY7C330 in "Understanding the CY7C330 Synchronous EPLD." More information on the CY7C33x family can be found in Cypress's BiCMOS/CMOS Data Book. CUPL Compilation The input to the CUPL system is an ASCII text file with extension .PLD. The various outputs include a 6-101 cupl [-flags] source [library] [device] Where -flags is the following set of compiler options -j -n -h -i -a -1 -x -f -p -b -d -r -g -u -s -e -x -w -mO -ml -m2 -m3 -m4 -c JEDEC download format use source filename as JEDEC filename ASCII-HEX download format HL download format create absolute file (for simulation purposes) create listing file create expanded product-terms in documentation file create fuse plot/chip diagram in documentation file create PDEF database interchange format file create Berkeley PLA format file deactivate unused OR terms disable product term merging program security fuse use specified library for compilation perform logic simulation after compile create expanded macro definition file generates a part usage documentation file (filename.DOC) perform simulation with waveform output (PC only) no minimization quick minimization (default) minimization level 2 (QuineMcCluskey) minimization level 3 (Presto) minimization level 4 (Espresso) create PALASM format file Library is the library name including the path that should be used other than the default library. This option is used in conjunction with the u flag. Device is the CUPL mnemonic name of the device which should be used when compiling the source file. This option over rides the name used in the CUPL source file. Source is the user-created ASCII logic description file (filename. PLD) . Figure 20. CUPL Compilation 6-102 Appendix A. T-Bird Truth-Table CUPL Code for PALC22VI0 Name Partno Revision Date Designer Company Location Assembly Device TBIRD_TT.PLD; PALC22V10; 01; 04-08-90; Joe Designer; Cypress Semiconductor; U1; Test; P22V10; /* This program implements the control signals for the tail lights of a Thunderbird. The lights have three segments for both the left and right tail light. The control signal into the device include a Left and Right signal, a Flash signal (Hazard), a brake signal, and a ignition signal (IGN). The outputs of the device are the six separate tail light segments. A Truth Table is used to specify the control logic. */ PIN PIN PIN PIN PIN PIN 1 4 5 6 CLK; LT; RT; BRAKE; FLASH; IGN; PIN PIN PIN PIN PIN PIN PIN 21 22 23 16 15 14 [17 .. 20]= 7 8 RI; RM; RO; LI; INPUTS Clock for Device */ Left turn signal */ Right turn signal */ Brake signal */ Hazard flash singal */ Ignition input */ Right inside tail light */ Right middle */ Right outside */ Left inside */ Left middle */ Left outside /* */ /* State variable holders */ /* /* /* /* /* LM; LO; [QO .. 3]; [IGN,FLASH,LT,RT,BRAKE,LO,LM,LI,RI,RM,RO]; [LO.D,LM.D,LI.D,RI.D,RM.D,RO.D]; FIELD INPUTS FIELD OUTPUTS TABLE /* /* /* /* /* /* => OUTPUTS { /* Quiescent state */ 'B'11000XXXXXX 'B'OlXXOXXXXXX => => 'B' 0; ' B' 0; => => ' B' 0; 'B'llllll; /* Flash */ 'B'XOXXX111111 'B'XOXXXOOOOOO 6-103 Appendix: A. T-Bird Truth-Table CUPL Code (cont) /* Brake */ 'B'X1001XXXXXX => 'B'llllll; => => => => 'B'OOlOOO; 'B' 011000; 'B'111000; 'B'O; => => => => 'B'OOO100; 'B'OOOl10; 'B' 000111; ' B' 0; => => => => 'B'OOllll; 'B'Olllll; 'B'111111; 'B' 000111; 4' /* Left turn */ 'B'l1lOOOOOXxx 'B'11100001XXX 'B'll100011XXX 'B'lllOOlllXXX /* Right turn */ 'B'l1010XXXOOO , B' 11010XXX100 , B' 11010XXX110 , B' 110l0XXX111 /* Left turn and brake */ 'B'11101000XXX 'B'11101001XXX 'B'11101011XXX 'B'11101111XXX /* Right turn and brake */ 'B'11011XXXOOO 'B'11011XXX100 'B'11011XXX110 'B'11011XXX111 /* Both turn - => => => => 'B' 111100; 'B'111110; 'B'l11111; 'B'111000; light flash in reverse sequence */ 'B'l1110000000 , B'llll0111111 , B' 11110011110 , B' 11110001100 /* Illegal condition - 'B'll11l000000 , B' 11111100001 'B'lllllOlOOlO 'B'11111001l00 => => => => 'B'11111l; 'B' 011110; 'B'001100; 'B'O; All on */ => => => => 'B'lOOOOl; 'B'OlOOlO; 'B'OOllOO; 'B'O; 6-104 Appendix B. T -Bird Simulator Input Name Partno Revision Date Designer Company Location Assembly Device ORDER: TBIRD_TT.PLD; PALC22V10; 01; 04-08-90; Joe Designer; Cypress Semiconductor; U1; Test; P22V10; "INPUTS, CLK, IGN, FLASH, LT, RT, BRAKE, OUTPUTS", LO, LM, LI, RI, RM, RO; VECTORS: $MSG " QUIESCENT C11000 $MSG " QUIESCENT C01XXO $MSG $MSG C 0 $MSG C 0 $MSG C 0 STATE - 1"; LLLLLL STATE - 1"; LLLLLL ""i " FLASH HIGH"; H H H H H H 0 X X X " FLASH LOW"; L L L L L L 0 X X X " FLASH HIGH" ; H H H H H H 0 X X X $MSG ""; $MSG " BRAKE"; C X 1 0 0 1 $MSG ""; $MSG " LEFT C11100 $MSG " LEFT C 1 1 1 0 0 $MSG " LEFT C 1 1 1 0 0 $MSG " LEFT C 1 1 1 0 0 $MSG " LEFT C 1 1 1 0 0 $MSG "" i $MSG " RIGHT C 1 1 0 1 0 $MSG " RIGHT C 1 1 0 1 0 $MSG " RIGHT C 1 1 0 1 0 $MSG " RIGHT C 1 1 0 1 0 H H H H H H TURN OFF"; LLLLLL TURN 1"; L L H L L L TURN 2"; L H H L L L TURN 3"; H H H L L L TURN OFF"; L L L L L L TURN L TURN L TURN L TURN L 1"; L L H L L 2"; L L H H L 3"; L L H H H OFF"; L L L L L 6-105 Appendix B. ,T·BirdSimulator Input (cont) $MSG ""; $MSG " BRAKE Clll0l $MSG " BRAKE Clll0l $MSG " BRAKE C 1 1 101 $MSG " BRAKE Clll0l AND LEFT TURN 1"; LLHHHH AND LEFT TURN 2"; LHHHHH AND LEFT TURN 3"; H H H H H H AND LEFT TURN OFF"; LLLHHH $MSG ""; $MSG " BRAKE C 1 1 0 1 1 $MSG " BRAKE C 1 1 0 1 1 $MSG " BRAKE C 1 1 0 1 1 $MSG " BRAKE C 1 1 0 1 1 AND RIGHT H H H AND RIGHT H H H AND RIGHT H H H AND RIGHT H H H TURN OFF"; L L L TURN 1"; H L L TURN 2"; H H L TURN 3"; H H H 6-106 Appendix C. T-Bird Simulator Output CSIM: CUPL Simulation Program Version 3.2a Serial# MD-32A-6295 Copyright (C) 1983,1989 Logical Devices, CREATED Mon Apr 09 09:32:04 1990 Inc. LISTING FOR SIMULATION FILE: tbird_tt.si TBIRD_TT.PLD; 1 : Name PALC22V10; 2 : Partno 01; 3: Revision 4 : Date 04-08-90; 5: Designer Joe Designer; 6 : Company Cypress Semiconductor; 7 : Location U1; Test; 8 : Assembly P22V10; 9 : Device 10: 11: 12: ORDER: "INPUTS", CLK, IGN, FLASH, LT, RT, BRAKE, 13: OUTPUTS", LO, LM, LI, RI, RM, RO; 14: Simulation Results QUIESCENT STATE - 1 0001: INPUTSC11000 QUIESCENT STATE - 1 0002: INPUTSC01XXO FLASH HIGH 0003: INPUTSFLASH LOW 0004: INPUTSFLASH HIGH 0005: INPUTSBRAKE 0006: INPUTSLEFT 0007: LEFT 0008: LEFT 0009: LEFT 0010: LEFT 0011: TURN OFF INPUTSTURN 1 INPUTSTURN 2 INPUTSTURN 3 INPUTSTURN OFF INPUTS- OUTPUTS- LLLLLL OUTPUTS- LLLLLL COOXXX OUTPUTS- HHHHHH COOXXX OUTPUTS- LLLLLL COOXXX OUTPUTS- HHHHHH CX1001 OUTPUTS- HHHHHH C11100 OUTPUTS- LLLLLL C11100 OUTPUTS- LLHLLL C11100 OUTPUTS- LHHLLL C11100 OUTPUTS- HHHLLL C11100 OUTPUTS- LLLLLL 6-107 Appendix C. T-BirdSimulator Output (cont) RIGHT TURN 1 0012: INPUTSCll010 RIGHT TURN 2 0013 : INPUTSCll010 RIGHT TURN 3 0014: INPUTSCll010 RIGHT TURN OFF 0015: INPUTSCll010 BRAKE AND LEFT TURN 0016: INPUTSCll101 BRAKE AND LEFT TURN 0017: INPUTSCll101 BRAKE AND LEFT TURN 0018: INPUTSCll101 BRAKE AND LEFT TURN 0019: INPUTSCll101 BRAKE AND RIGHT TURN 0020: INPUTSCll011 BRAKE AND RIGHT TURN 0021: INPUTSCll011 BRAKE AND RIGHT TURN 0022: INPUTSC11011 BRAKE AND RIGHT TURN 0023: INPUTSCllOll OUTPUTS- LLLHLL OUTPUTS- LLLHHL OUTPUTS- LLLHHH OUTPUTS- LLLLLL OUTPUTS- LLHHHH OUTPUTS- LHHHHH OUTPUTS- HHHHHH OUTPUTS- LLLHHH 1 2 3 OFF OFF OUTPUTS1 OUTPUTS2 OUTPUTS3 OUTPUTS- HHHLLL HHHHLL HHHHHL HHHHHH 6-108 C~RE3S ~, Using CUPL With Cypress PLDs SEMICCNDUCfOR =;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;:;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;!;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;= Appendix D. T-Bird State-Machine CUPL Code For CY7C330 Name Partno Revision Date Designer Company Location Assembly Device TBIRD_SM.PLD; CY7C330; 01; 04-07-90; Joe Designer; Cypress Semiconductor; U1; Test; P7C330; /* This program implements the control signals for the tail lights of a Thunderbird. The lights have three segments for both the left and right tail light. The control signal into the device include a Left and Right signal, a Flash signal (Hazard), a brake signal, and a ignition signal (IGN). The outputs of the device are the six separate tail light segments. A State Machine is used to specify the control logic. */ PIN PIN PIN PIN PIN PIN PIN 1 2 4 5 6 7 9 PIN PIN PIN PIN PIN PIN PINNODE 28 27 26 25 24 23 [29 .. 32]= CLK; INCLK; LT; RT; BRAKE; FLASH; IGN; RI; RM; RO; LI; LM; LO; [QO .. 3]; FIELD OUTPUTS OUTPUTS.OE OUTPUTS.SR OUTPUTS.SP /* /* /* /* /* /* /* Clock for Device */ Clock for Inputs */ Left turn signal */ Right turn signal */ Brake signal */ Hazard flash singal */ Ignition input */ Right inside tail light */ Right middle */ /* Right outside */ Left inside /* */ Left middle /* */ /* Left outside */ State variable holders */ /* /* /* [LO,LM,LI,RI,RM,RO]; 'B'l; 'B' 0; 'B'O; /* Using the $DEFINE statement to assign variable name to state values */ $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE SO Sl S2 S3 S4 85 S6 S7 S8 'B'OOOO 'B'OOOl 'B'0010 'B'OOll 'B'0100 'B'0101 'B'0110 'B'Olll 'B'1000 6-109 Appendix D. T..Bird State-Machine CUPL Code (cont) $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE $DEFINE S9 S10 Sll S12 S13 S14 S15 'B'1001 'B'1010 'B'101l 'B'1100 'B'll0l 'B'l110 'B'1111 1* The state machine construct where QO .. 3 are the state variables */ SEQUENCE [QO .. 3] { 1* Initial state all lights off */ PRESENT SO OUT !LO.D OUT !LM.D OUT !LI.D OUT !RI.D OUT !RM.D OUT !RO.D; IF (FLASH) NEXT S15; IF (BRAKE & ! (LT RT» NEXT S15; IF (IGN & LT & !BRAKE) NEXT Sl; IF (IGN & RT & !BRAKE) NEXT S4; IF (IGN & LT & BRAKE) NEXT S7; IF (IGN & RT & BRAKE) NEXT Sll; DEFAULT NEXT SO; * 1* Left turn */ PRESENT Sl OUT !LO.D OUT !LM.D OUT LI.D OUT !RI.D OUT !RM.D OUT !RO.D; IF (IGN & LT) NEXT S2; DEFAULT NEXT SO; PRESENT S2 OUT !LO.D OUT LM.D OUT LI.D OUT !RI.D OUT !RM.D OUT !RO.D; IF (IGN & LT) NEXT S3; DEFAULT NEXT SO; PRESENT S3 OUT LO.D OUT LM.D OUT LI.D NEXT SO; OUT !RI.D OUT !RM.D OUT !RO.D /* Right Turn * / PRESENT S4 OUT !LO.D OUT !LM.D OUT !LI.D IF (IGN & RT) NEXT S5; DEFAULT NEXT SO; OUT RI.D OUT !RM.D OUT !RO.D; PRESENT S5 OUT !LO.D OUT !LM.D OUT !LI.D OUT RI.D OUT RM.D OUT !RO.D; IF (IGN & RT) NEXT S6; DEFAULT NEXT SO; 6.. 110 Appendix D. T-Bird State-Machine Code (cont) PRESENT S6 OUT !LO.D OUT !LM.D OUT !LI.D OUT RI.D OUT RM.D OUT RO.D; NEXT SO; /* Brake and Left Turn */ PRESENT S7 OUT !LO.D OUT !LM.D OUT LI.D OUT RI.D OUT RM.D OUT RO.D; IF (IGN & LT) NEXT S8; DEFAULT NEXT SO; PRESENT S8 OUT !LO.D OUT LM.D OUT LI.D OUT RI.D OUT RM.D OUT RO.D; IF (IGN & LT) NEXT S9; DEFAULT NEXT SO; PRESENT S9 OUT LO.D OUT LM.D OUT L1.D OUT R1.D OUT RM.D OUT RO.D; IF (IGN & LT) NEXT S10; DEFAULT NEXT SO; PRESENT S10 OUT !LO.D OUT !LM.D OUT !L1.D OUT R1.D OUT RM.D OUT RO.D; IF (IGN & LT) NEXT S7; DEFAULT NEXT SO; /* Brake and Right Turn */ PRESENT Sll OUT LO.D OUT LM.D OUT L1.D OUT R1.D OUT !RM.D OUT IF (IGN & RT) NEXT S12; DEFAULT NEXT SO; PRESENT S12 OUT LO.D OUT LM.D OUT L1.D OUT R1.D OUT RM.D OUT IF (IGN & RT) NEXT S13; DEFAULT NEXT SO; !RO.D; !RO.D; PRESENT S13 OUT LO.D OUT LM.D OUT L1.D OUT R1.D OUT RM.D OUT RO.D; IF (IGN & RT) NEXT S14; DEFAULT NEXT SO; PRESENT S14 OUT LO.D OUT LM.D OUT LI.D OUT !RI.D OUT !RM.D OUT IF (IGN & RT) NEXT Sll; DEFAULT NEXT Sll; !RO.D; /* Brake and/or flash tail lights on */ PRESENT S15 OUT LO.D OUT LM.D OUT LI.D OUT RI.D OUT RM.D OUT RO.D; IF (BRAKE & ! (RT # LT)) NEXT S15; DEFAULT NEXT SO; 6-111 S}:CY>= -==-- Using CUPL With Cypress PLDs SEMICOIDUCTOR Appendix E. UplDown Counter Part Utilization CY7C330 Resources Planning Sheet Project : Up/Down Counter with Limits Input Input Register Register Pin Function Clock 1 State Clk 2 Clk 1 Clk 2 3 4 LLO 1 LLI 1 5 LL2 1 6 1 7 LL3 VSS 8 LL4 1 9 LLS 1 10 LL6 1 11 LL7 1 12 PRELOAD LOW 13 1 14 COUNTER OE 15 ULI 2 16 Reset 1 17 UL3 2 18 UL6 2 19 UL4 2 20 VSS 21 VCC 22 23 2 24 ULS 25 UL7 2 26 UL2 2 27 PRELOAD HIGH 2 28 ULO 2 None HI H2 None None H3 H4 None Register Function CNTl CND CNT4 CNT6 CND CNT5 CNT2 CNTO Up Equals UH Prel'Done Down Equals Up Count Notes :Input Register Clock #1 is pin 2 #2 is pin 3 See the Application Note for the meaning of the pin names. Output Enable = 14 means the asynchronous pin 14 direct enable. Z means the pin is never active 6-112 Output Enable Pin Z Pin Z Pin Pin 14 14 14 14 Pin 14 Pin 14 Z Pin 14 Z Pin 14 None None None None # of PTerms 9 19 11 17 13 15 15 13 17 11 19 9 19 11 17 13 ~ ~~~OID~~~~~~~~~~~~~~~~U~si~n~g~C~U~P~L~VVJ~lt~h~C~yp~r~e~s~s~P~L~D~s Appendix F. UplDown Counter CUPL Code for the CY7C330 Name Partno Revision Date Designer Company Location Assembly Device COUNTER.PLD; PALC22V10; 01; 02-25-90; Joe Designer; Cypress Semiconductor; U1; COUNTER; P7C330; 1* This design is an up/down counter with prelaodable limits. The Lower limits are loaded into the dedicated input registers on the rising edge of LLC and the upper limits are loaded into the input registers found in the 1/0 macrocells on the rising edge of ULC. The counter begins counting, when pre loading is done upwards until the upper limit is reached, and then, begins counting downward. This design, because the equations are already minimized and in sum of products form, should be compiled with the -MO flag (no minimization). *1 PIN PIN PIN 1 2 3 PIN PIN PIN [4 .. 7] [9 .. 12]= 13 CLK; LLC; ULC; [LLO .. 3]; [LL4 .. 7]; LPL; 1* Clock used for counting *1 1* Clock for pre loading lower limit *1 1* Clock for pre loading upper limit *1 1* Lower limit hold registers */ 1* Lower limit preload indications *1 1* Counter output registers. Pin assignments are based on the number of product terms are available on that pin. *1 1* 1* 1* 1* 1* 1* *1 *1 *1 *1 *1 *1 PIN PIN PIN PIN PIN PIN PIN PIN PIN PIN PIN 28 15 26 17 19 24 20 23 18 25 27 CNTO; CNT1; CNT2; CNT3; CNT4; CNT5; CNT6; CNT7; UL6; UL7; UPL; PINNODE PINNODE PINNODE PINNODE 29 30 31 32 UEQUAL; PLDONE; LEQUAL; UP; 1* Upper limit has been reached *1 1* Preloading has finished *1 1* Lower limit has been reached *1 1* Count direction *1 PIN PIN 16 14 !RESET; !CNTOE; 1* Reset signal clears all registers *1 1* 1/0 pin OE used for loading upper limit *1 Also Also Also Also Also Also used used used used used used for for for for for for Upper Upper Upper Upper Upper Upper limit limit limit limit limit limit loading loading loading loading loading loading 1* Used for Upper limit loading *1 1* Used for Upper limit loading *1 6-113 Appendix F. UplDown Counter Code for CUPL (cont) PINNODE PINNODE PINNODE PINNODE PINNODE PINNODE 45 46 47 48 49 50 ULO; UL2; UL5; UL4; UL3; UL1; ULO.IMUX UL2.IMUX UL5.IMUX UL4.IMUX UL3.IMUX UL1.lMUX 1* 1* 1* 1* 1* 1* Shared Shared Shared Shared Shared Shared 1* 1* 1* These definitions are used to *1 indicate which pin will be fed *1 through the share feedback Mux.*1 CNTO.IOD; CNT2.IOD; CNT5.IOD; CNT4.IOD; CNT3.IOD; CNTl. lOD; ULO UL2 UL5 UL4 UL3 UL1 1* 1* CNTO.lOD; CNT2.IOD; CNT5.lOD; CNT4.IOD; CNT3.lOD; CNTl.lOD; UPL.CKMUX LPL.CKMUX RESET.CKMUX [CNTO .. 5] .CKMUX [UL6 .. 7] .CKMUX [LLO .. 7] .CKMUX [CNTO .. 7] . SR [CNTO .. 7] .OEMUX ULC; LLC; LLC; ULC; ULC; LLC; RESET.DQ; CNTOE; input input input input input input MUX MUX MUX MUX MUX MUX definition definition definition definition definition definition *1 *1 *1 *1 *1 *1 These definitions are used to Simulate the design properly *1 *1 1* 1* 1* Pin 3 will be used for upper preload */ Pin 3 will be used for upper preload *1 Pin 2 will be used for lower preload */ 1* Count register will be reset by pin 16 1* OE will be controlled by pin 14 */ *1 1* Count equations. Note how the use of the XOR terms significantly reduces the number of product terms that are needed. This allows this complex design to fit fit into the device. *1 CNTO.D $ ** ** = CNTl.D $ ** ** * CNTO PLDONE !LLO.DQ & LPL.DQ & CNTO !CNTO.IOD& ULO & UPL.DQ LLO.DQ & LPL.DQ & !CNTO CNTO.lOD& !ULO & UPL.DQ ; CNTl !LLl.DQ & LPL.DQ & !PLDONE & CNT1 LLl.DQ & LPL.DQ & !PLDONE & !CNT1 UPL.DQ & !PLDONE & lULl & CNT1 UPL.DQ & !PLDONE & ULl & !CNTl CNTO.IOD& PLDONE & !UP !CNTO.IOD& PLDONE & UP 6-114 Appendix F. UplDown Counter Code for CUPL (cont) = CNT2.D $ ** ** * CNT3.D = $ * ** ** CNT4.D = $ *** ** CNTS.D = $ * ** * * CNT6.D = $ ** ** * CNT7.D = $ ** * * !CNT1; * CNT2 ! LL2 . DQ & LPL. DQ & CNT2 & ! PLDONE LL2.DQ & LPL .DQ & ! CNT2 & ! PLDONE UPL.DQ & CNT2 & !UL2 & !PLDONE UPL.DQ & !CNT2 & UL2 & !PLDONE CNTO.IOD& PLDONE & !UP & CNTl !CNTO.IOD& PLDONE & UP & !CNT1; CNT3 !LL3.DQ & LPL.DQ & !PLDONE & CNT3 LL3.DQ & LPL.DQ & !PLDONE & !CNT3 UPL.DQ & !PLDONE & !UL3 & CNT3 UPL.DQ & !PLDONE & UL3 & !CNT3 CNTO.IOD& CNT2 & PLDONE & !UP & CNTl !CNTO.IOD& !CNT2 & PLDONE & UP & !CNT1; CNT4 !LL4.DQ & LPL.DQ & !PLDONE & CNT4 LL4.DQ & LPL.DQ & !PLDONE & !CNT4 UPL.DQ & !PLDONE & !UL4 & CNT4 UPL.DQ & !PLDONE & UL4 & !CNT4 CNTO.IOD& CNT2 & PLDONE & !UP & CNT3 & CNTl !CNTO.IOD& !CNT2 & PLDONE & UP & !CNT3 & !CNT1; CNTS !LLS.DQ & LPL.DQ & CNTS & !PLDONE LLS . DQ & LPL . DQ & ! CNT S & ! PLDONE UPL.DQ & CNTS & !ULS & !PLDONE UPL.DQ & !CNTS & ULS & !PLDONE CNTO.IOD& CNT2 & PLDONE & CNT4 & !UP & CNT3 & CNTl !CNTO.IOD& !CNT2 & PLDONE & !CNT4 & UP & !CNT3 & !CNT1; CNT6 ! LL6 .DQ & LPL .DQ & ! PLDONE & CNT6 LL6.DQ & LPL.DQ & !PLDONE & !CNT6 UPL.DQ & !PLDONE & CNT6 & !UL6.DQ UPL.DQ & !PLDONE & !CNT6 & UL6.DQ CNTO.IOD& CNT2 & CNTS & PLDONE & CNT4 & !UP & CNT3 & CNTl !CNTO.IOD& !CNT2 & !CNTS & PLDONE & !CNT4 & UP & !CNT3 & !CNT1; CNT7 !LL7.DQ & LPL.DQ & CNT7 & !PLDONE LL7.DQ & LPL.DQ & !CNT7 & !PLDONE UPL.DQ & !UL7.DQ & CNT7 & !PLDONE UPL.DQ & UL7.DQ & !CNT7 & !PLDONE CNTO. IOD& CNT2 & CNTS & PLDONE & CNT6 & CNT4 & !UP & CNT3 & CNTl !CNTO.IOD& !CNT2 & !CNTS & PLDONE & !CNT6 & !CNT4 & UP & !CNT3 & 6-115 Appendix F. UplDown Counter Code for CUPL (cont) /* Direction of count */ UP.D UP !UEQUAL & !UP & PLDONE !LEQUAL & UP & PLDONE UPL.DQ & !PLDONE & !UP LPL.DQ & !PLDONE & UP; $ 41: 41: 41: /* Has the lower limit been reached */ LEQUAL.D 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: 41: LL6.DQ & !CNT6 !LL7.DQ & CNT7 LL7.DQ & !CNT7 LL3.DQ & !CNT3 !LLS.DQ & CNTS LLS.DQ & !CNTS !LL1.DQ & CNTl LLO.DQ & !CNTO !LL2.DQ & CNT2 !LL4.DQ & CNT4 LL4.DQ & !CNT4 !LLO.DQ & CNTO LL1.DQ & !CNTl !LL6.DQ & CNT6 !LL3.DQ & CNT3 LL2.DQ & !CNT2; 1* Has pre loading finished */ PLDONE.D !LPL.DQ = & !UPL.DQ /* Has the upper limit been reached */ UEQUAL.D !CNT6 & UL6.DQ !UL7.DQ & CNT7 & !CNT7 41: UL7.DQ & !CNT3 41: UL3 & !ULS 41: CNTS & ULS 41: !CNTS & CNTl 41: lULl 41: !CNTO.rOD & ULO & !UL2 41: CNT2 & CNT4 41: !UL4 & !CNT4 41: UL4 & !ULO 41: CNTO. rOD ULl & !CNTl 41: & !UL6.DQ 41: CNT6 & CNT3 41: !UL3 & UL2; 41: !CNT2 41: 6-116 Appendix G. Decoder CUPL Code Name Partno Revision Date Designer Company Location Assembly; Device DCOLUMNS332.PLD; P7C332; 01; 10-09-90; Joe Designer; Cypress Semiconductor; 332 DCOLUMNSR; P7C332; 1* This design is a simple decoder. Agroup of address lines are decoded to select one of 4 "windows" in memory. The inputs have been configured in each of their possible configurations. Although this application would not be used in a real design, this example shows how to configure the input registers in each of their possible modes. *1 PIN PIN PIN PIN PIN PIN PIN PIN PIN 1 2 [3 .. 7] [9 .. 13] 14 [15 .. 20] [23 .. 26] 27 28 [!WINDOWO .. 3] .OEMUX NOTME.OE [AD16 .. 19] .CKMUX [AD20 .. 23] .CKMUX [AD24 .. 27] .LEMUX [AD28 .. 31] .LEMUX 1* CLK; LTCHEN; AD16 .. 20]; [AD21. .25]; !COE; [AD26 .. 31]; ! [WINDOWO .. 3] ; !NOTME; !DCDEN; 1* 1* COE; 'b'l; CLK; !CLK; LTCHEN; !LTCHEN; Window selection Equations Clock pin *1 Latch enable pin */ I*Address lines *1 1* Output enable *1 1* 1* 1* Window selection output No window selected *1 Decode enable *1 1* 1* 1* 1* 1* 1* OE controlled by pin 14 *1 Notme always on bus *1 Clocked on rising edge *1 Clocked on falling edge *1 Latched when high *1 Latched when low */ *1 *1 WINDOWO DCDEN.DQ & AD31.LQ & AD30.LQ & AD29.LQ & AD28.LQ & AD27.LQ & AD26.LQ & AD25.LQ & AD24.LQ & AD23.DQ & AD22.DQ & AD21.DQ & AD20.DQ & !AD19.DQ & !AD18.DQ & !AD17.DQ & !AD16.DQ; WINDOW1 DCDEN.DQ & AD31.LQ & AD30.LQ & AD29.LQ & AD28.LQ & AD27.LQ & AD26.LQ & AD25.LQ & AD24.LQ & AD23.DQ & AD22.DQ & AD21.DQ & AD20.DQ & !AD19.DQ & AD18.DQ & !AD17.DQ & !AD16.DQ; WINDOW2 DCDEN.DQ & AD31.LQ & AD30.LQ & AD29.LQ & AD28.LQ & AD27.LQ & AD26.LQ & AD25.LQ & AD24.LQ & AD23.DQ & AD22.DQ & AD21.DQ & AD20.DQ & AD19.DQ & !AD18.DQ & !AD17.DQ & !AD16.DQ; 6-117 Appendix G. Decoder CUPL Code (cont) DCDEN.DQ & AD31.LQ & AD30.LQ & AD29.LQ & AD28.LQ & AD27.LQ & AD26.LQ & AD25.LQ & AD24.LQ & AD23.DQ & AD22.DQ & AD21.DQ & AD20.DQ & AD19.DQ & AD18.DQ & !AD17.DQ & !AD16.DQ; WINDOW3 NOTME $ # # # # # # # # # # # # # 'B'l DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN .DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ DCDEN.DQ & & & & & & & & & & & & & & AD16.DQ !AD16.DQ & AD17.DQ !AD31.LQ !AD30.LQ !AD29.LQ !AD28.LQ !AD27.LQ !AD26.LQ !AD25.LQ !AD24.LQ !AD23.DQ !AD22.DQ !AD21.DQ !AD20.DQ; 6-118 Using ABEL to Program the Cypress 22VIO Introduction This application note presents a compilation of examples using the popular PALC22V10 programmable logic device. The examples demonstrate the 22V10's advanced features and some of the high-level logic description techniques of the ABEL programming langauge. Each of the first seven. examples illustrates a specific 22V10 feature and lists the ABEL programming language statements necessary to implement the feature. The ABEL files also contain test vectors that exercise the feature. The remaining examples describe complete 22V10 designs that combine many of the individual features. All the examples have been tested, and you can obtain the code for them on floppy disk from· Cypress Semiconductor. The design examples provided are: You can use these examples as a design reference. They are excellent tools for designers new to programmable logic as well as for veteran PLO users. Add the files to your ABEL source-file library, and include any part of the ftleS in your own designs. You can use the files as a template by editing them using any text editor in the non-document mode. Conversion to the CUPL or PLO ToolKit ToolKit programming language is easily accomplished due to these languages' syntactical similarity. For conversion to other languages, consult your user's guide. Notes on the ABEL Programming Language Before examining the application examples, consider an introduction to the structure and syntax of the ABEL programming language. A rudimentary understanding of the ABEL language is necessary to fully appreciate the example files included here. Asynchronous reset/synchronous preset from single inputs Asynchronous product terms reset/synchronous preset An ABEL source file provides the information necessary to describe a PLO design's logical operation. You can see these files' keywords and structure in any of the examples. The ABEL language processor processes source files to generate a JEDEC programming file and design documentation. The language processor also uses test vectors, which you generate as part of the source file, to test the design's function. from Asynchronous reset/synchronous preset used to load predetermined non-zero values, employing istype statements Output-enable control from a single input Output-enable control from product terms ABEL Design Entry Methods Using 16 product terms-an 8-bit identity comparitor The ABEL programming language offers three methods for defming the logical operation of a given design. These methods are: Using feedback to realize more than 16 product terms in a 9-bit single-output identity comparitor Boolean Equation Bidirectional I/O-bus interface with answer-back 10-bit address generator/multiplexer Truth table Three state machines in one 22V10 State diagram 6-119 A source file can include any or all of these design entry methods. The following sections describe the Boolean equation, truth table, and state diagram entry· methods as well as the operators and notation conventions used in the source files. ABEL Operators and Notation Conventions In addition to the standard AND and OR logical operators, ABEL supports several high-level logic definitions. ABEL interprets "+" and "*,, signs-which in standard Boolean notation stand for OR· and AND operations, respectively-to indicate arithmetic addition and multiplication. This convention greatly simplifies the design of counters and ALU logic. Table 1 shows the logical operators ABEL supports. The labels A, B, and C in the examples can be either individual pins or a set of pins, as defined in the source file. Note that you can use these operators with operands of more than one bit on a bit-by-bit basis. For example, logically ORing hexidecimal values of 8 and 2 yields hexidecimal value A: "h08 # "h02 = "hOA Specifying Alternate Number Bases The "h symbols in the example above instruct the language processor to interpret the value following the symbol as base 16 (hex). The default number base in ABEL is decimal, but you can change the base for individual expressions with "b for binary, "0 for octal, "d for decimal, or "h for hexidecimal. You can also use the "@ radix" command to change the default number base to binary, octal, decimal, or hexidecimal for all subsequent statements in a source document. All the source files in. this application note include the command "@ radix 16" to set the number base to hexidecimal. Arithmetic Operators ABEL provides arithmetic operators to allow for easy implementation of math and shifting functions. Table 2 lists the arithmetic operators supported by ABEL. Shifting operations are unsigned, and zeros are shifted into the side of the expression opposite the direction of the shift. Also note that ABEL interprets the symbol "1" as an unsigned division operation. Other programmable logic }anguages use this symbol to indicate inversion. The symbol "%" gives the remainder of the division operation performed by "/". Relational Operators Relational operators perform various comparisons of elements in an expression and yield a Boolean true or false based on the result of the comparison. These operators greatly simplify the description of magnitude comparisons and reduce an identity comparison to a single statement. All relational operations are unsigned; take care when you represent negative numbers in twos compliment. Table 3 lists the relational operators. Relational operators are frequently used where ranges of values cause a given output. For example, if you want to decode an active-low chip-select line (CSl) for any address from "h2000 to "h2FFF, you can write the logic for this output in a single line: !CSt = (ADD >= "h2000) & (ADD<="h2FFF);. Assignment Operators Note that all example operations shown so far are for purely combinatorial outputs. The structure for combinatorial equations is: OUTPUT(s) = Expression(s) and/or Condition(s); Table 2. ABEL Arithmetic Operators Qp~[atQ[ NOT: ones compliment Definition Example C= !A; & AND C= A&B; # OR C= A# B; $ XOR: exclusive OR C= A$B; !$ XNOR: exclusive NOR C= A!$B; 2s complement ~ C= -A; subtraction C= A-B; addition C= A+ B; multiplication C= A *B; integer division C= AlB; % remainder C= A%B; < shift left C= A< 2; Qp~ratQr Table 1. ABEL Logical Operators + * D~finitiQn (shift left 2 bits) 6-120 Table 3. ABEL Relational Operators Table 4. ABEL Operator Priority HighestPriority equal ~ C=(A==B); != not equal C=(A!=B); < less than C=(A greater than C=(A>B); > Shift right < = less than or equal C=(A<=B); * Multiply Qp~[ato[ Definition - Twos compliment, IlQ1 subtraction / Unsigned division The assignment operator is the U=U sign, meaning that OUTPUT(s) combinatorially follow the evaluation of the expressions and conditions. If an output or set of outputs is registered (changing synchronously with the clock's rising edge), use the assignment operator u:=u. The structure of a registered equation, shown below, is essentially the same as a combinatorial equation but with this assignment operator: OUTPUT(s) := Expression(s) and/or Condition(s); % Remainder from division Third Highest Priority + Add - Subtract # OR $ XOR !$ XNOR Lowest Priority Operator Priority Operators in an expression are evaluated using a priority hierarchy. If two or more operators with equal priority appear in a single expression, they are evaluated in the order listed, from left to right within the expression. Table 4 lists the priority of all operators. All Relational Operators (==, !=, <, >, <=, >=), This statement signals the language processor to use logic reduction level 4. In cases where you need propagation delays of a speciftc length, use the statement You can use parentheses as in normal mathematics to alter the order of evaluation. ABEL performs the operation in the innermost parentheses ftrst. flag '-rO' Special Constants Table 5. ABEL Special Constants ABEL supports several special constants that ease the writing of equations and test vectors. Table 5 lists these special constants and their functions. Special Constant Definition .C. Clock: causes a low-high-Iow transition at a selected input for testing. .F. Floating input or output .K. Same as .C., but high-low-high .P. Register preload Logic Reduction Levels .x. Don't care condition At the beginning of every source me in this brief appears the statement .Z. Tests input or output for high impedance To use several of these constants in an abbreviated form and enable the symbols Hand L to represent binary Ones and Zeros, place the following statement in the labels section of the source document, as in the examples in this application note: H,L,X,C,z = 1,O,.X.,.C.,.Z.; flag' -r4' 6-121 which tells the language processor to use no reduction. ABEL provides four logic reduction levels, as listed in Table 6. ABEL Design Entry: Boolean Equations Boolean equations are the most common method of design entry. To use them, you give a name to each pin required for the application. If a design requires the special functions available in many devices (i.e., reset and preset), you also identify and name the nodes that control these functions. (The 22VlO has two such nodes: asynchronous reset at node 25 and synchronous preset at node 26.) Groups of pins and/or frequently used constants can also be given labels to facilitate writing equations. Following the keyword EQUATIONS in the source me, you describe the required logic with Boolean equations that use the pin, node, and/or label names. If an output has an output-enable term associated with it, you can write an equation for that term by using the pin name with the extension .OE" followed by the equation for the term. An example of this is: OUT1.0E = !RD & (INPUTS == 0); II This statement enables OUTl if pin RD is Low and the group of pins (can be any number of pins) labeled INPUTS are all Low. If these conditions are not met, the output remains three-stated. The 22VlO has a separate combinatorial output-enable product term for each I/O pin. The output enable is therefore easily controlled by either a single selectable pin or from a product term. To make an output enable synchronous or to expand the number of product terms available, you can dedicate an I/O macrocell to realize the appropriate logic; the macrocell's output feeds back to control the output-enable product term. This method causes additional propagation delay, however, due to the extra pass through the AND/OR array. The use of the enable equations is purely optional; in the absence of these equations, the ABEL language processor automatically enables any I/O pin defined in the Boolean equations as an output and disables any I/O specified as an input. The outputs appear on the left side of the equations. This application note outlines the operators and syntax of all Boolean equations. You can find additional information in the ABEL Language Reference and User's Guide supplied with the ABEL software. ABEL Design Entry: Truth Tables Table 6. ABEL Logic Reduction Levels Level o 2 3 4 Statement flag '-rO' Description No reduction. All equations must be in sum-of-products form. flag '-rI' Equations are expanded to sum-of-products form and reduced with standard Boolean algebra. This is the default. flag' -r2' flag '-r3' flag'-r4' Includes level 1 reduction plus the PRESTO algorithm. This process is iterative, so processing time is increased significantly. A truth table is a list of input combinations and the resulting outputs. Normally, the inputs are listed in ascending binary order from the minimum value to the maximum value. This format takes all possible input situations into account and prevents any undefined input combinations from producing undesirable outputs. The keyword TRUTH_TABLE marks the beginning of the table within the source file. Immediately following the keyword, you list the input(s) and output(s) labels in parentheses with an arrow (a minus sign and a greater than sign "_>") between the inputs and outputs. If you specify more than one input or output, you must enclose the set in square brackets "[ l". Figure 1 shows the statements required to implement a 3-to-8-line decoder. Note the use of the set identifier Q7 ..QO. This can be written out as Q7,Q6,Q5,Q4,Q3,Q2,Ql,QO. The PRESTO algorithm is performed on a pin-by-pin basis. This is faster than standard PRESTO reduction. The main advantage of the truth table entry method lies in writing test vectors. You can block-copy the entire truth table to the source fIle's test-vector section. This reduction level uses the ESPRESSO reduction algorithm. Any design specified by a truth table can also be entered as Boolean equations. For example, the output 6-122 Q6 in the above example could be represented by the Boolean equation: Q6 = 12 & 11 & no; ABEL Design Entry: State Diagrams One of the most powerful features of the ABEL programming language is its ability to compile state diagrams directly. By allowing direct state-diagram entry, ABEL frees you from the tedious task of generating Boolean equations with the expressions and conditions that cause each possible transition for each individual state register. You can implement several state machines in a single device, and you might have a set of outputs for each state machine. The state diagram for each set of outputs begins with the keyword STATE_DIAGRAM, followed by the pin names or labels that make up the state outputs. You then list each state. followed by any operations to be performed while in that state and at least one transition statement. A transition statement can be in any of three forms: As an example, consider a bidirectional, 3-bit counter with inputs UP and DOWN and outputs Q2, Q1, and QO. If UP or DOWN is High, the counter counts in the direction specified. If both UP and DOWN are High, the counter holds the current count. If both UP and DOWN are Low, the counter resets to zero. In addition, output MAX is High if the counter is in the UP mode and the count equals 7 or if the counter is in the DOWN mode and the count equals zero. Convenient labels for implementing this design appear in Figure 2, and Figure3 lists the source code for the state diagram. You can add another statement, WITH..ENDWITH, to any transition statement to set additional outputs to any given state when the transition preceding the WITH ..ENDWITH statement is executed. In the previous state diagram, for example, assume the transition from state S5 to S6 is to set a pin called FLAG. To achieve this result, the S5 diagram is modified as shown in Figure4. P ALC22VIO Design Examples The design examples present~d here exploit the various features of the 22V10 PLD. The ftrst seven designs focus on speciftc features and illustrate the techniques for using and testing these features. The last three designs combine several of the features to demonstrate the device's versatility. It is the 22VlO's tremendous versatility that has made it the most popular of all Cypress PLD s. Each of the last three designs, if implemented in SSI and MSI TTL, would require from seven to 13 packages. GOTO, for unconditional transitions to the next state IF .. THEN ..ELSE , for two-way branching CASE ..ENDCASE·, for N -way branching You can chain IF .. THEN ..ELSE statements to achieve n-way branching. but the CASE..ENDCASE construct accomplishes the same objective with less typing. By using labels for state outputs and condition inputs, you can implement even the most complex designs with ease. Asynchronous Reset/Synchronous Preset As shown in Figure5, this example defmes pins 2 and 3 to be the asynchronous reset and synchronous preset inputs, respectively. Eight inputs deftned as INPUT7 ..INPUTO are given the label INPUTS. Eight truth_table ([12,11,10] -> [Q7 ..QO]) [0,0,0] -> [0,0,0,0,0,0,0,1]; "labels [0,0,1] -> [0,0,0,0,0,0,1,0]; OUTS [0,1.0] -> [0,0,0,0,0,1,0,0]; MODE [0,1,1] -> [0,0,0,0,1,0,0,0]; [Q2..QO]; = = [UP,DOWN]; CNTUP = "b10; CNTDWN = "bOI; [1,0,0] -> [0,0,0,1,0,0,0,0]; RST = "bOO; HOLD = "bll; [1,0,1] -> [0,0,1,0,0,0,0,0]; [1,1,0] -> [0,1,0,0,0,0,0,0]; [1,1,1] -> [1,0,0,0,0,0,0,0]; SO "bOOO; Sl = "b001; S2 = "bOlO; S3 "bOll; S4 = "blOI; S6 = "bllO; S7 = "b100j S5 = "blllj Figure 2. State Machine Labels for Counter Example Figure 1. Truth Table for 3:8 Line Decoder 6-123 state_diagram OUT state SO: MAX. = (MODE == CNTDWN); case (MODE = = CNTUP): SI; (MODE = = CNTDWN): S7; (MODE = = HOLD) : SO; (MODE = = RST) : SO; endcase; .. state SI : MAX = 0; case (MODE = = CNTUP) : S2; (MODE = = CNTDWN) :SO; (MODE = = HOLD) : SI; (MODE = = RST) : SO; endcase; state S2 : MAX = 0; case (MODE = = CNTUP) : S3; (MODE = = CNTDWN):SI; (MODE = = HOLD)' : S2; (MODE = = RST) : SO; endcase; state S3 : MAX = 0; case (MODE = = (MODE = = (MODE = = (MODE = = endcase; CNTUP): S4; CNTDWN) :S2; HOLD) : S3; RST) : SO; state S4 : MAX = 0; case (MODE = = (MODE = = (MODE = = (MODE = = endcase; CNTUP): S5; CNTDWN): S3; HOLD) : S4; RST) : SO; state S5 : MAX = 0; case (MODE = = (MODE = = (MODE = = (MODE = = endcase; CNTUP) : S6; CNTDWN): S4; HOLD) : S5; RST) : SO; state S6 : MAX = 0; case (MODE = = (MODE = = (MODE = = . (MODE = = endcase; CNTUP): S7; CNTDWN): S5; HOLD) : S6; RST) : SO; state S7 :MAX = (MODE case (MODE = = (MODE = = (MODE = = (MODE = = endcase; == CNTDWN); CNTUP): SO; CNTDWN): S6; HOLD) : S7; RST) : SO; Figure 3.· ABEL Source Code for Counter Example corresponding outputs, OUTPUT7 ..OUTPUTO, are labeled OUTPUTS. Note how the use of labels enables the logic for all eight outputs to be written in a single equation. The equation: OUTPUTS := INPUTS; causes the data at INPUTS to be registered in OUTPUTS on the· rising edge of CLK. The .assignment operator ":=" indicates that the operation is clocked (registered). The 22VI0 clock input is, by definition, pin 1. The pin assignments section identifies the predefined node numbers for the reset and preset functions. The equations for the nodes, in terms of the selected pins, are then written in the file's equations section. Asynch.Reset and. Synch. Preset from Product Terms This example (Figure 6) implements an asynchronous reset and synchronous preset, as does the example in Figure 5. In this case, however, product terms activate the reset and preset nodes. Specifically,. the reset node is High (active) only when INPUTS equal 55 hex. Similarly, INPUTS equaling AA hex control the preset term. Note how the test vectors distinguish and test the synchronous versus the asynchronous operations. Reset and Preset Load Predetermined Values The examples in Figures 5 and 6 use the macrocells' positive, registered output for the pins represented by OUTPUI'S. Under this arrangement, the asynchronous reset causes all outputs to go Low and the synchronous preset causes them to go High. This example demonstrates how you can use istype statements in the pin assignments section to set any pattern of Ones and Zeros, either asynchronously with reset or synchronously with preset. To understand this operation, note in Figure7 that the 22VI0 provides four state S5 : MAX = 0; case (MODE = = CNTUP) : S6 with FLAG:= 1; endwith . (MODE = = CNTDWN): SO; (MODE = = HOLD) : S5; (MODE = = RST) : SO; endcase; Figure 4. WITH••ENDWITH Example ~RESS ~, SEMICCNDUCTOR Using ABEL to Program the 22VIO =================;;;;;;;======;;;;;;;======;;;;;; "Cypress Semiconductor Corp. 11/10/1987 "Module name test flag '-r3' "Logic Reduction level r3, fast PRESTO title ' Asynchronous Reset / Synchronous Preset Control From A Single Input "Device designator and type Ul device 'P22VI0'; "Pin assignments CLK pin 1; RST pin 2; "Defines async reset pin PRE pin 3; "Defines sync preset pin INPUT7,INPUT6,INPUT5,INPUT4 pin 4,5,6,7; INPUT3,INPUT2,INPUTl,INPUTO pin 8,9,10,11; OUTPUTI ,OUTPUT6,OUTPUT5,OUTPUT4 pin 23,22,21,20; OUTPUT3,OUTPUT2,OUTPUTl,OUTPUTO pin 19,18,17,16; reset,preset node 25,26; "Clock input "Pre-assigned node #s "Labels H,L,X,C,Z 1,0,.X.,.C.,.Z.; INPUTS [INPUT7 ..INPUTO]; OUTPUTS [OUTPUTI ..OUTPUTO]; @radix 16; "This command forces the default "number base to HEX. equations reset !RST; preset PRE; "Sync preset if pin PRE is high during the rising edge of CLK INPUTS; 'The := indicates that this a clocked (synchronous) operation OUTPUTS .- "Async reset when pin RST low test_vectors "Test reset and preset ([CLK,RST,PRE,INPUTS] -> OUTPUTS) [C,R,L,55] -> 55; "Test outputs by clocking in 55 [L,H,L,OAA] -> 55; "Test registers hold old data (55) "Clock AA (leading zero necessary for hex digits A-F) [C,H,L,OAA] -> OAA; [C,H,L,OFF] -> OFF; "Set all outputs high (FF) [L,L,L,OFF] -> 0; "RST low asynchronously [C,H,H,O] -> OFF; "PRE high synchronously end Rst_Prel Figure 5. ResetlPreset from Single Pins 6-125 "Cypress Semiconductor Corporation, 11110/1987 "Module name test flag' -d' "Logic Reduction level r3, PRESTO algorithm by pin title 'Asynchronous Reset / Synchronous Preset Example 2, Reset and Preset generated from Product terms' "************************************************************. "* This Example will Asynchronously Reset all registers when the inputs "* Synchronously Set all registers when the inputs equal AA "************************************************************. "Device designator and type Ul device 'P22VI0'; "Pin assignments CLK pin 1; INPUT7,INPUT6,INPUT5,INPUT4 pin 4,5,6,7; INPUT3,INPUT2,INPUT1,INPUTO pin 8,9,10,11; OUTPUT7 ,OUTPUT6,OUTPUT5,OUTPUT4 pin 23,22,21,20; OUTPUT3,OUTPUT2,OUTPUTl,OUTPUTO pin 19,18,17,16; reset,preset node 25,26; "Clock input "Pre-assigned node #s "Labels H,L,X,C,Z 1,0,.X.,.c.,.z.; [INPUT7 ..INPUTO]; INPUTS [OUTPUTI ..OUTPUTO]; OUTPUTS @radix 16 ; "command forces the default number base to be HEX equations reset (INPUTS==55); "Async reset when input = 55 preset (INPUTS==OAA); "Sync preset if inputs OUTPUTS INPUTS; 'The:= indicates that this a clocked (synchronous) operation = AA during the rising edge of CLK "Test reset and preset test_vectors ([CLK,INPUTS] -> OUTPUTS) [C,O] -> 0; "Test outputs by clocking in 0 [L,OFF] -> 0; "Test registers hold old data (0) [C,OFF] -> OFF; "Clock in FF (note leading zero for hex digits A thru F) [L,55] -> 0; "RST low asynchronously on inputs = 55 [L,OAA]-> 0; "No change, PRE is synchronous [C,OAA]-> OFF; "PRE acts synchronously on inputs = AA end Rst_Pre2 Figure 6. Reset I Preset From Product Terms 6-126 paths from the macrocells to the I/O pins: the Q and Q\ outputs of the macrocell's register and the true and inverted combinatorial terms that bypass the register. All these paths pass through a 4:1 multiplexer, which is controlled by architecture bits CO and Cl. Note from the test vectors in Figure 8 that the use of istype statements does not affect the outputs' polarity as described by the Boolean equations. Conversely, if you define an output as active Low through a Boolean equation, as in: !OUTPUT6 := INPUT6; The istype statements allow you to select which channel of the multiplexer is routed to the I/O pin. Table 8 shows the choices available. the state of the register is inverted for normal operation and for reset and preset conditions. An additional parameter in the istype statement allows A final note on using istype statements in conjunction with the reset node: The 22V10 resets when Vee is first applied to the chip. Istype statements and active-Low Boolean equations give you the opportunity to force the device's outputs to any desired state upon power up. you to select feedback paths. The choices are feed_term, feed_reg, and feedyin. An example showing this parameter is: OUTPUT6 istype 'pos,com,feedyin'; Specifying a feedback path for the 22V10 is redundant, however. This is because the 22V10 selects a feedback path using the same architecture bit (C1) that controls the selection of registered or combinatorial outputs. The 22V10 does not offer a feedback path from product terms. Output Enable Controlled by One Pin The example in Figure 9 defines pin 2 as the output enable pin for all outputs. Note the use of special constant" .Z." which is redefined as simply "Z" in the file's labels section. The constant is used in the test vectors to verify that the outputs are three-stated (high-Z) under the appropriate conditions. Table 8. Macrocell Configuration Selections .cL co.. Configuration o o Reg,Active Low 'neg,reg' Reg,Active High 'pos, reg' Comb,Active Low 'neg, com' Comb,Active High 'pos, com' o o Product-Term-Based Output Enable istllle Values While Figure 9 illustrates gang control of all output enables via an input pin, FigurelO shows several outputs with individual output enables generated from separate product terms. As with reset and preset, you can make output enables synchronous or extend the number of product terms by using a macrocell to generate the necessary logic and ASYNC RESET GLOBAL CLOCK SYNC PRESET OUTPUT ENABLE PTERM SUM OF PRODUCTS -LD s q r - ~~ QB R Ii ~ FEEDBACK TO ARRAY 0 o s 0 c1~ I co I 1 T C1 Figure 7. The PALC22VIO Macrocell 6-127 TO I 10 PIN "Cypress Semiconductor Corporation, 11/10/1987 "Module name test "Logic Reduction level r3, PRESTO algorithm by pin module Rst_Pre3 flag '-r3' title' Asynchronous Reset/Synchronous Preset Example 3, Using Reset and Preset to Load to Predetermined States "************************************************************************ "* This Example will Asynchronously Load a Value of 55 and Synchronously Load "* Value of AA by using 'istype' statements to invert alternating output registers * * "************************************************************************ "Device designator and type Ul device 'P22VI0'; pin 1; pin 2; pin 3; pin 4,5,6,7; pin 8,9,10,11; pin 23,22,21 ,20; pin 19,18,17,16; istype 'pos,reg'; istype 'neg,reg'; node 25,26; "Labels CLK RST PRE INPUT7,INPUT6,INPUT5,INPUT4 INPUT3,INPUT2,INPUTl,INPUTO OUTPUT7,OUTPUT6,OUTPUT5,OUTPUT4 OUTPUT3,OUTPUT2,OUTPUT1,OUTPUTO OUTPUT7 ,OUTPUT5,OUTPUT3,OUTPUTI OUTPUT6,OUTPUT4,OUTPUT2,OUTPUTO reset,preset H,L,X,C,Z INPUTS OUTPUTS @radix 16; "Pin assignments "Clock input "Defines async reset pin "Defines sync preset pin "Odd regs positive logic ''Even regs negative "Pre-assigned node #s 1,0,.x.,.C.,.Z.; [INPUT7 ..INPUTO]; [OUTPUT7 .. OUTPUTO]; "command forces the default number base to be HEX equations "Async reset when pin RST low !RST; reset PRE; "Sync preset if pin PRE is high during the rising edge of CLK preset "The := indicatese that this a clocked (synchronous) operation INPUTS; OUTPUTS .test_vectors ([CLK,RST,PRE,INPUTS] -> OUTPUTS) 'Test Reset and Preset [C,H,L,55] "Test outputs by clocking in 55 55; -> [L,H,L,OAA] "Test registers hold old data (55) 55; -> [C,H,L,OAA] "Clock in AA (note the leading zero necessary for hex digits A thru F) OAA; -> [C,H,L,OFF] OFF; "Set all outputs high (FF) -> [L,L,L,OFF] "RST low asynchronously (bits 6,4,2,0 inverted) 55; -> "PRE high synchronously (bits 6,4,2,0 inverted) [C,H,H,O] OAA; -> end Rst Pre3 Figure 8. Resetting and Presetting to Predetermined Values 6-128 ~ £ ~RESS Using ABEL to Program the 22VIO ~~ ~COID~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Cypress Semiconductor Corporation November 10, 1987 module Out_Enable 1 "Module name flag' -r3' "Logic Reduction level r3 title 'Output Enable from Single Input Example' "*********************************************** "* This example demonstrates the Output Enable, * "* Function being controlled by a single input * "*********************************************** U1 device 'P22V10'; "Device designator and type "Pin assignments "Clock input pin 1; CLK pin 2; "Output enable input OE pin 4,5,6,7; INPUT7,INPUT6,INPUT5,INPUT4 pin 8,9,10,11; INPUT3,INPUT2,INPUT1,INPUTO OUTPUT7 ,OUTPUT6,OUTPUT5,OUTPUT4 pin 23,22,21,20; OUTPUTI,OUTPUT2,OUTPUT1,OUTPUTO pin 19,18,17,16; reset,preset node 25,26; "Pre-assigned node #s "Labels H,L,X,C,Z 1,0,.x.,.C.,.Z.; INPUTS [INPUT7 ..INPUTO]; [OUTPUT7 .. OUTPUTO]; OUTPUTS OUTENA [OUTPUT7 .OE,OUTPUT6.0E,OUTPUT5.0E,OUTPUT4.0E]; OUTENB [OUTPUT3.0E,OUTPUT2.0E,OUTPUTl.OE,OUTPUTO.OE]; @radix 16; equations OUTENA OUTENB OUTPUTS "This command forces the default number base to be HEX 10E; 10E; INPUTS; test_vectors ([CLK,OE,INPUTS] [C,L,55] -> [L,H,OAA] -> [L,L,OAA] -> [C,L,OAA] -> [C,H,OFF] -> [L,L,x] -> end Out Enable1 "Outputs enabled only if pin OE is low "Test output enables -> 55; Z; 55; OAA; Z; OFF; OUTPUTS) "Test outputs by clocking in 55 (outputs enabled) "Test outputs go to high-Z state on OE high "Test registers hold old data (55) "Clock in AA (note the leading zero necessary for hex digits A thru F) "Set all outputs high (FF) but tri-stated "Tum outputs on and read FF Figure 9. Output Enable Controlled by a Single Input 6-129 "Cypress Semiconductor Corp. 11/10/1987 "Module name module Out_Enable2 flag' -r3' "Logic Reduction level r3 title 'Output Enable From a Product Term Example' "*********************************************** "* This example demonstrates the Output Enable * "* Function being controlled by a product term * "*********************************************** Ul device 'P22VI0'; CLK, OE INPUT7,INPUT6,INPUT5,INPUT4 INPUT3,INPUT2,INPUT1,INPUTO OUTPUT7 ,OUTPUT6,OUTPUT5,OUTPUT4 OUTPUT3,OUTPUT2,OUTPUTl,OUTPUTO reset,preset H,L,X,C,Z INPUTS OUTPUTS "Device designator and type "Pin assignments pin 1,2; "Clock and Output Enable inputs pin 4,5,6,7; pin 8,9,10,11; pin 23,22,21,20; pin 19,18,17,16; node 25,26; "Pre-assigned node #s 1,0,X.,.C.,.Z.; [INPUT7 ..INPUTO]; [OUTPUT7 ..OUTPUTO]; "Labels @radix 16; "This command forces the default number base to be HEX equations "Each Output individually enabled "inputs and OE is low 0) & !OE; OUTPUT1.0E 2) & !OE; OUTPUTI.OE 4) & !OE; OUTPUT5.0E 6) & !OE; OUTPUT7.0E if the corresponding digital code is applied at OUTPUTO.OE (INPUTS (INPUTS 1) & OUTPUT2.0E (INPUTS (INPUTS 3) & OUTPUT4.0E (INPUTS (INPUTS 5) & OUTPUT6.0E (INPUTS (INPUTS 7) & OUTPUTS := INPUTS; test_vectors ([CLK,OE,INPUTS] -> [OUTPUT7 ..OUTPUTO]) [C,H,55] [Z,Z,Z,Z,Z,Z,Z,Z] ; -> [L,H,O] [Z,Z,Z,Z,Z,Z,Z,Z] ; -> [L,L,O] [Z,Z,Z,z,Z,z,z,I] ; -> [L,L,I] [Z,Z,Z,Z,Z,Z,O,Z] ; "Loads 55, checks OE high overrides -> [L,L,2] [Z,Z,Z,Z,Z, I,Z,Z]; -> "all enable terms, then enables and [L,L,3] [Z,Z,Z,Z,O,Z,Z,Z] ; "checks all outputs one at a time -> [L,L,4] [Z,Z,Z,1 ,Z,Z,z,z]; -> [L,L,5] [Z,Z,O,Z,Z,Z,Z,z] ; -> [Z,I,Z,Z,Z,Z,Z,Z]; [L,L,6] -> [L,L,7] [O,Z,Z,Z,Z,Z,Z,Z] ; -> end Out_Enable2 Figure 10. Separate Output Enables Controlled by Product Terms 6-130 !OE; tOE; IOE; !OE; looping back the term via a feedback path. This method incurs additional propagation delay due to passing through the AND/OR array twice, however. The special constant" .Z." is used in the test vectors for this design to verify the operation of outputs in the three-stated (high-Z) mode. An 8-Bit Identity Comparitor This example (Figure 11) points out how the 22VI0's variable-product-term architecture permits you to directly implement logic that would otherwise require multiple feedback terms in standard PLDs. The 22VI0 offers 16 product terms maximum, compared to only eight product terms per output for standard 20-pin PLDs. pattern is supplied to INPUTS and is continuously compared to the data on DATA7 ..DATAO. This design is intended for an application in which DATA7 ..DATAO is a Z80 microprocessor's data bus. If the interrupt is enabled (pin INTRENBL is High), the 8-bit comparitor output drives pin INTR active (Low). In response, the Z80 drives pin IDREQ High. This action asks the device that initiated the interrupt to place its 8-bit ID code on the data bus. In this example, the ID code used is I\hSS. You can use any code by modifying the equation for DATA in the source file. Counter/Address Generator/Multiplexer This lO-bit counter, address generator, and multiplexer example (Figure 14) implements the address-generation circuitry for the front end of a high-speed data-acquisition module. The design requires two modes of operation: In ACQUIRE mode, counters generate the ten address lines. In READ mode, a microprocessor's address lines generate the same addresses. An n-bit comparitor requires 2n product terms to implement. This example achieves 8-bit comparison by decomposing the 8 bits into two 4-bit comparisons and using I/O pins 18 and 19 for each 4-bit comparison. These pins have 16 product terms each. The results of each 4-bit comparison are available at the pins one tpd after a match is detected A discrete version of this application employs quad 2:1 multiplexers to select whether the counters or microprocessor provide the address information. The entire discrete circuit, excluding the SRAM being addressed, consists of 11 SSI and MSI TTL components. The example given here implements the equivalent circuitry in a single 22VI0. Note in Figure 11 how the inputs and outputs are used in more than one label. This practice facilitates writing equations and test vectors for the individual 4-bit fields and the complete 8-bit fields. Single-Output, 9-Bit Identity Comparitor Note how the MODE pin in the equations for the AOUT outputs controls the source of the addresses. Also note the use of the asynchronous reset node: the reset term is generated when the MODE is set for microprocessor access (Low) and the processor address itself is zero. Although the effect at the outputs (all outputs = zero) is the same as if the reset term were not included, the asynchronous reset gives the processor a way to reset all the registers to a known state before allowing the counters to free-run again. This example is very similar to the example in Figure 11 , except this example rearranges the DATA inputs to AND the two 4-bit comparitor outputs with the result of the single, 9th-bit compare. The result is a single DATA = INPUTS output called INEQDATA. The source code for this example appears in Figure12. The disadvantage of this implementation is that it incurs an additional tpd by feeding the individual 4-bit comparitor outputs back through the ANDIOR array. Note that although the terms fed back to INEQDATA represent 34 (16 + 16 + 2) product terms, only three of the eight product terms available at I/O pin 23 are used; each of the three individual compares have already been reduced to single signals by the time they reach the AND/OR array for pin 23. You can also use the extra product terms along with a separately defined input for cascading the design to n-bit length. Timing Diagram One of the more interesting features of the ABEL SIMULATE program is its ability to generate timing diagram s for specified pins based on the test vectors in a source file. Although a timing diagramdoes not show propagation delays, it can help you verify a device's incircuit operation with a logic analyzer. The SIMULATE output file shown in Figure 15 is generated with the command line: simulate -iaddmux.out -oaddmux.sim -t4 wl,2,3,4,S,13,14,IS, 16, 17, 18 Bus Interface Data Trap with Answer-back This example demonstrates the 22VlO's bidirectional I/O capabilities (Figure 13). In this example, an 8-bit 6-131 November 10, 1987 "Cypress Semiconductor Corporation module AllTerms "Module name flag' -r3' "Logic Reduction level r3, PRESTO algorithm by pin title 'Using 16 Product Terms; An 8-bit Identity Comparitor ' "*************************************************************************** "* In this design, an 8-bit word is presented at 1/0 pins 23,22,21,20,17,16,15 and 14. "* These pins are used for inputs only in this example. The 8-bit word is compared, 4 bits "* at a time, to inputs INPUT7 ..0. Combinatorial outputs COMPHI and COMPLO show "* the result of each 4-bit comparison. Pins 19 and 18 are used as the comparitor outputs "* since these pins have enough Product Terms (16) for the required 4-bit comparisons. "*************************************************************************** U1 device 'P22V10'; CLK INPUT7,INPUT6,INPUT5,INPUT4 INPUT3,INPUT2,INPUTI,INPUTO DATA7,DATA6,DATA5,DATA4 DATA3,DATA2,DATA1,DATAO "Device designator and type "Pin assignments "Clock input (NOT used) pin I; pin 4,5,6,7; pin 8,9,10,11; pin 23,22,21,20; pin 17,16,15,14; COMPHI,COMPLO reset,preset H,L,X,C,Z INPUTSH DATAH INPUTSL DATAL DATA INPUTS pin 19,18; node 25,26; I,O,.X.,.C.,.Z.; [INPUT7 ..INPUT4]; [DATA7 ..DATA4]; [INPUT3 ..INPUTO] ; [DATA3 ..DATAO]; [DATA7 ..DATAO]; [INPUT7 ..INPUTO] ; "Comparator outputs "Pre-assigned node #s "High-order nibble "Low-order nibble "All 8 bits @radix 16; equations COMPHI = (INPUTSH COMPLO = (INPUTSL DATAH); DATAL); test_vectors ([DATA,INPUTS] -> [COMPHI;COMPLO]) -> [H,H]; [0,0] -> [1,1] [8,8] -> . [4,4] -> [H,H]; [OE,OE] -> [H,H]; [00,00] -> [7,7] -> [H,H]; [O,OF] -> [OFO,OFF]-> [OFO,O] -> [L,H]; "High-order nibble compare "Low-order nibble compare [H,H]; [H,H]; [H,H]; [H,L]; [H,L]; [2,2] -> [OF,OF] -> [OB,OB] -> [OFO,OF]-> [H,H]; [H,H]; [H,H]; [L,L]; end AlITerms Figure 11. Using 16 Product Terms : An 8-Bit Identity Com pari tor 6-132 --==-- !fft. ;~RESS ~, Using ABEL to Program the 22VIO ~~~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Cypress Semiconductor Corporation November 10, 1987 module CompFB "Module name flag' -r3' "Logic Reduction level r3, PRESTO algorithm by pin title 'Using Feedback to Realize more than 16 Product Terms; A Single Output, 9-bit Identity Comparitor ' "**************************************************************************** "* In this design, an 9-bit word is presented at pins 23,22,21,20,17,16,11,10 and 9. * "* These pins are used for inputs only in this example. The 8 LSBs of the 9-bit word are * "* compared, 4 bits at a time, to inputs INPUT7 ..0. Combinatorial outputs COMPHI and * "* COMPLO show the results of each 4-bit comparison. Pins 19 and 18 are used as the * "* comparitor outputs since these pins have enough Product Terms (16) for the required * "* 4-bit comparison. The MSBs (bit 8) of DATA and are compared at output COMPMSB. * "* Outputs COMPMSB, COMPHI, and COMPLO are ANDED together to form output * "* INEQDATA. * "**************************************************************************** U1 device 'P22V10'; "Device designator and type "Pin assignments pin 1,2,3,4,5; INPUT8,INPUT7 ,INPUT6,INPUT5,INPUT4 INPUT3,INPUT2,INPUT1,INPUTO pin 6,7,8,9; pin 10,11,13,14,15; DATA8,DATA7,DATA6,DATA5,DATA4 DATA3,DATA2,DATA1,DATAO pin 16,17,20,21; COMPH,COMPL,COMPMSB,INEQDATA pin 19,18,22,23; "Comparator outputs reset,preset node 25,26; "Pre-assigned node #s H,L,X,C,Z 1,0,.x.,.C.,.Z.; INPUTSH [INPUT7 ..INPUT4]; "High-order nibble [DATA7 ..DATA4]; DATAH INPUTSL [INPUT3 ..INPUTO] ; "Low-order nibble DATAL [DATA3 ..DATAO]; DATA [DATA8 ..DATAO]; "All nine bits INPUTS [INPUT8 ..INPUTO] ; @radix 16; equations COMPH "High-order nibble compare (INPUTSH == DATAH); COMPL "Low-order nibble compare (INPUTSL == DATAL); COMPMSB "MSB compare (INPUT8 == DATA8); INEQDATA COMPH & COMPL & COMPMSB; "Logical AND of all comparisons test_vectors ([DATA,INPUTS] -> [COMPH,COMPL,COMPMSB,INEQDATA]) [H,H,H,H]; [111,111] -> [H,H,H,H]; [0,0] -> [H,H,H,H]; [H,H,H,H]; [44,44] [22,22] -> -> [H,H,H,H]; [IFF, IFF] -> [H,H,H,H]; [88,88] -> [H,H,L,L]; [IFF,OFF] -> [H,H,L,L]; [0,100] -> [H,L,H,L]; [lFE,lEE] -> [L,H,H,L]; [lFE,lFF] -> end CompFB Figure 12. Realizing More Than 16 Product Terms Through Feedback: A 9-Bit, Single-Output Identity Comparitor 6-133 "Cypress Semiconductor Corp., 11/10/1987 module BiDirect "Module name test flag' -r3' "Logic Reduction level r3, PRESTO algorithm by pin title 'Bi-Directional I/O A Bus Interface Data Trap with Answer-Back' "**************************************************************************** "* This example compares the pattern at pins INPUTS to the data on data bus pins "* D AT A7..D A TAO. Pin INTR is driven low if they match and INTRENBL (interrupt "* enable) is high. Input IDREQ is then driven high, requesting ID code (" h55 in "* this example) to be put on the data bus * * * "**************************************************************************** Ul device 'P22VlO'; IDREQ, INTRENBL pin 2,3; ", Output Enable, Interrupt Enable COMPL,INlR pin 19,18; "Used in comparision of 4 LSBs INPUT7,INPUT6,INPUT5,INPUT4 pin 4,5,6,7; pin 8,9,10,11; INPUT3,INPUT2,INPUTl,INPUTO DATA7,DATA6,DATA5,DATA4 pin 23,22,21,20; DATA3,DATA2,DATAl,DATAO pin 17,16,15,14; "Pre~assigned node #s reset,preset node 25,26; H,L,X,C,Z = 1,0,X.,.C.,.Z.; INPUTS = [INPUT7 .. INPUTO]; "All inputs INPUTH = [INPUT7 ..INPUT4]; "High order nibble of INPUTS INPUTL = [INPUT3 ..INPUTO]; "Low order nibble of INPUTS DATA = [DATA7 ..DATAO]; "All data I/Os DATAH = [DATA7 ..DATA4]; "High order nibble of DATA DATAL = [DATA3 ..DATAO]; "Low order nibble of DATA DATAOEA = [DATA7.0E,DATA6.0E,DATA5.0E,DATA4.0E]; DATAOEB = [DATA3.0E,DATA2.0E,DATA1.0E,DATAO.OE]; IDCODE "h55; "Identification code equations DATAOEA= DATAOEB= DATA = COMPL= !INlR = IDREQ; IDREQ; IDCODE; (DATAL == INPUTL); (DATAH = = INPUTH) & COMPL & "Enables ID output onto data bus "Identification code for device ("h55) "4 LSBs compare INTRENBL; "INTR active low, All bits equal and "interrupt enabled (INTRENBL high) test_vectors ([IDREQ,INTRENBL,DATA,INPUTS] -> [COMPL,INlR,DATAD [L,H,"hOF ,"h IF] -> [H,H,X]; "Low nibble equal,high not equal -> [L,H,X]; [L,H, "hOFO, "hOFl] "High nibble equal, low not equal [H,H,X]; [L,L, "hOAA, "hOAA] "Test Interrupt Enable -> [H,L,X]; "DATA = INPUTS, INlR goes active (low) [L,H, "hOAA, "hOAA] -> [H,L,X]; [L,H, "h55, "h55] -> [H,H,Z,X] [X,X,IDCODE]; "DATA pins output IDCODE ("h55) -> end BiDirect Figure 13. BiDirectional I/O : Bus Interface Data 6-134 * "Cypress Semiconductor Corporation November 10, 1987 module AddGenMux flag '-r3' title ' lO-bit Address Generation / Multiplexer IC' "******************************************************** "* This PLD design generates Address signals AO-A9. "* If Control signal MODE is high, the address signals "* are the output of a 10-bit counter. If MODE is low "* the device passes uP Address lines UPADDO-UPADD9 * * * * "******************************************************** AdrsGen device 'p22v10'; CLK AO,A1,A2,A3,A4,AS,A6,A7,A8,A9 pin 1; UPADDO,UPADD1,UPADD2,UPADD3 UPADD4,UPADD5,UPADD6,UPADD7 pin 2,3,4,5; UPADD8,UPADD9 pin 10,11; pin 6,7,8,9; MODE pin 13; reset,preset H,L,X,C,Z node 25,26; 1,0,.X.,.C.,.Z.; [A9 .. AO); [upADD9 ..UPADDO); AOUT UPADD @radix 16; equations reset AOUT (UPADD .- -> [C,X,H) -> [C,X,H) 1; == 0) & !MODE; «AOUT + 1) & MODE) # (UPADD & !MODE); test_vectors ([CLK,UPADD,MODE) [X,O,L) -> 0; "System Master Clock pin14,15,16,17,18,19,23,22,21,20; "Address Outputs "uP Address Lines "Boolean equations "Reset if uP Address = 00 and MODE is low "Count up if MODE high or "Pass UPADD if MODE low "Check Operation AOUT) -> 2; [C,X,H] -> "Checks Reset Function 3; [C,X,H] -> 4; [C,X,H) -> 5; [C,X,H) -> 6; [C,X,H) -> 7; [C,X,H]-> 8; [C,X,H) -> 9; [C,X,H) -> [C,X,H) -> OB; [C,X,H]-> OC; [C,X,H) -> OD; [C,X,H) OA; OE; [C,X,H) -> OF; [C,X,H]-> 10; [C,lll,L)-> 111; 222; [C,44,L) -> 44; [C,88,L]-> 88; [C,2EE,L)-> 2EE; [C,155,L)-> 155; [C,OFF,L]-> -> [C,222,L)-> 1DD; [C,3BB,L)-> 3BB; [C,377 ,L]-> 377; 2AA; [C,3FF,L)-> 3FF; [C,222,H)-> 00; OFF; [C,lDD,L) -> [C,2AA,L) -> [C,X,H] -> 100; [C,lFF,L)-> IFF; [C,X,H) -> 200; "Load to states where all 8 LSBs [C,2FF,L]-> 2FF; [C,X,H] -> [C,3FF,L]-> 3FF; [C,X,H] -> 300; 0; "counter mode "are high (uP mode), then toggle in end AddGenMux Figure 14. 10-Bit Address GeneratorlMultiplexer 6-135 The "_i" indicates the input file, which in this case is the intermediate output file created by ABEL's FUSEMAP program. The -Oil tells SIMULATE which file to write the results into. The -t4" specifies the trace level where waveforms are displayed, and the "-w1..18" indicates which pins to show in the waveform output. II II You can find more information on SIMULATE in the ABEL User's Guide and Language Reference supplied with the ABEL software from DataI/O. Three State Machines in One 22VIO This final example demonstrates the power of the 22VlO when used as a synchronous state machine. The application involves the redesign of a radar system's timing circuitry. The system performs 12 discrete Fourier transforms on each set of quadrature data returned in three antenna beams that are gated for nine ABEL Version 2.00b Data 1/0 Corp. Address Generation I Multiplexer IC Simulate device AdrsGen, type 'P22V10' C L K V 0001 V 0002 V 0003 V 0004 VOOOS V 0006 V 0007 VOOOS V 0009 V 0010 V 0011 V 0012 V 0013 V0014 V001S V 0016 V 0017 VOO1S V 0019 V 0020 V 0021 V 0022 u u u u p p p P A A A A M D D D D D D D D 0 2 3 o D E A 0 A A A 2 3 L L C C I C C C C C C C C C I C I I I I C C C C J I 1_ I I I J I 1_ I J I 1_ I I I I I I I I J I_ I I _I L J I I I '- J J I 1_ I J '- L J L I J L I J I _ I I I I I I I J _I I I I I I I J I I I I I I I J I I 1_ C C C C C I J J I J I I_ I This example creates three state machines in a single 22VlO. As you can see from the state diagrams (Figure 16), the filter state machine is free running. The beam state machine only changes states when the filter outputs are in their maximum condition. Similarly the gate information changes only if both the filter and beam outputs are at their maximum values. Note the combined use of Boolean equations and state diagram s. A separate state diagram describes each state machine, but the transitions depend upon the condition of the other state outputs. Also of note is the extreme use of labels for pins, groups of pins, and the state outputs. This approach greatly simplifies the writing of the state diagrams and test vectors. When this design was first compiled, the ABEL FUSEMAP routine indicated several outputs that had too many terms for the physical array of the corresponding I/O pin. The design was made to fit by carefully arranging the lIOs. The flag "_r3" reduction statement made the fit possible without the tedium of generating and manually reducing Boolean equations from the state diagrams. The test vectors for this design are of particular interest Note how the @REPEAT command cycles through 35 states to make the gate state outputs toggle. This powerful command helps describe 325 test vectors in a concise and manageable manner. I I I I J ranges. The nonbinary nature of these numbers (three beams, nine ranges, and 12 speed bins) make generating the timing signals with counter circuits cumbersome. I I I I I I I I I I I J I I I J '- I I I J 'I_ 1I I 1_ _I I I Figure 15. ABEL Simulated Waveform 6-136 "Cypress Semiconductor Corporation November 10, 1987 module Statexam flag '-r3' title 'Timing Generation TRIPLE State Machine for DFT Processor using a Cypress Semiconductor PAL C22VIO' "********************************************************************* "* BEAM STATES - 0, 1,2 (3 not used), GATE STATES - 0,1,2,4,5,6,8,9, A "* (3,7,B,C,D,E,F not used), FILTER STATES - 0,1,2,4,5,6,8,9, A, C, D, E "* (3,7,B,F not used) "********************************************************************* Ul device 'P22VI0'; SYSCLK START ABO,AB1,AB2,AB3,AB4 AB5,AB6,AB7,AB8,AB9 reset,preset ABO,AB 1,AB2,AB3,AB4 ABS,AB6,AB7,AB8,AB9 H,L,X,C,Z ABall FILT BEAM GATE @radix 16; pin 1; pin 2; pin23,14,22,IS,21; pin 16,18,19,20,17; node 2S,26; istype 'pos,reg'; istype 'pos,reg'; l,O,.X.,.C.,.Z.; [AB9 .. ABO]; [AB3 ..ABO]; [ABS,AB4]; [AB9 ..AB6]; "Used for reset/power-up "Pins are non-sequential to take advantage of "The variable number of product terms in the 22VI0 "Pre-assigned node #s "Unnecessary because ABEL will set architecture bits "automatically - shown for example purposes only "Filter States - note missing states FO = 00; F1 = 01; F2 = 02; F3 = 04; F4 = 05; FS = 06; F6 = 08; F7 = 09; F8 = OA; F9 = OC; FlO = OD; F11 = OE; "Beam States BO = 00; Bl = 01; B2 = 02; "Gate States GO = 00; G1 = 01; G2 = 02; G3 = 04; G4 = OS; G5 = 06; G6 = 08; G7 = 09; G8 = OA; equations "Initialize to all lows on START reset = START; state_diagram FILT State FO: GOTO Fl; State F1: GOTO F2; State F2: GOTO F3; State F3: GOTO F4; State F4: GOTO FS; State F5: GOTO F6; State F6: GOTO F7; State F7: GOTO F8; State F8: GOTO F9; State F9: GOTO FlO; State FlO: GOTO Fl1; State Fl1: GOTO FO; state_diagram BEAM State BO: case (FILT -- "blll0) : Bl; (FILT 1= "blll0) :BO; endcase; State Bl: case (FILT == "blll0) :B2; "Increment ONLY if (FILT 1= "b1ll0) : B1; "FILT is at max (OE) endcase; :BO; State B2: case (FILT -- "bIllO) (FILT 1= "bl1lO) :B2; endcase; Figure 16. Triple State Machine (part!) 6-137 state_diagram OAm State 00: case «BEAM «BEAM endcase; State 01: case «BEAM «BEAM endcase; State 02: case «BEAM «BEAM endcase; State 03: case «BEAM & == AblO) & (FILT == Ab1110» != AblO) # (FILT != Ab11lO» :02; : 01; == AblO) & (FILT == Abl110» != AblO) # (FILT != Abl1l0» : 03; : 02; -- Abl0) & (FILT == Ab1110» «BEAM != AblO) # (FILT != AblllO» : 04; : 03; endcase; State 04: case «BEAM «BEAM endcase; State 05: case «BEAM «BEAM endcase; State 06: case «BEAM «BEAM endcase; State 07: case «BEAM «BEAM endcase; State 08: case «BEAM «BEAM endcase; test_vectors "Increments ONLY if BEAM and FILT are at max (FILT == Ab1110» : 01; != AblO) # (FILT != Ab1110» : 00; == Abl0) -- AblO) & (FILT == Abl110» != AblO) # (FILT != Ab11lO» : 05; :04; -- Abl0) & (FILT == Ab1110» != AblO) # (FILT != Abl110» : 06; : 05; -- Abl0) & (FILT -- AblllO» != AblO) # (FILT != Abl1l0» :07; : 06; -- AblO) & (FILT == Ab1110» != Abl0) # (FILT != AblllO» : 08; :07; == AblO) & (FILT == Abl110» != Abl0) # (FILT != Abl1l0» :00; :08; "Verifies devices operation ([SYSCLK,STARll -> [GATE,BEAM,FILT)) [X,H] -> [GO,BO,FO]; [C,L] -> [GO,BO,Fl]; [C,L] -> [GO,BO,F2];[C,L] -> [GO,BO,F3]; [C,L] -> [GO,BO,F4]; [C,L] -> [GO,BO,F5]; [C,L] -> [GO,BO,F6];[C,L] -> [GO,BO,F1]; [C,L] -> [GO,BO,F8]; [C,L] -> [GO,BO,F9]; [C,L] -> [GO,BO,FIO];[C,L] -> [GO,BO,Fll]; [C,L] -> [GO,Bl,FO]; [C,L] -> [GO,Bl,Fl]; [C,L] -> [GO,Bl,F2];[C,L] -> [GO,B1 ,F3]; [C,L] -> [GO,Bl,F4]; [C,L] -> [GO,Bl,F5]; [C,L] -> [GO,Bl,F6];[C,L] -> [GO,B1 ,F1]; [C,L] -> [GO,Bl,F8]; [C,L] -> [GO,Bl,F9]; [C,L] -> [GO,Bl,FlO];[C,L] -> [GO,Bl,Fll]; [C,L] -> [GO,B2,FO]; [C,L] -> [GO,B2,Fl]; [C,L] -> [GO,B2,F2];[C,L] -> [GO,B2,F3]; [C,L] -> [GO,B2,F4]; [C,L] -> [GO,B2,F5]; [C,L] -> [GO,B2,F6];[C,L] -> [GO,B2,F1]; [C,L] -> [GO,B2,F8]; [C,L] -> [GO,B2,F9]; [C,L] -> [GO,B2,FIO];[C,L] -> [GO,B2,Fll]; [C,L] -> [Gl,BO,FO]; "Gate output changes state here @REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G2,BO,FO];@REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G3,BO,FO]; @REPEAT 11035 {[C,L] -> [X,x,x]; } [C,L] -> [G4,BO,FO];@REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G5,BO,FO]; @REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G6,BO,FO];@REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G7,BO,FO]; @REPEAT 11035 {[C,L] -> [X,x,X]; } [C,L] -> [G8,BO,FO]; @REPEAT II 035 {[C,L] -> [X,x,x];} [C,L] -> [GO,BO,FO]; "Check the final state rolls over to the first "This completes a run-through of ALL states, the following 2 vectors retest reset (STAR1) [C,L] -> [GO,BO,Fl]; [C,H] -> [GO,BO,FO]; end Statexam Figure 16. Triple State Machine (continued) 6-138 CYPRESS SEMICONDUCTOR Using ABEL to Program the CY7C330 put-enable line. Control of the CY7C330's output enable can originate from the product term array or from pin 14. You can program the choice on a register-by-register basis. (The I/O macrocell section of this application note gives more information on controlling the output enable.) You can control the input-register clock mux in two ways. The most descriptive way is to use the".C" suffix, as shown in the DEM0330.ABL example file supplied with ABEL. This method works for the dedicated input registers (pins 4 - 7 and 9 - 14) but does not work in ABEL 3.1 for the input registers in the I/O macrocells. The reason for this problem is that for the 12 I/O macrocells, ABEL thinks the clock mux is for the output or state register and not the input register. Thus, the recommended method for controlling the input-register clock mux is to use macro commands. The macro file supplied with ABEL 3.0 does not include the complete macro list needed to program all the clock muxes, but you can get the complete file from Cypress. This file, P330.INC, contains the macros needed to program all the clock muxes, including the input registers. A listing of the macro file appears in Appendix A. ABEL versions 3.1 and higher come with the complete macro flIe. After you reference the macro file in the ABEL source flIe, the command CLK2 must enable the pin-3 clock. Then you set specifIc clock muxes by entering CLK2 n, where n is the input register's pin number. For example: LIBRARY 'p330'; "allows use of p330.inc macro file CLK2; "enables pin 3 as a clock input CLK2 5; - "pin 5 input reg uses the pin 3 clock CLK2 15; - "pin 15 input reg uses the pin 3 clock This application note describes how to access all the features of the Cypress CY7C330 using ABEL. Examples show how to put the features to work. ABEL is a versatile logic design tool that can program over 300 different devices. The Cypress CY7C330 is a powerful PLD. Features such as input and buried registers allow the CY7C330 to fit into a wide variety of applications. Although, the same features can make programming the device a challenge, this application note should minimize the challenge. ABEL 3.0 Bug If you are still using ABEL 3.0 and trying to program the CY7C330 for the fIrst time, note that the supplied device driver has a fatal flaw. Both Cypress and Data I/O offer updated device drivers. ABEL 3.1 also supplies a correct device flIe, with a new name. P330 was used for revision 3.0, and P330A for 3.1, although 3.1 still compiles with the P330 device name. The only difference between these two device flIes is the syntax for specifying the shared feedback mux. Input Registers The CY7C330 contains 11 dedicated input registers. An input register is also associated with each one of the 12 output registers (more on this later). Pin 3 can serve as an input register or a clock input. In fact, ten of the 11 input registers can be clocked from two different sources: pins 2 or 3. You can program the choice of the clock source individually, ona register-byregister basis. If an application requires only one input clock source, you can use pin 3 as a normal input. If an application requires both input clocks, however, you must use pin 3 as a clock input. A confIguration bit must be changed to enable pin 3 as a clock input. Like pin 3, pin 14 is a dual-function pin; it can be used as a registered input or a global, asynchronous, out- 6-139 You do not need a macro statement to specify the use of clock 1 (pin 2) for input registers, because clock 1 is the clock mux default setting for both the dedicated input registers and the I/O macrocell input registers. ABEL handles the accessing of data from one of the dedicated input registers (pins 3 - 14) the same as for a straight buffered input. The only difference is that for the dedicated input registers, input data is not available in the product term array until after the appropriate input clock pulse is received. Controlling the Output Enable You specify an output enable by appending the suffIx ".oE" to the appropriate pin name. You must define whether control of the output enable mux comes from pin 14 or the product term array. Configuration bit CO controls this choice, and you make the selection using the 1STYPE statement: OUT1,OU'f2,OUT3,OUT4 pin 15,16,17,18 ; "I/O pins OUT1.0E,OUT2.0E ISTYPE 'EQN'; "OE is product-term controlled OUT3.0E,OUT4.0E 1STYPE 'PIN'; "OE is controlled by pin 14 OUT1, INP1, INP2 PIN 16, 5, 6; OUT1.PR = INP1 ; "preset all output nodes on INP1=1 OUT.RE = INP2; "reset all output nodes on INP2=1 The second way to utilize set and reset is to employ the node notation shown in the following code, in which the set and reset product terms are designated node 30 and 29, respectively. SET, RESET NODE 30, 29 ; SET = INPl; "preset all output nodes on INP1=1 RESET = INP2; "reset all output nodes on INP2=1 Even though the reset and preset functions are synchronous, an error occurs while parsing the equations if you use the ":=" notation, which signifies a registered operation. Using the MacroceII as an Output Only When using the I/O macrocell as an output, you need to consider two parameters. The fIrst is the setting of the macrocell feedback mux, as controlled by configuration bit Cl. The second parameter is the control of the output enable, as described in the previous section. As with the output-enable control, you set the configuration bit for the feedback mux using the 1STYPE statement. When the input register is not used, data from the output register is typically fed back to the product-term array through the macrocell feedback mux. When this feedback arrangement is used, 1STYPE is followed by the FEED_REG attribute: OUTl PIN 15; "located in initial pin definitions When controlling the output enable with a product term, you have the option of setting it always on, always off, or making it a combination of some number of inputs or outputs. All three choices are illustrated in this code: [OUT1.0E,OUT2.0E] = [1,1]; "permanently enable outputs OUT3.0E= 0; "permanently disable output OUT4.0E = IN1 & IN2 & OUT1 ; "OE controlled by IN1, IN2, OUT1 OUT1 1STYPE 'FEED REG'; "sets C1=0, allowing feedback mux "to pass data from state register Using Set and Reset The CY7C330 has global synchronous set and. reset capability. When used, it sets or resets all 12. state registers and the four buried registers. Watch out for two conditions when using set or reset: First, when you reset the registers, all the outputs go High if they are· enabled because of the inverter between the state register and the output (Figure 1). Second, be aware that the reset does not occur for two clock pulses if an input is designated as the set/reset pin. This occurs because the reset data must be clocked into the product term array using one of the two input clocks fIrSt. The output registers must then be clocked to cause the reset or set to occur. You can accesS the CY7C330's set and reset capability in two ways: First, you can append the suffix "PR" for preset or ".RE"for reset to any output-pin or buried-register node name. The syntax is: OUT1:= INP1 $ «INP1 & INP2 )# INP3); "sample eq from'equations' section The ABEL default for the feedback mux configuration bit (C1) is to take data from the state register. Thus the "1STYPE 'FEED_REG';" statement is not· required, but it is recommended that the defaults be documented. Using the MacroceII as an Input Only When you use the I/O macrocell as an input register, the syntax differs from that of the previous example. Specifically; the output buffer most be three-stated, and the macrocell feedback mux must be set to accept data from the input register(Cl must be set to 1). The followingexample assiImes that the output register is not used at 6-140 Table 1. Node Numbers for Shared Input Multiplexers Node Number 35 36 37 38 39 cell. A configuration bit (C3) controls whether the mux's input is from an even- or odd-pin-number macrocell. The ABEL default is that the data is supplied from the evenpin-number macrocell. Changing to an odd pin requires that you invoke macros located in the P330.INC file. (The example in the next section shows how to make this change.) The purpose of the shared input mux is to provide another input path to the product-term array, when registered feedback is used, without losing input capability. Mux Between Pins 15, 16 17, 18 19,20 23,24 25,26 all. Keep in mind that the input register clock defaults to clock 1 (pin 2) unless specifically changed. INPl, INP2, oun PIN 5, 15, 16 ; Using the Input and Output Registers When using both the input and output registers in the I/O macrocell, the most difficult task is to get the data into the product-term array. You can use two muxes to feed data from the registers into the product-term array. The state-register information must be fed back through the feedback mux controlled by configuration bit Cl. You can route inputregister data through the feedback mux or through the shared input mux (Figure 1). The state-register output is referred to by the pin name associated with the macrocell. The data clocked into the input register is referred to by using the node name assigned to the shared input mux. Table 1 lists the node numbers of the shared input muxes. In ABEL, the configuration bit controlling the shared input mux (C3) defaults to an even I/O pin. When the input data is on an odd pin, you can use a macro in the P330.INC macro file to change the C3 configuration bit. INP2 ISTYPE 'FEED PIN'; "set Cl=l, allowing feedback mux to "take data from the input register INP2.0E ISTYPE 'EQU'; "set CO=O for product term OE EQUATIONS INP2.0E = 0; "three-state output buffer permanently oun:= INPI & INP2; Shared Input Mux Each pair of I/O macrocells has a shared input mux. This mux feeds data from the input pin into the productterm array if both registers are fed back in an I/O macro- SET RESET I C L K1 ICLKO oeLl OE OE PTL~E~R~M ________________________+-~~-+-r~ T , TO C3 FROM ADJACENT MACROCELL Figure 1. The CY7C330 Macrocell 6-141 PIN The following example also uses clock 2 (pin 3) to clock the input register: BREG PIN 15; "BREG is output register for pin 15 INP1 NODE 35; "INP1 is the input register for pin 15 BREG ISTYPE 'FEED REG'; "C1 is set to 0, mux routes Q of BREG BREG ISTYPE 'EQN'; "OE is product term controlled LIBRARY 'P330' ; "enables use of the P330.INC file CLK2; "enables pin-3 clock CLK2 15; - "enables CLK2 on pin-15 input reg FEEDPIN 15; "shared input InUX control bit (C3) set "This gives pin 15 an input path EQUATIONS BREG.OE = 0; "disable output BREG := BREG $ (INP1 & INP2); "BREG is fed back and INP1 is an input The Exclusive-OR Gate The CY7C330 provides an exclusive-OR (XOR) gate on the D input of the 12 IJO-macrocell output registers and the four buried registers. You can use this gate for two purposes. First, you can invert the polarity of a signal going into the output register. This inversion is accomplished by setting one of the XOR inputs to a logic 1, using the ABEL "$" symbol for XOR. In ABEL, you can use the following format: OUT1 := 1 $ (INP1 & INP2 & INP3); In ABEL versions before 3.1, however, the reduction algorithms do not recognize a 1 mixed with variables in an equation. The equivalent expression for earlier versions is: OUT1 := (INP1 # !INP) $ (INP1&INP2&INP3); The second use for the XOR gate is to emulate JK or T flip-flops in software. T flip-flops are more efficient than D flip-flops for implementing counters and state machines. You can emulate T -type flip-flops by feeding back the output register's Q output and tying it to the XOR product term. The sum-of-products input to the XOR becomes the T input (Figure 2). You can configure this emulation with Boolean equations: 1FLOP:= TFLOP $ (T input expression); where "T input expression" is a legal sum-of-products expression. A JK flip-flop is emulated using the same configuration, and the relationship: T=J!Q#KQ The second way to configure an output flip-flop as a T-type flop is to use an ISTYPE statement such as the one in the next example. The following syntax describes a simple 2-bit counter: PIN 1, 2,3, 14; CLK, INSTB, fOE QO, Ql PIN 28, 27; QO, Ql ISTYPE 'REG_ T' ; QO.OE, Q1.0E ISTYPE 'PIN'; CNT = [Ql,QO]; EQUATIONS QO.OE= OE; Q1.0E = OE; CNT = (CNT + 1); SET RESET ICLKI ICLKO OCLK OE OE P~~---------------------r4-~-+~~ S U" - - " - ' - - - t - - - i 1 - - ' TO INPUT MUX Figure 2. The CY7C330 Macrocell as a T -Type Flip-Flop 6-142 ABEL compensates for the lack of inversion in the output by inverting the data coming out of the input register. "inputs CKS, CK1, CK2, INP PIN 1, 2, 3, 4; "output OUT PIN 15; EQUATIONS OUT := INP; OE (FROM PIN 14) CllO C l K1 TEST VECTORS elK! ([CKS~CK1,CK2,INP] -> [OUT]) [0, C, 0, 0] -> [X]; [C, 0, 0, X] -> [0]; [C, 0, 0, X] -> [0]; [0, C, 0, 1] -> [0]; [C, 0, 0, X] -> [1]; [C, 0, 0, X] -> [1]; SR 55 Figure 3. A Buried Register Buried Registers As mentioned before, the CY7C330 contains four buried registers. You access these registers by assigning a name to the buried register node number. Table 2 lists the node numbers, and Figure 3 shows a diagram of a buried register. To use a buried register, assign a name to the node and use it as if it were a normal output. The only difference is that the I/O macrocell has an inverter between the state register and the output pin, which causes ABEL to handle the polarity differently (more on this in the next section). END When using state machine syntax, ABEL does not handle the polarity of the buried registers correctly. Not only do the equations not work, but the simulation also fails. You can easily flx the problem, however, by negating the names in the node declaration: CLK1, CLK2, CLK3 PIN 1,2,3; PIN 4,15 ; INP, OUT "hidden register declaration (negated) !C1, !C2, !C3 NODE 31,32,33; Polarity Conventions As with the state machine syntax, when using the "COUNT = COUNT +1" syntax, you also must invert the polarity of any buried registers. The easiest place to accomplish the inversion is at the node definitions statement, as shown in the previous example. Additionally, refer to the counter example at the end of this application note. As shown in later examples, you typically do not have to worry about signal polarity except when sending data to an output pin. This is because all data enters the product-term array in both the non-inverted and inverted states. ABEL chooses the right polarity to obtain the output as specified by the equations. When you export data from the device via an output pin, polarity is more critical-especially when using the set or reset. As shown by the block diagrams, the macrocell includes an inverter between the output register and output pin. Therefore, if you use the reset capability, the registers' Q output goes Low, and the output pins go High. If your application requires all the outputs to start out Low, use preset instead of reset. In the following example, the output is defmed as positive, and a 1 and a 0 are passed through the device. State Machine Syntax ABEL supports state machine syntax on the CY7C330. The only drawback is that you can only use the toggle flip-flop emulation mode for very simple state machines. Up to revision 3.1, the results of using state machine syntax with T flip-flop emulation are unpredictable. The T flip-flop is efflcient for state machines because it holds its state unless told otherwise and thus needs a product term only for a state change. In contrast, a state machine using D flip-flops needs a product term both to change states and to hold states. Even with this limitation, the CY7C330 contains from nine to 19 product terms per output and usually handles a medium-size state machine with ease. Table 2. Node Numbers of Buried Registers Buried Register 1 2 3 4 Node Number 31 32 33 34 Product Terms 13 17 11 19 Simulation Caveat Be aware of a limitation to what ABEL can simulate. Speciflcally, when writing simulation test vectors, you can use only one of the three clock lines on a single test-vec- 6-143 tor line. The following example does not simulate correctly: TEST_VECTORS ([CKS,CKl,CK2,INP] -> [OUT]) [ C , C , 0 , 0] -> [ 0] ; The following modified version does simulate correctly: TEST VECTORS ([CKS~CKl,CK2,INP] [O,C,O,O] [C,O,O,X] -> [OUT]) -> [X] ; -> [0] ; ABEL supports the preload function. Refer to the 15bit counter example for more information on how to use it. I6-Bit Up/Down Counter This application, COUNTER6, is an example of a 15bit up counter with a terminal-count output The application shows how to use ABEL's "COUNT = COUNT + 1" syntax and corrects the polarity problem that crops up when combining normal I/O macrocell output registers and buried registers. This example also illustrates how to use the preload function. The ABEL source code for this example appears in Appendix B. State-Machine-Based Modulo-II Counter This example is a state machine application implementing a modulo-II counter using state machine syntax. This example again shows how to handle polarity using both normal registers and buried registers. Appendix C lists the ABEL source code for this example. Appendix A. P330.INC -- Macro Listing " P330.INC "The following select Clock 2 (pin 3) for the Output Macrocell Input register. CLK2_28 macro () {FUSES[17030] = I;} CLK2_27 macro 0 {FUSES[17034] = I;} CLK2 26 macro 0 {FUSES[17037] = I;} CLK2=25 macro 0 {FUSES[17041] = I;} CLK2 24 macro 0 {FUSES[17044] = I;} CLK2-23 macro 0 {FUSES[17048] = I;} CLK2=20 macro 0 {FUSES[17051] = I;} CLK2 19 macro 0 {FUSES[17055] = I;} CLK2)8 macro 0 {FUSES[17058] = I;} CLK2_17 macro 0 {FUSES[17062] = I;} CLK2_I6 macro 0 {FUSES[I7065] = I;} CLK2_15 macro 0 {FUSES[17069] = I;} "The following enables clock 2 (pin 3) CLK2 macro 0 {FUSES[17070] = I;} CLK2_4 macro 0 {FUSES[17072] = I;} CLK2 5 macro 0 {FUSES[I7073] = I;} CLK2=6 macro 0 {FUSES[17074] = I;} CLK2_7 macro 0 {FUSES[17075] = I;} CLK2 9 macro 0 {FUSES[17076] = I;} CLK2-10 macro 0 {FUSES[17077] = I;} CLK2-U macro 0 {FUSES[17078] = I;} CLK2-12 macro () {FUSES[17079] = I;} CLK2-13 macro () {FUSES[17080] = I;} CLK2)4 macro 0 {FUSES[17081] = I;} "The following program the C3 bit in the Output Macrocell and selects feedback from the lower pin. FEEDPIN 27 macro 0 {FUSES[17031] = I;} FEEDPIN-25 macro 0 {FUSES[17038] = I;} FEEDPIN-23 macro 0 {FUSES[17045] = I;} FEEDPIN-19 macro 0 {FUSES[17052] = I;} FEEDPIN-17 macro 0 {FUSES[17059] = I;} FEEDPIN)5 macro 0 {FUSES[17066] = I;} 6-144 Appendix B. ABEL Source Code for the 16·Bit Counter Example module counter6 title 'Counter application for CY7C330 application note· Cypress Semiconductor June 19,1989' counter6device 'p330'; " This is example of a 15 bit counter showing: "I. How to handle the polarity when combining normal output registers and buried regs. "2. How to use the' count = count + l' syntax. "3. How to use preload for simulation vectors and handle the polarity inversion for the " buried registers. " inputs pins clk,clk1,c1k2,preset pin 1,2,3,4 ; " output pins cO,c1,c2,c3,c4,c5,c6 pin 15,28,26,17,24,19,20 ; c11,c12,cI3,c14 pin 25,18,16,27 ; tci pin 23 ; spreset node 30 ; !c7,!c8,!c9,!c10 node 31,32,33,34 ; " macros c cntr = [c14, c13, c12, ell, c10, c9, c8, c7, c6, c5, c4, c3, c2, c1, cO] ; " this is used to handle the preload inversion of the buried registers. See test vectors below. c_cntrs = [c14, c13, c12, c11, !clO, !c9, !c8, !c7, c6, c5, c4, c3, c2, c1, cO] ; c,x,p .c., .x., .p.; equations spreset preset; c_cntr '(c cntr + 1) ; (c_cntr == 2346) ; tci " Example of using preset with simulation test vectors ([clk,clkl,preset,c_cntrs] -> [c_cntr,tci]) [0,0 , x , x]; ] -> [ x • x [0. c , 1 • x ] -> [ X • x]; [c.O , x • x ] -> [ 0 .0]; ,0]; [0. c , 0 x ] -> [ 0 x ] -> [ 1 .0]; [c.O • x x ] -> [ 2 .0]; [c.O • x ,0]; x ] -> [ 3 [c.O • x x ] -> [ 4 .0]; [c.O • x ,0]; ] -> [ 5 [c.O • x , x .0]; [P.O. x • 62 ] -> [ x ] -> [ 62 .0]; [0.0 • x • x [c,O • x • x ] -> [ 63 .0]; ] -> [ 64 .0]; [c.O • x • x ] -> [ 65 .0]; [c.O • x • x ] -> [ 66 ,0]; [c.O • x • x [c,O , x , x ] -> [ 67 ,0]; [c,O • x , x ] -> [ 68 ,0 ]; [p,O , x ,2345] -> [ x ,0 ]; [0,0 , x , x ] -> [ 2345 , 0 ]; [c,O , x , x ] -> [ 2346 , 0 ]; ] -> [ 2347 , 1 ]; [c,O , x x ] -> [ 2348 ,0]; [c,O , x x [c,O , x , x ] -> [ 2349 ,0]; end 6-145 Appendix C. ABEL State Machine Source Code for Modulo 11 Counter module statem title' Application Note State Machine Example - Cypress Semiconductor 5-12-89' statem device elk 1,c1k2,cIk3 cl,c2 res reset !c3,!c4 count c4,c3,c2,cl c,x,z,h,l 'P330'; 1,2,3 ; pin pin 15,16 ; pin 4; node 30; node 31,32; [c4,c3,c2,cl] ; istype 'feedJeg'; .c.,.x.,.z.,I,O; " This is an example of implementing a modulo counter using state machine syntax. " This example also shows how to use the hidden registers. " counter states sO = AbOOOO; s3 = AbOOll s6 = AbOll0 sl = AbOOOI ; s4 = AbOlOO s7 = AbOlll s2 = AbOOlO; s5 = AbO 10 1 s8 = Abl000 equations c4.pr s9 = Abl00l; slO = AblOlO ; res; state diagram [c4,c3,c2,cl] state sO: goto sl ; state sl: goto s2 ; state s2: goto s3 ; state s3: goto s4 ; state s4: goto s5 ; state s5: goto s6 ; state s6: goto s7 ; state s7: goto s8 ; state s8: goto s9 ; state s9: goto s10 ; state s 10: goto sO ; test vectors ([elkl,clk2,res] -> [count]) [0 , c , 1 ] -> [15 ]; [c , 0 , 0 ] -> [ 0 ]; [0 , c , 0] -> [ 0 ]; [c , 0 , 0 ] -> [ I ]; [c , 0 , 0] -> [c , 0 , 0 ] -> [c , 0 , 0] -> [c , 0 , 0 ] -> [c , 0 , 0] -> [c , 0 , 0] -> [c , 0 , 0 ] -> [ [ [ [ [ [ [ [c , 0 , 0] -> [ [c ,0 ,0] -> [ [c , 0 , 0 ] -> [ 2 ]; 3 ]; 4 ]; 5 ]; 6 ]; 7 ]; 8 ]; 9 ]; 10]; 0 ]; end 6-146 Using ABEL to Program the Cypress CY7C331 This application note describes how to program the CY7C331 using Data I/O's ABEL. Each section of the application note describes a configuration and presents the relevant ABEL source code. (You can obtain all the examples presented in this application note from the Cypress Bulletin Board at (408) 943-2954. Retrieve the file 331APNT.EXE; it unarchives itself automatically.) The information presented here can simplify the jobs of circuit designers, who are under a lot of pressure to shorten design cycles and fit numerous functions into a small footprint. The latest programmable logic devices (PLOs) give you the ability to increase circuit density with a reduced design cycle. When you combine multiple types of PLDs from multiple vendors on the same board, using a general programmable logic compiler such as ABEL makes a lot of sense. Unfortunately, as PLOs get more complex, the concept and implementation of a universal compiler becomes non-trivial. A compiler vendor such as Data I/O must define a syntax that is both easy to use and powerful enough to accommodate hundreds of different PLO s. The ABEL PLO compiler succeeds with a vast array of features. It does an admirable job of supporting over 300 different types of PLD source equations with a multitude of different architectures. The architecture covered in this application note is that of the Cypress CY7C331. This device belongs to a family of high-speed, high-density, 28-pin PLOs. Features such as individual set, reset, and clock product terms for each of the 24 registers make the device one of the most versatile PLDs on the market today. together. Because only one term is available, OR terms are not allowed in the equation. The advantage to using pin 14 rather than a product term is that the pin enables or disables the output buffers 5 ns faster. This is because the output enable signal does not travel through the array. Any I/O pin (pins 15 - 28) used on the left side of an equation, by default, has its output enable programmed as asserted. For example: Il, OUT15 PIN 1, 15; EQUATIONS OUT15 = 11 ; is the same as 11, OUT15 PIN 1, 15; EQUATIONS OUTI5 = 11; OUT 15.0E = 1; If you use the direct connection to pin 14, the signal must be configured as active Low. The way ABEL configures the output enable mux depends on the equations. If the right hand side of an ".QE" equation has just an inverted pin 14 on it, ABEL assumes you want to use the direct connection to pin 14. For example, the following equations use the direct connection to pin 14 (CO = 1): Il4,115 PIN 14,15; OUTI5, OUT16 PIN 15,16; EQUATIONS [OUT15,OUT16].OE = !Il4; The same example uses the product term array if you change the equation to: [OUTI5,OUTI6].OE = 115; OR [OUT15,OUT16].OE = Il4 & Il5; or even: [OUTI5,OUT16].OE = Il4; In some cases, you might want to use pin 14 to control the output enable, but for timing reasons use the product term array instead of the direct connection. ABEL allows you to do this by using an ISTYPE statement. In the following example, the output enable for Controlling the Output Enable The CY7C331 has two different methods ofcontrolling the output enable on each of the twelve outputs (see the CY7C331 diagram in Figure 1 of "Using the CY7C331 as a Waveform Generator"). Either pin 14 or a product term can control each output enable. Controlling the output enable by a product term means using any combination of inputs and outputs ANOed 6-147 pin 16 goes through the product term array, and pin 17 uses a direct connection: 12,114 PIN 2,14; OUTI6, OUT17 PIN 16,17; ISTYPE 'EQN' ; OUTI6.0E EQUATIONS [OUT16,OUTI7].OE = !Il4; OUT16 = 12 ; OUT 17 = 12 ; TEST VECTORS ([12,114] -> [OUT16,OUTI7]) [ X , 0] -> [ z , z ]; [0, 1] -> [ 0 , 0 ]; [ 1, 1] -> [ 1 , 1 ]; Note that in most cases when an output register is buried and the I/O pin serves as an input, ABEL does not automatically disable the output enable. In fact, you cannot disable the output enable unless you defme it with an ISTYPE 'EQN' statement. In the following example, the OUTI5.0E = 0 statement does not disable the output enable unless the statement is preceded with OUTI5.0E ISTYPE 'EQN': " The following code is for testing " polarity on the CY7C331. "input pins PIN 1,2; I1,CLK RES,PRE,OE PIN 4,5,6; "output pins OUTI5,OUTI6 PIN 15,16; OUTI7,OUT18 PIN 17,18; "constants C,X,Z = .C., .x., .Z.; ISTYPE 'EQN'; OUTI5.0E EQUATIONS " the example below shows using the " feedback from register 15 to " control the preset and set of " register 16. OUT15 11; OUTI5.C = CLK; OUTI5.RE = RES; = PRE; OUTI5.PR " The following statement is ignored " without previous istype'eqn'. OUTI5.0E 0; OUTI6.RE OUT15; OUTI6.PR !OUTI5; OUTI6.0e 1; TEST VECTORS ([Il,CLK,RES,PRE] [ 0, 0, 0, 0 ] [ 0, C, 0, 0] [ 1, C, 0, 0] -> [OUT15,OUTI6]) -> [ Z , 1]; -> [Z, 0]; -> [ Z , 1]; " This tests what happens to the " polarity of the register feedback " when you go from register to " transparent. [0,0, 1, 1 ] -> [ Z, 0]; [ 1, 0, 1, 1 ] -> [ Z, 1]; In general, it is advisable to use the ISTYPE 'EQN' for all I/O pins that use a product term to control the output enable, especially when trying to disable an output buffer. Registered Output Only You can use the CY7C331 macrocell as a registered output, without using the input register, as illustrated in the following example: "input pins PIN 1, 2 ; D INP, CLK "output pins OUT15 PIN 15; "constants C,X,Z = .C., .x., .Z. ; EQUATIONS OUT 15 := D INP ; OUTI5.C = CLK;TEST VECTORS ([0 INP,CLK] -> OUTI5) [ X,O] -> 1; [ 0, C ] -> 0; [ 1, C ] -> 1; As shown in this example, the minimum requirement to configure an output into a register is the OUTPUT := INPUT equation and an equation describing where the clock is coming from. The latter is necessary because the CY7C331 has no dedicated clock pin. Because the following equations are ABEL defaults, you do not need to explicitly define them: OUTI5.RE = 0; "disable reset OUTI5.PR = 0; "disable preset " permanently enable output buffer OUT15.oE = 1; The next example uses all the output register's features. For example, you can dynamically switch from registered mode to combinatorial and back to registered. Although the ABEL simulation always shows the register returning to the same state when switching from combinatorial to registered mode, the. actual state varies from device to device. Also note that this example adds OUT17. to show that even when the pin 15 output buffer. is disabled, the register's state still feeds back to the product term array via the feedback mux. The ABEL default for the feedback mux in the registered mode is to take information from the register (Cl = 0). "input pins D INP,·CLK PIN 1,2; PIN 3,4,5; RES, PRE, OE "output pins PIN 15,16; OUTI5, OUT16 PIN 17,18; OUTI7, OUT18 "constants = .C., .x., .Z.; C,X,Z 6-148 EQUATIONS " OUT15 is using the output register in both registered "and combinatonal mode by manipulating the " set and reset terms. OUTIS := D INP ; = CLK; OUTI5.C = RES ; OUTI5.RE PRE ; OUT 15.PR OE ; OUTI5.0E OUT15 OUT17 TEST VECTORS ([15 INP,CLK,RES,PRE,OE] -> [OUT15,OUTI7]) [0 ~O ,0 ,0,0] -> [Z , 1 ]; "with no external help, the registers initialize to the "reset state, which means the outputs are high, " because of the non-bypassable inverter in "the output path. [ 0 , 0 , 0 ,0, 1] -> [ 1 , 1 ]; [ 0 , 0 , 0 , 1 , 1] -> [ 0 , 0 ]; [ 0 , 0, 1 ,0, 1] -> [ 1 , 1 ]; [ 0 , C , 0 , 0, 1] -> [ 0 , 0 ]; [ 1 , C , 0 , 0, 1] -> [ 1 , 1 ]; " The register becomes combinatorial " when the reset and preset are both asserted [ 0 ,0, 1 , 1 , 1] -> [ 0 , 0 ]; [ 1 , 0, 1 , 1 , 1] -> [ 1 , 1 ]; " this is the state the register returned to " when going from combinatorial to registered mode. [ 0 , 0 , 0 , 0 , 1] -> [ 0 , 0 ]; Remember that the ABEL default for the feedback mux in the registered mode is to take information from the register (Cl = 0). This is not the case when you configure the output register as transparent, however, as shown in the next example. When you configure the output register as transparent, the input register path data is automatically fed to the product term array (C1 = 1). Because ABEL also defaults to transparent input registers, the data fed to the product term array is not the same as the registered output data. You can feed data back to the product term array from before the output buffer-even when the output register is configured as transparent-by using an ISTYPE 'FEED REG' statement: "input pins 6,7 ; 16,17 PIN "output pins 16,18; OUT16, OUT18 PIN "constants C,X,Z = .C., .x., z. ; ISTYPE 'FEED_REG'; OUT16 EQUATIONS OUT16 = 16 ; OUT16.0E = I7; OUT18 = OUT16 TEST VECTORS ([16,17] -> [OUT16,OUT18]) [ 0, 0] -> [ Z , 0]; [ 1,0] -> [ Z , 1]; [ 0, 1] -> [ 0 , 0]; [ 1, 1] -> [1, 1]; If you omit the FEED_REG statement, an error occurs in the simulation. The FEED REG statement changes the feedback-mux configuration bit from One to Zero (Cl = 0). Transparent Input Only The ABEL 3.2 default is to make the input register transparent. Thus, to specify an I/O macrocell as a combinatorial input, place the specification on the right side of an equation: "INPUTS INP16, OUT18 PIN 16, 18; EQUATIONS OUT18 = INPI6; TEST VECTORS (INP16 -> OUTI8) Combinatorial Output Only ABEL allows you to configure the output register as transparent by using the "=" symbol instead of ":=" in the equations, as this example shows: "input pins 11 PIN 1; "output pins OUT15 PIN 15; "constants C, X, Z EQUATIONS OUT15 TEST_VECTORS ( 11 -> = .C., .x., o -> 0; 1 -> 1; In this example, only one operator (=) serves to configure both registers as transparent. This method works because the equals sign controls only the output register configuration (OUT18), which is possible because the default configuration for an input register is transparent. Changing the "=" to ":=" changes the pin18 output register from transparent to registered, but does not affect the pin-16 input register. .Z.; = 11 ; OUT15 ) 0; 1 1; In this example, the following equations are ABEL defaults, and you do not have to write them. Including these equations does not cause an error. OUT 15.PR 1; "set and reset OUT 15.RE 1; "high = transparent. OUTI5.0E = 1; "enable on. o -> -> The Macrocell as a Registered Input Only To change an input register from transparent to registered, you configure the register using its node 6-149 =e:~RESS Using ABEL 3.2 to Program the Cypress CY7C331 ~, ~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ number. Table 1 lists. the node assignment for each register. To use an input register as a register, place the signal on the right side of the equation and add the rest of the terms needed. In the following example, INP17 is a registered input pin. The register itself is called INP17REG. OUT19 is a transparent or combinatorial output "pin definitions INP17, RESET PIN 17,3; SET, CLK PIN 4,5; OUT19 PIN 19; INP17REG NODE 145; EQUATIONS OUT19 = INP17 ; INP17REG.C = CLK; INP17REG .PR = SET; INP17REG.RE = RESET TEST_VECTORS ([INP17,CLK,SET,reset] out19) -> [ X ,X, 0, 1] 0; -> [ X ,X, 1 , 0] -> 1 ., [ 0 ,C, 0, 0] -> 0; [ 1 ,C, 0, 0] -> 1 ; [ 0 ,X, 1, 1] -> 0; [ 1 ,X, 1, 1] -> 1; To access the data stored in an input register, use the pin name. Access the set, reset, and clock using the input register node name. Burying the Output Register/Registered Input The CY7C331 allows you to bury an output register and still use the pin as a registered input by using the shared-input mux. The CY7C331 provides a· sharedinput mux between pins 15 and 16, 17 and 18, 19 and 20, etc. Thus, there are three paths into the product term array for every pair of macrocells. You therefore cannot bury both of a pair's output registers and still use the pin as an input. If you bury the output register at pin 15 and use the pin for an input, for example, you cannot bury the output register at pin 16 and also use the pin for an input. Use the pin name to access the information fed back to the product term array from the output register. Use the node number of the shared-input mux to access the input data coming from the pin and passing through the input register. The shared-input mux node number assignments appear in Table 2. The shared-input mux can take information from one of the two macrocells. ABEL defaults to selecting the macrocell of the even pin number. However, macros are available that select the odd pin's macrocell. You can access these macros by using the following syntax: LIBRARY 'P331'; FEEDPIN_27; Table 1. CY7C331 Input Register Node Assignments Pin Number Re2ister Node 15 143 16 144 17 145 18 146 19 147 20 148 23 149 24 150 25 151 26 152 The LIBRARY statement inserts a copy of all the possible CY7C331 macros into the source during compilation. You can observe the result by looking at the listing file (.LST). The FEEDPIN_27 statement selects pin 27 to pass through . the shared-input mux, overriding the default, which is pin 28. The following code is the complete listing of a test program that shows how to bury a register and employ the pin as an input, using macros to change the sharedinput mux: module test3 title 'CY7C331 test programs for applications note Cypress Semiconductor Inc. 3/16/90' TEST3 DEVICE 'P331'; "This is an example of burying the output register "of a CY7C331 and using the I/O pin as an input. "input pins 11, CLK1, CLK2 PIN 1,2,3; RES, PRE, OE PIN 4,5,6; PIN 7; CLK3 "output pins OUT1S, OUT16 PIN 15,16; PIN 17,18; OUT 17, OUTI8 "constants C,X,Z .= .C.,X.,Z.; "LIBRARY statement is used to access the macros "needed to change the shared-input mux selection. LIBRARY 'P331'; "Data from pin 1 gets clocked through the buried "register on pin 15, and output on pin 16. "Output register 15 is configured as a register and the "pin 16 output register is transparent. "Data also gets input on pin 15 and output on pin 17. "Both are configured as registers. 6-150 5?l Using ABEL 3.2 to Program the Cn!ress CY7C331 ~~m~~~~~~~~~~~~~~~~~~~~~~~~~~ NODE 143; INP1SREG NODE 29; INP1SMUX ISTYPE 'EQN'; OUT1S.0E [OUT16,OUT17]; OUTPUTS FEEDPIN 15; EQUATIONS OUT17 := INPlSMUX = CLK3 ; OUT17.C INP1SREG.C = CLK2 ; .cUTIS := II; OUT1S.C = CLKI ; = 0 ; " disable reset, OUT1S.RE OUT1S.PR = 0 ; " preset, and oe OUT1S.0E =0 ; OUT16 = OUTlS ; TEST VECTORS ([I1,CLK1,OUT1S,CLI(2,CLK3] -> [OUTPUTS]) [X, 0, 0 , 0,0] -> [1,1]; [0, C, x , 0,0] -> [0,1]; [1, C, X , 0,0] -> [1,1]; [X, 0, 0 , C,O] -> [1,1]; [X, 0, x , O,C] -> [1,0]; [X, 0, 1 , C,O] -> [1,0]; [X, 0, x , O,C] -> [1,1]; END The ABEL 3.2 compiler contains a bug that relates to this example. If you remove the line OUT1S.0E ISTYPE 'EQN';, the code compiles and simulates correctly. However, if you look at the resulting ,JEDEC map for the equations, the output buffer for pin 15 is enabled, which should cause the simulation to fail. Contact Data I/O for more information. When you use macros, be cautious about several aspects of ABEL. In equations, for instance, the ABEL parser allows spaces between the end of the equation and the semicolon. However, you must place a semicolon immediately after a library statement and a macro. The parser does not allow a space between a semicolon and a library statement or a macro. Additionally, because the key words of the macros that are accessed using the library statement are in , Table 2. CY7C331 Shared Input Mux Node Assignment Pin Numbers Shared Input Mux Node 15116 143 17118 144 19/20 145 23/24 146 25/26 147 upper case, you must put all references to the macros (e.g., FEEDPIN_27) in upper case. This is the only place where ABEL is case sensitive. Finally, although you can put the library statement anywhere in the source code's declaration section, you must put macros last in the declaration section, before the equations section. Transparent Output with Registered Input This example shows how to configure a buried transparent output register with a registered input As described in the earlier section on transparent output registers, when you configure the output as transparent, the feedback to the product term array passes through the input register, unless programmed otherwise. The following code shows how to override the default using the ISTYPE 'FEED REG' statement. (Note that in the input section of the simulation, OUTlS represents the data being input on pin 15. This representation is somewhat confusing because in the equations OUT1S refers to the information coming from the pin-IS output register. See the simulation section of this application note for an explanation of this apparent discrepancy.) "input pins II, CLK2 PIN 1,2; PIN 3; CLK3 "output pins OUT 15, OUT16 PIN 15, 16; OUTI7, OUT18 PIN 17,18; "constants C,X,Z = .C., X., LIBRARY 'P331'; "Input data from pin 1 goes through the buried "register on pin 15, and is output on pin 16. "Output registers 15, 16 are configured as transparent. "Data is also input on pin 15 and output on pin 17. "Pin 15 input, pin 17 output are registered. NODE 143 ; INP15REG INPI5MUX NODE 29 ; OUT1S ISTYPE 'FEED_REG'; FEEDPIN 15; EQUATIONS.- INPl5MUX; OUT17 = CLK3 ; OUTI7.C INPlSREG.C = CLK2 ; OUT 15 =I1; OUT15.0E = 0 ; OUT16 = OUT15 TEST VECTORS ([iI,OUT1S,CLK2,CLK3] -> [OUT16,OUTI7]) [O,X,O,O ] -> [ 0, 1]; -> [ 1, 1]; [1,X ,0,0] [l,O,C,O ] -> [ 1, 1]; [1,X,O,C] -> [ 1, 0]; [1,1 ,C ,0] , -> [ 1, 0]; -> [ 1, 1]; [1 ,X ,O,C] "end z.; 6-151 Si;a= ~ --;;;;;;====;;;;;;;;;U;;;s;;;in~g~AB~E;;;L;;;;;3;;;.;;;2;;;to~P;;;r;;;o;:;gr;;;a;;;m~th;;;e;;;;;;C;;;yp:;:;;;r;;;;;es;;;;s;;;;;C;;;Y7=C;;;3;;;;;3;;;;;;;1 SEMICCtIDUCTOR_ Using the CY7C331 for Counting You can use the CY7C331 to create a synchronous counter. The only limitation to using the device in a synchronous mode is that all feedback must be internal to the part, because the input-data hold time is not compatible with the output-data hold time. ABEL provides many ways to implement a counter, including describing it explicitly in D or T flip-flop form. The following example shows how to use the "count = count + 1" capability with the CY7C331 to implement a basic counter. The ABEL compiler uses the CY7C331's XOR gate to implement T flip-flops without any external instructions such as ISTYPE 'REG_T'. "input pins 11, CLK2, CLK3 PIN 1,2,3; PIN 4,5,6; RES, PRE, OE "output pins OUT15, OUT16 PIN 15,16; PIN 17,18; OUT17, OUT18 "constants LIBRARY 'CONSTANT'; COUNT =[OUT18, OUT17, OUT16, OUT15]; EQUATIONS " Example of 4-bit counter " that starts and wraps around at 15. COUNT.C = CLK2; COUNT := COUNT + 1; " Example of how to use set and reset with this form COUNT.RE RES; COUNT.PR = PRE; TEST VECTORS (CLK2 -> COUNT) o -> 15; C -> 0; 1; C -> C -> 2; C -> 3; TEST VECTORS ([CLK2, RES, PRE] -> COUNT) [0,0,0] 3; -> [0,0,1] 0; -> [0, 1,0] -> 15; [0,0,1] 0; -> [C,O,o] 1; -> [C,O,O] 2; -> [C,O,O] 3; -> "end ABEL makes polarity control transparent by allowing you to write equations with both positive- and negative-polarity outputs. Most of the examples in the previous sections, for instance, had active-High outputs. But hard-wired polarity becomes an issue when using set and reset. Keep in mind that a reset causes the output to go High. ABEL takes care of the necessary inversions in the device to get the correct output polarity. This operation can be tricky when the internal feedback from a register controls another register's set or reset. Because both polarities are available in the product term array, it is not obvious which polarity should be used. Refer to the last example in the "Controlling the Output Enable"section of this application note for an example of indirect set and reset control. Although the CY7C331 has active-Low outputs, defining the outputs active High (using OUT15 ISTYPE 'POS') sometimes causes ABEL's Reduce module to create equations that suit the CY7C331 better. This effect is especially true when you use the XOR gate. Refer to pages 3 - 4 in the ABEL 3.2 User Notes for more information. Simulation Simulation is very important with a part as versatile as the CY7C331. All the examples in this application note have been simulated to verify their function. The ABEL simulator is powerful enough to simulate most of the configurations possible with the CY7C331. For example, the simulator supports multiple clock inputs controlling different registers. An application that illustrates this capability is a ripple counter. This counter has the clock input driven from the previous stage's output, with the least-significant bit driven by an external clock. The following is an example of a 4-bit decrementing ripple counter implemented in the CY7C331. "input pins PIN 2,3; CLK2, RESET "output pins PIN 15,16; OUT 15, OUT16 PIN 17,18; OUT17, OUT18 "constants LIBRAR Y 'CONSTANT'; COUNT =[OUT18, OUT17, OUT16, OUT15]; EQUATIONS "example of a 4·bit ripple counter that starts at 15 "and wraps around at O. = RESET; COUNT.RE OUT15.C = CLK2; := !OUTI5; OUT15 = OUT15; OUT16.C := .!OUT16; OUT16 = OUTI6; OUTI7.C OUT17 := !OUT17; = OUT17; OUT18.C := !OUTI8; OUT18 Polarity Issues The CY7C331 's outputs do not have programmable polarity control in the same sense as the 22V10. The CY7C331 has a hard-wired inverter between the output register and the output pin that results in an active low output. You generally control the device's polarity using the XOR gate located in front of the output register. 6-152 5?~a< .; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; V; ; ; S; ; ; iD; ; g: ; ; ;AB; ; ; ; ; ;E; ; ; ; ; ;L; ; ; 3; ; ; _2; ; ; ; ; ;to; ; ; ; ; ;P; ; ; r; ; ;o; :g; ; ;ra; ; ;m; ; ; ; ; ;t; ;h; ;e; ;;;;yp~r;;;;e;;;ss~C;;;;Y7~C;;;3;;3;;;,1 ;C TEST VECTORS ([CLK2,RESET ] -> COUNT) [ 0, -> X; [ 0, 1 ] -> 15; [ C, -> 14; [ C, -> 13; [ C, -> 12; [ C, -> 11; [ C, -> 10; [ C, -> 9; The CY7C331 powers-up with all registers in the reset state. The simulator, in most cases, mimics the device power-up characteristics. However, in certain applications, including the previous one, the simulation consistently initializes to a non-reset state. Another interesting problem with simulating the CY7C331 is naming the input data when you bury the output register and use an I/O pin as an input. Although the input-register data is accessed in the equations using a node name, the ABEL simulator only works with pin names. In this application note's "Transparent Output with Registered Input" section, the example's equations section uses the node name (INP15MUX) to access the data being input on pin 15; the pin name (OUT15) is used to represent the data from the output register, which is fed back to the product-term array and then to pin 16. In the simulation section, however, OUT15 now represents the data being input on pin 15. The ABEL simulator is smart enough to know which data you are referring to. Remember that simulation preload does not work with registered asynchronous parts such as the CY7C331 or 20RAlO. However, if your design has an extra input, you can preset to a specific value by using the set and preset product terms individually. For example: "input pins CLK, PRE PIN 2,3; "output pins OUT15, OUT16 PIN 15,16; PIN 17,18; OUT17, OUT18 OUTI9, OUT20 PIN 19,20; "constants LIBRARY 'CONSTANT'; COUNT = [OUT20, OUT19, OUT18, OUT17, OUT16, OUT15]; PRESET = [OUT20, OUT19, OUT18, OUT 17, OUT16, OUT15].RE; RESET = [OUT20, OUT19, OUT18, OUTI7, OUT 16, OUT15].PR; EQUATIONS = CLK; COUNT.C COUNT := COUNT + 1; WHEN (PRE == 1) THEN PRESET = [1,0,1,0,1,1]; WHEN (PRE == 1) = [0,1,0,1,0,0]; THEN RESET TEST VECTORS (CLK -> COUNT) 63; -> 0; C -> 1; C -> 2; C -> 3; C -> TEST VECTORS "preload simulation test ([clk,pre] -> count) [0, 0] -> 3 ; "remembers from previous sim. [C ,0] -> 4 ; [0, 1] -> 43 ; [C , 0] -> 44; [C , 0] -> 45 ; "end °] °°° ]]] °°]] °] ° 6-153 CYPRESS SEMICONDUCTOR Using LOG/iC to Program the CY7C330 This application note provides you with a running start towards using the LOO/iC design synthesis tool for designs using the Cypress CY7C330 programmable logic device. Of the steps required for implementing designs using PLDs, generating JEDEC files from high-level descriptions is probably the most time consuming. Unfortunately, the documentation that comes with many high-level synthesis packages does not provide enough detailed information to use advanced PLDs without a significant learning curve. Although the LOO/iC documentation is quite good, this application note should help flatten the LOO/iC learning curve further. Isdata's LOO/iC is an advanced universal logic synthesis program that generates designs targeted for PROMs, PLDs, and gate arrays. The LOO/iC package's basic algorithms were developed in the Electrical Engineering Department of the University of Karlsruhe, West Oermany. Although a relative newcomer to the PLD software market in the U.S., LOO/iC has become very popular in Europe. LOO/iC is available for a variety of operating environments including PC DOS and SPARC-based SUNoS platforms. The software is available as four different packages with two options. The first (PLC) package supports PAL designs. It offers input in either equations or tables with syntax constructs that include address ranges and functional blocks. Also available are hexadecimal, decimal, octal, and binary representations. The second (PLUS) package extends the first and supports the design of sequential controllers via inclusion of their FSM (finite state machine) syntax. This package includes an automatic test vector generation feature. Package three (PERFECT) extends support to include designs partitioning across multiple devices. Package four (OATES) supports the design of multi-Ievelstructure gate arrays by producing netlists from the various input formats. The two option packages offered are the Functional Verifier and PLD Database. This application note deals with the functions available in package two (PLUS). LOG/iC Language Overview LOO/iC offers three different entry methods: Boolean equations, truth tables, and FSM. Declarations partition an input me into sections. Additionally, designs can be logically partitioned into functional blocks within a LOO/iC design me. These options are described briefly before proceeding to CY7C330specific information. Declarations Declarations are directives to the LOO/iC compiler that identify the design, indicate the inputs and outputs, specify compiler options, assign pin numbers to variables, and specify the type of input format. These declarations separate the input me into discrete sections that describe the· design's various aspects. LOO/iC declarations consist of a key word preceded by an asterisk (*). The frrst section is the *Identification section, where you enter comments regarding the function of the design, etc. Variable declarations follow this information. LOO/iC supports input, output, local, and state variables for both Mealy and Moore machines. You can specify variables in ranges for compacmess of expression, such as Address[O ..31]. Variables can also have special function extensions that control the function of the device, such as RAS.OE. Following the variable declarations is the design description. It is denoted by the declaration *Boolean Equation s, *Function-Table, or *Flow-Table for Boolean, truth table, and FSM entry methods, respectively. In the design description, you specify the circuit's function. Drawing an analogy to programming a computer in a high-level language, you could say that most of the other declarations describe the circuit's variables. The design description implements the algorithm to perform the function you wish to create. Next are the *PLD, *PINS, *Run-Control, and *END sections. The *PLD declaration describes the device type targeted for use in this design. *PINS controls the assignment of the external variables to device pins. Finally, *Run-Control provides compiler directives, and *END signifies the end of the design file. 6-154 *Identification Parallel Load Register with acknowledge MMA - Cypress Semiconductor *X-Names Load, Data[O .. 7]; *Y-Names Qout[O .. 7],ACK; *Boolean-Equations Qout[O .. 7] := Load & Data[O..7]; ACK = Load; Table 1. LOG/iC Operators Operation Unregistered Output Registered Output Negation AND OR XOR Constant 1 Constant 0 Symbol 1 & + # VCC GND Example z = X; Z:= X; Z = IX; Z = X & Y; Z = X + Y; Z = X # Y; Z = VCC; Z GND; Figure 1. Boolean Entry Example Boolean Design Entry The simplest design entry method is by Boolean equation s. Table 1 shows the operators supported by LOG/iC in order of precedence. The labels X, Y, and Z can represent either a single variable or a range of variables. Logic polarity often creates an amazing amount of confusion for a methodology that has only two values. LOG/iC removes the burden of considering whether a given signal is active Low or High, because Boolean equation s always have a positive polarity. Thus, if a given input variable is specified without a 'I', that variable is deemed to be true independently of the active level of the signal on the pin. LOG/iC deals with signals that are active Low via the *Level declaration. You therefore write equations for an active-Low signal exactly the same as those for an active-High signal. The *Level section identifies the polarity of given input signals and manages negative/positive polarity issues for you. Another useful aspect of Boolean entry is the use of ranges, which provide a compact method of referring to many variables in a succinct fashion. Typical examples include references to address or data buses. Figure 1 shows an example of Boolean entry that utilizes variable ranges. This example features an 8-bit data bus whose values are captured in a register when a load command is issued. Figure 3 shows an example in which a header changes the variable ordering. This example uses two important constructs that can assist in reducing the logic design to the minimum number of product terms. The first construct is the Don't Care entries designated by a hyphen (-), which appear on both the input and output sides of the table. The use of the Don't Care input is unique to the function table entry method and can significantly improve the compiler's ability to produce minimized logic. Note that Don't Cares are only available when using bit fields and that the table ends with word "REST" on the input side. The use of the rest statement stems from the fact that, to uniquely identify all possible ~put Matterns with N input variables, you would requrre 2 table entries. A single Don't Care in any given line represents two entry lines rather that one. The rest statement provides a brief way to specify all remaining possible input values and the output the values should produce. The header line has an additional benefit beyond merely changing the order of bit data. You can also use the header line to indicate logical groupings of data as fields. Data that is not entered in groups must be entered as binary data. Grouped variables, however, can represent input data that is in binary, octal, decimal, or hexadecimal representations. Suffixes that indicate the *Identification Truth Table Example MMA - Cypress Semiconductor *X-Names X[6 .. 1]; *Y-Names Truth Table Entry Truth table entry represents one of the most compact entry methods to describe a combinatorial system. With this entry format, you map the outputs as a function of the input variables. The basic format of truth tables appears in Figure2. This example contains several noteworthy characteristics. The first is the ordering of the inputs and outputs. Note that the labels after the key word "*FunctionTable" are comments, indicated by the leading semicolon (;). Thus, the ordering of the X and Y variables in the *X-Names and *Y-Names declarations specifies their ordering in the function table. If you want some other ordering, you can specify it with a header. A header is a logical line preceded by the dollar sign symbol ($). When using a header, you separate the variables into fields delimited by commas. Y[1..4]; *Function-Table Input Side X X X X 6 5 4 3 o o 1 o 1 X X 2 1 0 1 0 0 1 1 1 o o 0 o 0 o o REST Output Side Y Y Y Y 1 1 234 , 0 000 1 ; 1 1 0; 1; o 1 100 1 ; 100 1 ; 0; 1 Figure 2. Truth Table Example 6-155 *Identification Truth Table Example with header MMA- Cypress Semiconductor *X-Names X[6.. 1]; *Y-Names Y[1..4]; *Function-Table Output· Side i Input Side $ X6, XS, X4, X3, Xl, X2 Y4, Y3, Y2, Yl -, 0, , 1 ,. 0, -, -, -, 0, 1 1, 0, 0, 0; 1, 1, 1, 0, 0, 0, 1, , 1·, 0, 1, -, -, 1, 1, -, -, 1, 1, 1 1, -, 1, 0; 1, 0, 0, 1 ; 0, -, 1, -, 0, 1, 0, 0, 1 ; 0, -, -, 1, 0, REST 0, -, -, 1 ; °° °° - Figure 3. Truth Table with Header data format appear in Table 2. It is important to note that a field is always totally occupied by a number; if necessary, leading zeros are added to completely fIll the field. In addition to fIelds, function tables allow the use of ranges. This feature permits effIcient implementation of address decoders (Figure 4). The function table for this decoder specifIes the address as ordered from 15 ..0. This order is signiflcant because it is the same order as that of the hexadecimal numbers entered in the ranges below, when you view the hexadecimal numbers as individual bits. Also note the double parenthesis surrounding the outputs in the header line, which label this field as a bit field, eliminating the need· for separating commas. Finite State Machine Entry FSM entry is probably the design methodology that correlates best with the CY7C330's target application as a high-speed state machine. LOG/iC's documentation defInes an FSM as a circuit that has combinatorial logic and state registers of arbitrary type that feed back to a combinatorial array. Add to this defInition multi-clocked input registers that minimize set-up and hold time requirements and you have a high-level description of the CY7C330. More generally described, state machines have memory elements that describe the present condition and inputs that influence both the transition to the next state and the outputs. FSMs are typically classified in two general categories: Moore and Mealy machines. LOG/iC differentiates between these types by stating that machines whose outputs might change arbitrarily within a state, even without a clock pulse, exhibit "Mealy behavior." Moore machines; on the other hand, have outputs that change only with the state clock and are free of glitches. This output is typifled as "Moore behavior" and is characteristic of the CY7C330. These out- Table 2. Numeric Base Indicator SufilXes B Binary (default - can be omitted) Octal Q Octal (alternate - to eliminate confusion between and 0) D Decimal H Hexadecimal o ° puts are tied to the state clock and are referred to in LOGlie as Z-variables. Four variables describe an FSM's behavior: the input variables' values, the present state, the output variables' values, and the next state. An FSM's variable declarations section has options for all these parameters. As in the previous entry methods, *XNames describe the circuit's inputs. *Y-Names are values that exhibit Mealy behavior. *Z-Names are outputs that change relative to the state clock, as do the CY7C330's. State information can assume one of two forms. The most common (and easiest) way to store the machine's state is to determine the total number ~ states required and dedicate N register bits (where 2 = the number of states) to maintain state information. This method is reliable and produces discrete non-overlapping state assignments. The disadvantage is that you must dedicate register resources (i.e., macrocells) that might have served better in another capacity. The second method available for state assignment is assignment of states . based purely on the output values. This method requires more thought, as it is critical that all output patterns be . unique. A design that might meet this criteria on first pass, might not be realizable if you add features - or remove them, in the case of undesirable "features." *IdentifIcation Address Decoder Example MMA - Cypress Semiconductor *X-Names Enable, Adr[O .. IS]; *Y-Names ROM[1..3], Port[I,2]; *Function-Table : OutPut Side ; Input Side $ Enable, (Adr[15 ..0]) : «ROM[1..3], Port[1..2])); 1, : 111 -- ; Disabled 0, OOOOOH ..007ftH : OIl 11 ; ROM1. Selected 0, 00800H ..OOfftH : 10111 . ; ROM2 Selected 0,01000H.,017ftH : 11011 ; ROM3 Selected 0, 08000H..08007H : 111 01 ; I/O Port 1 Selected 0, 08008H..0800FH : 111 10 ; I/O Port 2 Selected : 01111 ; ROM1 (Shadow) 0, OfSOOH ..Offfm : 111 11 ; Disabled REST Figure ( Address Decoder Function Table 6-156 *Identification Counter with 247 states and overflow signal MMA - Cypress Semiconductor *X-Names Reset; *Y-Names Overflow; *Z-Names Q[1..8] *Flow-Table S[1..247], X 1, Y 0, Fl ; Reset condition S[1..246], X 0, Y 0, F[2 .. 247] ; Count S[247], X 0, Y 1, Fl ; Overflow inputs and outputs might not be relevant to a subset of the machine's sequence of operations. Rather than force you to specify the status of all variables, LOG/iC has a directive that lets you specify what variables are significant. This statement is called Relevant and stays in effect until the next Relevant statement or until the end of the design. As an example, you can describe the simple machine as: *Flow-Table Relevant = Xl, X2 : Y2; 51, X 0 1, Y 1, F2; Omitting Yl from the Relevant statement indicates that Yl is a Don't Care. If, instead, you want Yl always to be off for the subsequent lines, you can state Yl = O. Another powerful statement is Xrest. Similar to the REST statement in function tables, Xrest provides a brief way to assign all remaining non-specified input patterns and these conditions' desired output and next state. You can also use ranges in flow tables for compact machine descriptions. In only three lines, the counter definition in Figure 5 completely specifies a state machine with 247 states through the use of ranges. The only limitation is the number of states that LOG/iC allows in a machine. . The table-driven LOG/iC optimizer allows a maximum of 1024 states. For most true state machine applications, you would be hard pressed to fit 1024 states into a single PLD. But this syntax's attractiveness for use in counters as large as 16 bits (64K states) in the CY7C330 can lead you to run up against the 1024-state limitation in short order. Fortunately, LOG/iC can partition designs into blocks. This capability allows you to partition the design into smaller chunks that are optimized individually and merged after compilation. Blocks also tend to mimic optimal approaches to finding solutions by segmenting designs into smaller functional units (more on this later). LOG/iC also includes a simple statement that determines the type of flip-flop for implementing the state registers via the *Flip-Flop directive. The default is D-FlipFlops, but the T-FlipFlops statement can also be used. The LOG/iC reduction algorithm automatically generates optimized equations for the flip-flop type specified. This capability is especially significant for the CY7C330, because LOG/iC understands how to use the XOR product term for both polarity control and T flipflop creation. The CY7C330 can implement large counter s extremely efficiently using T flip-flops automatically generated by LOG/iC. Figure 5. FSM Counter LOG/iC can implement designs using either type of state assignment. The *State-Assignment directive provides the options of binary, number, gray, l-out-ofN, and Z-variables. The binary option dedicates registers to state values and encodes the state values in binary. LOG/iC can do this encoding automatically, or you can specify the encoding explicitly. Using the number option ensures that the binary code for each state is the same as the state numbers used in the high-level description, i.e., state 1 = 001, etc. The gray option assigns the states using gray coding to minimize transitions. l-out-of-N assignment again uses registers but does not binary-encode states; instead, each discrete register represents a single state. This approach is especially demanding on macrocell resources but minimizes the number of state bits switching at a single clock edge. Finally, the Z-variable option allows the output values themselves to represent the states. You enter the FSM design as a table after the directive *Flow-Table. Each line in the flow table has as many as four fields separated by commas. These fields represent the present state, inputs, outputs, and next state. Not all designs require all four states. Counters are good examples of applications that require only three fields to describe the machine, because the count value is the same as the state value. The order in which the fields appear is not significant, because a letter indicating the field type precedes each field. The letters S, X, Y, and F indicate the state-number, input, output, and next-state fields, respectively. A line in an FSM that describes part of a machine might look like this: *Flow-Table Optimization Levels You control LOG/iC's optimizer via the Compute and Nocompute statements, which you can place in the design file's *Run-Control section. Optimization levels are essentially binary. Nocompute allows you to indicate SI, X 0 1, Y - 1, F2; When in state 1, with inputs at 0 and 1, this machine causes the second output to go True and transitions to state 2. In this case, the first output is not relevant to the design. In a large machine, many of the 6-157 ~RffiS ~, --;========;;;;;;;U~si;;;n~g;;;;;;;L;;;;;;;O~G;;;;;/i;;;C;;;;;;;t;;;;;;;o;;;;;;;P;;;;;;;r;;;;;;;o!:;gr;;;;;;;a;;;;;;;m=th;;;;;;;e; ; ; ;C; ; ; ;Y7=C; ; ; ;3; ; ; ;3; ; ; ,O SEMICCffi)UCfOR _ outputs for which you desire no reduction. Compute is complementary and allows you to explicitly specify the outputs you want reduced. Another directive, CPUTime = nn, allows you to specify the maximum amount of time the compiler can take to attempt an exact solution. After. this time, the compiler computes approximated solutions. and CLK2, respectively. The default clock used is CLKl. To specify CLK2 instead, use the *Special Functions directive along with the .IC2 pin name suffix. Thus, to select CLK2 for input Fred, use the following syntax: *Special Functions Fred.IC2 = YES; CY7C330 Characteristics Controlling Output Enable The default for OE in LOG/iC is asynchronous, pin-14 control of the output buffer. If you use the macrocell for input only (pure input), the OE-select fuse is left intact, which selects OE from the product term. Because none of the product term fuses are blown, selecting OE from the product term results in the output driver being turned off. Finally, if you use the macrocell for both input and output, the OE again defaults to asynchronous, pin-14 control. You have several options for changing this default behavior. First, you can use the OE special function. If the macrocell is called AO, then: Cypress's CY7C330 is a high-performance PLD optimized for state machine applications. It features a pipelined architecture that achieves a 66-Mhz state transition speed. The device's 11 dedicated registered inputs offer small set-up and hold times. These verslltile input registers can be clocked with either of two input clocks. You select the input clock by programming a configuration fuse unique to each input register. The CY7C330 has a total of three clock pins - two for the input registers and one for the output/state registers. This feature allows you to synchronize input data without using an external register. You can tie the clock pins together if you need only a single clock source. The CY7C330 provides 12 I/O macrocells and four buried macrocells. The 12 I/O macrocells have an input register structure identical to that of the dedicated inputs. The outputs from the CY7C330 logic array feature variable product-term distribution with nine to 19 product terms per output. These product terms are XORed with an additional product term, which you can use for equations that require an XOR, polarity control, ' or T flip-flop implementation. A fuse-configurable feedback mux allows you to program the CY7C330 macrocell for feedback from the input register or the output register (buried). The device's output enable is configurable for control via a product term or pin 14. This pin allows you to enable the output buffers asynchronously. Product term OE (output enable) is synchronous to the input register values that comprise the OE equation. You can also program this equation to permanently enable or disable the output buffer. When the feedback is programmed for stateregister (rather than input register) buried feedback, you have an additional feedback connection between pairs of I/O macrocel1s. This connection provides an input path for the pin that would otherwise be lost. You thus have the flexibility of burying six of the 12 I/O macrocells and using the associated' pins as dedicated inputs. The four hidden macrocells have the same product-term structure as the I/O macrocells, with fixed state-register feedback to the logic' array. The CY7C330 also furnishes two product terms that permit you to set or reset all the state registers synchronously. . Selecting the CY7C330's Input Clock The CY7C330's input registers are clocked with either pin 2 or 3. LOG/iC refers to these pins as CLK1 AO.OE = 0; ; Sets OE to synchronous product term control and permanently turns OFF the driver AO.OE = 1 ; ; Sets OE to synchronous product term control and permanently turns ON the driver AO.OE = EQN; ; Sets OE to synchronous product term control, output driver is controlled by the specified equation (EQN). These constructs should allow you to create any desired OE configuration, while maintaining readability. You 'can also use the FUSES statement to control the OE mux, as follows: ; BLOWN Selects synchronous product term output buffer control ; INTACT Selects asynchronous pin 14 output buffer control Pin # *Fuses; 15 $17067 = INTACT; $17063 = INTACT; 16 $17060 = INTACT; 17 18 $17056 = INTACT; $17053 = INTACT; 19 20 $17049 = INTACT; 23 $17046 = INTACT; $17042 = INTACT; 24 $17039 = INTACT; 25 26 $17035 = INTACT; 27 $17032 = INTACT; 28 $17028 = INTACT; 6-158 Use oithe XOR Product Term LOG/iC supports use of the XOR product term to implement polarity control and T flip-flops. Polarity control is automatic for all entry formats and is controlled via the *Level directive. LOG/iC uses the XOR to create T flip-flops by using the *Flip-Flops directive and specifying T-FlipFlops. The LOG/iC optimizer then automatically produces reduced equations targeted at T flip-flops. Macrocell Feedback statements in the *Boolean-Equations section. Avoid a potential pitfall by remembering that resetting the register to Zero causes a value of One to appear on the output pin because of the inverting output buffer. The following code shows the usage of the preset and reset statements, where variable Paul presets the register, and variable Ray resets the register. *Boolean-Equations $PS = Paul; $RS = Ray; LOG/iC defaults to selecting feedback from the state register. If you use the macrocell as a pure input, feedback is automatically routed from the input pin register. Designs that use the macrocell state register and the input pin register can specify feedback via the .FBK function or FUSES statements. As an example, say you use the state register as an adder, and the associated macrocell input-pin register holds a base value. In this case, you want to drive the result onto the output pins during normal operation, while the macrocell input register uses the feedback path to provide the base value to the adder equations. During base-value updates, you three-state the output buffers and clock a new value into the macrocell input registers. LOG/iC defaults to selecting feedback from the state register. The following statements configure the desired feedback: SUM3.FBK = PIN; or ; BLOWN Selects feedback from macrocell input register ; INTACT Selects feedback from macrocell output register *Fuses; Pin # $17068 = BLOWN; 15 $17064 = BLOWN; 16 $17061 = BLOWN; 17 $17057 = BLOWN; 18 $17054 = BLOWN; 19 20 $17050 = BLOWN; $17047 = BLOWN; 23 $17043 = BLOWN; 24 $17040 = BLOWN; 25 $17036 = BLOWN; 26 $17033 = BLOWN; 27 $17029 = BLOWN; 28 Using the Shared-Input Feedback Mux As mentioned previously, the CY7C330 has a shared-input feedback mux, which allows you to use a given macrocell for both input and output. This feature is useful for several configurations, such as when the state register is buried as an internal state bit that is fed back to the array, and the pin serves as a dedicated input. In this case, the OE product term is typically configured to disable the output buffer. Another good application for the shared-input feedback mux occurs when you use the input register to hold a seldom-changed value used by the machine. For example, a counter might have an upper limit that is loadable. During normal operation, the output buffer OE is enabled and the count appears on the output pins. When a new limit is desired, the output is threestated, and the limit value is clocked into the input register. The machine can then access this value via the shared-input feedback mux. LOG/iC deals with these situations by referring to the state register as a buried node. LOG/iC provides a list of the node numbers and the pins they correspond to. The input to the macrocell is assigned to the pin number. Using this notation, LOG/iC automatically uses the shared-input feedback mux for the input. The following statements correctly configure and use the shared-input feedback mux for a buried macrocell that has a variable assigned to the state register named S1 and an input named X29: *X-names X29; *Y-names Sl; ; Design entry here *Pins X29 = 27; *Nodes Sl = 15; Remember that the shared-input feedbackmux is available for only one of every pair of macrocells. Node numbers, the corresponding pin numbers, and their Controlling Synchronous Reset and Preset The CY7C330 has a single product term that controls the synchronous resets of all of the state/output registers. Similarly, a single product term controls all the state/output registers' synchronous presets. These two product terms are controlled via the $PS and $RS 6-159 available product terms are as follows (hI - 4 are the hidden macrocells): Node: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Pin: hI h2 h3 h4 15 16 17 18 19 20 23 24 25 26 27 28 PTs: 19 11 17 13 09 19 11 17 13 15 15 13 17 11 19 09 Design Examples An old adage about designing asserts that good engineers borrow and great engineers steal! Most often used to describe the practice of re-using existing software in a new design, this proverb applies equally well to doing new PLD designs. Also, examples tend to be the best way to flatten the learning curve for a new language. The examples that follow highlight features of LOG/iC and the CY7C330. Example 1: Modulo-ll Counter The ability of the CY7C330's XOR product term to implement T flip-flops proves ideal for building complex counters. This first example is a small counter that counts to 11 and resets to O. The design also features a clear, a hold, and count-up/down controls. Appendix A shows the LOG/iC source code for this design. This counter is an excellent example of the expression compactness with which LOG/iC describes designs. The counter's four inputs are CLR, UP, HOlD, and OE. CLR resets the counter to zero when asserted. UP determines the direction in which the counter operates. HOlD causes the counter to stop at its current count value until HOLD is released. OE is tied to pin 14 and serves as an asynchronous output enable. The outputs are count bits QO - Q3. Note that the *Level statement has been used to indicate that OE and CLR are active Low. As noted earlier, the design needs no polarity conversion; LOG/iC automatically creates the proper reduced equations for these active-Low inputs. Because each output value is unique, this design uses Z-Values state assignment. Thus, the states are the counter values. Examining the flow table, you can see that whenever CLR is active, the counter goes to state 1, which has a value of zero. Entered this way, the flow table values cause LOG/iC to use the macrocell product term s to implement the CLR function. You could use the CY7C330's preset and reset product terms to achieve the same result. This design falls well within the limit of nine to 19 product terms per output, however, and the design is very readable in the current format. The next line in the flow table shows the countingdown state: If in state 1, wrap around to state 11; if in any other state, move to the next lower state. The third line does not have to restate the state section of the flow table, because no change occurs from the second line. The third line specifies the design's up counter: If in state 1 - 10, go to the next higher state; if in state 11, wrap around to state 1. The flow table's last line shows the hold state. Notice that the file contains statements for both T and D flip-flops. This practice allows you to comment one of the two out easily and see the number of product term s necessary for each type of design implementation. As expected, the T flip-flop design generates a more efficient counter implementation. Example 2: 15-Bit Counter with Carry Out The. previous example generated a counter with a very compact design expression. If you want a larger counter, you might wish to borrow that example and edit the numbers to provide· more count bits. Doing so quickly runs you into the wall of the 1024-states maximum, however. The solutions is to use LOG/iC's block structure to partition the task into multiple smaller counter s that cascade to form a large counter. An example of this technique appears in Appendix B. This design consists of three smaller design blocks named CTR1, CTR2, and CTR3 - all identical. The design has global inputs called RESET, HOlD, and UP that perform obvious functions. Global outputs include 15 bits of counter value and a carry out. Between the design blocks are two local variables, INTl and INTI, which provide carry out internally between counter blocks. The *Link statement reconciles all the global variables and the local variables that each block declares. Although this design looks rather large, bear in mind that when the optimization is complete, the internal variables completely disappear, and only two product terms are required per output. Finally, note the block titled HW RESET. This block uses the CY7C330 preset product term to reset all the output pins to zero. Example 3: T-Bird Tail Lights via Truth Table The T-Bird tail lights example is a simple design that emulates the function of the early 1960s Ford Thunderbird tail lights. The original design used a motorized assembly that caused the left or right cluster of three lights to tum on sequentially from the inside to the outside when the driver activated the directional signal. The design presented here has five inputs: left tum (LT), right tum (RT), ignition (IGN), brake, and flash. For this design, the six output lights are also listed as inputs, because the truth table uses them to determine present state - similar to Z-values in an FSM. The six output lights are designated the right and left inside, middle, and outside. The brakes and emergency flashers operate regardless of whether or not the ignition is active. The tum signals, however, operate only with the ignition on. The brake and tum inputs activate all the lights on the side that is nof sequencing through a tum indication. This design introduces the concept of a bus through the constructs LEFT = LO,LM,LI and RIGHT = RI,RM,RO. Also note that the design uses string substitution to describe the output states. Appendix C shows this example. 6-160 macrocell's input registers. Appendix E shows the source code for this example. One difference between this example and the earlier ones is that both the X and Y input sections contain the variables A[O..7]. This arrangement is due to the fact that the same macrocells provide the desired position (input) and the difference value (output). The *Local attribute identifies intermediate values that are not needed for output but are used to generate the correct results via substitution into other equations. Because the basic equation for an adder uses an XOR to calculate the sum, this example specifies the .XRB attribute to use the CY7C330's XOR product term as an XOR - a technique that reduces the number of other product terms required. The adder completes an 8-bit add in three clock cycles, producing two intermediate carry bits, which are generated and stored in two of the four internal hidden registers. The special functions attributes .IC2 and .FBK configure the output macrocell appropriately. Example 4: T-Bird Tail Lights via Flow Table FSM syntax can also implement the T-Bird tail lights. For this approach, state bits are assigned to guarantee that all states are unique and non-overlapping. The CY7C330's hidden macrocells are ideal for this use. Refer to Appendix D for this design. Although this FSM implementation is safer than the truth-table version from the aspect of uniquely assigned states, the FSM approach is not without cost. Specifically, the truth table implementation was able to incorporate additional functions for invalid conditions such as LT and RT active simultaneously. Example 5: 8-Bit Adder for Servo Control This servo example is covered in detail in the application note, "Using the CY7C330 as a Closed-Loop Servo Controller." The basic idea is that you can use the CY7C330 to calculate the difference between the desired position and the actual position to provide feedback to the servo loop. In the servo application, the target position is loaded into the I/O macrocell's input register during a special update cycle. During this cycle, a microprocessor provides data to the dedicated inputs as a delta from the current position. The CY7C330 adds the position value to the current position and makes the result available at the output pins in three clocks. Then the second input clock is toggled once to load the new desired position into the I/O-macrocell input registers. This operation is possible because the outputs are driving the macrocell input registers. This design uses nearly all of the registers in the CY7C330. To provide a difference between desired and current position during normal operation, the input values are furnished in two's complement form and added to the target position stored in the I/O Summary The examples presented here frequently optimized to levels exceeding results produced previously. The ability to specify Don't Cares for output cases, along with LOG/iC's table-driven optimizer, produced results much more quickly than has previously been typical. Documentation, an Achilles heel for many PLD tools, proved quite readable in this case and minimized the dreaded learning curve. LOG/iC's finite state machine syntax allows compact descriptions of complex designs that produced correct results - quite a contrast to previous experiences. Clearly, LOG/iC can implement designs that use all the features available in the CY7C330. LOCtiC has quickly become an essential tool for Cypress PLD designs. 6-161 Appendix A. LOG/iC Source Code for Modulo-ll Counter *IDENTIFICATION Bit - modulo 11 counter using LOG/iC FSM entry Z-Value state assignment used to specify absolute output value associated with each of the 11 states MMA - Cypress Semiconductor *X-NAMES CLR, UP, HOLD, OE; *Z-NAMES Q[3 .. 0]; *LEVEL LOW == CLR,OE; Active low level for these pins *Z-VALUES S[1..11] = [0 .. 10]; *FLOW-TABLE RELEVANT = CLR, UP, HOLD; S[1..11],X 1 -, F1 S[1..11], X 0 0 0, F[11,1..10] X 0 1 0, F[2.. 11,1] S[1..11],X 0 1, F[1..11] ; ; ; ; Clear counter to zero Count Down Count Up Hold Counter value ;Spacing between X variables above added only to improve clarity *STATE-AS SIGNMENT Z-Values; *FLIP-FLOPS D-FlipFlops; D-F/F uses total of 22 Product Tei'rnS T-FlipFlops; T-F/F uses total of 16 Product Terms *PLD TYPE = PLD7C330; *PINS Q[3 .. 0] CLR UP HOLD OE [28 ..25], 3, 4, 5, 14; *RUN-CONTROL PROG = JEDEC; LIST = PLOT, EQUATIONS, PINOUT, FUSEPLOT; *END 6-162 Appendix B. IS·Bit Counter with Carry Out *Identification 15 bit counter· Using 7C330 hardware Reset Using Block Syntax to implement large counter w/FSM input Syntax (bypasses problem with exceeding maximum number of states when building large counters • block structure adds NO extra product terms to compiled design. INTI & 2 are completely elinunatea.) MMA Cypress Semiconductor HOLD CTRI CTR3 CTR2 CNT CY~------~CNT CY~------~ CNT CY 01 02 03 04 05 RESET 06 07 08 09 010 RESET RESET~--------------~--------------~ *X-Names RESET, HOLD, UP; *Y-Names CARRY,Q[1..15]; *Local INT[I,2]; *Link RESET RESET HOLD UP CARRY INTI INTI Q[1..5] Q[6 .. 10] Q[I1..15] CTRl:R,CTR2:R,CTR3:R; HW RESET:R; C'rRi:CNT; CTRl:UP,CTR2:UP,CTR3:UP; CTR3:CY; CTRl:CY,CTR2:CNT; CTR2:CY,CTR3:CNT; CTRl:QQ[1..5]; CTR2:QQ[1..5] ; CTR3:QQ[1..5] ; ;*** First 5-bit counter stage here ************ @BLOCK = CTRl; *X-Names CNT,R,UP; *Y-Names CY; *Q-Names QQ[5 .. 1]; 6-163 CARRY OIl 012 013 Q14 015 RESET Appendix B. 1S-Bit Counter with Carry Out (continued) *Flow-Table ;Using '330s Internal Reset Relevant = CNT,UP:CY; Y 0, F[l..32] ;Hold Condition S[1..32],X S[1..31],X 1 1, Y 0, F[2.. 32] ;Counting S[32], X I I , Y 1, Fl ;Maximum Count Reached S[32.. 2],X 1 0, Y 0, F[31..1] ;Counting S[1], X 1 0, Y 1, F32 ;Minimum Count Reached ° -, *Flip-Flops T-FLIPFLOPS; *State-Assignment binary; @ENDBLOCK = CTRl; ;*** Second 5-bit counter stage here ************ @BLOCK = CTR2; *X-Names CNT,R,UP; *Y-Names CY; *Q-Names QQ[5 .. 1]; *Flow-Table ; Using '330s Internal Reset Relevant = CNT,UP:CY; S[1..32], X Y 0, S[1..31],X 1 1, Y 0, S[32], X I I , Y 1, S[32 ..2],X 1 0, Y 0, S[I], X 1 0, Y 1, ° -, F[1..32];Hold Condition F[2 .. 32];Counting Fl ;Maximum Count Reached F[31..1];Counting F32 ;Minimum Count Reached *Flip-Flops T-FLIPFLOPS; *State-Assignment Binary; @ENDBLOCK = CTR2; ;*** Third 5-bit counter stage here ************ @BLOCK = CTR3; *X-Names CNT,R,UP; *Y-Names CY; 6-164 Appendix B. IS-Bit Counter with Carry Out (continued) *Q-Names QQ[5 .. 1]; *F1ow-Table ; Using '330s Internal Reset Relevant = CNT,UP:CY; S[1..32],X Y 0, S[1..31],X 1 1, Y 0, S[32], X I I , Y 1, S[32 .. 2],X 1 0, Y 0, S[1], X 0, Y 1, ° -, F[1..32] F[2 .. 32] Fl F[31..1] F32 ;Hold Condition ;Counting ;Maximum Count Reached ;Counting ;Minimum Count Reached *Flip-Flops T -FLIPFLOPS; *State-Assignment Binary; @ENDBLOCK = CTR3; ;******* End of Counter Blocks *********** @BLOCK = HW_RESET; *X-Names R; *Boolean Equations $PS = R; @ENDBLOCK *PLD Type = PLD7C330; *Pins REGCLK INPCLK Q[5 .. 1O] Q[11..15] CARRY RESET HOLD UP *Nodes Q[l..4] = 1, 2, ! needed for creating testvectors [15 ..20], [23 ..27], 28, 4, 5; 6; [1..4]; *Run-control Listing Progformat = Pinout, Plot; Jedec; *END 6-165 Appendix C. T-Bird Tail Lights Example *IDENTIFICATION Thunderbird sequencing Taillights example for 7C330 using ISDATA LOG/IC Truth Table Implementation MMA Cypress Semiconductor *X-NAMES LT, RT, BRAKE, FLASH, IGN, RI, RM, RO, LI, LM, LO; *Y-NAMES RI, RM, RO, LI, LM, LO; *BUS LEFT = LO,LM,LI; RIGHT = RI,RM,RO; *LEVEL LOW = FLASH; ;Macros for All desired output combinations: *STRING 1, 1, 1; ON 0, 0, 0; OFF 0, 0, 1; LEFT 1 LEFT2 0, 1, 1; 1, 0, 0; RIGHTl 1, 1, 0; RIGHT2 ONE TWO THREE TRI 1, 0, 0, 0, 1, 0, -, ., 0; 0; 1; *FUNCTION-TABLE $ IGN,FLASH,LT,RT,BRAKE,LEFT ,RIGHT ;Quiescent 1, 0, 0, 0, 0, 'TRI' , 'TRI' 0, 0, , -, 0, 'TRI' , 'TRI' - ;Flash 1, 1, -, -, -, -, -, -, -, -, ;Brake 0, 0, 0, 0, -, -, ;Left Tum 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, : LEFT ,RIGHT 'OFF' ,'OFF'; 'OFF','OFF'; ON', 'OFF', 'ON' 'OFF' 'OFF','OFF'; 'ON','ON'; -, 0, 1, 1, 'TRI', 'TRI', 'TRI' 'TRI' 'ON','ON'; 'ON','ON'; 0, 0, 0, 0, 0, 0, 0, 0, 'OFF' 'LEFTl', 'LEFT2', 'ON', ,'TRI' 'TRI' 'TRI' 'TRI' 'LEFTl', 'OFF'; 'LEFT2' ,'OFF'; 'ON','OFF'; 'OFF','OFF'; 6-166 Appendix C. T-Bird Tail Lights Example (continued) ;Right Turn 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 'TRI' , 'TRI', 'TRI', 'TRI', 'OFF' 'RIGHTl' 'RIGHT2' 'ON' 'OFF','RIGHT1'; 'OFF' ,'RIGHT2'; 'OFF' ,'ON'; 'OFF' ,'OFF'; 1, 1, 1, 1, 'OFF', 'LEFT1', 'LEFT2', 'ON', 'TRI' 'TRI' 'TRI' 'TRI' 'LEFT1','ON'; 'LEFT2' ,'ON'; 'ON','ON'; 'OFF','ON' ; 'TRI', 'TRI', 'TRI', 'TRI', 'OFF' 'RIGHTl' 'RIGHT2' 'ON' 'ON' ,'RIGHT1'; 'ON' ,'RIGHT2'; 'ON','ON'; 'ON' ,'OFF'; ;Left Turn + Brake 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, ;Right Turn + Brake 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, ;Both Turn 1, 0, 1, 0, 1, 0, 1, 0, ;ll1egal 1, 1, 1, 1, - lights flash 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, in reverse sequence 'OFF', 'ON', 'LEFT2', 'LEFT1', condition, All ON 0, 1, 1, 1, 'OFF', 0, 1, 1, 1, 'ONE', 0, 1, 1, 1, 'TWO', 0, 1, 1, 1, 'THREE', 'OFF' 'ON' 'RIGHT2' 'RIGHT1' 'ON','ON'; 'LEFT2' ,'RIGHT2'; 'LEFT 1' ,'RIGHT1'; 'OFF','OFF'; 'OFF' 'THREE' 'TWO' 'ONE' 'ONE','THREE'; 'TWO','TWO'; 'THREE' ,'ONE'; 'OFF','OFF'; *FLIP-FLOPS D-FLIPFLOPS; T-FLIPFLOPS; *PLD TYPE *PINS LT RT BRAKE FLASH IGN RI RM RO LI LM LO PLD7C330; 4, 5, 6, 7, 9, 23, 24, 25, 20, 19, 18; *RUN-CONTROL PROG = JEDEC; LIST = PLOT, EQUATIONS, PINOUT, FUSEPLOT; *END 6-167 Appendix D. T-Bird Tail Lights via FlowTable *IDENTIFICATION Thunderbird sequencing Taillights example for 7C330 using ISDATA LOG/IC State Machine Implementation MMA Cypress Semiconductor *X-NAMES LT,RT,BRAKE,FLASH,IGN; *Z-NAMES LO,LM,LI,ru,RM,RO; *LEVEL LOW = FLASH; *Q-NAMES Q[I ..4]; *Z-VALUES SI OOOOOO;AIllights off or Flash Off S2 001000; Left Tum 1 S3 011000; Left Tum 2 S4 111000; Left Tum 3 S5 000100; Right Tum 1 S6 000110; Right Tum 2 S7 000111; Right Tum 3 S8 001111; Brake + Left Tum 1 S9 011111; Brake + Left Tum 2 SlO = 111111; Brake + Left Tum 3 S11 = 000111; Brake + Left Tum 4 S12 = 111100; Brake + Right Tum 1 S13 = 111110; Brake + Right Tum 2 S14 = 111111; Brake + Right Tum 3 SIS = 111000; Brake + Right Tum 4 S16 = 111111; Brake or Flash On *FLOW-TABLE Sn, LT RT Brake Flash SI, X 0 0 0 0 X 1 X 0 0 1 0 X 1 0 X 1 0 0 0 X 0 1 0 0 X 1 0 1 0 X 0 1 1 0 XREST, IGN, Fn -, Fl; -, FI6; 1, FI6; 0, FI6; 1, F2; 1, F5; 1, F8; 1, F12; FI; S2, X 1 XREST, 1, F3; FI; S3, X 1 XREST, 1, F4; Fl; S4, XREST, All Lights Off Left Tum Sequence Fl; 6-168 Appendix D. T -Bird Tail Lights via Flow Table (continued) RI RM RO LI LM LO *NODES Q[1..4] 23, 24, 25, 20, 19, 18; [1..4]; *RUN-CONTROL PROG = JEDEC; LIST = PLOT, EQUATIONS, PINOUT, FUSEPLOT; *END 6-170 Appendix E. 8·Bit Adder Example *Identification 8-Bit multi-stage adder - as detailed in 7C330 Servo control Application Note Mark Aaldering Cypress Semiconductor *X-Names CIN,C2,C5,A[0 .. 7],B[0..7]; *Y-Names A[O.. 7],C2,C5,CARRY; *Local C[0.. 1,3 ..4,6..7] ; *Boolean-Equations A[0.. 7].XRB = A[0.. 7]; AO = BO # CIN; CO = (AO & BO) + (AO & CIN) + (BO & CIN); A[1..7] = B[1..7] # C[0 .. 6]; C[1..6] = (A[1..6] & B[1..6]) + (A[1..6] & C[0..5]) + (B[1..6] & C[0 .. 5]); CARRY = (A7&B7) + (A7&C6) + (B7&C6); *Flip-Flops D-FLIPFLOPS; *PLD Type *Nodes C2 = C5 = = PLD7C330; 1; 3; *Pins OUTCLK INCLK ACLK CIN B[0..7] AO Al A2 A3 A4 A5 A6 A7 CARRY 1, 2, 3, 4, [5 .. 7,9 .. 13], 28, 15, 20, 17, 26, 23, 19, 24, 18, *Special-Functions AO.IC2 = Yes; A1.IC2 = Yes; A2.IC2 = Yes; 6-171 ~CYPR!Ss ~ .. SEMlCOIDucrOR Using LOG/iC to Program the CY7C330 =============;;;;;;;;;;:;;;;======;;;;;;:;;=======;;;;;; Appendix E. 8-Bit Adder Example (continued) A3.IC2 A4.IC2 A5.IC2 A6.IC2 A7.IC2 = = = = = Yes; Yes; Yes; Yes; Yes; AO.FBK = A1.FBK = A2.FBK= A3.FBK = A4.FBK = Pin; Pin; Pin; Pin; Pin; Pin; Pin; Pin; AS.FBK = A6.FBK = A7.FBK= *RUN-CONTROL PROG = JEDEC; LIST = PLOT, EQUATIONS, PINOUT, FUSEPLOT; *END 6-172 ~ ~ ---. ~II-~~:}'iii .~a CYPRESS ~ F SEMICONDUCTOR - State Machine Design Considerations and Methodologies The use of state machines provides a systematic way to design complex sequential logic circuits-an increasingly popular approach since the advent of PLD (Programmable Logic Device) circuitry. This application note describes the many options encountered during the state machine design cycle. By exhaustively walking through the PLD-based design example presented here, you can weigh the merits of several design approaches. 7. Total input vector-The combination of the external input vector and the state vector. The total input vector is decoded to generate the next state of the machine. State Machine Entry Methods There are many ways of describing a state machine, each with distinct advantages and disadvantages. Three popular description methods are state diagrams, state tables, and high-level languages (HLLs). The state diagram provides an easily observable flow description of the state machine. Because the ability to view the flow of states provides distinct documentation advantages, state diagrams will be used throughout this application note to describe the example state machine. Upon completing a state diagram, you can easily convert the diagram's visual information into the other types of state machine description or directly into Boolean equations. Several available software programs accept their own forms of state table, HLL, and/or Boolean entry. You can enter all these formats easily via your favorite text editor. The software then translates the inputs into suitable forms (usually a JEDEC map) for hardware implementation. Another method of describing a state machine, the state table, offers perhaps the most concise description. Its major advantage over the other entry methods is the availability of state table reduction methods (see Reference 1). When applied to your state table definition, a reduction program generates a minimal model for the function. The software used for state machine synthesis throughout this application note uses the state table method of entry. The program is called LOG/iC from ISDATA Corporation. Finally, high level language (HLL) state machine entry is probably the most popular forro of state Definitions of Commonly Used Terms 1. External input vector-External signals (stimulus) applied to the state machine. 2. System outputs-Signals generated by the state machine that are explicitly designed for availability to the external system (hardware outside of the state machine). Registered system outputs can also be fed back into the state machine as part of the State Vector, which is then used in the decode of the state machine's next state. 3. State registers-Registers used exclusively for determining the next state of the machine (feedback). 4. State outputs--Outputs of the state registers that are available to the external system. (They are typically available to the external machine for debug or due to the lack of buried registers.) 5. State vector or machine state-The registered feedback information defining the present state of the machine and required to determine the next state of the machine. 6. State path-The transitional condition that must be met for the state machine to progress from one state to another. The state path typically consists of one or more product terms generated from external inputs, although other state paths are possible. 6-173 ~ State Machine Design Considerations and Methodologies ~ ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~ machine design. HLLs typically offer C-language-Iike instructions (e.g., case,if-then-else, etc.) to describe the machine. An Example State Machine The example state machine is a clock generator for a pipelined (three system execution stages), bit-slicebased, central processing unit (CPU). Each of the three system execution stages contains two clocks for a total of six system clocks for every instruction execution. With pipelining enabled, each instruction takes an average of two clock periods. Further, external hardware unaffected by CPU wait and stop states (e.g., cache memory) needs both polarities of an additional free-running clock. To minimize clock edge skew, the state machine provides both versions of the clock. To put the timing of this application into perspective, executing each pipeline stage in an 80-ns period (or 12.5 MHz) requires the state machine to run at 25 MHz. This speed is well within the range of the available PALs, EPLDs and PROMs that can be used to implement the state machine. Each of the pipeline's three execution stages has a specific function. Briefly, the frrst stage of the pipeline accesses the Writable Control Store (WCS) RAM. The Arithmetic Logic Unit (ALU) execution occurs during the second stage of the pipeline. Finally, the third pipeline stage clocks status and memory address registers. The function(s) performed during each of the three stages are described in greater detail in the "State Machine Output Definition" section of this application note. If this design only generates a simple set of pipelined clocks, why not use shift registers and miscellaneous glue logic instead of a state machine? There are two reasons to consider a state machine. First, it is usually desirable to minimize the number of chips required; the state machine in PLD form might need external glue logic, but significantly less than the shift register solution. The second reason for considering a state machine is that this application requires more then just a simple set of pipeline clocks. The function of the clock signals is to provide control of the CPU in multiple modes of operation. The desired modes of operation are as follows: time to complete one nonpipelined instruction equals the average of three pipelined instructions. CPU STOP The system must have a way to perform· an orderly stop of CPU execution from both of the above run modes. This stop might be the result of several possible conditions, including a utility stop from a system control unit, a single step, a breakpoint, or a response to external hardware (e.g., a logic analyzer). The free-running clocks continue to run during the CPU STOP mode and remain running at all times, except during a reset condition. CPU WAIT In CPU WAIT mode, an external condition causes a delay in an instruction's execution. The instruction pauses until the external condition is removed. One application for the CPU WAIT mode is to handle a cache miss. When a cache miss occurs, the CPU remains in the CPU WAIT mode until the cache completes its memory transfer. SINGLE STEP The ability to execute one instruction at a time is needed to debug the CPU. You can easily implement SINGLE STEP external to the clock state machine by pulsing the RUN signal. SINGLE STEP mode is described further in the State Machine Input Definition section of this application note. INTERRUPT A variety of system conditions can interrupt the CPU out of its normal execution sequence and immediately start the execution of the interrupt handler. The influence of the INTERRUPT mode on the system clocks will be discussed in greater detail later in this application note. REPEAT INSTRUCTION The REPEAT INSTRUCTION mode is a CPU debug feature.· It is a good idea to implement this mode external to the clock state machine. By dubbing the clock to the instruction register and the interrupt line to the clock state machine, the CPU continually executes the instruction in the instruction register. Synchronous vs. Asynchronous Machine At this point in the state machine design, an appropriate type of state machine must be chosen to match the application. Two major types are the asynchronous and the synchronous implementations. The asynchronous machine changes state when one or more of its inputs changes from a previously stable input state. After a state change, the outputs of the state machine settle, while the machine stabilizes once again. A basic example of an asynchronous state machine would be a simple SR latch built from two NAND gates (Figure 1). For the clocking application considered in this application note, the asynchronous state machine PIPELINED RUN Mode In this mode, the CPU simultaneously performs the instructions in all three stages of the pipeline. For example,: while instruction n does an ALU operation, instruction n+1 accesses WCS, and instruction n-1 clocks ALU status. NONPIPELINED RUN Mode NONPIPELINED RUN mode performs all three stages of instruction execution without overlap. The 6-174 ~ £~~~ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;S;;;;;ta;;;;;t;;;;;e;;;;;M;;;;;;;;;;a;;;;;C;;;;;h;;;;;in;;;;;e;;;;;D;;;;;e;;;;;s;;;;;ign;;;;;;;;;;;;C;;;;;o;;;;;n;;;;;si;;;;;d;;;;;er;;;;;a;;;;;t;;;;;io;;;;;n;;;;;s;;;;;a;;;;;n;;;;;d;;;;;M;;;;;;;;;;e;;;;;th;;;;;o;;;;;d;;;;;O;;;;;IO;;;;;gI;;;;;'e;;;;;s;;;;;;;;;;= SEMIcaIDUCTOR.;;;;; STATE STATE IIIPUTS remain stable until the next time period, when the Moore machine samples the total input vector.to determine the next state. If all design conditions are met (external inputs are stable prior to the next state clock), the Moore machine provides glitch-free system outputs-a desirable characteristic for the CPU system clock. The design described here is therefore implemented as a Moore machine. OUTPUTS s Q Q R-----l Clock Generator Output Definition Figure 1. SR Latch, Asynchronous State Machine Example As explained earlier, each of the three system execution stages contains two clocks for a total of six system clocks for every instruction execution. The naming convention for these clocks is implementation would be a poor choice, due to the instability of the system outputs. The synchronous state machine offers a better choice. A synchronous state machine block diagram appears in Figure 2. Generally, a synchronous state machine samples the total input vector at specific periods to determine the machine's next state. When designing synchronous state machines, it is important to avoid state register metastability. External inputs to the machine must be synchronized to guarantee stable state register inputs, and the feedback time plus data setup time to the state register clock must be less then or equal to the state clock period. The modem theory of synchronous state machines was pioneered by Mealy and Moore (see Reference 1). Mealy and Moore machines differ slightly from each other in they way they control the system outputs. During a specific machine state, a Mealy machine allows the input conditions to alter the system outputs (the outputs depend on the "total" input state). In contrast, a Moore machine system outputs depend only on the present machine state. Thus, the system outputs CLK xy where x = 1, 2, or 3, representing the first, second, or third stage of the instruction execution and y = A or B, representing the first or second half of the execution stage. Following this convention, the state machine's two free-running clocks are named CLK_A and CLK_B. These clocks run at half the state clock frequency and 180 degrees out of phase. The free-running clocks occur at the same time as their respective CLK_xA and CLK xB clocks. The major clock functions for this application are: CLK_IB: The leading edge of this clock updates the instruction register. CLK 2A: This clock's leading edge marks the start of ALU- execution. The information on the ALU input bus clocks into the appropriate input registers at this time. The instruction cycle is considered recoverable up through and including CLK_2A (Le., the status of the machine from the previous instruction has not been altered). STATE YECTOR :.:::::::::::::: TOTAL IIIPUT YECTOR 11111111..... - SYNCHRONOUS EXTERNAL IIIPUTS ASYIICHROIIOUS EXT ERIAL INPUTS ElTERIIAL INPUT VECTOR ~---------4~-~ STATE REU TER ;::::::::::::: MACHINE STATE MEALY SYSTEM OUTPUTS OPTIOIIAL STATE OUTPUTS & OPTIONAL SYSTEM OUTPUT DECODE STATE CLOCK ____-J________________________________________ SYSTEM OUPTPUT FEEDBACK MOORE SYSTEM OUTPUTS ~ Figure 2. Synchronous State Machine Block Diagram 6-175 State Machine Design Considerations and Methodologies CLK 2B: Used to control the second half of the ALU execUtion stage, this clock initiates a write to RAM, triggers counters, gates ALU output into its latch, and clocks the ALU output information into any of the distributed destination registers. CLK 3A: On this clock the memory address register can be updated. The ALU output bus status and ALU status is also clocked into the CPU status register. Clock Generator Inputs A set of inputs (external stimulus to the state machine) controls the state machine. The clock state machine described here has eight external inputs, including the state machine clock. These inputs are: STATECLK: The state machine clock. RESET: An asynchronous or synchronous reset input that can be connected directly to the state registers' preset or clear or to all clocked register inputs (D or T input). If connected to the preset or clear, RESET need not be synchronized. In this case, RESET forces the state machine into the machine's initial state, regardless of the present state. RESET can result from any combination of the following sources: 1. Power up circuit (system reset) 2. System controller software decodes system reset 3. System controller software decodes module reset 4. CPU software decodes module reset RUN: This signal controls the start and stop sequence of the CPU clocks. In PIPELINE RUN mode, the start sequence generates the proper clock progression to fill up the pipeline registers, and the stop sequence empties the pipeline. RUN is externally manipulated to implement the single step and breakpoint functions. NPL: Used to select NONPIPELINED RUN vs. PIPELINED RUN modes, this signal must be set to the selected mode prior to activating the RUN signal. Setting NPL = 1 selects NONPIPELINED RUN mode, and NPL = 0 selects PIPELINED RUN mode. The single step function operates properly in NONPIPELINED RUN mode only. INTR: This signal indicates an external interrupt. When INTR is received, and lEN (interrupt enable, described below) is active, the CPU executes its interrupt handler. An interrupt inhibits the instruction register update clock (CLK_lB) and the ALU update clock (CLK 2B). CLK lA for the interrupt instruction executes on -the next cycle. The interrupt condition has priority over a wait condition and therefore starts generating clocks to permit execution of the interrupt instructions. lEN: This interrupt enable signal qualifies INTR. lEN is likely to be a bit in the instruction word, allowing the user to define sections of un-interruptable code. WAIT: The wait condition is initiated when both WAIT and WEN (wait enable, described below) are active. The CPU remains in the wait condition until WAIT goes inactive. WEN: This wait enable signal qualifies WAIT for entrance into the wait condition. Like lEN, WEN is usually a bit in the instruction word, allowing the user to define sections of wait-sensitive code. State Machine Partitioning When architecting a state machine, it is generally a good practice to break up large machines into workable blocks, with each of the smaller machines containing states that require common inputs and generate common outputs. The example clock state machine is small enough to be designed as a single state machine, although it would be trivial to design logic to generate the free-running clocks as a separate machine from the rest of the clock state machine. Equations for the free-running clocks are: CLK_A := lRESET * ICLK_A CLK B := lRESET * CLK A where ":=" indicates a registered output. By examining these output equations, you can see that the free-running clocks have only two dependencies in common with the remaining portion of the clock state machine, i.e., RESET and STATECLK. The free-running clocks are required as inputs to the other state machine to synchronize the additional system outputs, however. The example presented here implements the freerunning clocks and the other system outputs within the same state definition. The resulting output equations can be verified against the equations for the free-running clocks alone. The I nitial Machine State Regardless of the preferred state machine entry method, attacking the problem starts with defming the initial state of the machine. This initial state (INIT in the example) must be consistent with the power-on condition and/or an external input used to initialize the machine (RESET). The state of the machine can be decoded from the present values of the system outputs, state registers, or a combination of the two. (The advantages and disadvantages of the state defmition options will be discussed in greater detail later in this application note.) The initial machine state is generally, but not always, a decode of all Os or all 18. In the example design, INIT is the decode of all Os. Naming the States With the exception of INIT, each state in the example design is named to indicate the active system clocks occurring during that state. For example, during state A, only CLK_A is active. Similarly, state 123B has only CLK_lB, CLK_2B, CLK_3B, and CLK_B active. Additionally, an "N" suffix designates a nonpipelined state and a "w" suffix designates a wait condition state; this convention differentiates between states with identical active system outputs. 6-176 ~ State Machine Design Considerations and Methodologies ~~~ 2&, SEMIcc::mucrOR ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;~;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;~ FROM STATE B CPU Inactive States The RESET input causes the state machine to enter the INIT state from any state in the machine. From the INIT state, the machine unconditionally starts to generate the free-running clocks. As shown in Figure 3, a line pointing from the INIT .sta.te to the A stat~, with a path equation equal to 1, mdlcates an unconditional branch. The state machine progression continues from the A state unconditionally into the B state. In the B state a multi-branch condition exists. If the RUN input remains inactive, then the A and B sta~es continue to toggle, generating only the free-runnmg clocks. Hence the INIT, A, and B states are referred to as "CPU inactive states". Nonpipelined States If the NPL input is active while the RUN input be- comes active, the state machine operates in NONPIPELINED RUN mode and follows the model portrayed in Figure4. Pipelined States If the NPL input is inactive when the RUN input goes active, thus indicating PIPELINED RUN mode, the state machine operates as depicted in Figure5. Unique States When the RUN input goes active, the next state executed is either the 1A or the 1AN state, depending upon the value of the NPL input (refer to Figures4 and 5). Notice that the active system outputs in these two states are identical. Why generate two identical stateswhen an additional state register might be required to differentiate between the states? (This assumes you use the system outputs to decode the machine's states.) The redundant states are not a problem because the additional state register needed to differentiate between the states is not an issue. There are two reasons for this. First, if you eliminate the redundant states, the state machine would require at least one additional state register anyway to differentiate between the B and the BW or BWN states, which would be needed without 1A and 1AN. (Separation of states BW and BWN from state B is required for correct functionality.) Second, adding another state only increases the number of state registers if the new total number of states exceeds an RESET (path from all .tat •• ) ., TO PIPELINE MACHINE ., TO NON-PIPELINE MACHINE STATES Figure 3. CPU Inactive States TO STATE A Figure 4. Non-Pipelined States additional binary boundary (2, 4, 8, 16, ... ). This is not a problem here. You might also choose to widen your state machine (increase the number of state registers) to reduce the number of product terms to the state or system output registers. This decision should take into account the desired circuit implementation (PLDs, PROMS, discrete hardware, etc.) and is often an iterative process. In general, you can initially architect the state machine in the manner that is the easiest for you to understand, then make additional changes or small adjustments later if they become necessary. State Description Verification Now that all the pieces of the state machine are functionally defmed (refer to Figure6 for the .c0!llpleted state diagram), consider methods for verifymg the validity of the design. Some software you can use to describe and implement state machines would already offer verification at this point in a design. For other methods, read on! One way to verify a state machine design is to recognize a rule of thumb: Out of every state, there should be a state path to another state for every possible combination of relevant external inputs. For example, there are two paths out of st,ate 123B, with INTR and IEN as the relevant external mputs: Path 1 = INTR * IEN Path 2 = IINTR + INTR * lIEN 6-177 ~ State Macbine Design Considerations and Methodologies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * 'NPL FROM STATE B I N T R * I E N WAIT * WEN - _/INTR + INTR * /IEN RUN * (lWAIT + WAIT * /WEN) /RUN * (lWAIT + WAIT * /WEN) TO STATE A Figure 5. Pipelined States If the equation's terms equal 1 after Boolean reduction, then every state path out -of the state is accounted for. The main advantage to this verification method is that you can easily do it using readily available Boolean reduction software. If there are known restrictions to the external inputs, you can use this information to reduce the complexity of the machine. If it is impossible for the INTR * lIEN condition to occur externally, for example, then you can leave this condition out of the Path 2 equation. If there are no known restrictions on the external inputs, a simple method of verifying the above rule of thumb is to generate an equation where all of the paths ----ouror-a-state are ORed together as follows: OUT_STATE_123B = Path 1 + Path 2; OUT_STATE_123B = (INTR * IEN) + IINTR + (INTR * lIEN); OUT_STATE_123B = 1 6-178 State Machine Design Considerations and Methodologies In that case, the reduction of the OUT STATE 123B equation yields a non-l result -Because the method of verification just described does not detect redundant path equations, it is useful to revise the original rule of thumb to: Out of every state, there should be one and only one state path to another state for every possible combination of relevant external inputs. This revised condition is not as easily verified as the original statement. The easiest way to verify the more restrictive case is to simulate the state machine. To do this, you must generate a test vector for every possible external input that is relevant to each state simulated. Automatic test vector generation programs are available that produce every possible combination. After running the vectors against the design, you must visually inspect the output to verify that the machine never enters an illegal state. System and State Register Output Generation The model defming the clock state machine is complete, but there are still quite a few important RESET~_~ (pith fro. 111 . t l t •• ) ~ , U N +- _ !INTR + INTR * !IEN RUN * (lI/AIT WAIT * + !WEN) !RUN * (lVAIT WAIT * + !WEN) Figure 6. CPU Clock State Machine 6-179 State Machine Design Considerations and Methodologies decisions to be made regarding the fmal circuit implementation. Some of the major alternatives for final implementation are: System output vs.exclusive state register state "decode D flip-flop vs. T flip-flop implementation PLD vs. PROM implementation To· gain some insight into these choices, consider how the output or feedback equations are assembled. Take, for example, the generation of CLK_3A using a D flip-flop (FF) implementation. By referring to Figure 6, you can find all the states in which CLK 3A is active. These are 123A, 3A, and 3AN. The CLK.=-3A output is generated by ORing the state decodes that, when ANDed with their respective state paths, advance the state machine into the three states listed above. Specifically: eLK 3A := - (Decode of 12B)*(/INTR+INTR*/IEN) ;-123A ;-123A +(Decode of BW) *(/WAIT) +(Decode of 23B)*(1) ;-3A +(Decode of 2BN)*(/INTR+INTR*I1EN);-3AN When you defme the state decodes, the CLK 3A equations are completely specified in terms of the state machine inputs (state path), state registers, and/or system outputs (state decode). Typically, you then multiply the equation out to form a sum of products. This format provides for easy implementation in a PLD, which has a sum of products architecture, and also provides a useful foundation for further equation reduction. decodes in the state registers can be selected to assist in Boolean reduction, proper state assignment enables the more complex equations to fit into a specific implementation. This type of decode is useful in a PLD implementation, where there is a shortage of product terms for a specific state flip-flop, but extra flip-flops. are available. Adding "an extra state register can simplify the decode logic enough to fit the design ina singlePLD. The total number· of exclusive state registers required to implement a state machine varies from a minimum of LOG(2)X (rounded up to the nearest integer) to a maximum of X, where X is the total number of states in the machine. You can iteratively change this number, along with the state assignment, to obtain a suitable solution. The state assignment itself is a non-trivial issue, with almost limitless possibilities and no known method of obtaining the optimal solution. There are, however, some guidelines that can be used to obtain workable solutions: 1. Two or more states that potentially enter the same state with identical path equations should be adjacent (their binary codes differ in exactly one position). As an example, refer to Figure 5. States 12B and 123B both proceed into state 1A if the path condition INTR * lEN is true. When generating the CLK_1A equation, two of the terms of the equation look like this: CLK lA := (Decode of 12B) * (INTR * IEN) ;-IA ;-lA + (Decode of 123B) * (INTR * lEN) State Decode If the decode of 12B and 123B differ in exactly one position, then Boolean reduction (which uses the A*B + I A*B = B relationship) converts the two product term s into one smaller product term. 2. Two or more states that might proceed into different states with identical path equations, and an identical active output, should be adjacent. This situation occurs in the previous CLK_3A equation, shown again here: CLK_3A := (Decode of 12B)*(/INTR+INTR*I1EN) ;-123A ;-123A +(Decode of BW)*(/WAlT) +(Decode of 23B)*(1) ;-3A +(Decode of 2BN)*(/INTR+INTR*I1EN);-3AN Note that if states 12B and 2BN are adjacent, then you can reduce the CLK_3A equation to three product terms. As discussed earlier, the next state of the machine can be decoded from the present values of the system outputs, the state registers, or a combination of the two. The choice typically comes down to weighing the maximum number of product terms verses the maximum number of flip-flops available in an implementation. For a Moore machine, with registered system outputs, using the system outputs to uniquely define the states uses the smallest number of flip-flops to define the state machine. However, it is often necessary to add one or more state registers to uniquely define the states. State assignment for this state decoding method is quite simple, but also rigidly defmed, allowing limited flexibility when assigning the additional state registers. Mter reduction, the feedback and output equations of this "narrow" state machine might contain too many product terms to be implemented in a specific PLD, although product term complexity is never a problem with a PROM implementation. Clock Generator Implementation As mentioned earlier, there are many ways to im- plement state machines. The following sections discuss some of the pros and cons associated with some of the more common state machine implementations. Exclusive State Registers Another consideration in state machine design is that you might be able to distribute the number of product terms more evenly among the equations implementing the state machine by using state registers exclusively to decode the states. Because the state D Flip-Flop Implementation There are more products available that support a D flip-flop solution than any other implementation. 6-180 State Machine Design Considerations and Methodologies Table 1. Optimized Results for Clock Generator: T Flip-Flop Implementation Table 2. Non-optimized Results for Clock Generator: D Flip-Flop Implementation LOG/IC OPTIMIZATION SUMMARY (FACT) LOG/IC OPTIMIZATION SUMMARY (FACT) CPU TIME QUOTA PER FUNCTION: 100 SEC CPU TIME QUOTA PER FUNCTION: 100 SEC FUNCTION CLK_1AT CLK lB.T CLK 2AT CLK_2B.T CLK 3AT CLK_3B.T CLK AT CLK_B.T QQ1.T QQ2.T !NY PTERMS CPUTIME NO 6 <1 YES 7 1 NO 4 1 YES 3 1 NO 5 1 YES 4 FLAGS CLK 1A.D CLK lB.D CLK_2AD <1 NO 4 1 YES 3 <1 NO 5 <1 YES 6 2 NO 4 <1 YES 2 <1 CLK 2B.D CLK_3A.D CLK_3B.D NO C YES C NO 2 1 YES 1 <1 NO 3 <1 YES 5 1 NO 6 <1 YES 11 2 C: Constant Function FACT MINIMIZATION: FUNCTION CLK AD QQ1.D QQ2.D !NY PTERMS CPUTIME FLAGS NO 12 <1 N YES 27 <1 N NO 5 <1 N YES 34 1 N NO 8 <1 N 31 <1 N YES NO 7 <1 N YES 32 <1 N NO 8 <1 N YES 31 <1 N NO 6 <1 N YES 33 <1 N NO NT YES NT NO 6 <1 N YES 5 <1 N NO 10 <1 N YES 9 <1 N N: No Optimization T: Trivial Function FACT MINIMIZATION: 2 SEC 11 SEC The best example of this situation is a simple synchronous binary counter. While the most significant bit (MSB) of an N-bit counter in a D flip-flop implementation requires N product terms, the T flip-flop solution requires only one product term. Note that the Cypress family of CY7C33x devices offers you a configurable T or D type implementation if you place an XOR gate prior to the D flip-flop; route the AND/OR array to one of the XOR's inputs and the flip-flop's Q output (via an additional product term) to the other XOR input. It isn't clear from simple observation, however, whether the T flip-flop implementation is beneficial for the clock generator state machine. One way to clarify this question is to change three command lines in the state machine description shown in Appendix A and recompile to produce a T flip-flop implementation. Table 3 contains the product term results using T flip- Therefore, it is usually the most cost-effective solution for a state machine. Table 1 lists the number of product terms per output obtained by compiling the clock generator state machine definition with the LOG/iC software, using D flip-flops. The compiler input file appears in Appendix A. Optimizing the design (Table 2) significantly reduces the number of product terms needed. T Flip-Flop Implementation Even though D flip-flop solutions are more widely available, there are times when the logic needed for this implementation is prohibitively complex. Under these circumstances, a T flip-flop implementation might be more cost effective, because using T flip-flops reduces the logic significantly. 6-181 State Machine Design Considerations and Methodologies flops. A quick study of the results reveals that the optimized version using D flip-flops (Table 2) requires fewer product terms than the T flip-flop version. Table 3. Optimized Results for Clock Generator: D Flip-Flop Implementation LOG/IC OPTIMIZATION SUMMARY (FACT) PLD Implementation CPU TIME QUOTA PER FUNCTION: 100 SEC With the LOG/iC PLD Database option. the software assists in selecting a PLD. and it shows that the non-optimized version of the clock state machine fits in a PALC22V10 without further reduction. If the equations are reduced using Boolean reduction. however. a lower-cost solution is available. The results shown in Table 3 indicate that the less expensive PALC2OG10 would work. Appendix A shows the listing for the 20G10 LOG/iC implementation. Waveforms for the completed design appear in Appendix B. You. can verify the CLK_A and CLK_B equation results against the equations generated in the State Machine Partitioning section of this application note. FUNCTION CLK_1A.D CLK 1B.D CLK_2A.D PROM Implementation CLK 2B.D You can obtain very high speed solutions by implementing state machines using PROMs. A PROM uses a look-up table to decode the machine's next state, as opposed to the AND/OR array in a PLD. The main advantage of using a look-up table to decode the next state is that every combination of the inputs can be decoded. Thus, you can create an extremely complex machine, without equation reductions. The look-up table's drawback is that the PROM's depth grows exponentially (2N, where N = # of inputs to the look-up table) with every additional input to the look-up table. To determine the depth required, notice that the present total input vector provides the inputs to the look-up table. The clock generator state machine has seven external inputs, six system outputs, and two state outputs, which indicates a feasible implementation using the CY7C277 (32K X 8) registered PROM. Using a registered PROM such as the CY7C277 to implement the machine also helps to reduce the parts count, because the PROM implements both the state and system output registers. LOG/iC offers support for implementing state machines in PROMs, and only a few minor changes to the state machine description shown in Appendix A are required. *PROM replaces the *pAL command, some simple statements indicating the CY7C277 architecture (INPUTS = 15 AND OUTPUTS = 8 ) replaces the TYPE = statement, and PROGFORMAT = INTEL-HEX. CLK_3A.D CLK_3B.D CLK_A.D CLK_B.D QQ1.D QQ2.D INV PTERMS CPUTIME NO 6 1 YES 11 2 NO 3 1 YES 4 <1 NO 4 1 YES 7 <1 NO 3 1 YES 4 <1 NO 4 1 YES 9 1 <1 NO 3 YES 3 1 NO 1 <1 YES 2 <1 NO 1 1 YES 2 <1 <1 NO 3 YES 3 1 NO 6 16 YES 6 FACT MINIMIZATION: FLAGS 2 29 SEC differently than you would when designing with traditional PLD architectures. To fully understand the information in this section, consult the Cypress Semiconductor application note, "Understanding the CY7C361." Using the clock generator state machine example, this section shows how you can generate a state diagram for the CY7C361 by following some simple rules. This diagram allows you to determine whether the design can fit in a CY7C361. The rule of thumb is that a state diagram with 32 or fewer state nodes will probably fit. (The likelihood of the implementation at. that point depends totally upon split-input-array fitting issues.) You can convert the state diagram directly into Boolean equation s (with no Boolean reduction required) and compile the equations into JEDEC code for the final implementation. CY7C361 Implementation A new way to obtain high-speed operation· of state machine s became available with Cypress Semiconductor's development of a revolutionary architecture that enables a CMOS PLD state machine part to operate at speeds in the 125-MHz range. The first part in this family is the CY7C361. The architectural innovations used to obtain 125-MHz operation require that you approach state machine design slightly 6-182 State Machine Design Considerations and Methodologies The CY7C361's condition-decode array has been optimized for use in state machine applications. As shown in Figure 7, the CY7C361 condition decoder contains the necessary logic to generate two kinds of state machine operations. The Entering a State operation should look familiar. The process used to generate the system output and state register equations (in the System and State Register Output Generation section of this application note) utilizes a similar equation form. There are two small differences, though. First, the Entering a State equation shown in Figure 7 assumes the present state conditions are available as single entities on the input array. That is, one state register uniquely defmes each state, and therefore the present state is not encoded using multiple flip-flops, as is typical in traditional state machines. There is a special case, however, that allows you to encode the states Leaving a state (a+b+c)*SO (SA+SB+SC)*(a*/b) Entering a state Figure 7. The Condition Decoder - Optimized for Two State Machine Operations !~~~ ~~ ;I2 ~~::;) s:~o~g~~~l~~~v:~~;d:ctm~::~: CY7C361. The flip-flop in this macrocell configuration is unconditionally set by the active previous state macrocell. The flip-flop remains set until the condition decoder equation (a Leaving a State equation) for the TERMINATE macrocell goes active. Figure 10 shows how the TERMINATE configuration looks within a state diagram. When implementing the. clock generator state machine in the CY7C361 using the conversion techniques discussed above, the number of states slightly exceeds 32. But by allowing the machine's pipelined and nonpipelined portions to share common states, (lA, 1B, and 3B) the total number of states reduces to less than 32. Note that you can use this same kind of state reduction for the original implementation (refer to the Unique States section). Figure 8 shows the resulting state diagram. It is a simple matter to convert information from the state diagram to PLD ToolKit Equations (refer to Appendix C for the PLD ToolKit source file). You must generate an Entering a State equation for every state node in the diagram. (The TERMINATE configuration was not used in this example, but it can be useful for implementing wait states.) You generate the equations in Appendix C using the and connectives for the AND and NAND terms, respectively. Then generate the system outputs by ORing the appropriate states in the OR-based output array. For example, the CLK lB output is active during the lB, 12B, 123B, or 123BX states. The PLD ToolKit connective for the OR array is . The CY7C361 implementation of the clock generator state machine was simulated using the PLD ToolKit (see Appendix D). state register is required in the decode of the next state. An example of this is a simple synchronous 32-bit binary counter using the TOGGLE (or T flip-flop) configuration. The second design difference with the CY7C361 is that the Entering a State equation also shows all states (SA, SB, SC) that have an identical state path to SO. This is not necessarily the case when designing with traditional PLDs (as shown in Figure6). The CY7C361, however, requires a machine definition in which all state paths into any given state are identical. You can easily convert an existing state diagram and satisfy the new condition by simply adding additional states for those states that do not meet the above condition. To remain consistent with the naming conventions already defmed for the clock generator example, two additional suffixes, "X" and "Y", indicate the additional states. For example, state BW in Figure 6 has two state paths entering into it WAIT * WEN from state 123A and a path from state AW. To meet the design conditions for the CY7C361, you add an additional state, BWX, such that state AW enters BWX with a state path of 1, and state 123A enters state BW with a state path of WAIT * WEN. The CY7C361 implementation of the clock generator state machine appears in Figure 8. Note that both of the new states (BW and BWX) have exits with the same state path equation. Thus, the number of states in the state machine does not grow geometrically due to this new methodology. In addition to the normal Entering a State equation, the CY7C361 supports operations in which multiple state paths go from one state to another, and each state path term contains only one input. Figure 9 shows a diagram of this condition. Another operation, called Leaving a State, proves especially useful in conjunction with the (Wait Until) TERMINATE state macrocell configuration in the Reference 1. Donald D. Givone, Introduction to Switching Circuit Theory (New York: McGraw-Hill, Inc., 1970) 6-183 State Machine Design Considerations and Methodologies ( SAME PATHS OUT AS 1ZU) Figure 8. CY7C361 Implementation, CPU Clock State Machine adjacent .>state / macrocells ENTERING A STATE E Q. I 2 (SA) (a+/b+c) Figure 9. Entering a State Along Multiple Paths Figure 10. Leaving a State (TERMINATE Configuration) 6-184 ~ ~~ State Machine Design Considerations and Methodologies' ~;r~~OID~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine LOG/iC-PAL ReI 3.212-2328-1721100034 # 32-5955 90/03/15 23:49:45 LOG/iC - COPYRIGHT (C) 1985,1988 BY ISDATA GMBH, 7500 KARLSRUHE WEST-GERMANY Cypress Semiconductor LICENCE FOR IBM-PC/XT/AT Data Set: OD20G 10.DCB 1 2 3 4 5 6 7 8 9 10 I 11 12 I 13 I 14 I 15 I 16 I 17 I 18 I 19 I 20 I 21 I 221 23 24 I 25 26 I 27 I 28 I 29 I 30 I 31 I 32 I 33 I 34 I 35 I 36 I 37 I 38 I 39 I 40 41 I 42 43 I 1: *IDENTIFICATION 2: PIPELINED CLOCKING SYSTEM OD2OO 10 ·317/90 3: ERIC B. ROSS 4: CYPRESS SEMICONDUCTOR 5: NAMING CONVENTION 6: OD = SYSTEM OUTPUTS ARE DFLOPS AND ARE USED FOR STATE DEF 7: 20010 = PALC2OOI0 IMPLEMENTATION 8: *PAL 9: TYPE= PALC2OOI0 10: 11: *X-NAMES 12: ;---------------------------------------------------------------------13: ;INPUT DEFINITIONS: 14:; RUN = START & STOP EXECUTION OF OUTPUT CLOCKS (NORMAL, SINGLE 15:; STEP, & BREAK PT. EXECUTION 16:; NPL = PIPELINED VS NON-PIPELINED MODE OF EXECUTION 17:; INTR = EXTERNAL INTERRUPT CONDITION (TLB MISS, PARITY ERROR, ... ) 18:; lEN = INTERRUPT ENABLE 19: ; WAIT = WAIT ENABLE (CACHE MISS) 20: ; WEN = WAIT ENABLE 21: ;---------------------------------------------------------------------22:; 23: RUN, NPL, INTR, lEN, WAIT, WEN, RESET; 24: 25: *Z-NAMES 26: ;---------------------------------------------------------------------27: ;OUTPUT DEFINITIONS: 28:; 29: ; 3 CLOCK STAGES 1, 2, 3 30:; 2 CLOCKS PER STATE A, B 31:; CLK XX WHERE XX = lA,lB,2A,2B,3A,3B 32: ; 33:; 2 FREE RUNNING CLOCKS 34:; CLK A, CLK B 35:; 36:; ADDITIONAL REGISTERS FOR STATE DEFINITION 37:; QQl, QQ2 38: ;---------------------------------------------------------------------39:; 40: CLK lA, CLK lB, CLK 2A, CLK 2B, CLK 3A, CLK 3B, CLK A, CLK B, QQ 1, QQ2; 41: 42: *Z-VALUES 43: 6-185 ~ State Machine Design Considerations and Methodologies ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) 44 I 45 I 461 47 I 48 I 49 I 501 51 I 521 53 I 54 55 56 57 I 58 59 60 61 62 63 64 65 66 67 68 69 I 70 71 72 73 74 75 76 77 78 I 79 80 81 82 83 I 84 85 86 87 88 89 90 91 92 93 94 95 I 44: ; ADDITIONAL OUTPUTS 45: ; SYSTEM OUTPUTS FOR STATE DEFINITION 46:; 47:; 48: ; CCCCCCCC QQ LLLLLLLL QQ 49: ; KKKKKKKK 12 50:; 51:; 112233AB ABABAB 52:; 53: ; INIT COMMON STATES 54: SI = 0 0 0 0 0 0 0 0 ; SA - INACTIVE 55: S2 = 0 0 0 0 0 0 1 0 0MODE STATES ; SB 56: S3 = 0 0 0 0 0 0 0 1 057: ;SIA PIPELINE STATES 58: S4 = 1 0 0 0 0 0 1 0 - 0 ; SIB 59: S5 = 0 1 0 0 0 0 0 1 - 0 ; S12A 60: S6 = 10100010 ;SI2B 61: S7 = 010-10001 62: S8 = 10101010 ; S123A ; S123B 63: S9 = 0 1 0 1 0 1 0 1 64: SIO = 000 1 0 1 0 1 ; S23B ;S3A 65: Sl1 = 0 0 0 0 1 0 1 0 - 0 66: S12 = 0 0 0 0 0 1 0 1 - 0 ; S3B ; SAW 67: S13 = 000000 1 0 1 0 ;SBW 68: S14 = 00000001 10 69: 70: S15 = 10000010 -1 ; SIAN NON-PIPLINE ;SIBN 71: S16 = 01000001 -1 ;S2AN 72: S17 = 00100010 ;S2BN 73: S18 = 00010001 ;S3AN 74: S19 = 00001010 -1 ;S3BN 75: S20 = 00000101 -1 76: S21 = 00000010 11 ; SAWN ;SBWN 77: S22 = 0000000111 78: 79: *STRING 80: INIT = 1 COMMON STATES -INACTIVE MODE 81: SA = 2 STATES 82: SB = 3 83: ; PIPELINE STATES 84: SIA = 4 85: SIB = 5 86: S12A = 6 87: S12B = 7 88: S123A = 8 89: S123B = 9 90: S23B = 10 91: S3A = 11 92: S3B = 12 93: SAW 13 94: SBW = 14 95: 6-186 ~ =:rCYPRESS aas, State Machine Design Considerations and Methodologies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) 96 96: SIAN 15 ; NON-PIPLINE 97 97: SlBN = 16 98 98: S2AN = 17 99 99: S2BN = 18 100 100: S3AN = 19 101 101: S3BN = 20 102 102: SAWN = 21 103 103: SBWN = 22 104 104: LASTSTATE = 22; 105 1105: 106 106: *FLOW-TABLE 107 1107: ; 108 I 108: ;----------------------------------------------------------------------109 I 109: ;RESET STATE 110 I 110: ;ALL STATES MUST RESET TO TIm INITIAL STATE (ALL OUTPUTS REGISTERS 0) UPON 111 I 111: ;AN ACTIVE RESET INPUT. SINCE TIm 20G1O HAS NO GLOBAL OR INDIVIDUAL 112 I 112: ;RESETS TO THE OUTPUT REGISTERS, RESET TO INITIAL STATE MUST BE EMBEDDED 113 I 113: ;INTO THE STATE MACHINE 114 1114: ; 115 115: RELEVENT = RESET ; 116 116: S[1 .. 'LASTSTATE'], X l , F 'INIT' ;ALL STATE INIT UPON RESET 117 138: RELEVENT = RESET = 0 118 1139: ; 119 I 140: ;----------------------------------------------------------------------120 I 141: ;INACTIVE MODE STATES 121 142: RELEVANT = RUN, NPL , 122 143: S'INIT' ,X - ,F 'SA' ;INITIAL STATE AFTER RESET 123 1144: ,F'SB' ;INACTIVE MODE STATE, ONLY 124 145: S 'SA' , X - 125 1146: 126 147: S 'SB' , X 0 , F 'SA' ;FREE RUN CLKS A & B ARE ACTIVE 127 148: Xl 0 , F 'SlA' ; PIPELINE VS. 128 149: XII , F 'SIAN' ; NON-PIPELINE DECISION 129 1150: 130 I 151: ;----------------------------------------------------------------------131 I 152: ;PIPELINE MODE STATES 132 1153: 133 154: RELEVANT = INTR, lEN ;*PRIMING THE PIPELINE * 134 155: S 'SlA' ,X - ,F 'SIB' 135 1156: ,F'S12A' ; 136 157: S 'SlB' , X - 137 1158: ,F'S12B' ; 138 159: S 'S12A' ,X - 139 1160: , F 'SlA' ; INTERRUPT CONDITION? YES 140 161: S 'S12B' ,X 11 , F 'S123A' ; NO 141 162: Xl 0 142 163: X 0, F 'S123A' ; NO 143 1164: 144 165: RELEVANT = RUN, INTR, lEN, WAIT, WEN; *FULL PIPELINE * 145 166: S 'S123A' ,X - - - 1 1 , F 'SBW' ; WAIT CONDITION 146 167: X 0 - - 0 - , F 'S23B' ; IRUN COND., EMPTY PIPELINE 147 168: X 0 - - 10, F 'S23B' ; IRUN COND., EMPTY PIPELINE 148 169: X 1- - 0 -, F 'S123B' ; RUN CONDITION 149 170: X 1 - - 1 0, F 'S123B' ; RUN CONDITION 150 1171: 6-187 State ,Machine Design Considerations and Methodologies Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) 151 172: S 'S123B' ,X - 1 1 - - , F 'SlA' ; INTERUPT CONDITION 152 173: X - 0 - - - , F 'S123A' ; RUN CONDITION 153 174: X - 10- - , F 'S123A' ; RUN CONDITION 154 1175: 155 176: RELEVANT = RUN ; *EMPTY PIPELINE * 156 177: S 'S23B' ,X ,F 'S3A' 157 1178: 158 179: S 'S3A' ,X ,F 'S3B' 159 1180: 160 181: S'S3B' ,X ,F 'SA' ; BACK TO INACTIVE STATE 161 I 182: 162 183: RELEVANT = WAIT ; *PIPELINE WAIT STATES* 163 184: S 'SBW' ,X 1 ,F 'SAW' ; WAIT 164 185: X0 , F 'S123A'; /WAIT 165 1186: 166 187: S 'SAW' , X ,F 'SBW' ; 167 1188: 168 I 189: ;----------------------------------------------------------------------169 I 190: ;NON-PIPELINE MODE STATES 170 1191: 171 192: S 'SIAN' ,X , F 'SlBN' ; 172 1193: 173 194: S 'SlBN' ,X ,F 'S2AN' ; 174 1195: 175 196: RELEVANT = WAIT, WEN ; 176 197: S 'S2AN' ,X 11 ,F 'SBWN' ; WAIT CONDITION 177 198: X0,F 'S2BN' ; /WAIT CONDITION 178 199: Xl 0 ,F 'S2BN' ; /WAIT CONDITION 179 1200: 180 201: RELEVANT = INTR, lEN , 181 202: S 'S2BN' ,X 11 ,F 'SIAN' ; INTERRUPT CONDITION 182 203: X0, F 'S3AN' ; /INTERRUPT CONDITION 183 204: X 10 ,F 'S3AN' ; /INTERRUPT CONDITION 184 1205: 185 206: RELEVANT = RUN 186 207: S 'S3AN' ,X , F 'S3BN' ; 187 1208: 188 209: S 'S3BN' ,X 1 ' ,F 'SIAN' ; 189 210: X0 ,F 'SA' ; BACK TO INACTIVE STATE 190 I 211: 191 212: RELEVANT = WAIT ;*NON-PIPELINED WAIT STATES* 192 213: S 'SBWN' ,X 1 , F 'SAWN' ; REMAIN IN WAIT 193 214: X0 , F 'S2AN' ; END OF WAIT CONDITION 194 1215: 195 216: S 'SAWN' ,X , F 'SBWN' ; REMAIN IN WAIT 196 I 217: 197 218: *STATE-ASSIGNMENT 198 219: Z-VALUES 199 I 220: 2001221: 201 222: *PIN 202 223: STATECLK = 1, RUN = 2, NPL = 3, INTR = 4, lEN = .5, WAIT = 6, WEN = 7, 203 223: RESET = 8, CLK 1A = 14, CLK IB = 15, CLK 2A = 16, CLK 2B = 17, 204 223: CLK 3A = 18, CLK 3B = 19, CLK A = 20, CLK B = 21, QQ 1-== 22, QQ2 = 23; 205 1224: - 6-188 State Machine Design Considerations and Methodologies Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) 206 207 208 209 210 225: 226: 227: 228: 229: *RUN-CONTROL LISTING= LONG,SYMBOL-TABLE,EQUATIONS,PINOUT; PROGFORMAT= L-EQUATIONS OPTIMIAZATION= P-TERMS; *END LOG/IC SYMBOL TABLE SYMBOL TYPE REG LEVEL PIN/NODE LOCAL - HIGH GND VCC LOCAL - HIGH X-VARIABLE 2 RUN - HIGH NPL X-VARIABLE - HIGH 3 INTR X-VARIABLE - HIGH 4 IEN X-VARIABLE - HIGH 5 WAIT X-VARIABLE - HIGH 6 WEN X-VARIABLE - HIGH 7 RESET X-VARIABLE - HIGH 8 14 CLK 1A X-VARIABLE - HIGH CLK-lB X-VARIABLE 15 - HIGH CLK-2A X-VARIABLE 16 - HIGH CLK-2B X-VARIABLE - HIGH 17 CLK-3A X-VARIABLE - HIGH 18 CLK-3B X-VARIABLE - HIGH 19 CLK-A X-VARIABLE - HIGH 20 CLK-B X-VARIABLE - HIGH 21 QQ1 X-VARIABLE 22 - HIGH QQ2 X-VARIABLE 23 - HIGH CLK 1A.D Z-VARIABLE DFF HIGH 14 CLK-lB.D Z-VARIABLE DFF HIGH 15 CLK-2A.D Z-VARIABLE DFF HIGH 16 CLK-2B.D Z-VARIABLE DFF HIGH 17 CLK-3A.D Z-VARIABLE DFF HIGH 18 CLK-3B.D Z-VARIABLE DFF HIGH 19 CLK-A.D Z-VARIABLE DFF HIGH 20 CLK-B.D Z-VARIABLE DFF HIGH 21 Z-VARIABLE DFF HIGH 22 QQCD QQ2.D Z-VARIABLE DFF HIGH 23 EXPANDED FUNCTION TABLE (INCLUDING LOCAL VARIABLES): : CCCCCC : LLLLLLCC CCC CCC : KKKK KKLL RLLL LLLC C : KK QQ I W EKKK KKKL L :ll2233 QQ GVRN NIA W S K KQQ : ABAB ABAB 12 NCUP TEIE E 112233 QQ : ......... . DCNL RNTN TABA B"ABA B12 : DDDD DDDD DD 6-189 State Machine Design Considerations and Methodologies Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) ---- ---- 1000 0000 0-- : 0000 0000 --; ---- ---- 0000 0000 0-- : 0000 0010 0-; ---- ---- 1000 000100- : 0000 0000 --; ---- ---- 0000 0001 00- : 0000 0001 0-; ---- ---- 1000 0000 10- : 0000 0000 --; --0- ---- 0000 0000 10- : 0000 0010 0-; --10 ---- 0000 0000 10- : 1000 0010 -0; --11 ---- 0000 0000 10- : 1000 0010 -1; ---- ---- 1100 0001 0-0 : 0000 0000 --; ---- ---- 0100 0001 0-0 : 0100 0001 -0; ---- ---- 1010 0000 1-0 : 0000 0000 --; ---- ---- 0010 0000 1-0 : 1010 0010 --; ---- ---- 1101 0001 0-- : 0000 0000 --; ---- ---- 0101 0001 0-- : 0101 0001 --; ---- ---- 1010 1000 1-- : 0000 0000 --; ---- 11-- 00101000 1-- : 1000 0010 -0; ---- 10-- 0010 1000 1-- : 1010 1010 --; ---- 0--- 0010 1000 1-- : 1010 1010 --; ---- ---- 1101 0101 0-- : 0000 0000 --; ---- --11 0101 0101 0-- : 0000 0001 10; --0- --0- 0101 0101 0-- : 0001 0101 --; --0- --10 0101 0101 0-- : 00010101 --; --1- --0- 0101 0101 0-- : 01010101 --; --1- --10 010101010-- : 01010101 --; ---- ---- 1010 1010 1-- : 0000 0000 --; ---- 11-- 0010 1010 1-- : 1000 0010 -0; ---- 0--- 0010 1010 1-- : 1010 1010 --; ---- 10-- 0010 1010 1-- : 1010 1010 --; ---- ---- 1000 1010 1-- : 0000 0000 --; ---- ---- 0000 1010 1-- : 0000 1010 -0; ---- ---- 1000 01010-0 : 0000 0000 --; ---- ---- 0000 0101 0-0 : 0000 0101 -0; ---- ---- 1000 0010 1-0 : 0000 0000 --; ---- ---- 0000 0010 1-0 : 0000 0010 0-; ---- ---- 1000 0001 010 : 0000 0000 --; ---- ---- 0000 0001 010 : 0000 0001 10; ---- ---- 1000 0000 110 : 0000 0000 --; ---- --1- 0000 0000 110 : 0000 001010; ---- --0- 0000 0000 110 : 1010 1010 --; ---- ---- 1100 0001 0-1 : 0000 0000 --; ---- ---- 0100 0001 0-1 : 0100 0001 -1; ---- ---- 101000001-1 : 0000 0000 --; ---- ---- 00100000 1-1 : 0010 0010 --; ---- ---- 10010001 0-- : 0000 0000 --; ---- --11 0001 0001 0-- : 0000 0001 11; ---- --0- 0001 0001 0-- : 0001 0001 --; ---- --10 0001 0001 0-- : 0001 0001 --; ---- ---- 1000 1000 1-- : 0000 0000 --; ---- 11-- 0000 1000 1-- : 1000 0010 -1; ---- 0--- 0000 1000 1-- : 0000 1010 -1; ---- 10-- 0000 1000 1-- : 0000 1010 -1; ---- ---- 1000 0101 0-1 : 0000 0000 --; ---- ---- 0000 01010-1 : 0000 0101 -1; ---- ---- 1000 0010 1-1 : 0000 0000 --; --1- ---- 0000 0010 1-1 : 1000 0010 -1; 11 116 2/ 143 31 117 41 145 51 118 6/ 147 71 148 81 149 91 119 101 155 111 120 12/ 157 131 121 141 159 151 122 16/ 161 171 162 181 163 191 123 201 166 211 167 22/ 168 231 169 241 170 251 124 26/ 172 271 173 281 174 291 125 301 177 311 126 32/ 179 331 127 341 181 351 128 36/ 187 371 129 38/ 184 39/ 185 401 130 41/ 192 42/ 131 43/ 194 441 132 45/ 197 46/ 198 471 199 48/ 133 49/ 202 501 203 51/ 204 5V 134 53/ 207 54/ 135 55/ 209 6-190 State Machine Design Considerations and Methodologies Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) --0- ---- 0000 0010 1-1 : 0000 0610 0-; ---- ---- 1000 0001 011 : 0000 0000 --; ---- ---- 0000 0001 011 : 0000 000111; ---- ---- 1000 0000 111 : 0000 0000 --; ---- --1- 0000 0000 111 : 0000 001011; ---- --0- 0000 0000 111 : 0010 0010 --; REST : ---- ---- --; 62 1234 5678 9012 3456 789 56/ 210 571 136 581 216 591 137 601 213 61/ 214 1234 5678 90 STATE ASSIGNMENT: CCCC CC LLLLLLCC KKKKKKLL KKQQ 112233 QQ ABAB ABAB 12 0000 0000 --; 1 0000 0010 0-; 2 0000 00010-; 3 1000 0010 -0; 4 0100 0001 -0; 5 1010 0010 --; 6 0101 0001 --; 7 1010 1010 --; 8 0101 0101 --; 9 0001 0101 --; 10 0000 1010 -0; 11 0000 0101 -0; 12 0000 0010 10; 13 0000 0001 10; 14 1000 0010 -1; 15 0100 0001 -1; 16 0010 0010 --; 17 0001 0001 --; 18 0000 1010 -1; 19 0000 0101 -1; 20 0000 0010 11; 21 0000 0001 11; 22 EXPANDED FUNCTION TABLE (LOCAL VARIABLES REMOVED): : CCCCCC : LLLLLLCC C CCCC C : KKKK KKLL RL LLLL LCC KK QQ I W EK KKKK KLL :-IT22 33 QQ RNNI AWS KKQ Q : ABAB ABAB 12 UPTE lEE 1-1223 3- Q Q : ......... . NLRN TNTA BABA BAB1 2 : DDDD DDDD DD 6-191 State Machine Design Considerations and Methodologies Appendix A. LOG/iC PLO Source Code: Clock State Machine (Continued) ---- --100000 000- - : 0000 0000 --; 11 116 ---- --00 0000 000- - : 0000 0010 0-; 'll 143 ---- --100000 0100 - : 0000 0000 --; 31 117 ---- --00 0000 0100 - : 0000 0001 0-; 41 145 ---- --100000 0010 - : 0000 0000 --; 51 118 0--- --00 0000 0010 - : 0000 0010 0-; 6/ 147 10-- --00 0000 0010 - : 1000 0010 -0; 71 148 11-- --00 0000 0010 - : 1000 0010 -1; 81 149 ---- --11 0000 010- 0 : 0000 0000 --; 91 119 ---- --01 0000 010- 0 : 0100 0001 -0; 101 155 ---- --10 1000 001- 0 : 0000 0000 --; 11/ 120 ---- --00 1000 001- 0 : 1010 0010 --; 1'll 157 ---- --11 0100 010- - : 0000 0000 --; 131 121 ---- --01 0100 010- - : 0101 0001 --; 141 159 ---- --10 1010 001- - : 0000 0000 --; 151 122 --11 --00 1010 001- - : 1000 0010 -0; 16/ 161 --10 --00 1010 001- - : 1010 1010 --; 171 162 --0- --00 1010 001- - : 1010 1010 --; 181 163 ---- --11 0101 010- - : 0000 0000 --; 191 123 ---- 1101 0101 010- - : 0000 0001 10; 201 166 0--- 0-01 0101 010- - : 0001 0101 --; 21/ 167 0--- 1001 0101 010- - : 0001 0101 --; 221 168 1--- 0-01 0101 010- - : 0101 0101 --; 231 169 1--- 1001 0101 010- - : 01010101 --; 241 170 ---- --10 1010 101- - : 0000 0000 --; 251 124 --11--00 1010 101- - : 1000 0010 -0; 26/ 172 --0- --00 1010 101- - : 1010 1010 --; 271 173 --10 --00 1010 101- - : 1010 1010 --; 281 174 ---- --100010101- - : 0000 0000 --; 291 125 ---- --00 0010 101- - : 0000 1010 -0; 301 177 ---- --100001010- 0 : 0000 0000 --; 31/ 126 ---- --00 0001 010- 0 : 0000 0101 -0; 3'll 179 ---- --100000 101- 0 : 0000 0000 --; 331 127 ---- --00 0000 101- 0 : 0000 0010 0-; 341 181 ---- --100000 0101 0 : 0000 0000 --; 351 128 ---- --00 0000 0101 0 : 0000 0001 10; 36/ 187 ---- --10 0000 0011 0 : 0000 0000 --; 371 129 ---- 1-00 0000 0011 0 : 0000 0010 10; 381 184 ---- 0-00 0000 0011 0 : 1010 1010 --; 391 185 ---- --11 0000 010- 1 : 0000 0000 --; 401 130 EXPANDED FUNCTION TABLE (LOCAL VARIABLESREMOVED)- continued: ---- --01 0000 010- 1 : 0100 0001 -1; ---- --10 1000 001- 1 : 0000 0000 --; ---- --00 1000 001- 1 : 00100010 --; ---- --10 0100 010- - : 0000 0000 --; ---- 1100 0100 010- - : 0000 0001 11; ---- 0-00 0100 010- - : 0001 0001 --; ---- 1000 0100 010- - : 0001 0001 --; ---- --100010 001- - : 0000 0000 --; --11 --00 0010 001- - 1000 0010 -1; --0- --00 0010 001- - 0000 1010 -1; --10 --00 0010 001- - 0000 1010 -1; ---- --10 0001 010- 1 0000 0000 --; 41/ 192 4'll 131 431 194 441 132 451 197 46/ 198 47/ 199 481 133 491 202 501 203 51/ 204 5'll 134 6-192 State Machine Design Considerations and Methodologies Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) ---- --00 0001 010- 1 : 0000 0101 -1; ---- --100000 101- 1 : 0000 0000 --; 1--- --00 0000 101- 1 : 1000 0010 -1; 0--- --00 0000 101- 1 : 0000 0010 0-; ---- --100000 01011 : 0000 0000 --; ---- --00 0000 0101 1 : 0000 0001 11; ---- --10 0000 0011 1 : 0000 0000 --; ---- 1-00 0000 00111 : 0000 001011; ---- 0-00 0000 0011 1 : 00100010 --; REST : ---- ---- --; 62 1234 5678 9012 3456 7 53/ 207 54/ 135 55/ 209 56/ 210 57/ 136 58/ 216 59/ 137 60/ 213 61/ 214 1234 5678 90 PIPELINED CLOCKING SYSTEM OD2OG10 3/7/90 ERIC B.ROSS CYPRESS SEMICONDUCTOR 90/03/15 23:49:45 **************************************************** *** NET DESCRIPTION TABLE FOR AND/OR STRUCTURE *** **************************************************** : CCCCCC : LLLLLLCC C CCCC C : KKKK KKLL RL LLLL LCC: KK QQ I W EK KKKK KLL - :112233 QQ RNNI AWS KKQ Q : ABABABAB 12 UPTE IEEl-1223 3- Q Q : .......... NLRN TNTA BABA BAB12 : DDDD DDDD DD INV ......... . REG DDDD DDDD DD ---- 0-0- --0- 0-11 0 : A. ........ ; 1 1--- --0- --0- 1--- 1 : A ......... ; 2 ---- --0- 1--- ---- 0 : A ......... ; 3 ---- --0- 1-1- ---- - : A ......... ; 4 --11 --0- --1- 0--- - : A ......... ; 5 1--- --0- 0-0- 0-10 - : A. ........ ; 6 ---- --01 ---0 ---- - : .A ........ ; 7 1--- -001 ---- ---- - : .A ........ ; 8 1--- 0-01 ---- ---- - : .A ........ ; 9 ---0 --0- 1--- ---- - : ..A ....... ; 10 --0- --0- 1--- ---- - : ..A ....... ; 11 ---- 0-0- --0- 0-11 - : ..A ....... ; 12 ---- --0- 1-0- ---- - : ..A. ...... ; 13 ---- 0-0- -1-- ---- - : ... A ...... ; 14 ---- -00- -1-- ---- - : ...A ...... ; 15 ---- --01 -1-0 ---- - : ...A ...... ; 16 ---0 --0- --1- ---- - : .... A ..... ; 17 --0- --0- --1- ---- - : .... A ..... ; 18 ---- --0- 0-1- 1--- - : .... A ..... ; 19 6-193 ~ State Machine Design Considerations and Methodologies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) ---- 0-0- 0-0- 0-11 0 : .... A ..... ; 20 ---- 0-0- ---1 ---- - : .....A .... ; 21 ---- -00- ---1 ---- - : .....A .... ; 22 ---- --0- -0-1 ---- - : .....A .... ; 23 ---- --0- ---- -0-- - : ......A. ..; 24 ---- --0- ---- -1-- - : .......A .. ; 25 ---- ---- ---- -1-1 - : ........ A.; 26 ---- ---- ---- 0-11 - : ........ A.; 27 ---- ---- -1-- ---- - : ........ A.; 28 ---- ---- --0- 1--- - : .........A; 29 ---- ---- 0-1- 0--- - : .........A; 30 ---- ---0 -1-- ---- - : .........A; 31 ---- ---- -0-- -1-- 1 : .........A; 32 -1-- ---00--00--0 - : .........A; 33 ---- ---- 00-- 0--1 1 : .... .... .A ; 34 1234 5678 9012 3456 7 : 1234 5678 90 PIPELINED CLOCKING SYSTEM OD2oo 10 3/7/90 ERIC B.ROSS CYPRESS SEMICONDUCTOR 90103/15 23:49:45 **************************************************** *** BOOLEAN EQU A TIONS *** **************************************************** CLK lA.D '- IWAIT & /RESET & ICLK 2B & ICLK_3B & CLK B & QQl &/QQ2 + RUN & IRESET & ICLK 2B & CLK_3B & QQ2 + /RESET & CLK lB & iQQ2 + /RESET & CLK-IB & CLK 2B + INTR & lEN -& IRESET & eLK 2B & ICLK 3B + RUN & /RESET & ICLK lB &-/CLK 2B &-/CLK 3B & CLK_B &/QQl ; CLK lB.D '- IRESET & CLK lA & ICLK 3A + RUN & lWEN & IRESET &. CLK lA + RUN & IWAIT & IRESET & CLK')A CLK 2A.D '- lIEN & /RESET & CLK lB + IINTR & /RESET & cLk lB + IWAIT & IRESET & ICLK-2B & ICLK_3B & QQl + /RESET & CLK_IB & ICLK_2B CLK 2B.D '- IWAIT & IRE SET & CLK 2A + lWEN & /RESET & CLK 2A + /RESET & CLK_IA & CLK_2A 6-194 S;~ State Machine Design Considerations and Methodologies ~~ ~~OID~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) CLK 3A.D := & IRESET & CLK 2B + IINTR & IRESET & CLK 2B + IRESET & ICLK lB & -CLK 2B & CLK 3B + IWAIT & IRESE-T & ICLK lB - & ICLK 2B - & ICLK_3B & CLK_B & QQ1 & !QQ2 - - lIEN CLK 3B.D .- /WAIT & IRESET & CLK 3A + lWEN & IRESET & CLK 3A + /RESET & ICLK_2A & CLK_3A CLK A.D '- IRESET & ICLK_A CLK B.D '- IRESET & CLK_A ; QQl.D := CLK A & QQ1 + ICLK 3B& CLK B & QQ1 + CLK=2A & CLK 3B QQ2.D := ICLK 2B + ICLK lB& CLK 2B& ICLK 3B + ICLK-1A & CLK-2A + ICLK-2A & CLK-A & QQ2 + NPL - & ICLK 1A - & ICLK lB & ICLK_3A & ICLK 3B & IQQ 1 + ICLK_lB - & ICLK_2A & ICLK_3B & QQ1 PIPELINED CLOCKING SYSTEM OD2OGlO 3/7/90 ERIC B. ROSS CYPRESS SEMICONDUCTOR 90/03/15 23:49:45 PALC2OG10 STATECLK 24 @VCC RUN 2 23 QQ2 NPL 3 22 QQ1 INTR 4 21 CLK_B lEN 5 20 CLK_A WAIT 6 19 CLK_3B WEN 7 18 CLK_3A RESET 8 17 CLK_2B 6-195 & QQ2 ~ ~~ . State Machine Design Considerations and Methodologies ~;r~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) @09 9 16 CLK_2A @10 10 15 CLK_lB @11 11 14 CLK_IA @GNO 12 13 @OE PIPE LINED CLOCKING SYSTEM 00200 10 3/7/90 ERIC B. ROSS CYPRESS SEMICONDUCTOR 90/03/15 23:49:45 S T A T I E @ NNRCVQQ TPULCQQ RLNKC21 432 1282726 5 lEN 6 WAIT 7 23 CLK 3B PALC2oo10 8 22 CLK_3A LCC WEN 9 21 CLK_2B RESET 10 11 20 CLK_2A 19 12 13 14 15 16 17 18 @@@@@CC 011GOLL 901NEKK D 1 1 AB PIPELINED CLOCKING SYSTEM 00200 10 3/7/90 ERIC B. ROSS CYPRESS SEMICONDUCTOR 90/03/15 23:49:45 6-196 ....::=-.... %;~RFSS -=:::!!!!I!!f" State Machine Design Considerations and Methodologies =============================;;;;;; SEMlCONDOCTOR Appendix A. LOG/iC PLD Source Code: Clock State Machine (Continued) S T A T E NRC P U L L N K @ V Q C Q C 2 432 1282726 INTR 5 25 QQ1 lEN 6 WAIT 7 23 CLK A PALC2OGlO WEN 8 22 CLK 3B PLCC RESET 9 21 CLK_3A @09 10 20 CLK_2B 19 CLK 2A 11 12 13 14 15 16 17 18 @@@@CC 11GOLL 01NEKK D LOG/iC - PAL CPU TIME USED: 45 SEC 6-197 State Machine Design Considerations and Methodologies Appendix B. LOG/iC Simulation: Clock State Machine PIPELINED CLOCKING SYSTEM OD200 10 317190 CCCC ES R E vt W eaR N N I A W S n t U PTE l E E teN L R N # # CC LLLL LLCC KKKK KKLL TNT K K C C 2- 2- 3- 3A B A B A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 Top of trace buffer 1 lIU:: 1 1 IC : 1 lIU: 2 lIU: 2 1 IC: 2 lIU: 3 lIU: 3 lIC: 3 2IU: 4 2IU: 4 2IC: 4 3IU: 5 3IU: 5 3 IC: 5 2IU: 6 2IU: 6 2IC: 6 3IU: 7 3IU: 7 3 IC: 7 4IU: 8 4IU: 8 4IC: 8 5IU: 9 5IU: 9 5IC: 9 6IU: 10 6IU: 10 6IC: 10 7IU: 11 7IU: 11 7IC: 11 8IU: 12 8IU: 12 8IC: 12 9IU: 13 9IU: 13 9IC: 13 8IU: 14 8IU: 14 8IC: 14 9IU: 6-198 State Machine Design Considerations and Methodologies Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPE LINED CLOCKING SYSTEM OD2oo 10 3/7/90 CCCC CC ES R LLLL LLCC vt W ea RNNI n t U PTE teN L R N # 15 15 15 16 16 16 17 17 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22 23 23 23 9IU: 9IC: 8IU: 8IU: 8IC: 10 IU : 10 IU : 10IC: 11 IU : 11 IU : 11 IC : 12IU: 12IU: 12IC: 2IU: 2IU: 2IC: 3IU: 3IU: 3 IC : 2IU: 2IU: 2IC: 3IU: 3IU: 3IC: 15IU: 15IU: 15 IC: 16IU : 16IU : 16IC: 17 IU : 17IU : 17IC: 18IU : 18IU : 18 IC: 19IU: 19IU: 19IC: 20IU: 20IU: 20IC: 15IU : 24 25 25 25 26 26 26 27 27 27 28 28 28 29 29 29 KKKK KKLL KK C C 2- 2- 3- 3A B A B A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 # 24 24 E AWS lEE TNT 6-199 fir:~OR ======s;;;;;t;;;;;a;;;;;te=M=aC;;;;;h;;;;;i;;;;;D;;;;;e;;;;;D;;;;;es;;;;;·;;;;;ign=C=OD;;;;;s;;;;;i;;;;;d;;;;;er;;;;;a;;;;;t;;;;;io;;;;;D;;;;;S;;;;;a;;;;;D;;;;;d=M;;;;;e;;;;;t;;;;;h;;;;;Od;;;;;O;;;;;I;;;;;O;;;;;gI;;;;;·e;;;;;s=;;;;;;;;; Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG 10 317/90 CCCC CC ES R LLLL LLCC vt W E KKKK KKLL eaR N N I A W S t U PTE teN L R N n # # lEE TNT CC A 2- 2- 3- 3- B A B K K A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 30 15IU : 30 15IC: 30 16 IV : 31 16 IV : 31 16IC: 31 17 IV : 32 17 IV : 32 17 IC: 32 18 IV : 33 18 IV : 33 18IC: 33 19 IV : 34 19 IV : 34 19 IC: 34 20 IV : 35 20 IV: 35 20IC: 35 15 IV : 36 15IU : 36 15IC: 36 16 IV : 37 16 IV : 37 16IC: 37 17 IV : 38 17 IV : 3817IC: 38 18 IV : 39 18 IV : 39 18IC: 39 19IU: 40 19 IV : 40 19IC: 40 20 IV : 41 20IU : 41 20IC: 41 2 IV : 42 2IU: 42 2IC: 42 3IU: 43 3 IV: 43 3IC: 43 2 IV: 6-200 .-.. £. :;~RESS ~, State Machine Design Considerations and Methodologies ~COID~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG 10 317/90 CCCC CC ES R LLLL LLCC vt W E KKKK KKLL eaR N N I A W S n t V PTE teN L R N # # lEE TNT C C 2- 2- 3- 3A B A B K K A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 44 2IV: 44 2IC: 44 3IU: 45 3IU: 45 3IC: 45 4IU: 46 4IU: 46 4IC: 46 5IU: 47 5IU: 47 5IC: 47 6IU: 48 6IU: 48 6IC: 48 7IU: 49 7 IV : 49 7IC: 49 8IU: 50 8IU: 50 8IC: 50 9IU: 51 9IU: 51 9IC: 51 8 IV : 52 8 IU : 52 8IC: 52 9IU: 53 9IU: 53 9IC: 53 4IU: 544IU: 54 4IC: 54 5IU: 55 5IU: 55 5IC: 55 6IU: 56 6IV: 56 6IC: 56 7 IU : 57 7IU: 57 7IC: 57 8IU: 58 8IU: 58 8 IC: 58 9IU: 59 9IU: 59 9IC: 6-201 State Machine Design Considerations and Methodologies Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG 10 3/7/90 CCCC CC E S R LLLL LLCC vt W E KKKK KKLL ea RNNI AWS KK n t U PTE l E E C C 2- 2- 3- 3t e N L R N TNT A B A B A -B -A B # # 0-1 0-1 0-1 0-1 0-1 0-1 0-1 : 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0 59 8IU: 60 8IU: 60 8 IC : 60 9IU: 61 9IU: 61 9IC: 61 8IU: 62 8IU: 62 8IC: 62 14 I : 63 14 I : 63 14IC: 63 13 I : 64 13 I : 64 13IC: 64 14 I : 65 14 I : 65 14IC: 65 13 I : 66 13 I : 66 13IC: 66 14 I : 67 14 I : 67 14IC: 67 8IU: 68 8IU: 68 8IC: 68 9IU: 699IU: 69 9IC: 69 8IU: 70 8IU: 70 8 IC: 70 9IU: 71 9IU: 71 9IC: 71 1IU: 72 1 IU : 72 1 IC: 72 1IU: 73 lIU: 73 1 IC : 6-202 State Machine Design Considerations and Methodologies Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG 10 317/90 CCCC CC E S R LLLL LLCC vt W E KKKK KKLL eaR N N I A W S U PTE teN L R N n t # # lEE TNT K K C C 2- 2- 3- 3A B A B A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 73 1IU: 74 1 IU : 74 1 IC: 74 2IU: 75 2IU: 75 2IC: 75 3IU: 76 3IU: 76 3IC: 76 4IU: 77 4IU: 77 4 Ie: 77 5IU: 78 5IU: 78 5IC: 78 6IU: 79 6 IV: 79 6IC: 79 7IU: 80 7IU: 80 7 IC: 80 4IU: 81 4IU: 81 4IC: 81 5IU: 82 5IU: 82 5IC: 82 6IU: 83 6IU: 83 6IC: 83 7IU: 84 7IU: 84 7IC: 848IU: 85 8 IV : 85 8 IC : 85 9IU: 86 9IU: 86 9IC: 86 8IU: 87 8IU: 87 8IC: 87 9IU: 88 9IU: 88 9IC: 88 8IU: 89 8IU: 6-203 ~ State Machine Design Considerations and Methodologies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG10 317/90 eccc CC ES R LLLL LLCC vt W E KKKK KKLL eaR N N I A W S V PTE teN L R N n t # # lEE TNT K K C C 2- 2- 3- 3A B A B A -B -A B 0-10-10-10-1 0-10-10-1: 0-1 0-1 0-1 0-1 0-10-10-10-1 0 89 8IC: 891IV: 901IV: 90 1 IC: 902IV: 91 2 IV : 91 2IC: 91 3 IV : 923IV: 92 3 IC: 92 15 IV : 93 15 IV : 93 15IC: 93 16 IV : 94 16 IV : 94 16IC: 94 17 IV : 95 17 IV: 95 17 IC: 95 18 IV : 96 18 IV : 96 18IC: 96 15 IV : 97 15 IV : 97 15 IC: 97 16 IV : 98 16 IV : 98 16IC: 98 17 IV : 99 17 IV: 99 17 IC: 99 221 : 100 22 I : 100 22IC : 100 21 I : 101 21 I : 101 21 IC : 101 22 I : 102 22 I : 102 22IC: 102 21 I : 6-204 State Machine Design Considerations and Methodologies Appendix B. LOG/iC Simulation: Clock State Machine (Continued) PIPELINED CLOCKING SYSTEM OD2OG 10 317/90 CCCC CC E S R LLLL LLCC vt W E KKKK KKLL ea RNNI AWS KK n t U PTE l E E C C 2- 2- 3- 3t e N L R N TNT A B A B A -B -A B # # 0-10-10-10-1 0-1 C-1 0-1 : 0-1 0-1 0-1 0-1 0-10-10-10-1 0 103 21 I : 103 21IC: 103 22 I : 104 22 I : 104 221C : 104 17IU : 105 171U: 105 171C: 105 18IU: 106 181U: 106 181C: 106 19 IU : 107 19IU: 107 191C: 107 20 IU : 108 20lU : 108 201C: 108 151U: 109 151U: 109 151C: 109 161U: 110 161U : 110 161C: 110 171U: 111 17 IU : 11117IC: 111 181U: 112 18IU: 112 18IC: 112 19IU : 113 191U: 113 191C: 113 20IU : 114 20IU : 114 20IC: 114 21U: 115 21U: 115 21C: 115 31U: 116 31U: 116 31C: 116 21U: 6-205 t;F;~ ~ --;;;;;=====s;;;;;;ta;;;;;;t;;;;;;e;;;;;;M=a;;;;;;C;;;;;;h;;;;;;iD;;;;;;e;;;;;;D;;;;;;es=ign=;;;;;;C;;;;;;O;;;;;;D;;;;;;Si;;;;;;d;;;;;;er;;;;;;a;;;;;;t;;;;;;iO;;;;;;D;;;;;;S;;;;;;a;;;;;;D;;;;;;d;;;;;;M=e;;;;;;th;;;;;;o;;;;;;d;;;;;;O;;;;;;IO;;;;;;gI;;;;;;·;;;;;;es== SEMICQIDUCTOR_ Appendix C. Cypress PLD ToolKit: CY7C361 Implementation CY7C361; {PIPELINED CLOCKING SYSTEM AN1 361 4/27/90 ERIC B. ROSS CYPRESS SEMICONDUCTOR} CONFIGURE; { --------------------------------------------------------------------------------------------------- ---------------------------- ;---------------_ ... _--------------------------------------------------------------------------------- ---------------------------- ; INPUT DEFINITIONS: ; RUN = START & STOP EXECUTION OF OUTPUT CLOCKS (NORMAL, SINGLE STEP, , & BREAK PT. EXECUTION ; NPL = PIPELINED VS NON-PIPELINED MODE OF EXECUTION ; INTR = EXTERNAL INTERUPT CONDITION (TLB MISS, PARITY ERROR, ...) ; lEN = INTERRUPT ENABLE ; WAIT = WAIT ENABLE (CACHE MISS) ; WEN = WAIT ENABLE ; RPT_EO = USED TO DUB CLK_1B, CLK USED TO UPDATE THE EO REG ;OUTPUT DEFINITIONS: , ; 3 CLOCK STAGES 1,2,3 ; 2 CLOCKS PER STATE A, B CLK_XX WHERE XX = 1A,lB,2A,2B,3A,3B , ; 2 FREE RUNNING CLOCKS CLK_A, CLK_ B ;--------------------------------------------------------------------------------------... ------------ ......_----------------------} RUN(node= 3), STATECLK, NPL, INTR, IEN(node= 9), WAIT, WEN, RESET, IRPT_EO, {*INPUTS*} ICLK A(node= 16), ICLK B, ICLK lA, ICLK-2B(node= 24), ICLK IB(and~ ICLK 2A, /CLK 3A, ICLK)B, -- {LOCAL 8 FEEDBACK LOCAL 8 HALF 16 GLOBAL 32 FEEDBACK FEEDBACK FEEDBACK {*OUTPUTS*} *STATE MACROCELLS* START = DEFAULT} AX(node= 32), lA, A, lAX, B, 12A, BW, BWX, AW, 12B, AWN, BWN, 2AN, BWNX(node= 53),2ANX, 23B, 3A, 23BX, 3AN, 2BNX, 3ANX, 1B, 123A, {LOCAL 8 = 1, HALF = 1} 123AX, {LOCAL 8 = 2, HALF = 1} 123AY(node= 47), = 123B, 123BX, {LOCAL 8 = 1, HALF 2BN, 3BN, {LOCAL 8 = 2, HALF = 2} 6-206 2} State Machine Design Considerations and Methodologies Appendix C. Cypress PLD ToolKit: CY7C361 Implementation (Continued) {*MISC*} IENA(node= 29),IENB, GLBRST(node= 64), GND(NODE= 73), CLKDB(NODE= 74) EQUATIONS; {*MISC*} GLBRST = < prod> RESET; IENA = < INV_SUM> IGND; IENB = < INV_SUM> IGND; AX = < prod> IRESET; A {*STATE MACROCELLS} < prod> IRUN < invyrod> IB * 13BN; B < prod> dnvyrod> lAX * IlAX lB < prod> < invyrod> IlA * IlAX; lA < prod> INTR * IEN < invyrod> Il23B * Il23BX * Il2B * 12BN; lAX < prod> RUN < invyrod> IB * 13BN; l2A < prod> INPL < invyrod> 11B; l23A < prod> IINTR < invyrod> Il2B * Il23B * Il23BX; BW < prod> WAIT * WEN < invyrod> 1123A * 1123AX * Il23A Y; AW < prod> WAIT < invyrod> IBW * IBWX; l2B = l23AX= < prod> 12A; < prod> INTR * lIEN < invyrod> 112B * 1123B * Il23BX; BWX = < prod> AW; l23A Y= < prod> IWAIT < invyrod> IBW * IBWX; AWN = < prod> WAIT < invyrod> IBWN * IBWNX; 6-207 State Machine Design Considerations and Methodologies AppendixC. Cypress PLD ToolKit: CY7C361 Implementation (Continued) BWN < prod> WAIT * WEN < inv""prod> 12AN; 2AN < prod> NPL < inv""prod> IlB; 123B = < prod> RUN * IWAIT < inv""prod> 1123A * 1123AX * 1123AY; BWNX = < prod> AWN; 2ANX = < prod> IWAIT < inv""prod> IBWN * IBWNX; 123BX = < prod> RUN * WAIT * lWEN < inv""prod> 1123A * 1123AX * 1123AY; < prod> IR UN * WAIT * lWEN < inv""prod> 1123A * 1123AX * 1123AY; 23B 23BX = < prod> IR UN * IWAIT < inv""prod> 1123A * 1123AX 2BNX = < prod> WAIT * lWEN < inv""prod> 12AN * 12ANX; 2BN < prod> IWAIT < invyrod> 12AN * 12ANX; 3A < prod> < invyrod> 123B * 123BX; 3AN = < prod> IINTR < invyrod> 12BN 3ANX < prod> INTR * lIEN < inv""prod> 12BN * 12BNX; 3BN = * 1123AY; < prod> < invyrod> 13A * 12BNX; * 13AN * 13ANX; 6-208 State Machine Design Considerations and Methodologies Appendix C. Cypress PLD ToolKit: CY7C361 Implementation (Continued) CLK_A = < iny sum> IA * lAX * 11A, * 11AX * {*OUTPUTS*} - 112A * 1123A * 1123AX * 1123AY * lAW * 13A * 12AN * 12ANX * lAWN * 13AN * 13ANX; < iny sum> IB * IlB * 112B * 1123B * 1123BX * IBW * IBWX * 123B * 123BX * 12BN * 12BNX * IBWN * IBWNX * 13BN; CLK_IA = < iny sum> 11A * 11AX * 112A * 1123A * - 1123AX * 1123AY; CLK_IB = < inY_sum> liB * 112B * 1123B * 1123BX; CLK_2A = < iny sum> 112A * 1123A * 1123AX * 1123AY * 12AN * 12ANX; < inY_sum> 112B * 1123B * 1123BX * 123B * 123BX * /2BN * /2BNX; CLK_2B = CLK_3A = < inY_sum> 1123A * 1123AX * 1123AY * /3A CLK_3B * 13AN * 13ANX; = < inY_sum> 1123B * 1123BX * 123B * 123BX * 13BN; 6-209 ~ State Machine Design Considerations and Methodologies ~', ~~amucroR =============================;;; Appendix D. Cypress PLD TooIKit:.CY7C361 Simulation 6-210 State Machine Design Considerations and Methodologies Appendix D. Cypress PLD ToolKit: CY7C361 Simulation (Cont.) 6-211 ~ £:: ~RESS ~, State Machine Design Considerations and Methodologies ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix D. CypressPLD ToolKit: CY7C361 Simulation (Coot.) 6-212 ~ , .iii CYPRESS SEMICONDUCTOR Understanding the CY7C330 Synchronous EPLD determined by the total 15-ns feedback time from the Q output of a flip-flop to the D input of any flip-flop in the device. To ensure the 66-MHz operation, all 23 inputs to the device have registers. This structure permits pipelined operations, which allow external data to be synchronized or CPU bus-oriented data to be latched. Input registers can be clocked from either of two input clock sources on either pin 2 or 3. The CY7C330 offers 258 variable product terms for 16 state registers. This allows you to design very complex sequential machines with virtually no limitation of product terms. These designs can easily exceed the size you want to manage with Kamaugh mapping. However, the new generation of advanced EPLD compilers can manage very complex state machine designs on workstations such as the IBM PC/XT . This application note provides basic information on the CY7C330 and presents four design examples: a high-speed up/down counter with limits, a 16x16 crossbar switch, a pipelined buffer, and a simple toggle counter. Also included is an internal product term numbering c'hart. All example source code is in Cypress PLD ToolKit syntax. The Cypress CY7C330 is the flrst in a family of high-speed, application-optimized CMOS EPLDs. This fully synchronous part is designed to implement state machines and other clocked systems. The CY7C330 offers new solutions for systems designers, with a truly usable high clock rate, 39 total registers, and 17,000 programmable bits providing up to 1200-gate complexity. Other devices in the family are the CY7C331 and the CY7C332. All family members are packaged in 28pin, 300-mil dual in-line and LCC/PLCC packages. The technology is low-power CMOS and UV erasable. The application-specific family from Cypress provides the CY7C330 for sequential state machine applications, the CY7C331 for general-purpose asynchronous designs, and the CY7C332 for decoders and combinational logic applications. This family of high-speed devices provides the optimal solution for each system design using Cypress's 0.8-micron, dual-level-metal, CMOS technology. Systems using other types of programmable logic devices for synchronous state machine applications can use the CY7C330 as a higher-density, lower-power solution at speeds up to 66 MHz. The Cypress PALC22VlO, PLDC20G 10 and PAL20 devices proved the popularity of high-speed, low-power, erasable CMOS logic. The CY7C330 builds on that base. One CY7C330 can easily replace four PALC22VIOs because the CY7C330 extends the number of state registers to 16, extends the number of product terms per output to 19 maximum, adds an XOR logic function, and provides the ability to use pins as bidirectional 1/0. The CY7C330 increases the speed of synchronous systems to 66 MHz. This is the actual usable speed, as Overview of the CY7C330 An easy way to picture the CY7C330 is with the block diagrams in Figure 1. On the input side of the CY7C330 (pins 1 - 7 and 9 - 14) are 11 input registers and three clocks. Pin 1 is the state clock. Each of the 11 input registers is edge triggered, and each can use either device pin 2 (clock 1) or pin 3 (clock 2) (shown in Figure 2) as a clock. An architecture bit for each input register controls the selection of the input clock. This approach allows input data to be synchronized to a clock edge or loaded into the device from a CPU data bus, with the clocks being decoded I/O-write signals. The registers' setup and hold times are very short, allowing high system throughput. Note that the outputs of the registers feed the device's AND-OR-XOR array. Pin 14 has an additional function that affects the input register: You can use the pin as a fast, asynchronous output enable to the device, allowing a CPU to move data in the state machine registers onto a bus, for example. On the I/O side of the device (pins 15 - 20 and 23 28) are 12 macrocells. Each I/O macrocell (see Figure 1 in "Using ABEL to program the CY7C330") contains a D-type register, an input register with clock controls, and output-enable resources. Architecture bits for feed6-213 TO UPPER SECTION T0 LOWE R SEC T' 0 N Figure 1. The CY7C330 Block Diagram back selection, output-enable configuration, and inputregister clock selection allow you to configure each macrocell independently. Each adjacent 110 macrocell shares an input multiplexer (Figure 3). This allows either macrocell register to be hidden, while the 110 pin is used as an input. In addition, four hidden register macrocells (see Figure 3 in "Using ABEL to Program the CY7C330") provide additional state registers without direct output connections. The AND-OR-XOR array in Figure1 has 66 inputs and 244 product terms driving 16 OR-XOR gates. The I i FINS 1. .? •.. 14 . . .,i. . . . .'1' PiN I ! I LiD !! l 0r-r l. . 16 OR gates have from nine to 19 inputs (variable product terms), which allow complex designs to fit into each stage. An XOR product term for each OR output permits equations to be solved either with D or T flipflops in the output stage, or for active-High or activeLow equations. 12 product terms provide the outputenable function. A global reset and preset is also generated out of the array. Each product term forms an AND function with up to 66 inputs. The 66 inputs are the true and complement signals of 33 internal nodes in the CY7C330. FRO~ I~n· ""'"y 'IU INPUT TO ARRAY UPPER ~ACROCELL ~0"""..1 o S 1 ··············r··········.. (4 C3 CU::'2 FROl\ PIN :3 eLKl FRD1'\ PiN 2 I FRO~ LO~ER Figure 2. The CY7C330 Input Macrocell ~ACROCELL Figure 3. The CY7C330 Shared Input Multiplexer 6-214 ~D~a~ Pin 1 ~>CLi(! ! arOJ Pin 2/3 -1u k o~ Pin 1 ~~H.tt ! ! 'f0J Pin 1 ~~~~r ,.............. ~~~~r ----..cia 0: l.. . . ~~.jpln 2/3 t~~~ __ ~~~~~ -----! ----l Pin 2/3 Pin 1 ~~~~r ~~~~r -----' a 0 CL Pin 2/3 ~ "...!....... ~ o ..---c>- ..·0 PR Pin 1 CLK MO 1~~~E Figure 4. Four CY7C330 I/O Macrocell Configurations pin 3 serving as clock. The total register count is 39-16 state registers and 23 input registers. To keep the device speed as high as possible, the number of inputs to the array is limited to 33 (x2); six of the array inputs from the I/O macrocells are multiplexed (shared). Thus, three feedbacks are provided for the two output and two input registers for each set of two I/O pins. The easiest way to understand the net result is that the maximum number of hidden registers in the 12 I/O macrocells is six. Output registers that have no feedback to the array are useful for data outputs or single-clockdelayed Mealy outputs from the state machine. The 12 macrocells have 24 registers total and 18 feedbacks. When you assign functions in your application to physical pins in the device, consider the number of feedbacks available and the number of product terms required. Macrocell State Registers The CY7C330's OR-XOR gates feed into 16 state registers. These registers are edge-triggered D flip-flops with pin 1 serving as clock. The outputs from these state registers feed back into the array, allowing you to con~ struct high-speed state machines. The total feedback time period from Q to D and the array delay from input register to state register is 15 ns, allowing a full, usable clock rate of 66 MHz. Four of the CY7C330's state registers are always hidden inside the device. A hidden register lets you build intermediate states or other functions without loading an I/O pin. Of the 12 remaining registers, up to six can be hidden. This gives a total of 10 maximum usable hidden registers, while allowing the 28-pin device to have 17 dedicated input pins, six I/O pins, and many other combinations. Valid I/O macrocell configurations appear in Figure4. Center Pinning All Cypress CY7C330 family products use center pins for Vee and Vss connections. In addition, the Vss Each I/O macrocell (pins 15 - 20 and 22 - 28) also has an edge-triggered input register with either pin 2 or 6-215 for the intemal logic and the Vss for the output drivers are on different pins. Center power pins eliminate noise generated by both TIL and CMOS devices. This noise is inductive noise proportional to the package lead inductance. Moving the power pins to the center lowers pin inductance and noise by a factor of 3 compared with corner-pin power connections. Splitting ground lines-with the ground for input and logic on pin 8 and the ground for output drivers on pin 21-has additional noise benefits. Ground-bounce noise is caused when outputs switch from High to Low. The more pins switching at the same time, the more noise generated. Several hundred millivolts can be induced on the chip's internal ground from this effect. Although the level is low enough to meet output Vol specs, the noise voltage must be considered when designing the input buffers on a chip because the noise influences the Vii spec of 0.8V. 400 mV of ground-bounce noise shifts the AC effective Vii to 1.2V. By separating the input reference ground from the output ground where the noise is generated, ground noise compensation is lowered or eliminated. This permits Cypress offer a faster input buffer. Externally, the two grounds are connected together. Also, by placing the Vee pin close to the GND pin, external 0.1 J.LF capacitors (as usual, one per chip) can be very close to the actual device power pins. All Cypress EPLDs permit the registers to be preloaded into any configuration. This capability can vastly reduce the test time and allows all patterns programmed into an EPLD to be completely tested. Without preload, for example, testing a multibit counter that has no reset product term could be very slow or impossible. CY7C33X Family Technology The CY7C330 and most other new Cypress products are built in the Cypress 0.8 micron, N-well CMOS, high-speed technology. New Cypress EPLDs use a dual-metal-layer connection method to further in'crease speed. This technology allows Cypress to build static RAMs with 7-nsaccess times, 35-MHz FIFOs, a 33~MHz RISC processor, and many other high-performance products. ' Cypress uses an EPROM technology (as distinct from fuse-link, or EEPROM technology) for all its EPLD sand (E)PROMS because of the tremendous in.; crease in manufacturing yields and 100-percent testability offered by EPROM technology. This UVerasable EPROM technology provides proven data retention, testability, and manufacturability. , In addition, the Cypress 2T (2 transistor) cell design allows very high speed circuits to be built. Cypress uses this 2T cell design for performance. One transistor is used only for programming and the other for reading, with each optimized for only one function. The program transistor can be larger and slower. It is designed to withstand 15V source to drain, which is the maximum program charge on the floating gate. The and fast. Because the read transistor can be very read bit line is only switching between 0 and 5V, the sense amp is smaller and faster, and no high-current 15V driver MOSFETs are present. The result is very fast (sub 10 ns) array times. All Cypress devices offer protection against static discharge (ESD). This means the devices are no more sensitive than bipolar devices. By using a unique -3V substrate bias generator (Vbb), Cypress devices are protected from latchup caused by transient voltages below ground, which are commonly seen in TIL systems. This internally generated Vbb also allows the device to maintain high speed over a wide temperature range by controlling switching thresholds. No current flows in an input even under extreme undershoot situations, and the input transistor requires no recovery time after an undershoot. In addition to substrate bias for latchup elimination, Cypress uses a stacked TIL output driver. This feature removes the pin-to-P-channel-transistor connection, a major source of latchup. Reducing the energy in High-to-Low transitions also improves overshoot and noise generation. Virtually all high-performance systems using TIL or CMOS adhere to the TIL standard voltage specification-2.0V for a TIL High and 0.8V for a TTL Low. Thus, a P-channel output transistor that pulls the output to Vee causes more problems than it solves because it overdrives the output. The lower voltage output from a stacked N-channel output drive of 3.5V vs. 5.0V causes less noise on the High-to-Low transition because less energy needs to be switched. Cypress uses stacked N-channel transistors on the outputs of all devices, eliminating latchup and fast transition to an overly high output 1 level. The devices are more compatible with the TIL devices Cypress replaces. small Resource Planning Planning the assignment of functions to pins in the CY7C330 is an important step in a CY7C330 design. The resource planning sheet presented in Table 1 should be helpful for this procedure. Examples of its use are included with each application presented here. The decision on which pin to use is based on: 1. Asynchronous output enable, set to pin 14 or synchronous enable with a product term 2. State clock is pin 1 3. Input clock is pin 2 4. Second input clock is pin 3, or use pin 3 as a normal input if pin 2 will be the only input clock 5. Input only on pins 4 -7,and 9- 13 6. Device outputs: Assign pins keeping in mind that they have different product term widths. The widths are: 9, 11, 13, 15, 17, 19 for pins 28/15, 26/17, 24/19, 23/20,25/18,27/16, respectively d. Assign input names to these six registers that are different from the physical device pin names e. The optionally hidden registers can be viewed if their output enable is made active and the external logic driving the pin is in a high-impedance state; otherwise the OE (output enable) product term of the hidden register must be set to Zero (NAME.ENA = 0) 7. Use of hidden registers: a. Four registers - H 1 to H4 - are always hidden b. Up to six additional hidden registers can be defined; Cypress suggests this sequence: 25, 18, 27, 16,23,20 c. Assign input names to these six registers that are defined. Cypress suggests this sequence: 25, 18, 27, 16,23,20 Table 1. A CY7C330 Resource Planning Sheet CY7C330 Resources Planning Sheet Project: Your project name Input Register Pin Function 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 State Ok Clk 1 Input/Clk 2 Input Input Input Input VSS Input Input Input Input Input Input/OE Input Input Input Input Input Input VSS VCC 23 24 25 26 27 28 HI H2 H3 H4 Notes : Input Input Input Input Input Input None None None None Input Register Clock Input Register Clock Register Function Output Enable # of PTerms 112 112 112 112 if Input 112 if Input 112 if input 112 if input 112 if input 112 if input 112 if Input Output Output Output Output Output Output Pin Pin Pin Pin Pin Pin 141Ptenn 141Ptenn 141Ptenn 14/Ptenn 141Ptenn 141Ptenn 9 19 11 17 13 15 112 if input 112 if input 112 if input 112 if input 112 if input 112 if input Output Output Output Output Output Output Pin 141Ptenn Pin 14/Ptenn Pin 14/Ptenn Pin 141Ptenn Pin 141Ptenn Pin 14/Ptenn None None None None 15 13 17 11 19 9 19 11 17 13 1 if Input 112 112 112 112 112 1/2 #1 is pin 2 #2 is pin 3 See the Application Note for the meaning of the pin names. Output Enable = 14 means the asynchronous pin 14 direct enable. Z means the pin is never active 6-217 ~CYI'R!SS CY7C330 Synchronous EPLD ~~R~~~==~====~~~-~~~ 8. The remaining visible registers can still be used in applications where both inputs of a macrocell pair are used. However, one of the output registers of each adjacent Pair cannot have a feedback; it is used only as an OQtput synchronized by the state clock on pin 1. If, after this assignment, the compiler or assembler complains that not enough product terms are available, some pins might have to be re-assigned Software Design Tools You can compile logic for the CY7C330 with a number of packages available from independent software vendors® These packages include ABEL V3",,0 from DATA 110 and LOG/iC V3.0 from ISDATAIIII. Cypress has developed the PLD ToolKit (CY7C3101), which you can use to design any PLD that Cypress makes. All these packages are logic compilers capable of converting state machine or binary logic descriptions into a JEDEC file that can program the device. The JEDEC file is the standard interface from a software development tool to a logic programmer. See the examples section for more detail on the software tools. Logic Programmers The CY7C330 can be programmed today on the QuickPro plug-in board for IBM and compatible personal comp~ters. So~n you will also be able to use the DATA 110 , STAG ,and other programmers. Some software tools require you to set fuses or bits in the device to enable certain functions, whereas others eLKS r---- 15 0i9 t---020 [ .......... 023 r 19 024 '-.....or-... CLKl r---- 025 026 1.. ···........·.. Q27 . ·. ·t·. ·. · ......................] Q 2 8 CLK2 Figure 5. Pipelined Buffer Block Diagram Pipelined Buffer The Pipe330 example is a two-stage pipeline that shifts parallel data from the inputs to the outputs (Figure 5). This example demonstrates the overall Cypress PLD ToolKit source syntax and shows how macrocells are configured. In the Pipe330 example, the output enable for specific macrocells is under control of either pin 14 or the associated product term. The latter case is the default. To control the output enable of a·· macrocell with pin 14, add NENBPT to the list of attributes following. the· node assignment in the configuration section. If NENBPT does not appear in the attribute list for a node, the expression that follows .the construct in the equations controls the output enable. If . is not part of the equation, the output is permanently disabled. If is present, but no expression follows it, the output is permanently enabled. The pin 1 signal always clocks the output registers in the CY7C330. Either the pin 2 or 3 signals can clock the input registers. Because pin 2 is the default clock, no special attributes are required for this configuration. If you wish to clock an input register with pin 3, the attribute list for that node must contain ICLK= 3. The resource planning sheet for the pipelined buffer appears in Table 2, and the source code appears in Appendix A. Test patterns for the Pipe330 example are relatively simple, but keep in mind a few guidelines. At first, for example, the state of the registers in the device is unknown, and all registers are put in a known state before any outputs are checked (non-X). Another aspect of CY7C330 simulation is the need to consider multiple clocks. The input and output clocks should be treated separately, because the simultaneity of clock assertion is not guaranteed in programmers---or in any real system, for that matter. UplDown Toggle Counter with Preloads ·····..·v··..·..\ i set the architecture bits automatically. Note that bit 17070 requires special attention: it must be set to 1 if any input register uses a clock from pin 3. This requirement will disappear in {uture releases of the software packages, and the bits will be set automatically. The Tog330 example shows how you can use the CY7C330's XOR product terms to emulate aT-type flip-flop. The statement: Q = < XSUM> Q < SUM> T; programs the XOR product term with the feedback of the register output, making the register into a T type . The T-type register configuration is active Low because, by architecture, all the outputs are active Low. You can . emulate a JK-type flip-flop by using the configuration above with the following relation: T = J!Q + KQ 6-218 5/!cvmss -=- CY7C330 S.r!!chronous EPLD SEMICCNDUCTOR =====================;;;;;!;=======;;;;;; Table 3 presents the resource planning sheet for the toggle counter example, and the source code appears in Appendix B. Figure 6 shows the block diagram for the design. UplDown Counter with Limits The up/down counter example shows how you can assign the pins for maximum use in the CY7C330. This counter operates at 66 MHz, counting up until reaching the value stored in the 8-bit upper-limit register, then down until reaching the lower limit. Also included is a device reset and a method to preload the counter to either the upper or lower limit. Consider an application in which the two 8-bit limit registers are loaded from a CPU. The lower limit is on pins 4 to 12, with a 9th bit for preload on pin 13. The clock for this lower limit is on pin 2. The upper limit is loaded via pins 15 - 27, with pin 27 providing 9th preload bit. These pins are also used for reading out the counter value, and pin 14 is the output enable for the up/down counter. Table 2. Resource Planning Sheet for Pipelined Buffer CY7C330 Resources Planning Sheet Project: Pipelined Buffer Input Register Pin Function 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 HI H2 H3 H4 Input Register Clock State Clk Clk 1 (LHS) Clk 2 (RHS) 14 15 16 17 VSS 19 110 1 2 III 2 112 113 OE 2 Register Function Output Enable # of Z Z Z Z PTerms 2 Q19 Q20 Pterm (Eqn) Pterm (Eqn) 9 19 11 17 13 15 Q23 Q24 Q25 Q26 Q27 Q28 Pterm (Eqn) Pterm (Eqn) Pin 14 Pin 14 Pin 14 Pin 14 None None None None 15 13 17 11 19 9 19 11 17 13 VSS vce None None None None Notes: Input Register Clock #lispin2 #2 is pin 3 See the Application Note for the meaning of the pin names. Output Enable = 14 means the asynchronous pin 14 direct enable. Z means the pin is never active 6-219 '7 L....·..··.. ·in~ r. . . · Q0 ............8fIo. IT! m iT:? Q?' ~-~;T3 03 i--~~1111L .... i~ .?' t.. . . . . . . . . . . . . . . . . . . . . . . . . . . j . . . . . CLR Figure 6. Toggle Counter Block Diagram Figure 8. 16X16 Crossbar Switch Block Diagram Four buried registers detect equality of the counter with the limits to maintain up/down direction and to detect the preload request as an edge-triggered signal. By using the XOR product terms, the counter needs only nine total products even on the most significant bit. Without XOR, the 8th bit would need 18 product terms because of the two preload sources. Due to the large number of product terms per output in the CY7C330, this counter can operate at 66 MHz. The counter's contents can be read out when pin 14 (direct output enable) is Low. In a bus-oriented system, a microprocessor can read the register if a decoded I/O read signal is applied to pin 14. Note that the other method of output enable, via the array, requires a clock edge to load the enable input condition into the input registers. When pin 14 is High, the upper-limit register can be loaded-from a processor bus, for example. The lower-limit register can be loaded at any time. Pin 2 Pin 3 Prelo .. d L Prelo .. d H Resel-<' """---1---- Pin 1 L-_~-~~t_~-Pln 14 Figure 7 shows the block diagram for this design. The resource planning sheet appears in Table 4, and the code is in Appendix C. In operation, the up/down counter counts between the limits stored in two registers. Lower-limit (LL) data is loaded on the positive edge of the pin 2 clock. There are 8 data bits plus 2 control bits, LPL and Reset. If LPL is Low, only the limit compare register is changed. If LPL is High, the LL data is loaded into the counter on the next clock edge, and the counter counts up. The LL data is one count higher than the actual lower limit. If RESET is active, all internal registers are reset to 0, so long as the reset bit is set in the LL register. Upper-limit (UL) data is loaded on the positive edge of the pin 3 clock. This part of the counter uses 8 data bits plus a preload control bit, UPL. If UPL is Low, only the limit-compare register is changed. If UPL is High, the UL data is loaded into the counter on the next clock edge, and the counter counts down. UL data is multiplexed with the counter output data. The UL data is one count lower than the actual upper limit. Pin 16 is the RESET input. Pin 14 is the active-Low output enable for the counter; the counter can be read at any time. Pin 1 is the clock for the counter. Pins 18 and 20 are connected together for data bit 6. Pins 23 and 25 are connected together for data bit 7. The buried (hidden) registers are used as follows: HI is loaded with the result of the comparison between the counter and UL. H2 is UPL or LPL, delayed by one clock edge; H2 serves as an edge detect. H3 is loaded with the result of the comparison between the counter and LL. H4, when High, forces the counter to count up. 16 x 16 Crossbar Switch A data switch capable of multiplexing 16 inputs into four outputs can be built with one CY7C330. The 66-MHz clock rate allows even asynchronous input signals of up to 33 MHz to be switched through the ~evice. The compact 300-mil package saves PCB space, In contrast to the space such a multiplexer would otherwise 8 Figure 7. UPIDOWN Counter Block Diagram 6-220 need. At least 40 pins would normally be required, partitioned as follows: 16 input pins, 4 output pins, 4 x 4 = 16 selection inputs 4 pins for power and clock connections No other PLD today can perform this function using a single device, due to the logic requirement (the number of product terms required per output) as well as the timing requirement. The crossbar switch uses 12 state registers plus four input registers to act as the 4 x 4-bit selection registers. Each output channel needs a 4-bit register to select one of 16 input channels. A 4-stage, 4-bit-wide shift register implemented in the device holds the select status. This allows the 4 x 4 selection bits to be loaded via only four pins, without needing any address pins. When the PL (PRELOAD) signal on pin 3 is Low, input data bits 0 to 3 become the selector data lines; five clock pulses shift the select data through the device Table 3. Resource Planning Sheet for Toggle Counter CY7C330 Resources Planning Sheet Project: 4 Bit Toggle Counter Input Register Pin Function 1 2 3 4 5 6 7 8 9 10 Input Register Clock # of PTerms Register Function Output Enable !QO !QI !Q2 !Q3 Pterm Pterm Pterm Pterm Z Z 9 19 11 17 13 15 Z Z Z Z 15 13 17 11 19 9 19 11 17 13 State Clk Clk 1 Clear VSS 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 HI H2 H3 H4 VSS VCC Z Z None None None None None None None None Notes: Input Register Clock # 1 is pin 2 #2 is pin 3 See the Application Note for the meaning of the pin names. Output Enable = 14 means the asynchronous pin 14 direct enable. Z means the pin is never active 6-221 a shared-input mux. The source file's configuration secinto selectors 1, 2, and 3, as well as the output pins. tion specifies this arrangement by first assigning the Setting pin 3 High after the fifth pulse loads the signals name of the output register to the macrocell node numon the output data· pins into select register O. This last ber. Because the default configuration is for the output load operation utilizes· the function of pin 3 as a data register's Q output to feed back into the array, no other pin as well as a clock. Setting the signal on pin 3 Low configuration attributes are needed here. Next, the switches the internal logic from a selector into a shift input's name is assigned to the node number of the register; the clock edge created by applying a High to shared . input mux adjacent to the pin. The default for pin 3 loads the data. outputs into the input registers associated with output pins 16, IS, 25, and 27. the shared input muxes is to pass the data on the evennumbered pin into the array. If the input should come This design buries the output registers of several from an odd-numbered pin, YOll must add the attribute 110 macrocells and uses the pin as an input by utilizing Table 4. Resource Planning Sheet for UplDown Counter CY7C330 Resources Planning Sheet Project: UplDown Counter with Limits Input Register Pin Function lState Clk 2Clk 1 3Clk2 4ll.01 5ll.11 6ll.21 7ll.31 SVSS 9LIA1 1Oll.51 1lll.61 12ll.n 13PRELOAD LOW1 14COUNTER OE15UL12CNTlPin 14 9 16Resetl-Z19 17UL32CNT3Pin 1411 ISUL62-Z17 19UL42CNT4Pin 1413 20--CNT6Pin 1415 21VSS 22VCC 23--CNT7Pin 1415 24UL52CNT5Pin 1413 25UL72-Z17 26UL22CNT2Pin 1411 27PRELOAD HIGH2-Z19 2SUL02CNTOPin 149 H1None-Up EquaisNone19 H2None-UH Prel'DoneNone11 H3None-Down EqualsNone17 H4None-Up CountNone13 Input Register Clock Register Function Notes :Input Register Clock #1 is pin 2 #2 is pin 3 See the Application Note for the meaning of the pin names. 6-222 Output Enable # of PTerms SRC=N (where N is the pin number) to the list of attributes in parentheses following the node name. For an example of this syntax, refer to dl0 and sa2 in the source file. The space advantage of the CY7C330 in this crossbar switch application becomes especially important as the size of the matrix increases. A 32 x 32 matrix requires only 16 devices vs. 64 PALC22VI0s or 96 TIL parts. You can easily load the internal data selection registers with a Cypress 24-pin EPLD, the PLDC2OGlO, and a FIFO. A CPU can load the 16 x 4-bit selector information into the FIFO, and the PLDC20G 10 can move the data from the FIFO into the device. One PLDC2OGI0 and one 16 x 4 (or larger) FIFO is required. The Cypress CY7C403 is an ideal FIFO for this application Table 5 shows the resource planning sheet for the 16 X 16 crossbar switch, and a block diagram of the design appears in Figure 8. The source code can be found in Appendix D. Table 5. Resource Planning Sheet for Crossbar Switch CY7C330 Resources Planning Sheet Project :16 X 16 Crossbar Switch Input Register Pin Function 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 HI H2 H3 H4 State Clk Clk 1 Sel PRELOAD Data 0 Data 1 Data 2 Data 3 VSS Data 4 Data 5 Data 6 Data 7 Data 8 Data 9 Data 10 Select DO Data 11 Select CO Data 12 Input Register Clock 1 2 2 Register Function Output Enable # of PTerms Select A2 Output 3 Select Al Output 2 Select Cl Select Dl Z 9 19 11 17 13 15 Select B2 Select A2 Output 1 Select C2 Output 0 Select D2 Select A3 Select B3 Select C3 Select D3 Z Z Pterrn Z Pterrn Z Z VSS VCC Data 13 Select BO Data 14 Select AO Data 15 None None None None Notes: Input Register Clock 1 2 1 2 Pterrn Z Pterrn Pterrn None None None None 15 13 17 11 19 9 19 11 17 13 # 1 is pin 2 #2 is pin 3 See the Application Note for the meaning of the pin names. Output Enable = 14 means the asynchronous pin 14 direct enable. Z means the pin is never active 6-223 the non-inverted input to the array. If the number is even, then the false input is the next-higher integer; if the number is odd, then the false input is the next lower integer. The table lists the number of product terms in each output stage, along with the JEDEC offset (sequential fuse position) for each. · Reading the CY7C330 JEDEC Map Table 6 should help you read the JEDEC map of a CY7C330. The pin or node reference number is on the left. These numbers correspond to the pin and node numbers on the block diagram in Figure 1. The column labeled Input True gives the sequential number (left to right) of the column corresponding to Table 6. The CY7C330 Internal Array Reference List Pin or Node Function 1 2 3 4 5 6 State Clock Input Clockl Input Clock2 Input Register Input Register Input Register Input Register VSS Input Register Input Register Input Register Input Register Input Register Input Register 110 Regs, mux mUll input(node) IJO Regs, mux IJO Regs, mux mux input(node) 110 Regs, mux IJO Regs, mux mUll input(node) 110 Regs, mux VSS 7 9 10 11 12 13 14 15 N-35 16 17 N-36 18 19 N-37 20 21 22 23 N-38 24 25 N-39 26 27 N-40 28 N-29 N-30 N-31 N-32 N-33 N-34 Input True # of Pterms OE XOR 1st OR 9 L16236 Ll6302 Ll6368 19 11 L14850 Ll3992 Ll4916 Ll4058 L14982 Ll4124 17 13 Ll2738 L9636 Ll2804 L9702 Ll2870 L9768 15 L8514 L8580 L8646 39 36 35 33 30 15 L5280 L5346 L5412 13 17 L4290 L3036 L4356 L3102 L4422 L3168 29 27 24 23 11 19 L2178 L792 L2244 L858 L2310 L914 9 L66 L132 Ll98 0 2 4 6 8 10 12 14 16 18 20 65 62 61 59 56 55 49 46 45 vce IJO Regs, mux mUll input(node) IJO Regs, mux IJO Regs, mux mux input(node) IJO Regs, mux 110 Regs, mux mUll input(node) IJO Regs, mux Sync. Reset Sync. Preset Buried Register Buried Register Buried Register Buried Register LO Ll6962 40 42 13 17 Ll1814 Ll0626 Ll1870 Ll0692 50 52 11 19 L7722 L6402 L7788 L6468 6-224 Appendix A. PLD ToolKit Source Code for Pipelined ButTer CY7C330; {Pipe330} CONFIGURE; CkS (node=l), Ckl, Ck2, 10 (iclk=3), 11 (iclk=3), 12 (iclk=3), I3 (iclk=3), 14 (node=9), 15, 16, 17, OEl, IOE2(node=14), Q7, Q6, Q5, Q4, Q3(nenbpt), Q2(nenbpt), Ql(node=23,nenbpt), QO(nenbpt), lRST(iop), reset(node=29), {Output register clock} {Input register clock I} {Input register clock 2} {Input 0, clocked by Ck2 (pin 3)} {Input 1, clocked by Ck2 (pin 3)} {Input 2, clocked by Ck2 (pin 3)} {Input 3, clocked by Ck2 (pin 3)} {Input 4, clocked by Ckl (pin 2)} {Input 5, clocked by Ckl (pin 2)} {Input 6, clocked by Ckl (pin 2)} {Input 7, clocked by Ckl (pin 2)} {output enable for Q<7:4>} {direct output enable for Q<7 :0> } {Output 7, clocked by CkS, enabled by OEl&IOE2} {Output 6, clocked by CkS, enabled by OEl&IOE2} {Output 5, clocked by CkS, enabled by OEl&IOE2} {Output 4, clocked by CkS, enabled by OEl&IOE2} {Output3, clocked by CkS, enabled: pinl4} {Output2, clocked by CkS, enabled: pinl4} {Outputl, clk: CkS, OE: pinl4} {OutputO, clocked by CkS, enabled: pinl4} {low asserted reset, I/O macrocell as input} {internal reset node} EQUATIONS; reset = RST; lQO !IO; lQI !II; lQ2 !I2; lQ3 !I3; lQ4 OEI & OE2 < sum> !I4; lQ5 OEI & OE2 < sum> !I5; lQ6 OEI & OE2 < sum> !I6; lQ7 OEI & OE2 < sum> !I7; {end of file} 6-225 Appendix B. PLD ToolKit Source Code for a Toggle Counter CY7C330; {Tog330} CONFIGURE; CkS, Ckl, !elr, !OE(node 14), !QO(nenbpt), !Ql(nenbpt), !Q2(nenbpt), !Q3(nenbpt), reset(node=29), {Count clock, This is pinl since it is fIrst in the list.} {Input clock, This is pin2 since it is next.} {Low true clear, Pin3 is next in sequential order.} {Low asserted output enable pin, pin 14} {QO-Q3 are the counter outputs - pins 15-18.} {The reset product term is node 29.} EQUATIONS; reset = Clr; QO QO < sum> ; {Feeding the register output back into the XOR emulates a T flop.} {T input - No expression after the connective < sum> means always asserted} Ql = Ql < sum> QO; {Feeding the register output back into the XOR emulates a T flop.} {T input} Q2 = Q2 < sum> Ql & QO; {Feeding the register output back into the XOR emulates a T flop.} {T input} Q3 {Feeding the register output back into the XOR emulates a T flop.} {T input} Q3 < sum> Q2 & Q 1 & QO; {end of fIle} 6-226 Appendix C. PLD ToolKit Source Code for UplDown Counter {File: COUNTER.CYP Date: 11/9/1988 } CY7C330; CONFIGURE; CLK(node=I), LLC(node=2), ULC(node=3), {Count clock, Lower Limit Clock, Upper Limit Clock} LLO(node= 4, iclk= 2), LL1, LL2, LL3, {The Lower Limit register is clocked by pin 2-LLC- by default.} LL4(node= 9), LL5, LL6, LL7, {The register is located at pins 4-7, 9-12 - pin 8 is Vss.} LPL(node=13), {Lower limit PreLoad} ICNTOE (node=14), {Counter output enable on pin 14} CNTO (node= 28, nenbpt, oclk= l,iclk= 3), {The counter itself is in the output register of various 1/0 macrocells} CNTI (node=15, ,nenbpt, iclk=3), {as noted in the node numbers after the names. Pin 1 always clocks the} CNT2 (node=26, nenbpt, iclk=3), {output registers-oclk = 1 was included once for documentation.} CNT3 (node=17, nenbpt, iclk=3), {'nenbpt' specifies that the output enable is controlled by pin 14} CNT4 (node=19, nenbpt, iclk=3), {rather than the output enable product terms in each macrocell} CNT5 (node= 24, nenbpt, iclk= 3), {Most of these rnacrocells will be bidirectional, with the Upper Limit} CNT6 (node=20, nenbpt), {register residing in the input registers. 'iclk = 3' specifies that pin 3} CND (node=23, nenbpt), {clocks the input registers. This overrides the default, pin2.} {The output register is fed back into array by default.} {ULO is the input reg of pin28, routed thru shared input mux-node40} ULO (node=40, src=28), {ULI is the input reg of pinl5, routed thru shared input mux-node35} ULI (node=35, src=15), {UL2 is the input reg of pin26, routed thru shared input mux-node39} UL2 (node=39, src=26), {UL3 is the input reg of pinl7, routed thru shared input mux-node36} UL3 (node=36, src=17), {UL4 is the input reg of pinl9 routed thru shared input mux-node37} UL4 (node=37, src=19), {UL5 is the input reg of pin24 routed thru shared input mux-node38} UL5 (node=38, src=24), {UL6 is the input reg of pinl8, 'iop' selects array input from input reg} UL6 (node=18, iop,iclk=3), UL7 (node=25, iop, iclk=3), {UL7 is the input reg of pin25, 'iop' selects array input from input reg} UPL (node=27, iop, iclk=3), {Upper limit PreLoad, array input from input reg, clocked by pin 3} lreset (node=16, iop), {Low asserted clear, array input from input reg, clocked by pin 2} node29 (node=29), {The reset product term is node 29} UP (node=31), {buried node 31 selects the counter direction, clocked by pin I} LEQUAL (node=32), {buried node 32 compares counter with lower limit, clocked by pin I} PLDONE (node=33), {buried node 33 is the preload done flag, clocked bypin I} UEQUAL (node=34), {buried node 34 compares counter with upper limit, clocked by pin I} EQUATIONS; ICNTO= < XSUM> ICNTO < SUM> ILPL & IUPL < SUM> IPLDONE < SUM> ILLO & LPL & CNTO < SUM> ICNTO & ULO & UPL < SUM> LLO & LPL & ICNTO CNTO & /ULO & UPL; ICNT1= < XSUM> ICNTI < SUM> ILPL & CNTO & IUPL & IUP < SUM> ILPL & ICNTO & IUPL & UP < SUM> ILLI & LPL & PLDONE & CNTI < SUM> LLI & LPL & PLDONE & ICNTI < SUM> UPL & PLDONE & lULl & CNTI < SUM> UPL & PLDONE & ULI & ICNTI < SUM> CNTO & IPLDONE & IUP ICNTO & IPLDONE & UP; 6-227 ~C'tPRE$ ~ CY7C330 Synchronons EPLD SEMlcnIDUCI'QR ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;!;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;=;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Appendix C. Source Code for Up/Down Counter (continued) leNT2=: < XSUM> ICNT2 < SUM> ILPL & CNTO & /uPL & /uP & CNTl < SUM> ILPL & ICNTO & /uPL & UP & ICNTl < SUM> ILL2 & LPL & CNT2 & PLDONE < SUM> LL2 & LPL & ICNT2 & PLDONE < SUM> UPL & CNT2 & /uL2 & PLDONE < SUM> UPL & ICNT2 & UL2 & PLDONE < SUM> CNTO & IPLDONE & IUP & CNTl ICNTO & IPLDONE & UP & ICNTl; ICNT3= < XSUM> ICNT3 ILPL&CNTO&/UPL&CNT2&/UP&CNTl ILPL&/CNTO&IUPL&/CNT2&UP&/CNTl < SUM> ILL3 & LPL & PLDONE & CNT3 < SUM> LL3 & LPL & PLDONE & ICNT3 < SUM> UPL & PLDONE & /uL3 & CNT3 < SUM> UPL & PLDONE & UL3 & ICNT3 CNTO&CNT2&/PLDONE&IUP&CNTl /CNTO&/CNT2&IPLDONE&UP&/CNTl; ICNT4= ICNT4 < SUM> ILL4 & LPL & PLDONE & CNT4 < SUM> LL4 & LPL & PLDONE & ICNT4 < SUM> UPL & PLDONE & /UL4 & CNT4 < SUM> UPL & PLDONE & UL4 & ICNT4 ILPL' & CNTO & IUPL & CNT2 & IUP & CNT3' & CNTl /LPL & ICNTO & IUPL & ICNT2 & UP & ICNT3 & ICNT < SUM> CNTO & CNT2 & IPLDONE & /uP & CNT3 & CNTl ICNTO & ICNT2 & IPLDONE & UP & ICNT3 & ICNTl; ICNT5= ICNT5 < SUM> ILL5 & LPL & CNT5 & PLDONE < SUM> LL5 & LPL & ICNT5 & PLDONE UPL & CNT5 & /UL5 & PLDONE UPL & ICNT5 & UL5 & PLDONE < SUM> ILPL & CNTO & IUPL & CNT2 & CNT4 & IUP & CNT3 & CNTl < SUM> ILPL & ICNTO & /uPL & ICNT2 & ICNT4 & UP & ICNT3 & ICNTl < SUM> CNTO & CNT2 & IPLDONE & CNT4 & /uP & CNT3 & CNTl < SUM> ICNTO & ICNT2 & IPLDONE & ICNT4 & UP & ICNT3 & ICNTl; ICNT6= ICNT6 < SUM> ILL6 & LPL & PLDONE & CNT6 < SUM> LL6 & LPL & PLDONE & ICNT6 < SUM> UPL & PLDONE & CNT6 & /uL6 < SUM> UPL & PLDONE & ICNT6 & UL6 < SUM> ILPL&CNTO&/UPL&CNT2&CNT5&CNT4 & IUP & CNT3 & CNTl < SUM> ILPL & ICNTO & IUPL & ICNT2 & /CNT5 & ICNT4 & UP & ICNT3 & ICNTl < SUM> CNTO&CNT2&CNT5&/PLDONE&CNT4 & IUP & CNT3 & CNTl < SUM> ICNTO & /CNT2 & ICNT5 & IPLDONE & ICNT4 & UP & ICNT3 & ICNTl; 6-228 Appendix C. Source Code for UplDown Counter (continued) ICNT7 = ICNTI < SUM> ILL7 & LPL & CNT7 & PLDONE < SUM> LL7 & LPL & ICNT7 & PLDONE UPL & !UL7 & CNTI & PLDONE UPL & UL7 & ICNTI & PLDONE < SUM> ILPL & CNTO & IUPL & CNT2 & CNTS & CNT6 & CNT4 & IUP & CNT3 & CNTl ILPL & ICNTO & /UPL & ICNT2 & ICNTS & ICNT6 & ICNT4 & UP & ICNT3 & ICNTl < SUM> CNTO & CNT2 & CNT5 & IPLDONE & CNT6 & CNT4 & IUP & CNT3 &CNTl < SUM> ICNTO & ICNT2 & ICNT5 & IPLDONE & ICNT6 & ICNT4 & UP & ICNT3 & ICNTl; node29 = reset; UP= < XSUM> UP lUEQUAL & IUP lLEQUAL & UP < SUM> UPL & PLDONE & IUP < SUM> LPL & PLDONE & UP; PLDONE= < SUM> ILPL & IUPL; LEQU AL= < SUM> LL6 & ICNT6 < SUM> ILL7 & CNT7 < SUM> LL7 & ICNT7 < SUM> LL3 & ICNT3 < SUM> ILLS & CNTS < SUM> LL5 & ICNT5 < SUM> ILLl & CNTl < SUM> LLO & ICNTO < SUM> ILL2 & CNT2 < SUM> ILL4 & CNT4 < SUM> LL4 & ICNT4 < SUM> ILLO & CNTO < SUM> LLl & ICNTl < SUM> ILL6 & CNT6 < SUM> ILL3 & CNT3 < SUM> LL2 & ICNT2; UEQU AL= < < < < < < < < < < < < < < < < SUM> ICNT6 & UL6 SUM> IUL7 & CNT7 SUM> UL7 & ICNT7 SUM> UL3 & ICNT3 SUM> CNT5 & IUL5 SUM> ICNTS & ULS SUM> lULl & CNTl SUM> ICNTO & ULO SUM> CNT2 & IUL2 SUM> IUL4 & CNT4 SUM> UL4 & ICNT4 SUM> CNTO & IULO SUM> ULl & ICNTl SUM> CNT6 & IUL6 SUM> IUL3 & CNT3 SUM> ICNT2 & UL2; 6-229 Appendix D. Source Code for Crossbar Switch CY7C330; configure; clk (node=l), iclk, pI, dO, dl, d.2, d3, d4 (node =9), d5, d6, d7, d8,d9, dlO (node=35,src=15), dll (node=36, src=17), d12 (node=37,src=19), d13 (node=38, src=24), d14 (node=39, src=26), d15 (node=40, src=28), sal (node=17), sa2 (node=15), sa3 (node=34), sbl (node=24), sb2 (node=23), sb3 (node=33), scI (node=19), sc2 (node=26), sc3 (node=32), sdl (node=20), sd.2 (node=28), sd3 (node=3l), yO(node=27 ,iop,iclk= 3), yl(node=25,iop,iclk=3), y2(node=18,iop,iclk=3), y3(node= l6,iop,iclk=3), {Input reg {Input reg {Input reg {Input reg EQUATIONS; Isal = Ipi & Isa2 pI & Isal; /sa2 = /pi & sa3 pI & /sa2; sa3 = Ipi & dO pI & sa3; Isbl= /pi & /sb2 pI & Isbl; Ish2= < SUM> /pi & sb3 pI & /sh2; sb3 = /pi & dl pI & sb3; /sel= < SUM> /pi & /sc2 pI & /scl; Isc2 < SUM> Ipi & sc3 pI & Isc2; sc3 < SUM> /pi & d2 pI & sc3; sdl = < SUM> /pi & /sd2 pI & Isdl; Isd.2 < SUM> /pi & sd3 pI & Isd.2; sd3 < SUM> Ipi & d3 < SUM> pI & sd3; 6-230 is is is is saO} sbO} scO} sdO} Appendix D. Source Code for Crossbar Switch (continued) ly3 = < OE> IpI pI & Ida & Isa3 & Isb3 & Isc3 & Isd3 pI & Idl & sa3 & Isb3 & Isc3 & Isd3 pI & 1d2 & Isa3 & sb3 & Isc3 & Isd3 pI & Id3 & sa3 & sb3 & Isc3 & Isd3 pI & Id4 & Isa3 & Isb3 & sc3 & Isd3 pI & Id5 & sa3 & Isb3 & sc3 & Isd3 pI & Id6 & Isa3 & sb3 & sc3 & Isd3 pI & Id7 & sa3 & sb3 & sc3 & Isd3 pI & Id8 & Isa3 & Isb3 & Isc3 & sd3 pI & Id9 & sa3 & Isb3 & Isc3 & sd3 pI & Isa3 & sb3 & Isc3 & sd3 & IdlO pI & sa3 & sb3 & Isc3 & sd3 & Idll < SUM> pI & Isa3 & Isb3 & Idl2 & sc3 & sd3 pI & Idl3 & sa3 & Isb3 & sc3 & sd3 pI & Idl4 & Isa3 & sb3 & sc3 & sd3 pI & Idl5 & sa3 & sb3 & sc3 & sd3 IpI & sdl; 1y2 = < OE> IpI < SUM> pI & Ida & sd2 & sc2 & sb2 & sa2 < SUM> pI & Idl & sd2 & sc2 & sb2 & Isa2 < SUM> pI & Id2 & sd2 & sc2 & Isb2 & sa2 < SUM> pI & Id3 & sd2 & sc2 & Isb2 & Isa2 < SUM> pI & Id4 & sd2 & Isc2 & sb2 & sa2 < SUM> pI & Id5 & sd2 & Isc2 & sb2 & Isa2 < SUM> pI & Id6 & sd2 & Isc2 & Isb2 & sa2 < SUM> pI & Id7 & sd2 & Isc2 & Isb2 & Isa2 < SUM> pI & Id8 & Isd2 & sc2 & sb2 & sa2 < SUM> pI & Id9 & Isd2 & sc2 & sb2 & Isa2 < SUM> pI & Isd2 & sc2 & Isb2.& IdlO & sa2 < SUM> pI & Isd2 & sc2 & Isb2 & Idll & Isa2 < SUM> pI & Isd2 & Isc2 & sb2 & Idl2 & sa2 < SUM> pI & Isd2 & Isc2 & Idl3 &sb2 & Isa2 < SUM> pI & Isd2 & Isc2 & Idl4 & Isb2 & sa2 < SUM> pI & Isd2 & Idl5 & Isc2 & Isb2 & Isa2 IpI & scI; Iyl = < OE> IpI < SUM> pI & IdO & sbl & sdl & scI & sal < SUM> pI & Idl & sbl & sdl & scI & Isal < SUM> pI & Id2 & Isbl & sdl & scI & sal < SUM> pI & Id3 & Isbl & sdl & scI & Isal < SUM> pI & Id4 & sbl & sdl & Isel & sal < SUM> pI & Id5 & sbl & sdl & Isel & Isal < SUM> pI & Id6 & Isbl & sdl & Isel & sal < SUM> pI & Id7 & Isbl & sdl & Isel & Isal < SUM> pI & Id8 & sbl & Isdl & sel & sal < SUM> pI & Id9 & sbl & Isdl & scI & Isal < Sl!M> pI & Isbl & Isdl & scI & sal & IdlO < SUM> pI & Isbl & Isdl & scI & Idll & Isal < SUM> pI & sbl & Isdl & Idl2 & Iscl & sal < SUM> pI & sbl & Id13 & Isdl & Isel & Isal < SUM> pI & Idl4 & Isbl & Isdl & Isel & sal < SUM> pI & Idl5 & Isbl & Isdl & Iscl & Isal IpI & sbl; 6-231 Appendix D. Source Code for Crossbar Switch (continued) lyO = < OE> Ipl < SUM> pI & IdO & lyO & Iyl & lil & ly3 < SUM> pI & Idl & yO & Iyl & lil & ly3 < SUM> pI & Id2 & lyO & yl & lil & ly3 < SUM> pI & Id3 & yO & yl & lil & ly3 < SUM> pI & /d4 & lyO & Iyl & il & ly3 < SUM> pI & IdS & yO & Iyl & il & ly3 < SUM> pI & Id6 & lyO & yl & il & /y3 < SUM> pI & Id7 & yO & yl & il & 1y3 < SUM> pI & IdS & lyO & Iyl & lil & y3 < SUM> pI & Id9 & yO & Iyl & lil & y3 < SUM> pI & lyO & yl & lil & y3 & IdlO < SUM> pI & yO & yl & lil & IdB & y3 < SUM> pI & lyO & Iyl & Idl2 & il & y3 < SUM> pI & yO & Iyl & Idl3 & il & y3 < SUM> pI & lyO & Idl4 & yl & il & y3 < SUM> pI & IdlS & yO & yl & il & y3 IpI & sal; 6-232 Using the Cypress CY7C330 in Closed-Loop Servo Control This application note examines a cornmon facet of engineering design - control systems - and offers an alternative to cornmon implementations. Along with an overview of the subject, this application note explores the tradeoffs among several implementation strategies. Also included here is a description of a PLD-based method that offloads the processing bandwidth require~ ments of a controlling CPU. Implemented in a Cypress CY7C330 PLD, this method has been successfully employed in a high-speed customer application - a laser mirror-positioning servo. are using to read this line of text, the engine thermostat in most automobiles, and the print head of a dot-matrix printer. The closed-loop application described later in this application note consists of a motor-driven mirror that can rotate 360 degrees in either direction. Closed-loop systems use information from the environment under control to influence the output. Block diagrams such as the one in Figure 1 typically represent such a control system. Control System Influences In a closed-loop design, numerous factors influence the system behavior. Among them are: Input, I(t): The system input is the signal from an external source that references the desired steady-state behavior. In the mirror servo system, the steady-state output is the absolute position at a given location within a given accuracy. The input is also known as the reference or set point. Summing function: This is the section of the control system that determines the amount of error, E(t), currently in the system. It is the difference between the reference point and the controlled environment's present state. In a motor servo system, E(t) is the difference between the target reference position and the motor's present position. In an analog circuit, an operational amplifier usually implements the summing function. Controller. Most control systems incorporate a controller that receives the error signal as an input and generates. an output that attempts to reduce this error to within a specific tolerance (ideally 0). The controller has a control mode that determines how the controller should manipulate the error signal to produce a control signal. Cornmon control modes include proportional, integral, differential, and the combination of these three - PID. Approximately 80 - 90 percent of industrial control implementations use variations of the PID method. Controlled device: The. object of the control system is to have a controlled device perform satisfactorily. This is the motor, in the case of the mirror servo. Control System Concepts Control system theory is applied to areas as diverse as pneumatic controls and economic models. Analyzing control system behavior mathematically relies heavily on an understanding of Laplace and Z-transforms (see the References). However, this application note deals with the subject on a more practical level. Control systems fall into two major categories: open loop and closed loop. An open-loop system generates outputs based on input conditions, but has no feedback from the output to verify or correct the output condition. Examples of open-loop systems include light switches (although you could reasonably argue that the human is the feedback loop) and self-timed, free-running traffic-control signals. Closed-loop systems, on the other hand, provide information on system status to the controller. Examples of closed-loop systems include the eye-brain system you DI STURB"NCES INPUT OUTPUT REFERENcE POINT FEEDBACK Figure 1. Closed·Loop Servo System 6-233 Output, Oft): This is the physical characteristic to be controlled. In ·an automobile thermostat system, O(t) is the engine's temperature. In the mirror servo, O(t) is the mirror's position. Disturbance, D(t): Any influence on the system that negatively affects the desired output is called a disturbance. In an automobile, operation in bumper-tobumper traffic that reduces airflow through the radiator is a disturbance to the thermostat. This is only a partial list of the influences in a closed-loop control system, but the factors mentioned are the most significant for the mirror servo. (For more complete information, consult any of the References.) Control System Parameters Some of the parameters used to quantize control system behavior are: Accuracy: the difference between ideal and actual steady-state system behavior. Settling time: the time required to reach steady state after the reference point is changed or set. Percentage overshoot: the difference between the reference point and the maximum excursion after passing through the reference point. Jitter. a condition that occurs when the controlling element improperly overcompensates for an overshoot of the reference point. The overcompensation results in an undershoot that is .again overcompensated for and produces overshoot. Jitter can increase the system's settling time or result in unstable· oscillations that never at~ tain the reference point. Rise time: the time required for the system's output to increase from 10 to 90 percent of the final value. Control System Implementations Control system implementations vary from purely analog to completely digital.· Many popular implementations use a hybrid of digital and analog techniques. The approach described here uses a digital element to perform the summing, the control, and part of the feedback function. This approach and the pure-analog method are possibly the most often used. Each approach has its own tradeoffs. Because analog systems continuously perform the summing function (usually with an op-amp), they are immune to the problems associated with data quantization. Thus analog systems usually offer excellent stability. Digital hybrids offer good senSitivity, immunity to noise, resolution, and flexibility, along with minimized drift. These systems are usually easier to design at a lower cost, compared to alternatives. Microprocessors make it relatively easy to implement the system's controller and summing function on one chip. When you use a microprocessor, you can take advantage of several algorithms for generating the control signal. The simplest is proportional control, in which the correction made is proportional to the error signal. The value by which the error is scaled is the system's proportionality constant or gain. Proportional control offers an intuitively reasonable solution: the larger the error, the larger the corrective signal. Another control algorithm is integral control, where the corrective signal is based on the error's time integral multiplied by a weighting factor. You typically calculate this value using a numeric approximation. Integral control is usually combined with proportional control to increase accuracy or reduce steady-state error. One other control algorithm, derivative control, employs a corrective signal comprised of the error signal's derivative over time multiplied by a weighting factor. Again, a numeric approximation is used to calculate the derivative. Combining this method with proportional control contributes a stabilizing influence to the system. However, noisy systems often omit the derivative function because it amplifies high-frequency disturbances. When all three control algorithms are combined, they constitute proportional· + integral + derivative, or PIO control. You can verify the influences of the integral and derivative methods on PIO with analysis based on Laplace transforms. A PIO tradeoff is that it reduces the processor bandwidth available to perform other tasks. PIO systems also require a fmite amount of time to calculate the output value. Another factor to consider in a hybrid control system is the system's sampling/processing rate. Several reference books indicate that the sampling rate for a closed-loop control system should be significantly above the minimum dictated by Shannon's sampling theorem. Thus, rather than operating at the Nyquist frequency (twice the highest frequency sampled), the sampling rate would be eight to ten times the highest sampled frequency. The reasons for this practice include an uncertainty associated with determining the sampled signal's highest frequency component, the possibility of aliasing, and the decrease in system stability that can result from a too-low sampling rate. Unfortunately, increasing the sampling rate quickly consumes the available bandwidth of a microprocessor-based implementation. Using the CY7C330 in Servo Control The Cypress CY7C330 can help offload the microprocessor in a high-speed servo control system. The application described here positions a mirror to form images with a laser beam. A previous implementation of this system used a 68000 microprocessor in the servo loop. But as the number of tasks on the 68000 increased, the processor's ability to maintain a stable servo system became marginal. The CY7C330-based version maintains servo loop stability as well as freeing processor system throughput with a minimum of additional cost and complexity. Several features of the CY7C330 are fundamental to understanding this design (see the CY7C330 block 6-234 I------~D 0 ~---nCLK o ClK1 INPUTS TO LOGIC ARRAY ClK2 Figure 2. CY7C330 Dedicated Input Register diagram in Figure 1 of "Understanding the CY7C330 Synchronous EPLD"). The dedicated input registers (Figure2), for example, allow data to be loaded into ~e chip with either of two data input clocks - CLKl (pm 2) or CLK2 (pin 3). You choose the input clock at program time via an EPROM configuration fuse. The macrocells (Figure 3) also feature input registers, again with two clocks for data entry. The ability to three-state the macrocell output drivers and load data into the macrocell input register allows you to use these macrocell input registers to hold reference values. This is handy in applications such as up/down counter s, where the input registers can hold the counter upper/lower limit. In the mirror-servo design, the macrocell input registers store the mirror's calculated target position and are clocked by CLK2. While actively controlling the servo, this design uses the dedicated input registers for loading the present mirror position from the servo loop. In command mode, though, the dedicated input registers hold data from the microprocessor that is used to calculate a new target position. In either case, CLKl loads the dedicated input registers. Figure 3. CY7C330 I/O Macrocell diagram in Figure 4 shows the general approach used. The design employs three CY7C330s that each generate an 8-bit accumulate for 24-bit precision. The microprocessor provides the CY7C330s with a 24-bit position reference target for the mirror. The CY7C330s latch this 24-bit value into their on-board registers. The CY7C330s perform the control loop's summing and proportional feedback functions. The PLDs compare the 24-bit desired position to the present position, which is maintained in an external 24-bit present-position counter. The result is the error multiplied by a fixed unity gain. This proportional control signal is then converted to an analog signal, which is converted to a current level to control the positioning mirror's motor. The motor's shaft has an optical encoder that creates a sin-cos analog signal. When converted to digital form, this signal indicates the direction of rotation and provides a pulse that increments or decrements the external 24-bit present-position counter. This allows the Mirror Servo Fundamentals As Figure 1 shows, the basic mechanism of control loops is proportional feedback of the error signal. If this loop acts as a self-contained coprocessor to the main CPU, the CPU is only required to input the reference point to which the mirror should be moved. Now. the CPU no longer needs to perform the control algorithm at a pace equal to the sampling rate. Essentially, the processor can "set and forget" the servo coprocessor. One way to implement this servo coprocessor is to add another microprocessor. This would add software and hardware (CPU, RAM, ROM, clock, 110, interrupt control, etc.), and possibly require an in-circuit emulator for development if a low-cost microcontroller is used. Another possibility is to use an analog servo controller, but the accuracy requirements preclude this when drift is considered. Another approach is to use several simple PLDs in a hybrid control-loop implementation. The system block Figure 4. CY7C330 Servo Control Loop 6-235 result of this addition moves to the macrocell output registers, and CLK2 clocks it into the same macrocell input registers that were a source value for the add. Thus, in this mode, the CY7C330s use the present value on the dedicated input pins to adjust the target position in the macrocell input registers with an accum~late cycle. This target-position update cycle is pictured in Figure 5. The microprocessor always provides data as a delta or step from the present position. The accumulate can be either an add or subtract. Subtracts are accomplished by providing the step data from the microprocessor in 2's-complement form. Mter alignment, the position and accumulator values are reset to zero, and the system is ready for operation. In operation, the outputs from the microprocessor are three-stated, and the value from the 44-bit position counter is loaded into the the dedicated input registers. This value is. always provided in a 2's-complement form by inverting the position counter's outputs (1' s complement) and setting the carry in (Cin) input to one. The position-counter value is thus subtracted from the present-target-position value stored in the macrocell input registers; this forms the proportional error feedback. value used to control the servo motor. Figure 6 illustrates this servo control mode. Note that the D/A converter does not need a 24-bit digital value for control. In practice, the circuit uses an 8-bit DIA value biased such that the eighth bit provides direction control (clockwise vs. counterclockwise). In the actual design, the upper 16 bits from the two mostsignificant CY7C330s are tested for rail High and Low conditions and generate generate two offscale bits each loop to operate as fast as the slowest of the following elements: the CY7C330s configured as a multistage accumulatorlsubtractor, the DI A converter, or the AID converter. The host microprocessor is completely decoupled from the servo loop. Should the microprocessor halt, the servo circuitry continues to maintain the desired reference position without intervention. Details of the Mirror Servo Getting into the inner workings of the mirror servo loop, the CY7C330 macrocell output registers act essentially as an accumulator. Depending on the mode of operation, the accumulator generates a value that is either a new servo-motor target position or the proportional error feedback value to the servo. When the system starts, the macrocell input registers wake up with an initial value of O. These registers are dedicated to holding the motor's present target position. At the same time, the external position counter is set to zero. Then the microprocessor steps the target position until the laser targets an alignment sensor. The following steps accomplish this sequence: First, the outputs of the external 24-bit position counter are placed in a three-state condition. These outputs and the microprocessor's .outputs act as inputs to the CY7C330's dedicated input registers. The processor drives a step value onto the inputs, and CLKI clocks the value into the CY7C330's dedicated input registers. On CLK1's rising edge, this value is added to the present value in the macrocell input registers. The MICROPROCESSOR DEDICATED POSITION I/\PUT REGISTER DATA LOGIC MACROCELL ARRAY so OUTPUT REGISTER 0 PROGRAMMED WITH ACCUMULATOR EQUATIONS a ClK AIlDER RESULT Q TARGET D POSITION ClK CLK1 CIn· 0 CLK MACROCELL lilPUT REGISTER CLK2 Figure 5. Target Update Mode Operation Sequence (1) With external position counter's output three-state, host microprocessor drives position step data. (2) Step data (provided in 2's complement form if a subtract is desired) is loaded intO the 330 with CLKI. .. (3) Step data is added or subtracted from present target position with logic equations to create new target pOSItion. (4) New target position is clocked into macrocell output registers with CLK. (5) On CLK2, the new target position is clocked into the macrocell input register. 6-236 for these conditions. The seven low-order bits, along with the four offscale bits, are passed to a second PLO (22VI0), which drives the output to the 01 A in the cor~ rect direction (eighth bit) and with the correct magnitude. If the four offscale bits indicate that the upper bits are all close to 0, the seven bits to the 01A are masked to O. Likewise, if the upper bits are mostly 1, the DI A bits are set to 1. The offscale bits are generated to minimize the number of inputs required for the subsequent PLO that feeds the DI A converter. The determination of how to use the offscale bits for compensation in the second PLD is specific to a given application. The Accumulator Design The backbone of the logic in this design is the CY7C330-based accumulator. The logic that implements this synchronous full adder is described by an equation for the sum and an equation for the carry of a given bit. The equation for the sum (S) at bit position n, with inputs A, B, and carry in (Cin) is: Sn = (An XOR Bn XOR Cin). The equation for the carry out is: COUTn = (An * Bn) + (An * Cin) + (Bn * Cin) Figure 7 shows the equations for a 4-bit synchronous adder, whose sequence completes in four clocks. Because the objective is to calculate a complete 24-bit sum as quickly as possible, the equation for carry out (CO) from the adder's first bit can be substituted COUNTER POSITION DATA LOGIC DEDICATED INPUT REGISTER 0 into the equation for the adder's second bit This arrangement allows the first two bits to be added in a single clock cycle. Similarly, the equation for the carry out from the second bit can be substituted into the equation for the third sum, and so on. The resulting equations for three bits of substitution appear in Figure8. The CY7C330's XOR product term is useful for reducing the number of product terms required for a given sum bit. However, even after Boolean reduction and utilization of the XOR product term, the fourth bit of the adder requires 30 product terms for the sum bit and 31 product terms for the carry out bit to generate a 4-bit result in a single clock cycle. Because a given CY7C330 macrocell provides a maximum of 19 product terms, the device must run the accumulate process over multiple 3-bit stages. The addition of the first three bits fmishes after one clock cycle, the second three bits after two cycles, and so on. Implemented in three CY7C330s, the complete 24-bit accumulate therefore requires nine clock cycles. With 66-MHz devices, nine clock cycles translates to a complete calculation cycle of 120 ns. Appendix A lists the minimized equations for one of the three 8-bit adder stages. The syntax used in this example is that of the Cypress PLD ToolKit Variables BO - B7 are the eight dedicated inputs sourced from either the microprocessor or the 24-bit position counter. INCLK is the CLKI pin on the CY7C330 used to clock in the BO - B7 variables. Cin is the carry in from external logic (set to one for subtraction when in control mode MACROCEll OUTPUT REGISTER ARRAY a 0" SO a 0 PROGRAMMED WITH ACCUMULATOR EQUATIONS ClK PROPORTIONAL ERROR FEEDBACK ClK ADDER RESULT Q TARGET POSITION ClK! eln = I MACROCEll I!'-PUT REGISTER ClK Figure 6. Control Mode Operation Sequence (1) CLKI loads external 24-bit position data (in l's complement form) into CY7C330's dedicated input register. (2) With carry in set to 1, logic equations subtract current position from t(Jfget position to form error amount. (3) Error result is clocked into macrocell output register with CLK and is available to servo motor interface. 6-237 /* Four Bit Adder - General Case *' /* Synchronous 3 bit adder - derivative of General Case *' Uses substitution of Carry Out in ftrst 3 bits to generate 3 bit result in one clock cycle '* Inputs: An, Bn ; Inputs to be added at Bit n CIN ; Carry in to Adder *' SO = AO XOR BO XOR CIN , * CO= Outputs: Sn ; Sum out for Bit n Cn ; Carry out from adder stage n '* Equations to be reduced '* C1 + (AI + (B1 so * CIN) SI = Al XOR BI XOR CO CI = (AI * BI) + (AI * CO) + (BI * CO) S2 = A2 XOR B2 XOR CI C2 -= (A2 * B2) + (A2 * C1) + (B2 * C1) = (AI * B1) * [(AO * BO) + (AO * [(AO * BO) + (AO S2 = A2 XOR B2 XOR {(AI * BI) + (AI * [(AO * BO) + (AO + (BI * [(AO * BO) + (AO C2 = (A2 S3 = A3 XOR B3 XOR C2 C3 = (A3 * B3) + (A3 * C2) + (B3 '* * BO) + (AO * CIN) + (BO * CIN) *' Sl = Al XOR B1 XOR [(AO *' = AO XOR BO XOR CIN CO= (AO * BO) + (AO * CIN) + (BO (AO * BO) + (AO * CIN) + (BO * CIN)] * CIN) + (BO * CIN)]) * CIN) + (BO * CIN)]) *' * CIN) + (BO * CIN)]) * CIN) + (BO * CIN)])} * B2) + (A2* {(AI * B1) + (AI * [(AO * BO) + (B1 * [(AO * BO) + (B2 * {(AI * B1) + (AI * [(AO • BO) + (BI * [(AO * BO) * C2) C3 == Carry Out of Four Bit Adder *' Figure 7. Equations for Four-Bit Adder on the first 8-bit adder stage) or from the previous stage of the adder. AO - A7 are the sum outputs for either target update or control mode. If the processor is updating the target position by a step incremen4 AO - A7 are loaded into the macrocell input registers with CLK2 (named ACLK). When this new position update is being loaded, the output drivers of the macrocells are not three-stated with the OE pin or a product term equation. This allows ACLK to load the macrocell output registers (which have the newly calculated target position) into the macrocell input registers (which are used to hold the target position). C2 and C5 are internal carry-out bits generated from the ftrst and second 3-bit adder stages, respectively. Finally, COUT is the carry out generated as either the final carry out or as the input to the next 8-bit adder stage's carry in. Appendix B shows the implementation of the two upper CY7C330 stages. The equations for the accumulator function are the same as in the previous equations. The additions here are the equations for detecting rail conditions and generating the offscale bits. Note that the intent here has been to focus on a different approach to implementing a closed-loop servo + (AO + (AO * CIN) + (BO * CIN)]) * CIN) + (BO * CIN)])}) + (AO + (AO * CIN) + (BO * CIN)]) * CIN) + (BO * CIN)])} Figure 8. Equations for a Synchronous 3-Bit Adder controller, with the CY7C330 as the central element, and to disclose the details unique to the CY7C330. Many hardware implementation details are left to the designer, including the D/A design, feedback design, and the lead/lag compensation. References Houpis & Lamont, Digital Control Systems - Theory, Hardware, Software (New York: McGraw-Hill, 1985) Ball & Prat4 Engineering Applications of Microcomputers (Prentice Hall Int'l (UK) Ltd, 1986) Kuo, Digital Control Systems (New York: Holt, Rinehart, & Winston, Inc., 1980) Gayakwad & Sokoloff, Analog and Digital Control Systems (Prentice Hall, 1988) Bollinger & Duffie, Computer Control of Machines &Processes (New York: Addison - Wesley, 1988) For more information on implementing the CY7C330-based, 24-bit up/down position counter mentioned in this application note, consult the application note, "66-MHz CY7C330 Synchronous State Machine." 6-238 Appendix A. PLD ToolKit Code for an 8-Bit Accumulator {Mark Aaldering - Cypress Semiconductor - 8-bit accumulator - June 14, 1989} CY7C330; CONFIGURE; { Dedicated input registers. Default configuration is use of pin 2 for clock} Outclk(node=I), Inclk(node=2), Aclk(node=3 ), CIN(node=4), BO(node=5), Bl(node=6), B2(node= 7), B3(node=9), B4(node=10), B5(node=11), B6(node= 12), B7(node=13), oe(node=14), {Output nodes assigned to maximize available product term utilization. In the following declarations, the 7C330's macrocell outputs are configured as follows: ireg--This sets the macrocell feedback MUX for feedback from the macrocell input register instead of the (default) macrocell output register (rgd) iclk=3--This selects the clock on pin 3 instead of the default (used for the inputs above) of clock on pin 2 for the macrocell input register IOP--Same as ireg. nenbpt--Selects OE control from pin 14 instead of a product term} AO(node=28,iop,iclk=3,ireg,nenbpt), Al (node= 15,iop,iclk=3,ireg,nenbpt), A2(node=20,iop,iclk=3,ireg,nenbpt), A3(node=17,iop,iclk=3,ireg,nenbpt), A4(node=26,IOP,iclk=3,ireg,nenbpt), A5(node=23,IOP,iclk=3,ireg,nenbpt), A6(node= 19 ,IOP,iclk=3,ireg,nenbpt), A7 (node=24,IOP,iclk=3,ireg,nenbpt), COUT(node= 18,nenbpt), C2(node= 32), C5(node= 34), { { { { { { { { Sum 0 / Accum. Feedback Register Sum 1 / Accum. Feedback Register Sum 2 / Accum. Feedback Register Sum 3 / Accum. Feedback Register Sum 4 / Accum. Feedback Register Sum 5 / Accum~ Feedback Register Sum 6 / Accum. Feedback Register Sum 7 / Accum. Feedback Register {Carry out } { Carry 2 - Hidden } { Carry 5 - Hidden} { Available nodes # P.T.'s} { I/O macrocell - 16 - 19 } { I/O macrocell - 25 - 17 } { I/O macrocell - 27 - 19 } { hidden macrocell - 31 - 13 } { hidden macrocell - 33 - 11 } {End of configuration section} 6-239 0 1 2 3 4 5 6 7 } } } } } } } } Appendix A PLD ToolKit Code for an 8-Bit Accumulator (continued) {Logic equation section} EQUATIONS; {AO: 2 product terms, pin 28: 9 P.T. Available} lAO < XSUM> CIN < SUM> lAO * IBO + AO* BO; {AI: 6 product terms, pin 15: 9 P.T. Available} IAI < XSUM> IAt < SUM> Bl * IBO * ICIN + IBI * BO * CIN + IB 1 * AO * CIN + IBI * AO * BO + B 1 * lAO * ICIN + B 1 * lAO * BO; {A2: 14 product terms, pin 20: 15 P.T. Available} IA2 < XSUM> IA2 < SUM> B2*/AI */BI IB2 * B 1 * BO * CIN + IB2 * Al * BO * CIN + fB2* Bl* AO* CIN + IB2 * Al * AO * CIN + IB2 * B 1 * AO * BO + IB2 * At * AO * BO + B2 * IBI * IBO * ICIN + B2 * IAt * IBO * ICIN + IB2* Al* Bl + B2 * IBI * lAO * fCIN + B2 * fAI * lAO * fCIN + B2 * fB 1 * lAO * IBO + B2 * fAI * fAO * fBO; + {C2: 15 product terms, virtual pin 32: 17 P.T. Available} C2 < SUM> + + + + + + + + + + + + + + B2 * B 1 * BO * CIN A2 * Bl * BO * CIN B2 * Al * BO * CIN A2 * Al * BO * CIN B2 * B 1 * AO * CIN A2 * B 1 * AO * CIN B2 * Al * AO * CIN A2 * Al * AO * CIN B2 * B 1 * AO * BO A2 * B 1 * AO * BO B2'" Al * AO * BO A2 * Al * AO * BO B2 * Al * Bl A2 * Al * Bl A2 * B2; 6-240 Appendix A PLD ToolKit Code for an 8-Bit Accumulator (continued) {A3: 2 product terms, pin 17: 11 P.T. Available} IA3 = < XSUM> C2 < SUM> IA3 * IB3 A3 * B3; + {A4: 6 product terms, pin 26: 11 P.T. Available} IA4 = < XSUM> IA4 < SUM> B4 * IB3 * IC2 + IB4 * B3 * C2 + IB4 * A3 * C2 + IB4 * A3 * B3 + B4 * I A3 * IC2 + B4 * I A3 * B3; {A5: 14 product terms, pin 23: 15 P.T. Available} IA5 = < XSUM> IA5 < SUM> B5 * IA4 * IB4 + IB5 * B4 * B3 * C2 + IB5 * A4 * B3 * C2 + IB5 * B4 * A3 * C2 + IB5 * A4 * A3 * C2 + IB5 * B4 * A3 * B3 + IB5 * A4 * A3 * B3 + B5 * IB4 * IB3 * IC2 + B5 * IA4 * IB3 * IC2 + IB5 * A4 * B4 + B5 * IB4 * I A3 * IC2 + B5 * IA4 * IA3 * IC2 + B5 * IB4 * IA3 * IB3 + B5 * I A4 * I A3 * IB3; {C5: 15 product terms, virtual pin 34: 19 P.T. Available} C5= < SUM> B5 * B4 * B3 * C2 + A5 * B4 * B3 * C2 + B5 * A4 * B3 * C2 + A5 * A4 * B3 * C2 + B5 * B4 * A3 * C2 + A5 * B4 * A3 * C2 + B5 * A4 * A3 * C2 + A5 * A4 * A3 * C2 + B5 * B4 * A3 * B3 + A5 * B4 * A3 * B3 + B5 * A4 * A3 * B3 + A5 * A4 * A3 * B3 + B5 * A4 * B4 A5 * A4 * B4 + A5 * B5; + 6-241 Appendix A. PLD ToolKit Code for an 8·Bit Accumulator (continued) {A6: 2 product terms, pin 19: 13 P.T. Available} IA6 = < XSUM> CS < SUM> IA6 * IB6 + A6* B6; {A7: 6 product terms, pin 24: 13 P.T. Available} IA7 = < XSUM> IA7 < SUM> B7 * IB6 * ICS + IB7 * B6 * C5 + IB7 * A6 * CS + IB7 * A6 * B6 + B7 * I A6 * ICS + B7 * I A6 * B6; {COUT: 7 product terms, pin 18: 17 P.T. Available} ICOUT = < SUM> IB7 * IB6 * ICS I A7 * IB6 * ICS IB7 * I A6 * ICS I A 7 * I A6 * IC5 + IB7 * I A6 * IB6 + I A7 * I A6 * IB6 + IA7 * IB7; + + + {End of file.} 6-242 Appendix B. PLD ToolKit Code for an Accumulator with Rail Condition {Mark Aaldering - Cypress Semiconductor - 8-bit accumulator with rail condition outputs - June 14, 1989} CY7C330; CONFIGURE; { Dedicated input registers. Default configuration is use of pin 2 for clock } Outclk(node=I), Inclk(node=2), Aclk(node=3), Cin(node=4), BO(node=5), Bl(node=6), B2(node= 7), B3(node=9), B4(node= 10), B5(node=11), B6(node=12), B7 (node= 13), oe(node=14), {Output nodes assigned to maximize available product term utilization. In the following declarations, the 330's macrocell outputs are configured as follows: ireg--This sets the macrocell feedback MUX for feedback from the macrocell input register instead of the (default) macrocell output register (rgd) iclk=3--This selects the clock on pin 3 instead of the default (used for the inputs above) of clock on pin 2 for the macrocell input register IOP--Same as ireg. nenbpt--Selects OE control from pin 14 instead of a product term } AO(node=28,iop,iclk=3,ireg,nenbpt), Al (node= 15,iop,iclk=3,ireg,nenbpt), A2(node=20,iop,iclk=3,ireg,nenbpt), A3(node=17 ,iop,iclk=3,ireg,nenbpt), A4(node=26,iop,iclk=3,ireg,nenbpt), A5(node=23,iop,iclk=3,ireg,nenbpt), A6(node=19,iop,iclk=3,ireg,nenbpt), A7(node=24,iop,iclk=3,ireg,nenbpt), COUT(node= 18,nenbpt), C2(node= 32), C5(node= 34), RO(node= 16,nenbpt), R l(node= 25,nenbpt), { Sum 0 I Accum. Feedback Register 0 { Sum 1 I ACCUID. Feedback Register 1 { Sum 2 I ACCUID. Feedback Register 2 { Sum 3 I ACCUID. Feedback Register 3 { Sum 4 I Accum. Feedback Register 4 { Sum 5 I ACCUID. Feedback Register 5 { Sum 6 I ACCUID. Feedback Register 6 { Sum 7 I ACCUID. Feedback Register 7 {Carry Out} { Carry 2 - Hidden } { Carry 5 - Hidden} {Rail Bit O} { Rail bit 1 } # P.T.'s} { Available nodes { I/O macrocell - 27 - 19 } } { Hidden macrocell- 31 - 13 { Hidden macrocell - 33 - 11 } {End of configuration section} 6-243 } } } } } } } } ~~ -==l!Ir -;~~~~~~~~~~C~Y7~C~3~3~O~:~C~lo~s~ed~.~L~o~o~p~S~e~rv~o~C~on~t~r=ol SEMICOIDUCTOR_ Appendix B. PLD ToolKit Code for an Accumulator with Rail Condition (continued) {Logic equation section} EQUATIONS; {AO: 2 product terms, pin 28: 9 P.T. Available} I AD = < XSUM> CIN < SUM> lAO * IBO + AO * BO; {AI: 6 product terms, pin 15: 9 P.T. Available} IAI = < XSUM> IAI < SUM> Bl * IBO * ICIN + IB 1 * BO * CIN + IBl* AO* CIN + IBI * AO * BO + B 1 * lAO * ICIN + B 1 * lAO * BO; {A2: 14 product terms, pin 20: 15 P.T. Available} IA2 = < XSUM> IA2 < SUM> B2 * IAI * IBI + IB2 * B 1 * BO * CIN + IB2 * Al * BO * CIN + IB2 * Bl * AO * CIN + IB2 * Al * AO * CIN + IB2 * Bl * AO * BO + IB2 * Al * AO * BO + B2 * IB 1 * IBO * ICIN + B2 * IAI * IBO * ICIN + IB2* Al* Bl + B2*/BI * lAO */CIN + B2 * IAI * lAO * ICIN + B2*/Bl*/AO*/BO + B2*/Al*/AO*/BO; {C2: 15 product terms, virtual pin 32: 17 P.T. Available} C2= < SUM> + + + + + + + + + + + + + + B2 * B 1 * BO * CIN A2 * B 1 * BO * CIN B2 * Al * BO * CIN A2 * Al * BO * CIN B2 * B 1 * AO * CIN A2 * Bl * AO * CIN B2 * Al * AO * CIN A2 * Al * AO * CIN B2 * B 1 * AO * BO A2 * Bl * AO * BO B2 * Al * AO * BO A2 * Al * AO * BO B2 * Al * Bl A2 * Al * Bl A2 * B2; 6-244 Appendix B. PLD ToolKit Code for an Accumulator with Rail Condition (continued) {A3: 2 product terms, pin 17: 11 P.T. Available} IA3 = < XSUM> C2 < SUM> IA3 * IB3 + A3 * B3; {A4: 6 product terms, pin 26: 11 P.T. Available} I A4 I A4 < SUM> B4 * IB3 * IC2 + IB4 * B3 * C2 + IB4 * A3 * C2 + IB4 * A3 * B3 + B4 * I A3 * IC2 + B4 * I A3 * B3; {AS: 14 product terms, pin 23: lS P.T. Available} IA5 = < XSUM> IA5 < SUM> BS*IA4*/B4 IB5 * B4 * B3 * C2 + IB5 * A4 * B3 * C2 + IB5 * B4 * A3 * C2 + IB5 * A4 * A3 * C2 + IBS * B4 * A3 * B3 + IB5 * A4 * A3 * B3 + B5 * IB4 * IB3 * IC2 + B5 * I A4 * IB3 * IC2 + IBS * A4 * B4 + BS * IB4 * IA3 * IC2 + BS * IA4 * IA3 * IC2 + B5 * IB4 * I A3 * IB3 + B5 * IA4 * IA3 * IB3; + {CS: lS product terms, virtual pin 34: 19 P.T. Available} C5 = < SUM> BS * B4 * B3 * C2 A5 * B4 * B3 * C2 + B5 * A4 * B3 * C2 + + AS * A4 * B3 * C2 + BS * B4 * A3 * C2 + AS * B4 * A3 * C2 + B5 * A4 * A3 * C2 + A5 * A4 * A3 * C2 + BS * B4 * A3 * B3 + AS * B4 * A3 * B3 + B5 * A4 * A3 * B3 + AS * A4 * A3 * B3 + B5 * A4 * B4 + A5 * A4 * B4 + A5 * B5; 6-245 5y.:~ .;;;;;;;;;;;=========;;;;;C;;;;;Y7=C;;;;;3;;;;;3;;;;;O;;;;;:=C;;;;;lo;;;;;;s;;;;;;ed;;;;;;-;;;;;;L;;;;;;o;;;;;o!;;;p;;;;;;S;;;;;;e;;;;;;rv;;;;;o;;;;;;C=oD;;;;;;t;;;;;;;r=ol Appendix B. PLD ToolKit Code for an Accumulator with Rail Condition (continued) {A6: 2 product terms, pin 19: 13 P.T. Available} IA6 = < XSUM> C5 < SUM> IA6 * IB6 + A6 * B6; {A7: 6 product terms, pin 24: 13 P.T. Available} IA7 = < XSUM> IA7 < SUM> B7 * IB6 * IC5 + IB7 * B6 * C5 + IB7 * A6 * C5 + IB7 * A6 * B6 + B7 * I A6 * IC5 + B7 * I A6 * B6; {COUT: 7 product terms, pin 18: 17 P.T. Available} ICOUT =< SUM> IB7 * IB6 * IC5 + I A7 * IB6 * IC5 + IB7 * I A6 * IC5 + I A 7 * I A6 * IC5 + IB7 * I A6 * IB6 + I A7 * I A6 * IB6 + IA7 * IB7; {RO: rail bit 0; Arbitrarily equation chosen to detect when upper 5 bits are all 1 - this decision is a matter of preference output active low} IRO = < SUM> A7 * A6 * A5 * A4 * A3; { R1: rail bit 1; Again, arbitrarily chosen to reflect value of carry out, therefore this is a redundant output - active 10\ output} IRI = < SUM> COUT; {End of me} 6-246 ~4 ;;;;;; .= CYPRESS , SEMICONDUCTOR FDDI Physical Connection Management Using the CY7C330 This application note shows how you can use the Cypress CY7C330 programmable logic device (PLD) to implement the Physical Connection Management (PCM) state machine specified in the Station Management (SMT) of the Fiber Distributed Data Interface (FDDI) standard. Along with a brief overview of the FDDI standard, this application note explains the CY7C330's features, the design methodology used in this design, and an example of how you can synthesize a complex function into this device. Note, however, that this is not meant to be an in-depth tutorial of the FDDI standard and its various layers. 1988 update of the SMT specification. The final FDDI specification might differ slightly, but the design methodology remains the same. The PMD layer is the lowest and specifies the network's connectors, transceivers, and bypass switches. The PRY layer specifies the type of encoding used on the data (4B/5B) and specifies a set of line states. These line states implement a handshake mechanism between PRYs of adjacent nodes. The MAC layer performs higher-level, peer-to-peer communications. It also provides for system timer support, packet framing, and responses to various types of errors in the network. The SMT layer controls the activities of the MAC, PHY, and PMD. SMT includes functions such as connection management (CMT), fault detection, and ring reconfiguration. The CMT is the portion of Station Management that controls the insertion, removal and logical connection of the PRY entities. Within the CMT is an area known as the Physical Connection Management (PCM). A chart showing a hierarchical view of the location of FDDIOverview FDDI is a lOO-Mbits/s dual token ring network that can connect as many as 500 nodes with a maximum linkto-link distance of 2 km and a total network circumference of about 100 km. The network employs a primary and a secondary ring. The primary ring handles data transmission, and the secondary ring mainly provides fault tolerance, but can be used for data transmission as well. FDDI is a token ring network, in which rotating a token grants network access. The node with the token can transmit data. This arrangement ensures a deterministic, collision-free network, independent of the number of stations in the network. Because of the dual-ring topology, FDDI defines a fault-recovery mechanism. If a fault is detected, such as a broken fiber-optic cable, the network can be restored by routing around the break with the second ring. This function is largely controlled by the state machine shown later, which is implemented with the CY7C330. The ANSI X3T9.5 standards committee controls the FDDI standard, which was developed using the Open Systems Interconnection (OSI) model; FDDI implements the model's physical and data-link layers. The four FDDI layers are Physical Media Dependent (PMD), Physical (PRY), Media Access Control (MAC), and Station Management (SMT). The state machine example described later in this application note was developed with the December 2, I SMT I I MAC I I I CMTII I PHY I PMD I Figure 1. FDDI Hierarchy 6-247 PCM ~ ==-~~~~~~~~~~~~~~~~~~~~F~D~D~I~U~S~in~g~th~e~C~Y7~·~C~3~30 the PCM appears in Figure 1. The PCM provides the signals to perform the following functions: Initialize a connection Reject a marginal connection Support maintenance Figure 2 shows the synthesized state machine that performs these activities. This state machine is based on version 9.1 of the PCM state machine described in the SMT specification. To keep within the CY7C330's 25 I/O constraint, a small amount of logic is implemented outside the CY7C330. For instance, the PCM uses two timers. The CY7C330 does not include these timers,but two decoded signals (timerl and timer2) indicate that the timer has reached specific values. The timerl and tirner2 signals are inputs to the CY7C330. The chart in Figure 3 shows all the macrofunctions, how they are decoded, and their functions. Introduction to the CY7C330 The CY7C330 is a synchronous, 28-pin PLD. It is packaged in a 300-mil DIP as well as several types of surface mount packages, including a leadless ceramic chip carrier (LCC) and a plastic leaded chip carrier (PLCC). The device is fabricated with the Cypress 0.8micron CMOS process and is available in speeds of 33, 50, and 66 MHz. The CY7C330 is also available as a military device in speeds of 33, 40, and 50 MHz. The device is optimized to implement high-speed state machine designs. The CY7C330's features can be generalized into four groups: 1. Dedicated input cell 2. Product term array 3. I/O macrocell 4. Hidden state-register macrocell The CY7C330 contains 11 of the dedicated input macrocells. This cell (Figure 4) contains a D flip-flop and a programmable multiplexer (mux) that allows a choice of two iriput clocks. The two input clocks are CKI and CK2, which come directly from pins 2 and 3 of the device, respectively. Note that you cannot bypass any of the CY7C330's registers. The device is purely synchronous in nature. As with any PLD, the CY7C330's product term array (see the CY7C330 block diagram in Figure 1 of "Understanding the CY7C330") synthesizes the logical connections of the design. The product terms control a H LS ~~.---.. QLS+HLS+IO ISE (QU +HLS+YLS) -Till E1 QLS+(IIU*TIIIU) hall QLS+HLS+TIIIEI+II0ISE Figure 2. PCM State Machine 6-248 MACRD NAME MLS ILS HLS QLS pc_start pCJeject scjoin pcstop pcmaint time 1 time2 n_neCL1O n_~7 n_~9 n_~lO noise vaIn vaI8 vaI9 SYNTHESIZED SIGNAL !MLS !ILS !HLS !QLS !pcO & !pel !pcO & pcl pcO & !pcl !pc_stop !pc_maint !timerl !timer2 !nO & !nl !nO & nl nO & !nl nO & nl !noise_count Val_n !VaI 8 !VaI_9 FUNCTION Master Line State Idle Line State Halt Line State Quiet Line State State PCM State Machine Enter Reject State Incorporate connection into token path PCM state machine to enter OFF state Enter maintenance state See timer explanations below. See timer explanations below. Counter indicating 10 bits of data have not been received or transmitted Counter indicating 7 bits have been transmitted or received Counter indicating 9 bits have been transmitted or received Counter indicating 10 bits have been transmitted or received Noise counter threshold Transmitted value n Transmitted or Received value = 8 Transmitted or Received value = 9 TIMER VALUES Timer 1 Oms 0.2ms 480ns 15 us 25 ms 200ms Timer 2: 100ms TB_Min A Max LS Min LS_Max I_Max T_next(9) Minimum break time for link. Maximum time required to achieve signal aquisition. Length of time reception of ILS Max time required for line state recognition Max optical bypass insertionldeinsertion time Default time for MAC loopback TOut Signalling Timeout Figure 3. Macro Definitions register, which can clock data from the I/O pin into the array. This flip-flop can be clocked from CKI or CK2, as with the dedicated input cell. global reset, a global preset, an Exclusive-OR gate, the output enables, and the product terms that go to the D input of the flip-flops in the output macrocells. (Most of these features are covered ·later in the explanation of the macrocell.) The device offers product term distribution that varies between nine and 19, depending on which output macrocell is being addressed. The 19 product terms become the limiting factor in the complexity of the design. The I/O macrocell (see Figure] in "Using ABEL to program the CY7C330") contains two D flip-flops. One of the D flip-flops clocks data from the array to either the output pin or back to the array and is intended to be a state register. The I/O macrocell has a different clock than the input registers, called CLK, which comes directly from pin 1. The other D flip-flop is an input FROM INPUT TO PIli INPUT CLK2 CLKl FROM FROM PIN Pili 3 2 Figure 4. Input Macrocell 6-249 BUFFER As mentioned earlier, the product term array feeds an XOR gate, which in turn feeds the D input of the state register. This gives you quite a bit of design flexibility. For example, you can use the XOR as an inverter by setting the XOR product term to a One. You can use the XOR to make the flip-flop a D, T, or JK type. Wrapping the Q output back to the XOR input changes the flip-flop from D to T, for instance. The design example described later uses this feature. The output macrocell also allows you to choose the output-enable control for the pin. The output enable can come from a product term or directly from pin 14. The CY7C330 provides 12 I/O macrocells. The hidden-state macrocell (Figure 5) contains a state register with no output pin associated with it. The CY7C330 contains four hidden-state macrocells. You can use these macrocel1s to synthesize a small 4-bit internal state machine or perform any function that is required only internally to the device itself. The timing required for this design is 12.5 MHz, which allows use of the slowest CY7C330 version (33 MHz). The design requires one clock, although two pins are dedicated for clocks in the CY7C330. In this design, pins 1 and 2 are tied together extemally,conneeting the input-register and state-register clocks together. In the ABEL source code described below, the labels for the two clocks are CKS and CKI. Design Methodology The PCM design is implemented using the state machine syntax in ABEL version 3.0. The first-pass ABEL source code appears in Appendix A. Note that the state machine requires 31 states. This means that the state machine is implemented with 5 bits, which gives 32 total states and leaves one illegal state. When the design is run at reduction level 4 - the maximum reduction in ABEL - the software responds that the design requires more than 30 product terms per output. This is far more than the 19 product terms that are possible on anyone output. I 0 'L-II:...LJU1.IlL-..4'r---. SUM )-+-+-++-+-+--1 OE (FIlOM PIlI ClKO ClKl elK! SR SS Figure 5. The CY7C330 Buried Register 14) Case 1. Decimal 6 9 Binary 000110 001001 (4 bits toggle) 6 7 000110 000111 (1 bit toggles) Case 2. Figure 6. State Change Comparison At first glance, you might assume that the design is far too complex for the CY7C330. But further procedures make this implementation possible. To understand these procedures, it is necessary to understand some facts about ABEL. ABEL reduces a design to a sum of products and does not make use of the XOR gate in the macrocell. To use the XOR gate, you must specify it in Boolean equation form and run the reduction at level O. Specifying T flip-flops in version 3.0 also causes ABEL to reduce to a sum of products and not create T flip-flops using the XOR gate. ABEL 3.1 accepts T flip-flops, however, and corrects this situation. Product Term Squeezing The first method for reducing the number of product terms is to increase the number of bits in the state machine from 5 to 6 bits. Although the. state machine only requires 31 states, a much broader range of choice results from having 64 possibilities for placing the states. The next procedure involves changing from D flipflops to T flip-flops. T flip-flops are more effIcient because when the T input is High, the flip-flop toggles. Otherwise, the flip-flop retains its previous state. Because a T flip-flop only needs one product term for a transition to occur, the state machine can be optimized by choosing state transitions that use a minimum number of bits. For example, a transition between states 6 and 9 requires more bits to change than a transition between states 6 and 7 (Figure 6). The 6-to-9 transition requires four product terms, while the 6-t0-7 transition requires only one product term. Because the number of total states has been increased from 32 to 64 by adding one more bit to the state machine, you gain much more flexibility in choosing states. Carefully choosing the states in a state machine is the easiest way to reduce the number of product terms required. Another way to make the design implementation more effIcient is to use the CY7C330's synchronous global reset and preset to deal with illegal states. (Initially, the state machine is in state 0 because the CY7C330 has a power-on reset) It is good design practice to make provisions for illegal states. Although an 6-250 State !S48: illegal state should never occur, the state machine should be able to recover from such a state. Many times the recovery mechanism is built into the state machine itself, which requires more product terms. If an illegal state is detected in this design, the state machine re-initializes itself and goes to state O. Instead of building this requirement into the design, you can use a hidden register to detect the occurrence of illegal states. The signal from that register controls the CY7C330's synchronous reset, which returns the state machine to state O. The CY7C330's synchronous nature causes the state machine to go to state 0 two clocks after the illegal state is encountered. One clock is required to detect the illegal state, and one clock is required to reset the device. This requirement is acceptable for this application. In this design, it was noticed that the condition pcmaint was encountered in every state; the state machine was unconditionally required to go to this state. To reduce the state machine further, the state assigned to this condition is 63 (111111 binary). The synchronous preset is used to detect this signal. The assertion of pcmaint forces the state machine to state 63, thus avoiding the use of any product terms in the main body of the design. This design requires several synchronous resets: an external pin (RST), the illegal state detect, and the signal pc_stop. Because only one product term is allowed for the device's synchronous reset, the other two resets must be developed by ANDing the reset signal with every product term associated with the outputs that are to be reset. This performs the same function as having multiple p terms for the synchronous reset but does not utilize any additional resources in the CY7C330. Keep in mind that the CY7C330 has varied product term distribution. The state registers associated with pins 16 and 27 have 19 product terms. Put the state outputs that require the most product terms to these pins. In this example, QO requires 18 product terms, and Q5 requires 17. These outputs are assigned to pins 27 and 16. The remaining outputs are placed in the same manner. Converting the state machine to Boolean equations is a straightforward procedure. By examining the state transitions, you can extract the Boolean equations. The reduced design is shown in Figure7. if (HLS) then !S52 else if (QLS # time2) then !S32 else !S48; 48 52 = 110000 (binary) = 110100 Q2 is the only bit that transitions Therefore, a product term of: Q5 & Q4 & !Q3 & !Q2 & !Q1 & !QO & HLS / \ state 48 would be added to the equation for Q2. To continue the example: 48 = 110000 32 = 100000 Q4 is the only bit that transitions Therefore, the product terms of: Q5 & Q4 & !Q3 & !Q2 & !Q1 & !QO & QLS # Q5 & Q4 & !Q3 & !Q2 & lQl & lQO & time2 \ / state 48 would be added to the equation for Q4. Figure 7. Boolean Equation Extraction Example The Cypress PLD ToolKit· is used as the development platform for the reduction process. The PLD ToolKit is a low-cost software development system for all Cypress PLDs. Although the reduced equations could have been obtained using ABEL, in many ways the PLD ToolKit is easier to use and more tailored to the Cypress devices. The PLD ToolKit source file appears in Appendix B. The PLD ToolKit also features a mouse-driven, interactive, simulator/waveform editor that makes design verification easy. 6-251 Appendix A. Orignal Abel Source Code module pcm flag '-r3' title 'Physical Connection Management (PCM) state Machine version 9.1 Steve Traum Cypress Semiconductor March 27, 1989' U1 device 'P330'; "Inputs CKS,Ck1,rst pcO,pc1 timer 1 timer2 mls,ils,hls,qIs Val n nO,n1 Val 8 Val-9 noise_count pc_stop pc. maint nC Val 8 Val=9 noise_count pin 1,2,3; pin 4,5; pin 6; pin 7; pin 9,10,11,12; pin 13; pin 14,15; pin 16; pin 17; pin 18; pin 19; pin 20; istype 'feedyin'; istype 'feedyin'; istype 'feedyin'; istype 'feedyin'; "Outputs Reset node 29; Q5,Q4,Q3,Q2,Q1,QO pin28,27,26,25,24,23; Q5,Q4,Q3,Q2,Q1,QO istype 'pos,reg'; Qstate = [Q5,Q4,Q3,Q2,Q1,QO]; "declarations "Qstate SO S5 S10 S15 S20 S25 S30 S35 S40 S45 S50 S55 S60 I\bOOOOOO; "bOOOlO1; I\b001010; I\bOOllll; "b010100; "b01lO01; "bOll 110; I\b100011; "b lO 1000; "blOll01; "bllOOlO; I\b 110 11 1; I\b1 11 100; High,Low H,L,C,X,Z = = 1,0; 1,0, .C.,X.,.Z.; Sl = "bOOOOO1; S6 = "bOOO 11 0; Sl1 "bOOlO11; S16 "b010000; S21 "bOlO101; S26 "b011010; S31 I\bOl 11 11; S36 "b100100; S41 = "b101001; S46 = "biOI 110; S51 = I\b11OOll; S56 = I\b1ll000; S61 = "b11ll01; S2 = I\bOOOO 10; S7 = "bOO0111; S12 "bOOll00; S17 "b010001; S22 "bOlOll0; S27 "bOl1011; S32 I\blOOOOO; S37 "blOOlOl; S42 I\blOl0l0; S47 I\b10ll11; S52 I\bl10100; S57 I\bl1l001; S62 I\b1 11 110; MLS MACRO {(!mls)}; 1LS MACRO {(!ils)}; HLS MACRO {(!hIs)}; QLS MACRO {(!qls)}; 6-252 S3 = "bOOOOll; S4 = "bOOO100; S8 = "bOO 1000; S9 = I\bOOl00l; S13 I\bOO1101;S14 = I\bOOlll0; S18 "bOl00lO;S19 = "bOl0011; S23 I\b010111;S24 = "b011000; S28 I\b011100;S29 = "b01ll01; S33 I\b100001;S34 = I\b1000lO; S38 "b100110;S39 = I\b100ll1; S43 "b1010ll;S44 = "b10ll00; S48 "bllOOOO;S49 = I\b11OOO1; S53 I\bll0101;S54 = I\b110110; S58 I\b1ll0lO;S59 = "bll10ll; S63 I\bllllll; Appendix A. Original Abel Source Code (continued) pc_start MACRO {(!pcO & Ipc1)}; pCJeject MACRO {(!pcO & pc1)}; scjoin MACRO {(pcO & Ipc1)}; pcstop MACRO {(!pc_stop)}; pcmaint MACRO {(!pc maint)}; time1 MACRO {(Itimer!)}; time2 MACRO {(!timer2)}; n_necL)O MACRO {(!nO & In1)}; n eq 7 MACRO {(!nO & n1)}; n=eq) MACRO {(nO & In1)}; n_e QO & !ILSTATE & pc stop # !Q5 & !Q4 & !Q3 & !Q2 & !Ql & QO & !HLS & !ILSTATE & pc stop # !Q5 & !Q4 & !Q3 & !Q2 & Ql & !QO & !timer! & !ILSTATE & Pc_stop # !Q5 & !Q4 & Q3 & !Q2 & !QI & QO & pcO & !pcl & !timer! & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & !QI & QO & !ILSTATE & pc stop # !Q5 & Q4 & !Q3 & !Q2 & QI & QO & nO & nl & !ILSTATE & pc stop # Q5 & Q4 & Q3 & !Q2 & Ql & QO & !Val 8 & !ILSTATE & pc stop # Q5 & Q4 & !Q3 & Q2 & QI & !QO & !ReS & !lLSTATE & pc -stop # Q5 & Q4 & !Q3 & Q2 & QI & !QO & !MLS & !lLSTATE & pc-stop # Q5 & Q4 & !Q3 & Q2 & Ql & !QO & !timerl & !ILSTATE & pc_stop # !Q5 & Q4 & Q3 & !Q2 & Ql & QO & Val_n & !lLSTATE & pc_stop # Q5 & !Q4 & !Q3 & !Q2 & !Ql & !QO & !QLS & !timerl & !lLSTATE& pc stop # Q5 & !Q4 & !Q3 & !Q2 & !Q 1 & !QO & !HLS & !timerl & !lLSTATE & pc-stop # Q5 & !Q4 & !Q3 & !Q2 & !Ql & !QO & !MLS & !timerl & !lLSTATE & pc-='stop # Q5 & !Q4 & !Q3 & !Q2 & !Ql & QO & !ILS & !lLSTATE & pc stop # Q5 & !Q4 & !Q3 & !Q2 & Ql & QO & !timerl & !lLSTATE & pc stop # Q5 & !Q4 & Q3 & !Q2 & !QI & !QO & !ILS & !lLSTATE & pc_stop # Q5 & Q4 & !Q3 & !Q2 & Ql & QO & !Val_n & !lLSTATE & pc_stop # Q5 & Q4 & Q3 & Q2 & Ql & QO & !pcO & !pcl & !lLSTATE & pc_stop; 6-256 Appendix B. Cypress PLD ToolKit Source File (continued) QI Q2 Q3 0- 0- 0- < oe> QI & !ILSTATE & pc stop # !Q5 & !Q4 & !Q3 & !Q2 & Ql & QO & !pcO & pcl & !ILSTATE & pc_stop # Q5 & Q4 & Q3 & Q2 & QI & QO & !pcO & !pcl & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & !Ql & QO & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & !QLS & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & !timer2 & !ILSTATE & pc stop # !Q5 & Q4 & !Q3 & !Q2 & QI & QO & nO & nl & !ILSTATE & pc-stop # Q5 & !Q4 & !Q3 & !Q2 & !QI & QO &"!HLS & !ILSTATE & pcji"op # Q5 & !Q4 & !Q3 & !Q2 & QI & !QO & !QLS & !ILSTATE & pc stop # Q5 & !Q4 & !Q3 & !Q2 & QI & !QO & !timer2 & !MLS & !ILSTATE & pc stop # Q5 & Q4 & !Q3 & !Q2 & QI & QO & Vatn & !ILSTATE & pc_stop; < oe> Q2 & !ILSTATE & pc_stop # Q5 & Q4 & Q3 & Q2 & QI & QO & !pcO & !pcl & !ILSTATE & pc_stop #!Q5 & Q4 & !Q3 & !Q2 & QI & !QO & lHLS & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & !MLS & !ILSTATE & pc_stop # Q5 & Q4 & Q3 & !Q2 & QI & QO & !Val_8 & !ILSTATE & pc_stop # !Q5 & Q4 & Q3 & !Q2 & QI & QO & !ILSTATE & pc_stop # Q5 & !Q4 & !Q3 & Q2 & !QI & !QO & !QLS & !ILSTATE & pc stop # Q5 & !Q4 & !Q3 & Q2 & !QI & !QO & !timer2 & !ILSTATE & Pc stop # Q5 & !Q4 & !Q3 & Q2 & QI & !QO & !timerl & !ILSTATE & pc stop # Q5 & !Q4 & Q3 & Q2 & !QI & !QO & !timerl & !ILSTATE & pc-stop # Q5 & Q4 & !Q3 & !Q2 & !Ql & !QO & !HLS & !ILSTATE & pc3top # Q5 & Q4 & !Q3 & Q2 & Ql & QO & !ILSTATE & pc_stop; < oe> Q3 & !ILSTATE & pc stop # Q5 & Q4 & Q3 & Q2 & Ql &-QO & !pcO & !pcl & !ILSTATE & pc stop # !Q5 & !Q4 & Q3 & !Q2 & !QI & !QO & !QLS & !ILSTATE & pc stop # !Q5 & !Q4 & Q3 & !Q2 & !QI & !QO & !HLS & !ILSTATE & pc-stop # !Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !noise count & !ILSTATE & pc stop # !Q5 & !Q4 & Q3 & !Q2 & !Ql & QO & !pcO &-pcl & !ILSTATE & pc_stop # !Q5 & !Q4 & Q3 & !Q2 & !Ql & QO & !MLS & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & Ql & QO & !nO & nl & !ILSTATE & pc stop # !Q5 & Q4 & !Q3 & !Q2 & Ql & QO & nO & !nl & !ILSTATE & pc-stop # Q5 & Q4 & Q3 & !Q2 & QI & QO & !ILSTATE & pc stop # Q5 & !Q4 & !Q3 & Q2 & !Ql & !QO & !MLS & !ILST-ATE & pc_stop # Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !QLS & !ILSTATE & pc_stop # Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !HLS & !ILSTATE & pc_stop # Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !timer2 & !ILSTATE & pc_stop # Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !noise_count & !ILSTATE & pc_stop; 6-257 Appendix B. Cypress PLD ToolKit Source File (continued) Q4 .- < oe> Q4 & m.,sTATE & pc_stop # !Q5 & !Q4 & !Q3 & !Q2 & QI & QO & !timerl & !lLSTATE & pc_stop # Q5 & Q4 & Q3 & Q2 & QI & QO & !pcO & !pcl & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & !QI & !QO &Vat9 & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & !QLS & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & !timer2 & !lLSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & !QO & tMLS & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & Q2 & QI & !QO & !ILSTATE & pc stop # !Q5 & Q4 & Q3 & !Q2 & QI & QO & !Val n & !ILSTATE & pc stop # Q5 & !Q4 & !Q3 & Q2 & QI & QO & !HLS & !ILSTATE & pcj'top # Q5 & !Q4 & !Q3 & Q2 & QI & QO & !MLS & !ILSTATE & pc_stop # Q5 & !Q4 & !Q3 & Q2 & QI & QO & !timerl & !ILSTATE & pc_stop # Q5 & Q4 & !Q3 & !Q2 & !QI & !QO & !QLS & !ILSTATE & pc stop # Q5 & Q4 & !Q3 & !Q2 & !QI & !QO & !timer2 & !lLSTATE & pc_stop # Q5 & Q4 & !Q3 & Q2 & !QI & !QO & !timerl & !ILSTATE & pc_stop; Q5 < oe> Q5 & !ILSTATE & pc_stop # !Q5 & !Q4 & !Q3 & !Q2 & !Ql & !QO & !pcO & !pcl & !lLSTATE & pc_stop # !Q5 & !Q4 & !Q3 & !Q2 & !QI & QO & !HLS & !ILSTATE & pc_stop # !Q5 & !Q4 & !Q3 & Q2 & QI & !QO & !ILSTATE & pc_stop # !Q5 & !Q4 & Q3 & !Q2 & !Ql & !QO & !QLS & !ILSTATE & pc_stop # !Q5 & !Q4 & Q3 & !Q2 & !QI & !QO & !HLS & !ILSTATE & pc_stop # !Q5 & !Q4 & Q3 & !Q2 & !QI & !QO & !noise count & !ILSTATE & pc stop # !Q5 & Q4 & !Q3 & !Q2 & !Ql & !QO & !ILSTATE & pc stop # !Q5 & Q4 & IQ3 & IQ2 & Ql & IQO & IQLS & !ILSTATE & pc_stop # !Q5 & Q4 & IQ3 & IQ2 & Ql & !QO & !timer2 & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & QI & QO & InO & !nl & !ILSTATE & pc_stop # !Q5 & Q4 & !Q3 & !Q2 & Ql & QO & nO & lnl & IILSTATE & pc_stop # !Q5 & Q4 & !Q3 & Q2 & Ql & !QO & !ILSTATE & pc stop # !Q5 & Q4 & Q3 & IQ2 & Ql & QO & !ILSTATE & pcjtop # Q5 & !Q4 & !Q3 & !Q2 & Ql & !QO & !ILS & !ILSTATE & pc_stop # Q5 & IQ4 & Q3 & !Q2 & !Ql & QO & !timerl & !ILSTATE & pc stop # Q5 & Q4 & !Q3 & !Q2 & Ql & !QO & !ILSTATE & pc stop # Q5 &Q4 & !Q3 & !Q2 & Ql & QO & Vatn & !ILSTATE & pc_stop; {end of file} 6-258 Bus-Oriented Maskable Interrupt Controller This application note illustrates the design flexibility of Cypress's CY7C331 PLD by describing a single-chip interrupt controller based on the PLD. Virtually all microprocessor designs require some type of interrupt support. Co~plex applications c~ take advantage of a dedicated mterrupt controller ChIp from the microprocessor family. But for simple applications or where special requirements exist, a standard interrupt controller can prove inadequate or represent overkill for the design. In such cases, you generally implement a customdesigned controller using some combination of MSI logic and PLDs. The single-chip design described ~ere is implemented in two stages: The first stage compnses a simple 4-channel controller, which includes the major functional blocks. In the second stage, another controller is cascaded from the stage-l design to provide support for up to eight interrupt channels. The interrupt controller's design features include: 1. Programable-polarity, level-sensitive inputs 2. Interlocked REQI ACK handshake 3. Simple MPU bus attachment for read and write 4. Masking of individual channels 5. Prioritized interrupt vector 6. Fully asynchronous operation CY7C331 Description The device used to implement the interrupt controller is the CY7C331, an asynchronous PLD packaged in a 28-pin, 300-mil DIP. The device features 12 I/O macrocells and 13 dedicated inputs. The I/O macrocell has a separate input and output flip-flop, which is highly Mask Word (Write) 7 16 o -> Is 14 b I ENABLED ~CHO MASK CH1 7 6 0 STATUS BIT o -> No Interrupt Vector I I L Vector LSB Vector Figure 1. Data Bus Bit Assignments useful in bus-oriented applications. Each flip-flop has a separate product term for the clock, set, and reset. The output flip-flop's D input incorporates an XOR with the sum-of-products array. This allows you to select polarity or implement a toggle or JK flip-flop. The macrocell flip-flops also offer a unique transparency feature: When the set and reset inpu!s are both asserted, the flip-flop's Q output follows Its D input. Thus, you can use the flip-flop as a. clocked register with independent clock, set and reset mputs or as a combinational path. Design Description The interrupt controller attaches to the MPU data bus and is controlled by the system processor through read and write ports on the data bus. The read port provides interrupt status and a prioritized vector for the processor, and the write port allows the processor to selectively mask individual interrupt channels. The controller provides a separate interrupt request line to the processor to signal a pending interrupt.. Figure 1 sho~s the bit assignments for the read and wnte ports. In FIgure 2. you can see the interrupt controller's major functional blocks. 6-259 £i.~RESS Bus-Oriented Maskable Interrupt Controller ~, ~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Figure 2. Interrupt Controller Block Diagram Additionally, the CY7C331 includes six shared input multiplexers, which allow you to bury up to six output flip-flops without giving up any pins. Figure 3 shows a block diagram of the CY7C331. A diagram of the device's I/O macrocell appears in Figure 1 of the application note "Using the CY7C331 as a Waveform Generator." Four-Channel Interrupt Controller The interrupt controller's operation is quite simple. On reset, all interrupt channels are masked off, and no interrupts are permitted. The processor then loads the mask register with the desired interrupt-channel mask bits cleared. If the channel is not masked when an interrupt request occurs, the request is prioritized, and the controller asserts the Interrupt Request (IRQ) to the processor. The processor responds to the IRQ by reading the interrupt vector port When the interrupt controller detects .the read, the controller latches the current interrupt priority and places the priority vector on the data bus. Latching the current priority while the vector is being read prevents the vector from being altered during the read cycle. In addition, the controller decodes the vector and asserts the corresponding channel's acknowledge line. The acknowledge remains asserted until detected by the interrupting element, which responds by deasserting its interrupt request. This interlocking handshake ensures that a pending interrupt is not lost or responded to more than once. The controller also uses the acknowledge internally to disable the interrupt request into the priority encoder; this is done in the time between the interrupt acknowledge and the interrupt request being deasserted. A simple example of the timing sequence· for a single interrupting channel appears in Figure 4. Figure 5 defmes the pin assignments for the fIrststage interrupt controller. Figure3. CY7C331 Block Diagram Low and· WE remains High, the controller holds the current priority vector and interrupt status and places them on the data bus. TheCY7C331's I/O macrocell is readily adapted to this requirement, as illustrated in Figure 6. The INTERRUPT status generation requires a different inlplementation. If any interrupt requests are pending when the controller detects a read cycle (CS Data Bus Interface The data bus interface requires bidirectional operation. When CS and WE are asserted Low, the controller writes data into the mask register. When CS is asserted 6-260 ICS*W~E INTn _________________, IRQ DiB ~~_=DATA CE WE CS L..L...~~~~~LL...LLj-<~ BUS" R S T'-_ _ _ _---, ACK _ _-+_~ MASK"-"_ _ _ _-1 Figure 4. Timing Sequence for Single Interrupt Channel CE WE Low, WE High), the interrupt status bit must be asserted High. Moreover, new interrupt requests are held off until the end of the read cycle. This requires a clocked implementation of the interrupt status bit on the data bus, as shown in Figure 7. Figure 6. Mask/Priority Vector Function ing have been met. An SR flip-flop implements the acknowledge-generation function for each channel. The flip-flop is set when a read cycle occurs, the priority vector corresponds to the channel, and the delayed internal strobe occurs. The flip-flop is reset when the interrupt request for the channel is de asserted. A logic diagram for the internal strobe generation and a single acknowledge-generation block appears in Figure 8. The timing diagram in Figure 9 illustrates a typical operation. Acknowledge Generation Acknowledge generation requires that the controller decode the priority vector placed on the data bus and assert the corresponding acknowledge line until the interrupt request line is deasserted. The controller must handle the timing carefully for correct operation. Specifically, because a valid priority vector is not available until after CS is asserted Low, the controller cannot decode the correct channel until the priority vector register has settled. Thus, a delay is required before the controller can generate an acknowledge. The controller can generate a delay by taking advantage of the following sequence: If there is a pending interrupt request, the interrupt status bit is always asserted one propagation delay after CS is asserted on a read cycle. The interrupt status signal is then passed through an internal strobe stage, which causes an additional propagation delay. The internal strobe then initiates the acknowledge-generation sequence. The delayed strobe assures that the priority vector value has settled and the setup requirements for decod1 2 1 CS 1 WT 1 RS T 3 4 5 6 7 8 REQ3 REQ2 REQI REQO 9 10 11 12 13 14 28 27 26 25 24 23 22 21 Logic Equations The Cypress PLD ToolKit assembles the Boolean equation s for the interrupt controller (Appendix A). The equations are heavily commented for clarity. Because the PLD ToolKit does not currently support "DeMorgan ization," and because the CY7C331 contains inverting output buffers, the Boolean equations for output flip-flops are written for negative logic (i.e., solving for zero). In addition, the inversion requires swapping of the SET and RESET functions on the output flip-flops. Thus, the logical Boolean equation required to set the flip-flop must be implemented on the flip-flop's reset input. Similarly, the equation required to reset the flipflop must be implemented on the flip-flop's set input. Adding Cascade Capability IRQ You can readily extend the interrupt controller design to accommodate four additional channels by inles * REQ3 V~E~ DTB3 DTB2 20 DTBl DTBO 19 18 17 16 15 ACK3 ACK2 ACKl ACKO _________________-, REQZ ISTAT REQI REQO RST _______________ MASK3 ____________ ~ ·0· Figure 7. Interrupt Status Generation Figure 5. Interrupt Controller Pin Assignments 6-261 5r~ =========B;;;;;;U;;;S;;;;.O~rl;;;·e;;n;;;te;;d~M;;;a;;;s;;;k;;;a;;;b;;le~I;;;n;;;te~r~r~u!p~t~C~o~n~tr~o~l~le~r REO! 4 .. 7') _-7-4_~ INTERNAL STROBE ~--I-- liE UPPER INTERRUPT CONTROLLER .. 3 ) INTERRUPT ~rRQ ICS WE r LOIJER REO! PRIORITY VECTOR ~ -+-~ cs DTB! 4 .. 7 ) ,",,--·-1--31- CONTROLLER DTB! ~ .. 3 ) ACK(i2I .. 3) I CS Figure 10. Cascading Interrupt Controllers Figure 8. Internal StrobelAcknowledge Generation corporating a cascade mechanism. You can then attach a second interrupt controller to to the ftrst (Figure 10). The additional channels require an extension to the formats of the mask register and the interrupt vector (Figure 11). The lower interrupt controller supports the lowerpriority interrupt channels, generates the IRQ to the processor, and places the interrupt status and priority yector on the data bus during a read cycle.· The upper mterrupt controller supports the higher-priority channels and passes its current status and priority vector down to the lower interrupt controller. The interrupt status line is asserted High when the upper interrupt controller has a non-masked interrupt request pending. To permit the host processor to write into the upper interrupt controller's mask register, the controller monitors the data bus's upper four bits. Because the upper interrupt controller passes its priority vector directly to the lower interrupt controller, however, the upper interrupt controller does not need to output any data on the bus during a read cycle. i~Tr c 5 ,A,TU,:> ACKn MSK \lORD (\/RITE 1 7 3 2 I 0 I1l_ENI\8LED I-MSKED INTERR'UPT VECTOR (READ) 7 6 5 4 3 _UTL 2 I 0 CKT..ITXT2-:-J:s V2-1--vT'V0' ___,d' ___ 65"; 19iil~fi"ITr;I:i~JQl11fD;l.[f~grQ£iJQijJ ~~. ~_} i VECTOR LSB I' - - - - - VECTOR 2SEl c'ef>--- 1 ''-T INTERNAL STROBE In operation, the lower interrupt controller must monitor the status interrupt line from the upper con!£Oller. The lower controller incorporates the interrupt mto the IRQ to the host processor and into the interrupt vector placed on the data bus during a read cycle. Modifying the interrupt vector is straightforward. Because the upper interrupt channels have higher priority, when the interrupt status from the upper controller is asserted, the interrupt vector's lower two bits are the two vector bits from the upper controller. When the status is not asserted, the interrupt vector's lower . . --------------- VECTOR /'ISEl I'--.Iclf>-'-- .. --~t . I ' - - - - - - - STATUS <,' .r L 3-> NO iNTEHRUPTS ------alt'~-; I ~ VECTOR !S VAllO REOn L--Figure 9. Timing Diagram Figure 11. Extended Interrupt Vector 6-262 two bits are the lower priority interrupt vector encoded from the lower interrupt controller. The interrupt vector's third bit is simply the state of the interrupt status signal from the upper controller. The modified interrupt controller equations for the lower element appear in Appendix B. and the upper element equations in Appendix C. Summary moderate-complexity interrupt controllers. You can extend the design as required for different request polarity levels, edge-sensitive inputs, or additional channels. Simulations of the interrupt controller show that the design works as expected. You can obtain the PLD source files for the design from your local Cypress sales office. The interrupt controller described in the application note can serve as the basis for flexible low-to- 6-263 ~CYPRISS Bus-Oriented Maskable Interrupt Controller ~~m~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. PLD ToolKit Source Code Stand Alone Interrupt Controller {Stand Alone Interrupt Controller} CY7C331; {declare device type} CONFIGURE; CS(node = 4), WE(node = 5), RST(node = 6), REQ3(node = 9), REQ2(node = 10), REQ l(node = 11), REQO(node = 12), {pin 4, chip select} {pin 5, write enable} {pin 6, reset} {pin 9, interrupt request channel 3} {pin 10, interrupt request channel2} {pin 11, interrupt request channell} {pin 12, interrupt request channel O} !IRQ(node = 27), {pin 27, interrupt to processor} ISTAT(node = 28), {pin 28, data bus 3 - interrupt status} PVEC2(node = 26), {pin 26, data bus 2 - priority vector bit 2} PVEC1(node = 24), {pin 24, data bus 1 - priority vector bit I} PVECO(node = 20), {pin 20, data bus 0 - priority vector bit O} ACK3(node = 18), {pin 18, acknowledge channel3} ACK2(node = 17), {pin 17, acknowledge channel2} ACK1(node = 16), {pin 16, acknowledge channell} ACKO(node = 15), {pin 15, acknowledge channel O} MSK3(node = 34,SRC = 28), {shared input mux for pin 28} MSK2(node = 33,SRC = 26), {shared input mux for pin 26} MSK1(node = 32,SRC = 24), {shared input mux for pin 24} MSKO(node = 31,SRC = 20), {shared input mux for pin 20} ISTB(node = 25), {pin 25, internal strobe} EQUATIONS; IRQ = < oe> < set_out> {make FF transparent} < clr_out> {make FF transparent} < xsum> {force invert} < sum> REQ3 & IACK3 & IMSK3 # REQ2 & IACK2 & IMSK2 # REQ1 & lACK 1 & IMSKl # REQO & IACKO & IMSKO; !ISTAT = < oe> ICS & WE < xsum> {force invert} < set_out> CS & ISTAT {FF output is reset} < ck out> ICS & WE < seCin> IRST {interrupt is masked on reset} < ck in> lWE & ICS < sum> REQ3 & IACK3 & IMSK3 # REQ2 & IACK2 & IMSK2 # REQ1 & !ACK1 & !MSK1 # REQO & !ACKO & !MSKO; IPVEC2 = < oe> ICS & WE < set out> {always zero} < set-in> !RST {interrupt is masked on reset} < ck,=-in> lWE & !CS; . 6-264 ~= Bus-Oriented Maskable Interrupt Controller ~ .~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. PLD ToolKit Source Code Stand Alone Interrupt Controller (continued) IPVECl = < oe> ICS & WE < xsum> {force invert} < ek out> ICS & WE < suiID. IACK3 & REQ3 & IMSK3 # IACK2 & REQ2 & IMSK2 < set in> IRST {interrupt is masked on reset} < ek]n> lWE & ICS; IPVECO = < oe> ICS & WE < xsum> {force invert} < ek out> ICS & WE < sum> !ACK3 & REQ3 & !MSK3 # !ACKl & REQ 1 & IMSKl & MSK2 # IMSKl & IACKl & REQ 1 & IREQ2 < set_in> !RST {interrupt is masked on reset} < ek_in> lWE & ICS; IACK3 = < oe> < elr_out> !CS & WE & PVECl & PVECO & ISTB & IACK3 {FF output is set} < set_out> CS & ACK3 & !REQ3; {FF output is reset} IACK2 = < oe> < elr_out> !CS & WE & PVECl & IPVECO & ISTB & IACK2 {FF output is set} < set_out> CS & ACK2 & IREQ2; {FF output is reset} IACKl = < oe> < elr_out> ICS & WE & IPVECl & PVECO & ISTB & IACKl {FF output is set} < set_out> CS & ACKl & IREQl; {FF output is reset} IACKO= < oe> < elr_out> ICS & WE & IPVECl & IPVECO & ISTB & IACKO {FF output is set} < set_out> CS & ACKO & IREQO; {FF output is reset} !ISTB = < oe> < elr_out> 1ST A T & !ISTB {FF output is set} < set_out> CS & ISTB; {FF output is reset} 6-265 1i1:CYPRISS Bus-Oriented Maskable Interrupt ControUer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix B. PLD ToolKit Source Code Cascadable Interrupt Controller-Lower Element {Cascaded Interrupt Controller - Lower Element} CY7C331; {declare device type} CONFIGURE; UST A T(node = 1), RVECl(node = 2), R VECO(node = 3), CS(node = 4), WE(node = 5), RST(node = 6), REQ3(node = 9), REQ2(node = 10), REQl(node = 11), REQO(node = 12), !lRQ(node = 27), 1STA T(node = 28), PVEC2(node = 26), PVECl(node = 24), PVECO(node = 20), ACK3(node = 18), ACK2(node = 17), ACKl(node = 16), ACKO(node = 15), MSK3(node = 34,SRC MSK2(node = 33,SRC MSKl(node = 32,SRC MSKO(node = 31,SRC ISTB(node = 25), {pin 1, upper element interrupt status} {pin 2, ripple vector bit 1 from upper element} {pin 3, ripple vector bit 0 from upper element} {pin 4, chip select} {pin 5, write enable} {pin 6, reset} {pin 9, interrupt request channel3} {pin 10, interrupt request channel2} {pin 11, interrupt request channell} {pin 12, interrupt request channel O} = = = = {pin 27, interrupt to processor} {pin 28, data bus 3 - interrupt status} {pin 26, data bus 2 - priority vector bit 2} {pin 24, data bus 1 - priority vector bit I} {pin 20, data bus 0 - priority vector bit O} {pin 18, acknowledge channel3} {pin 17, acknowledge channel2} {pin 16, acknowledge channell} {pin 15, acknowledge channel O} 28), {shared input mux for pin 28} 26), {shared input mux for pin 26} 24), {shared input mux for pin 24} 20), {shared input mux for pin 20} {pin 25, internal strobe} EQUATIONS; IRQ = < oe> < set_out> {make FF transparent} < clr_out> {make FF transparent} < xsum> {force invert} < sum> REQ3 & !ACK3 & !MSK3 # REQ2 & !ACK2 & !MSK2 # REQ 1 & IACKl & IMSKl # REQO & IACKO & !MSKO # USTAT; !ISTAT = < oe> !CS & WE < xsum> {force invert} < set_out> CS & ISTAT {FF output is reset} < ck out> ICS & WE < se~in> IRST {interrupt is masked on reset} < ck in> lWE & ICS < sum> REQ3 & IACK3 & !MSK3 # REQ2 & IACK2 & !MSK2 # REQ 1 & IACKl & !MSKI # REQO & !ACKO & IMSKO # USTAT; 6-266 ~RESS Bus-Oriented Maskable Interrupt Controller ~~ ~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix B. PLD ToolKit Source Code Cascadable Interrupt Controller-Lower Element (continued) lPVEC2 = < oe> lCS & WE < xsum> {force invert} < ck out> lCS & WE < sum> USTAT < set in> lRST {interrupt is masked on reset} < ck.In> lWE & lCS; lPVECl = < oe> lCS & WE < xsum> {force invert} < ck out> lCS & WE < sum> lACK3 & REQ3 & lMSK3 & lUSTAT # lACK2 & REQ2 & lMSK2 & lUSTAT # RVECl & USTAT < set in> lRST {interrupt is masked on reset} < ck]n> lWE & lCS; lPVECO = < oe> lCS & WE < xsum> {force invert} < ck out> lCS & WE < sum> lACK3 & REQ3 & lMSK3 & lUSTAT # lACKl & REQ 1 & lMSKl & MSK2 & lU STAT # lMSKl & lACKl & REQI & lREQ2 & lUSTAT # RVECO & USTAT < set in> lRST {interrupt is masked on reset} < ck]n> lWE & lCS; lACK3 = < oe> < elr_out> lCS & WE & lPVEC2 & PVECI & PVECO & ISTB & lACK3 {FF output is set} < set_out> CS & ACK3 & lREQ3; {FF output is reset} lACK2 = < oe> < elr_out> lCS & WE & lPVEC2 & PVECI & lPVECO & ISTB & lACK2 {FF output is set} < set_out> CS & ACK2 & lREQ2; {FF output is reset} lACKl = < oe> < clr out> lCS & WE & lPVEC2 & lPVECl & PVECO & ISTB & lACKl {FF output is set} < sei='out> CS & ACKI & lREQl; {FF output is reset} lACKO = < oe> < clr out> lCS & WE & lPVEC2 & lPVECl & lPVECO & ISTB & lACKO {FF output is set} < se(.out> CS & ACKO & lREQO; {FF output is reset} lISTB = < oe> < clr out> 1STA T & !ISTB {FF output is set} < se(.out> CS & ISTB; {FF output is reset} 6-267 ~ 9! ~~RESS Bus-Oriented Maskable Interrupt Controller ~, ~~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix C. PLD ToolKit Source Code Cascadable Interrupt Controller-Upper Element {Cascaded Interrupt Controller - Upper Element} CY7C331; {declare device type} CONFIGURE; CS(node = 4), WE(node = 5), RST(node = 6), REQ3(node = 9), REQ2(node = 10), REQ l(node = 11), REQO(node = 12), {pin 4, chip select} {pin 5, write enable} {pin 6, reset} {pin 9, interrupt request channel3} {pin 10, interrupt request channel2} {pin 11, interrupt request channell} {pin 12, interrupt request channel O} {pin 28, data bus 3 - always zero} PVEC3(node = 28), PVEC2(node = 26), {pin 26, data bus 2 - always zero} PVEC1(node = 24), {pin 24, data bus 1 - always zero} PVECO(node = 20), {pin 20, data bus 0 - always zero} ACK3(node = 25), {pin 25, acknowledge channel3} ACK2(node = 23), {pin 23, acknowledge channel 2} ACK1(node = 19), {pin 19, acknowledge channell} ACKO(node = 17), {pin 17, acknowledge channel O} 1\1SK3(node = 34,SRC = 28), {shared input mux for pin 28} MSK2(node = 33,SRC = 26), {shared input mux for pin 26} MSKl(node = 32,SRC = 24), {shared input mux for pin 24} MSKO(node = 31,SRC = 20), {shared input mux for pin 20} ISTB(node = 27), {pin 27, internal strobe} USTA T(node = 18), {pin 18, interrupt status output} ISENSE(node = 30,SRC = 18), {shared input mux for pin 18} {internal interrupt sense to generate input for ISTB} RVECl(node = 16), RVECO(node = 15), {pin 16, ripple vector bit 1 output} {pin 15, ripple vector bit 0 output} EQUATIONS; !PVEC3 = < set out> {always zero} < set in> iRST {interrupt is masked on reset} < ck,=-in> !WE & !CS; !PVEC2 = < set out> {always zero} < set in> iRST {interrupt is masked on reset} < ck.=-in> !WE & !CS; !PVEC1 = < xsum> {force invert} < ck out> !CS & WE < sum> !ACK3 & REQ3 & !MSK3 # !ACK2 & REQ2 & !MSK2 < set in> !RST {interrupt is masked on reset} < ck3n> !WE & !CS; !PVECO = < xsum> {force invert} < ck_out> !CS & WE 6-268 2:~RESS -=JII' SEMlcamUCfOR Bus-Oriented Maskable Interrupt Controller =========================;.;;;;=====;;;;;; Appendix C. PLD ToolKit Source Code Cascadable Interrupt Controller-Upper Element (continued) < sum> IACK3 & REQ3 & IMSK3 # IACKl & REQl & IMSKl & MSK2 # IMSKl & IACKl & REQl & IREQ2 < set_in> IRST {interrupt is masked on reset} < ek_in> lWE & ICS; IACK3 = < oe> < elr out> ICS & WE & PVECl & PVECO & ISTB & IACK3 {FF output is set} < se(out> CS & ACK3 & IREQ3; {FF output is reset} IACK2 = < oe> < elr_out> ICS & WE & PVECl & IPVECO & ISTB & IACK2 {FF output is set} < set_out> CS & ACK2 & lREQ2; {FF output is reset} IACKl = < oe> < elr_out> ICS & WE & IPVECl & PVECO & ISTB & lACK! {FF output is set} < set_out> CS & ACKl & IREQl; {FF output is reset} lACKO = < oe> < elr_out> ICS & WE & IPVECl & IPVECO & ISTB & IACKO {FF output is set} < set_out> CS & ACKO & IREQO; {FF output is reset} lUSTAT = < oe> < xsum> {force invert} < set_out> {make FF transparent} < elr_out> {make FF transparent} < sum> REQ3 & IACK3 & IMSK3 # REQ2 & IACK2 & IMSK2 # REQ 1 & IACKl & IMSKl # REQO & IACKO & lMSKO < ek in> ICS & WE < elr-=.in> CS & ISENSE; lR VECl = < oe> < xsum> {force invert} < set_out> {make FF transparent} < elr_out> {make FF transparent} < sum> IACK3 & REQ3 & !MSK3 # !ACK2 & REQ2 & IMSK2; IR VECO = < xsum> {force invert} < set_out> {make FF transparent} < elr out> {make FF transparent} < ek -out> ICS & WE < sum> !ACK3 & REQ3 & !MSK3 # !ACKl & REQ 1 & IMSKl & MSK2 # IACKl & REQ 1 & IMSKl & !REQ2; !ISTB = < oe> < elr out> ISENSE & !ISTB {FF output is set } < se(out> CS & ISTB; {FF output is reset} 6-269 CYPRESS SEMICONDUCTOR Using the CY7C330 as a Multi-channel Mbus Arbiter This application note discusses the use of the CY7C330 as a bus arbiter for an Mbus system based on the Cypress SPARC CY7C600 RISC processor. The CY7C330 is a high-speed synchronous erasable programmable logic device (EPLD) optimized for fmite state machine (FSM) applications. The Cypress SPARC system utilizes a CY7C601 RISC processor, a CY7C602 floating point unit (FPU), four CY7C604 cache controller and memory management units (CMU), and eight CY7C157 16K x 16 cache RAM s for a 256-Kbyte cache. The arbiter uses a combination of techniques to resolve Mbus access contention for a system with four CMU bus masters. Figure 1 shows a block diagram of the Mbus system. CY7C330 Brief Description The CY7C330 is a 66-MHz, high-performance PLD with 11 input latches, 17,000 programmable bits, four buried state registers, and 12 user-configurable output macrocells. It is manufactured using a CMOS 0.8micron, double-metal processing technology that is UV erasable. The CY7C330 comes in 28-pin, 300-mil dual in-line and LCClPLCC packages. You can partition it into multiple functional blocks, as shown in this applica- Figure 1. Mbus System Block Diagram tion.(See Figure 1 in "Understanding the CY7C330 Synchronous EPLD" for a block diagram of the CY7C330.) Mbus Description The Mbus is a system bus defined to be a SPARC standard main memory interface for the Cypress CY7C604 SPARC cache/memory management unit. The M in Mbus stands for module and emphasizes the multi-processor module support .that SPARC offers. The Mbus is a high-speed synchronous, 64-bit, multiplexed address/data bus that operates at the CY7C601 's clock rate. Mbus accesses are initiated by a master and responded to by a slave. Generally, a bus transaction takes place between a master and main memory, but in the case of direct data intervention, transactions can occur between masters. The handshake between the CY7C604 CMMU and the arbiter utilizes a request line (MRQO-3) and a grant line (MGTO-3) for each master. A busy line (MBB) is common to all masters and indicates that the bus is in use. Figure 2 shows the multiple Mbus request sequence. By design, bus mastership and resolution of multiple requests are performed outside the realm of Mbus and SPARC. This allows you to implement the arbitration scheme that best fits your system requirements. The application example presented here describes only one such implementation. Mbus transfers are synchronous with respect to the system clock. The data transactions across the bus consist of a single-clock-period address phase and a multiple-clock-period data phase. The bus transfers data in word (64-bit), multi-word burst, or. atomic-load-store formats. All signals are valid and sampled on the system clock's rising edge. The address phase is validated by the memory address strobe (/MAS) signal, which denotes the start of the actual data transfer. Bus states are indicated by three status lines and convey the current bus operation as well as error status. Figure 3 shows Mbus data transfer waveforms. 6-270 Timing Considerations SHRE IJRITE ACCESS. To meet the Mbus timing specifications, the arbitrator must be able to: accept a request, resolve any access contention, and grant bus rights to a master, all in a single Mbus clock cycle. In this application, a 66MHz CY7C330 implements the arbiter, whose input registers run at the same 33-MHz clock rate as the CY7C601 and CY7C604s. This speed allows the arbiter inputs to meet the Mbus masters' timing requirements. The output registers (including the state machine) are clocked at twice the rate of the bus masters (66 MHz), enabling the arbiter to sample requests with the input latches on one Mbus clock cycle's rising edge, transfer from one state to another, and grant access before the Mbus clock's next rising edge. Figure 4 illustrates the timing relationship between Master 0 (CY7C604 at 33 MHz) and the 66 MHz CY7C330 arbiter. ~TA INS ItlWf ~ ~AIT STATES ~---;.---.;-----;-- ~ L-..l.--Jr---+------+-- II\I£lRY Itmm I~ ~4:--~-~---'~--~----~ iE-1IIII!I~MIAAfII~ \6-BYTE BlffiT READ. [J£ ~AIT STATE ~OJIK MDBSIDATA INS Arbitration Scheme IMfNlf You can employ several resolution techniques for the arbitration function. Fixed priority, rotating priority, least recently used (LRU) , and random priority prove successful, although each has its own faults. A fixed priority, for instance, favors one requester more than the others. Rotating priority provides a simple but not always fair approach to arbitration. An LRU arbitration scheme represents the fairest form of contention resolution but requires a highly complex implementation. The random technique does not allow predictable arbitration results and could result in performance problems. A combination of methods minimizes the associated problems. The circuit presented here, for example, employs both a random and a fixed priority scheme. The random scheme uses a 2-bit counter that increments every clock cycle and varies the priority accordingly. You can set the priority function such that the processor can specify which master has the highest priority; the processor does this by loading a value into the CY7C330 via a store instruction. To support the processor in this function, the interface to the processor must provide a latched and decoded chip select, along lNURY mal ~~~--~----~--~- IlIEJRR I~ ~~:_ _-,-_---.:_ _-,-_-,-:..--,r:iE-1IIII!I~'/IIlIX1£~WA:"'~ Figure 3. Mbus Data Transfer Waveforms with a latched write enable connected directly to the arbiter. The priority function can be of value if the preset highest-priority Mbus master is fetching a program's critical data from main memory. The remaining channels follow a preset priority defined in Table 1. The Random Priority Counter employs the same priority scheme used for preset priority and operates only when the latched priority is disabled by the priority selection block via the EN signal. Design Partitioning The arbiter design is partitioned into four functional blocks that are designed separately (Figure 5). The first block is the priority latch, which is a synchronous register using the decoded and latched chip select (lCS) CLOCK CY7C331l IN'UT ---,l--_ /~R00 ---,L-_____ a f8..6 IllIK CY7C331l rurrur / ~ROl !lOCK /I\R00 ----l /I\GT0 /~GTI /1\88 MEHlER STAlE /~GT0 /~BB Figure 2. Mbus Multiple Request Sequence Figure 4. CY7C604 & CY7C330 Timing for Master 0 6-271 ~ :.n~ucrOR =====;;;;;;;;U;;;s;;;;in;;:g;;t;;;;h;;;;e;;;;;C;;;;;Y~7;;;C;;;;3;3;;;;;O;;;;;;;a;;;;s;;;;;;;a;;;;;;M;;;;;;;;u;;;;lt;;;;;;i-;;;;;;c;;;;;h;;;;;a;;;;;D;;;;;;De;;;;;I;;;;;M;;;;;;;;;;;h;;;;;u;;;;s;;;;;A;;;;;r;;;;;h;;;;;it;;;;;e;;;;;;r FID\ 11lE. GTLVAIT. GT2.VAIT (J' GlUAIT II\GTII II\GTI 11\GT2 IIIGT3 IftIB ACTIVE TO GTlUl Figure 5. Arbiter Block Diagram and write enable (/WE) signals from the CY7C601 to generate an enable signal. The priority latch accepts three data lines from the processor bus (one for the priority enable and two for the high-priority bus master's value). The latch loads the values into dedicated registers. The random counter, a minor portion of the design, is a free-running counter that supplies a 2-bit binary value to the priority-select block. The count changes every output clock (CLKl) cycle and provides a "seed" for the random priority function. The priority-select block chooses between the priority latch outputs (LPO - 1) and the random counter value (CTO - 1) using the EN signal as the selection criteria. The two outputs (PRIO - 1) feed to the handshake state machine and arbitrate between bus masters when more than one simultaneous request occurs. The handshake state machine monitors the request (MRQO - 3) and busy (MBB) inputs and generates the grant (MGTO - 3) signals that give an Mbus master ownership of the bus. TOGTIJI TOGT3Jl TO GT2JJ Figure 6. Bus Master 0 State Diagram. 0-1-2-3 sequence. The equations for the random counter are: cn = CTO = + CTI */CTO + ICTO; + ICTI *CTO; The priority selection block selects between the priority latch and the random counter. This block is a registered multiplexer that loads its register outputs with the priority latch value if EN = 1, or the counter's current state if EN = O. The outputs are updated every clock and fed to the handshake state machine. Handshake State Machine The handshake state machine controls Mbus handshake and arbitration. The machine cycles through 13 discrete states in performing its function. On power-up or reset, the state machine enters the idle state, waiting for a bus request. Upon receiving a request (/MRQO, for instance), the machine enters a wait mode (state GTO 0). In wait mode, the arbiter looks for busy (!MBE) to go inactive, while driving the IMGTO output active. When !MBB goes inactive, the machine goes to state GTO 1 and holds IMGTO active, while waiting for the granted master to· assert !MBB. When IMBB is Table 1. Mbus Channel Priorities Priority Latch, Select and Random Counter As described previously, the priority latch is a synchronous register loaded by the processor. When the active-Low write enable (/WE) and chip select (/CS) signals are both Low, the latch loads three data bits from the bus to the three macrocells dedicated to the priority latch. When either lWE or ICS are inactive (High), each register's output value is continuously reloaded every clock cycle, thus retaining the proper value. The equations for the priority latch are: EN= ICS */WE*D2 + ICS*/WE*Dl + ICS*/WE*DO + EN*WE + LPI *WE + LPO*WE + EN*CS; + LPI *CS; +LPO*CS; EN = LPl= LPO; The random counter is simply a 2-bit counter that changes state every output clock (CLKl) transition. The counter clears when lRESET is Low and counts in a PRIORITY Latched 6-272 Value FIRST 2ND 3RD LOWEST 11 master3 master2 masterl master4 10 master2 masterl masterO master3 01 masterl masterO master3 master2 00 masterO master3 master2 masterl ~~ ~ ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; U~si~n!g~t; ;he~C; ;Y; ;7; ;C; ;3; 3; ;O; ; ; a; ;s; ; ; a; M~u; ;lt; ; i; ;.c; ; ; ; h; ; a; ; ; n ; ; ; e; ; ; I; ; ; M; ; ; ; ; b; ; ; ; ; us ;;;;;;;;A;;;;;;;;r;;;;;;;;b;;;;;;;;it;;;;;;;;;;;er SEMJeamUCTOR;;;; detected, the machine goes to state GTO_WAIT and looks for another request. The MGTO grant line is held active during and after the sequence, allowing the master to maintain bus ownership until another master requests ownership. Figure 6 shows the bus master 0 state diagram and the request/grant handshake. The operation is identical for each of the four bus masters. The equations for the handshake state machine can be produced from a state transition table that also includes the arbiter's priority encoding. The table can be reduced to a manageable number of minterms using a public-domain optimizer called McBOOLE.. (see the Reference). Appendix A shows the state tranSItIon table. The sum-of-products format equations are then merged into the Cypress PLD ToolKit design file with the priority-latch, random-counter, and priority -se~ection equations. The PLD ToolKit design file appears m Appendix B. Design Verification The CY7C330 four-channel Mbus arbiter design was entered and verified using the PLD ToolKit. Design verification was performed using the PLD ToolKit's interactive simulator. A mouse was used with pop-down menus to create the circuit stimuli by drawing the waveform on the graphics screen for a each CY7C330 node or pin. The SIMULATE command was then selected, and the response waveforms were visu~ly inspected, giving a high degree of confidence m the design's function before programming a part. Reference "McBOOLE: A New Procedure For Exact Logic Minimization," M.R. Dagenias, V.K. Agarwal, N.C. Rumin, IEEE transactions on CAD of Circuit and Systems, vol. CAD-5, N.I, January 1986, p.229. 6-273 5:1;= Using the CY7C330 as a Multi-channel Mbus Arbiter ~CaID~OR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. Mbus Handshake!Arbiter State Transition Table. I*STATE TABLE FOR MBUS ARBITER HANDSHAKE STATE MACHINE -names: MBB,MRQ3,MRQ2,MRQ1.MRQO,PRI1,PRIO,STI,ST2,ST1,STO,MGTI,MGT2,MGT1,MGTO; STI,ST2,ST1,STO,MGTI,MGT2,MGT1,MGTO; output *1 1* PRESENT NEXT STATE STATE (INPUTS) (OUTPUTS) input ----------------------------------------------------------------------MMMMPP MMMM MRRRRRRSSSSGGGG BQQQQll I I I I I I I I B32101032103210 XllllXXOOOOOOOO XI110XXOOXOXXXX XllOlXXOOXOXXXX XIOlIXXOOXOXXXX XOIIIXXOOXOXXXX XOOOOOOOOXOXXXX XOOO1 OOOOXOXXXX XOO100000XOXXXX XOO110000XOXXXX X01000000XOXXXX XO 10 1OOOOXOXXXX XOllOOOOOXOXXXX X10000000XOXXXX Xl00l0000XOXXXX XI0l00000XOXXXX XllOOOOOOXOXXXX XOOOOO100xOXXXX XoooI0100XOXXXX XOOI001 OOXOXXXX XOOI10100XOXXXX XO 10001 OOXOXXXX XO 10 101 OOXOXXXX X01100100XOXXXX X10000100XOXXXX X1OO10100XOXXXX XI0loo100XOXXXX Xl1ooo100XOXXXX XOOOOloooXOXXXX XooolloooXOXXXX X0010 l000XOXXXX Xooll1000XOXXXX MMMM SSSSGGGG II II II II 32103210*1 OOOOOOOO 01000001 01000010 10000100 10001000 01000001 10001000 01000001 10001000 01000001 10001000 01000001 01000001 10000100 01000001 01000001 01000010 01000010 01000001 10001000 01000010 01000010 01000001 01000010 01000010 01000001 01000010 10000100 10000100 10000100 10000100 I*WAlT FOR MRQx *1 I*GOTO GTO *1 I*GOTO GTt *1 I*GOTO GT2 *1 I*GOTO GTI *1 I*GOTO GTO *1 I*GOTO GTI *1 I*GOTO GTO *1 I*GOTO GTI *1 I*GOTO GTO *1 I*GOTO GTI *1 I*GOTO GTO *1 I*GOTO GTO *1 I*GOTO GT2 *1 I*GOTO GTO *1 I*GOTO GTO *1 I*GOTO GTt *1 I*GOTO GT1 *1 I*GOTO GTO *1 I*GOTO GTI *1 I*GOTO GTt *1 I*GOTO GTt *1 I*GOTO GTO *1 I*GOTO GTt *1 I*GOTO GTt *1 I*GOTO GTO *1 I*GOTO GTt *1 I*GOTO GT2 *1 I*GOTO GT2 *1 I*GOTO GT2 *1 I*GOTO GT2 *1 XO 100 1OOOXOXXXX X01011OOOXOXXXX XOll 0 loooXOXXXX Xlooo1oooXOXXXX XlOOl1000xOXXXX X10101000XOXXXX X11OO1OOOXOXXXX XOOOO11 ooXOXXXX Xoooll100XOXXXX XOO101100XOXXXX 01000010 01000010 01000001 10000100 10000100 10000100 01000010 10001000 10001000 10001000 I*GOTO GTt *1 I*GOTO GTt *1 I*GOTO GTO *1 I*GOTO GT2 *1 I*GOTO GT2 *1 I*GOTO GT2 *1 I*GOTO GTI *1 I*GOTO GTI *1 I*GOTO GTI *1 I*GOTO GTI *1 6-274 -s;):CYPRESS ~ Using the CY7C330 as a Multi-channel Mbus Arbiter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. Mbus Handshake/Arbiter State Transition Table. XOOllll00XOXXXX X01001100XOXXXX XOI011100XOXXXX XO 110 11 OOXOXXXX Xl0001100XOXXXX Xl00l1100XOXXXX XlOI01100XOXXXX X 1100 11 OOXOXXXX 10001000 10001000 10001000 10001000 10000100 10000100 10000100 01000010 OXXXXXXOI000001 lXXXXXXOl00000l 1XXXXXXOOO 10001 OXXXXXXOOOI 000 1 Xll11XXOOl00001 01000001 00010001 00010001 00100001 00100001 OXXXXXXOl000010 lXXXXXXOlooooI0 lXXXXXX00010010 OXXXXXXOOOl0010 XI111XXoolooolO 01000010 00010010 00010010 00100010 00100010 OXXXXXXI 00001 00 1XXXXXX 10000100 lXXXXXXoooI0loo OXXXXXXoooI01oo XIIIIXX00100100 10000100 00010100 00010100 00100100 00100100 OXXXXXX 1000 1000 1XXXXXX 1000 1000 lXXXXXXOOOll000 OXXXXXX00011000 XllllXXool0l000 10001000 00011000 00011000 00101000 00101000 '*GOTO G1'3 *' '*GOTO G1'3 *' '*GOTO G1'3 *' '*GOTO G1'3 *' '*GOTO GT2 *' '*GOTO GT2 *' '*GOTO GT2 *' '*GOTO GTI *' '*CH 0 STATES *' '*GTO_O, WAIT ONMBB= 1 IN GTO_O*' '*GTO_O, GOTO GTO_l *' '*GTO_l, WAIT ON MBB= *' I*GTO_l, GOTO GTO_WAIT *' I*GTO_WAIT *' '*CH 1 STATES *' I*GTl_O, WAIT ONMBB= lINGTl_O*1 I*GTl_O, GOTO GTl_l *1 I*GTl_l, WAIT ON MBB= *' I*GTl_l, GOTO GTl_WAIT *1 I*GTl_WAIT *' '*CH 2 STATES *' I*GT2_0, WAIT ON MBB= 1 IN GT2_0*1 I*GT2_0, GOTO GT2_1 *' I*GT2_1, WAIT ONMBB = *1 I*GT2_1, GOTO GT2_WAIT *1 ° ° ° '*OT2_WAIT *' '*CH3 STATES *' I*OT3_0, WAIT ONMBB= lINOT3_0*' '*GT3_0, GOTO OT3_1 *' I*OT3_1, WAIT ONMBB = 0*' '*GT3_1, OOTO GT3_WAIT *1 '*G1'3_WAIT *' 6-275 57~ =;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;V;;;;;;;S;;;;;;;iD;:g::;t;;;;;;;h;;;;;;;e;;;;;;;C;;;;;;;Y;;;;;;;;;;;;;7;;;;;;;C;;;;;;;3;;;;3;;;;;;;O;;;;;;;a;;;;;;;S;;;;;;;a;;;;;;;M;;;;;;;;;;;;;u;;;;;;;lt;;;;;;;i-;;;;;;;c;;;;;;;h;;;;;;;a;;;;;;;D;;;;;;;D;;;;;;;el;;;;;;;M=h;;;;;;;u;;;;;;;s;;;;;;;A;;;;;;;r;;;;;;;h;;;;;;;it;;;;;;;e;;;;;r Appendix B. PLD ToolKit Source File for Mbus Arbiter CY7C330; {DESIGN FILE: FOUR CHANNEL MBUS ARBITRATION UNIT WITH RANDOM PRIORITY COUNTERS AND SYNCHRONOUS PRIORITY ENABLE} CONFIGURE; {INPUTS} CLKl, CLK2, !RESET, MBB, MRQO, MRQl, MRQ2, MRQ3(node=9), CS, WE, DO, 01, {Output Clock 2x CLK2 } {Input Clock = MBUS System Clock } {Reset, Active Low} {MBUS Busy, Active Low} {MBUS Channel 0 Request, Active Low} {MBUS Channel 1 Request, Active Low} {MBUS Channel 2 Request, Active Low} {MBUS Channel 3 Request, Active Low} {Decoded Processor Chip Select} {Processor Write Enable} {Data Bus Bit 0, Lalched Priority Bit O} {Data Bus Bit 1, Latched Priority Bit I} {Data Bus Bit 2; Latched Priority Enable Bit} 02, {OUTPUTS} !MGTO(node= 15), !MGTl, !MGT2, IMGT3, lEN, IPRIO(node=23 ), !PRIl, !CTO, !cn, !LPO, !LPt, INT RST(node=29), STO(node=31), STl, ST2, ST3, {MBUS Channel 0 Grant, Active {MBUS Channel 1 Grant, Active {MBUS Channel· 2 Grant, Active {MBUS Channel 3 Grant, Active {Settable Priority Enable Bit} {Priority Selection Bit O} {Priority Selection Bit I} {Random Counter Bit O} {Random Counter Bit I} {Latched Priority Bit O} {Latched Priority Bit I} {Sync Reset Node} {State Variable Bit O} {State Variable Bit I} {State Variable Bit 2} {State Variable Bit 3} {End of configuration section} Low} Low} Low} Low} EQUATIONS; INT_ RST = RESET; {MBUS Request/Grant Handshake State Machine Equations} ST3 = IMRQ3*MRQl *MRQO*/PRIl *IST3*IST2*ISTO + IMRQ3*PRIl *PRIO*IST3*IST2*ISTO + IMRQ3*MR QO*/PRIl */PRIO*IST3*IST2*ISTO + IMRQ3*MRQ2*MRQl*MRQO*IST3*IST2*ISTO + IMRQ2*PRll*/PRIO*IST3*IST2*ISTO + MRQ3*/MRQ2*MRQl*MRQO*IST3*IST2*ISTO + MRQ3*/MRQ2*MRQO*/PRIO*IST3*IST2*ISTO + MRQ3*/MRQ2*PRIl*IST3*IST2*ISTO + IMBB*ST3*IST2*ISTl*ISTO*/MGT3*MGT2*/MGTl*/MGTO + IMBB*ST3*IST2*ISTl*ISTO*MGT3*/MGT2*/MGTl*/MGTO; 6-276 ~ ~~RESS Using the CY7C330 as a Multi-channel Mbus Arbiter ~;r~~~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix B. PLD ToolKit Source File for Mbus Arbiter ST2 MRQ2*/MRQl *PRIl */PRIO*IST3*IST2*ISTO + MRQ2*MRQ 1*/MRQO*/PRIO*IST3*IST2*ISTO + IMR Q 1*/PRIl *PRIO*IST3*IST2*ISTO + IMRQO*/PRIl */PRIO*IST3*IST2*ISTO + MRQl*/MRQO*/PRIl*IST3*IST2*ISTO + MRQ3*MRQ2*/MRQ 1*MRQO*IST3*IST2*ISTO + MRQ3*MRQ2*/MRQl*PRIl*IST3*IST2*ISTO + MRQ3*MRQ2*MRQ 1*/MRQO*IST3*IST2*ISTO + IMBB*IST3*ST2*ISTl *ISTO*/MGT3*/MGT2*/MGTl *MGTO + IMBB*IST3*ST2*ISTl *ISTO*IMGT3*/MGT2*MGTl */MGTO; STl IMBB*IST3*IST2*ISTl *STO*/MGT3*/MGT2*MGTl */MGTO + IMBB*IST3*IST2*/ST 1*STO*IMGT3*MGT2*/MGTl */MGTO + IMBB*IST3*IST2*ISTl *STO*MGT3*/MGT2*/MGTl */MGTO + IMBB*IST3*IST2*ISTl*STO*/MGT3*/MGT2*/MGTl*MGTO + MRQ3*MRQ2*MRQl *MRQO*IST3*IST2*STl *ISTO*/MGT3*/MGT2*MGTl */MGTO + MRQ3*MRQ2*MRQl*MRQO*IST3*IST2*STl*ISTO*/MGT3*MGT2*/MGTl*/MGTO + MRQ3*MRQ2*MRQ 1*MRQO*IST3*IST2*STl *ISTO*/MGT3*/MGT2*/MGTl *MGTO + MRQ3*MRQ2*MRQl*MRQO*IST3*IST2*STl*ISTO*MGT3*IMGT2*/MGTl */MGTO; STO = MBB*IST3*IST2*ISTl *STO*/MGT3*/MGT2*/MGTl *MGTO + MBB*IST3*/ST2*ISTl *STO*IMGT3*/MGT2*MGTl */MGTO + MBB*/ST3*IST2*/STl *STO*/MGT3*MGT2*/MGTl *IMGTO + MBB*IST3*/ST2*ISTl *STO*MGT3*/MGT2*/MGTl */MGTO + MBB*IST3*ST2*ISTl *ISTO*/MGT3*/MGT2*MGTl *IMGTO + MBB*ST3*IST2*ISTl */STO*/MGT3*MGT2*/MGTl */MGTO + MBB*/ST3*ST2*/STl*ISTO*/MGT3*/MGT2*/MGTl*MGTO + MBB*ST3*/ST2*/STl *ISTO*MGT3*/MGT2*/MGTl */MGTO; MGT3= /MRQ3*MRQl *MRQO*/PRIl *IST3*/ST2*ISTO + IMRQ3*PRIl *PRIO*/ST3*IST2*/STO + IMRQ3*MRQO*/PRIl */PRIO*/ST3*/ST2*ISTO + /MRQ3*MRQ2*MRQ 1*MRQO*IST3*IST2*/STO + MBB*IST3*IST2*/STl *STO*MGT3*IMGT2*/MGTl */MGTO + IMBB*/ST3*/ST2*/STl *STO*MGT3*/MGT2*/MGTl */MGTO + MRQ3*MRQ2*MRQ 1*MRQO*IST3*/ST2*STl */STO*MGT3*/MGT2*/MGTl */MGTO + MBB*ST3*IST2*/STl */STO*MGT3*/MGT2*/MGTl */MGTO +IMBB*ST3*IST2*/STl *ISTO*MGT3*IMGT2*IMGTl */MGTO; MGT2 = MBB*IST3*IST2*/STl *STO*/MGT3*MGT2*IMGTl */MGTO + IMBB*/ST3*IST2*ISTl *STO*/MGT3*MGT2*/MGTl */MGTO + IMRQ2*PRIl */PRIO*IST3*IST2*ISTO + MRQ3*/MRQ2*MRQ 1*MRQO*IST3*IST2*ISTO + MRQ3*MRQ2*MRQ 1*MRQO*IST3*IST2*STl *ISTO*/MGT3*MGT2*/MGTl */MGTO + + + + MGTl MRQ3*/MRQ2*MRQO*/PRIO*IST3*IST2*ISTO MRQ3*/MRQ2*PRIl */ST3*IST2*ISTO MBB*ST3*IST2*ISTl *ISTO*/MGT3*MGT2*/MGTl */MGTO IMBB*ST3*IST2*ISTl *ISTO*/MGT3*MGT2*/MGTl */MGTO; = MBB*IST3*IST2*ISTl *STO*/MGT3*/MGT2*MGTl */MGTO + IMBB*IST3*IST2*ISTl *STO*/MGT3*/MGT2*MGTl */MGTO + MRQ2*/MRQl *PRIl */PRIO*IST3*IST2*ISTO + IMRQ 1*/PRIl *PRIO*IST3*IST2*ISTO + + + + MRQ3*MRQ2*/MRQl*MRQO*IST3*IST2*ISTO MRQ3*MRQ2*/MRQ 1*PRIl *IST3*/ST2*/STO MRQ3*MRQ2*MRQ 1*MRQO*IST3*/ST2*STl */STO*/MGT3*/MGT2*MGTl */MGTO IMBB*/ST3*ST2*ISTl */STO*/MGT3*/MGT2*MGTl */MGTO + MBB*/ST3*ST2*/STl *ISTO*/MGT3*/MGT2*MGTl */MGTO; 6-277 ~ ~~RESS ~, SEM!camucrOR _-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;V;;;;;;;;;;;;;;;si;;;;D;:;g;;;;t;;;;he;;;;;;;;;;;;;;;C;;;;Y;;;;7;;;;C;;;;3;;;;3;;;;O;;;;a;;;;s;;;;a;;;;M;;;;;;;;;;;;;;;ll;;;;It;;;;i-;;;;c;;;;h;;;;a;;;;D;;;;ne;;;;I;;;;M;;;;;;;;;;;;;;;h;;;;ll;;;;s;;;;A;;;;r;;;;h;;;;it=er Appendix B. PLD ToolKit Source File for Mbus Arbiter MGTO MBB*IST3*IST2*ISTl *STO*/MGT3*/MGT2*IMGTl *MGTO + IMBB*IST3*IST2*IST 1*STO*IMGT3*IMGT2*IMGTI *MGTO + MRQ2*MR Q 1*/MR QO*/PRIO*IST3*IST2*ISTO + IMRQO*IPRIl */PRIO*IST3*IST2*ISTO + MRQ 1*/MRQO*IPRIl *IST3*IST2*ISTO + MRQ3*MRQ2*MRQ 1*/MRQO*IST3*IST2*ISTO + MRQ3*MRQ2*MRQ 1*MRQO*IST3*IST2*STI *ISTO*/MGT3*/MGT2*IMGTI *MGTO + IMBB*IST3*ST2*ISTI *ISTO*/MGT3*IMGT2*IMGTI *MGTO + MBB*IST3*ST2*ISTI *ISTO*/MGT3*IMGT2*/MGTI *MGTO; = {Random Counter Equations} CTl = CTI */CTO + ICTl*CTO; CTO ICTO; {Latched Priority Equations} EN ICS*IWE*D2 + EN*WE + EN*CS; LPI = ICS*IWE*Dl + LPl*WE + LPl*CS; LPO = ICS*IWE*DO + LPO*WE + LPO*CS; {Priority Selection Latch} PRIl PRIO IEN*CTl + EN*LPl; = IEN*CTO + EN*LPO; {End of file} 6-278 CYPRESS SEMICONDUCTOR Using the CY7C331as a Waveform Generator This application note demonstrates the ability of the Cypress CY7C331 CMOS Erasable Programmable Logic Device (EPLD) to implement a design requiring multiple clocks, input registers, buried registers, and independent control of individual registers' set and reset inputs. Combined with this design flexibility, the CY7C331 provides high-speed performance-an unprecedented combination. The application example described in this application note shows how to use the CY7C331 as a programmable waveform generator. ture that supports asynchronous and general-purpose gluelogic integration applications. The CY7C331 has a 192-product-term array and 12 I/O-logic macrocells. Each macrocell has two D-type flipflops with asynchronous set, reset, and bypass capability. You can individually program the flip-flops' clock, set, and reset inputs, as well as each macrocell's logic polarity and output enable control. The CY7C331 easily supports combinatorial and registered inputs, along with buried states. The ability to bury registers and associated gates is highly desirable because it helps increase the number of usable gates in an EPLD. Typically, if you use an I/O pin as an input, you waste the output register and its supporting product term structure. This loss occurs because conventional devices provide only one macrocell feedback OE (PIN 14) CY7C331 Background The CY7C331 is a member of the Cypress slimline 28-pin family of high-performance CMOS EPLDs, which are characterized by high speed, increased I/O, and high integration. The CY7C331 has a highly flexible architec- OE RM OUT SET PTERM CO PRODUCTS OUT ClK PTERM r-------+-------~ ~r-~~L-~~PIN OUT RESET PTERM IN ClK PTERM IN SET PTERM TO INP T B FFER IN RESET PTERM TO INPUT BUFFER reg1 ster FROM ADJACENT MACROCEll Figure 1. The CY7C331 I/O Macrocell and Shared Input Mux 6-279 path. Using this path as an input makes it impossible to feed the contents of the register back into the array. The CY7C331's dual-muxing structure eliminates this limitation by allowing you to use the shared input mux (Figure 1) as an I/O path into the array, while simultaneously feeding back the register contents using the separate macrocell feedback mux. Because you can make the CY7C331's output register transparent by asserting both the register's set and clear nodes, you can also achieve simultaneous combinatorial feedback. Using this feature, you can implement bidirectional I/O in both registered and combinatorial configurations. Configuring the CY7C331 Figure 2 lists PLO ToolKit source code that configures a CY7C331 I/O macrocell as bidirectional, with feedback from the output. The I/O pin corresponding to the macrocell is labeled 10 PIN, and the path from the I/O pin to the macrocell is :iN PATH. The code includes explanatory comments. Note that the source code assigns 10 PIN to node 28 and IN_PATH to node 34, with pin 28 as-a source. In the PLO ToolKit simulator, you must add the input waveform on the trace corresponding to node 28, even though that trace is named 10 PIN. IN PATH's node 34 is a readonly node. This is true evenIf you configure 10_PIN as a buried register, and IN PAlH is always an input. The reason is that node 34 is just a mux, and the register associated with the input belongs to node (pin) 28. If you want to see the output register's value when the pin is an input, you can create a view node for the mux node. This arrangement allows you to probe several different places inside a macrocell (see the Reference for more information on view nodes). The CY7C331 as a Function Generator Waveform generators are useful in a variety of applications, primarily in the test and diagnostic areas. Any time you need to create high-speed digital waveforms, a programmable waveform generator is the ideal solution. The CY7C331 design described here allows you to generate waveforms of frequencies greater than 30 MHz. This waveform generator builds waveforms with respect to a system clock called SYS CLK. To use the generator, you load into LOW_ REG(2:0) the number of {*****************************************************************************************} CY7C331; CONFIGURE; {The first line of code selects the device} {In this section pin and node names are specified, along with configuration information} INCLK, OUTCLK, IINCLR, IINSET, OEI, IOE2, INPUT, IOUTCLR(NOOE=9), 10UTSET, {The input names are listed above. Pin I will be the input clock, pin 2 will be the output clock. Pins 3 and 4 will be the input register's clear and set signals respectively. Pins 5 and 6 will be output enables, OEI is high asserted, IOE2 is low asserted. Pin 7 is a straight input. We skip pin 8 because it is Vss. Pins 9 and 10 will be the input register's clear and set signals.} 10 PIN(NOOE=28, IREG), IN PATH(NOOE=34, SRC=28), OUT(NODE=27), - {Pin 28 is the actual bidirectional pin. The IREG attribute specifies that the input to the array comes from the output register, rather than the pin. Node 34 is the shared input mux for nodes 27 and 28. IN PATH is the input path to the array from pin 28. Pin 27 is a simple output.} EQUATIONS; {This is where the array is specified.} INPUT {When 10 PIN is an output, it follows Pin 7.} OUTSET OUTCLR OUTCLK OEI * OE2 {Outputs are enabled when OE_l is high, and IOE_2 is low.} INCLK INCLR INSET; OUT = {Listing the connective alone sets the product term to "I", always asserted.} {When both the set and reset product terms are asserted, the register} {becomes transparent. Thus, this is a combinatorial output.} IN PATH; {This output always shows the value of the input register at pin 28.} {If the register is in combinatorial mode, the value on pin 28 will be shown.} Figure 2. PLD ToolKit Source Code for a Bidirectional Pin With Feedback 6-280 SYS_CLK cycles that you want the output waveform (OUT W AVE) to remain Low. HI REG(2:0) contains the number of SYS CLK cycles that you want OUT WAVE to be High. For this implementation. the values must be between 2 and 7. When the START signal is asserted. OUT WAVE goes low. and LOW REG(2:0) is loaded into a counter. When the count is almost O. the signal TERM CNT is deasserted. then reasserted when the count reaches O. This toggles OUT WAVE and loads a second counter with the value in HIJlliG(2:0). The cycle repeats. alternating between HI REG(2:0) and LOW REG(2:0) until SYS CLK is withheld. or new values are loaded into HI REG(2:0) and LOW_REG(2:0). and START is reissued:- Figure 3 depicts the waveforms for this design. HI_REG(2:0) and LOW_REG(2:0) are loaded using IDS and ADDR(7:0). You can specify any address for these registers. In this example. HI REG(2:0) is at ADDR(7:0) = 00 Hex. and LOW-REG(2:0) is at ADDR(7:0) = 01 Hex. LOW_CLK_IN is the clock input for LOW_REG(2:0). The clock results from decoding the active low IDS (data strobe) and ADDR(7:0) ;., 01 Hex. HI_CLK_IN is similarly decoded from IDS and ADDR(7:0) = 00 Hex. LOW CNT (2:0) and HI CNT (2:0) form two 3-bit counters. These counters are iOaded-with the contents of ADDR(7:0) IDS XZ\ 00 the LOW_REG(2:0) and HI_REG(2:0) registers. respectively. via each flip-flop's individual set and reset. LOW CNT (2:0) is loaded when /TERM CNT is Low and OUT WAVE is High. Similarly, HI-CNT (2:0) is loaded when /TERM CNT is Low and OUT ViAVE is Low. SYS CLK clockS both counters. lTERM CNT is also clocked by SYS CLK and detects when either of the counters equals 1. -When this occurs. lTERM CNT goes Low for one clock. then goes High again. -/TERM CNT's rising edge clocks OUT WAVE. which toggles on every clock. Implementing this design requires two separate 3-bit input registers. decoding logic for the input-register clocks, two separate 3-bit counters, logic. and two miscellaneous registers. All the counter flip-flops must be individually settable or resettable. In addition. there are four separate clocking functions. Figure 4 shows an implementation of this design using small-scale integration. This type of design is usually difficult to implement in a PLD. The flip-flops in most PLDs permit neither the use of the individual set and reset inputs nor separate clocking. Because the CY7C331 has these features. however. it implements the design effortlessly. PLD ToolKit Implementation ~~______________~P~Oa"_'T~C~A~R~E__________________________ ~ HI_REG(2:0) ~~~~~~_____________________________________________________ LOW_REG(2:0) ~~~~~~XX~-L_______________________________________________ START --------~;--\~----------~-----------------------------\~ \'--------'/ x x x Figure 3. Waveform Generator Internal and External Timing 6-281 \ _ _____J/ / output. Because this is the defaul~ it does not need to be specifie~ but it is included here for documentation purposes. The same is true for TERM CNTt IHI CNT 0, and /LOW CNT 1. -Notice that -HI IN 1 and LOW IN 0 have the attribute "!REO" listed after the node assignment. This attribute specifies that these pins are dedicated inputs; the feedback mux selects the Q output of the input register associated with the pint as opposed to the output register's Q output. This is an override of the default discussed above. Appendix A contains the Cypress PLD ToolKit source code for the waveform generator. Two aspects of the code require some clarification: the pin assignments and polarity. The pin assignments for nodes (pins) 1 through 14 are straightforward. Pin 8 has been skipped because it is a Vss pin. Otherwiset these pins are the CY7C33rs combinatorial inputs and thus require no configuration information. OUT_WAVE is assigned to pin 16. "lOP" following the node assignment indicates that the feedback mux is programmed to feed back the OUT_WAVE registerts Q SYI t_1 r-.... ..... ..... IDI 1&11111" LAURI ,aDDI. I LADJIJII LllDR4 IADDU LA ~F=I lADDIl LADDR7 .LUl11 " r8" LOII_RU _ ~h ,. ~ LOll_lEi ~ )--I- ~ HI 'I .. D, 1.011 I I -PD- ~II~ II t:op>- r8.. II_RUI ,I, II 1 r8.. II_REI HI II ! I L 1111 t I , \' II , T"'. " IT ~ .',, r~ I ~ ,I, " .............. II lIo .. ua •• ~ ~h ....... '"oT lIa •• '8"., "a., ..... - ..... '8".,-"aVIr .. , .,.T ~I -PD- ~~.. ,. .HI 1I0T lIa •• ,-, TOIiriT LOII_UU _C LI_II IIBY lIa,w tllTftIiT~ -Tra. tiT - IIr- .1111 " J.JIJI..I.I ....... """ ua •• ~ ,I, }-L-A.i - C L, _ " IHI elT It I ,,,,, .. "av~ hT-.--'. t:op>-b®J~ tiT 'II"" uav. ..... -tIlT 'A"" uav~ Tn.-tIlT QI START II_REn S'fS eLla. Figure 4. Schematic of the Waveform Generator 6-282 . r~ IIUT IIJ YE wn Using the CY7C331 as a Waveform Generator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The rest of the assignments are of the same form as NO CONNECT lOll - III - 0 lOll - III 1 / lOI/_ CNT 1 lOll III - Z /HI_ CNT 0 IDS ADORa ADDRt ADDRZ ADDR3 ADDU ADDRS IHI CNT 2 and HI IN 2. IHI CNT 2 is assigned to node attribute of IOP.-As mentioned earlier, this 18,-with an configures the feedback mux to select feedback from IHI CNT 2 as the array input. HI IN 2 is assigned to node 30, -which is an additional mux that serves as an input path from the input register on either pin 18 or 17. The notation "SRC = 18" specifies that HI_IN_2 is assigned to the input register on pin 18. The default is that the even pin is always selected, and thus "SRC = 18" is included primarily for documentation purposes. This method for utilizing both a pin's input and output registers is used four times in this design. In each case, the output register is buried (not accessible to the pin). Figure 5 shows the CY7C331 footprint with all external pin signals labeled. A close look at the file in Appendix A might also raise questions concerning polarity conventions in the PLD ToolKit. Polarity on inputs is fairly straightforward. Note that the "I" in ISTART denotes a Low-asserted signal. When START appears in the EQUATIONS section (refer to lOUT_WAVE and /TERM_ CNT equatio~s) without the "I", the signal is interpreted as ISTART bemg asserted. Thus, when ISTART = 0, the OUT_WAVE register is set The output feedback polarity can cause more confusion. Polarity on the CY7C331 is programmed using the XOR in the array. Thus, when TERM_CNT is specified in the CONFIGURATION section, the output register is actually /TERM_ CNT, because an inverter lies between the - - Vee Vss Vss HI IN a HI IN 1 HI IN 2 TERM CNT OUT IIAYE NO CONNECT ADDU ADDR7 START SYS ClK NO CONNECT SYS CLEAR - - - - Figure 5. Footprint of the CY7C331 Waveform Generator register output and the pin. Further, when . you set TERM CNT, the pin is Low. How, then, do you specify that TERM CNT is asserted when it appears on the right of an equation? You refer to the polarity present on the pin. Thus, in the lOUT_WAVE equation's portion, TERM_CNT is specified. This means that lOUT WAVE is clocked when pin 17 (TERM_CNT) exhibits rising edge. a Reference PLD ToolKit Manual, Chapter 4.3. Available from Cypress Semiconductor. 6-283 ~ ~~RESS SEMICQIDUCTOR ~, _--,;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;V;;;;;;;;;;;;;si;;;;D;;g;;;;t;;;;he;;;;;;;;C;;;;Y;;;;7;;;;C;;;;3;;;;3;;;;1;;;;a;;;;s;;;;a;;;;;;;;W;;;;a;;;;v;;;;e;;;;fo;;;;;;;r;;;;m;;;;;;;;G;;;;e;;;;D;;;;;;;er;;;;a;;;;t;;;;;;;;;;;;or Appendix A. PLD ToolKit Code for the Waveform Generator CY7C331; CONFIGURE; {Low asserted data strobe} IDS, ADDRO, ADDR1, ADDR2, ADDR3, ADDR4, ADDRS, {address bits 0,1,2,3,4,S,} {address bits 6 and 7} ADDR6(NODE=9), ADDR7, ISTART, {start sequence} SYS CLK, {counter clock} SYS-CLEAR(NODE=14), {initialize OUT WAVE,TERM CNT to a quiescent state} {output wave rOiro} OUT W AVE(NODE=16,IOP), TERM CNT(NODE= 17,lOP), {terminal count decode register} . IHI CNT 2(NODE=18,IOP), {high counter bit 2, a buried register} {high register input bit 2} HI IN 2(NODE=30,SRC=18), IN-1(NODE=19,IREG), {high counter input bit I} IHI CNT 1(NODE=20,IOP), {high· counter. bit 1, a buried register} HI IN 0(NODE=31,SRC=20), {pin 20 acts as high register input bit O} IHI CNT. 0(NODE=23,IOP), {high counter bit O} ILOW CNT 2(NODE=24,IOP), {low counter bit 2, a buried register} LOW -IN 2(NODE=32,SRC=24), {pin 24 is low register input bit 2} ILOW CNT 1(NODE=2S,IOP), {low counter bit I} LOW -IN 1(NODE=33,SRC=26), {pin 26 acts as low register input bit I} ILOW CNT 0(NODE=26,IOP), {low counter bit 1, a buried register} LOW]N_0(NODE=27,IREG), {low register input bit O} He EQUATIONS; LOW_CNT_O := /LOW CNT 0 SYS CLK DS* mDRO*1ADDR1 *1 ADDR2*1ADDR3*1ADDR4*1 ADDRS*IADDR6*1ADDR7 ILOW IN 0 * lOUT WAVE * lTERM CNT LOW,=-IN=O * IOUT=WAVE * lTERM'=-CNT; DS*ADDRO*/ADDR1*/ADDR2*/ADDR3*/ADDR4*/ADDRS*/ADDR6*/ADDR7; LOW CNT 1 := LOW CNT 1 LOW CNT 0 ILOW IN 1 * lOUT WAVE * lTERM CNT LOW-IN-1 * lOUT-WAVE * lTERM-CNT SYS CLK ; LOW CNT 2 := LOW CNT 2 LOW CNT 0 * LOW CNT 1 ILOW IN 2 * lOUT WAVE * lTERM CNT Low1N-2 * lOUT-WAVE * ITERM-CNT SYS CLK DS*AI5DRO*/ADDR1 */ADDR2*/ADDR3*/ADDR4*/ADDRS*/ADDR6*/ADDR7; OUT WAVE TERM CNT START SYS_CLEAR 6-284 C~RESS Using the CY7C331 as a Waveform Generator ~, ~C~OR~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Appendix A. PLD ToolKit Code for the Waveform Generator /TERM CNT:= /LOW CNT 0 * LOW CNT 1 * LOW CNT 2 /HI-CNT 0 * HI CNT 1 * HI-CNT-2 START SYS CLEAR ; /HI CNT 0 -SYS -CLK HI IN 0 * OUT W AVE * /TERM CNT /HCIN-=,O * OUT-='WAVE * /TERM-='CNT; HI CNT 1 HI CNT 0 /HI IN l*OUT WAVE*/TERM CNT HCIN-l *OUT-WAVE*/TERM-CNT SYS- CLK DS*/ ADDRO*/ADDRI */ADDR2*/ ADDR3*/ ADDR4*/ADDRS*/ ADDR6*/ ADDR7; /HI_IN_l = HI_CNT_2:= DS*/ADDRO*/ADDRl*/ADDR2*/ADDR3*/ADDR4*/ADDRS*/ADDR6*/ADDR7; HI CNT 2 HI CNT 1*HI CNT 0 /HI IN 2*OUT VIAVE*/TERM CNT HI-IN-2*OUT-WAVE*ITERM-CNT SYS- CLK DS*/ADDRO*/ADDRl*/ADDR2*/ADDR3*/ADDR4*/ADDRS*/ADDR6*/ADDR7; 6-285 CYPRESS SEMICONDUCTOR CY7C331 Application Example: Asynchronous, Self-Timed VMEbus Requester This application note describes how to use the Cypress CY7C331 CMOS erasable programmable logic device (EPLD) to support asynchronous, self-timed designs. The CY7C331 is ideal for implementing asynchronous, self-timed, and general-purpose logic· integration applications. The application example described here is an asynchronous, self-timed VMEbus requester. The CY7C331 is a member of the Cypress slim-line, 28-pin family of high-performance CMOS EPLDs. Family members are characterized by high speed, increased I/O, and high integration. The CY7C331 has a highly flexible architecture with a 192-product-term logic array and 12 I/O-logic macrocells. Each macrocell provides two D flip-flops with asynchronous set, reset, and bypass capability. The flip-flop's Clock, set, and reset inputs are individually programmable, as are each macrocell's logic polarity and output-enable control. The CY7C331 easily supports combinatorial and registered inputs and outputs and buried states. Additionally, the CY7C331 has the uncommon ability to self-time asynchronous, sequential applications. A self-timed design performs a sequential task without the presence of a clock to synchronize each step in the sequence. This design approach usually results in higher performance compared to synchronous designs. The main application for self-timing is in highperformance I/O interfaces. The CY7C331 supports self-timed designs because its clock inputs are programmable, internal timing relationships are well-controlled, and metastable resolution is ultra-fast. The VMEbus is a common, high-performance asynchronous bus. The VMEbus request function is asynchronously initiated and sequential. In addition to showing the CY7C331 's ability to handle asynchronous, self-timed tasks, this application example demonstrates the use of many unique CY7C331 features. CY7C331 Brief Description The CY7C331 is available in a 28-pin slim-line (300 mil wide) plastic or windowed DIP and in 28-pin PLeC and LCC packages. The windowed version is UV erasable and reprogranunable, and the plastic DIP, PLCC, and LCC versions are one-time programmable. The CY7C331 is available with TpD and Teo specified at 20 ns max and with register set-up times of 12 or 2 ns, depending on whether the register connects to an input pin or to the device's .logic array. Other commercial and military speed grades are also available. The CY7C331 is based on a programmable sum-ofproducts (AND-OR) logic-array architecture. The logic array consists of 192 programmable product terms, each having as input the true and complement versions of 31 logic inputs. The product terms connect to one of twelve I/O logic macrocells, and each of these macrocells connects to a device pin. The product terms are allocated with a variable distribution to the macrocells. The CY7C331 provides 13 combinatorial inputs to the array from dedicated input pins, one of which (pin 14) can also be used as an output-enable control. The macrocells and six shared input muxes each provide an input to the array. A shared input mux selects the input from one of two adjacent macrocells (Figure1). The CY7C331's I/O-logic macrocell sums array product terms, selectively inverts the sum, and provides the result to the D input of a D flip-flop. The flip-flop's output (Q) connects through an inverting three-state buffer to a device pin and can be fed back to the array. The I/O macrocell also provides a second D flip-flop that latches data from the same device pin. This flipflop's Q output connects to the macrocell input-select mux and to the shared-input mux (see Figure1 in "Using the CY7C331 as a Waveform Generator"). Both flipflops have asynchronous set (S) and reset (R) inputs, as well as bypass capability. A flip-flop bypasses the D input to Q when S and R are both High. Separate product terms drive both flip-flops' clock, S, and R inputs. A multi-input OR ogate sums the product terms. The number of product terms input to the OR gate depends on the macrocell (Figure1). A dual-input XOR gate selectively inverts the sum. The XOR gate's second input is a product term that controls selective inversion. You can control a macrocell's output enable (OE) by 6-286 Figure 1. Cypress CY7C331 Block Diagram using pin 14 or a product term. The OE mux selects one of these two options. Another mux, the FB mux, selects the macrocell array input Each OE, FB, and sharedinput feedback mux .has an associated· programmable configuration bit that controls mux selection. A self-timed design implements a state machine without the presence of a clock to synchronize each state transition. The implementation of a self-timed design must meet two· basic requirements: 1. It must time and perform state transitions. 2. It must synchronize asynchronous inputs. As in any state machine, a self-timed design must meet minimum state flip-flop set-up times before performing a state transition. Without the benefit of a clock, the design must generate self-timing clocks based on the state data change due to a state transition itself. Thus, clock initiation and data changes are coincident, and the design must delay a clock to allow data to settle and meet minimum set-up time requirements. The simplest example of self-timing appears in FiEf ure 3. This circuit clocks a logic 1 into a D flip-flop on the input's rising edge. The design works if the clock delay time is long enough to allow the data input to be set up. This simple circuit illustrates how the CY7C331 supports self-timed designs; the CY7C331 allows you to program the timing relationship between the flip-flop's D-input logic and clock input logic to guarantee satisfaction of minimum set-up time requirements. The CY7C331 synchronizes asynchronous inputs in the same manner, except that the set-up time is longer to allow for metastable resolution. The CY7C331 can also perform self-timed synchronization because metastable resolution is ultra-fast The approach used in the CY7C331 to self-time state transitions is to delay a clOCk signal by passing it through the logic array one additional time; this arrangement allows data to meet set-up time requirements. To guarantee that this approach works, the extra delay in the clock path must be programmed to delay the clock as long as possible (Figure 4). In general, a selftimed design should set up data as fast as possible and delay the clock long enough to guarantee that data is set up. But delay time in the CY7C331 is sensitive to the logic function programmed. Guaranteeing that data is set up as fast as possible restricts the logic functions the device can perform. You can avoid this limitation by placing restrictions on the clock path. You can program any logic function if the clock delay path is slow enough. To perform self-timed synchronization, the clock is delayed by two extra passes· to provide the extra delay required for metastable resolution (Figure 5). Program both clock delay elements to be as slow as possible so you can configure any logic function. With these restrictions, the mean time to failure (MTF) due to a metastable condition is greater than 10 years. CY7C331 Self-Timed Capability Clock Delay Programming The main application for self-timed functions is in high-performance I/O interfaces, where clocking restrictions prevent performance requirements from being satisfied. These applications might not have an available clock, the clock might be too slow, or synchronization time might have to be minimized. In the CY7C331, a product term generates an output transition from Low to High faster than from High to Low. A transition caused by a single input and a single product term is faster than those caused by multiple inputs and/or product terms. The shortest delay time through a CY7C331 occurs when a single input 6-287 triggers a single product term to transition from Low to High. The. slowest clock path results from placing restrictions on how the extra level of clock delay is programmed. These restrictions are: The clock delay should use a logic path through multiple product terms, OR gates, and XOR gates to a bypassed flip-flop. Clock delay logic should make product term outputs transition from High to Low. All product terms to the OR gate should be programmed identically to implement clock logic. The OR gate should have the same or more inputs than associated data-path OR gates. The programmable XOR input should be set Low. The clock delay element shown in Figure 4 illustrates each of the four programming restrictions. Self-Timed VMEbus Requester Bus requesters are used in common bus systems that support multiple processors controlling bus transfers. A processor that controls bus transfers is typically referred to as a bus master. The bus requester requests permission for a master to control the data bus and indicates to the master when data bus control has been granted. The VMEbus supports multiple bus masters. A self-timed design approach for a VMEbus requester is appropriate because the VMEbus is asynchronous and offers high performance. The bus-request function is asynchronously initiated and is sequential. A self-timed design self-synchronizes to initiate the request and self-times the rest of the request sequence at CY7C331 device sp~d. A synchronous approach requires an external clock· to synchronize and time the sequence, for which the VMEbus provides a 16-MHz system clock. However, a CY7C331 self-timed design provides much higher performance than a synchronous design using the system clock. VME Background The VMEbus i~ defmed to support multiple bus masters, although only one master can control the bus at a time. The VMEbus provides an arbitration subsystem in which a central bus arbiter determines which master is granted the data bus. Each master contains a bus requester to request control of the bus from the arbiter. The arbitration subsystem is supported on the VMEbus with six bused lines and four daisy-chained lines. All these lines are active Low, which is indicated by a"_" suffix on a line name. The bused lines are Bus Busy (BBSY-), Bus Clear (BCLR-), and Bus Request 3 - 0 (BR3- through BRO-). When the daisy-chained lines enter a board, they are designated Bus Grant 3-0 In (BG3IN- through BGOIN-), and when leaving are designated Bus Grant 3 - 0 Out (BG30UT- through BGOOUT-). (The terms BRx-, BGxIN-, and BOxOUT- are used when references are not to a specific line or lines; x. is any value from 0 to 3.) The highest priority is allocated to number 3 lines and lowest to number 0 lines. The BGxOUTlines that leave a board in slot n enter the board in slot n+l as BGxIN- lines. The bus arbiter must always reside in the first slot of a VMEbus-based system to initiate BGxOUT- generation. All masters in the system drive BBSY- when they have control of the bus. Within each bus-grant daisy chain, all masters drive the same BRx- line. Multiple masters on a bus grant daisy chain can request the data bus at the same time by simultaneously driving their associated BRx- lines. When this occurs, the requester furthest up in the daisy chain gets the bus grant. The remaining master(s) on the daisy chain can continue to assert BRx- until they receive a bus grant. A simple VMEbus requester initiates a request after detecting an on-board request (OBR). (A simplified bus-request state diagram and timing diagram appear in Figures6 and 7.) The requester then drives the BRx- line active and waits for the associated BGxIN- line to become active. Once the requester detects BGxIN- active, BBSY~ and the appropriate DMA Grant .line (DMAGRx-) are driven active, while BRx- is released to inactive. The active DMAGRx line indicates to an on-board master that it has. the bus and can perform a a data transfer. While data is being trarisferred, the bus master asserts the Data Transfer (DTR-) input to the CY7C331 bus requester. When the master has finished using the bus, the DTR input is deasserted. The requester then releases the bus by deasserting BBSY- and OBO. Even if one of the other on-board masters wants the bus, the requester deasserts BBSY- and waits for a new BGxINbefore granting the bus to this· master. This extra overhead allows other requesters that might be further up the daisy chain to obtain the bus between on-board bus requests. If the bus grant input (BGxIN-) becomes active while none of the on-board request lines are active, the requester must pass the request down the daisy chain. This is accomplished by asserting the bus grant out (BGxOUT-) signal. The VMEbus specification includes a few timing and requester design restrictions. A VMEbus requester must satisfy the two timing requirements displayed in Figure6. BBSY- must be driven for a minimum of 90 ns, and the release of BRx- must occur at least 30 ns before BBSY- is released. The primary design requirements are that BBSY- and BRx- must use open-collector I N OUT Figure 2. A Self-Timed Element 6-288 OUT IllS RESET ..... t IIAII PTERM PT • I DELAY Figure 3. CY7C331 Self-Timed Element drivers, and BGxOUT- must never glitch during operation. The restriction on BGxOUT- ensures avoidance of inadvertent bus grants. possible to facilitate the next bus arbitration. BBSY- is not released, however, until the following criteria are met BBSY- is driven for at least 90ns, BGxIN- is inactive, and the previous data transfer is complete (DTRis deasserted). If none of the DMARQx- lines is requesting the bus when a grant is received, the requester passes the grant onto BGxOUT- for the next requester on the daisy chain. The requester also recognizes a system reset (SYSRESET-) and initializes the device appropriately. A logic diagram of a self-timed VMEbus requester using the CY7C331 appears in Figure8. BRx- is the OR of the DMARQx- lines. Requester Design The requester supports overlapped bus requests; It also releases the data bus every transfer cycle to allow the central arbiter to grant the bus to a higher-priority requester, if one exists. The CY7C331 VMEbus requester supports three on-board DMA request lines (DMARQ2- through DMARQO-). All the DMARQx- lines can generate a bus request on the BRx- line. The requester supports three on-board grant lines (DMAGR2- through DMAGRO-), one for each request line. When a bus grant is received on BGxIN-, the requester must determine which DMAGRx- line to activate. The requester prioritizes the DMARQx- lines and grants the bus to the highest priority request; DMARQO- has the highest priority and DMARQ2- the lowest. The selected DMAGRx- line is not activated until the previous data transfer is complete. If any of the DMARQx- lines are active when a bus grant is received, the requester drives BBSY- active. For overlapped operation, BBSY- is released as soon as Requester Operation If any DMARQx line becomes active, BRx- be- comes active, signifying to the arbiter that one of the masters on this board wants the data bus. An external open-collector driver drives BRx-. Self-timed operation begins when the incoming BGxIN- line becomes active. The three on-board DMA request lines (DMARQ2- through DMARQO-) are selfsynchronized to the BGxIN- line. BGxIN's falling edge serves as a clock to register the DMARQx- lines and toggle a flip-flop from High to Low to initiate an inter- f----"-"'-T RESET PURH Figure 4. CY7C331 Self-Synchronizing Element 6-289 ~CYPRIi$ CY7C331 Asynchronous VMEbus Reguester ~aNOOcr~~~~~~~~~~~~~~~~~~~~~~~~~~~ nal. self-timed clock signal (STCP). The DMARQxlines must be synchronized. because BGxIN- can be activated when any BRx- line becomes active or when BBSY- is released. For example. if DMARQO- causes the associated BRx- to initiate bus arbitration. and DMARQ2- attempts to become active at the same time BGxIN- becomes active. DMARQ2's resulting state could be an indeterminate metastable condition that needs time for resolution. The pair of internal clock delays provides this time before the DMAGR2- output register samples the state of DMARQ2-. Two CY7C331 delay elements delay the internal. self-timed clock signal to provide enough time to selfsynchronize the requests. The requests are prioritized during the clock delay time. The resulting delayed clock (STCP2) then asserts BBSY - if any of the DMARQxlines are active. If none are active. the BGxOUT- line is asserted to send the grant to the next requester in the daisy chain. Using the delayed clock to generate BBSYand BGxOUT- guarantees that both lines are synchronized and cannot glitch. BBSY- is driven onto the bus with an external open-collector driver. The prioritized requests are clocked into registers to create the DMAGRx-· signals on the delayed STCP' s rising edge. if the previous data SYSRESET 'I x B R -L30" B G x I N-=---1L._ _--' BBSY _ II 90" smX" s B G.x 0 U , - Figure 6. VME Arbitration Timing transfer has completed. or on the rising edge of DTRwhen the data transfer completes. An internal flip-flop toggles at the same time. The flip-flop output indicates transfer completion (TC). The registered BBSY- line feeds into an external 90-ns delay line to guarantee that BBSY- is active for the minimum required time. The delay mechanism should be designed such that the delay circuit has no effect if the data transfer requires more than 90 ns to complete. One way to implement this feature is to use a one-shot triggered by the falling edge of the CY7C331's BBSY- signal. The one-shot's output is ORed with the BBSY- signal from the CY7C331 to generate the _ - ~~-----------.--------------------------~ ,~ IOBR- & IBGxIN- IBRx-. IBGxOUT-. IBBSY-. IOBGBGxIN- & IOBR- OBR- - .- BRx- BGxOUT- ~GxIN- BGxIN- - IBGxIN- BBSY-. IBRx. OBG- iBGxtNIBGxIN- & IDTR- Figure 5. VME Bus Requester State Diagram 6-290 + DTR- Yf::~ =========;;C;;;Y;;;7;;;;C;;;;3;;;3;;;1;;;A~sy~n~c;;;;h~r~on~o~u~s~V~M~E~b~u~s~R~e~q~u~e~s~te~r BBSY- signal to the VMEbus. The VME BBSY- signal is inactivated when the 90-ns delay has elapsed provided that TC is True and OTR- and BGxIN- ~ inactive. The requester is initialized for another self~~d. operation at the same time. The requester also lmtializes when the SYSRESET input is asserted. This design uses the 9O-ns delay circuit because an ~bsolute dela~ is required to meet the VME specification. A self-timed delay can yield only relative results because there is no way to determine how many delay levels are required to obtain a 9O-ns delay. Anyone delay is usually much faster than the worst-case specification, but the delay might be that slow. You can emulate the delay on-chip by creating a digital delay, but accuracy would be poor because you would have to synchronize BBSY- to an absolute time base, such as the 16-MHz system clock. The .CY7C331 can emulate the external open-collector drivers, but the emulation would not meet the VMEbus specification's drive requirements. To emulate an open-collector driver, use the signal output to the external driver to drive the output enable of an on-board inverting, three-state driver (with the input tied High). ' CY7C331 Implementation The bus requester can be implemented and simul~ted using the source code in Appendix A, generated Vla the Cypress PLO ToolKit software package. A close examination of the code reveals how many of the CY7C331's features are utilized. The DMARQx- lines use two CY7C331 pins for each line---one combinatorial and one registered. The registered input .pins are used to conserve output logic for other functions. The three macrocells associated with the registered inputs also perform the internal selftimed clock generation and delay functions; most other PLO s require six outputs to implement these functions. In addition, the CY7C331's individually programmable clocks allow the input register flip-flops to be clocked on BGxIN's falling edge. BBSY is assumed to be the input to the external delay line, and the CY7C331 input BBSY90 is assumed to connect to the delay line output. The source code defines the self-timed clock generation and delay logic needed to meet the requirements of CY7C331 self-synchronization. n-ll , 110 .I-n , III , /11 .0 .. ..'.lI (.,,'ar •• ' ) Figure 7. Self-Timed VMEbus Requester 6-291 Appendix A. PLD ToolKit Source Code for VMEbus Requester CY7C331; { Norman Taffe Cypress Semiconductor 6120/1990 Cypress PLD Toolkit VME Bus Requester } CONFIGURE; DMARQ2(node= 1), DMARQ1(node= 2), DMARQO(node=3);. BGxIN(node= 4), SYSRESET(node= 6), BBSY90(node= 7), DTR(node= 9), node 14(node= 14), IINIT(node= 15), IOBG(node= 16,ireg), ISTCP(node= 17), IBBSY(node= 18), IBGxOUT(node= 19,ireg), IBRx(node= 20,ireg), IDMAGRO(node= 23), { On-board Request Lines} { VME Bus Grant Input} { Externally delayed BBSY signal} { Signifies a Data Transfer in progress} { Requester initialize signal} { Signals board that it has the bus } { Self timed CLK input register } { Assert Bus Busy when taking the bus } { Send Bus Grant down the daisy chain if not wanted} { Signal arbiter that this board wants the bus} IDMAGR1(node= 24), { On-board grant lines } IDMAGR2(node=25), IRDMARQ1(node= 26), IRDMARQ2(node= 27), { Registered On-Board Request lines} IRDMARQO(node= 28,ireg), STCP2(node= 33), STCP1(node= 34,src= 27), TC(node= 30,SRC= 17), { Second delay stage of self timed clock} { First delay stage of self timed clock } { Resets the INIT signal} EQUATIONS; INIT = < OE> < SET .OUT> < CLR-OUT> - BGxIN*BBSY90*TC*D1R ISYSRESET; STCP < CK OUT> RDMARQ1 & DTR