1973 06_#42 06 #42
1973-06_#42 1973-06_%2342
User Manual: 1973-06_#42
Open the PDF directly: View PDF .
Page Count: 936
Download | |
Open PDF In Browser | View PDF |
AFIPS ;:~ i. < " ~ . ~ ,<, , CONFERENCE PROCEEDINGS VOLUME 42 1973 , ~ < •• NATIONAL COMPUTER CONFERENCE AND EXPOSITION ~ A'FI PS PRESS 210SUM'MIT 'AVENUE' ' MONTVALE. NEW JERSEY 07645 ',' June 4-8, 1973 New York, New York The ideas and opinions expressed herein are solely those of the authors and are not necessarily representative of or endorsed by the 1973 National Computer Conference or the American Federation of Information Processing Societies. Inc. Library of Congress Catalog Card Number 55-44701 AFIPS PRESS 210 Summit Avenue Montvale, New Jersey 07645 ©1973 by the American Federation of Information Processing Societies, Inc., Montvale, ~ew Jersey 07645. All rights reserved. This book, or parts thereof, may not be reproduced in any form without permission of the publisher. Printed in the United States of America PART I SCIENCE AND TECHNOLOGY CONTENTS PART I-SCIENCE AND TECHNOLOGY PROGRAM DELEGATE SOCIETY SESSION The Association for Computational Linguistics Linguistics and the future of computation ..................... . An abstract-Speech understanding .......................... . An abstract-Syntax and computation ........................ . An abstract- Literary text processing ......................... . 1 8 8 8 D. G. Hays D. E. Walker J. J. Robinson S. Y. Sedelow Society for Information Display The augmented knowledge workshop ......................... . Graphics, problem-solving and virtual systems ................. . 9 23 D. C. Engelbart R.Dunn Association for Computing Machinery Performance determination-The selection of tools, if any ....... . An abstract-Computing societies-Resource or hobby? ........ . 31 38 T. E. Bell A. Ralston Special Libraries Association An abstract-Special libraries association today ................ . An abstract-Copyright problems in information processing ..... . An abstract-Standards for library information processing ...... . 39 39 39 E. A. Strahle B. H. Weil L. C. Cowgill D. Weisbrod Association for Educational Data Systems An abstract-A network for computer users .................... An abstract-Use of computers in large school systems .......... An abstract-Training of teachers in computer usage ........... An abstract- How schools can use consultants ................. 40 40 41 41 B. K. Alcorn T. McConnell D. Richardson D.R. Thomas 43 J. R. Rice 48 T. E. Hull 49 50 50 51 J.McLeod P. W. House T. Naylor G. S. Fishman 52 H. Campaigne 52 W. F. Simon 52 G. M. Sokol 53-56 C, L Smith . . . . Society for Industrial and Applied Mathematics NAPSS-like systems-Problems and prospects ................. . An abstract-The correctness of programs for numerical computation ................................................. . The Society for Computer Simulation An abstract-The changing role of simulation and simulation councils ...................................................... . An abstract-Methodology and measurement An abstract-Policy models-Concepts and rules-of-thumb ...... . An abstract-On validation of simulation models ............... . I' •••••••••••••••••• IEEE Computer Society An abstract-In the beginning ............................... . An abstract-Factors affecting commercial computers system design in the seventies ..................................... . An abstract-Factors impacting on the evolution of military computers ................................................... . Instrument Society of America Modeling and simulation in the process industries .............. . Needs for industrial computer standards-As satisfied by ISA's programs in this area ..................................... . 57 T. J. Williams K. A. Whitman Quantitative evaluation of file management performance improvements ................................................... . 63 T. F. McFadden J. C. Strauss A method of evaluating mass storage effects on system performance .................................................... . 69 M. A. Diethelm PERFORMANCE EVALUATION The memory bus monitor-A new device for developing real-time systems Design and evaluation system for computer architecture 75 81 An analysis of multi programmed time-sharing computer systems 87 00000000000000000000000000000000000000000000000000 0 0 0 0 0 0 000 0 0 Use of the SPASM software monitor to evaluate the performance of the Burroughs B6700 93 Evaluation of performance of parallel processors in a real-time environment 101 A structural approach to computer performance analysis 109 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00000000000000000000000000000000000000000000000000 0 0 0 0 0 0 0 0 0 Ro E. Fryer Ko Hakozaki M. Yamamoto T.Ono N.Ohno Mo Umemura M.A. Sencer C. L. Sheng J. Mo Schwartz D.S. Wyner Go Ro Lloyd RoE. Merwin P. H. Hughes GoMoe NETWORK COMPUTERS-ECONOMIC CONSIDERATIONSPROBLEMS AND SOLUTIONS Simulation-A tool for performance evaluation in network computers 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00000000000000000000000000000000 ACCNET-A corporate computer network A system of APL functions to study computer networks A high level language for use with computer networks On the design of a resource sharing executive for the ARPANET Avoiding simulation in simulating computer communications networks 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 121 133 141 149 155 E. K. Bowdon, Sr. SoA.Mamrak F. R. Salz MoL. Coleman To D. Friedman H. Z. Krilloff R. Ho Thomas 165 R. M. Van Slyke W.Chow Ho Frank An implementation of a data base management system on an associative processor Aircraft conflict detection in an associative processor A data management system utilizing an associative memory. 171 177 181 Associative processing applications to real-time data management 187 R.Moulder H. R. Downs Co R. DeFiore P. B. Berra Ro R. Linde L. O. Gates T.FoPeng 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0000000000000 ASSOCIATIVE PROCESSORS 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 •••• 0 • 0 0 • 0 0 • 0 •• 0 0 0 0 0 • 0 0 •• 0 • 0 ••• 0 0 0 0 0 0 • 0 AUTOMATED PROJECT MANAGEMENT SYSTEMS A computer graphics assisted system for management ... 0 •••• 0 •• 0 •• 0 0 • 197 R. Chauhan 203 211 WoGorman M. H. Halstead 215 Jo E. Brown TUTORIAL ON RESOURCE UTILIZATION IN THE COMPUTING PROCESS On the use of generalized executive system software Language selection for applications 0 0 ••• 0 •• 0 0 0 ••• 0 0 ••••••••••••• •• 0 0 •••• INFORMATION NETWORKS-INTERNATIONAL COMMUNICATION SYSTEMS An abstract-A national science and technology information system in Canada ... An abstract-Global networks for information, communications and computers .... 0 0 ••• ••• , , 0 •• 0 0 • 0 •• 0 0 0 •• 0 0 0 0 0 •• 0 0 • 0 ••••••• 0 • INTELLIGENT TERMINALS A position paper-Panel Session on Intelligent terminalsChairman's Introduction .................................. . A position paper-Electronic point-of-sale terminals ............ . Design considerations for knowledge workshop terminals ........ . Microprogrammed intelligent satellites for interactive graphics .. . 217 219 221 229 I. W. Cotton Z. Thornton D. C. Engelhart A. van Dam G. Stabler Fourth generation data management systems .................. . Representation of sets on mass storage devices for information retrieval systems ......................................... . 239 V.K.M. Whitney 245 Design of tree networks for distributed data .................... . Specifications for the development of a generalized data base planning system . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 S. T. Byrom W. T. Hardgrave R. G. Casey TRENDS IN DATA BASE MANAGEMENT 259 J. F. Nunamaker, Jr. D. E. Swenson A~B:-~wnifistoh Database sharing-An efficient mechanism for supporting concurrent processes ............................................ . 271 Optimal file allocation in multi-level storage systems ............ . Interaction statistics from a database management system ...... . 277 283 P. F. King A. J. Collmeyer P. P. S. Chen J. D. Krinos 290 W. E. Hanna, Jr. The evolution of virtual machine architecture .................. . 291 An efficient virtual machine implementation .................. . 301 Architecture of virtual machines ............................. . 309 J. P. Buzen U. O. Gagliardi R. J. Srodawa L. A. Bates R. P. Goldberg CONVERSION PROBLEMS An abstract-Utilization oflarge-scale systems ................. . VIRTUAL MACHINES COMPUTER-BASED INTEGRATED DESIGN SYSTEMS The computer aided design environment project COMRADE-An overview ................................................ . Use of COMRADE in engineering design ...................... . The COMRADE executive system ............................ . 319 325 331 The COMRADE data management system .................... . 339 PLEX data structure for integrated ship design ................ . COMRADE data management system storage and retrieval techniques .................................................. . 347 353 The COMRADE design administrative system ................. . 359 A. Bandurski M. Wallace M. Chernick 365 D. A. Davidson 367 H. J. Highland 371 J. Maniotes T.Rhodes J. Brainin R. Tinker L. Avrunin S. Willner A. Bandurski W.Gorman M. Wallace B. Thomson ACADEMIC COMPUTING AT THE JUNIOR/COMMUNITY COLLEGE-PROGRAMS AND PROBLEMS A business data processing curriculum for community colleges .... Computing at the Junior/Community College-Programs and problems ............ " ............ , ...... " ........ " ... . The two year and four year computer technology programs at Purdue University ........................................ . Computing studies at Farmingdale ........................... . Computer education at Orange Coast College-Problems and programs in the fourth phase .................................. . An abstract-Computing at Central Texas College .............. . 379 C. B. Thompson 381 385 R. G. Bise A. W. Ashworth, Jr. 387 395 A. L. Scherr T. F. Wheeler, Jr. 401 407 W. A. Schwomeyer E. Yodokawa 413 H.Chang T. C. Chen C. Tung W.C. Hohn P. D. Jones STORAGE SYSTEMS The design of IBM OS/VS2 release 2 .......................... . IBM OS/VS1-An evolutionary growth system ............... ,. Verification of a virtual storage architecture on a microprogrammed computer ........................................... . On a mathematical model of magnetic bubble logic ............. . The realization of symmetric switching functions using magnetic bubble technology ........................................ . The Control Data STAR-100 paging station .................... . 421 NATURAL LANGUAGE PROCESSING The linguistic string parser .................................. . 427 A multiprocessing approach to natural language ................ . Progress in natural language understanding-An application to 1unar geology ............................................ . An abstract-Experiments in sophisticated content analysis ..... . An abstract-Modelling English conversations ................. . 435 R. Grishman N. Sager C. Raze B. Bookchin R.Kaplan 441 451 451 W.A. Woods G. R. Martin R. F. Simmons An abstract-The efficiency of algorithms and machines-A survey of the complexity theoretic approach ........................ . An abstract- Hypergeometric group testing algorithms ......... . 452 452 abstract-The file transmission problems .................. . abstract--Analysis of sorting algorithms ................... . abstract-Min-max relations and combinatorial algorithms ... . abstract-The search for fast algorithms ................... . 453 453 453 453 J. Savage S. Lin F.K. Hwang P. Weiner C. L. Liu W. R. Pulleyblank I. Munro DISCRETE ALGORITHMS-APPLICATIONS AND MEASUREMENT An An An An APPLICATIONS OF AUTOMATIC PATTERN RECOGNITION Introduction to the theory of medical consulting and diagnosis .... Pattern recognition with interactive computing for a half dozen clinical applications of health care delivery .................. . Interactive pattern recognition-A designer's tool .............. . Auto-scan-Technique for scanning masses of data to determine potential areas for detailed analysis ......................... . 455 463 479 E. A. Patrick L. Y. L. Shen F. P. Stelmack R. S. Ledley E. J. Simmons, Jr. 485 D. L. Shipman C. R. Fulmer 489 R. S. Ledley H.K. Huang T. J. Golab Y. Kulkarni G. Pence L. S. Rotolo INGREDIENTS OF PATTERN RECOGNITION SPIDAC-Specimen input to digital automatic computer ....... . 497 503 509 R. B. Banerji M.F. Dacey L.Uhr 518 R. Brody 518 518 J. Davis G. Huckell Computer architecture and instruction set design ............... . 519 P. Anagnostopoulos M.J. Michel G. H. Sockut G. M. Stabler A. van Dam A new m-i-R-icompYte-r-Jmulti-pro-G€-SSQ-I"f-w.-theARPA network ..... . 529 A method for the easy storage of discriminant polynomials ....... . A non-associative arithmetic for shapes of channel networks ..... . The description of scenes over time and space .................. . ADVANCED HARDWARE An abstract-Tuning the hardware via a high level language (ALGOL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An abstract-1O- 5 -10- 7 cent/bit storage media, what does it mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An abstract-Computer on a chip and a network of chips ........ . THE GROWING POTENTIAL OF MINI; SMALL SYSTEMS Data integrity in small real-time computer systems ............. . The design and implementation of a small scale stack processor system .................................................. . Operating system design considerations for microprogrammed mini-computer satellite systems ............................ . 539 F ..E.-HeartS. M. Ornstein W. R. Crowther W. B. Barker T. Harrison T. J. Pierce 545 M. J. Lutz 555 J. E. Stockenberg P. Anagnostopoulos R. E. Johnson R.G.Munck G. M. Stabler A. van Dam 563 563 M. A. Melkanoff B. H. Barnes G. L. Engel. M. A. Melkanoff 565 569 581 589 603 F. A. Stahl G. E. Mellen 1. S. Reed R. Turn C. H. Meyer More effective computer packages for applications ............. . 607 EASYSTAT -An easy-to-use statistics package ................ . ACID-A user-oriented system of statistical programs .......... . 615 621 W. B. Nelson M. Phillips L. Thumhart A. B. Tucker R.A. Baker T. A. Jones A GRADUATE PROGRAM IN COMPUTER SCIENCE An abstract-Another attempt to define computer science ....... . An abstract-The master's degree program in computer science .. . CRYPTOLOGY IN THE AGE OF AUTOMATION A homophonic cipher for computational cryptography .......... . Cryptology, computers and common sense ..................... . Information theory and privacy in data banks ................. . Privacy transformations for data banks ....................... . Design considerations for cryptography ....................... . DESIGN AND DEVELOPMENT OF APPLICATION PACKAGES FOR USERS A DAY WITH GRAPHICS Graphics Applications I Graphics and Engineering-Computer generated color-sound movies .................................................. . 625 L. Baker 629 R. Notestine 635 639 C. M. Williams C. Newton 643 R. Resch 651 W.W.Newman 657 R. C. Gammill 663 677 685 N. Negroponte J. Franklin I. Sutherland Packet switching with satellites .............................. . Packet switching in a slotted satellite channel .................. . 695 703 N.Abramson L. Kleinrock S.S.Lam Dynamic allocation of satellite capacity through packet reservation ..... , ............................................... . 711 L. G. Roberts Chairman's introduction-Opposing views .................... . The future of computer and communications services ........... . Social impacts of the multinational computer .................. . 717 723 735 A new NSF thrust-Computer impact on society ............... . 747 M. Turoff L. H. Day B. Nanus L. M. Wooten H. Borko P. G. Lykos Graphics computer-aided design in aerospace .................. . Graphics and digitizing-Automatic transduction of drawings into data bases ............................................... . Graphics in medicine and biology ............................ . Graphic Applications II Graphics and art-The topological design of sculptural and architectural systems .......................................... . Graphics and education-An informal graphics system based on the LOGO language ....................................... . Graphics and interactive systems-Design considerations of a . software system .......................................... . Graphics and architecture-Recent developments in sketch recognition .............................................. . Graphics and electronic circuit analysis ....................... . Graphics in 3D-Sorting and the hidden surface problem ....... . SATELLITE PACKET COMMUNICATIONS VIEWS OF THE FUTURE-I VIEW OF THE FUTURE-II The impact of technology on the future state of information technology enterprise ..................................... . The home reckoner- --A scenario on the home use of computer::; ... . 751 759 What's in the cards for data entry? ........................... . 765 L. A. Friedman C. A. R. Kagan L. G. Schear G. Bernstein 773 J. R. Norsworthy 781 K. D. Leeper What is different about tactical military operational programs .... What is different about the hardware in tactical military systems .. 787 797 What is different about tactical military languages and compilers. What is different about tactical executive systems .............. . 807 811 G. G. Chapin E. C. Svendsen D.L.Ream R.J. Rubey W. C. Phillips ENVIRONMENTAL QUALITY AND THE COMPUTER Assessing the regional impact of pollution control-A simulation approach ....................... , ...... , ................... . An automated system for the appraisal of hydrocarbon producing properties ............................................... . WHAT'S DIFFERENT ABOUT TACTICAL MILITARY COMPUTER SYSTEMS Linguistics and the future of computation by DAVID G. HAYS State University of New York Buffalo, ~ew York My subject is the art of computation: computer architecture, computer programming, and computer application. Linguistics provides the ideas, but the use I make of -them-is-not--the-l-ing:uist!-s--uS€; whic-h -w-G--Uld--oo-an- -attempt at understanding the nature of man and of human communication, but the computer scientist's use. In ancient India, the study of language held the place in science that mathematics has always held in the West. Knowledge was organized according to the best known linguistic principles. If we had taken that path, we would have arrived today at a different science. Our scholarship draws its principles from sources close to linguistics, to be sure, but our science has rather limited itself to a basis in Newtonian calculus. And so a chasm separates two cultures. THREE Llr\GUISTIC PRIXCIPLES Since I cannot treat the entire field of linguistics. I ha-ve-4esen--te---s-ketch three pIincif}les that---se-em--mest basic and far-reaching. Two of them are known to every linguist and applied automatically to every problem that arises. The third is slightly less familiar; I have begun a campaign to give it due recognition. As everyone knows, the capacity for language is innate in every human specimen, but the details of a language are acquired by traditional transmission, from senior to younger. As everyone knows, language is a symbolic system, using arbitrary signs to refer to external things, properties, and events. And, as everyone certainly knows by now, language is productive or creative, capable of describing new events by composition of sentences never before uttered. Hockett 9 and Chomsky 3 explain these things. But of course these are not principles; they are problems for which explanatory principles are needed. My first principle is stratification.12 This principle is often called duality of patterning, although in recent years the number of levels of patterning has grown. The original observation is that language can be regarded as a system of sounds or a system of meaningful units; both points of view are essential. One complements the other without supplanting it. Phonology studies language as sound. It discovers that each of the world's languages uses a small alphabet of sounds, from a dozen to four times that many, to construct all its utterances. The definition of these unit sounds is not physical but functional. In one language, two sounds with physically distinct manifestations are counted as functionally the same; speakers of this language do not acquire the ability to distinguish between the sounds, and can live out their lives without knowing that the sounds are unlike. English has no use for the difference between [p] and [p'], the latter having a little puff of air at the end, yet both occur: [p] in spin, [p' ] in pin. Since other languages, notably Thai, use this difference to distinguish utterances, it is humanly possible not only to make the two forms of / p / but also to hear it. Thus languages arbitrarily map out their alphabets of sounds. lC Languages also differ in the sequences of sounds that they permit. In Russian, the word vzbalmoshnyj 'extrava- The scientific reliance on calculus has been productive. Often understood as a demand for precision and rigor, it has simultaneously made theoreticians answerable to experimental observation and facilitated the internal organization of knowledge on a scale not imagined elsewhere in human history. Very likely, a reliance on linguistic laws for control of science during the same long period would have been less successful, because the principles of linguistic structure are more difficult to discover and manipulate than the principles of mathematical structure; or so it seems after two thousand years of attention to one and neglect of the other. How it will seem to our descendants a thousand years hence is uncertain; they may deem the long era of Western study of mathematical science somewhat pathological, and wonder why the easy, natural organization of knowledge on linguistic lines was rejected for so many centuries. However that may be, the prospect for the near term is that important opportunities will be missed if linguistic principles continue to be neglected. Linguistics is enjoying a period of rapid growth, so a plethora of ideas await new uses; the computer makes it possible to manipulate even difficult principles. Traditional mathematics seems not to say how computers much beyond the actual state of the art can be organized, nor how programs can be made much more suitable to their applications and human users, nor how many desirable fields of application can be conquered. I think that linguistics has something to say. 1 2 National Computer Conference, 1973 gant' is reasonable, but the English speaker feels that the initial sequence /vzb/ is extravagant, because initial Iv/ in English is not followed by another consonant, and furthermore initial I zl is not. The Russian word violates English rules, which is perfectly satisfactory to Russian speakers, because they are unacquainted with English restrictions. As Robert Southey put it, speaking of a Russian, And last of all an Admiral came, A terrible man with a terrible name, A name which you all know by sight very well But which no one can speak, and no one can spell. (Robert Southey, 'The March to Moscow.') Phonology, with its units and rules of combinations, is one level of patterning in language. That languages are patterned on a second level is so well known as to require little discussion. English puts its subject first, verb second, object last-in simple sentences. Malagasy, a language of Madagascar, puts the verb first, then the object, last the subject.ll Other orders are found elsewhere. English has no agreement in gender between nouns and adjectives, but other languages such as French and Russian, Navaho and Swahili, do; nor is gender controlled by semantics, since many gender classes are without known semantic correlation. Gender is as arbitrary as the English rejection of initial /vzb/ . The units that enter into grammatical patterns are morphemes; each language has its own stock, a vocabulary that can be listed and found once more to be arbitrary. It seems true that some color names are universal -needed in all languages to symbolize genetic capacities -but other color name~ are al~o coined, ~uch a~ the English scarlet and crimson, on arbitrary lines. \,8 The existence of a third level of symbolic patterning is best shown by psychological experiments. Memory for a story is good, but not verbatim. Only the shortest stretches of speech can be remembered word for word; but the ideas in quite a long stretch can be recited after only one hearing if the hearer is allowed to use his own words and grammatical structures. 5 The comparison of pictures with sentences has been investigated by several investigators; they use models in which below is coded as not above, forget is coded as not remember, and so on, because they need such models to account for their subjects' latencies (times to respond measured in milliseconds). Using such models, they can account for the differences between times of response to a single picture, described with different sentences, to an impressive degree of precision. 12 Each level of symbolic patterning should have both units and rules of construction. On the phonological level the units are functional sounds; the rules are rules of sequence, for the most part. On the grammatical level the units are morphemes and the rules are the familiar rules of sequence, agreement, and so on. On the third level, which can be called semological or cognitive, the units are often called sememes; the morpheme 'forget' corresponds to the sememes 'not' and 'remember'. The rules of organization of this level have not been investigated adequately. Many studies of paradigmatic organization have been reported, sometimes presenting hierarchical classifications of items (a canary is a bird, a dog is a quadruped, etc.), but this is only one of several kinds of organization that must exist. Classification patterns are not sentences, and there must be sentences of some kind on the semological level. Chomsky's deep structures might be suitable, but Fillmore 6 and McCawley 13 have proposed different views. What is needed is rapidly becoming clearer, through both linguistic and psychological investigations. The relations that help explain grammar, such as subject and object, which control sequence and inflection, are not the relations that would help most in explaining the interpretation of pictures or memory for stories; for such purposes, notions of agent, instrument, and inert material are more suitable. But the organization of these and other relations into a workable grammar of cognition is unfinished. Up to this point I have been arguing only that language is stratified, requiring not one but several correlated descriptions. Now I turn to my second principle, that language is internalized. Internalization is a mode of storage in the brain, intermediate between innateness and learning. Concerning the neurology of these distinctions I have nothing to say. Their functional significance is easy enough to identify, however. What is innate is universal in mankind, little subject to cultural variation. Everyone sees colors in about the same way. unless pathologically color blind. Everyone on earth has three or more levels of linguistic patterning. The distinctions among things (nouns), properties (adjectives), and events (verbs) are so nearly universal as to suggest that this threefold organization of experience is innate. To have grammar is universal, however much the grammars of particular languages vary. The innate aspects of thought are swift, sure, and strong. What is internalized is not the same in every culture or every person. But whatever a person internalizes is reasonably swift, sure, and strong; less than what is innate, more than what is learned. Besides the mechanisms of linguistic processing, various persons internalize the skills of their arts and crafts; some internalize the strategies of games; and all internalize the content of at least some social roles. The contrast between learning and internalization is apparent in knowledge of a foreign language. A person who has learned something of a foreign language without internalization can formulate sentences and manage to express himself, and can understand what is said to him, although slowly and with difficulty. A person who has internalized a second language is able to speak and understand with ease and fluency. Linguistics and the Future of Computation Similarly in games, the difference between a master of chess, bridge, or go is apparent. But something more of the difference between learning and internalization is also to be seen here. The novice has a more detailed awareness of how he is playing; he examines the board or the cards step by step, applying the methods he has learned, and can report how he arrives at his decision. Awareness goes with learned skills, not with internalized abilities. Internalized abilities are the basis of more highly organized behavior. The high-school learner of French is not able to think in French, nor is the novice in chess able to construct a workable strategy for a long sequence of moves. When a language has been internalized, it becomes a tool of thought; when a chess player has internalized enough configurations of pieces and small sequences of play, he can put them together into strate~._The_musician_ intexnali~_e~__ chQrd_~, meI9dje~_,--.I:!!ld ultimately passages and whole scores; he can then give his attention to overall strategies, making his performance lyrical, romantic, martial, or whatever. What makes learning possible is the internalization of a system for the storage and manipulation of symbolic matter. If a person learns a story well enough to tell it, he uses the facilities of symbolic organization-his linguistic skills-to hold the substance of the story. Much of the content of social roles is first learned in this way; the conditions of behavior and the forms of behavior are learned symbolically, then come to be, as social psychologists put it, part of the self-that is, internalized. In fact, the conversion of symbolic learned material into internalized capacities is a widespread and unique fact of human life. It must be unique, since the symbol processing capacity required is limited to man. This ability gives man a great capacity for change, for adaptation to different cultures, for science: by internalizing the methods of science, he becomes a scientist. The amount that a person internalizes in a lifetime is easy to underestimate. A language has thousands of morphemes and its users know them. Certainly their semantic and grammatical organization requires tensmore plausibly hundreds-of thousands of linkages. A high skill such as chess takes internalize knowledge of the same order of magnitude, according to Simon and Barenfeld. 15 I think that internalized units are more accurately conceived as activities than as inert objects. All tissue is metabolically active, including the tissue that supports memory. Memory search implies an activity, searching, in an inactive medium, perhaps a network of nodes and arcs. More fruitfully we can imagine memory as a network of active nodes with arcs that convey their activity from one to another. A morpheme, then, is an activity seeking at all times the conditions of its application. My third principle in linguistics is the principle of metalinguistic organization. Language is generally recognized as able to refer to itself; one can mention a word in order to define it, or quote a sentence in order to refute it. A very common occurrence in grammar is the embedding 3 of one sentence within another. A sentence can modifv a word in another sentence, as a relative clause: . The boy who stole the pig ran away. A sentence can serve as the object of a verb of perception, thought, or communication: I saw him leave. I know that he left. You told me that she had left. And two sentences can be embedded in temporal, spatial, or causal relation: He ran away because he stole the pig. He stole the pig and then ran away. He is hidin~_ far from the spot where he stole the pig. An embedded sentence is sometimes taken in the same form as if it were independent, perhaps introduced by a word like the English that, and sometimes greatly altered in form as in his running away. The definition of abstract terms can be understood by metalinguallinkages in cognitive networks. The definition is a structure, similar to the representation of any sentence or story in a cognitive network. The structure is linked to the term it defines, and the use of the term governed by the content of the structure. Science and technology are replete with terms that cannot be defined with any ease in observation sentences; they are defined, I think, through metalingual linkages. What kind of program is a compiler? What kind of device can correctly be called heuristic? These questions can be answered, but useful answers are complicated stories about the art of proglamming, not simple statements of perceptual conditions, and certainly not classificatory statements using elementary features such as human, male, or concrete. I can indicate how vast a difference there is between metalingual operations and others by proposing that all other operations in cognitive networks are performed by path tracing using finite-state automata, whereas metalingual operations are performed by pattern matching using pushdown automata. These two systems differ in power; a finite-state machine defines a regular language and a pushdown automaton defines a context-free language. A path in a cognitive network is defined as a sequence of nodes and arcs; to specify a path requires only a list of node and arc types, perhaps with mention that some are optional, some can be repeated. A more complex form of path definition could be described, but I doubt that it would enhance the effectiveness of path tracing procedures. In Quillian's work, for example, one needs only simple path specifications to find the relation between lawyer and client (a client employs a lawyer). To know that a canary has wings requires a simple form of path involving paradigmatic (a canary is a kind of bird) and syntagmatic relations (a bird has wings). The limiting factor is not the complexity of the path that can be 4 National Computer Conference, 1973 defined from one node to another, but the very notion that node-to-node paths are required. 4 • 14 Pattern matching means fitting a template. The patterns I have in mind are abstract, as three examples will show. The first, familiar to linguists, is the determination of the applicability of a grammatical transformation. The method, due to Chomsky, is to write a template called a structure description. The grammatical structure of any sentence is described by some tree; if the template fits the tree of a certain sentence, then the transformation applies to it, yielding a different tree. These templates contain symbols that can apply to nodes in grammatical trees, and relations that can connect the nodes. Chomsky was, I think, the first to recognize that some rules of grammar can be applied only where structure is known; many phenomena in language are now seen to be of this kind. Thus linguists today ask for tree-processing languages and cannot do with string processors. My second example is the testing of a proof for the applicability of a rule of inference. A proof has a tree structure like the structure of a sentence; whether a rule of inference can be applied has to be tested by reference to the structure. 'If p and q then p' is a valid inference, provided that in its application p is one of the arguments of the conjunction; one cannot assert that p is true just because p, q, and a conjunction symbol all occur in the same string. Finally, I come to metalingual definition. The definition is itself a template. The term it defines is correctly used in contexts where the template fits. As in the first two examples, the template is abstract. A structure description defines a class of trees; the transformation it goes with applies to any tree in the class. A rule of inference defines a class of proofs; it applies to each of them. And a metalingual definition defines a class of contexts, in each of which the corresponding term is usable. Charity has many guises; the story-template that defines charity must specify all of the relevant features of charitable activity, leaving the rest to vary freely. When the difference in power between finite-state and context-free systems was discovered, it seemed that this difference was a fundamental reason for preferring context-free grammars in the study of natural language. Later it became evident that the need to associate a structural description with each string was more important, since context-free grammars could do so in a natural way and finite-state automata could not. Today linguists and programmers generally prefer the form of context-free rules even for languages known to be finite state, just because their need for structure is so urgent. It may prove the same with pattern matching. In proofs, in transformations, and in definitions it is necessary to mark certain elements: the conclusions of inferences, the elements moved or deleted by transformation, and the key participating elements in definition. (The benefactor is charitable, not the recipient.) Until I see evidence to the contrary, however, I will hold the view that pattern matching is more powerful than path tracing. Pattern matching is, surely, a reflective activity in comparison with path tracing. To trace a path through a maze, one can move between the hedges, possibly marking the paths already tried with Ariadne's thread. To see the pattern requires rising above the hedges, looking down on the whole from a point of view not customarily adopted by the designers of cognitive networks. That is, they often take such a point of view themselves, but they do not include in their systems a component capable of taking such a view. COMPUTER ARCHITEC'TURE I turn now to the art of computation, and ask what kind of computer might be constructed which followed the principles of stratification, internalization, and metalingual operation. Such a computer will, I freely admit, appear to be a special-purpose device in comparison with the generalpurpose machines we know today. The human brain, on close inspection, also begin::) to look like a :special-purpose machine. Its creativity is of a limited kind, yet interesting nevertheless. The prejudice in favor of mathematics and against linguistics prefers the present structure of the computer; but a special-purpose machine built to linguistic principles might prove useful for many problems that have heretofore been recalcitrant. Stratification is not unknown in computation, but it deserves further attention. The difficulties of code translation and data-structure conversion that apparently still exist in networks of different kinds of computers and in large software systems that should be written in a combination of programming languages are hard to take. The level of morphemics in language is relatively independent of both cognition and phonology. In computer architecture, it should be possible to work with notational schemes independent of both the problem and the inputoutput system. Whether this level of encoding both data and their organization should be the medium of transmission, or specific to the processor, I do not know. But it is clear that translators should be standard hardware items, their existence unknown in high-level languages. Sophistication in design may be needed, but the problems seem not insurmountable, at least for numerical, alphabetic, and pictorial data. The design of translators for data structures is trickier, and may even prove not to be possible on the highest level. The lesson to be learned from the separation of grammar and cognition is more profound. Language provides a medium of exchange among persons with different interests and different backgrounds; how they will understand the same sentence depends on their purposes as well as their knowledge. Much difficulty in computer programming apparently can be traced to the impossibility of separating these two levels in programming languages. Programs do not mean different things in different contexts; they mean the same thing always. They are there- Linguistics and the Future of Computation fore called unambiguous, but a jaundiced eye might see a loss of flexibility along with the elimination of doubt. Many simple problems of this class have been solved; in high-level languages, addition is generally not conditioned by data types, even if the compiler has to bring the data into a common type before adding. More difficult problems remain. At Buffalo, Teiji Furugori is working on a system to expand driving instructions in the context of the road and traffic. He uses principles of safe driving to find tests and precautions that may be needed, arriving at a program for carrying out the instruction safely. Current computer architecture is resistant to this kind of work; it is not easy to think of a program on two levels, one of them providing a facility for expanding the other during execution. An interpreter can do something of the sort; but interpretive execution is a high price to pay. If computer hardware provided for two simultaneous monitors of the data stieain, orieexecufiil-g -a -complIed program while the other watched for situations in which the compiled version would be inadequate, the separation of morphemics and cognition might better be realized. In teaching internalization to students who are mainly interested in linguistics, I use microprogramming as an analogy. What can be seen in the opposite direction is the fantastic extent to which microprogramming might be carried with corresponding improvement in performance. If the meaning of every word in a language (or a large fraction of the words) is internalized by its users, then one may hope that microprogramming of a similar repertory of commands would carry possibilities for the computer somewhat resembling what the speaker gains, to wit, speed. A computer could easily be built with a repertory of 10,000 commands. Its manual would be the size of a desk dictionary; the programmer would often find that his program consisted of one word, naming an operation, followed by the necessary description of a data structure. Execution would be faster because of the intrinsically higher speed of the circuitry used in microprogramming. Even if some microprograms were mainly executive, making numerous calls to other microprograms, overall speed should be increased. At one time it would have been argued that the art could not supply 10,000 widely used commands, but I think that time is past. If someone were inclined, I think he could study the literature in the field and arrive at a list of thousands of frequently used operations. Parallel processing adds further hope. If a computer contains thousands of subcomputers, many of them should be operating at each moment. Even the little we know about the organization of linguistic and cognitive processing in the brain suggests how parallel processing might be used with profit in systems for new applications. A morphemic unit is the brain seems to be an activity, which when successful links a phonological string with one or more points in a cognitive network. If these units had to be tested sequentially, or even by binary search, the time to process a sentence would be great. Instead all 5 of them seem to be available at all times, watching the input and switching from latency to arousal when the appropriate phonological string appears. If each were a microprogram, each could have access to all input. Conflicts inevitably arise, with several units aroused at the same time. Grammar serves to limit these conflicts; a grammatical unit is one with a combination of inputs from morphemic units. When a morphemic unit is aroused, it signals its activation to one or several grammatical units. When a proper combination of morphemic units is aroused, the grammatical unit is in turn aroused and returns feedback to maintain the arousal of the morphemic unit which is thereupon enabled to transmit also to the cognitive level. Thus the condition for linkage between phonology and cognition is a combination of grammatical elements that amounts to the representation of a sentence structure. This is Lamb's model of stratal orgariizafioii, and shows -how grammar reauces-TeiicaT ambiguity. The problem it poses for computer architecture is that of interconnection; unless the morphemic units (like words) and the grammatical units (like phrasestructure rules) are interconnected according to the grammar of a language, nothing works. The computer designer would prefer to make his interconnections on the basis of more general principles; but English is used so widely that a special-purpose computer built on the lines of its grammar would be acceptable to a majority of the educated persons in the world-at least, if no other were on the market. A similar architecture could be used for other purposes, following the linguistic principle but not the grammar of a natural language. Ware l6 mentions picture processing and other multidimensional systems as most urgently needing increased computing speed. Models of physiology, of social and political systems, and of the atmosphere and hydrosphere are among these. Now, it is in the nature of the world as science knows it that local and remote interactions in these systems are on different time scales. A quantum of water near the surface of a sea is influenced by the temperature and motion of other quanta of water and air in its vicinity; ultimately, but in a series of steps, it can be influenced by changes at remote places. Each individual in a society is influenced by the persons and institutions close to him in the social structure. Each element of a picture represents a portion of a physical object, and must be of a kind to suit its neighbors. To be sure, certain factors change simultaneously on a wide scale. I f a person in a picture is wearing a striped or polka-dotted garment, the recognition of the pattern can be applied to the improvement of the elements throughout the area of the garment in the picture. A new law or change in the economy can influence every person simultaneously. Endocrine hormones sweep through tissue rapidly, influencing every point almost simultaneously. When a cloud evaporates, a vast area is suddenly exposed to a higher level of radiation from the sun. These situations are of the kind to make stratification a helpful mode of architecture. Each point in the grid of 6 National Computer Conference, 1973 picture, physiological organism, society, or planet is connected with its neighbors on its own stratum and with units of wide influence on other strata; it need not be connected with remote points in the same stratum. Depending on the system, different patterns of interaction have to be admitted. Clouds are formed, transported, and evaporated. Endocrine glands, although they vary in their activity, are permanent, as are governments. Both glands and governments do suffer revolutionary changes within the time spans of useful simulations. In a motion picture, objects enter and depart. How the elements of the first stratum are to be connected with those of the next is a difficult problem. It is known that the cat's brain recognizes lines by parallel processing; each possible line is represented by a cell or cells with fixed connections to certain retinal cells. But this does not say how the cat recognizes an object composed of several lines that can be seen from varying orientations. Switching seems unavoidable in any presently conceivable system to connect the level of picture elements with the level of objects, to connect the level of persons with the level of institutions, to connect the elements of oceans with the level of clouds, or to connect the elements of the morphemic stratum with the level of cognition in linguistic processing. In computation, it seems that path tracing should be implicit, pattern matching explicit. The transmission of activity from a unit to its neighbors, leading to feedback that maintains or terminates the activity of the original unit, can be understood as the formation of paths. Something else, I think, happens when patterns are matched. A typical application of a linguistically powerful computer would be the discovery of patterns in the user's situation. The user might be a person in need of psychiatric or medical help; an experimenter needing theoretical help to analyze his results and formulate further experiments; a lawyer seeking precedents to aid his clients; or a policy officer trying to understand the activities of an adversary. In such cases the user submits a description of his situation and the computer applies a battery of patterns to it. The battery would surely have to be composed of thousands of possibilities to be of use; with a smaller battery, the user or a professional would be more helpful than the computer. If the input is in natural language, I assume that it is converted into a morphemic notation, in which grammatical relations are made explicit, as a first step. On the next level are thousands of patterns, each linked metalingually to a term; the computer has symbolic patterns definitive of charity, ego strength, heuristics, hostility, and so on. Each such pattern has manifold representations on the morphemic stratum; these representations may differ in their morphemes and in the grammadical linkages among them. Some of these patterns, in fact, cannot be connected to the morphemic stratum direetly with any profit whatsoever, but must instead be linked to other metalingual patterns and thence ultimately to morphemic representations. In this way the cognitive patterns resemble objects in perception that must be recognized in different perspectives. Grammatical theory suggests an architecture for the connection of the strata that may be applicable to other multistratal systems. The two strata are related through a bus; the object on the lower stratum is a tree which reads onto the bus in one of the natural linearizations. All of the elements of all of the patterns on the upper stratum are connected simultaneously to the bus and go from latent to aroused when an element of their class appears; these elements include both node and arc labels. When the last item has passed, each pattern checks itself for completeness; all patterns above a threshold transmit their arousal over their metalingual links to the terms they define. Second -order patterns may come to arousal in this way, and so on. If this model has any validity for human processing, it brings us close to the stage at which awareness takes over. In awareness, conflicts are dealt with that cannot be reduced by internalized mechanisms. The chess player goes through a few sequences of moves to see what he can accomplish on each of them; the listener checks out those occasional ambiguities that he notices, and considers the speaker's purposes, the relevance of what he has heard to himself, and so on. The scientist compares his overall theoretical views with the interpretations of his data as they come to mind and tries a few analytic tricks. In short, this is the level at which even a powerful computer might open a dialogue with the user. Would a sensible person build a computer with architecture oriented to a class of problems? I think so, in a few situations. Ware listed some problems for which the payoff function varies over a multibillion-dollar range: foreign policy and arms control, weather and the environment, social policy, and medicine. With such payoffs, an investment of even a large amount in a more powerful computer might be shown to carry a sufficient likelihood of profit to warrant a gamble. In the case of languageoriented architecture, it is not hard to develop a composite market in which the users control billions of dollars and millions of lives with only their own brains as tools to link conceptualization with data. A president arrives at the moment of decision, after all the computer simulations and briefings, with a yellow pad and a pencil; to give him a computer which could help his brain through multistratal and metalingual linkages of data and theories would be worth a substantial investment. Can these applications be achieved at optimal levels without specialized architecture? I doubt it. Parallel processing with general-purpose computers linked through generalized busses will surely bring an improvement over serial processing, but raises problems of delay while results are switched from one computer to another and does nothing to solve software problems. Specialized architecture is a lesson to be learned from linguistics with consequences for ease of programming, time spent in compilation or interpretation, and efficiency of parallel processing. Linguistics and the Future of Computation COMPUTATIONAL LINGUISTICS I have delayed until the end a definition of my own field, which I have presented before. 7 It should be more significant against the background of the foregoing discussion. The definition is built upon a twofold distinction. One is the distinction, familiar enough, between the infinitesimal calculus and linguistics. The calculus occupies a major place in science, giving a means of deduction in systems of continuous change. It has developed in two ways: Mathematical analysis, which gives a time-independent characterization of systems including those in which time itself is a variable-time does not appear in the metasystem of description. And numerical analysis, in which time is a variable of the metasystem; numerical analysis deals in algorithms. .Linguistics, aIs-o~-h~is develo-ped {n twowaYs.Thefirrie~ independent characterizations that Chomsky speaks of as statements of competence are the subject of what is called linguistics, with no modifier. This field corresponds to the calculus, or to its applications to physical systems. Timedependent characterizations of linguistic processes are the subject matter of computational linguistics, which also has two parts. Its abstract branch is purely formal, dealing with linguistic systems whether realized, or realizable, in nature; its applied branch deals with algorithms for the processing of naturally occurring languages. I have undertaken to show that the concepts of abstract computational linguistics provide a foundation for nonnumerical computation comparable to that provided by the calculus for numerical computation. The work is still in progress, and many who are doing it would not be comfortable to think of themselves as computational linguists. I hope that the stature of the field is growing so that more pride can attach to the label now and hereafter than in earlier days. 7 REFERENCES 1. Berlin, Brent, Kay, Paul, Basic Color Terms: Their Universality and Evolution, Berkeley, University of California Press, 1969. 2. Chase, William G., Clark, Herbert H., "Mental Operations in the Comparison of Sentences and Pictures," in Cognition in Learning and Memory, edited by Lee W. Gregg, New York, Wiley, 1972, pp. 205-232. 3. Chomsky, Noam, Language and Mind, New York, Harcourt Brace Jovanovich, enlarged edition, 1972. 4. Collins, Allan M., Quillian, M. Ross, "Experiments on Semantic Memory and Language Comprehension," in Gregg, op. cit., pps. 117-137. 5. Fillenbaum, S., "Memory for Gist: Some Relevant Variables," Language and Speech, 1966, Vol. 9, pp. 217 -227. 6. Fillmore, Charles J., "The Case for Case," in Universals in Linguistic Theory, edited by Emmon Bach and Robert T. Harms, ~ew York, Holt, Rinehart and Winston, 1968, pp. 1-88. 7. Hays, David G., The Field and Scope of Computational Linguistfcs, ·1971· Thlettiatftfifat~eefifig-on C6ihIyo.ratfoiiIilLifigUistlcs; Debrecen. 8. Hays, David G., Margolis, Enid, Naroll, Raoul, Perkins, Revere Dale. "Color Term Salience," American Anthropologist, 1972, Vol. 74, pp. 1107-1121. 9. Hockett, Charles F., "The Origin of Speech," Scientific American, 1960, Vol. 203, pp. 88-96. 10. Ladefoged, Peter, Preliminaries to Linguistic Phonetics, Chicago, University of Chicago Press, 1971. 11. Keenan, Edward L., "Relative Clause Formation in Malagasy," in The Chicago Which Hunt, edited by Paul M. Peranteau, Judith N. Levi, and Gloria C. Phares, Chicago Linguistic Society, 1972. 12. Lamb, Sydney M., Outline of Stratificational Grammar, Washington, Georgetown University Press, revised edition, 1966. 13. McCawley, James D., "The Role of Semantics in a Grammar," in Bach and Harms, op. cit., pp. 125-169. 14. Quillian, M. Ross, "The Teachable Language Comprehender," Communications of the ACM, 1969, Vol. 12, pp. 459-476. 15. Simon, Herbert A., Barenfeld, M., "Information-processing Analysis of Perceptual Processes in Problem Solving," Psychological Review, 1969, Vol. 76, pp. 473-483. 16. Ware, Willis H., "The Ultimate Computer," IEEE Spectrum, March 1972. 8 National Computer Conference, 1973 Speech understanding Literary text processing by DONALD E. WALKER by SALLY YEATES SEDELOW Stanford Research Institute Menlo Park, California University of Kansas Lawrence, Kansas ABSTRACT ABSTRACT Research on speech understanding is adding new dimensions to the analysis of speech and to the understanding of language. The accoustic, phonetic, and phonological processing of speech recognition efforts are being blended with the syntax, semantics, and pragmatics of question-answering systems. The goal is the development of capabilities that will allow a person to have a conversation with a computer in the performance of a shared task. Achievement of this goal will both require and contribute to a more comprehensive and powerful model of language -with significant consequences for linguistics, for computer science, and especially for computational linguistics. To date, computer-based literary text processing bears much greater similarity to techniques used for information retrieval and, to some degree, for question-answering, than it does to techniques used in, for example, machine translation of 'classical' artificial intelligence. A literary text is treated not as 'output' in a process to be emulated nor as a string to be transformed into an equivalent verbal representation, but, rather, as an artifact to be analyzed and described. The absence of process as an integrating concept in computer-based literary text processing leads to very different definitions of linguistic domains (such as semantics and syntactics) than is the case with, for example, artificial intelligence. This presentation explores some of these distinctions, as well as some of the implications of more process-oriented techniques for literary text processing. Syntax and computation by JANE J. ROBINSON The University of Michigan Ann Arbor, Michigan ABSTRACT Algorithms have been developed for generating and parsing with context-sensitive grammars. In principle, the contexts to which a grammar is sensitive can be syntactic, semantic, pragmatic, or phonetic. This development points up the need to develop a new kind of lexicon, whose entries contain large amounts of several kinds of contextual information about each word or morpheme, provided in computable form. Ways in which both the form and content of the entries differ from those of traditional dictionaries are indicated. The augmented knowledge workshop by DOUGLAS C. ENGELBART, RICHARD W. WATSON, and JAMES C. NORTON Stanford Research Institute Menlo Park, California Workshop improvement involves systematic change not only in the tools that help handle and transform the materials, but in the customs, conventions, skills, procedures,- we-rk-ing-met-heds-,fH'-g-ani-z-ati-onal-rG-ies,---t-r-aining,-etc., by which the workers and their organizations harness their tools, their skills, and their knowledge. Over the past ten years, the explicit focus in the Augmentation Research Center (ARC) has been upon the effects and possibilities of new knowledge workshop tools based on the technology of computer timesharing and modern communications. 18 -41 Since we consider automating many human operations, what we are after could perhaps be termed "workshop automation." But the very great importance of aspects other than the new tools (i.e., conventions, methods, roles) makes us prefer the "augmentation" term that hopefully can remain "wholescope." We want to keep tools in proper perspective within the total system that augments native human capacities toward effective action.I-3.1O.16.18.24 Development of more effective knowledge workshop technology will require talents and experience from many backgrounds: computer hardware and software, psychology, management science, information science, and operations research, to name a few. These must come together within the framework of a new discipline, focused on the systematic study of knowledge work and its workshop environments. CONCEPT OF THE KNOWLEDGE WORKSHOP This paper discusses the theme of augmenting a knowle-dgeworkshop. -The first part-of-the-paper describes the concept and framework of the knowledge workshop. The second part describes aspects of a prototype knowledge workshop being developed within this framework. The importance and implications of the idea of knowledge work have been described by Drucker. 3.4 Considering knowledge to be the systematic organization of information and concepts, he defines the knowledge worker as the person who creates and applies knowledge to productive ends, in contrast to an "intellectual" for whom information and concepts may only have importance because they interest him, or to the manual worker who applies manual skills or brawn. In those two books Drucker brings out many significant facts and considerations highly relevant to the theme here, one among them (paraphrased below) being the accelerating rate at which knowledge and knowledge work are coming to dominate the working activity of our society: In 1900 the majority and largest single group of Americans obtained their livelihood from the farm. By 1940 the largest single group was industrial workers, especially semiskilled machine operators. By 1960, the largest single group was professional, managerial, and technical-that is, knowledge workers. By 1975-80 this group will embrace the majority of Americans. The productivity of knowledge has already become the key to national productivity, competitive strength, and economic achievement, according to Drucker. It is knowledge, not land, raw materials, or capital, that has become the central factor in production. TWO WAYS IN WHICH AUGMENTED KNOWLEDGE WORKSHOPS ARE EVOLVING Introduction First, one can see a definite evolution of new workshop architecture in the trends of computer application systems. An "augmented workshop domain" will probably emerge because many special-purpose application systems are evolving by adding useful features outside their immediate special application area. As a result, many will tend to overlap in their general knowledge work supporting features. Second, research and development is being directed toward augmenting a "Core" Knowledge Workshop domain. This application system development is aimed expressly at supporting basic functions of knowledge In his provocative discussions, Drucker makes extensive use of such terms as "knowledge organizations," "knowledge technologies," and "knowledge societies." It seemed a highly appropriate extension for us to coin "knowledge workshop" for re-naming the area of our special interest: the piace in which knowledge workers do their work. Knowledge workshops have existed for centuries, but our special concern is their systematic improvement, toward increased effectiveness of this new breed of craftsmen. 9 10 National Computer Conference, 1973 work. An important characteristic of such systems is to interface usefully with specialized systems. This paper is oriented toward this second approach. NATURAL EVOLUTION BY SCATTERED NUCLEI EXPANDING TOWARD A COMMON "KNOWLEDGE WORKSHOP" DOMAIN Anderson and Coover 15 point out that a decade or more of application-system evolution is bringing about the beginning of relatively rational user-oriented languages for the control interfaces of advanced applications software systems. What is interesting to note is that the functions provided by the "interface control" for the more advanced systems are coming to include editors and generalized file-management facilities, to make easier the preparation, execution, and management of the specialpurpose tools of such systems. It seems probable that special application-oriented systems (languages) will evolve steadily toward helping the user with such associated work as formulating models, documenting them, specifying the different trial runs, keeping track of intermediate results, annotating them and linking them back to the users' model(s), etc. When the results are produced by what were initially the core application programs (e.g., the statistical programs), he will want ways to integrate them into his working notes, illustrating, labeling, captioning, explaining and interpreting them. Eventually these notes will be shaped into memoranda and formal publications, to undergo dialogue and detailed study with and by others. 15 Once a significant user-oriented system becomes established, with a steady growth of user clientele, there will be natural forces steadily increasing the effectiveness of the system services and steadily decreasing the cost per unit of service. And it will also be natural that the functional domain of an application system will steadily grow outward: "as long as the information must be in computer form anyway for an adjacent, computerized process, let's consider applying computer aid to Activity X also." Because the boundary of the Application System has grown out to be "next to" Activity X, it has become relatively easy to consider extending the computerized-information domain a bit so that a new application process can support Activity X. After all, the equipment is already there, the users who perform Activity X are already oriented to use integrated computer aid, and generally the computer facilitation of Activity X will prove to have a beneficial effect on the productivity of the rest of the applications system. This domain-spreading characteristic is less dependent upon the substantive work area a particular application system supports than it is upon the health and vitality of its development and application (the authors of Reference 15 have important things to say on these issues): however. it appears that continuing growth is bound to occur in many special application domains, inevitably bringing about overlap in common application "sub-domains" (as seen from the center of any of these nuclei). These special subdomains include formulating, studying, keeping track of ideas, carrying on dialogue, publishing, negotiating, planning, coordinating, learning, coaching, looking up in the yellow pages to find someone who can do a special service, etc. CONSIDERING THE CORE KNOWLEDGE WORKSHOP AS A SYSTEM DOMAIN IN ITS OWN RIGHT A second approach to the evolution of a knowledge workshop is to recognize from the beginning the amount and importance of human activity constantly involved in the "core" domain of knowledge work-activity within which more specialized functions are embedded. If you asked a particular knowledge worker (e.g., scientist, engineer, manager, or marketing specialist) what were the foundations of his livelihood, he would probably point to particular skills such as those involved in designing an electric circuit, forecasting a market based on various data, or managing work flow in a project. If you asked him what tools he needed to improve his effectiveness he would point to requirements for aids in designing circuits, analyzing his data, or scheduling the flow of work. But, a record of how this person used his time, even if his work was highly specialized, would show that specialized work such as mentioned above, while vital to his effectiveness, probably occupied a small fraction of his time and effort. The bulk of his time, for example, would probably be occupied by more general knowledge work: writing and planning or design document; carrying on dialogue with others in writing, in person, or on the telephone; studying documents; filing ideas or other material; formulating problem-solving approaches; coordinating work with others; and reporting results. There would seem to be a promise of considerable payoff in establishing a healthy, applications oriented systems development activity within this common, "core" domain, meeting the special-application systems "coming the other way" and providing them with well-designed services at a natural system-to-system interface. It will be much more efficient to develop this domain explicitly, by people oriented toward it, and hopefully with resources shared in a coordinated fashion. The alternative of semi-random growth promises problems such as: (1) Repetitive solutions for the same functional prob- lems, each within the skewed perspective of a particular special-applications area for which these problems are peripheral issues, (2) Incompatibility between diferent application software systems in terms of their inputs and outputs. The Augmented Knowledge Workshop (3) Languages and other control conventions inconsist- ent or based on different principles from one system to another, creating unnecessary learning barriers or other discouragements to cross usage. In summary, the two trends in the evolution of knowledge workshops described above- are each valuable and are complementary. Experience and specific tools and techniques can and will be transferred between them. There is a very extensive range of "core" workshop functions, common to a wide variety of knowledge work, and they factor into many levels and dimensions. In the sections to follow, we describe our developments, activities, and commitments from the expectation that there soon will be increased activity in this core knowledge workshop domain, and that it will be evolving "outward" to meet the other application systems "heading inward." BASIC ASSUMPTIONS ABOUT AUGMENTED KNOWLEDGE WORKSHOPS EMBEDDED IN A COMPUTER NETWORK 11 such as the language, control conventions, and methods for obtaining help and computer-aided training. This characteristic has two main implications. One, it means that while each domain within the core \vorkshop area or within a specialized application system may have a vocabulary unique to its area, this vocabulary will be used within language and control structures common throughout the workshop system. A user will learn to use additional functions by increasing vocabulary, not by having to learn separate "foreign" languages. Two, when in trouble, he will invoke help or tutorial functions in a standard way. Grades of user proficiency Even a once-in-a-while user with a minimum of learning will want to be able to get at least a few straightforwardthi-ngsoone. In faet,even an expert-user-iu--onedomain will be a novice in others that he uses infrequently. Attention to novice-oriented features is required. But users also want and deserve the reward of increased proficiency and capability from improvements in their skills and knowledge, and in their conceptual orientation to the problem domain and to their workshop's system of tools, methods, conventions, etc. "Advanced vocabularies" in every special domain will be important and unavoidable. A corollary feature is that workers in the rapidly evolving augmented workshops should continuously be involved with testing and training in order that their skills and knowledge may harness available tools and methodology most effectively. The computer-based "tools" of a knowledge workshop will be provided in the environment of a computer network such as the ARPANET. 7 .8.14 For instance, the core functions will consist of a network of cooperating processors performing special functions such as editing, publishing, communication of documents and messages, data management, and so forth. Less commonly used but important functions might exist on a single machine. The total computer assisted workshop will be based on many geographically separate systems. Once there is a "digital-packet transportation system," it becomes possible for the individual user to reach out through his interfacing processor(s) to access other people and other services scattered throughout a "community," and the "labor marketplace" where he transacts his knowledge work literally will not have to be affected by geographical location. 27 Specialty application systems will exist in the way that specialty shops and services now do-and for the same reasons. When it is easy to transport the material and negotiate the service transactions, one group of people will find that specilization can improve their cost/ effectiveness, and that there is a large enough market within reach to support them. And in the network-coupled computerresource marketplace, the specialty shops will grow-e.g., application systems specially tailored for particular types of analyses, or for checking through text for spelling errors, or for doing the text-graphic document typography in a special area of technical portrayal, and so on. There will be brokers, wholesalers, middle men, and retailers. One cannot predict ahead of time which domains or application systems within the workshop will want to communicate in various sequences with which others, or what operations will be needed in the future. Thus, results must be easily communicated from one set of operations to another, and it should be easy to add or interface new domains to the workshop. Coordinated set of user interface principles Availability of people support services There will be a common set of principles, over the many application areas, shaping user interface features An augmented workshop will have more support services available than those provided by computer tools. Ease of communication between, and addition of, workshop domains User programming capability There will never be enough professional programmers and system developers to develop or interface all the toois that users may need for their work. Therefore, it must be possible, with various levels of ease, for users to add or interface new tools, and extend the language to meet their needs. They should be able to do this in a variety of programming languages with which they may have training, or in the basic user-level language of the workshop itself. 12 National Computer Conference, 1973 There will be many people support services as well: besides clerical support, there will be extensive and highly specialized professional services, e.g., document design and typography, data base design and administration, training, cataloging, retrieval formulation, etc. In fact, the marketplace for human services will become much more diverse and active. 27 Cost decreasing, capabilities increasing The power and range of available capabilities will increase and costs will decrease. Modular software designs, where only the software tools needed at any given moment are linked into a person's run-time computer space, will cut system overhead for parts of the system not in use. Modularity in hardware will provide local configurations of terminals and miniprocessors tailored for economically fitting needs. It is obvious that cost of raw hardware components is plummeting; and the assumed large market for knowledge workshop support systems implies further help in bringing prices down. The argument given earlier for the steady expansion of vital application systems to other domains remains valid for explaining why the capabilities of the workshop will increase. Further, increasing experience with the workshop will lead to improvements, as will the general trend in technology evolution. Range of workstations and symbol representations The range of workstations available to the user will increase in scope and capability. These workstations will support text with large, open-ended character sets, pictures, voice, mathematical notation, tables, numbers and other forms of knowledge representation. Even small portable hand-held consoles will be available. 13 Careful development of methodology As much care and attention will be given to the development, analysis, and evaluation of procedures and methodology for use of computer and people support services as to the development of the technological support services. Changed roles and organizational structure The widespread availability of workshop services will create the need for new organizational structures and roles. SELECTED DESCRIPTION OF AUGMENTED WORKSHOP CAPABILITIES Introduction \Vithin the framework described above, ARC is developing a prototype workshop system. Our system does not meet all the requirements outlined previously, but it does have a powerful set of core capabilities and experience that leads us to believe that such goals can be achieved. Within ARC we do as much work as possible using the range of online capabilities offered. We serve not only as researchers, but also as the subjects for the analysis and evaluation of the augmentation system that we have been developing. Consequently, an important aspect of the augmentation work done within ARC is that the techniques being explored are implemented, studied, and evaluated with the advantage of intensive everyday usage. We call this research and development strategy "bootstrapping." In our experience, complex man-machine systems can evolve only in a pragmatic mode, within real-work environments where there is an appropriate commitment to conscious, controlled, exploratory evolution within the general framework outlined earlier. The plans and commitments described later are a consistent extension of this pragmatic bootstrapping strategy. To give the reader more of a flavor of some of the many dimensions and levels of the ARC workshop, four example areas are discussed below in more detail, following a quick description of our physical environment. The first area consists of mechanisms for studying and browsing through NLS files as an example of one functional dimension that has been explored in some depth. The second area consists of mechanisms for collaboration support-a subsystem domain important to many application areas. The third and fourth areas, support for software engineers and the ARPANET Network Information Center (NIC), show example application domains based on functions in our workshop. General physical environment Our computer-based tools run on a Digital Equipment Corporation PDP-10 computer, operating with the Bolt, Beranek, and Newman TENEX timesharing system. 9 The computer is connected via an Interface Message Processor (IMP) to the ARPANET.7 s There is a good deal of interaction with Network researchers, and with Network technology, since we operate the ARPA Network Information Center (see below).39 There is a range of terminals: twelve old, but serviceable, display consoles of our own design,26 an IMLAC display, a dozen or so 30 ch/sec portable upper/lower case typewriter terminals, five magnetic tape-cassette storage units that can be used either online or offline, and a 96character line printer. There are 125 million characters of online disk storage. o The display consoles are equipped with a typewriterlike keyboard, a five-finger keyset for one-handed character input, and a "mouse"-a device for controlling the position of a cursor (or pointer) on the display screen and for input of certain control commands. Test results on the mouse as a ~creen- The Augmented Knowledge Workshop selection device have been reported in Reference 25, and good photographs and descriptions of the physical systems have appeared in References 20 and 21. The core workshop software system and language, called NLS, provides many basic tools, of which a number will be mentioned below. It is our "core-workshop application system." During the initial years of workshop development, application and analysis, the basic knowledge-work functions have centered around the composition, modification, and study of structured textual material. 26 Some of the capabilities in this area are described in detail in Reference 26, and are graphically shown in a movie available on 10an~1The structured:te.xt manipulation has heendevelop_ed extensively because of its high payoff in the area of applications-system development to which we have applied our augmented workshop. We have delayed addition of graphic-manipulation capabilities because there were important areas associated with the text domain needing exploration and because of limitations in the display system and hardcopy printout. To build the picture of what our Core Knowledge Workshop is like, we first give several in-depth examples, and then list in the section on workshop utility service some "workshop subsystems" that we consider to be of considerable importance to general knowledge work. STUDYING ONLINE DOCUMENTS Introduction The functions to be described form a set of controls for easily moving one around in an information space and allowing one to adjust the scope, format, and content of the information seen. 26 .41 Given the addition of graphical, numerical, and vocal information, which are planned for addition to the workshop, one can visualize many additions to the concepts below. Even for strictly textual material there are yet many useful ideas to be explored. View specifications One may want an overview of a document in a table-ofcontents like form on the screen. To facilitate this and other needs, NLS text files are hierarchically structured in a tree form with subordinate material at lower levels in the hierarchy. 26 The basic conceptual unit in NLS, at each node of the hierarchical file, is called a "statement" and is usually a paragraph, sentence, equation, or other unit that one wants to manipulate as a whole. 13 A statement can contain many characters-presently, up to 2000. Therefore, a statement can contain many lines of text. Two of the "view-specification" parameters-depth in the hierarchy, and lineS peT statement-can be controlled during study of a document to give various overviews of it. View specifications are given with highly abbreviated control codes, because they are used very frequently and their quick specification and execution make a great deal of difference in the facility with which one studies the material and keeps track of where he is. Examples of other view specifications are those that control spacing between statements, and indentation for levels in the hierarchy, and determine whether the identifications associated with statements are to be displayed, whi~.h br(1Il~_h(~sl ill the tree.a,r~t(). b_~ disRla-y~~:t,. :wh~th~r special filters are to be invoked to show only statements meeting specified content requirements or whether statements are to be transformed according to special rules programmed by the user. Moving in information space A related viewing problem is designating the particular location (node in a file hierarchy) to be at the top of the screen. The computer then creates a display of the information from that point according to the view specifications currently in effect. The system contains a variety of appropriate commands to do this; they are called jump commands because they have the effect of "jumping" or moving one from place to place in the network of files available as a user's information space.26.33-39 One can point at a particular statement on the screen and command the system to move on to various positions relative to the selected one, such as up or down in the hierarchical structure, to the next or preceding statement at the same hierarchical level, to the first or last statement at a given level, etc. One can tell the system to move to a specifically named point or go to the next occurrence of a statement with a specific content. Each time a jump or move is made, the option is offered of inciuding any of the abbreviated view specifications-a very general, single operation is "jump to that location and display with this view." As one moves about in a file one may want to quickly and easily return to a previous view of the path as one traverses through the file and the specific view at each point, and then aliowing return movement to the most recent points saved. Another important feature in studying or browsing in a document is being able to quickly move to other documents cited. 14 National Computer Conference, 1973 There is a convention (called a "link';) for citing documents that allows the user to specify a particular file, statement within the file and view specification for initial display when arriving in the cited file. A single, quickly executed command (Jump to Link) allows one to point at such a citation, or anywhere in the statement preceding the citation, and the system will go to the specific file and statement cited and show the associated material with the specified view parameters. This allows systems of interlinked documents and highly specific citations to be created. A piece of the path through the chain of documents is saved so that one can return easily a limited distance back along his "trail," to previously referenced documents. Such a concept was originally suggested by Bush 1 in a fertile paper that has influenced our thinking in many ways. function is available for both display and typewriter terminal users over the ARPANET. The technique is particularly effective between displays because of the high speed of information output and the flexibility of being able to split the screen into several windows, allowing more than one document or view of a document to be displayed for discussion. When a telephone link is also established for voice communication between the participants, the technique comes as close' as any we know to eliminating the need for collaborating persons or small groups to be physically together for sophisticated interaction. A number of other healthy approaches to teleconferencing are being explored elsewhere. 11.12.16.17 It would be interesting to interface to such systems to gain experience in their use within workshops such as described here. RECORDED DIALOGUE SUPPORT Multiple windows Introduction Another very useful feature is the ability to "split" the viewing screen horizontally and/ or vertically in up to eight rectangular display windows of arbitrary size. Generally two to four windows are all that are used. Each window can contain a different view of the same or different locations, within the same or different files. 39 As ARC has become more and more involved in the augmentation of teams, serious consideration has been given to improving intra- and inter-team communication with whatever mixture of tools, conventions, and procedures will help.27.36.39 If a team is solving a problem that extends over a considerable time, the members will begin to need help in remembering some of the important communicationsi.e., some recording and recalling processes must be invoked, and these processes become candidates for augmentation. If the complexity of the team's problem relative to human working capacity requires partitioning of the problem into many parts-where each part is independently attacked, but where there is considerable interdependence among the parts-the communication between various people may well be too complex for their own accurate recall and coordination without special aids. Collaborating teams at ARC have been augmented by development of a "Dialogue Support System (DSS)," containing current and thoroughly used working records of the group's plans, designs, notes, etc. The central feature of this system is the ARC Journal, a specially managed and serviced repository for files and messages. The DSS involves a number of techniques for use by distributed parties to collaborate effectively both using general functions in the workshop and special functions briefly described below and more fully in Reference 39. Further aspects are described in the section on Workshop Utility Service. COLLABORATIVE DIALOGUE AND TELECONFERENCING Introduction The approach to collaboration support taken at ARC to date has two main thrusts: (1) Support for real-time dialogue (teleconferencing) for two or more people at two terminals who want to see and work on a common set of material. The collaborating parties may be further augmented with a voice telephone connection as well. (2) Support for written, recorded dialogue, distributed over time. These two thrusts give a range of capabilities for support of dialogue distributed over time and space. Teleconferencing support Consider two people or groups of people who are geographically separated and who want to collaborate on a document, study a computer program, learn to use a new aspect of a system, or perform planning tasks, etc. The workshop supports this type of collaboration by allowing them to link their terminals so that each sees the same information and either ('an control the system. This Document or message submi.<;sion The user can submit an NLS file, a part of a file, a file prepared on another system in the ARPANET (document). or text t~'ped at submission time (message) The Augmented Knowledge Workshop to the Journal system. When submitted, a copy of the document or message is transferred to a read -only file whose permanent safekeeping is guaranteed by the Journal system. It is assigned a unique catalog number, and automatically cataloged. Later, catalog indices based on number, author, and "titleword out of context" are created by another computer process. Nonrecorded dialogue for quick messages or material not likely to be referenced in the future is also permitted. One can obtain catalog numbers ahead of time to interlink document citations for related documents that are being prepared simultaneously. Issuing and controlling of catalog numbers is performed by a Number System (an automatic, crash-protected computer process). At the time of submission, the user can contribute such information as: title, distribution list, comments, keyW-6-fa-S,eat-affig--numbers of-documents this- new one supersedes (updates), and other information. The distribution is specified as a list of unique identification terms (abbreviated) for individuals or groups. The latter option allows users to establish dialogue groups. The system automatically "expands" the group identification to generate the distribution list of the individuals and groups that are its members. Special indices of items belonging to subcollections (dialogue groups) can be prepared to aid their members in keeping track of their dia10gue. An extension of the mechanisms available for group distribution could give a capability similar to one described by Turoff. 17 Entry of identification information initially into the system, group expansion, querying to find a person's or group's identification, and other functions are performed by an Identification System. Document distribution Documents are distributed to a person in one, two, or all of three of the following ways depending on information kept by the Identification System. (1) In hardcopy through the U.S. or corporation mail to those not having online access or to those desiring this mode, (2) Online as citations (for documents) or actual text (for messages) in a special file assigned to each user. (3) Through the ARPANET for printing or online delivery at remote sites. This delivery is performed using a standard Network wide protocol. Document distribution is automated, with online delivery performed by a background computer process that runs automatically at specified times. Printing and mailing are performed by operator and clerical support. \Vith each such printed document, an address cover sheet is automatically printed, so that the associated printout pages only need to be folded in half, stapled, and stamped before being dropped in the mail. 15 Document access An effort has been made to make convenient both online and offline access to Journal documents. The master catalog number is the key to accessing documents. Several strategically placed hardcopy master and access collections (libraries) are maintained, containing all J oumal documents. Automatic catalog-generation processes generate author, number, and titleword indices, both online and in hardcopy.38 The online versions of the indices can be searched conveniently with standard NLS retrieval capabilities. 37 ,39,41 Online access to the full text of a document is accomplished by using the catalog number as a file name and loading the file or moving to it by pointing at a citation and asking the -system to "}ump;; as -(fescrlbed-e~~~ lier. there SOFTWARE ENGINEERING AUGMENTATION SYSTEM Introduction One of the important application areas in ARC's work is software engineering. The economics of large computer systems, such as NLS, indicate that software development and maintenance costs exceed hardware costs, and that software costs are rising while hardware costs are rapidly decreasing. The expected lifetime of most large software systems exceeds that of any piece of computer hardware. Large software systems are becoming increasingly complex, difficult to continue evolving and maintain. Costs of additional enhancements made after initial implementation generally exceed the initial cost over the lifetime of the system. It is for these reasons that it is important to develop a powerful application area to aid software engineering. Areas of software engineering in which the ARC workshop offers aids are described below. Design and review collaboration During design and review, the document creation, editing, and studying capabilities are used as well as the collaboration, described above. Use of higher level system programming languages Programming of NLS is performed in a higher level ALGOL-like system programming language called L-10 developed at ARC. The L-10 language compiler takes its input directly from standard NLS structured files. The PDP-10 assembler also can obtain input from NLS files. It is planned to extend this capability to other ianguages, for example, by providing an interface to the BASIC system available in our machine for knowledge workers wishing to perform more complex numerical tasks. 16 National Computer Conference, 1973 We are involved with developing a modular runtimelinkable programming system (MPS), and with planning a redesign of NLS to utilize MPS capabilities, both in cooperation with the Xerox Palo Alto Research Center. MPS will: (2) (1) Allow a workshop system organization that will (3) (2) (3) (4) (5) make it easier for many people to work on and develop parts of the same complex system semiindependently. Make it easier to allow pieces of the system to exist on several processors. Allow individual users or groups of users to tailor versions of the system to their special needs. Make it easier to move NLS to other computers since MPS is written in itself. Speed system development because of MPS's improved system building language facilities, integrated source-level debugging, measurement facilities, the ability to construct new modules by combining old ones, and to easily modify the system by changing module interconnection. System documentation and source-code creation Source-code creation uses the standard NLS hierarchical file structures and allows documentation and other programming conventions to be established that simplify studying of source-code files. Debugging (5) (6) (7) Maintenance Maintenance programmers use the various functions mentioned above. The Journal is used for reporting bugs; NLS structured source code files simplify the study of problem areas and the debugging tools permit easy modification and testing of the modifications. THE ARPA NETWORK INFORMATION CENTER (NIC) Introduction A form of source-level debugging is allowed through development of several tools, of which the following are key examples: (1) A user program compilation and link loading facil- ity that allows new or replacement programs to be linked into the running system to create revised versions for testing or other purposes. (2) NLS-DDT, a DDT like debugging facility with a command language more consistent with the rest of NLS, and simplifies display of system variables and data structures, and allows replacement of system procedures by user supplied procedures. (3) Use of several display windows so as to allow source code in some windows and control of DDT in others for the setting of breakpoints and display of variables and data structures. Measurement and analysis A range of measurement tools has been developed for analyzing system operation. These include the following: (1) Capabilities for gathering and reporting statistics un many operalillg (4) zation of system components in various modes, queue lengths, memory utilization, etc. The ability to sample the program counter for intervals of a selectable area of the operating system or any particular user subsystem to measure time spent in the sampled areas; Trace and timing facilities to follow all procedure calls during execution of a specified function. The ability to study page-faulting characteristics of a subsystem to check on its memory use characterist.ics. The ability to gather NLS command usage and timing information. The ability to study user interaction on a task basis from the point of view of the operating-system scheduler. The ability to collect sample user sessions for later playback to the system for simulated load, or for analysis. ~)~tem parameters such as utili- The NIC is presently a project embedded within ARC.39 Workshop support for the NIC is based on the capabilities within the total ARC workshop system. As useful as is the bootstrapping strategy mentioned earlier, there are limits to the type of feedback it can yield with only ARC as the user population. The NIC is the first of what we expect will be many activities set up to offer services to outside users. The goal is to provide a useful service and to obtain feedback on the needs of a wider class of knowledge workers. Exercised within the NIC are also prototypes of information services expected to be normal parts of the workshop. The NIC is more than a classical information center, as that term has come to be used, in that it provides a wider range of services than just bibliographic and "library" type services. The NIC is an experiment in setting up and running a general purpose information service for the ARPANET community with both online and offline services. The services offered and under development by the NIC have as their initial basic objectives: (1) To help people with problems find the resources (people, systems, and information) available within the network community that meet thf'ir nPNio;:.. The Augmented Knowledge Workshop (2) To help members of geographically distributed groups collaborate with each other. Following are the NIC services now provided to meet the above goals in serving the present clientele: .Current online services (1) Access to the typewriter version (TNLS) and display version (D~LS) of the Augmentation Research Center's Online System (NLS) for communique creation, access, and linking between users, and for experimental use for any other information storage and manipulation purpose suitable for NLS and useful to Network participants. (2) Access to Journal, Number, and Identification Systems. to al1o.w ..mes.,c;;;ages and documents to . be transmitted between network participants. (3) Access to a number of online information bases through a special Locator file using :KLS link mechanisms and through a novice-oriented query system. Current offline services (1) A 'Network Information Center Station set up at each network site. (2) Techniques for gathering, producing and maintaining data bases such as bibliographic catalogs, directories of network participants, resource information, and user guides. (3) Support of Network dialogue existing in hardcopy through duplication, distribution, and cataloging. (4) General Network referral and handling of document requests. (5) Building of a collection of documents potentially valuable to the Network Community. Initial concentration has been on obtaining documents of possible value to the Network builders. (6) As yet primitive selective document distribution to Station Collections. (7) Training in use of NIC services and facilities. Conclusion The Network Information Center is an example prototype of a new type of information service that has significant future potential. Even though it is presently in an experimental and developmental phase, it is providing useful online and offline services to the ARPANET community. so that we can begin to transfer the fruits of our past work to them and with their assistance, to others, and so that we can obtain feedback needed for further evolution from wider appiication than is possibie in our project alone. 28 We want to find and support selected groups who are willing to take extra trouble to be exploratory, but who: (1) Are not necessarily oriented to being core-workshop developers (they have their own work to do). (2) Can see enough benefit from the system to be tried and from the experience of trying it so that they can justify the extra risk and expense of being "early birds." (3) Can accept assurance that system reliability and stability, and technical! application help will be available to meet their conditions for risk and cost. ARC is establishing a Workshop Utility Service, and promoting the type of workshop service described above as part of its long-term commitment to pursue the continued development of augmented knowledge workshops in a pragmatic, evolutionary manner. It is important to note that the last few years of work have concentrated on the means for delivering support to a distributed community, for providing teleconferencing and other basic processes of collaborative dialogue, etc. ARC has aimed consciously toward developing experience and capabilities especially applicable to support remote and distributed groups of exploratory users for this next stage of wider-application bootstrapping. One aspect of the service is that it will be an experiment in harnessing the new environment of a modern computer network to increase the feasibility of a wider community of participants cooperating in the evolution of an application system. Characteristics of the planned service The planned service offered will include: (1) Availability of Workshop Utility computer service to the user community from a PDP-IO TEKEX system operated by a commercial supplier. (2) Providing training as appropriate in the use of Display NLS (DNLS), Typewriter NLS (TNLS), and Deferred Execution (DEX) software subsystems, (3) Providing technical assistance to a user organization "workshop architect" in the formulation, development, and implementation of augmented knowledge work procedures within selected offices at the user organization. 6 PLANS FOR A WORKSHOP UTILITY SERVICE Motivation It is now time for a next stage of application to be established. We want to involve a wider group of people 17 This assistance will include help in the development of NLS use strategies suitable to the user environments, procedures within the user organization for implementing these strategies, and possible special-application NLS extensions (or 18 National Computer Conference, 1973 simplifications) to handle the mechanics of particular user needs and methodologies. (4) Providing "workshop architect" assistance to help set up and assist selected geographically distributed user groups who share a special discipline or mission orientation to utilize the workshop utility services and to develop procedures, documentation, and methodology for their purposes. GENERAL DESCRIPTION OF SOME WORKSHOP UTILITY SUBSYSTEMS Introduction Within a particular professional task area (mission- or discipline-oriented) there are often groups who could be benefited by using special workshop subsystems. These subsystems may be specialized for their specific application or research domain or for support of their more general knowledge work. Our goal is to offer a workshop utility service that contains a range of subsystems and associated methodology particularly aimed at aiding general knowledge work, and that also ~;upports in a coordinated way special application subsystems either by interfacing to subsystems already existing, or by developing new subsystems in selected areas. In the descriptions to follow are a number of workshop subsystem domains that are fundamental to a wide range of knowledge work in which ARC already has extensive developments or is committed to work. For each subsystem we include some general comments as well as a brief statement of current ARC capabilities in the area. Document development, production, and control Here a system is considered involving authors, editors, supervisors, typists, distribution-control personnel, and technical specialists. Their job is to develop documents, through successive drafts, reviews, and revisions. Control is needed along the way of bibliography, who has checked what point, etc. Final drafts need checkoff, then production. Finally distribution needs some sort of control. If it is what we call a "functional document" such as a user guide, then it needs to be kept up to date. 39 There is a further responsibility to keep track of who needs the documents, who has what version, etc. Within the ARC workshop, documents ranging from initial drafts to final high-quality printed publications can be quickly produced with a rich set of creation and editing functions. All of ARC's proposals, reports, designs, letters, thinkpieces, user documentation, and other such information are composed and produced using the workshop. Documents in a proof or finished form can be produced with a limited character set and control on a line printer or typewriter, or publication quality documents can be produced on a photocomposer microfilm unit. Presently there are on the order of two hundred special directives that can be inserted in text to control printing. These directives control such features as typefont, pagination, margins, headers, footers, statement spacing, typefont size and spacing, indenting, numbering of various hierarchical levels, and many other parameters useful for publication quality work. Methodology to perform the creation, production, and controlling functions described above has been developed, although much work at this level is still needed. In terms of future goals, one would like to have display terminals with a capability for the range of fonts available on the photocomposer so that one could study page layout and design interactively, showing the font to be used, margins, justification, columnization, etc. on the screen rather than having to rely on hardcopy proofsheets. To prepare for such a capability, plans are being made to move toward an integrated portrayal mechanism for both online and hardcopy viewing. Collaborative dialogue and teleconferencing Effective capabilities have already been developed and are in application, as discussed above. There is much yet to do. The Dialogue Support System will grow to provide the following additional general online aids: Link-setup automation; back-link annunciators and jumping; aids for the formation, manipulation, and study of sets of arbitrary passages from among the dialogue entries; and integration of cross-reference information into hardcopy printouts. Interfaces will probably be made to other teleconferencing capabilities that come into existence on the ARPANET. It also will include people-system developments: conventions and working procedures for using these aids effectively in conducting collaborative dialogue among various kinds of people, at various kinds of terminals, and under various conditions; working methodology for teams doing planning, design, implementation coordination; and so on. Meetings and conferences Assemblies of people are not likely for a long time, if ever, to be supplanted in total by technological aids. Online conferences are held at ARC for local group meetings and for meetings where some of the participants are located across the country. Use is made of a large-screen projection TV system to provide a display image that many people in a conference room can easily see. This is controlled locally or remotely by participants in the meeting, giving access to the entire recorded dialogue data base m: needed during the meeting and also providing the capability of recording real-time The Augmented Knowledge Workshop meeting notes and other data. The technique also allows mixing of other video signals. Management and organization The capabilities offered in the workshop described in this paper are used in project management and administration. 39 Numerical calculations can also be performed for budget and other purposes, obtaining operands and returning results to NLS files for further manipulation. Where an organization has conventional project management operations, their workshop can include computer aids for techniques such as PERT and CPM. We want to support the interfacing that our Core Workshop can provide to special application systems for management processes. We -aYe---especiatly--lnteFested -at this stage-, in-1l1an-a:ge~ ment of project teams-particularly, of application-systems development teams. 19 area to apply their techniques and systems in the workshop domain. In training new and developing users in the use of the system, we have begun using the system itself as a teaching environment. This is done locally and with remote users over the ARPANET. Software engineering augmentation A major special application area described above, that has had considerable effort devoted to it, is support of software engineers. The software-based tools of the workshop are designed and built using the tools previously constructed. It has long been felt 24 .29 that the greatest "bootstrapping" leverage would be obtained by intensively developing the augmented workshop for software engineers; --and--we--ho-pe---to--stimul-at-e and- -sup-port--more--activity in this area. Knowledge workshop analysis Handbook development Capabilities described above are being extended toward the coordinated handling of a very large and complex body of documentation and its associated external references. The goal is that a project or discipline of everincreasing size and complexity can be provided with a service that enables the users to keep a single, coordinated "superdocument" in their computer; that keeps up to date and records the state of their affairs; and provides a description of the state of the art in their special area. Example contents would be glossaries, basic concept structure, special analytic techniques, design principles, actual design, and implementation records of all developments. Research intelligence The provisions within the Dialogue Support System for cataloging and indexing internally generated items also support the management for externally generated items, bibliographies, contact reports, clippings, notes, etc. Here the goal is to give a human organization (distributed or local) an ever greater capability for integrating the many input data concerning its external environment; processing (filtering, transforming, integrating, etc.) the data so that it can be handled on a par with internally generated information in the organization's establishing of plans and goals; and adapting to external opportunities or dangers.3s Computer-based instruction This is an important area to facilitate increasing the skills of knowledge workers. ARC has as yet performed little direct work in this area. We hope in the future to work closely with those in the computer-based instruction Systematic analysis has begun of the workshop environment at internal system levels, at user usage levels, and at information-handling procedure and methodology levels. The development of new analytic methodology and tools is a part of this process. The analysis of application systems, and especially of core-workshop systems, is a very important capability to be developed. To provide a special workshop subsystem that augments this sort of analytic work is a natural strategic goal. CONCLUSION-THE NEED FOR LONG-TERM COMMITMENT As work progresses day-to-day toward the long-term goal of helping to make the truly augmented knowledge workshop, and as communities of workshop users become a reality, we at ARC frequently reflect on the magnitude of the endeavor and its long-term nature. 22 Progress is made in steps, with hundreds of shortterm tasks directed to strategically selected subgoals, together forming a vector toward our higher-level goals. To continue on the vector has required a strong commitment to the longer-range goals by the staff of ARC. In addition, we see that many of the people and organizations we hope to enlist in cooperative efforts will need a similar commitment if they are to effectively aid the process. One of ARC's tasks is to make the long-term objectives of the workshop's evolutionary deveiopment, the potential value of such a system, and the strategy for realizing that value clear enough to the collaborators we seek, so that they will have a strong commitment to invest resources with understanding and patience. 20 National Computer Conference, 1973 One key for meeting this need will be to involve them in serious use of the workshop as it develops. The plans for the Workshop Utility are partly motivated by this objective. Although the present ARC workshop is far from complete, it does have core capabilities that we feel will greatly aid the next communities of users in their perception of the value of the improved workshops of the future. ACKNOWLEDGMENTS During the 10 year life of ARC many people have contributed to the development of the workshop described here. There are presently some 35 people-clerical, hardware, software, information specialists, operations researchers, writers, and others-all contributing significantly toward the goals described here. The work reported here is currently supported primarily by the Advanced Research Projects Agency of the Department of Defense, and also by the Rome Air Development Center of the Air Force and by the Office of Naval Research. REFERENCES 1. Bush, V., "As We May Think," Atlantic Monthly, pp. 101-108, July 1945 (SRI-ARC Catalog Item 3973). 2. Licklider, J. C. R., "Man-Computer Symbiosis," IEEE Tmnsactions on Human Factors in Electronics, Vol. HFE-1, pp. 4-11, March, 1960 (SRI-ARC Catalog Item 6342). 3. Drucker, P. F., The Effective Executive, Harper and Row, New York, 1967 (SRI-ARC Catalog Item 3074). 4. Drucker, P. F., The Age of Discontinuity-Guidelines to our Changing Society, Harper and Row, New York, 1968 (SRI-ARC Catalog Item 4247). 5. Dalkey, N., The Delphi Method-An Experimental Study of Group Opinion, Rand Corporation Memorandum RM-5888-PR, 1969 (SRI-ARC Catalog Item 3896). 6. Allen, T. J., Piepmeier, J. M., Cooney, S., "Technology Transfer to Developing Countries-The International Technological Gatekeeper," Proceedings of the ASIS, Vol. 7, pp. 205-210, 1970 (SRIARC Catalog Item 13959). 7. Roberts, L. G., Wessler, B. D., "Computer Network Development to Achieve Resource Sharing," AFIPS Proceedings, Spring Joint Computer Conference, Vol. 36, pp. 543-549, 1970 (SRI-ARC Catalog Item 4564). 8. Roberts, L. G., Wessler, B. D., The ARPA Network, Advanced Research Projects Agency, Information Processing Techniques, Washington, D.C. May 1971 (SRI-ARC Catalog Item 7750). 9. Bobrow, D. G., Burchfiel, J. D., Murphy, D. L., Tomlinson, R. S., "TENEX-A Paged Time Sharing System for the PDP-10," presented at ACM Symposium on Operating Systems Principles, October 18-20, 1971. Bolt Beranek and Newman Inc., August 15, 1971 (SRI-ARC Catalog Item 7736). 10. Weinberg, G. M., The Psychology of Computer Programming, Van Nostrand Reinhold Company, New York, 1971 (SRI-ARC Catalog Item 9036). 11. Hall, T. W., "Implementation of an Interactive Conference System," AFIPS Proceedings, Spring Joint Computer Conference, Vol. 38, pp. 217-229,1971 (SRI-ARC Catalog Item 13962). 12. Turoff, M., "Delphi and its Potential Impact on Information Systems," AFIPS Proceedings, Fall Joint Computer Conference. Vol. 39. pp. 317-~2n. Jfl71 (SRI-ARC' CAtalog Ttpm ,flnnl 13. Roberts, L. G., Extensions of Packet Communication Technology to a Hand Held Personal Terminal, Advanced Research Projects Agency, Information Processing Techniques, January 24, 1972 (SRI-ARC Catalog Item 9120). 14. Kahn, R. E., "Resource-Sharing Computer Communication Networks," Proceedings of the IEEE, Vol. 147, pp. 147, September 1972 (SRI-ARC Catalog Item 13958). 15. Anderson, R. E., Coover, E. R., "Wrapping Up the Package-Critical Thoughts on Applications Software for Social Data Analysis," Computers and Humanities, Vol. 7, No.2, pp. 81-95, November 1972 (SRI-ARC Catalog Item 13956). 16. Lipinski, A. J., Lipinski, H. M., Randolph, R. H., "Computer Assisted Expert Interrogation-A Report on Current Methods Development," Proceedings of First International Conference on Computer Communication, Winkler, Stanley (ed), October 24-26, 1972, Washington, D.C., pp. 147-154 (SRI-ARC Catalog Item 11980). 17. Turoff, M., "'Party-Line' and 'Discussion' Computerized Conference Systems," Proceedings of First International Conference on Computer Communication, Winkler, Stanley (ed), October 24-26, 1972, Washington, D. C., pp. 161-171, (SRI-ARC Catalog Item 11983). BY OTHER PEOPLE, WITH SUBSTANTIVE DESCRIPTION OF ARC DEVELOPMENTS 18. Licklider, J. C. R., Taylor, R. W., Herbert, E., "The Computer as a Communication Device," International Science and Technology, No. 76, pp. 21-31, April 1968 (SRI-ARC Catalog Item 3888). 19. Engelbart, D. C., "Augmenting your Intellect," (Interview with D. C. Engelbart), Research/Development, pp. 22-27, August 1968 (SRI-ARC Catalog Item 9698). 20. Haavind, R., "Man-Computer 'Partnerships' Explored," Electronic Design, Vol. 17, No.3, pp. 25-32, February 1, 1969 (SRIARC Catalog Item 13961). 21. Field, R. K., "Here Comes the Tuned-In, Wired-Up, Plugged-In, Hyperarticulate Speed-of-Light Society-An Electronics Special Report: No More Pencils, No More Books-Write and Read Electronically," Electronics, pp. 73-104, November 24, 1969 (SRI-ARC Catalog Item 9705). 22. Lindgren, N., "Toward the Decentralized Intellectural Workshop," Innovation, No. 24, pp. 50-60, September 1971 (SRI-ARC Catalog Item 10480). OPEN-LITERATURE ITEMS BY ARC STAFF 23. Engelbart, D. C., "Special Considerations of the Individual as a User, Generator, and Retriever of Information," American Documentation, Vol. 12, No.2, pp. 121-125, April 1961 (SRI-ARC Catalog Item 585). 24. Engelbart, D. C., "A Conceptual Framework for the Augmentation of Man's Intellect," Vistas in Information Handling, Howerton and Weeks (eds), Spartan Books, Washington, D.C., 1963, pp. 1-29 (SRI-ARC Catalog Item 9375). 25. English, W. K., Engelbart, D. C., Berman, M. A., "Display-Selection Techniques for Text Manipulation," IEEE Tmnsactions on Human Factors in Electronics, Vol. HFE-8, No.1, pp. 5-15, March 1967 (SRI-ARC Catalog Item 9694). 26. Engelbart, D. C., English, W. K., "A Research Center for Augmenting Human Intellect," AFIPS Proceedings, Fall Joint Computer Conference, Vol. 33, pp. 395-410, 1968 (SRI-ARC Catalog Item 3954). 27. Engelbart, D. C., "Intellectual Implications of Multi-Acces!' Computer Networks," paper presented at Interdisciplinary Conference on Multi-Access Computer Networks, Austin, Texa~, April Hl':'(), prf'print (SRI ARC .JQurllJI FilcS~Sjl. The Augmented Knowledge Workshop 28. Engelbart, D. C., Coordinated Information Services for a Discipline- or Mission-Oriented Community, Stanford Research Institute Augmentation Research Center, December 12, 1972 (SRIARC Journal File 12445). Also published in "Time SharingPast, Present and Future," Proceedings of the Second Annual Computer Communications Conference at California State University, San Jose, California, January 24-25, 1973, pp. 2.1-2.4, 1973. Catalog Item 5139). RELEVANT ARC REPORTS 29. Engelbart, D. C., Augmenting Human Intellect-A Conceptual Framework, Stanford Research Institute Augmentation Research Center, AFOSR-3223, AD-289 565, October 1962 (SRI-ARC Catalog Item 3906). 30. Engelbart, D. C., Huddart, B., Research on Computer-Augmented Information Management (Final Report), Stanford Research Institute Augmentation Research Center, ESD-TDR-65-168, AD 622 5Z0,-Marili1905\SRr~ARCCatalog Item 9(90). 31. Engelbart, D. C., Augmenting Human Intellect-Experiments, Concepts, and Possibilities-Summary Report, Stanford Research Institute Augmentation Research Center, March 1965 (SRI -ARC Catalog Item 9691). 32. English, W. K, Engelbart, D. C., Huddart, B., Computer Aided Display Control-Final Report, Stanford Research Institute Augmentation Research Center, July 1965 (SRI-ARC Catalog Item 9692). 33. Engelbart, D. C., English, W. K, Rulifson, J. F., Development of a Multidisplay, Time-Shared Computer Facility and ComputerAugmented Management-System Research, Stanford Research Institute Augmentation Research Center, AD 843 577, April 1968 (SRI-ARC Catalog Item 9697). 34. Engelbart, D. C., Human Intellect Augmentation TechniquesFinal Report, Stanford Research Institute Augmentation Research 21 Center, CR-1270 N69-16140, July 1968 (SRI-ARC Catalog Item 3562). 35. Engelbart, D. C., English, W. K, Evans, D. C., Study for the Development of Computer A.ugmented ,Management TechniquesInterim Technical Report, Stanford Research Institute Augmentation Research Center, RADC-TR-69-98, AD 855 579, March 8, 1969 (SRI-ARC Catalog Item 9703). 36. Engelbart, D. C., SRI-ARC Staff, Computer-Augmented Manage- ment-System Research and Development of Augmentation Facility -Final Report, Stanford Research Institute Augmentation Research Center, RADC-TR-70-82, April 1970 (SRI-ARC Catalog Item 5139). 38. Engelbart, D. C., Experimental Development of a Small ComputerAugmented Information System-Annual Report, Stanford Research Institute Augmentation Research Center, April 1972 (SRI-ARC Catalog Item 10045). 39. Online Team Environment-Network Information Center and Computer Augmented Team Interaction, Stanford Research Institute Augmentation Research Center, RADC-TR-72-232, June 8, 1-972 {SRI-ARC Jo-urnal:File 13041). RELEVANT ARTICLES IN ARC/NIC JOURNAL 40. Engelbart, D. C., SRI-ARC Summary for IPT Contractor Meeting, San Diego, January 8-10, 1973, Stanford Research Institute Augmentation Research Center, January 7, 1973 (SRI-ARC Journal File 13537). MOVIE AVAILABLE FROM ARC FOR LOAN 41. Augmentation of the Human Intellect-A Film of the SRI-ARC Presentation at the 1969 ASIS Conference, San Francisco (A 3Reel Movie, Stanford Research Institute Augmentation Research Center, October 1969 (SRI-ARC Catalog Item 9733). Graphics, problem solving and virtual systems byR.M.DCNN u.s. Army Electronics Command Fort Monmouth, :\ew Jersey system so that he has a "feel" for its characteristics in terms of personal judgments that may be quite subjective. interactive computing mechanisms in problem solving situations should extend and amplify man's basic abilities for creative thinking and discovery. These mechanisms should improve his ability to perceive previously unrecognized characteristics. They should permit and support man's definition of new and meaningful symbols by which he designates his perceptions. They should aid in making any specification of values he chooses to assign to his symbolized perceptions. And, interactive computing mechanisms should aid in specifying and retaining any combination of evaluated, symbolized perceptions. Of particular interest are the combinations that man's creative faculty perceives as being related so as to form a higher order entity. IKTRODUCTION Man naturally uses many languages when he thinks creatively. Interactive compli6ti!~( mec-naffisrtfs Infefiaea to augment man's higher faculties must provide for appropriate man-machine dialogues. The mechanisms must not constrain man's linguistic expressiveness for communication in the dialogues. To do so is to limit or retard the creative activity. This paper explores some basic concepts for problem solving through interactive computing. Characteristics of the interactive access process and over-all system concepts are discussed. The evolution of a recognition automaton is proposed based on current work toward a multiconsole, interactive graphics Design Terminal. BASIC CONCEPTS Certain notions about problem solving, virtual systems, use-directed specification and interactive graphics are central to the concluding proposal of this discussion. These notions do not all reflect the current state-of-theart or even that of the very near future. However, they do characterize the objectives and capabilities that should be the goals of interactive computing mechanism research development and use. Virtual systems A virtual system is considered to be an organized, temporary collection of resources that is created for a specific transient purpose. Computer-based virtual systems combine processes, processors, data storage mechanisms. Interactive, computer-based virtual systems are considered to include people as another type of resource. The specific purposes that generate virtual systems are considered to be functionally classifiable. As a result, one can associate a specific purpose with a type of virtual system in terms of the function of the virtual system, or the process that carried out its formation, or the structure of its physical or functional organization or any combination of these attributes. The resources that are available for use in a computerbased virtual system may be centralized or widely distributed. Today's trend points to the general case for the future as being distributed resources interconnected by communications networks. Network-oriented, computerbased virtual systems are extensible by the simple expedient of interconnecting more and/ or different resources. The problems associated with the design and control of extensible distributed computing systems were investigated as early as 1965 by Cave and Dunn,2 and since then by many others.3.4.5.6.7.s.9.1o Problem solving "Problem solving" is considered to be a process that involves creative thinking and discovery. Problem solving in a computer-based system is considered to be the activity of a human exploring a concept or system for which a computer-based description has been or is being devised. The human tries to perceive, alter and/ or assess the description, behavior, performance or other quality of the concept or system. Very often the system or concept has time-dependent characteristics which add to its complexity. The essence of the problem solving process is variation and observation of the system under study. Alexander! pointed out that human cognition functions to discover and identify the "misfit variable" that is the cause of the problem. To do so, the human needs to "toy" with the 23 24 National Computer Conference, 1973 For problem solving processes that incorporate interactive computing mechanisms, a particular type of computer-based, network-oriented virtual system is of specific interest. This system type exhibits two hierarchical characteristics. First, it allows and supports a hierarchy of functional uses to which it may be put. And second, it also embodies the capacity to interconnect and support access to resources on a hierarchical basis. Use-directed specification Use-directed specification is considered to be a process within a time-ordered activity. The subjects and history of activity determine the semantics and pragmatics of the specification. l l In this context, semantics is taken to be the process of identifying relations between elements of a specification and the intents that are to be signified by those elements. Pragmatics is taken to be the process of identifying the extent and manner by which the signified intents can be of value to the time-ordered activity. For activities that are not deterministic, the semantic and pragmatic processes establish the operational context and effect of use-directed specifications on a probablistic basis. Use-directed specification presumes an identified system of pragmatic values based upon the goals of the activity. For many time-ordered activities, the goals are unclear. Therefore, the associated system of pragmatic values are poorly defined or may not exist at all at the start of the activity. Such activities require rules of thumb, strategies, methods or tricks that are used as guides until the goals and pragmatic value !'y!'tem are established. This heuristic 12 approach requires a feedback mechanism as part of the means by which the activity is conducted. Feedback is used to provide information which may lead to adjustments in the heuristics and clarification of the activity's goals and pragmatic value system. Adaptive, use-directed specification will be used to characterize activities that operate in the manner just described. Adaptive, use-directed specifications are of particular interest for problem solving activities that incorporate interactive mechanisms in the environment of network-oriented, computer-based virtual systems with hierarchical characteristics. I nteractiue graphics Interactive graphics is considered to be a computerbased process with the human "in-the-Ioop." "Interactive" describes the relation between the human and the computer-based process. The interactive relation is characterized by a rate of response to human service requests that is both useful and satisfying to the human. If the rate is too fast, the substance of the response may not be useful to the human. If the rate is too slow, the human may not he sMisfierl. Dunn. \3 Boehm, et 81.. 14 ;:mrl many others have explored detailed characteristics of interaction in a graphics environment. Interactive graphics is considered to have three principal purposes.1 5 One purpose is to improve the quality and rate of the input/ output relation between people and machines. Another purpose is to provide assistance to people during detailed specification of some particular abstract representation. The remaining purpose is to provide assistance to people in visualizing and evaluating some attribute, behavior or performance of a specified abstract representation. All three purposes, but especially the latter two, are of particular interest for problem solving activities that incorporate interactive computing mechanisms. VIRTUAL SYSTEM ACCESS AND INTERACTIVE GRAPHICS Most interactive computing systems contain an inherent assumption about certain knowledge required of the users. In some systems, the assumption is open and stated. In others, a less obvious, more troublesome situation may exist. Users of interactive computing systems rarely can consider the system as a "black box" into which parameter identification and values are entered and from which problem solving results are received. Most often the user is minimally required to explicitly know: the "black box" function to be used; the identification of the shared main computer that supports the "black box" function; the way in which the function must be requested; the way in which service on the supporting computer must be requested; and the type of information that must be provided to the function and the supporting computer. Some interactive systems require even more of the user. The user of most of today's interactive systems can reasonably be required to have knowledge of the kind referred to above. However, when one considers the types of interactive systems that are likely to exist tomorrow, these requirements are not merely unreasonable, they may be impossible to be satisfied by typical users. Consider the use of interactive computing mechanisms by problem solving activities via an extensible, networkoriented, distributed resource computing system. Over time, such a system will undergo significant changes in the number, type and pattern of dispersion of resources that are inter-connected. For an individual user, as his efforts progress or change, the combinations of resources appropriate to his purposes will also change. Any economic implementation of such a system will not be free of periods or instances when "busy signals" are encountered in response to requests for service. Therefore, it is likely that most resources will have some level of redundancy in the system. The following conclusion must be drawn. If the human user of interactive computing systems must continue to satisfy today's requirements in the environment of tomorrow's svstemf-;, then the enormous potential of these svs- Graphics, Problem Solving and Virtual Systems tems will be lost to the user. It appears that this conclusion can be obviated. If one analyzes interactive access characteristics along with system functions and relations in a certain way, it appears feasible to reduce the burden of system knowledge upon the user to a manageable level in the environment of sophisticated interactive networks of the future. Interactive access performance characteristics The basic motivation for interactive access is to allow people to function as on-line controllers and participants in the computing process. Consequently, we must consider characteristics of interactive access mechanisms from the view of both human and system performance. Further, if we consider performance characteristics in the context of a complex process such as "problem solving" then, in a very loose sense, we have taken a "worst· case" approach. The first thing to consider is that interaction is carried on by means of a dialogue. This implies the existence of a language known to both parties. The question is-what should be the scope of reference of this language? Should it be the mechanisms of computing? Or the functioning of the interactive device? Or the topics which give rise to the body of information pertinent to the problem to be solved? Ideally, one should not need to be concerned with computing mechanisms or interactive devices, but only with information relevant to the problem. Practically, one may want or need at least initial and, perhaps, refresher information on mechanisms and devices. One can then conclude that the principal concern of the language should be the topics which relate to the problem. The discourse should permit tutorial modes or inquiry dialogues on other issues only at the specific request of the user. Raphael's16 work and that of others have established a solid foundation for the inquiry capability. But, what of the problem solving topics? Should a separate language exist for each one? Could that be feasible as the domain of interactive problem solving expands? Clearly, it is not even feasible with today's primitive use. Tomorrow's uses will make this matter worse. It may be equally unreasonable to expect that machine systems can be provided with a human's linguistic faculty for some time. However, there are at least two feasible approximations. The first is exemplified by MATHLAB.17 In this effort, the machine is being programmed with the rules of analytical mathematics. Then the user interactively writes a mathematical equation on a machine sensible surface, the equation is solved analytically by the machine and the graph of the solution is displayed on an interactive graphics device. The process also requires that the machine is programmed to recognize handwritten entries. It does this task imperfectly and has to be corrected through re-entry of the symbols. The sensible input surface and the visible output surface together form the interactive mechanism of feedback until man and machine have reached agreement. A related 25 example is the "turtle" language of Papert'slS Mathland. This first type of approximation provides a linguistic mechanism for a broad topic of discourse and in addition, provides an interactive feedback mechanism that allows man and machine to work out misunderstandings in the dialogue. The second approximation is exemplified by the work of Pendergraft. lu9 In this effort, the main concern became the evolution of computer-based, linguistic systems of a certain kind-a semiotic system. These systems are based on semiotics, the science of linguistic and other signs and how they are used (first identified by Charles Sanders Pierce and later elaborated upon by Morris 21). "A semoitic system can be precisely specified as a system of acts rather than of things. Such a specification describes what the system does, not what it is in a physical sense. The specification of acts consists of two basic parts: "(a) Potential acts. Individually, these may be thought of as mechanical analogues of habits. Collectively, they constitute a body of knowledge delimiting what the system can do, what is within its competence. "(b) Actual acts. These are individually the analogues of behaviors realizing the potential acts or habits. They relate to one another within a single taxonomic structure that centers on a history of the success or failure of elementary senso-motor behaviors, or inferred projections of that history. Together and in their relations, these actual acts constitute a pattern of experience delimiting what the system observes in the present, remembers of the past, or anticipates for the future. "Among the potential acts, there must be certain acts which determine how the current pattern of experience is to be "deduced" from the current body of knowledge. The very realization of habits as behaviors depends upon this logical part of the specification. Deductive behaviors realizing these logical habits themselves, may appear in the experimental pattern being constructed; usually the system will not be aware of its logical behaviors. 19 An automatic classification system was defined and constructed 11 that provided the mechanism for the unifying taxonomic structure. It was demonstrated to be capable of assessing probability of class membership for an occurrence. It also was demonstrated to be capable of detecting the need and carrying out effort to reclassify the data base upon the occurrence of a "misfit." This second type of approximation provides a mechanism for man and machine to interactively teach each other what is relevant in their dialogue. It also provides a capability for both partners to learn useful lessons for the problem solving activity based on their actual history of success and failure. This latter point is particularly relevant for the situation when another user wishes to do problem solving in an area in which the system has already had some "experience." In summary, the interactive access mechanisms for problem solving ought to have the following performance 26 National Computer Conference, 1973 APPL I CAT I ON FU,,"CT I or, +-------------------~ HUMAN INPUT FUNCTION HUMA\J I r-iPUT ~UNCT STORAGE FUNCTION I Or..; GRAPHICS SYSTEM FUNCTIONS AND RELATIONS sioned as having at least two distinct types of components. One component type contains information about variables and their values. The other component type contains information about the identity and parameters of the function that is to utilize the variable data. Considered this way, inter-function data are messages between elements of the graphics system. They convey service requests for specified functions. Message structured service requests of a limited kind for computer graphics has been considered in the environment of distributed resource, network-oriented systems. 20 In order to successfully support interactive graphics access in a network environment, careful distribution of the graphics system functions must be accomplished within the network facilities. And equally important, the relationships between graphics system functions must be preserved. Figure 1 characteristics. The mechanisms should be oriented to discourse on the subject of the problem. Recourse to subsidiary dialogues, e.g. tutorial inquiry, etc., at the request of the human, should be provided to facilitate the operation of the mechanism by the user. The mechanisms should bear the burden of trying to deduce the system implications of a human's service request, rather than the human needing to direct or set up the implementation of service in response to the request. Use of the wide bandwidth channel provided by interactive graphics for manmachine communication at a rate comfortable to the human is the concluding feature of this characterization. DESIGN TERMINAL One approach for network-oriented, interactive graphics is illustrated by a Design Terminal configuration 13 under development at the author's installation. Some limited experience with it in conjunction with network access has been gained. 22 The Design Terminal is basically a multi-console terminal with the ability to independently and concurrently interconnect graphics consoles, graphics functions and network-based application and storage functions. Objectives Interactive graphics systems functions and relations Figure 1 illustrates the relations that exist among the five essential functions in any interactive graphics system. The human output function presents images in a form compatible with human viewing and cognition. The human input function mechanizes requests for attention and provides a means for entry of position, notation or transition data that will affect the graphics and other processes. The storage function retains images or their coded abstractions for subsequent processing. The application function is the set of higher order, "black box" processes that will utilize information inherent in images as input data. Finally, the graphics function performs the three types of sub-processes that are the heart of the graphics process. One type of sub-process provides for composition, construction or formation of images. The second type of sub-process provides for manipulation or transformation of images. The third type of sub-process (attention handling) links the composition and/ or manipulation sub-processes to interaction, intervention or use by higher order processes being performed either by the application function or by the user. Notice that the rela- tions between the graphics system functions are ones of data flow within an overall system. observe the following. -The data that flo\vs from one function of the system to another can al.ways be envi\\TC Two issues have motivated the development of the Design Terminal. First, in certain types of installations, there is considered to be a need to insure that interactive facilities for problem solving and design specification are functionally matched and economically operated with a multi-source capability from hardware suppliers. This issue involves concern for the relation between types of MEDEA PROJECT EXPERIMENTAL TEST BED Fi~1)re 2 Graphics, Problem Solving and Virtual Systems display mechanisms (e.g. refresh CRT's, DVST's, printer/ plotters, etc.), the types of graphics' use and the probability of occurrence and volume need for each type of use. Second, for an installation that requires many intelligent terminals, there is the concern for the total system implementation and support costs. The solution that is being pursued is a little farther around the "wheel of reincarnation"23 than other related configurations. A general purpose mini-processor and a special purpose display processor form the heart of a terminal with remote access to many shared computers. The general purpose mini-processor is being multi-programmed in a certain way to support concurrent, independent graphics activities emanating from the terminal. The main thrust is to expand the number of concurrently active graphics consoles at the terminal so as to achieve a satisfactory distribution of the total cost of the terminal over each concurrently active console. Figure 2 illustrates the test bed on which this effort is being conducted. The Design Terminal configuration is concerned with providing an interactive graphics terminal with the following capabilities: (a) in its simplest form, it is a singleconsole intelligent terminal; (b) the local mini-processor and special purpose processor facilities for providing the graphics function are shared by many interactive graphics consoles; (c) the graphics consoles currently active may employ widely different display mechanisms; (d) a majority of all the connected consoles can be concurrently active; (e) the current use of each active console can involve display of different images than those being generated for all other active consoles at the terminal; (f) the terminal can concurrently obtain support for the graphics system application and storage functions from more than one shared main computer system; and (g) the current use of each active console can involve a different application on a different shared main computer than is involved for all other active consoles at the terminal. The distribution of graphics system functions for the Design Terminal configuration are illuustrated in Figure 3. ~--------------------------------------~ "~/- - - - - - - - - - - - -, : Q----jJ i~Ur-.CTICNS - - - -- - ~-~- ---- - -7---i ~, DESIGN TERMINAL cm.JFIGURATION Figure 3 0\ C6~~~~E~'1ARED 27 Experience Although the development of the Design Terminal is still incomplete, our experience so far has provided insight into the difficulties of problem solving on networkbased, virtual systems through interactive graphics. The first point is not new. It is yet another confirmation of something well known to the computing profession and industry. Most users of computing, particularly in an interactive environment, cannot afford to be bogged down in the mechanics of the computer system. They certainly don't care about the subtle intricacies or enduring truths and beauties of the system that turn on its builders and masters. Therefore, the intricate knowledge about access to and use of distributed resources must somehow be built-in to the system. The second point is also not completely unknown. The telecommunications people have been considering alternatives for a long time. The efforts of Hambrock, et aJ.24 and Baron 25 are two of the most significant to our current situation. In a large, dynamic and expanding network, one cannot maintain deterministic directories of every possible combination of resources and associated interconnection schemes that are being used or can be expected to be used. The transmission facilities would be jammed with up-date traffic and the resource processors would be heavily burdened with directory maintenance. For the user and the system builder, this point appears to raise a paradox. The user doesn't want to and can't manage to know everything about the network. And the maintenance of directories within the network would impose a severe utilization of scarce resources. The approach of the ARPANET represents an interim compromise for this problem, based upon Baran's work on distributed communication systems. However, the user is still required to know a great deal about the use of each resource in the network even though the communications problem is taken care of for him. For each resource of interest; (a) he must know that the resource exists; (b) he must know where within the net it is located; and (c) he must know the usage procedure required by the processor of that resource. He may be required to know much more. For users of interactive mechanisms with extreme accessibility provided for by the Design Terminal type configuration, this approach to locating and specifying essential resources is especially troublesome. The conclusion we draw toward the problem we pose is that the resource directory function cannot be built into either the resource processors or the interconnection facilities of the network. We also conclude that attempts to moderate search traffic loading in random techniques 24 and relieve switching bottlenecks can be successful 6 provided the search criteria and routing mechanism are carefully defined. There is one more point to be considered. It is raised by the cost of computer software development and the growing diversity of available computing resources. We cannot afford and shall not be able to afford explicit re-design of resource linkages each time a new, useful combination is 28 National Computer Conference, 1973 devised that provides additional higher level capability. The interactive access mechanims to network-based virtual systems must provide or be able to call upon a generalized, dynamic linkin,g function. This function will link together distributed resource modules it has determined to be available and to form the appropriate basis for satisfying an interactive service request. HIERARCHICAL ACCESS TO VIRTUAL SYSTEMS VIA INTERACTIVE GRAPHICS Cherry27 has observed "that recognition, interpreted as response according to habits, depends upon the past experience from which an individual acquires his particular habits." Although his interest was human communication, one should recall the earlier discussion on basic concepts. In particular, consider Cherry's observation in the context of adaptive, use-directed specification and the extensions and amplification of man's basic abilities as a result of problem solving through interactive computing mechanisms. In this context, this observation provides a significant suggestion toward possible answers to many of the difficulties cited above. Semiotic coupler function In the linguistic environment, Pendergraft 11 characterized a self-regulating system with goal-directed behavior in terms useful to our purpose. A hierarchy of processes was described. The perception process tries to recognize input data in terms of familiar attributes. The symbolization process assigns identifiers to the recognized data. The valuation process associates the assigned symbols to processes that effect the response of the system to input data. It does so in terms of the system's current knowledge of pragmatic values that will satisfy its goal directed performance. For problem solving through interactive graphics access to distributed resource networks, the goal would be for the system to correctly determine and cause the interconnection of a virtual system necessary to satisfy each interactive service request. Correctness would be a probablistic measure that would improve with experience for a given problem solving area. The input data to the perception process would be the image data and/ or symbol string that specifies the interactive service request. The output of the perception process would be the syntactic parsing of the service request over the language of service requests. The perception process also operates on a probablistic basis derived from experience. The input data to the symbolization process would be the identification of processing functions that are required to satisfy the service request. The output of the symbolization process would be the identification of the network's known distributed resources that must be assembled into a virtual system to carry out the processing functions. Again, .the performance of the symboiiza- tion process will improve as experience increases with both the problem solving topic and the network resources. In situations where processing functions are specified for which network resources are unknown or unavailable, two options exist. Either the symbolization process appro xi mates the function in terms of known resources or the network is searched. The input data to the valuation process would be the identification of resource modules that will be called upon to satisfy the service request. The output of the valuation process would be the identification of the processing sequence and data flow relationships that must exist amongst the activated resource modules. This valuation process is extremely dependent on experience for improved performance in a given problem solving area. Adaptive classifier function Each of the preceding processes depends upon feedback from experience to improve performance. Pendergraft suggests that the processes applied to stimulus information should also be applied to response information. 11 For us, this suggests that the user should interactively "grade" the system's performance. The system would then apply the three preceding processes to the "grading" information in order to adjust the probability assignments to the estimates of relationship and to the classification structure for service requests vs. processing functions. When "misfits" were identified and/ or when the probabilities computed for relationships dropped below some threshold, the classification structure would be recomputed. Pendergraft and Dale28 originally demonstrated the feasibility of this approach in the linguistic environment using a technique based on Needham'!=;29 theory of clumps. As new resources are added to the network, the processing functions that they provide are entered into the classification structure with some initial (perhaps standard) probability assignment for relations to all known types of service requests. These probabilities are then revised based upon the feedback from the user's grading of the system's performance. Use-directed specification function The user is of pivotal importance toward specifying service requests and generating performance grades for the system. Yet, as was indicated earlier, he must be capable of the actions without being required to have elaborate knowledge of the system's internal content, structure or mechanisms. It is in this area that interactive graphics plays a vital role in expediting the problem solving dialogue between man and machine. Consider the simple concept of a "menu," that is, a set of alternatives displayed to the user for his selection. In a complex problem solving area, the result of a user selection from one meHU cail lead It) the display of a :;uburdi Graphics, Problem Solving and Virtual Systems nate menu for further specification of the task to be performed. In effect, this process leads to a concept of the dialogue as a selection process of the alternative paths in trees of menus. We claim that generalized sub-trees can be devised for areas of problem solving methodology that can be parametrically instantiated to a given topic at run-time only guided by previous menu selections during the current session at the terminal. Furthermore, we claim that this sub-tree concept can be devised so as to allow a given subtree to be invoked from a variety of parent nodes in the specification process. Work on the Design Terminal includes an effort to implement this concept. The specification process cannot be fully accommodated by the mechanism of the parametric dialogue tree. Procedures illustrated by the MATHLAB techniques, the methods of Coons/o the Space Form 3l system and more conventional interactive graphics layout, drafting, and curve plotting techniques will all be required in addition to alphanumeric data entry in order to complete the specification. The point is that the semantics of these specifications, in terms of the problem solving processing functions that are required, will have been directed by the current use of the dialogue mechanism. One set of choices that is always displayed or selectable represents the user's evaluation alternatives of the system's performance. Another optional set is one that places the system in the role of a tutor, either for use of the system or for the use of a processing function to which the system provides access. Another set of options should also be callable. In this case, the user may want to access a specific processing function. He may not know its name or the location of the resources in the distributed system that support it. If he does have specific identification, he may use it. If he lacks an identifier, the user will generally know of some attribute of the process. Therefore, he should be able to enter a search mode in which he can construct the search criteria for the known attribute in whatever terms that the system supports. The set of use-directed specifications that are achieved in the above manner form the set of interactive service requests that are input to the semiotic coupler function. The selection of evaluation alternatives forms the feedback input to the adaptive classifier function. The Recognaton performs four principal functions. Three of them have already been described: semiotic coupling; adaptive classification; and use-directed specification. The fourth function generates messages into the distributed resource network. It uses the output data of the valuation process of the semiotic coupling function. These messages request assignment and activation of network resources according to the processing sequence and inter-process data flow requirements that were determined from the current status of the pragmatic value system. The messages are the immediate cause for the formation of the virtual system. The functions of the Recognaton represent a significant computational load at a rather sophisticated level. It is unlikely that the cost to implement this computation could be afforded for a single interactive graphics console. Therefore, we conclude that multiple inte-ractiv.e graphi-es consoles must be serviced by a given Recognaton. Figure 4 illustrates an access node with the Recognaton functions based on the configuration of the Design Terminal that was discussed earlier. SUMMARY This discussion began with a concern for an interactive mechanism to facilitate man-machine dialogues oriented to man's creative thought processes. The intent was to consider problem solving as a test-case, creative process with practical implications. The activity of the mechanism was taken to be to serve as a means for specification of and hierarchical access to virtual systems formed in a distributed resource computing network. A recognition automaton, the Recognaton, has been proposed which appears to serve the purpose and does not impose system level conflicts. Implementation of the Recognaton appears feasible as an extension of the Design Terminal multi-console, interactive graphics configuration. ~;:-~-------------------....... '~'---- - - - --- ~WJ.----i!:!!i:l~,:i" / \ ~ECOC;\:ITIO'\ A RECOGNITION AUTOMATON (RECOGNATON) We suggest that distributed resource computing networks should contain nodes of at least two distinct types. The first type is a service node at which the computing resources of the network are connected. The second type is the access node. It is to the access node that the user is connected through his interactive graphics console. "\Ve further suggest that the access node is the point in the network which implements the functions necessary to provide interactive hierarchical access to virtual systems in the network. We call the implementation vehicle at the access node a recognition automaton or Recognaton. 29 FU\CT I or-.. l \ /' CY----- / GRAPb.ICS FUNCT I 0\5 'ECvG'.A T0,\ CQ,\F I GURA T I or. Figure 4 J 30 National Computer Conference, 1973 REFERENCES 1. Alexander, C., Notes on the Synthesis of Form, Harvard University Press, 1964. 2. Cave, W. C., Dunn, R. M., Saturation Processing: An Optimized Approach to a Modular Computing System, ECOM Technical Report - 2636, US Army Electronics Command, November 1965. 3. Dunn, R M., Gallagher, J. C., Hadden, D. R, Jr. Distributed Executh'e Control in a Class of Modular Multi-processor Computing Systems via a Priority Memory, ECOM Technical Report 2663, U.S. Army Electronics Command, January 1966. 4. Dunn, R M., Modular Organization for Nodes in an Adaptive, Homeostatic Process-Oriented Communications System. ECOM Technical Report - 2722, US Army Electronics Command, June 1966. 5. Dunn, R M., "Homeostatic Organizations for Adaptive Parallel Processing Systems," Proceedings of the 1967 Army Numerical Analysis Conference, ARO-D Report 67-3, pp. 99-110. 6. Dunn, R M., Threshold Functions as Decision Criteria in Saturation Signalling, ECOM Technical Report - 2961, US Army Electronics Command, April, 1968. 7. Roberts, L. G., Wessler, B. D., "Computing Network Development to Achieve Resource Sharing," Proceedings of the Spring Joint Cumputer Cunference, 1970, pp. 543. 8. Walden, D. C., "A System for Interprocess Communications in a Resource Sharing Computer Network," Communications of the ACM, Vol. 15, No.4. April 1972, pp. 221-230. 9. Mink, A., "A Possible Distributed Computing Network," Proceedings of the Computer Science Conference, Columbus, Ohio, February 1973. 10. Lay, M., Mills, D., Zelkowitz, M., "A Distributed Operating System for A Virtual Segment-Network," Proceedings of the AIAA Conference, Huntsville, Alabama, April 1973. 11. Pendergraft, E. D., Linguistic Information Processing and Dynamic Adaptive Data Base Management Studies, Final Report, Linguistic Research Center/University of Texas. Austin, November 1968. chap. 3. 12. Slagle, J. R, Artificial Intelligence: The Heuristic Programming Approach. McGraw Hill Book Company, 1971. l2. D 1mn, R. M., "Computer G r apl,ic,,: Capabijitie". C()c;t" ::I wI U"efulness." Proceedings of SHARE XL, Denver. ~1arch 1973. 14. Boehm, B. W., Seven, M. J., Watson, R. A., "Interactive Problem Solving - An Experimental Study of 'lock-out' Effects." Proceedings of &lCC, 1971, pp. 205-216. 15. Dunn, R M., "Interactive Graphics: Stand-Alone. Terminal. Or?," Proceedings of IEEE INTERCON, March 1973. 16. Raphael, B., SIR: A Computer Program for Semantic Information Retrieval, Doctoral Dissertation, M.I.T., June 1964. 17. MATHLAB, Project MAC, M.I.T., Cambridge, Mass. 18. Papert, S., "The Computer as Supertoy," Spring Joint Computer Conference, 1972. 19. Pendergraft, E. D., Specialization of a Semiotic Theory, ECOM Technical Report - 0559-1, Contract No. DAAB07 -67 -C-0569, TRACOR, INC., Austin, Texas, February 1968. 20. Cotton, I. Level 0 Graphic Input Protocol, ARPANET Network Graphics Working Group, NIC #9929, May 1972. 21. Morris, C., Signs Language and Behavior, Prentice-Hall, Inc., 1946. 22. Gray, B. H., "An Interactive Graphics Design Terminal Concept, Implementation, and Application to Structural Analysis," Proceedings of Fourth Annual Pittsburgh Modeling and Simulation Conference, April 1973. 23. Myer, T. H., Sutherland, I. E., "On the Design of Display Processors," Communications of ACM, Vol. 11, No.6, June 1968, pp. 410-414. 24. Hambrock, H., Svala, G. G., Prince, L. J., "Saturation Signaling Toward an Optimum Alternate Routing," 9th National Communication Symposium, Utica, N.Y., October 1963. 25. Baran, Paul, On Distributed Communications, Vols. I-X, Memorandum RM-3420-PR, the Rand Corporation, Santa Monica. August, 1964. 26. Nehl, W., "Adaptive Routing in Military Communications Systems," Proceeding of the Military Electronics Conference, 1965. 27. Cherry, C., On Human Communications, (Sec. Ed), The M.I.T. Press, 1966, pp. 275. 28. Pendergraft. E., Dale, N., Automatic Linguistic Classification, Linguistics Research Center, University of Texas, Austin, 1965. 29. Needham, R., The Theory of Clumps II, Cambridge Language Research Unit. Cambridge, England. 1961. 3Q. Coons, S. A.• Surfaces for Computer·Aided Design of Spaccfcims. Project MAC Technical Report MAC-TR-41, M.I.T., Cambridge. June. 1967. ~L Carr. S. C" Genm",t>"ic Mndeling. RADC Techniclll Hepnrt - TR.69-248. University of Utah. Salt Lake City. June 1969. Performance determination-The selection of tools, if any by THOMAS E. BELL The Rand Corporation Santa Monica, California listing conditioned data, generating tentative reports, trying to employ them, and then revising the reports. The following summaries indicate some of the important analysis characteristics of simulation, accounting systems, and other available tools. Most of the technical details are omitted because they are only marginally relevant to this discussion. * * As interest in computer performance analysis has grown, the dedication of some analysts to a single tool and analysis approach appears to have become stronger. In most instances this affection probably comes from increased familiarity and success with an approach combined with a resulting lack of familiarity and success with other approaches. Other equally experienced analysts use a variety of approaches and tools, and may give the appearance that any tool can be used in any situation. Only a little experience is necessary, however, to conclude that personal inspection, accounting data, hardware monitors, software monitors, benchmarks, simulation models, and analytical models are not equally cost effective for performing general operations control, generating hypotheses for performance improvement, testing performance improvement hypotheses, changing equipment, sizing future systems, and designing hardware and/ or software systems. The analyst new to performance analysis may become confused, discouraged, and, eventually disinterested in the field as he attempts to start an effective effort. This paper attempts to aid him by presenting an overview. Tools are described; applications are listed; and important considerations are reviewed for selecting a tool for a specific application. Personal inspection Personal inspection can imply an uninspired glance at the machine room. This sort of activity often leads to beliefs about an installation based more on preconceived notions than on reality. This "tool" usually is employed in an "analysis" involving occasional glances at a machine room when the observer sees precisely what he expected to see (whether it's true or not, and often even in the face of significant, contrary evidence). Since the observer may only glance at the machine room for a few minutes two or three times per day, his sample of the day's operation is very incomplete. This type of performance analysis, although common, is without redeeming social value, and will not be considered further. Other types of personal inspection are more valuable for performance analysis. Each time a piece of unit record equipment processes a record, it emits a sound. The performance analyst can use this sound to roughly estimate activity and judge the occurrence of certain system-wide problems. For example, a multiprogrammed system may be experiencing severe disk contention in attempting to print spooled records. Quite often, this problem manifests itself in strongly synchronized printing from the several printers on a large system. As the disk head moves from track to track, first one then another printer operates. When one printer completes output for its job, the other printer(s) begins operating at a sharply increased rate. ALTERNATIVE TOOLS AND APPROACHES An analyst's approach to analyzing a system's performance can, to a large extent, be described by the tools he uses. For example, an analyst using simulation as his tool performs an analysis based on abstracting important characteristics, representing them correctly in a simulation, checking the results of his abstractions, and performing simulation experiments. * If he were analyzing accounting data, his procedure would probably involve ** See References 2 and 3 for some details. Further material can be found in References 4-7. A review of monitors available in 1970 appears in Reference 8. * More complete procedural suggestions for pursuing a simulation analysis can be found in Reference 1. 31 32 National Computer Conference, 1973 Multiple, rapidly spinning tapes and extremely active disk heads can, in some environments, indicate severe trouble. In other environments (where loads should be causing this kind of behavior), they may indicate a smoothly running system. Unfortunately, most installations fall somewhere between these two extremes, leaving analysts-and managers-with an amorphous feeling of unease. The clues from personal inspection can be valuable, but an experienced eye, accompanied with an equally experienced ear, is often necessary to make sense from the raw environment. Fortunately, alternatives are available. Accounting systems Accounting systems aggregate computer usage by task, job, or other unit of user-directed work. The primary objective of the accounting system's designer is cost allocation, which sometimes compromises the usefulness of accounting system data, particularly where overhead is involved.* Although accounting data can be deceptive, analysts can determine the actual data collection methods used and perform analyses based on a good understanding of potential errors.** Accounting data also have some distinct advantages for analyses. They are usually quite complete because they are retained for historical purposes, and changes in collection methods are well documented so that users can examine them for correctness. The data are collected about the system's work and organized in precisely the correct way to facilitate workload control-by requests for computer work (by job). In addition to serving as input for reports about computer component usage, accounting data (sometimes combined with operations logs) can be used to determine the use to which this activity was devoted. For example, a user would seldom be using a simulation language if he were involved in writing and running payroll programs, and simulation execution could, prima facie, be considered of negligible value to the organization in this circumstance. For most analysts, accounting data have the advantage of immediate availability, so analysis can begin without delays for acquisition of a tool. However, immediate data availability does not necessarily imply immediate usability. Accounting systems are commonly very extensive, so analysts are often overwhelmed with the quantity of items collected and the number of incidences of each item. All these data are usually placed in poorly formatted records on a file along with irrelevant or redundant data. The data conditioning problem may therefore be a major hurdle for successful analysis. Inadequate documentation of the details of data collec* Determining system overhead is not trivial, and one of the least trivial problems is defining precisely what the term means. The appendix suggests some of its constituents. ** See Reference 9 for some useful techniques to employ a.ccounting rlHtIl. tion by manufacturers and inadequacies in the data collection (leading to variability in addition to significant bias) can confuse any analysis results unless the analyst is very careful. Monitors Performance monitors (whether implemented in hardware or software) are designed to produce data revealing the achieved performance of the system. These tools produce data, not understanding, so the analyst does not buy his way out of the need for thoughtful analysis when he purchases one. A hardware monitor obtains signals from a computer system under study through high-impedance probes attached directly to the computer's circuitry. The signals can usually be passed through logic patchboards to do logical ANDs, ORs, and so on, enabling the analyst to obtain signals when certain arbitrary, complex relationships exist. The signals are then fed to counters or timers. For example, an analyst with a hardware monitor could determine (1) the portion of CPt} time spent performing supervisory functions while only one channel is active, or (2) the number of times a channel becomes active during a certain period. Because hardware monitors can sense nearly any binary signal (within reason), they can be used with a variety of operating systems, and ~ven with machines built by different manufacturers. This capability to monitor any type of computer is usually not a critically important characteristic, because the analyst is usually concerned with only one family of computers. Some hardware monitors are discussed in References 2-5 and 10-14. The hardware monitor's primary disadvantage for analysis is its great flexibility. Analysts with extensive experience have learned the most important performance possibilities to investigate, but even the notes distributed by vendors of these monitors often prove inadequate for aiding the novice. Cases of wasted monitoring sessions and of monitors sitting in back rooms are seldom documented, but their validity is unquestionable. In some cases even hardware monitor vendors, while attempting to introduce clients to their monitors, have held a session on an unfamiliar machine and failed miserably. (In most cases, they have proved how valuable it is to have their expertise on hand to aid in performing a complex analysis with a minimum of fuss and bother.) Software monitors consist of code residing in the memory of the computer being monitored. This means that they can have access to the tables that operating systems maintain, and thereby collect data that are more familiar to the typical performance analyst. Since he usually was a programmer before he became an analyst, descriptions of data collection are often more meaningful to him than the descriptions of hardware monitor data collection points. In addition, most software monitors are designed to produce specific reports that the designers found to be particularly meaningful for the hardware! snft-wRrp ('omhinntinn hping monitorprl This recu('Ps thi" Performance Determination difficulty of analysis, particularly where the design of application jobs is under consideration. Hardware monitors, in systems where a program may reside in a variety of places, do not typically produce reports on individual problem performance that can be easily interpreted, but software monitors typically can. For more material on some software monitors see References 2, 3, 7, and 16-20. The answer to every analyst's problem is not a software monitor. Software monitors require a non-negligible amount of memory, often both central memory and rotating memory. In addition, some amount of I/O and CPU resources are necessary for operating the monitor. This all amounts to a degradation in system performance, and at the precise time when people are concerned with performance. As a result, the analyst needs to choose carefully how much data he will collect and over how long a period. This necessity adds to the analyst's problems~ and is usually resolved in favor of short runs. This, in turn, leads to data of questionable representativeness. Since computer system loads usually change radically from hour to hour, the analyst may be led to conclude that one of his changes has resulted in significant changes in performance, when the change actually resulted from different loads over the short periods of monitoring. The choice between hardware and software monitors (and between the subtypes of monitors in each groupsampling vs full time monitoring, separable vs integrated, recording vs concurrent data reduction, and so on) is largely dependent on situation-specific characteristics. Application in the specific situation usually involves monitoring the normal environment of the existing computer. An alternative exists: a controlled, or semi-controlled, environment can be created. This analysis approach is closely related to the use of batch benchmarks and artificial stimulation. Benchmarks A batch benchmark consists of a job, or series of jobs, that are run to establish a "benchmark" of the system performance. The benchmark run is usually assumed to be typical of the normal environment but to have the advantage of requiring a short time for execution. The most common use of benchmarks is for equipment selection, but analysts often use benchmarks for determining whether a change to their systems has improved the performance of the benchmark job stream. The conclusion about this performance (usually measured primarily by the elapsed time for execution) is then assumed to be directly related to the normal environment; an improvement of 20 percent in the benchmark's performance is assumed to presage an improvement of 20 percent in the real job stream's performance. Benchmark work is described in References 21 and 22. For an on-line system this technique would not be applicable because on-line jobs exist as loads on terminals rather than as code submitted by programmers. The analog to the benchmark job in a batch system is artifi- 33 cial stimulation in the on-line environment. Through either hardware or software techniques, the computer system is made to respond to pseudo-inputs and the response is measured. A stimulator, implemented in software, is described in Reference 23. The obvious difficulty in using batch or on-line benchmarking is relating the results to the real job stream. The temptation is to assume that jobs presented by users are "typical" and that the results will therefore be applicable to reality, or that the on-line work described by the users is actually what they do. Neither assumption is generally true. Running benchmarks or artificially stimulating a system implies some kind of measurement during a period of disrupting the system's operation; then the results must be related to reality. Performance modeling has the same difficulty in relating its results to reality, but it does not disrupt the system's operation. Performance modeling Simulation modeling of computer system performance has seemed an attractive technique to analysts for years, and it has been used in response to this feeling. An analyst may design his own simulation using one of the general or special purpose languages, or employ one of the packaged simulators on the market. * In either case, he can investigate a variety of alternative system configurations without disrupting the real system, and then examine the results of the simulated operation in great detail. Virtually all such simulations model the operation of the system through time, so time-related interactions can be thoroughly investigated. Some simulation experiences are described in References 29-35. Problems and objectives in simulating computers are described in Reference 36. Analytical models are usually steady-state oriented, and therefore preclude time-related analysis. However, they usually do provide mean and variance statistics for analyses, so those analyses requiring steady-state solutions (e.g., most equipment selections) could employ the results of analytical modeling. Simulations, on the other hand, must be run for extensive periods to determine the same statistics, and analysts need to worry about problems like the degree to which an answer depends on a stream of random numbers. Examples of analytical modeling are given in References 37 -42. The problem that often proves overwhelming in using either type of modeling is ensuring that the model (the abstraction from reality) includes the most important performance-determining characteristics and interactions of the real system. Without this assurance, the model is usually without value. Unfortunately for the analyst, indication that a particular model was correct for another installation is no real assurance that it is correct for his installation. Unique performance determinants are * For information about such languages and packages see References 24-2R. 34 National Computer Conference, 1973 usually found in operating system options, configuration details, and workload characteristics. Therefore, a validation exercise of a simulative or analytical model is usually a necessity if specific values of output parameters are to be used in an analysis. APPLICATIONS The applications of computer performance analysis can be categorized in a variety of ways, depending on the objective of the categorization. In this paper the objective is to aid in selecting an appropriate tool and analysis approach. The following categorization will therefore be adopted: General control: Many installations are run by intuition. Freely translated, this means that they are not managed, but instead allowed to run without control. All attempts at other applications of performance analysis will be of marginal utility without control based on adequate operating information. Hypothesis generation: Computer system performance improvement involves generating hypotheses, testing hypotheses, implementing appropriate changes, and testing the changes.* Useful information for hypothesis generation often appears so difficult to specify and obtain that random changes are attempted to improve the system. The failure rate for performance improvement efforts without explicit hypothesis generation is extremely high. Hypothes~ testing: Given an interesting hypothesis, an analyst's first impulse is to assume its correctness and begin changing the system. This usually results in lots of changes and little improvement. Hypothesis testing is imperative for consistently successful computer system performance improvement. Equipment change: The friendly vendor salesman says his new super-belchfire system will solve all your problems. The change is too large to be classified as a performance improvement change. Should you take his word for it and make him rich, or do your own analysis? If you choose to do your own analysis, you're in this category when you're upgrading or downgrading your system. Sizing: Sizing a new system is a step more difficult than equipment change because it often involves estimating workload and capacity in areas where extrapolation of existing characteristics is impossible or unreliable. This situation does not occur so often as equipment change, but usually involves much higher costs of incorrect decisioni". Typical situations are bringing in a new computer for a conceptualized (but unreal* This is a summary of the steps suggested in Reference 43. ized) workload, centralization of diverse workloads previously run on special purpose hardware/software systems, and decentralization of workload from a previously large system to a series of smaller ones. Vendor selection is included in this category since the performance-related part of this problem can be described as sizing and verifying (or merely verifying, in the case of some procurements) the performance of a certain size system. System design: Whether dealing with hardware or software, designers today usually are concerned with performance. If the designers are in the application area, the concern for performance often comes too late for doing much about the mess. Early consideration, however, can be expensive and unfruitful if carried on without the proper approach. The easiest situation would be for each of these categories to have exactly one tool appropriate for application in analyses, but performance analysis has more dimensions than the single one of analysis objective. Two of the most important ones are analyst experience and type of system under consideration. ANALYST EXPERIENCE Some groups of analysts have considered single systems (or a single model of system processing essentially the same load at several sites) over a period of years. These groups have often developed simulation and analvtical tools for one type of analysis, and then, with th~ tool developed and preliminary analysis already performed, apply them in other situation~. Similarly, they may have used accounting data for a situation where it is particularly applicable, and then have applied it in an analysis in which accounting data's applicability is not obvious. The ability of group members to apply a variety of familiar tools freely in diverse situations is one of the reasons for maintaining such groups. Some other groups have developed analysis techniques using a single tool to the extent that their members can apply it to a much wider variety of situations than expected because they have become particularly familiar with its characteristics and the behavior of the systems they are analyzing. As a result, such groups have proved able to enter strange installations and produce valuable results by immediately executing rather stylized analyses to check for the presence of certain hypothesized problems. The analyst with less than one or two years of performance analysis experience, however, cannot expect to achieve the same results with the~e approaches. The remainder of this paper will consider the situation of the more typical analyst who is not yet extensively experienced. Performance Determination TYPE OF SYSTEM Software monitors are obviously commercially available for IBM System 360 and 370 computers, but their existence for other systems is often unrecognized. This sometimes leads analysts to believe that personal inspection is the only alternative for any other system. In fact, virtually every major computer system on the market currently possesses an accounting system; hardware monitors will work on any system (with the exception of certain very high speed circuits); batch benchmarks can be run on any system; and models can be constructed for any system (and have been for most). In addition, software monitors have been implemented for most computer systems in the course of government-sponsored research. The analyst's problem is to discover any required, obscure tools and to be able to- use them without undue emphasis on learning the tools' characteristics. The world of performance analysis tools is not so smooth as may be implied above. First, benchmarks for on-line systems are nearly impossible to obtain for any system without assistance of the computer system vendor. Second, many tools are simply no good; their implementers did a poor job, or they are poorly documented, or they don't do the thing needed for the problem at hand. Third, searching out the appropriate tool may require more time than the analyst can spend on the entire performance analysis. Fourth, the analyst seldom has prior knowledge of whether one of the first three problems will arise, so he doesn't know where to concentrate his search. Fortunately, any of several different types of tools can be used in most analyses, so the analyst can pick from several possibilities rather than search for some single possibility. The choice is largely dependent on the category of analysis being performed. CATEGORY OF ANALYSIS Having presented some important analysis characteristics of various tools, limited the discussion to apply only to analysts without extensive experience, and begged the question of tool availability, the important step remains of matching analysis objective with type of tool and analysis. Suggestions about tool selection for each analysis objective are given below; they should be interpreted in the context of the discussion above. GeneraL control Accounting data generally have proven most appropriate for general control. They are organized correctly for generating exception reports of system misuse by programmers (incorrect specification of job options, violating resource limitations, and running jobs inappropriate for their assigned tasks). They also usually provide valuable information about operations (number of 35 reloads of the operating system, number of reruns, incidence of the system waiting for tape mounts, etc.). Further, it provides data on the level of chargeable resource utilization so that financial management can be performed. Accounting data's primary disadvantage is the difficulty of generating meaningful reports from it. It also requires operators' adherence to appropriate standards of operation for maintaining reliable data. Further, it usually can provide reports no sooner than the following day. One alternative is to use a very inexpensive hardware monitor with dynamic output for on-line operational control and to use accounting data for normal reporting. (Regular use of monitors, perhaps one use per month, can also be adopted to supplement the accounting data.) The most commonly used general control technique is one of the least useful. Personal inspection is inadequate for anything except the case of a manager continually on the floor-and he needs an adequate system of reporting to detect trends that are obscured by day-to-day problems. The techniques in this section may appear too obvious to be important, but we find that ignoring them is one of the most common causes of poor computer system performance. Hypothesis generation Hypothesis generation for system performance improvement is based on the free run of imagination over partly structured data, combined with the application of preliminary data analysis techniques. In general, the data leading to the most obvious relationships prove best, so personal inspection and partly reduced accounting data often are most useful. Quick scans of system activity, organized by job, often lead to hypotheses about user activity. An analyst can often hypothesize operational problems by visiting several other installations and trying to explain the differences he observes. Some installations have found that regular use of a hardware or software monitor can lead to generating hypotheses reliably. The technique is to plot data over time and then attempt to explain all deviations from historical trends. This approach may have the advantage of hypothesis formulation based on the same data collection device that is used for hypothesis testing. Hypothesis testing Nearly any tool can be used for testing performance improvement hypotheses. The particular one chosen is usually based on the type of hypothesis and the tool used in generating the hypothesis. For example, hypotheses about internal processing inefficiencies in jobs can usually be best tested with software monitors designed to collect data on application code. Hypotheses about the allocation of resource use among programmers can usually be tested most readily through the use of account- 36 National Computer Conference, 1973 ing data. Simulative and analytical models can often be used to perform tests about machine scheduling and the trade-off of cost and performance, particularly when hypothesis generation employed modeling. After a hypothesis is tested, implementation of a change is usually the next step. Following implementation, the resulting performance change requires examination to ensure that the expected change, and only that change, has occurred. Although the same tool may be used for both parts of hypothesis testing, employing a different tool provides the advantage of forcing the analyst to view the system from a slightly different point of view, and therefore reduces the chance of ignoring important clues in seemingly familiar data. This advantage must be traded-off against the advantage of easing detection of perturbations in the familiar data caused by implemented changes. A special note is in order for the use of benchmarks in hypothesis te~ting. If a hypothesis involves a characteristic of the basic system, a completely controlled test often can test the hypothesis far more thoroughly than other types of tests. For example, an analyst might hypothesize that his operating system was unable to initiate a high-priority job when an intermediate-priority job had control of the CPU. While he could monitor the normal system until the condition naturally occurred, a simple test with the appropriate benchmark jobs could readily test the hypothesis. We have found that artificial stimulation of on-line systems can similarly test hypotheses rapidly in both controlled tests and monitoring normal operation. The temptation to examine only "the normal system" should be resisted unless it proves to be the most appropriate testing technique. Equipment change Equipment change might involve upgrading the system's CPU, changing from slow disks to faster ones, or adding a terminal system. All these changes might be considered merely major tuning changes, but they involve enough financial risk that more analysis is devoted to them than normal system performance improvement efforts. In addition, the analyst has a very stylized type of hypothesis: how much performance change results from the hardware change? These special characteristics of equipment change lead to increased use of benchmarks and simulation. When the alternative configuration exists at another installation (usually a vendor's facility), analysts can generate a series of benchmarks to determine how well the alternative performs in comparison with the existing system. Recently, synthetic benchmarks have come to be used more extensively in this process, particularly in test designs which examine particular system characteristics, or which include carefully monitoring normal system utilization to improve the meaningfulness of the benchmarks. Tn other ('ases there is no svstem available for running the benchmarks. Simulation is often employed in this environment. The most important problem in this type of analysis is ensuring the validity of the workload description on the alternative system and the validity of the alternative's processing characteristics. Unvalidated simulations may be the only reasonable alternative, but the risk of employing them is usually high. Sizing The technique used most commonly today in sIzmg computers is listening to vendor representatives and then deciding how much to discount their claims. This situation is partly the result of the difficulties involved in using the alternatives-benchmarking and modeling. Although analytical modeling is conceptually useful, its use in sizing operations has been minimal because its absolute accuracy is suspect. Simulative modeling appears less suspect because the models are closer to commonly-used descriptions of computer systems. The sensitivity of simulation models to changes in parameters can often be verified, at least qualitatively, so analysts can gain some degree of faith in their correctness. All the problems of using benchmarking in equipment change analyses are present when benchmarking is used in sizing analyses. In addition, the relationship of benchmarks to workloads that will appear on a future system is especially difficult to determine. A synthetic benchmark job might be quite adequate for representing workload mea~ingfully on a modification of the existing system, but Its characteristics might be very wrong on a completely different system. (This same problem may be true for simulations, but a validated simulation should facilitate correct workload descriptions.) Design Tool selection in design must be divided into two parts -selection in the early design phase and selection in the implementation phase. In the earlier phase, performance analysis must be based on modeling because, without any implemented system, real data cannot be collected. The later phase might, therefore, seem particularly suited to the data collection approaches. In fact, modeling appears to be a good technique to employ in concert with monitored data in order to compare projections with realized performance. Collecting data without using modeling may decrease management control over development and decrease the ease of data interpretation. Design :fforts can begin by using modeling exclusively, and then mtegrate monitoring into the collection of tools as their use becomes feasible. FINAL COMMENT Computer performance analysis tools and approaches are in a period of rapid development, so the appropriateness of their application in various situations can be expected to change. In addition. individual analysts often Performance Determination find that an unusual application of tools proves the best match to their particular abilities and problems. The suggestions above should therefore not be interpreted as proclamations of the best way to do performance analysis, but as general indications of potentially useful directions. Inadequate understanding of computer system performance currently precludes quantifying problems across large numbers of systems. Each analyst must feel his way to a solution for each problem with only helpful hints for guidance. If improved understanding is developed, the artistic procedure discussed in this paper may evolve into a discipline in which analysts have the assurance they are using the correct approach to arrive at the correct answer. REFEREl\;CES 1. Morris, M. F., "Simulation as a Process," Simuletter, Vol. 4, No.1, October 1972, pp. 10-21. 2. Bell, T. E., Computer Performance Analysis: Measurement Objectives and Tools, The Rand Corporation, R-584-NASA/PR, February 1971. 3. Canning, R. G. (ed.), "Savings from Performance Monitoring", EDP Analyzer, Vol. 10, ~o. 9, September 1972. 4. Murphy, R. W., "The System Logic and Usage Recorder," Proc. AFIPS 1969 Fall Joint Computer Conference, pp. 219-229. 5. Rock, D. J., Emerson, W. C., "A Hardware Instrumentation Approach to Evaluation of a Large Scale System," Proc. of The 24th National Conference (ACM), 1969, pp. 351-367. 6. Johnston, T. Y., "Hardware Versus Software Monitors," Proc. of SHARE XXXIV, Vol. I, March 1970, pp. 523-547. 7. Kolence, K W., "A Software View of Measurement Tools," Datamation, Vol. 17, No.1, January 1, 1971, pp. 32-38. 8. Hart, L. E., "The User's Guide to Evaluation Products," Datamation, Vol. 16, No. 17, December 15,1970, pp. 32-35. 9. Watson, R. A., Computer Performance Analysis: Applications of Accounting Data, The Rand Corporation, R-573-NASA/PR, May 1971. 10. Bell, T. E., Computer Performance Analysis: Minicomputer-Based Hardware Monitoring, The Rand Corporation, R-696-PR, June 1972. 11. Estrin, G., Hopkins, D., Coggan, B., Crocker, S. D., "SNUPER Computer: A Computer in Instrumentation Automation," Proc. AFIPS 1967 Spring Joint Computer Conference, pp. 645-656. 645-656. 12. Carlson, G., "A "Cser's View of Hardware Performance Monitors," Proc. IFIP Congress 1971, Ljubljana, Yugoslavia. 13. Cockrun, J. S., Crockett, E. D., "Interpreting the Results of a Hardware Systems Monitor," Proc. 1971 Spring Joint Computer Conference, pp. 23-38. 14. Miller, E. F. Jr., Hughes, D. E., Bardens, J. A., An Experiment in Hardware Monitoring, General Research Corporation, RM-1517, July 1971. 15. Stevens, D. F., "On Overcoming High-Priority Paralysis in Multiprogramming Systems: A Case History," Comm. Of The ACM, Vol. 11, :\lo. 8, August 1968, pp. 539-541. 16. Bemer, R. W., Ellison, A. L., "Software Instrumentation Systems for Optimum Performance," Proc. IFIP Congress 1968, pp. 39-42. 17. Cantrell, H. N., Ellison, A. L., "Multiprogramming System Performance Measurement and Analysis," Proc. 1968 Spring Joint Computer Conference, pp. 213-221. 18. A Guide to Program Improvement with LEAP, Lambda LEAP Office, Arlington, Virginia (Brochure). 19. Systems Measurement Software (SMS! 360) Problem Program Efficiency (PPE) Product Description, Boole and Babbage, Inc., Cupertino, California (Brochure). 37 20. Bookman, P. G., Brotman, G. A., Schmitt, K. L., "lJse Measurement Engineering for Better System Performance," Computer Decisions, Vol. 4, No.4, April 1972, pp. 28-32. 21. Hughes, J. H., The Construction and Use of a Tuned Benchmark for U1VIVAC 1108 Performance Evaluation-An interim report, The Engineering Research Foundation at the Technical lJ niversity of Norway, Trondheim, Norway, Project No. 140342, June 1971. 22. Ferrari, D., "Workload Characterization and Selection in Computer Performance Measurement," Computer, July/August 1972, pp.18-24. 23. Load Generator System General Information Manual, Tesdata Systems Corporation, Chevy Chase, Maryland (Brochure). 24. Cohen, Leo J., "S3, The System and Software Simulator," Digest of The Second Conference on Applications of Simulation, Kew York, December 2-4, 1968, ACM et aI., pp. 282-285. 25. Nielsen, Xorman R., "ECSS: An Extendable Computer System Simulator," Proceedings, Third Conference on Applications of Simulation, Los Angeles, December 8-10, 1969, ACM et aI., pp. 114-129. 26. Bairstow, J. N., "A Review of System Evaluation Packages," Computer Decisions, Vol. ,2, .xo. 6, July 1970, Pg. 20. 27. Thompson, William C., "The Application of Simulation in Computer System Design and Optimization," Digest of The Second Conference on Applications of Simulation, New York, December 24, 1968, ACM et aI., pp. 286-290. 28. Hutchinson, George K, Maguire, J. ~., "Computer Systems Design and Analysis Through Simulation," Proceedings AFIPS 1965 Fall Joint Computer Conference, Part 1, pp. 161-167. 29. Bell, T. E., Modeling the Video Graphics System: Procedure and Model Description, The Rand Corporation, R-519-PR, December 1970. 30. Downs, H. R., Nielsen, N. R., and Watanabe, E. T., "Simulation of the ILLIAC IV-B6500 Real-Time Computing System," Proceedings Fourth Conference on Applications of Simulation, December 9-11,1970, ACM et aI., pp. 207-212. 31. McCredie, John W. and Schlesinger, Steven J., "A Modular Simulation of TSS/360," Proceedings, Fourth Conference on Applications of Simulation, New York, December 9-11, 1970, ACM et aI., pp.201-206. 32. Anderson, H. A., "Simulation of the Time-Varying Load on Future Remote-Access Immediate-Response Computer Systems," Proceedings, Third Conference on Applications of Simulation, Los Angeles, December 8-10, 1969, ACM et a!., pp. 142-164. 33. Frank, A. L., "The Use of Simulation in the Design of Information Systems," Digest of The Second Conference on Applications of Simulation, New York, December 2-4, 1968, ACM et aI., pp. 87 -88. 34. Dumas, Richard K, "The Effects of Program Segmentation on Job Completion Times in a Multiprocessor Computing System," Digest of The Second Conference on Applications of Simulation, 0;"ew York, December 2-4,1968, ACM et aI., pp. 77, 78. 35. Nielsen, Norman R., "An Analysis of Some Time-Sharing Techniques," Comm. of The ACM, Vol. 14, No.2, February 1971, pp. 79-90. 36. Bell, T. E., Computer Performance Analysis: Objectives and Problems in Simulating Computers, The Rand Corporation, R-1051PR, July 1972. (Also in Proc. 1972 Fall Joint Computer Conference, pp. 287-297.) 37. Chang, W., "Single-Server Queuing Processes in Computer Systems," IBM Systems Journal, Vol. 9, No.1, 1970, pp. 37-71. 38. Coffman, E. G. Jr., Ryan Jr., Thomas A., "A Study of Storage Partitioning Using a Mathematical Model of Locality," Comm. of The ACM, Vol. 15, ~o. 3, March 1972, pp. 185-190. 39. Abate, J., Dubner, H., and Weinberg, S. B., "Queuing Analysis of the IBM 2314 Disk Storage Facility," Journal of The ACM, Vol. 15, ~o. 4, October 1968, pp. 577 -589. 40. Ramamoorthy, C. V., Chandy, K. M., "Optimization of Mem· ory Hierarchies in Multiprogrammed Systems," Journal of The ACM, Vol. 17, No.3, July 1970, pp, 426-445. 41. Kimbleton, S. R., "Core Complement Policies for Memory Allocation and Analysis," Proc. 1972 Fall Joint Computer Conference, pp. 1155-1162. 38 National Computer Conference, 1973 42. DeCegama, A., "A Methodology for Computer Model Building," Proc. 1972 Fall Joint Computer Conference, pp. 299-310. 43. Bell, T. E., Boehm, B. W., Watson, R. A., Computer Performance Analysis: Framework and Initial Phases for a Performance Improvement Effort, The Rand Corporation, R-549-1-PR, November 1972. (Much of this document appeared as "Framework and Initial Phases for Computer Performance Improvement," Proc. 1972 Fall Joint Computer Conference, pp. 1141-1154.) APPENDIX-COMPUTER OVERHEAD Accountancy requires that computer overhead costs be borne by users who are charged directly for their demands on the system. Data collection systems tend to include this requirement as a basic assumption underlying their structures. The resulting aggregation obscures the type of overhead most prominent in a system, the resources heavily used by overhead activities, and the portion of total system capability devoted to overhead activities. System analysis requires these data; they need definition and should be available for performance analysis. From the viewpoint of performance analysis, at least five components of overhead can be identified in most multiprogramming systems. These are: 1. I/O handling 2. User resource request handling 3. System handling of spooled I/O 4. Job or sub-job (e.g., job step or activity) initiation/ termination 5. System operation (including task switching, swapping, maintaining system files, etc.) I/O handling may require large amounts of time, but this is largely controllable by the individual user. Item one, therefore. may not be a candidate for inclusion in a definition of overhead in many analyses. User resource request handling (at least at the time of job or sub-job initiation) is similarly controllable by the users except for required system-required resources (such as system files). Item two might be included in definitions more often than item one, particularly since item two is often influenced strongly by installation-specified practices (such as setting the number of required files). System handling of spooled I/O is under the control of users to the extent that they do initial and final I/O, but the alternatives open to installation managements for influencing its efficiency are often very great. For example, changing blocking sizes or using an efficient spooling system (such as HASP) can have gross effects on the amount of resources consumed in the process. Installation management's control over this is so high that item three is often included in a definition of overhead. Initiation and termination appear to consume far more resources than usually assumed. User-specified options influence the amount of resource usage, but installationchosen options and installation-written code can impact usage to a large degree. The choice of specific operating system, order of searching files for stored programs, layout of system files, and options in the operating system can change the resources used to such an extent that item four should be included in overhead in nearly all cases. System operation is always included as a part of overhead. The difficulty of :separating thi~ element of overhead from all the rest is very difficult, so analyses usually assume that it is included as part of one of the other elements. One technique for quantifying its magnitude is to decide on the parts of code whose execution represent it and then to measure the size of these elements. The same parts of code can be monitored with a hardware monitor to determine the amount of processor time and I/O requests that arise from execution of the code. The sizes of system files are usually not difficult to obtain for determining the amount of rotating memory used by this type of overhead. This technique, however, will nearly always underestimate that amount of overhead since pieces of overhead are so scattered through the system. Ideally, each of the types of overhead would be identified and measured so that installations could control the amount of each resource that is lost to it. If the resource loss to overhead were known for typical systems. each of the applications of performance analysis would be eased. Computing societies-Resource or hobby? by ANTHONY RALSTON State University of New York Buffalo, New York ABSTRACT The fodder for a technical society is people but people can nevertheless use as well as be used by the society. Such use can be passive (e.g., publishing articles in the society's journals) or active through direct participation in the professional activities or administration of the ~o('iety. Ap. in thr up.{' of FIll romputing resourcep-o there is a potential for both profit and loss; these will be examined. in part at least, seriously. Special Libraries Association Session The special libraries association today 39 Standards for library information processing by E. A. STRABLE Special Libraries Association New York, New York ABSTRACT Special librarians are part of the larger library community but can be differentiated from other groups of librarians (school, public, academic) by where they practice their profession, by the groups with whom they work, and most importantly, by their goals and objectives. The major objective, the utilization of knowledge for practical ends, brings special librarianship thoroughly into information processing in some unusual and unique ways. The Special Libraries Association is the largest society to which special librarians belong. The Association, like its members, is also involved in a number of activities which impinge directly upon, and affect, the role of information processing in the U.S. Copyright problems in information processing by B. H. WElL Esso Research and Engineering Company Linden, New Jersey ABSTRACT Present copyright laws were developed largely to protect "authors" against large-scale piracy of books, articles, motion pictures, plays, music, and the like. These laws and related judicial decisions have in recent years raised serious questions as to the legality of such modern information processing as the photocopying, facsimile transmission, microfilming, and computer input and manipulation of copyrighted texts and data. Congress has so far failed to clarify these matters, except for sound recordings. It has proposed to have them studied by a National Commission, but it has repeatedly refused to establish this without aiso passing revisions chiefly dealing with cable-TV. Emphasis will be placed in this talk on consequences for libraries, library networks, and other information processors, and on recent legislative developments. by LOGAN C. COWGILL Water Resources Scientific Information Center Washington, D.C. and DA VID L. WEISBROD Yale University Library New Haven, Connecticut ABSTRACT Technical standards will be described in terms of their intent, their variety (national, international, etc.), their enumeration, and their development process. Their importance will be evaluated in terms of their present and future usefulness and impact. 40 National Computer Conference, 1973 A network for computer users Uses of the computer in large school districts by BRUCE K. ALCORN Western Institute for Science and Technology Durham, North Carolina by THOMAS J. McCONNELL, JR. Director, Atlanta Public Schools Atlanta, Georgia ABSTRACT ABSTRACT Computer networks are an accepted fact in the world of computing, and have been for some time. Not so well accepted, however, is the definition of a computer network. Some claim that to be a network the communications system must connect a group of computers as opposed to a network of terminals communicating with one computer. Still others hold that both are examples of computer networks; the first being a ring network and the latter a star network. Within education, computer networks of many descriptions exist. Most such activities have dealt with the institutions of higher education, but there are some notable exceptions. These networks are operated by universities, independent non-profit corporations, branches of state governments, and private industry. Some are time-sharing systems, some operate in the remote batch mode, and others offer both types of service. Most of the computing done through these networks has been for instructional purposes; however, a great many research problems are processed with administrative applications last in amount of activity, although increasing. During 1968 the National Science Foundation initiated a number of projects which gave a great impetus to computer networks, mainly among colleges and universities. This effort continues today in a different form through the Expanded Research Program Relative to a National Science Computer Network of the NSF. Currently the National Institute of Education is supporting the development of the Nationwide Educational Computer Service, a network designed to help colleges and school systems meet their computing needs at a minimum of cost. This network will consist of a large scale computer serving a series of intelligent terminals in institutions in various parts of the United States. The system is configured in such a way so as to assist the student, faculty, and administrator at a cost effective rate. The factors involved in producing this saving include the particular hardware and software at the central site and at the terminal location, the mode of operation and the effective use of existing tele-communication facilities. In this age of accountability in education it is apparent that the most economical and efficient systems conceivable must be made available to the administrator. This fact is true at all levels of management from the classroom to the superintendent. Most large school districts could not perform all of the tasks required of them if they had to operate in a manual mode. This fact is certainly not unique to school districts but is a common problem of our dynamic society. The administrative use of the computer in most school districts came about as a result of a need for more efficient and faster methods of performing accounting functions. After their first introduction they generally just "growed" as Topsy would say. Most large school districts today will have a rather sophisticated set of hardware and software supported by a very fine staff of professionals. With the advent of tighter budget control and with most educators today clamoring for some form of "program budgeting" the computer is an even more vital ingredient that is required if we are to provide for quality education. Additionally, it is no longer sufficient to provide automation to the administrative functions in a school district. The computer is fast becoming an essential part of our instructional program. This instructional role of the computer is coming into being in the form of Computer Managed Instruction (CM!) as well as Computer Assisted Instruction (CAl). Although development of uses for the computer for instructional purposes has only been under way for a few short years, we have witnessed some very dramatic results. Most educators are in agreement as to the effectiveness of the computer for instructional purposes; the fact that it has not expanded as many had hoped and assumed is a function of finances rather than a shortcoming of the Implementation. Education can expect to have some very rewarding experiences in its relationship with the computer and the computer professional in the seventies. This fact will come about as a result of developments in computer technology both in hardware and in software. Also, the reduction in the cost factor should be of such magnitude that computer services will be available to more school districts and at a cost that they can afford. With proper organization and cooperation the computer can begin to realize its full potential in bringing about efficient. effective education in its many aspects. Association for Educational Data Systems Session Training of teachers in computer usage How schools can use consultants by DUANE E. RICHARDSON ARIES Corporation Minneapolis, Minnesota 41 by DONALD R. THOMAS Northwest Regional Educational Laboratory Portland, Oregon ABSTRACT ABSTRACT I plan to discuss the need in teacher education for training and experience in the selection of instructional materials for use on computers and the teacher's role in helping to identify criteria for developing additional instructional materials. §£~£LfLc_ ~~s~~~_sio_n__ ~il) h~_ dj:!".e~t~g _~t c:l~_sqjbing a course which will guide teachers through the development of a set of criteria by which to judge the value of such instructional applications and will demonstrate how the criteria can be applied. The course will allow the teacher to practice application of the criteria to sample instructional uses from his particular interest. Data processing consulting firms today offer a variety of professional services to schools. Users of these services, however, often differ in their opinions of the value of these services. The point of this presentation is simply that unsatisfactory consultant relationships can have their source not onlyi!1 the consultant himself, but also in the schoo) use of the consultant's services. In other words, use of consultive services implies a two-way relationship which is subject to misuse and abuse by either party. The experience throughout the educational computer area demonstrates that time and effort devoted to sound use of consultants will pay substantial dividends. That factor should be a major one in the planned use of a consultant. An experienced consultant will bring expertise to a study based upon his experiences with other clients. This should result in client confidence and in assuring that the unique needs of the clients will be identified and addressed. NAPSS-like systems-Problems and prospects by JOHN R. RICE Purdue University West Lafayette, Indiana, NAPSS-NUMERICAL ANALYSIS PROBLEM SOLVI;\JG SYSTEM The first of these is the best understood and least difficult at the present time. The next four are very substantial problem areas, but the pilot NAPSS system shows that one can obtain acceptable performance and results. Symbolic problem analysis (except for things like symbolic differentation, is not made in the pilot NAPSS system and, in a numerical analysis context, this is an undeveloped area. The interface with the operating system is very complex in the pilot system and is an area of unsolved problems. Basically the pilot ::-.JAPSS system needs more resources than the operating system (which is indifferent to NAPSS) provides for one user. The final problem area, portability is as difficult for NAPSS as for other complex software systems. All of these problem areas except 5 and 6 are present in any problem solving system with a level of performance and scope similar to NAPSS. Examples include statistical systems (BMD,SPSS,OSIRIS); linear programming and optimization packages (LP90,OPTIMA); engineering problem solving systems (COGO,NASTRA:r\,ICES) and so forth. There is considerable variation in the present characteristics of these systems, but they have as ultimate goal to provide a very high level system involving many built-in problem solving procedures of a substantial nature. Only a brief discussion of the first four problem areas is presented here because they are either less difficult or already widely discussed in the literature. The next two problem areas are specific to ~APSS-like systems and several pertinent points are discussed which have arisen from an analysis of the pilot NAPSS system. The final two are widely discussed in the literature but still very difficult for a NAPSS-like system. Some alternatives are presented, but the best (or even a good) one is still to be determined. This paper arises from the development of ;\JAPSS and discus-s-es the p-roblems solved and still it> be solved in this area. The original paper contains two phrases which define the objectives of NAPSS (and NAPSS-like systems) in general, yet reasonably precise, terms: "Our aim is to make the computer behave as if it had some of the knowledge, ability and insight of a professional numerical analyst. " "describe relatively complex problems in a simple mathematical language-including integration, differentiation, summation, matrix operations~ algebraic and differential equations, polynomwl and other approximations as part of the bas ic language." A pilot system has been completed at Purdue and runs on a CDC 6500 with an Imlac graphics console. It does not contain all the features implied by these objectives, but it has (a) shown that such a system is feasible and (b) identified the difficult problem areas and provided insight for the design of a successful production system. The purpose of this paper is to identify the eight principal problem areas, discuss four of them very briefly and the other four in somewhat more detail. Several of these problem areas are specific to NAPSS-like systems, but others (including the two most difficult) are shared with a wide variety of software systems of a similar level and nature. The presentation here does not depend on a detailed knowledge of NAPSS, but specific details are given in the papers listed in the references.1. 2 ,3,4.5.6.7 PROBLEM AREAS The eight principal problem areas are listed below: 1. 2. 3. 4. 5. 6. 7. 8. LANGUAGE DESIGN AND PROCESSING Language Design and Processing Simulating Human Analysis (Artificial Intelligence) Internal System Organization User Interface Numerical Analysis Polyalgorithms Symbolic Problem Analysis Operating System Interface Portability Ordinary mathematics is the language of NAPSS modulo the requirement of a linear notation. While this linear notation does lead to some unfamiliar expressions, it is not an important constraint. NAPSS has also included a number of conventions from mathematics that are not normally found in programming languages (e.g., 1=2, 4, .. " N). Incremental compilation is used to 43 44 National Computer Conference, 1973 obtain an internal text which is then executed incrementally. This approach is, of course, due to the interactive nature of NAPSS. The creation of this language processor was a very substantial task. The primary difficulties are associated with the fact that all variables are dynamic in type and structure, function variables are data structures (not program structures) and that many operators are quite large and complex due in part to the wide variety of operand types. For example, several integrals may appear in one assignment statement and each of these may lead to interaction with the user. The current NAPSS language processor is relatively slow, partly because of the nature of the language, partly because of the incremental and interactive approach and partly because it is the first one. However, it performs well enough to show that it is not a major barrier to obtaining an acceptable production system. SIMULATING HUMAN ANALYSIS (ARTIFICIAL INTELLIGENCE) The original objectives include a large component of automatic problem solving in the NAPSS system. This component lies primarily in the polyalgorithms and manifests itself in two ways. First, there are facilities to analyze the problem at hand and to select an appropriate numerical analysis technique. This analysis continues during the computation and specific techniques may be changed several times during the execution of a poly algorithm. The second manifestation is in incorporating common sense into the polyalgorithms. This is both difficult and time consuming as it requires a large number of logical decisions and the collection and retention of a large amount of information about the history of the polyalgorithm's execution. The automation of problem solving in NAPSS-like systems leads inevitably to large codes for the numerical analysis procedures. A routine using the secant method may take a few dozen Fortran statements, but a robust, flexible and convenient nonlinear equation polyalgorithm requires many hundreds of statements INTERNAL SYSTEM ORGANIZATION NAPSS and NAPSS-like systems are inherently large. The compiler, interpreter, command processor and supervisor are all substantial programs. A polyalgorithm for one operator like integration or solution of nonlinear equations can easily run to 1000 lines of Fortran code. Data structures created during executions may also be quite large (e.g., matrices and arrays of functions). The organization of this system naturally depends on the hardware and operating system environment. The current pilot system is organized with three levels of overlays with a paging system and runs in a memory of about 16,000 words (CDC 6500 words have ten characters or 60 bits or multiple instructions). Many other configurations are feasible and this area does not, in itself, pose a major barrier to an acceptable production system. Currently NAPSS performs quite well provided one ignores operating system influences, i.e., when NAPSS is the only program running. However, it is unrealistic to assume that NAPSS-like systems have large computers dedicated to them or that the operating system gives them preferential treatment (compared to other interactive systems in a multiprogramming environment). Thus the internal system organization is determined by outside factors, primarily by the requirements of the interface with the operating system. USER INTERFACE A key objective of NAPSS-like systems is to provide natural and convenient operation. This means that substantial investment must be made in good diagnostic messages, editing facilities, console, and library and file storage facilities. These requirements for a NAPSS-like system are similar to those of a variety of systems. The pilot NAPSS system does not have all these facilities adequately developed and the primary effort was on editing, an operating system might provide many of these facilities in some cases. A more novel requirement here is the need for access to a lower level language like Fortran. Note that applications of NAPSS-like systems can easily lead to very substantial computations. The intent is for these computations to be done by the polyalgorithms where considerable attention is paid to achieving efficiency. Inevitably there will be problems where these polyalgorithms are either inapplicable or ineffective. Detailed numerical analysis procedures (e.g., Gauss elimination) are very inefficient if directly programmed in NAPSS and thus some outlet to a language with more efficient execution is needed. In such a situation, NAPSS is a problem definition and management system for a user provided numerical analysis procedure. There are several limits on this possibility due to differences in data structures and other internal features. An analysis of the pilot NAPSS system indicates, however, that a useful form of this facility can be provided with a reasonable effort. NUMERICAL ANALYSIS A NAPSS-like system requires at least ten substantial numerical analysis procedures: 1. 2. 3. 4. Integration Differentiation Summation of infinite series Solution of linear systems (and related items like matrix inverses and determinants) NAPSS-LIKE Systems 5. 6. 7. 8. 9. 10. Matrix eigenvalues Interpolation Least squares approximation (of various types) Solution of nonlinear equations Polynomial zeros Solution of ordinary differential equations The objective is to automate these numerical analysis procedures so that a user can have statements like: AXS~ EQ2: additional symbolic procedures must also be added including at least a reasonable symbolic integration procedure. N.A1PSS currently has two basic types of functions, the symbolic function and the tabulated (or discrete) function. (There are also the standard built-in functions.) Both of these may internally generate substantial data structures. Consider, for example, the result of a very common process, Gram-Schmidt orthogonalization. The program in NAPSS may well appear as: fF(X), (X~A TO B) Xi 2*COS(X) -F(X)/(1 +X) =A *B/2 G(X) =F(X -A1\S)/(1 +X) /* DEFIXE QUADRATIC SPLIXE */ Q(X)~X SOLVE EQ2 FOR X EQ3: 45 Y"(T)-COS(T) Y'(T) + TY(T) =G(T-X) -AXS _SDhYE_RQ3 FOR_Y(l:J_QX i 2 FOR X> =0 ~ (O~2}\VITH Y(O)~O, Y(2)~3 The user is to have confidence that either these procedures are carried out accurately or that an error message is produced. These procedures grow rapidly in size as one perfects the polyalgorithms. One polyalgorithm developed for the current NAPSS system is about 2500 Fortran statements (including comments). This large size does not come from the numerical analysis which constitutes perhaps 20 percent of the program. It comes from simulation of common sense (which requires numerous logical and associated pieces of information), the extensive communication facilities for interaction with the user and various procedures for searching, checking accuracy and so forth, aimed at providing robustness and reliability. A further source of complexity is the fact that all of these polyalgorithms must automatically interface. Thus we must be able to interpolate or integrate a function created by the differential equation solver as a tabulated function (or table valued function), one of the function types provided in NAPSS. Since this output is not the most convenient (or even reasonable) input to an integration polyalgorithm, one must make a special provision for this interface. For example in this case NAPSS could have a completely seperate poly algorithm for integrating table valued functions or it could use a local interpolation scheme to obtain values for the usual polyalgorithm. The latter approach is taken by the pilot NAPSS system. In addition to numerical analysis procedures, NAPSS currently has a symbolic differentiation procedure and numerical differentiation is only used as a back-up for those functions which cannot be differentiated symbolically (e.g., the gamma function). One may use Leibniz rules for differentiating integrals and piecewise symbolically differentiable functions present may be handled symbolically, so the numerical back-up procedure is infrequently used. It is noted below that future versions of NAPSS should have more function types and that there should be considerably more symbolic analysis of the program. If these features are added, then a number of S(X)~.5(Q(X) -3Q(X -1) +3Q(X -2) -Q(X -3» /* FIRST THREE LEGEXDRE POLYXO~nALS i3(X)IoF-=f;-]r(X)[iI~)(BrX5[2]~i.-5XT 2~.5 /* GRA~I-SCH::\nDT FOR K~3,4,· */ FOR ORTHOGOXAL BASIS */ . ·,21 DO T~(K -2)/10-1, TE::\IP(X)~S((X - T)/10) TE::\IP(X)~ TE:\,fP(X) (Y~ -1 TO 1», - SU:\l ((fTE~lP( Y)B( Y) [J], J~,l,· .. , K-1) B(X)[K]~TE:YIP(X)/(fTE~IP(Y) i i 2, (Y~ -1 TO 1» .5; The result is an array B(X)[K] of 22 quadratic spline functions orthogonal on the interval [-1,1]. These statements currently cause NAPSS difficulty because all of the functions are maintained internally in a form of symbolic text. By the time the 22nd function is defined the total amount of the text is quite large (particularly since S(X) is defined piecewise) and the evaluation time is also large because of the recursive nature of the definition. The integrations are, of course, carried out and constant values obtained. This difficulty may be circumvented in the pilot NAPSS by changing the code and a non-recursive text representation scheme has been found (but not implemented) which materially reduces the evaluation time in such situations. These remedies, however, do not face up to the crux of the problem, namely many computations involve manipulations in finite dimensional function spaces. NAPSS should have a facility for such functions and incorporate appropriate procedures for these manipulations. This, of course, adds to the complexity of the language processor, but it allows significant (sometimes by orders of magnitude) reductions in the size of the data structures generated for new functions and in the amount of time required for execution in such problems. Once this type function is introduced, then it is natural to simultaneously identify the case of polynomials as a separate function type. Again NAPSS would need appropriate manipulation procedures, but then even a simple symbolic integration procedure would be valuable and allow the program presented above to be executed in a very small fraction of the time now required. 46 National Computer Conference, 1973 SYMBOLIC ANALYSIS We have already pointed out the usefulness of symbolic algorithms (e.g., differentiation, integration), but there is also a significant payoff possible from a symbolic analysis of the program. This may be interpreted as source language optimization and, as usual, the goal is to save on execution time by increasing the language processing time. There are three factors that contribute to this situation. First, NAPSS is a significantly higher level (more powerful) language than, say, Fortran and it is much easier to inadvertently specify a very substantial computation. Second, NAPSS allows one to directly transcribe ordinary mathematical formulas into a program. Many mathematical conventions ignore computations and hence, if carried out exactly as specified, lead to gross inefficiency. Finally, the ultimate class of users are people who are even less aware of the mechanics and procedures of computation than the average Fortran programmer. Indeed one of the goals of a NAPSS-like system is to make computing power easily available to a wider class of users. The second factor is most ~pparent in matrix expressions where everyone is taught to solve linear equations by (in NAPSS) X+-A H -1)B and matrix expressions like (D+U-IL)-IL-IU are routinely applied to vectors. The inefficiencies of computing inverse matrices are well known and algorithms have been developed for processing each expression without unnecessarily computing matrix inverses. Another simple example comes from integration where the statement Df-J(JF(X)G(Y), (Xf-O TO 1», (Yf-O TO 1) is the direct analog of the usual mathematical notation. These two examples may be handled by optimization of single NAPSS statements. This presents no extraordinary difficulty in the current system, but optimization involving several statements presents severe difficulties for the current system design because it is an incremental language processor and all variables are dynamic. Symbolic analysis of groups of statements is worthwhile and many of these situations are fairly obvious or correspond to optimizations made in common compilers. The following group of statements illustrate a situation unique to NAPSS-like languages (or any language where functions are true variables, i.e., data structures). H(X)f-GA'(X -A)/GB'(A -X) G(X)f-JF(T), (Tf-O TO X) PLOT G(X) FOR Xf-Q TO 10 SOLVE Y'(T) + G(T) Y(T) = H(T/lO)/(l + G(T» FOR Y(T) \VITH Y(0)f-2 OK Tf-Q TO 10 SOLVE G(W)/W-H(W-A)=TAX(WIA) FOR W The first two statements define functions in terms of operators implemented by polyalgorithms (assume that GA(X) or GB(X) cannot be differentiated symbolically) and the last three statements required numerous evaluations of these functions. The straightforward approach now used simply makes these evaluations as needed by the PLOT or SOLVE processors. However, it is obvious that very significant economies are made by realizing that these functions are to be evaluated many times and thus, introducing the following two statements, APPROXIMATE H(X) AS HH(X) ON 0 TO 10 APPROXIMATE G(X) AS GG(X) ON 0 TO 10 and then by using HH(X) and GG(X) in the last three statements. A good symbolic analysis of the program would recognize this situation and automatically replace the symbolic definition of H(X) and G(X) by the approximations obtained from the approximation algorithm. It is clear that a symbolic analysis would not be infallible in these situations, but it appears that the savings made in the straightforward situations would be significant. OPERATING SYSTEM INTERFACE The most likely environment (at this time) for a NAPSS-like system is a medium or large scale computer with a fairly general purpose multiprogramming mode of operation. From the point of view of NAPSS the key characteristics of this environment are (a) The operating system is indifferent to NAPSS, i.e., NAPSS does not receive special priority or resources relative to jobs with similar characteristics. (b) Central memory is too small. (c) Heavy, or even moderate, use of NAPSS in an interactive mode makes a significant adverse impact on the overall operation of the computer. One may summarize the situation as follows: NAPSS is too big to fit comfortably in central memory for semi-continuous interactive use. Thus it must make extensive use of secondary memory. The result is that in saving on one scarce resource, central memory space, one expends large amounts of another equally scarce resource, access to secondary memory. One may consider five general approaches to the organization of a NAPSS-like system in an effort to obtain acceptable performance at an acceptable cost and with an acceptably small impact on the operating system. The first is to operate in a small central memory area and to be as clever as possible in instruction of programs and the access to secondary storage. In particular, paging would be heavily if not entirely controlled by the NAPSS system in order to optimize transfers to secondary storage. This is the approach used in the current pilot NAPSS system. The second approach is to use the virtual memory facilities of the operating and hardware system and then treat NAPSS as though it were in central memory at all times. The third approach is obtain enough real memory to hold all, or nearly all, of NAPSS. This approach includes the case of running a NAPSS-like system on a dedicated computer. The fourth approach is to limit NAPSS to batch processing use. NAPSS-LIKE Systems The final approach is to use distributed computing involving two processors. One processor is for language processing. A substantial memory is required because quite large data structure may be generated by NAPSS. A minicomputer with a disk might be suitable to handle a number of consoles running NAPSS. The other processor is that of a medium or large scale computer and its function is to execute poly algorithms. These programs would reside in this central computer's secondary storage rather than in the minicomputer's memory. The necessary language and data structures would be transferred to the main computer when a polyalgorithm is to be executed. The batch processing approach fundamentally changes the nature of the system and is hard to compare with the others. The other approaches have one or more of the following disad'laIliage.s: 1. The performance (response time) may be slow, especially when the computer is heavily loaded. 2. A very substantial investment in hardware is required. 3. The system is difficult to move to a new environment. The performance of the pilot ~APSS system suggests that each of these approaches can lead to a useful production system. Those that invest in special hardware would no doubt perform better, but it is still unclear which approach gives the best performance for a given total investment (in hardware, software development, execution time and user time). PORTABILITY The development of the pilot NAPSS system was a significant investment in software, perhaps 8 to 12 man years of effort. The numerical analysis polyalgorithms are reasonably portable as they are Fortran programs with only a few special characteristics. Indeed one can locate some suitable, if not ideal, already existing programs for some of the numerical analysis. The language processor is very specific to the operating system interface and the hardware configuration. It is about 90 percent in Fortran, but even so changing environments requires perhaps 8 to 12 man months of effort by very knowledgeable people. 47 NAPSS-like systems must be portable in order to get a reasonable return from the development effort as few organizations can justify such a system on the basis of internal usage . .I.~ number of approaches to (nearly) machine independent software do exist (e.g., boot strapping, macros, higher level languages) which are very useful. However, I believe that a survey of widely distributed systems similar to NAPSS in complexity would show that the key is an organization which is responsible for the portability. This organization does whatever is necessary to make the system run on an IBM 360;75 or 370/155, a UNIVAC 1108, and CDC 6600 and so forth. No one has yet been able to move such a system from sayan IBM 370 to a CDC 6600 with a week or two of effort. Another approach is to make the system run on II!El9-jt!:r!lJ!!l(:Ll~rg~__ ~U:~_M _aeQ'_~__ ~n_d _aLQ'~L( within )standarrl configurations) and ignore the rest of the computers. The emergence of computer networks opens up yet another possibility for portability, but it is too early to make a definite assessment of the performance and cost of using a NAPSS-like system through a large computer network. Networks also open up the possibility of a really large NAPSS machine being made available to a wide community of users. REFERENCES 1. Rice, J. R, Rosen S., "NAPSS-A Numerical Analysis Problem 2. 3. 4. 5. 6. 7. Solving System," Proc. ACM National Conference, 1966, pp. 5156. Rice, J. R, "On the Construction of Polyalgorithms for Automatic Numerical Analysis" in Interactive Systems for Experimental Applied Mathematics, (M. Klerer and J. Reinfeld, eds.l. Academic Press, New York, 1968, pp. 301-313. Rice, J. R, "A Polyalgorithm for the Automatic Solution of Nonlinear Equations," Proc. ACM National Conference, 1969, pp. 179-183. Roman, R V., Symes, L. R., "Implementation Considerations in a ~umerical Analysis Problem Solving System" in Interactive Systems for Experimental Applied Mathematics, (M. Klerer and J. Reinfeld, eds.), Academic Press, New York, 1968, pp. 400-410. Symes, L. R., Roman, R. V., "Structure of a Language for a Numerical Analysis Problem Solving System" in Interactive Systems for Experimental Applied Mathematics, (M. Klerer and J. Reinfelds, eds.), Academic Press, New York, 1968, pp. 67-78. Symes, L. R, "Manipulation of Data Structures in a Numerical Analysis Problem Solving System," Proc. Spring Joint computer Conference, AFIPS, Vol. 36, 1970, p. 157. Symes, L. R., "Evaluation of NAPSS Expressions Involving Polyalgorithms, Functions, Recursion and Untyped Variables," in Mathematical Software (J. R. Rice, ed.), Academic Press, new York, 1971, pp. 261-274. 48 National Computer Conference, 1973 The correctness of programs for numerical computation byT. E. HULL University of Toronto Toronto, Canada ABSTRACT Increased attention is being paid to techniques for proving the correctness of computer programs, and the problem is being approached from several different points of view. For example, those interested in systems programming have placed particular emphasis on the importance of language design and the creation of well-structured programs. Others have been interested in more formal approaches, including the use of assertions and automatic theorem proving techniques. Numerical analysts must cope with special difficulties caused by round oft' and truncation error, and it is the purpose of this talk to show how various techniques can be brought together to help prove the correctness of programs for numerical computation. The Society for Computer Simulation Session The changing role of simulation and the simulation councils by JOHN MCLEOD Simulation Councils, Inc. La J olIa, California ABSTRACT Simulation in the broadest sense is as old as man. Everyone has a mental model of his world. Furthermore he will use it to investigate-mentally-the possible results of alternative courses of action. Simulation as we know it, the use of electronic circuits to m9del real or ilI!~gi!lary .thi1).g§L beg~n about 3f,i -years ago. Since that time we have seen such vast changes in both the tools and the techniques of simulation that only the underlying philosophy remains unchanged. And the uses and abuses of simulation have changed radically, too. Seldom has a technology, developed primarily to serve one industry-in the case of simulation the aerospace industry --so permeated seemingly unrelated fields as has simulation. Today simulation is used as an investigative tool in every branch of science, and in many ways that by no stretch of the term can be called science. These changes have had their impact on our society, too. The first Simulation Council was founded in 1952 after we had tried in vain to find a forum for discussion of simulation among the established technical societies. As interest grew other Simulation Councils were organized, and in 1957 they were incorporated and became known as Simulation Councils, Inc. Because the nine regional Simulation Councils now comprise the only technical society devoted exclusively to advancing the state-of-theart of simulation and serving those people concerned with simulation, we are now known as SCS, the Society for Computer Simulation. In 1952 the analog computer was the best tool for simulation, and not one of the technical societies concerned with the up-and-coming digital computers was interested in the analog variety. So circumstances, not purpose, decreed that the Simulation Councils should become thought of as the analog computer society. We are not, and never have been; the Society for Computer Simulation is concerned with the development and application of the technology, not the tool! 49 That being the case, and realizing the applicability of the simulation technology to the study of complex systems in other fields, the society fostered the necessary technolog-j transfer by soliciting and publishing articles describing applications first in medicine and biology, and for the last several years, in the social sciences. To foster the change in role of simulation from that of a tool for the aerospace industry to that of a means for studying and gaining and understanding of the problems of our society required that the society also change. This change was first reflected in the technical content of our journal Simulation. It has always been our policy to publish articles describing unusual applications of simulation, but until a few years ago that was the only reason material describing a socially relevant use of simulation appeared in Simulation. Now it is our policy to solicit such_ articles, .and.publish as .many as are approved by our editorial review board. Therefore much of the material in our journal is now concerned with socially relevant issues. The Society for Computer Simulation also publishes a Proceedings series. Of the three released to date, all are relevant to societal problems. The changing role of the society is also evidenced by changes in official policy and in the organization itself. The change in policy was elucidated by our President in an article published in the April 1970 issue of Simulation, which stated in part" ... the Executive Committee feels that [our society's] primary mission today should be to assist people who want to use simulation in their own fields and particularly to assist people who are dealing with the world's most urgent and difficult [societal] problems ... " The principal organizational change is the establishment of the World Simulation Organization to stimulate work towards the development of simulation technology applicable to the study of problems of our society from a global point of view. Concomitant with the spread of simulation to all disciplines has been the increase in interest within technical societies which are only peripherally concerned with simulation. Although these societies are primarily dedicated to other fields, several have formed committees or special interest groups with aims and objectives similar to those of the Society for Computer Simulation. However, the Society for Computer Simulation remains the only technical society dedicated solely to the service of those concerned with the art and science of simuiation, and to the improvement of the technology on which they must rely. That others follow is a tribute to our leadership. 50 National Computer Conference, 1973 Up, up and away Policy models-Concepts and rules-ofthumb by THOMAS NAYLOR Duke University Durham, North Carolina by PETER w. HOUSE Environmental Protection Agency Washington, D.C. ABSTRACT ABSTRACT In 1961, Jay Forrester introduced economists, management scientists and other social scientists to a new methodology for studying the behavior of dynamic systems, a methodology which he called Industrial Dynamics. Following closely on the heels of Industrial Dynamics was Urban Dynamics, which purported to analyze the nature of urban problems, their cases, and possible solution to these problems in terms of interactions among components of urban systems. More recently, Forrester has come forth with World Dynamics. We and the inhabitants of the other planets in our universe are now anxiously awaiting the publication of Universe Dynamics, a volume which is to be sponsored by the Club of Olympus, God, the Pope, Buddha, Mohammed, and the spiritual leaders of several other major religions of this world and the universe. Not unlike World Dynamics and other books by Jay Forrester, Universe Dynamics will be characterized by a number of distinct features. These features will be summarized in this paper. In this presentation we shall comment on the methodology used by Forrester in World Dynamics as well as the methodology which is being set forth by his disciples who publish The Limits of Growth and the other people involved in the Club of Rome project. We shall address ourselves to the whole question of the feasibility of constructing models of the entire world and to model structures alternative to the one set forth by Forrester, et al. It is first necessary to consider what possible objectives one might have in trying to prove programs correct, since different correctness criteria can be relevant to any particular program, especially when the program is to be used for numerical computation. Then it will be shown that careful structuring, along with the judicious use of assertions, can help one to organize proofs of correctness. Good language facilities are needed for the structuring, while assertions help make specific the details of the proof. Examples from linear algebra, differential equations and other areas will be used to illustrate these ideas. The importance of language facilities will be emphasized, and implications for Computer Science curricula will be pointed out. A useful analogy with proofs of theorems in mathematics and the relevance of this analogy to certification procedures for computer programs will be discussed. The desire to build policy models or models for policy makers is based on two foundations. First, the need to solicit funds to pay for the construction of models means that those who want to construct models have to promise a "useful" product. Since a large portion of the models built are to support some level of policy, public or private, there is a deliberate attempt to promise output which will be useful to the decision process. Secondly, it is clear from history that the advisor to the throne is a coveted position and one dreamed of by many scientists. It is also clear that the day is coming when models will playa large role in making such policy. The advisory role then shifts to the model builder. Unfortunately, the reality of model development for the policy level does not appear to agree with the rhetoric. This presentation will review the concept of policy models and suggest some rules-of-thumb for building them. The Seciety fer Cemputer Simulatien Sessien On validation of simulation models by GEORGE S. FISHMAN Yale University New Haven, Connecticut ABSTRACT Befere an investigater can claim that his simulatien medel is a useful tee 1 fer studying behavier under new hypethetical cenditiens, he is well advised to. check its censistency with the true system, as it exists befere any change is made. The success ef this validatien establishes a basis fer cenfidence in results that the medel generates under --new- conditions_After-all,- if a model-cannot ;rep-roduce system behavier without change, then we hardly expect it to. preduce truly representative results with change. The preblem ef hew to. validate a simulatien model arises in every simulatien study in which seme semblance ef a system exists. The space deveted to. validatien in Nayler's boek Computer Simulatien Experiments with Models of Economic Systems indicates both the relative importance of the tepic and the difficulty of establishing universally applicable criteria fer accepting a simulatien model as a valid representatien. One way to. approach the validation of a simulation model is through its three essential cempenents; input, structural representation and eutput. For example, the input censist of exegeneus stimuli that drive the model during a run. Censequently one would like to. assure himself that the probability distributions and time series representations used to characterize input variables are consistent with available data. With regard to. structural representation one would like to. test whether or not the mathematical and logical representations do not conflict with the true system's behavier. With regard to output ene ceuld feel cemfortable with a simulation model if it behaved similarly to. the true system when expesed to. the same input. Interestingly enough, the greatest effort in medel validatien ef large econometric medels has concentrated on structural representation. No. doubt this is due to the fact that regression methods, whether it be the simple leastsquares methed or a more comprehensive simultaneous equatiens techniques, in addition to providing precedures for parameter estimation, facilitate hypothesis testing regarding structural representation. Because of the availability of these regression methods, it seems hard to believe that at least some part of a medel's structural representation cannot be validated. Lamentably, seme researchers choose to discount and avoid the use of available test precedures. With regard to input analysis, techniques exist for determining the temporal and probabilistic characteristics of exogeneous variables. Fer example the auteregres- 51 sive-moving average schemes described in Box and Jenkins' beok, Time Series Analysis: Forecasting and Control, are available today in canned statistical computer programs. Maximum likeliheed estimation procedures are available for mest commen prebability distribution and tables based on sufficient statistics have begun to appear in the literature. Regardless of how little data is available, a medel's use would benefit from a conscientious effort to characterize the mechanism that produced those data. As mentioned earlier a check of censistency between model and system eutput in res pense to the same input would be an apprepriate step in validation. A natural question that arises is: What ferm should the consistency check take? One approach might go as fellows: Let Xl> ... , Xn be the medel's output in n consecutive time intervals-and let Yn . . .; -¥n-OO-tM-system's -0Utput -fuf n consecutive time intervals in response to the same stimuli. Test the hypethesis that the joint prebability distribution of Xl> ... , Xn is identical with that ef Y1, • • • , Y n. My ewn feeling is that the above test is too stringent and creates a misplaced emphasis en statistical exactness. I weuld prefer to. frame output validatien in mere of a decision making centext. In particular, one questien that seems useful to answer is: In response to the same input, does the model's output lead decision makers to take the same action that they would take in respense to the true system's eutput? While less stringent than the test first described, its implementation requires access to. decision makers. This seems to me to. be a desirable requirement for enly through continual interaction with decision makers can an investigator hope to gauge the sensitive issues to. which his model should be respensive and the degree ef accuracy that these sensitivities require. 52 National Computer Conference, 1973 In the beginning by HOWARD CAMPAIGNE Slippery Rock State Teachers College Slippery Rock, Pennsylvania ABSTRACT The history of computers has been the history of two components; memories and software. These two depend heavily on each other, and all else depends on them. The early computers had none of either, it almost seems in retrospect. The Harvard Mark I had 132 words of 23 decimal digits, usable only for data. ENIAC had ten registers of ten decimals, each capable of doing arithmetic. It was von Neuman who pointed out that putting the program into the memory of ENIAC (instead of reading it from cards) would increase the throughput. Thereafter computers were designed to have data and instructions share the memory. The need for larger storage was apparent to all, but especially to programmers. EDVAC, the successor to ENIAC, had recirculating sounds in mercury filled pipes to get a thousand words of storage. The Manchester machine had a TV tube to store a thousand bits. Then the reliable magnetic core displaced these expedients' and stayed a whole generation. It was only in recent times when larger memories became available that the programmer had a chance. And of course it is his sophisticated software which makes the modern computer system responsive and effective. Factors affecting commercial computers system design in the seventies by WILLIAM F. SIMON Sperry UNIVAC Blue Bell, Pennsylvania ABSTRACT The design of a digital computer for the commercial market today must, of course, face up to the pervasive influence of IBM. But technological maturity in some areas is slowing the rate of change so that designs seem to converge on certain features. Microprogramming of the native instruction set (or sets?) with emulation of a range of older systems is such a feature. Virtual memory addressing may be another. Characteristics of main stor- age, random access mass storage devices, data exchange media seem to converge while terminals and communications conventions proliferate and diverge. Some reasons for these phenomena are evident; others will be suggested. Whatever happened to hybrid packaging, thin films, large scale integration, and tunnel diodes? The more general question is: why do some technologies flourish only in restricted environments, or never quite fulfill the promise of their "youth?" Or is their development just slower than we expected? While these answers cannot be absolute, some factors affecting the acceptance of new technologies can be identified. Factors impacting on the evolution of military computers by GEORGE M. SOKOL US Army Computer Systems Command Fort Belvoir, Virginia ABSTRACT This paper will trace Army experience in ADP for the combat environment, with emphasis on the role of software as a factor in influencing computer organization and design. Early Army activity on militarized computers resulted in the Fieldata family of computers, a modular hierarchy of ADP equipment. Subsequently, ~oftware considerations and the evolution of functional requirements resulted in extended use of commercially available computers mounted in vehicles. The balance between central electronic logic and peripheral capability is central to the design of militarized computers, but constraints of size, weight and ruggedness have greatly limited the processing capability of fieldable peripheral equipment. The systems acquisition process also impacts on the available characteristics of militarized computers. Modeling and simulation in the process industries by CECIL L. SMITH and ARMANDO B. CORRIPIO Louisiana State University Baton Rouge, Louisiana and RA YMOND GOLDSTEIN Pica tinny Arsenal Dover, New Jersey builds up, the drainage rate decreases and the retention increases. This process continues until all of the free liquid has drained through the wire. In general, these processes are not well-understood (especially from a quantitative standpoint), and as a result, the equations used to describe them have been primarily empirical. The action of the suction boxes and press rolls is also a physical process, and again are not well-understood. Similarly, the drying of sheets is also a complex physical process .. Initially, the sheet contains a high percentage of water, and it is easily driven off. But as the sheet becomes drier, the remaining water molecules are more tightly bound (both chemically and physically) to the fiber, and the drying rate decreases. Quantitatively, the relationships are not well-developed, and again empiricism is relied upon quite heavily. A model of the paper machine should be capable of relating the final sheet density (lbs/ft2), sheet moisture, and other similar properties to inputs such as stock flow, machine speed, dryer steam pressure, etc. I:\fTRODUCTION The objective of this paper is to present what is at least the authors' general assessment of the state-of-the-art of modeling and simulation in the process industries, which in this context is taken to include the chemical, petrochemical, pulp and paper, metals, waste and water treatment industries but excluding the manufacturing industries such as the automobile industry. Since a number of texts l.2.3 are available on this topic for those readers interested in a more technical treatment, this discussion will tend to be more general, emphasizing such aspects as economic justification, importance of experimental and/ or plant data, etc. EXAMPLES Paper machine In the process customarily used for the manufacture of paper, an aqueous stream consisting of about 0.25 percent by weight of suspended fiber is jetted by a head box onto a moving wire. As the water drains through the wire, a certain fraction of the fiber is retained, forming a mat that subsequently becomes a sheet of paper. The wire with the mat on top passes over suction boxes to remove additional water, thereby giving the mat sufficient strength so that it can be lifted and passed between press rolls. It then enters the dryer section, which consists of several steamheated, rotating cylinders that provide a source of heat to vaporize the water in the sheet. The final sheet generally contains from 5 to 10 percent water by weight. The paper machine is a good example of a model consisting almost entirely of relationships to describe physical processes. The formation of the mat over the wire is a very complex physical process.'·5 Initially, the wire has no mat on top, and the drainage rate is high but the retention (fraction of fiber retained on wire) is low. As the mat TNT process Whereas the model for the paper machine consists almost entirely of relationships describing physical processes, the description of chemical processes forms the heart of many models. For example, the manufacture of trinitrotoluene (TNT) entails the successive nitration of tol uene in the presence of strong concentrations of nitric and sulphuric acids. In current processes, this reaction is carried out in a two phase medium, one phase being largely organic and the other phase being largely acid. 6 According to the currently accepted theory, the organic species diffuse from the organic phase to the acid phase; where all reactions occur. The products of the reaction then diffuse back into the organic phase. In this process, the primary reactions leading to the production of TNT are well-known at least from a stoichi53 54 National Computer Conference, 1973 ometric standpoint. However, many side reactions occur, including oxidation of the benzene ring to produce gaseous products (oxides of carbon and nitrogen). These reactions are not well-understood, but nevertheless must be included in a process model. Similarly, the relationships describing the diffusion mechanism are complex and include constants whose quantitative values are not available. In this particular process, the solubility of the organic species in the acid phase is not quantitatively known. From a model describing the TNT process, one should be able to compute the amount of product and its composition from such inputs as feed flows and compositions, nitrator temperatures, etc. STEADY-STATE VS. DYNAMIC MODELS A steady-state model is capable of yielding only the equilibrium values of the process variables, whereas a dynamic process model will give the time dependence of the process variables. Using the paper machine as an example, the design of the plant would require a model that gives the final sheet moisture, density, and other properties obtained when the inputs are held at constant values for long periods of time. This would be a steady-state model. On the other hand, one of the most difficult control problems in the paper industry occurs at grade change. 7 For example, suppose the machine has been producing paper with a sheet density of 60 lbs/ 1000ft2. Now the necessary changes must be implemented so that the machine produces paper with a sheet density of 42 lbs/l000ft 2. Since virtually all of the paper produced in the interim must be recycled, the time required to implement the grade change should be minimized. Analysis of this problem requires a model that gives the variation of the paper characteristics with time. This would be a dynamic model. ECONOMIC JUSTIFICATION Due to the complexity of most industrial processes, development of an adequate process model frequently requires several man-years to develop, significant outlays for gathering data, and several hours of computer time. Therefore, some thought must be given to the anticipated returns prior to the start of the project. In essence, there is frequently no return from just developing a model; the return comes from model exploitation. Virtually every project begins with a feasibility study, which should identify the possible ways via which a model can be used to improve process performance, estimate the returns from each of these, develop specific goals for the modeling effort (specify the sections of the process to be modeled; specify if the model is to be steady-state or dynamic, etc.), and estimate the cost of the modeling efforts. Unfortunately, estimating returns from model exploitation is very difficult. Furthermore, return" e:m be nivined into tangiblf' rf'hlrnR for which dollar values are assigned and intangible returns for which dollar values cannot readily be assigned. For example, just the additional insight into the process gained as a result of the modeling effort is valuable, but its dollar value is not easily assigned. Perhaps the day will come when the value of process modeling has been established to the point where models are developed for all processes; however, we are not there yet. For many processes, the decision as to whether or not to undertake a modeling project is coupled with the decision as to whether or not to install a control computer, either supervisory or DDC. In this context, perhaps the most likely subjects are plants with large throughputs, where even a small improvement in process operation yields a large return due to the large production over which it is spread. B Many highly complex processes offer the opportunity to make great improvements in process operation, but these frequently necessitate the greatest effort in model development. Typical projects for which a modeling effort can be justified include the following: 1. Determination of the process operating conditions that produce the maximum economic return. 2. Development of an improved control system so that the process does not produce as much off-specification product or does not produce a product far above specifications, thereby entailing a form of "product give-away." For example, running a paper machine to produce a sheet with 5 percent moisture when the specification is 8 percent or less leads to product give-away in that the machine must be run slower in order to produce the lower moisture. Also, paper is in effect sold by the pound, and water is far cheaper than wood pulp. 3. Design of a new process or modifications to the current process. Although many modeling efforts have been in support of computer control installations, this is certainly not the only justification. In fact, in many of these, hindsight has shown that the greatest return was from improvements in process operation gained through exploitation of the model. In many, the computer was not necessary in order to realize these improvements. MODEL DEVELOPMENT In the development of a model for a process, two distinct approaches can be identified: 1. Use of purely empirical relationships obtained by correlating the values of the dependent process variables with values of the independent process variables. 2. Development of detailed heat balances, material balances, and rate expressions, which are then romhinf'n to form the OVPTl'Ill monf'l of the process. Modeling and Simulation in the Process Industries The first method is purely empirical, whereas the second relies more on the theories regarding the basic mechanisms that proceed within the process. 'Nhile it may not be obvious at first, both of these approaches are ultimately based on experimental data. Since regression is used outright to obtain the empirical model, it is clearly based on experimental data. For any realistic process, the detailed model encompassing the basic mechanisms will contain parameters for which no values are available in the literature. In these cases, one approach is to take several "snapshots" of the plant, where the value of as many process variables as possible are obtained. In general, the normal process instrumentation is not sufficient to obtain all of the needed data. Additional recording points are often temporarily added, and samples are frequently taken for subsequent laboratory analysis. With this data available, a multivari:,1ble search technique such as Pattern 13 ,14 can be used to determine the model parameters that produce the best fit of the experimental data. In efforts of this type, the availability of a digital computer for data logging can be valuable. The proper approach is to determine what data is needed in the modeling effort, and then program the computer to obtain this data from carefully controlled tests on the process. The use of a digital computer to record all possible values during the normal operation of the process simply does not yield satisfactory data from which a model can be developed. Another point of contrast between the empirical model and the basic model involves the amount of developmental effort necessary. The empirical model can be developed with much less effort, but on the other hand, it cannot be reliably used to predict performance outside the range within which the data was obtained. Since the detailed model incorporates relationships describing the basic mechanisms, it should hold over a wider range than the empirical model, especially if the data upon which it is based was taken over a wide range of process operating conditions. NUMERICAL METHODS In the development and exploitation of process models, numerical techniques are needed for the following operations: 1. Solution of large sets of nonlinear algebraic equa- tions (frequently encountered in the solution of steady-state models). 2. Solution of large sets of nonlinear, first-order differential equations (frequently encountered in the solution of unsteady state models). 3. Solution of partial differential equations (usually encountered in the solution of an unsteady-state model for a distributed-parameter system). 4. Determination of the maximum or minimum of a high-order, nonlinear function (usually encountered 55 in either the determination of the model parameters that best fit the experimental data or in the determination of the process operating conditions that produce the greatest economic return). Only digital techniques are discussed in this section; analog and hybrid techniques will be described subsequently. In general, the numerical techniques utilized for process models tend to be the simpler ones. The characteristic that generally presents the most difficulties is the size of the problems. For example, the model of the TNT plant described in Reference 9 contains 322 nonlinear equations plus supporting relationships such as mole fraction calculations, solubility relationships, density equations, etc. In the solution of sets of nonlinear algebraic equations, the tendency is to use direct substitution methods in an iterative approach to solving the equations. In general, a process may contain several recycle loops, each of which requires an iterative approach to solve the equations involved. The existence of nested recycle loops causes the number of iterations to increase significantly. For example, the steady-state model for the TNT process involves seven nested recycle loops. Although the number of iterations required to obtain a solution is staggering, the problem is solved in less than a minute on a CDC 6500. A few types of equations occur so frequently in process systems that special methods have been developed for them. An example of such a system is a countercurrent, stagewise contact system, which is epitomized by a distillation column. For this particular system, the Thetamethod has been developed and used extensively. 10 In regard to solving the ordinary differential equations usually encountered in dynamic models, the simple Euler method has enjoyed far more use than any other method. The advantages stemming from the simplicity of the method far outweigh any increase in computational efficiency gained by using higher-order methods. Furthermore, extreme accuracy is not required in many process simulations. In effect, the model is only approximate, so why demand extreme accuracies in the solution? Although once avoided in process models, partial differential equations are appearing more regularly. Again, simple finite difference methods are used most frequently in solving problems of this type. Maximization and minimization problems are encountered very frequently in the development and exploitation of process models. One very necessary criterion of any technique used is that it must be able to handle constraints both on the search variable and on the dependent variables computed during each functional evaluation. Although linear programming handles such constraints very well, process problems are invariably nonlinear. Sectional linear programming is quite popular, although the conventional multi variable search techniques coupled with a penalty function are also used. Over the years, a number of simulation languages such as CSMP and MIMIC have been used in simulation. 11 On 56 National Computer Conference, 1973 the steady-state side, a number of process simulation packages such as PACER, FLOTRAN, and others have appeared. 12 An alternative to these is to write the program directly in a language such as Fortran. One of the problems in steady-state simulation is the need for extensive physical property data. Many of the steady-state simulation packages have a built-in or readily available physical properties package that is a big plus in their favor. However, many prefer to use subroutines for physical properties, subroutines for the common unit operations, and subroutines to control the iteration procedures, but nevertheless write in Fortran their own master or calling program and any special subroutine for operations unique to their process. For dynamic process models with any complexity, Fortran is almost universally preferred over one of the simulation languages. COMPUTATIONAL REQUIREMENTS With the introduction of computing machines of the capacity of the CDC 6500, Univac 1108, IBM 360/65, and similar machines produced by other manufacturers, the computational capacity is available to solve all but the largest process simulations. Similarly, currently available numerical techniques seem to be adequate for all but the very exotic processes. This is not to imply that improved price/performance ratios for computing machines would be of no benefit. Since the modeling effort is subject to economic justification, a significant reduction in computational costs would lead to the undertaking of some modeling projects currently considered unattractive. As for the role of analog and hybrid computers in process simulation, no significant change in the current situation is forecast. Only for those models whose solution must be obtained a large number of times can the added expense of analog programming be justified. However, for such undertakings as operator training, the analog computer is still quite attractive. coefficients occurring in the relationships comprIsmg a process model are simply not available for most processes of interest. This paper has attempted to present the state-of-the-art of process modeling as seen by the authors. This discussion has necessarily been of a general nature, and exceptions to general statements are to be expected. In any case, these should always be taken as one man's opinion for whatever it is worth. ACKNOWLEDGMENT Portions of the work described in this paper were supported by the Air Force Office of Scientific Research under Contract F -44620-68-C-0021. REFERENCES 1. Smith, C. L., Pike, P. W., Murrill, P. W., Formulation and Optimi- 2_ 3. 4_ 5. 6. 7_ 8. 9. 10_ 11. SUMMARY At this stage of process modeling and simulation, the generally poor understanding of the basic mechanisms occurring in industrial processes is probably the major obstacle in a modeling effort. Quantitative values for diffusions, reaction rate constants, solubilities, and similar 12. 13. 14_ zation of Mathematical Models, International Textbook Company, Scranton, Pennsylvania, 1970. Himmelblau. D. M .• Bischoff. K R, Process Analysis and Simulation-Deterministic Systems, Wiley, New York, 1968. Franks, R. G_ K, Modeling and Simulation in Chemical Engineering, Wiley-Interscience, New York, 1972. Schoeffler, J_ D., Sullivan, P_ R., "A Model of Sheet Formation and Drainage on a Fourdrinier," Tappi, VoL 49, No.6, June 1966, pp_ 248-254_ Sullivan, P_ R., Schoeffler, J_ D., "Dynamic Simulation and Control of the Fourdrinier Paper-Making Process," IFAC Congress, London, June 20-26.1966. Slemrod, S., "Producing TNT by Continuous Nitration," Ordnance, March-April 1970, pp_ 525-7. Brewster, D. R, The Importance of Paper Machine Dynamics in Grade Change Control, Preprint No. 17.2-3-64, 19th Annual ISA Conference and Exhibit, New York, October 12-15,1964_ Stout, T. M., "Computer Control Economics," Control Engineering, VoL 13, No.9, September 1966, pp_ 87 -90_ Williams, J. K, Michlink, F. P., Goldstein, R., Smith, C. L., Corripio, A. R, "Control Problem in the Continuous TNT Process," 1972 International Conference on Cybernetics and Society, Washington, D.C_, October 9-12, 1972. Holland, C. D., Multicomponent Distillation, Prentice Hall, Englewood Cliffs, New Jersey, 1963. Smith, C. L., "All-Digital Simulation for the Process Industries," ISA Journal, VoL 13, No.7, July 1966, pp_ 53-60. Evans, L. R, Steward, D. G., Sprague, C. R., "Computer-Aided Chemical Process Design," Chemical Engineering Progress, VoL 64, No.4, April 1968, pp. 39-46. Hooke, R., Jeeves, T_ A., "Direct Search Solution of Numerical and Statistical Problems," The Journal of the Association for Computing Machinery, VoL 8, No_ 2, April 1961, pp. 212-229_ Moore, C_ F., Smith, C. L., Murrell, P. W., IBM SHARE Library, LSU, PATE, SDA No_ 3552,1969. Needs for industrial computer standards-As satisfied by ISA's programs in this area by THEODORE J. WILLIAMS Purdue University West Lafayette, Indiana and KIRWIN A. WHITMAN Allied Chemical Corporation Morristown, New Jersey INTRODUCTION ACCEPTA~CE TESTING OF DIGITAL PROCESS COMPUTERS Never before has the relevancy of institutions been questioned as critically as today. Many of us have now learned what should have always been evident; that the key to relevance is the satisfaction of needs. The computer industry, and those technical societies that support it, should view this new emphasis on service as an opportunity to stimulate its creative talents. The implication of the "future shock" concept requires that we must anticipate problems if we are ever to have enough time to solve them. But in what way can a technical society serve; should it be a responder or a leader? Ironically, to be effective, it must be both. It must respond to the requests of individuals in the technical community to use the society's apparatus for the development, review and promulgation of needed standards. The development and review stages can be done by groups of individuals, but it remains for the technical society to exert a leadership role to make these standards known and available to all who might benefit from them. Thus, our purpose here is to bring to your attention two new, and we feel exciting, industrial computer standards developments that have been undertaken by the Instrument Society of America, as well as a discussion of further actions contemplated in this field. The first is RP55, "Hardware Testing of Digital Process Computers." The second is the work cosponsored with Purdue University on software and hardware standards for industrial computer languages and interfaces. This latter work is exemplified by the ISA series of standards entitled S61, "Industrial Computer FORTRAN Procedures," among others. Both standards development projects have succeeded in furthering ISA's commitment "to provide standards that are competent, timely, unbiased, widely applicable, and authoritative." Needs A Hardware Testing Committee was formed in October 1968 because a group of professionals recognized certain specific needs of the process computer industry. The user needed a standard in order to accurately evaluate the performance of a digital computer and also to avoid the costly duplication of effort when each user individually writes his own test procedures. Conversely, the vendor needed a standard to avoid the costly setting up of different tests for different users and also to better understand what tests are vital to the user. Purpose The purpose of the committee has been to create a document that can serve as a guide for technical personnel whose duties include specifying, checking, testing, or demonstrating hardware performance of digital process computers at either vendor or user facilities. By basing engineering and hardware specifications, technical advertising, and reference literature on this recommended practice, there will be provided a clearer understanding of the digital process computer's performance capabilities and of the methods used for evaluating and documenting proof of performance. Adhering to the terminology, definitions, and test recommendations should result in clearer specifications which should further the understanding between vendor and user. Scope The committee made policy decisions which defined the scope of this recommended practice to: 57 58 National Computer Conference, 1973 (1) Concern digital process computer hardware testing rather than software testing. However, certain software will be necessary to perform the hardware tests. (2) Concern hardware test performance at either the vendor's factory or at the user's site. This takes into account that it would be costly for a vendor to change his normal test location. (3) Concern hardware performance testing rather than reliability or availability testing. These other characteristics could be the subject for a different series of long term tests at the user's site. (4) Concern hardware testing of vendor supplied equipment rather than also including user supplied devices. Generally, the vendor supplied systems includes only that equipment from the input terminations to the output terminations of the computer system. (5) Consider that specific limits for the hardware tests will not exceed the vendor's stated subsystem specifications. (6) Consider that before the system contract is Rigned, the vendor and user will agree upon which hardware testing specifications are applicable. It was not the intent of the standard to finalize rigid specifications or set specific rather than general acceptance criteria. This recognizes that there are many differences both in vendor product design and in user requirements. (7) Consider that the document is a basic nucleus of tests, but other tests may be substituted based on cost, established vendor procedures, and changing state of the art. Although requirements to deviate from a vendor's normal pattern of test sequence, duration or location could alter the effectiveness of the testing, it could also create extra costs. (8) Consider that the document addresses a set of tests which apply to basic or typical digital process computers in today's marketplace. Where equipment configurations and features differ from those outlined in this standard, the test procedures must be modified to account for the individual equipment's specifications. (9) Consider that the document does not necessarily assume witness tests (i.e., the collecting of tests for a user to witness). This collection mayor may not conform to the vendor's normal manufacturing approach. There are three cost factors which should be considered if a witness test is negotiated: a. Added vendor and user manhours and expenses. b. Impact on vendor's production cycle and normal test sequence. c. Impact on user if tests are not performed correctly in his absence. Recommended test procedures It will not be attempted in this paper to detail reasons for particular procedures in the areas of peripherals. environmental, subsystem and interacting system tests. Time will only permit naming the test procedure sections and what they cover. In this way you may judge the magnitude of this undertaking and the probable significance of this recommended practice. (1) Central Processing Unit-including instruction complement, arithmetic and control logic, input/ output adapters, I/O direct memory access channel, interrupts, timers and core storage. (2) Data Processing Input/ Output Subsystems including the attachment circuitry which furnishes logic controls along with data links to the input/ output bus; the controller which provides the buffer between the computer and the input/ output device itself; and finally the input/ output devices themselves. (3) Digital Input/ Output-including operation, signal level, delay, noise rejection, counting accuracy, timing accuracy and interrupt operation. (4) Analog Inputs-including address, speed, accuracy/linearity, noise, common mode and normal mode rejection, input resistance, input over-voltage recover, DC crosstalk, common mode crosstalk and gain changing crosstalk. (5) Analog Outputs-incl uding addressing, accuracy, output capability, capacitive loading, noise, settling time, crosstalk and droop rate for sample and hold outputs. (6) Interacting Systems-including operation in a simulated real time environment in order to check the level of interaction or crosstalk resulting from simultaneous demands on the several subsystems which make up the system. (7) Environmental-including temperature and humidity, AC power and vibration. Costs The committee constantly had to evaluate the costs of recommended tests versus their value. Typical factors affecting the costs of testing are: (1) (2) (3) (4) (5) (6) (7) (8) Number of separate test configurations required Methods of compliance Sequence, duration, and location of tests Quantity of hardware tested Special programming requirements Special testing equipment Effort required to prepare and perform tests Documentation requirements The additional testing costs may be justified through factors such as reduced installation costs, more timely installation, and early identification of application problems. Needs for Industrial Computer Standards Documentation Another unique feature of this recommended practice is that it has given special attention to documentation of evidence of tests performed on the hardware. Three types of documentation are proposed in order that the user may choose what is most appropriate cost-wise for his situation. Type 1 would include any statement or evidence provided by the manufacturer that the hardware has successfully passed the agreed-upon tests. Type 2 would be an itemized check list indicating contractually agreed-upon tests with a certification for each test that had been successfully performed. Type 3 would be an individual numerical data printout; histograms, etc., compiled during the performance of the tests. It is, therefore, the aim to provide maximum flexibility in documentation related to the testing. Board of review The committee was composed of eight vendors, eight users, and two consultants. In addition to the considerable experience and varied backgrounds of the committee, an extensive evaluation by a Board of Review was also required. Serious effort was given to insuring that a wide cross-section of the industry was represented on the Review Board. Invitations were sent to the various ISA committees, to attendees at the Computer Users Conference, and various computer workshops. Announcements also appeared in Instrumentation Technology and Control Engineering. Interest was expressed by approximately 250 people, and these received the document drafts. A very comprehensive questionnaire was also sent to each reviewer in order that a more meaningful interpretation of the review could be made. Subsequently, 117 responses were received. In addition to the questionnaire response, other comments from the Review Board were also considered by the appropriate subcommittee and then each comment and its disposition were reviewed by the SP55 Committee. The magnitude of this effort can be judged from the fact that the comments and their disposition were finally resolved on 44 typewritten pages. The returned questionnaires indicated an overwhelming acceptance and approval of the proposed documents. The respondents, who came from a wide variety of industrial and scientific backgrounds, felt that it would be useful for both vendors and users alike. They gave the document generally high ratings on technical grounds, and also as to editorial layout. Some reservation was expressed about economic aspects of the proposed testing techniques, which is natural considering that more testing is caiied for than was previously done. However, ninetyone percent of the respondents recommended that RP55.1 be published as an ISA Recommended Practice." Only three percent questioned the need for the document. The 59 responses were also analyzed for any generalized vendoruser polarity. Fortunately, the percentage of those recommending acceptance of the document were in approximate proportion to their percentages as vendors, users, or consultants. In other words, there was no polarization into vendor, user or consultant classes. The recommended practice was subsequently submitted to the American National Standards Institute and is presently being evaluated for acceptability as an ANSI standard. ISA's Standards and Practices Board has met with ANSI in order to adopt procedures permitting concurrent review by both ISA and ANSI for all future standards. PRESENT EFFORTS AT PROCESS CONTROL SYSTEM LANGUAGE STANDARDIZATION As mentioned earlier, standardization has long been recognized as one means by which the planning, development, programming, installation, and operation of our plant control computer installations as well as the training of the personnel involved in all these phases can be organized and simplified. The development of APT and its variant languages by the machine tool industry is a very important example of this. The Instrument Society of America has been engaged in such activities for the past ten years, most recently in conjunction with the Purdue Laboratory for Applied Industrial Control of Purdue University, West Lafayette, Indiana.4.6 Through nine semiannual meetings the Purdue Workshop on Standardization of Industrial Computer Languages has proposed the following possible solutions to the programming problems raised above, and it has achieved the results listed below: (1) The popularity of FORTRAN indicates its use as at least one -of the procedural languages to be used as the basis for a standardized set of process control languages. It has been the decision of the Workshop to extend the language to supply the missing functions necessary for process control use by a set of CALL statements. These proposed CALLS, after approval by the Workshop, are being formally standardized through the mechanisms of the Instrument Society of America. One Standard has already been issued by ISA,7 another is being reviewed at this writing, 8 and a third and last one is under final development. 9 (2) A so-called Long Term Procedural Language or L TPL is also being pursued. A set of Functional Requirements for this Language has been approved. Since the PL/ 1 language is in process of standardization by ANSI (the American National Standards Institute), an extended subset of it (in the manner of the extended FORTRAN) will be tested against these requirements. 12 Should it fail, other languages will be tried or a completely new one will be developed. 60 National Computer Conference, 1973 (3) The recognized need for a set of problem-oriented languages is being handled by the proposed development of a set of macro-compiler routines which will, when completed, allow the user to develop his own special language while still preserving the transportability capability which is so important for the ultimate success of the standardization effort. This latter will be accomplished by translating the former language into one or the other of the standardized procedural languages before compilation. (4) To establish the tasks to be satisfied by the above languages, an overall set of Functional Requirements has been developed. 10 (5) In order that all Committees of the Workshop should have a common usage of the special terms of computer programming, the Glossary Committee of the Workshop has developed a Dictionary for Industrial Computer Programming which has been published by the Instrument Society of America 11 in book form. The Workshop on Standardization of Industrial Computer Languages is composed entirely of representatives of user and vendor companies active in the on-line industrial digital computer applications field. Delegates act on their own in all Workshop technical discussions, but vote in the name of their companies on all substantive matters brought up for approval. It enjoys active representation from Japan and from seven European countries in addition to Canada and the United States itself. Procedures used in meetings and standards development are the same as those previously outlined for the Hardware Testing Committee. As mentioned several times before, it is the aim and desire of those involved in this effort that the Standards developed will have as universal an application as possible. Every possible precaution is being taken to assure this. The nearly total attention in these and similar efforts toward the use of higher level languages means that the vendor must be responsible for producing a combination of computer hardware and of operating system programs which will accept the user's programs written in the higher level languages in the most efficient manner. A relatively simple computer requiring a much higher use of softw~re accomplished functions would thus be equivalent, except for speed of operation, with a much more sophisticated and efficient computer with a correspondingly smaller operating system. The present desire on the part of both users and vendors for a simplification and clarification of the present morass of programming problems indicates that some standardization effort, the Purdue cosponsored program, or another, must succeed in the relatively near future. Future possibilities and associated time scales The standardized FORTRAN extensions as described can be available in final form within the next one to two years. Some of those previously made have been implemented already in nearly a dozen different types of computers. The actual standardization process requires a relatively long period of time because of the formality involved. Thus, the 1974-75 period appears to be the key time for this effort. The work of the other language committees of the Workshop are less formally developed than that of the FORTRAN Committee as mentioned just above. Successful completion of their plans could result, however, in significant developments in the Long Term Procedural Language and in the Problem Oriented Languages areas within the same time period as above. In addition to its Instrument Society of America sponsorship' this effort recently received recognition from the International Federation for Information Processing (IFIP) when the Workshop was designated as a Working Group of its Committee on Computer Applications in Technology. The Workshop is also being considered for similar recognition by the International Federation of Automatic Control (IFAC). As mentioned, this effort is achieving a very wide acceptance to date. Unfortunately, partly because of its Instrument Society of America origins and the personnel involved in its Committees, the effort is largely based on the needs of the continuous process industries. The input of interested personnel from many other areas of activity is very badly needed to assure its applicability across all industries. To provide the necessary input from other industries, it is hoped that one or more of the technical societies (United States or international) active in the discrete manufacturing field will pick up cosponsorship of the standardization effort presently spearheaded by the Instrument Society of America and, in cooperation with it, make certain that a truly general set of languages is developed for the industrial data collection and automatic control field. RECOMMENDED PRACTICES AND STANDARDIZATION IN SENSOR-BASED COMPUTER SYSTEM HARDWARE In addition to the work just described in programming language standardization, there is an equally vital need for the development of standards or recommended practices in the design of the equipment used for the sensorbased tasks of plant data collection, process monitoring, and automatic control. Fortunately, there is major work under way throughout the world to help correct these deficiencies as well. As early as 1963 the Chemical and Petroleum Industries Division of ISA set up an annual workshop entitled The User's Workshop on Direct Digital Control which developed an extensive set of "Guidelines on Users' Requirements for Direct Digital Control Systems." This was supplemented by an equally extensive set of "Questions and Answers on Direct Digital Control" to define and explain what was then a new concept for the application of digital computers to industrial control tasks. A re- Needs for Industrial Computer Standards cently revised version of these original documents is available.6 The Workshop has continued through the years, picking up cosponsorship by the Data Handling and Computation Division and by the Automatic Control Division in 1968 when it renamed the ISA Computer Control Workshop. The last two meetings have been held at Purdue University, West Lafayette, Indiana, as has the Workshop on Standardization of Industrial Computer Languages described above. The ESONE Committee (European Standards of Nuclear Electronics) was formed by the EURATOM in the early 1960's to encourage compatibility and interchangeability of electronic equipment in all the nuclear laboratories of the member countries of EURATOM. In cooperation with the NIM Committee (Nuclear Instrumentation Modules) of the United States Atomic Energy Commission,---they -have--r-ecent1-¥-developed a-co-mpl-etely compatible set of interface equipment for sensor-based computer systems known by the title of CAMAC.1- 3.13 These proposals merit serious consideration by groups in other industries and are under active study by the ISA Computer Control Workshop. Japanese groups have also been quite active in the study of potential areas of standardization. They have recently developed a standard for a process control operator's console (non CRT based)l4 which appears to have considerable merit. It will also be given careful consideration by the Instrument Society of America group. It is important that the development of these standards and recommended practices be a worldwide cooperative endeavor of engineers and scientists from many countries. Only in this way can all of us feel that we have had a part in the development of the final system and thus assure its overall acceptance by industry in all countries. Thus, both the ISA Computer Control Workshop and the Language Standardization Workshop are taking advantage of the work of their compatriots throughout the world in developing a set of standards and recommended practices to guide our young but possible overly-vigorous field. While we must be careful not to develop proposals which will have the effect of stifling a young and vigorously developing industry, there seems to be no doubt that enough is now known of our data and control system requirements to specify compatible data transmission facilities, code and signal standards, interconnection compatibility, and other items to assure a continued strong growth without a self-imposed obsolescence of otherwise perfectly functioning equipment. SUMMARY This short description has attempted to show some of the extensive standards work now being carried out by the Instrument Society of America in the field of the applica- 61 tions of digital computers to plant data collection, monito ring, and other automatic control tasks. The continued success of this work will depend upon the cooperation with and acceptance of the overall results of these developments by the vendor and user company managements and the help of their personnel on the various committees involved. REFERENCES 1. CAMAC-A Modular Instrumentation System for Data Handling, AEC Committee on Nuclear Instrument Modules, Report TID25875, United States Atomic Energy Commission, Washington, D.C., July 1972. 2. CAMAC Organization of Multi-Crate System, AEC Committee on N-ucleal' --I-Rstrumen-t-Modules-, Report TID 25816;-Bnited States Atomic Energy Commission, Washington, D.C., March 1972. 3. Supplementary Information on CAMAC Instrumentation System, AEC Committee on Nuclear Instrument Modules, Report TID 25877, United States Atomic Energy Commission, Washington, D.C., December 1972. 4. Anon., Minutes, Workshop on Standardization of Industrial Computer Languages, Purdue Laboratory for Applied Industrial Control, Purdue University, West Lafayette, Indiana; February 17-21; September 29-0ctober 2,1969; March 2-6; November 9-12, 1970; May 3-6; October 26-29, 1971; April 24-27; October 2-5, 1972; May 7-10,1973. 5. Anon., "Hardware Testing of Digital Process Computers," ISA RP55.1, Instrument Society of America, Pittsburgh, Pennsylvania, October 1971. 6. Anon., Minutes, ISA Computer Control Workshop, Purdue Laboratory for Applied Industrial Control, Purdue University, West Lafayette, Indiana; May 22-24; November 13-16, 1972. 7. Anon., "Industrial Computer System FORTRAN Procedures for Executive Functions and Process Input-Output," Standard ISAS61.1, Instrument Society of America, Pittsburgh, Pennsylvania, 1972. 8. Anon., "Industrial Computer System FORTRAN Procedures for Handling Random Unformatted Files, Bit Manipulation, and Data and Time Information," Proposed Standard ISA-S6l.2, Instrument Society of America, Pittsburgh, Pennsylvania, 1972. 9. Anon., "Working Paper, Industrial Computer FORTRAN Procedures for Task Management," Proposed Standard ISA-S61.3, Purdue Laboratory for Applied Industrial Control, Purdue University, West Lafayette, Indiana, 1972. 10. Curtis, R. L., "Functional Requirements for Industrial Computer Systems," Instrumentation Technology, 18, No. 11, pp. 47-50, November 1971. 11. Glossary Committee, Purdue Workshop on Standardization of Industrial Computer Languages, Dictionary of Industrial Digital Computer TerminoLogy, Instrument Society of America, Pittsburgh, Pennsylvania, 1972. 12. Pike, H. E., "Procedural Language Development at the Purdue Workshop on Standardization of Industrial Computer Languages," Paper presented at the Fifth World Congress, International Federation of Automatic Control, Paris, France, June 1972. 13. Shea, R. F., Editor, CAMAC Tutorial Issue, IEEE Transactions on NuclearScience, Vol. NS-18, Number 2, April 1971. 14. Standard Operator's Console Guidebook, JEIDA-17-1972, Technical Committee on Interface Standards, Japan Electronics Industry Development Association, Tokyo, Japan, July 1972. Quantitative evaluation of file management performance improvements* byT. F. McFADDEN McDonnell Douglas Automation Company Saint Louis, Missouri and J. C. STRAUSS Washington University Saint Louis, Missouri IXTRODUCTIOX Automation Company XDS Sigma 7 running under the BT}I opE'rating system. The modE'ls developed hE're are extremely simple, detE'rministic representations of important aspE'cts of filE' managemrnt. This usr of simple models to reprrsent very complex systrms is finding incrrasing application in computrr system performancE' work. The justification for working \\-ith these simple models on this application are twofold: Operating systE'ms generally provide file management sE'rvice routines that are employed by usrr tasks to accE'SS secondary storage. This paper is conccrnrd with quantitative evaluation of several suggested performance improvements to the file managemE'nt system of the Xerox Data Systems (XDS) opE'rating systems. The file management system of the new XDS UnivE'rsal Time-Sharing System (UTS) operating systE'm includes the same service routines employed by the older operating system-the Batch Time-Sharing :\Ionitor (BT:\I). :\Iodels for both UTSI and BT.1P have been developed to facilitate performance investigation of CPU and core allocation strategies. These models do not, however, provide capability to investigate performance of the file management strategies. A \vealth of literature is available on file management systems. A report by Wilbur3 details a ne\v file management design for the Sigma Systems. Other articles havE' been published to define basic file management concepts,4,5 to discuss various organization techniques 4,5, 6 and to improve understanding of the current Sigma file management systE'm. 6, 7 HowE'vE'r, there is little published \vork on the performance of file management systems. The task undertaken here is to develop and test a simple quantitative method to evaluate the performance of proposed modifications to the file management system. :\lodels are developed that reproduce current performance levels and these models are employed to predict the performance improvements that will result from the implementation of specific improvement proposals. The models are validated against measured performance of the :\IcDonnell Douglas 1. File management system behavior is not well understood and simple models develop understanding of the important prOCE'sses. 2. When applied properly, simple models can quantify difficult design decisions. The underlying hypothesis to this and other \vork with simple models of computer systems is that system behavior must be understood at each successive level of difficulty before proceeding to the next. The success demonstrated here in developing simple models and applying them in the design process indicates that this present work is an appropriate first level in the complexity hierarchy of file management system models. The work reported here has been abstracted from a recent thesis. s Additional models and a more detailed discussion of system measurement and model verification are presented in Reference 8. The paper is organized as follows. The next section describes the XDS file management system; current capabilities and operation are discussed and models are developed and validatrd for opening and reading a file. Several improvpment proposals are then modeled and evaluated in the third section. * Abstracted from an M. S. Thesis of the same title submitted to the Sever Institute of Technology of Washington University by T. F. McFadden in partial fulfillment of the requirements for the degree of Master of Science, May 1972. This work was partially supported by National Science Foundation Grant GJ-33764X. CURREXT XDS FILE :\fAl\AGE:\IEl'-~T SYSTE~I This spction is divided into three parts: description of the file management capabilities, description of the file manage63 64 National Computer Conference, 1973 ment. syst.em structur(', and d('velopm('nt and validation of models for t.he open and rf'ad operations. Description of capabilities A file, as definf'd by XDS,1 is an organized coll('ction of space on the sf'condary storage devices that may be creat('d, retrieved, modified, or deleted only through a call on a file management routine. Each file is a collection of records. A record is a discrete subset of the information in a file that is accf'ssed by the USf'r independent of other records in the file. When the ~le is created the organization of the rf'cords must be specIfied. Records may be organized in a consecutive, keyed, or random format. The space uSf'd by a consf'cutivf' or kf'yed filf' is dynamically controlled by the monitor; the space used by a random file must be requested by the USf'r when the file is created and it never changes until the file is relf'ased. . Open-When a file is going to be used it must bf' madf' avaIlable to the user via a file management routine callf'd OPEN. When the file is openf'd, thf' user may specify one of the following modes: IN, that is, read only; OUT-w~ite only; I NOUT-update ; OUTIX-scratch. Whpn a file IS opened OUT or OUTIX it is being created; when it is openrd IX or INOUT it must alrf'ady exist. Thf' opf'n routine will Sf't some in-core pointers to the first record of t.hp file if it has been openf'd I~ or I~OUT and to some free spacf' that has bee~ allocated for the file if it has been opened OUT or OUTr~. These pointf's arf' never ust'd by thp ust'r dirf'ctly. When the ust'r reads or writf's a rf'cord tht' in-corf' pointf'rs art' uSf'd by file management to tract' the appropriatf' rf'cord. Close-Whpn the user has completrd all oprrations on the file he must call anothf'r routinp namf'd CLOSE. A filp can be closed with RELEASE or with SAVE. If RELEASE is specifipd then all spacf' currf'ntly allocatpd to thr filf' is placrd back in the monitor's free spacr pool and all pointers are deleted. If SAVE is specifipd thpn thp in-core pointt'rs are written to file directories maintainrd by the monitor so that the file can be found when it is next opened. Read and W rite-There are a number of operations that may be performed on records. The two that are of primary interest are read and write. In a consecutive file the records must be accessed in the order in \\'hich they are written. Records in a keyed file can be accessed directly, with the associated key, or sequentially. Random files are accessed by specifying the record number relative to the beginning of the file. Random file records are fixed length (2048 characters). When reading or writing the user specifies the number of characters he wants, a buffer to hold them, and a key or record number. an account dirrctory (AD). Thrre is an rntry in the AD for ('ach account on the systrm that has a filp. Thp rpsult of the AD sparch is a pointpr to the' fiIP dirpctory. Th('re' is only one AD on th<' syst('m and it is maintainpd in the 'linked-sector' format (doublr linke'd list of sector size [256 word] blocks). Each ('ntry in the AD contains th(' account number and the disc addrpss of the file directory (FD). An FD is a 'linkpdsector' list of filp names in the corresponding account. With each file namf' is a pointer to t.he File Information Table (FIT) for that file. Th<> FIT is a 1024 charactpr block of information on this file. It contains security, file organization, and allocation information. The FIT points to the table of keys belonging to this file. Figure 1 presents a schematic of the file structure. Sequential Access and Keyed Structure-In describing record access, attE'ntion is rE'stricted to spqupntial accessf's. The structure of consecutive and keyed filps is identical. Both file organizations allow sE'quential accesses of records. Because the structures are the same and both permit sequential accesses there is no true conspcutivp organization. All of the code for this organization is imbrdd('d in th(' k('yed file logic. The only difference betwE'en the two structures is that records in a consecutive file may not be accessed directly. Thp r('sult of this implementation is that most processors havE' b(,pn written using keyed files rather than consecutive files bpcause t here is an additional capability offprpd by keyed files and th('re is no diffprE'nce in speed in sequentially accessing r('cords on E'ither structure. :\Ieasurements have establishpd that only 16 pE'rcpnt of th(' rpads on thE' systpm are done on consecutive files while 94 percf'nt of the reads on the system arp sequpntial accesses. For th('s(' rpasons, ('mphasis is plac('d on improving srqu('ntial accpssf'S. Once a filp is opened thf'rp is a pointf'r in thf' monitor CurrE'nt Fil(' Us('r (CFU) tablp to a list of kpys called a :\laster Index (:\lIX). Each .:\IIX entry points to one of the data granules associated ,,·ith th<> file. Data granules arC' 2048 character blocks that contain no allocation or organization information. Figurp 2 is a schematic of the rE'cord structure. Each entry in the :\IIX contains the following fields: FILE DIRECTORY CORE FILE INFORMATION TABLES F ILE 1 ----t--7 FILE2 - - t - - - - ' FILEn --,------' ACCOUNT DIRECTORY ACCTl ACCT2 ~ i ~r-" ACCTn ~ File management system structure ___--" FILEI FILE2 ~ ••• ••• ~ I The structures used to keep track of files and records are described. File Structure-The account number specified by a calling prugram is used as a key tu search a table of accounts called ~ ,~ ' ... Figure l-File "tructufe FILEn SECURITY RECORD POINTER File Management Performance Improvements (a) KEY-In a cons('cutive file the keys arc three bytes in length. The first key is ahvays zero and all others follow and are incrrmrntrd by onr. (b) DA-Thc> Disc Addrrss of the data buff~r that contains thr nrxt srgmrnt of thp record that is associated with this key. Data buffers are always 204:8 characters long. Thr disc address field is 4 characters. (c) DISP-Thr byte displacrmrnt into thr granule of the first charactrr of the rrcord segmrnt. Cd) SIZE-Xumber of characters in this record segment. (e) C-A continuation bit to indicate whethrr there is another rrcord segment in another data granule. (f) FAK-First Apprarance of Key-When FAK= 1 then this entry is the first with this kpy. (g) EOF-When set, this field indicates that this is the last key in the ~IIX for this file. End Of File. Space Allocation-The Sigma systems have two types of secondary storage devicps-fixed head and moving head. Thp fixed head device is called a RAD (Rapid Access Device); the moving hpad device is a disc. The allocation of space on these devices is completely and dynamically controlled by the monitor subject to user demands. The basic allocation unit is a granule (2048 characters) which corresponds to one page of memory. The basic access unit is a sector (1024 charactprs). A sector is the smallest unit that can be read or ,vritten on both devices. The account directories, file directories, filr information tables and master indices arp always allocated in units of a sector and are always allocated on a RAD if there is room. Data granules are allocated in singh> granulp units and are placed on a disc if there is room. The S\VAP RAD is never used for file allocation. Table I presents the characteristics for the secondary storage devices referenced here. If s is the size in pages of an access from device d, the average time for that access (TAd(s» is: TAd(s) = Ld + Sd + 2*s*T~fd where: Ld is thf' aVf'rage latf'ncy time of drvice d Sd is the average seek time of device d T~ld is the average multiple sector transfer time of device d. When accessing AD, FD, ::\lIX or FIT sectors, the average transfer time for a single sector from device d is: CORE CFU FILEI MASTER INDEX KEYl KEY2 - ACCTZKEYn ~ Figure 2- Record structure DATA GRANULE 65 TABLE I-Speeds of Sigma Secondary Storage Devices DEVICE Variable Kame Capacity (granules) Latency (average ms) Seek Time-average -range Transfer Time (one sector) (ms per sector) Transfer Time (multiple sectors) (ms per sector) 7212 RAD 7232 RAD Symbol SRAD RAD 2624 17 0 0 T .34 3072 17 0 0 2.67 TM .41 2.81 L S 7242 DISC PACK DP 12000 12.5 75 ms 25-135 ms 3.28 4.08 ,,,here: Td IS the average single sector transfer time for device d. illodels of current operation :\fodrls are devf'loped and validated for the open file and read rpcord opprations. The model algorithms are closely patternf'd after a simple description of system operation such as found in Referencf's 3 or 8. Reff'rence 8 also develops a model for the write operation and describes in detail the measurrmf'nts for specifying and validating the modf'ls. Open 111 odel-A simple model for the time it takes to opf'n a file is: TO = TPC + TOV + TFA + TFF + TFFIT ,vhere: TPC TOV TFA time to process the request for monitor services. = time to read in monitor overlay. = timp to search AD and find the entry that matches the requested account. TFF = time to search FD and find FIT pointer for the requested file. TFFIT = time to read FIT and transfer the allocation and organization information to the CFU and user DCB. = The functions for TOV, TF A, TFF and TFFIT can be refined and expressed using the following paramf'tf'rs: PADR = probability that an AD is on a RAD instead of a disc pack. PFDR = probability that an FD is on a RAD instead of a disc pack. PFITR = probability that an FIT is on a RAD. XAD =aVf'rage number of AD sectors. XFD = average number of FD sectors per account. TADl = time it takes to discover that the in-core AD sector does not contain a match. TFDl = same as TADl for an FD sector. TAD2 = timf' it takes to find the correct f'ntry in the incorf' AD sector given that the entry is either in this Sf'ctor or thf're is no such account. 66 National Computer Conference, 1973 TABLE II -Observable Open Parameters PADR PFDR PFITR NAD NFD SO 186.17 ms .1 ms 1.5 ms .1 ms 1.5 ms .333 1.7 ms TO TAD1 TAD2 TFD1 TFD2 PON TPC 1.00 .872 .906 6 sectors 2.6 sectors 2.5 pages TFD2 = same as TAD2 for an FD sector. PON = the probability that, when a request is made for the open overlay, it is not in core and therefore the RAD must be accessed. SO = number of granules occupied by the open overlay on the SWAP RAD. The time for the open overlay can be expressed as: TOV = PON*TAsRAD(SO) and the time to find the file directory pointer is: NAD TFA=PADR* --*(TS RAD +TAD1) being spent on opening a file. The system has symbiont activity going on concurrently with all other operations. The symbionts buffer input and output between the card reader, on-line terminals, the RADs and the line printer. So the symbionts steal cycles which are being measured as part of open and they also produce channel conflicts. Neither of these is considered by the model. (b) The figure NFD is supposed to reflect the number of file directories in an account. The measured value is 2.6. Unfortunately there is a large percentage (40 percent) of accounts that are very small, perhaps less than one file directory sector (30 files). These accounts are not being used. The accounts that are being used 90 percent of the time have more than three file directory sectors. Therefore if the average number of FD sectors searched to open a file had been observed rather than computed, the computed value for TO would have been closer to the observed TO. Read M odel-To simplify the read model these assumptions are made: (a) This read is not the first read. The effect of this assumption is that all buffers can be assumed to be full. Since 2 NAD *(TSDP+TADl) + (1-PADR)* _.- TABLE IV-Observed Read Parameters 2 +TAD2-TAD1 and the time to find the file information table pointer is: NFD TFF=PFDR* - - *(TSRAD+TFDl) 2 + (l-PFDR)* NFD *(TS DP +TFD1) 2 +TFD2-TFD1 and the time to read the file information table is: TFFIT= PFITR*TS RAD + (1-PFITR)*TS DP Table II contains the values of observable parameters measured in the period January through .:\larch, 1972. Table III contains the values of computable parameters discussed in the open model. The difference between the 1\vo figures, the observed and computed values of TO, is 33 percent. There are a number of ways this figure can be improved: (a) When the TO of 186 ms was observed, it was not possible to avoid counting cycles that were not actually TABLE III -Computed Open Parameters TAsRAD(SO) TOV TFA TFF TO 19.0 6.3 60.7 30.3 125.4 ms ms ms ms ms TPC TTR TR 0.65 ms 1.00 ms 20.35 ms TMS PMIXR NEM 2 ms 0.585 47.7 entries there are an average of 193 records per consecutive filf', the assumption is reasonable. (b) The record being read exists. This assumption implies thf' file is not positioned at the end of file. Again, only 1 percent of the time will a read be on the first or last record of the filf'. (c) Thf' rf'cord size is If'sS than 2048 characters. This assumption is made so that the monitor blocks the record. The average rf'cord size is 101.6 characters. These assumptions not only simplify the model, but they reflect the vast majority of reads. The time to read a record can therf'fore be written as: TR = TPC + TGE + TTR + PEC*TGE where: TPC = time to process request to determine that it is a read request. This parameter also includes validity chf'cks on the user's DCB and calling parameters. TGE=time to get the next key entry (even if the next entry is in the nf'xt ':\IIX) and make sure that the corresponding data granule is in core. TTR = time to transfer entire record from monitor blocking buffer to user's buffer. PEe = the probability that a record has two rntricsentry continued. File Management Performance Improvements The probability, PKVr, that the next ~lIX entry is in the resident :ynx can be expressed as a function of the average number of entries, NE::.Yr, in a MIX sector: XEM-1 PEM= NEM The probability, PEG, that the correct data granule is in core is a function of the number of times the data granule addresses change when reading through the ~UX relative to the number of entries in the MIX. Table IV presents the observed values of the parameters used in the sequential read model. The computed results for the read model are found in Table V. The difference between the computed and observed values for TR is 42 percent. The error can be improved by refining the observed values to correct the following: (a) The symbionts were stealing cycles from the read record routines and producing channel conflicts that the timing program did not detect. Cb) In addition, the observed value of TR includes time for reads that were direct accesses on a keyed file. These direct accesses violate some of the read model assumptions because they frequently cause large scale searches of all the ~laster Index sectors associated \vith a file. ~IODELS OF THE PROPOSALS In this section, two of the performance improvement proposals presented in Reference 8 are selected for quantitative evaluation. One proposal impacts the open model and the other proposal impacts the read model. The models developed previously are modified to predict the performance of the file management system after implementation of the proposals. A proposal that impacts the open/close routines I mplementation Description-To preclude the necessity of searching the AD on every open, it is proposed that the first time a file is opened to an account the disc address of the FD will be kept in one word in the context area. In addition, as an installation option, the system will have a list of library accounts that receive heavy use. When the system is initialized for time-sharing it will search the AD looking for the disc address of the FD for each account in its hpuvy use list. The FD pointers for each heavy use account will be kept in a parallel table in the monitor's data area. The result is that bettrr than 9.1 prrcent of all opens and closes will be in accounts whose FD pointers are in core. For these opens and closes thr AD search is unnecessary. TABLE V-Computed Read Parameters TR TGE PEM 11.8 ms 9.7 ms 0.979 PEC PDG TRMIX 0.042 0.93 49.18 ms 67 Effect on Open lllodel-For the majority of opens the above proposal gives the open model as: TO' = TPC + TOV + TFF + TFFIT =TO-TFA = 125.4-60.7 = 64.7 This represents a percentage improvement of 51.5 percent. (The figures used for TO and TFA are in Table III). A proposal that impacts the read/write routines I mplementation Description-The significant parameter in the read model is the time to get the next entry, TG E. There are two expressions in TGE \vhich must be considered: the first is the average time to get the next ::\traster Index eI:L.try;.th~ f?e.QQnd, i~ the .~y:ex.ag.e time to m.a.kBsure the._correct data granule is in core. The levels of these expressions are 1.03 and 6.7 ms. It is apparent that the number of times that data granules are read is contributing a large number of cycles to both the read and write models. One of the reasons for this, of course, is the disproportionately large access time on a disc pack compared to a RAD. Xevertheless it is the largest single parameter so it makes sense to attack it. A reasonable proposal to decrease the number of data granule accesses is to double the buffer size. The model is developed so that the size of the blocking buffer parameter can be varied to compare the effect of various sizes on the read and write model. Effect of Proposal on the Read ill odel-The proposal outlined above \vill impact only one parameter in the read, PDG. PDG represents the probability that the correct data granule is in core. Its current value is 0.93. XO\v, there are three reasons that a ~laster Index entry will point to a different data granule than the one pointed to by the entry that preceded it: 1. The record being written is greater than 2048 characters and therefore needs one entry, each pointing to a different data granule, for every 2048 characters. 2. The original record has already been overwritten by a larger record, requiring a second ~raster Index entry for the characters that \vould not fit in the space reserved for the original record. The second entry may point to the same data granule but the odds are ten to one against this because there are, on the average, 10.6 data granules per file. 3. There was not enough room in the data granule currently being buffered 'when the record was written. When this occurs the first .:\Iaster Index entry points to those characters that would fit into the current data granule and the second entry points to the remaining characters that are positioned at the brginning of the next data granule allocated to this file. The first two reasons violate the assumptions for the read model and are not considered further here. The third reason is the only one that will be affected by changing the data granule allocation. It follows that if there 68 National Computer Conference, 1973 T.ABLE VI-Impact of Blocking Buffer of Size N Pages on Read Model N (pages) PDG (ms) TGE (ms) TR(ms) 1 2 5 10 .930 .955 .970 .975 9.729 7.337 5.903 5.424 11.78 9.20 7.80 7.30 are 10.6 dat.a granules per file by doubling the allocation size there be 5.3 data 'granulrs' prr filr. This rfff'ctivf'ly divides by two the probability that a record had to be continupd becausf' of the third itf'm abovr. Tripling thr sizr of thp blocking buffer and data 'granule' would divide t.he probability by three. The question to be resolvpd at this point is: What sharp of the 7 percent probability that thp data granule addrpss will change can bp attributed to the third reason above? A reasonable approximation can be dpvplop('d by tIl(> following argument: ,,,ill (a) (b) (c) (d) (e) There are 2048*n charactE'rs in a data 'granule'. There are 101.6 characters per rf'cord. Therefore thf're arf' 20.V5*n rE'cords per data 'granule'. The 20*n record will have two :\iaster Index entries. Then, on the average, one out of every 20*n entries will have a data granule address that is different from the address of the preceding entry. This probability is 1/(20.15*n) which is .0496/n. Then for n= 1, as in the original read and write models, 5 of the 7 percf'nt figure for 1-PDG is attributable t.o records overlapping data granule boundaries. The actual equation for PDG is: .05) PDG=l- ( .02+ ~ where n is the number of granules in the monitor's blocking buffer. The impact of various values of n on the read modE'1 is listE'd in Table VI. It is obvious that by increasing the blocking buffer size, the performancE' of the read model can be improved. However the amount of improvement decreases as n increases. If there are no other considerations, a blocking buffer size of two promises an improvement of 21 percent in the read routine. Perhaps a user should be allowed to set his own blocking buffer size. A heavy sort user that has very large files, for example, can be told that making his blocking buffers three or four pages long will improve the running time of his job. A final decision is not made here because the profile of jobs actually run at any installation must be considered. In addition, there are problems like: Is thpre enough core available? Is the swap channel likely to become a bottleneck due to swapping larger users? These problems are considrred in Rpferrnce 8 and found not to causp difficulty for the operational range considered here. COXCLUSIONS This paper does not. pretend to dpvelop a completp filp managpment systpm model. Such a model would npcessarily contain modPl functions for pach file managpmrnt activity and some IDPans of combining thpsr functions with the rrlative frequency of each activity. The model result would thpn be relatpd to systpm pprformancp parametprs such as thr numbpr of usprs, thp pxppcted intrractive response time and the turn-around timr for compute bound jobs. Thp dpscribed rpsearch rppresents an pffort to model a file management system. Thp modpl is detailed enough to allow certain parampters to bp changed and thus show the impact of proposals to improve the syst.em. The development of the modPl is straightforward, based on a relati,Tely detailed knmdedge of the system. This type of model is sensitive to changes in the basic algorithms. The model is developed both to further understanding of the system and to accurately predict the impact on the file management system of performance improvements. Work remains to be don£' to integrate the model into overall system performance measures. However comparisons can be made with this model of different file managrment strategies. DevelopmE'nt of similar models for oth£'r systE'ms will facilitate the search for good file managemf'nt strategif's. REFERENCES 1. Bryan, G. E., Shemer, J. E., "The UTS Time-Sharing SystemPerformance Analysis and Instrumentation," Second Symposium on Operating Systems Principle.~, October 1969, pp. 147-158. 2. Shemer, J. E., Heying, D. W., "Performance Modeling and Empirical Measurements in a System Designed for Batch and TimeSharing Users," Fall Joint Computer Conference Proceedings, 1969, pp. 17 -26. 3. Wilbur, L. E., File Management System, Technical Design Manual, Carleton University, 197I. 4. Chapin, N., Common File Organization Techniques Compared," Fall Joint Computer Conference Proceedings, 1969, pp. 413-422. 5. Collmeyer, A. J., "File Organization Techniques," IEEE Computer Group News, March 1970, pp. 3-1I. 6. Atkins, D. E., Mitchell, A. L., Nielson, F. J., "Towards Understanding Data Structures and Database Design," Proceedings of the 17th International Meeting of the XDS Users Group, Vol. 2, November 1971, pp. 128-192. 7. Xerox Batch Processing Monitor (BPM), Xerox Data Systems, Publication No. 901528A, July 1971. 8. McFadden, T. F., Quantitative Evaluation of File Management Performance Improvements, M. S. Thesis, Washington University, St. Louis, Missouri, May 1972. A method of evaluating mass storage effects on system performance by M. A. DIETHELM Honeywell Information Systems Phoenix, Arizona Ii\TRODUCTION allocate the files in such a manner as to maximize system performance. In -tIie case of adding a fast device to the configuration, this objective is strongly correlated, within reasonable limits, with an allocation policy which maximizes the resulting 110 activity to the fastest device in the mass storage configuration. This allocation policy can be mathematically modelled as an integer linear programming problem which includes the constraint of a specified amount of fast device storage capacity. Having the file allocations and resulting I/O activity profiles for a range of fast access device capacities, the expected system performance change can be estimated by use of an analytical or simulation model which includes the parameters of proportionate distribution of I/O activity to device types and device physical parameters as well as CPU and main memory requirements of the job stream. The results of application of an analytic model are described and discussed in the latter paragraphs as a prelude to inferring any conclusions. The analytic model is briefly described in the Appendix. A significant proportion of the cost and usefulness of a computing system lies in its configuration of direct access mass storage. A frequent problem for computing installation management is evaluating the desirability of a change in the mass storage configuration. This problem often manifests itself in the need for quantitative decision criteria for adding a fast (er) direct access device such as a drum, disk or bulk core to a configuration which already includes direct access disk devices. The decision criteria are hopefully some reasonably accurate cost versus performance functions. This paper discusses a technique for quantifying the system performance gains which could be reasonably expected due to the addition of a proposed fast access device to the system configuration. It should be noted that the measurement and analysis techniques are not restricted to the specific question of an addition to the configuration. That particular question has been chosen in the hope that it will serve as an understandable illustration for the reader and in the knowledge that it has been a useful application for the author. The system performance is obviously dependent upon the usage of the mass storage configuration, not just on the physical parameters of the specific device types configured. Therefore, the usage characteristics must be measured and modelled before the system performance can be estimated. This characterization of mass storage usage can be accomplished by considering the mass storage space as a collection of files of which some are permanent and some are temporary or dynamic (or scratch). A measurement on the operational system will then provide data on the amount of activity of each of the defined files. The mass storage space is thereby modelled as a set of files, each with a known amount of I/O activity. The measurement technique of file definition and quantification of 110 activity for each is described first in this paper along with the results of an illustrative application of the technique. The next step in predicting the system performance with an improved (hopefully) mass storage configuration is to decide which files will be allocated where in the revised mass storage configuration. The objective is to MASS STORAGE FILE ACTIVITY MEASUREMENT The first requirement in determining the files to be allocated to the faster device is to collect data on the frequency of access to files during normal system operation. Thus the requirement is for measurements of the activity on the existing configuration's disk subsystem. Such measurements can be obtained using either hardware monitoring facilities or software monitoring techniques. Hardware monitoring has the advantage of being non-interfering; that is, it adds no perturbation to normal system operation during the measurement period. A severe disadvantage to the application of hardware monitoring is the elaborate, and expensive, equipment required to obtain the required information on the frequency of reference to addressable, specified portions of the mass storage. The preferred form of the file activity data is a histogram which depicts frequency of reference as a function of mass storage address. Such histograms can be garnered by use of more recent hardware monitors 69 70 National Computer Conference, 1973 which include an address distribution capability, subject to, of course, the disadvantages of cost, set up complexity, and monitor hardware constrained histogram granularity. A more flexible method of gathering the required information is a software monitor. This method does impose a perturbation to system operation but this can be made small by design and code of the monitor program. It has the strong advantage of capturing data which can be analyzed ::lfter the fact to produce any desired reports with any desired granularity. A software monitor designed to work efficiently within GCOS* was utilized to obtain file activity data for the analysis described for illustration. This 'privileged' software obtains control at the time of initiation of any I/O command by the processor and gathers into a buffer information describing the I/O about to be started and the current system state. This information, as gathered by the measurement program used for this study includes the following: Job Characteristics Job and activity identification File identification of the file being referenced Central Processing Unit time used by the job Physical I/O Characteristics Subsystem, channel and device identification I/O command(s) being issued Seek address Data transfer size The information gathered into the data buffer is written to tape and subsequently analyzed by a free standing data reduction program which produced histograms of device and file space accesses, seek movement distances, device utilization and a cross reference listing of files accessed by job activities. Of primary concern to the task of selecting files to be allocated to a proposed new device are the histograms of device and file space accesses. A histogram showing the accesses to a device is shown in Figure 1. This histogram is one of 18 device histograms resulting from application of the previously described measurement techniques for a period of 2 1/4 hours of operation of an H6070 system which included an 18 device DSS181 disk storage subsystem. The method of deriving file access profiles will be illustrated using Figure 1. The physical definition of permanently allocated files is known to the file system and to the analyst. Therefore, each area of activity on the histograms can be related to a permanently allocated file if it is one. If it is not a permanently allocated area, then it represents the collective activity over the measurement period of a group of temporary files which were allocated dynamically to that physical device area. Figure 1 depicts the activity of some permanent files (GCOS Catalogs, GCOS-LO-USE, SWLO-USE, LUMP) and an area in which some temporary files were allocated and used by jobs run during the ~E.q(a:~T Pli03a 81 L.' TV Or Oe'cJ~Qe~~E :~;xJ~;xX!~";~!'" ~~ ~ .. ~~ ... ~~ ... ~~. ~~ I. I I, I )GCOS-LO-USE o x); X')' xx XX)Oc,(xlC1000': x x It'XXJ: II :'''''''''''' ) "',Lo-U,", :, I I I :~~~~o 1, LUMP ~ I : I I I, ~ I, : ~; ~ ~ xx (Y x'OC X X): STl-HISC j I I Figure 1-Histogram of accesses to device 1 space measurement period (STI-MISC). The leftmost column of the histogram gives the number of accesses within each 5 cylinder area of the disk pack and consequently is used to calculate the number of accesses to each file, permanent or temporary, defined from the activity histograms for each device. Often the accessing pattern on a device is not concentrated in readily discernible files as shown in Figure 1 but is rather randomly spread over a whole device. This is the case with large, randomly accessed data base files as well as for large collections of small, user files such as the collection of programs saved by the systems' time sharing users. In these cases the activity histograms take the general form of the one shown in Figure 2. In this case no small file definition and related activity is feasible and the whole pack is defined as a file to the fast device allocation algorithm. The resulting files defined and access proportions for the mentioned monito ring period are summarized in Table I. The unit of file size used is the "link," a GCOS convenient unit defined as 3840 words, or 15360 bytes. Table I provides the inputs to the file allocation algorithm which is described in the following section. '! ;1.,,"" •• "'!..-il. ~,~!~, -1 ::' :'F PE er::'I':" T u: 1'5 '10 ~:..j:- :-!.'.': 1 _ t'] "0,1' •) 3 ~ ],J4 :l,H 9.)' ~ •J1 :i.)' J.)l ':'. )~ "0, " C,)') D.H '),)4 J.H '). J ~ J.l1 J.n J.)d .), 1).)5 ~ ).Jd ;J.n :~: J,n :.J' :.Jd I J 4 ~ ~ .; ~ , ~ 1.. ?':i : . : . . ; .. .....;!.... r • : • • ~ -:- ': !- • • • • : • • • • 1• • • • , I.)' ~ ~ :: ~ ~ ,,, "'" !, , ! ,» i~ ~ ~ ~ ,.'" 110' ~ 110' ~). III. ~ .)( • 1'(:W ... w , i:: ~ ~ 10',:1' 111.10 IICO 11()'1(0..... ) IlO:)I'x Il(lt ~ 1 :1('''' 10'"' / .. ) . r ~:; x ! .0', i::; the acrunym for the General Comprehcn:;i\'c Operating Supervisor software for H6000 systems, i~.;.;C I I I : " Gcas •• : ) GCOS CA!ALOGS IHxllX):.O"XXXklOt'J(XJ(X I' ~ !' FiguTe 2-Histogram of acces:;:es to device 2 :;:pace ,- Mass Storage Effects on System Performance THE PROBLEM TABLE I-Measured File Activity Dual607o--Monitoring Period Approx. 2.25 Hours DSS181 (16 used. devices)-149180 Accesses Exposure (3D) Workload (6/23/72) File Name Size (Links) Activity (%) GCOSCAT GLOB-HI-USE GLOB-LO-USE SU-HI-USE SW-LO-USE SW-SYSLIB LUMP #P #S PACK16 STI-MISC -G-WAD-FILES SYOU2 DEV. CATS PACK 11 PACK 10 D6SCRATCH D7SCRATCH D13 C SCRATCH D3 SCRATCH 1 D3SCRATCH2 D4 MISC SYOU 1 D7 MISC PACK8 PACK9 80 18 140 40 150 40 20 55 200 750 150 1200 360 1200 1200 100 180 90 150 90 600 1200 600 1200 1200 4.1 12.7 3.5 0.9 1.7 1.3 0.8 0.9 4.5 3.9 2.8 -4.7 9.1 2.6 3.1 4.1 3.1 2.0 1.4 2.7 1.0 3.3 9.0 1.4 2.7 4.0 OTHER 7940 8.6 150 71 Xote: 1 Link =3840 Words = 15360 bytes. OPTIMAL FILE ALLOCATION TO FAST DEVICE WITH LIMITED CAPACITY Having a set of mass storage files defined as well as a measured profile of the frequency of access to each, the next step is to postulate an allocation of these files to the mass storage subsystems. For purposes of illustration it will be assumed that this allocation problem may be characterized as the selection of that subset of files which will fit on a constrained capacity device. The selected files will then result in the maximum proportion of I/O activity for the new device being added to the mass storage configuration. This problem of selecting a subset of files may be formulated as an integer linear programming application as follows. GIVEN A set of n pairs (8i' Ii) , defined as: 8i= Size (links) of the ith file, Select from the given set of files that subset which maximizes the sum of the reference frequencies to the selected subset while keeping the sum of the selected subset file sizes less than or equal to the given fast device capacity limitation. lVIATHE::.\1ATICAL FORMULATION Define Oi = o- ith file is not selected {1-ith file is selected for allocation to fast device. Then the problem is to find, :MAX Z= [ :E /i·Oi] i=l,n Subject to, L....J " 8··0·d and the comptrxity of the time-sharing ' - 3 computer systems, TSCS, necessitate a good deal of effort to be spent on the analysis of the resource allocation problems which are obviously tied to the cost and the congestion properties of the system configuration. In this paper we study the congestion problems in the multiprogramming ' - 3 TSCS's. The activity in the past sixteen years in the analysis of the congestion properties of the TSCS's by purely analytical means has been concentrated on the study of the central processing unit (or units), CPU ('s), and the related queues. A good survey of the subjf'ct has bf'f'n provided by }IcKinney.4 In the meantime more contributions to this area have appeared in the literature. 5- 11 There has been a separatf' but not so active interest in the study of the congestion problems in the multiprogramming systems. Heller '2 and }lanacher'3 employed Gantt charts (cf. 12) in the scheduling of the sequential "job lists" over different processors. Ramamoorthy et a1.'4 considered the same problem with a different approach. A single server priority queueing model was applied by Chang et al. ls to analyze multiprogramming. Kleinrock '6 used a 2-stage (each stage with a single exponential server) "cyclic qUf'ueing"'7 model to study the multistage sequential srrvers. Later Tanaka 'S extended thr 2-stage cyclic queuring model to include the Erlang distribution in one of the stagf's. Gavf'r19 extendrd the 2-stage cyclic queueing model of multiprogramming computer systems furthf'r by including an arbitrary number of identical procrssors with rxponrntial service timrs in thf' input-output, 10, stagr in providing the busy period analysis for the CPU undrr various procrssing distributions. Herr wr prrsent a multiprogramming modrl for a genrral purpose TSCS, whrrr thr arrival and thf' drparture processes of thr programs along with thr main systrm resourcrs, i.e., the CPl~'s, thp diffprrnt kind of lOP's and thp finitr main memory sizf', are included and their relationships are examined. :\IODEL 2:i:l The configuration of the TSCS wr want to study consists of ko~ 1 identical CPU's with m groups of lOP's whf're the ith 87 88 National Computer Conference, 1973 the system rquation, following Fell~r,21 by relating the state of the system at time t+h to that at time t, as follows: IOP's AI m kl·/-LI cpu's ql m P(no, n1, ... , nm; t+h) = {l- [C L hi+ L Iljaj(nj)]h} i=l ;=0 XP(no, ... , nm; t) m + L h;E ihP(nO, ... , n;-I, ... , nm; t) ko./-Lo ;=1 m Am + L ao(no+ 1) llorOj E jhP(no+l, ... , l1j-l, .. . , 11m; t) m. k ;=1 qm m + L ai(n;+l)lliTiOE ohP(no-l, ... , n;+I, ... , n m ; t) i=1 1- POl m +C L Figure I-System configuration ai(ni+l)ll;(I-T;o)h i=1 preemptive-resume type of interrupt policy where the interrupted program sequence continues from the point \vhere it was stopped when its turn comes. All the 10 and the CPU queues are conceptual and assumed to be formed in the main memory. In this model the processing times of all the CPU's and the lOP's are assumed to be independent and distributed negative exponentially. The processors in the -ith group all have the same mean service time 1/ Ili, where i = 0, 1, ... , m. The transition probabilities between the lOP groups and the CPU group are assumed to be stationary. These assumptions have already been used successfully by other authors investigating the TSCS in one form or other (cf. 4 and other references on TSCS). XP(no, ... , ni+ 1, ... , nm; t) +O(h). m where i=O m Lni~N, i=O ai(n) =min{n, Ei=min{ni' kd, -i=O,I, ... ,m II, i=O, 1, ... , m. By the usual procrss of forming the derivative on the left hand side and letting it equal to zero for steady-state and also replacing P(no, 711, ••• ,11m; t) by P(no, 1l1, . . . ,11m), we have In ;=1 m +L hiE iP(nO, ... ,71,.-1, ... , n m ). i=1 m + L ao(no+l)llot]iE ;P(no+l, ... ,n;-I, ... ,nm) ;=1 m + L ai(ni+ 1)Il;PiE iP (no-l, ... ,ni+ 1, ... , nm) i=l m i=O, 1, ... , m (1) ;=0 and if =0 ;=1 where Lni..W," Opns. Res., Vol. 9, May-June 1961, pp. 383-387. APPENDIX The general "network of waiting lines" (network of queues) was described in Reference 20. Here we add one more condition to that stated in Reference 20 by allowing no more than N customers in the system at any time. If a new customer arrives to find N customers already in the system, he departs and never returns. Then following the previously defined notation in earlier sections, we can write the system equation for general network of waiting lines as Multiprogrammed Time-Sharing Computer Systems 91 where m = m m {l-[C L Ai+ L JLjO!j(nj) Jh}P(nl' "" n m ; t) i=1 m j=1 m (10) i=1 j=1 m i=1 Tij, m +CL O!i(ni+1)JLi(1-Ti)hP(nl, ' , " ni+ 1, ' , " n m ; t) +O(h), L i=1 + L AiE ihP(nl, ' , " ni- 1, ' , " nm ; t) + L L i=1 Ti= It can be shown that the solution to (10) in steady state is given by the theorem stated earlier, In this case it is difficult to obtain a closed form solution for 1\ /s, Use of the SPAS M soft ware monitor to eval uate the performance of the Burroughs B6700 by JACK M. SCHWARTZ and DONALD S. WYNER Federal Reserve Bank of New York New York, ~ew York features of this system which distinguish it from many other systerhs are: .. . INTRODUCTION The need for system performance measurement and evaluation • Each task has assigned to it a non-overlayable area of memory called a stack. This area provides storage for program code and data references* associated with the task as well as temporary storage for some data, history and accounting information. • Multiple users can share common program code via a reentrant programming feature. • The compilers automatically divide source language programs into variable sized program code and data segments rather than fixed sized pages. • Core storage is a virtual resource which is allocated as needed during program execution. (This feature is discussed in more detail below.) • Secondary storage including magnetic tape and head-per-track disk is also allocated dynamically by the MCP. • Channel assignments are made dynamically; that is they are assigned when requested for each physical 110 operation. • I 10 units are also assigned dynamically. • Extensive interrupt facilities initiate specific MCP routines to handle the cause of the interrupt. • The maximum possible B6700 configuration includes 3 processors, 3 multiplexors, 256 peripheral devices, 1 million words of memory (six 8-bit characters per word or 48 information bits per word), and 12 data communications processors. The benefit to be derived from a large multi-purpose system, such as the B6700, is that many jobs of very diverse characteristics can (or should) be processed concurrently in a reasonable period of time. Recognizing that certain inefficiencies may result from improper or uncontrolled use, it is necessary to evaluate the computer system carefully to assure satisfactory performance. To this end, the objective of our work in the area of performance evaluation is to: 1. determine the location(s) and cause(s) of inefficiencies and bottlenecks which degrade system performance to recommend steps to minimize their effects, 2. establish a profile of the demand(s) placed upon system resources by programs at our facility to help predict the course of system expansion, 3. determine which user program routines are using inordinately large portions of system resources to recommend optimization of those routines 4. establish control over the use of system r~sources. Among the techniques which have been applied to date in meeting these objectives are in-house developed software monitors, benchmarking, and in-house developed simulations. This paper discusses the software monitor SPASM (System Performance and Activity Softwar~ Monitor), developed at the Federal Reserve Bank of New York to evaluate the performance and utilization of its Burroughs B6700 system. The current B6700 system at the Federal Reserve Bank of New York shown in Figure 1 includes one processor, one 110 multiplexor with 6 data channels, one data communications processor and a number of peripheral devices. In addition, the system includes a virtual memory consisting of 230,000 words of 1.2 micro-second memory, and 85 million words of head per track disk storage. The management of this virtual memory serves to illustrate the involvement of the MCP in dynamic resource THE B6700 SYSTEM The B6700 is a large-scale multiprogramming computer system capable of operating in a multiprocessing mode which is supervised by a comprehensive software system called the Master Control Program (MCP).1,2 Some of the * These references are called descriptors and act as pointers to the actual location of the code or data 93 94 National Computer Conference, 1973 ,"-" m I I f_m__ l"m-J- January 1973 Schwartz/Wyner SPASM l m"_-"_} * 14 memory modules, each having 98.3 KB; total 1.4 MH. Figure I-Configuration of the Federal Reserve Bank of New York B6700 computer system allocation. This process is diagrammed in Figure 2. Main memory is allocated by the MCP as a resource to current processes. When a program requires additional memory for a segment of code or data, an unused area of sufficient size is sought by the MCP. If it fails to locate a large enough unused area, it looks for an already allocated area which may be overlaid. If necessary, it links together adjacent available and in-use areas in an attempt to create an area large enough for the current demand. When the area is found, the desired segment is read in from disk and the segments currently occupying this area are either relocated elsewhere in core (if space is available), swapped out to disk or simply marked not present. In any case, the appropriate descriptor must be modified to keep track of the address in memory or on disk of all segments involved in the swap. All of these operations are carried out by the MCP; monitoring allows us to understand them better. For additional information on the operation and structure of the B6700 see Reference 3. B6700 PERFORMANCE STATISTICS The complexity of the B6700 system provides both the lleces~ity to monitor and the ability to monitor. The per" vasive nature of the MCP in controlling the jobs in the system and in allocating system resources made it necessary for the system designers to reserve areas of core memory and specific cells in the program stacks to keep data on system and program status. This design enables us to access and collect data on the following system parameters: • • • • • • • • • • • • • • • system core memory utilization I/O unit utilization I/O queue lengths processor utilization multiplexor utilization multiplexor queue length peripheral controller utilization system overlay activity program overlay activity program core memory utilization program processor utilization program I/O utilization program status scheduler queue length response time to non-trivial requests These data are vital to the evaluation of our computer system. Table I presents examples of the possible uses for some of these statistics. Spasm Software Monitor Swapped to Disk Marked not Present TABLE I-Examples of Collected Statistics and Their Possible Uses ~ Unused Core 50 Words I Data ~ ~ Segment 6 Segment 4 Program 2 ~ Program 2 ~ (Code) 100 Words~(Data) 100 words~ I ~ AREA 6 Unused Core 50 Words 95 Segment 2 Program 5 100 Words Figure 2-B6700 memory allocation procedure 1. Space is needed for a 300 word segment for one of the current tasks. 2. A large enough unused area is not located. 3. The MCP finds a contiguous location made up of areas 1 through 4 which is 300 words long. 4. Area 1 contains a 50 word data segment. The MCP relocates this segment into area 5, makes note of its new core address and removes area 5 from the unused linkage. 5. Area 2 is unused. It is removed from the unused linkage. 6. Area 3 contains a 100 word code segment. There are no unused areas large enough to contain it. Therefore, it is simply marked not present. Since code cannot be modified during execution, there is no reason to write it out to disk-it is already there. 7. Area 4 contains a 100 word data segment. It is written out to disk and its new location is recorded. 8. The 300 word segment is read into core in the area formerly occupied by areas 1 through 4 and its location is recorded. DESCRIPTION OF THE SPASM SYSTEM The B6700 System Performance and Activity Software Monitor, SPASM, is designed to monitor the performance of the system as a whole as well as that of individual user programs. It consists of two separate programs, a monitor and an analyzer, both of which are described below. The principal criteria governing its design are: (a) to make a software monitor capable of gathering all the pertinent data discussed in the previous section, (b) to minimize the additionai ioad pi aced upon the system by the monitor itself, and (c) to provide an easily used means of summarizing and presenting the data gathered by the monitor in a form suitable for evaluation by technical personnel and management. A bility to gather pertinent data The Mastor Control Program concept of the B6700 helps in many ways to simplify the acquisition of the data Use System core memory utilization Determine need for additional memory I/O unit utilization, I/O unit queue Determine need for Disk File Optilengths mizer and/or additional disk storage electronic units, printers or disk file controllers Processor queue length and com- Determine need for additional proposition cessor Evaluate effect of job priority on execution Determine processor boundedness of mix Determine effect of processor utilization on demand for I/O (in conjunction with I/O unit data) System--evoosy--activ-ity - Determine- - need-- -for----additional memory Determine need for better task scheduling Determine when thrashing* occurs Job overlay activity Evaluate program efficiency Evaluate system effect on job execution Job core memory utilization Evaluate program efficiency Change job core estimates Scheduler queue length Determine excess demand for use of system Evaluate MCP scheduling algorithm * Thrashing is the drastic increase in overhead I/O time caused by the frequent and repeated swapping of program code and data segments. It is caused by having insufficient memory to meet the current memory demand. listed in Table 1. Such information as a program's core usage, processor and 1;0 time, and usage of overlay areas on disk are automatically maintained in that program's stack by the MCP. A relatively simple modification to the MCP permits a count of overlays performed for a program to be maintained in its stack. Data describing the status of programs are maintained by the MCP in arrays. Information on system-wide performance and activity is similarly maintained in reserved cells of the MCP's stack. Pointers to the head of the processor queue, I; 0 queues and scheduler queue permit the monitor to link through the queues to count entries and determine facts about their nature. Other cells contain data on the system-wide core usage; overlay activity, and the utilization of the 1;0 multiplexor. An array is used to store the status of all peripheral devices (exclusive of remote terminals) and may be interrogated to determine this information. All of the above data are gathered by an independently running monitor program. The program, developed with the use of a specially modified version of the Burroughs ALGOL compiler, is able to access all information maintained by the MCP. The program samples this information periodically and stores the sampled data on a disk file for later reduction and analysis. 96 National Computer Conference, 1973 interactively or following: Minimization of load upon the system To minimize the additional load on the B6700, the monitor program is relatively simple, and very efficient. A somewhat more sophisticated analyzer program is used to read back the raw data gathered by the monitor and massage it into presentable form. This analysis is generally carried out at a time when its additional load upon the system will be negligible. The system log has indicated that the monitor does indeed present a minimal load requiring about 1/4 of 1 percent processor utilization and 2 1/4 percent utilization of one disk I! 0 channel. • • • • batch mode will produce any of the In Graphs of data versus time Frequency distribution histograms of data Correlation and regression analyses among data Scanning for peak periods The options are selected via an input language consisting of mnemonics. (1) Graphs of data versus time are produced to show the minute-by-minute variations in parameters of interest. The graphs are "drawn" on the line printer using symbols to represent each curve. Any number of parameters may be plotted on one graph, and a key is printed at the end identifying all symbols used in the graph and listing the mean values of each. To aid in tailoring the most desirable presentation, rescaling of the ordinate and time axes is permitted. In addition, the user may select a specific time interval of interest and plot that interval only. Fig- Easy means of analysis and presentation The raw data gathered by the monitor can be used directly in some cases; however, to serve best the purpose for which SPASM was designed (i.e., as a management reporting system) several useful presentations have been engineered. The analyzer program, which may be run I" T T <, t;O • v ~Q .. 1/1 • 1'\') "" . 1111 .. II' ~ v vvvv v vv vv v Vv V v vv " V V V • T .. vv 1 • .. ~? '11 .. 110 .. 19 .. T 11\ .. 17 .. v vvv 1" .. I/VVV vV vv vvv vvvvvvv vv vvVv V v vvvvv V VVVVV 14 .. VV VVV'IVV 13 .. IIU'llilJlI'1 V vv, vv VUUII VV v UltIIIIIJULJIJIIUU!llllJ 1'i .. tlU lJIIIIIIlIl""IUIJIl IIUI111 lJU 1? .. U " " '111 II II'I LIP I! U 10 * TT IJ '9 .. ?8 * '7 * ,(, * ,') .. '4 * '3 * '? * IIIIII1I1111 VVV II TTT ""UIII! TVVV " IJII'IIllIll'" t IIUlIll;1 1.1 VVV T TTT VII If nc S I '1L I Rr SMO"l/OR.lF"r.T '1 * *11 14 * * * 11 ,? 11 * '0 .. 09 * nil .. 07 * 06 .. * 05 nil 03 07 * * 01 * * nn * 1111 lJll Ull illlil II ,n * 19 * HI * 17 * 16 * 15 II 1111 T1ME I NTF"RVAI , 09,'7 , 0991 Tn 1110n KEY SCALE S'I'SAVL TTTTTT lI"lIT I4rA"l'" 19A77.1773<;8 S'I'SOLY UIJUUUII t liN IT '4FAN: 110911.901111110 SYSSVE VVVVVV 1 UNIT ",rAN. , ') I'" 7,7? nATEI 001000 wnRn "'1"'<)/1 /11 1 ". 1 n,,,, 1)::>1 J.'1I1""RI'i" nc '11, I. "'I1O:;Q II Q7n ",70~<11Q?'1 2.lAA?~7Q4 AV C; Y <;5 VF '.'1",,,,,nIl7 7.Q9QR77Q, In'lt)7'.(j~1''(\I'''1) ~'I71?~7nl\,3)4?"'l'~ ,,,,,3.'l41\6,,6n 1 n'i", 11, 7'1217QOII.1I0"~'1"'" ,,,RQ3.MI'QI74" III 7 ~ I 1l.IIIH'I91 0.11111'14,,7 ~II. A .. ~1I7 (\. 71 'J Pr.I1LAY PCnl.A v nr.nLAY CnnLH SVC;SV" sysnLY PRnCIJT '1r. nl sYc;nLV sYsc;vr UHll.AV AY 2,11?'14n"" PRO~IIT 1.()nOnOOllr no <1.JQ/j">"'?<1.-111 1 .1l0OOOOIH' 00 <).IlQ,c,flHt,'-01 1.00nn'lon. 00 11.''',,>/jl~qr-nl 1 • nnnnnoor no ... 'iR4;:>QQ<1.-"11 .... 7;:>7n;:>1"r-Ol /).4/\, Q"J'-01 -?, 7141)'" q~-.,! -'.110 'II)QO ..... - 0 I -7.1117<1,,<14,-01 -1I.Q?Cl900Ir-n, 3.;>'''>~,QQ,-nl '.AQo;II\79r-o' 1,0<)30;11)11"'.-'" 1.111<1n::>' ..... -01 I.Ooooooor 00 ;:>.21)1145'8(-01 ,:001l0001l .. on 1754 p"nC'IT e;1I~ o;nllRrr nr VA"IATT"N TnT AI nr SqIlARrC; I1R Tr. T... AI. 11'11 T 5 M(AN rnRRrLATToN rnR~ SQ11ARrs OOTr.I'IAL ""'ITe; 1.1I0nnoooo 1!.43R74141 R[SII1IIAL 0.000404"5 VARtARLE PCOLAY nCnLAY CnOLAY r.nHrTcIPH 0.on44RI 0.0,<:1,,9, •• -o.nlal, ... • o.onooo" •• O.tJ'l1100"' •• Syssvr c; YSOI Y CrJ'IISTA'JT 5 •• , T Cnfll<;TANT rnNST~"'T e;. E. rnrr. O.nI)H,n o. n07173 n. nil"!! 7 o.l1noono o.ooonnn RrTA n.o">l'1,7 0.;>R391'i -0.1197119 0.3791::>5 0.41\13VI -".4'510ll44 ".n5 7 413R? S.·. RrT' o.n'l9Q::>" O. n0;2 7 19 0.00;171\0 .1.nlln73 o.n'HQl 1.3n61~/I 5.'11341 7 -2.112",,,3 12.2011118 20.4901",4 '1.,/11,1'/1'59 RHCl '"'. <:I'i/'~ 1)4", PART[ aL R 0.031230 0.\27'708 0.050;;130 0.l'8n\4'5 0.4/10079 R-<;/l An J. R-<;Q -7./1/"''1701 I S .... Ee;TT"ATE T PRnR 0.'108,11\ I.OOOnnO 0.9791';9 1.0001100 I.noo'loo 11.'54nQI\6QR OIJRIITN OIAT"nN '1.091002'17 r '.tJt)nOOonn p~nRAIIJL TTY Figure 5-Results of regression and correlation analysis and overlay rate parameters. These results are seen to show that, for example, the overlay statistics are highly correlated to the amount of "SAVE" core in the system. This is understandable since the larger the "SAVE" core the greater the chance of needing to swap segments. (4) Scanning for peak periods is a necessity in most computer systems, especially those operated in a data communication environment where the load fluctuates widely. The analyzer can scan the entire day's data and flag time intervals (of length greater than or equal to some specified minimum) during which the mean value of a parameter exceeded a desired threshold. For example. a period of five minutes or longer in which processor utilization exceeded 75 percent can be easily isolated (see Figure 6). Using this techniq ue a peak period can be automatically determined and then further analyzed III more detail. Spasm Software Monitor _NI)TNG F"NDING _NnTNG _NDTNG F'NOTNG PRnCUT PRnCllT PRnr.UT PROCIJT PRnCIJT PRnr.UT rnR ??~O rnR 4920 SEcnNns F"NnTNG 1'S:O~ PRnr.IJT F"nR n420 SEcnNOS _"IOT"Ir. 10;:\::» PRnr.IJT rnR rnR rnR rnR rnR 0740 ::»160 0700 13::»0 04RO n91'Sn 11101, 1 t 149 \::»131 1::»139 SEcn"-JDS r~!D TN('; 1l!37 SEcnNOS SECnNns SEcnNDS SEcnNOS SECnNDS ElCCI"_nEn ElCC·-nEO [XC·F"nEn nC-EnE" rlCC·rnEn EXCI"r:-()En rlCc_rnrn EXc .. nEn THRESH"I n THRESHn, 0 THRESHn!.O THRESHnlO THRF:SHnLn TIiRESfHll n THRESHn, 0 THRESHn, n Figure 6-Periods of peak processor utilization The design criteria discussed above have been met and a software monitoring system has been developed which is comprehensive, and easily used, and yet presents a negligible load upon the B6700 computer. CONCLUSION AND OUTLOOK The SPASM system has proven to be very instrumental in the performance evaluation of the B6700 system at the Bank. Several areas in which it has been and is currently being used are as follows: • The statistics on processor queue length, multiplexor utilization, and disk controller utilization were used to aid in the analysis of the need for a second processor*, second multiplexor and additional controllers. • The job core utilization data have been used to evaluate the effect of alternate programming techniques on memory use. • Disk utilization data have been examined to identify any apparent imbalance of disk accesses among the disk electronics units. • Processor queue data are being used to determine the effect of task priority on access to the processor. • System overlay data are being used to determine the adequacy of automatic and manual job selection and scheduling. • Processor utilization figures, as determined from the processor queue data, were used to determine the effect of core memory expansion on processor utilization. Some future possible uses planned for SPASM include: • Use of the scheduler queue statistics to evaluate the efficiency of the current MCP scheduling algorithm and to evaluate the effect changes to that algorithm have on the system performance. • Use of the response time data to evaluate system efficiency throughout the day with different program mixes. • Evaluation of resource needs of user programs. • Evaluation of the effect that the Burroughs Data Management System has on system efficiency. • Building of a B6700 simulation model using the collected statistics as input. • Building an empiricai modei of the B6700 system by using the collected regression data. * See Appendix A for a discussion of how the processor queue data was used to determine processor utilization. 99 The SPASM system has enabled us to collect a greatdeal of data on system efficiency and, consequently, a great deal of knowledge on how well the system performs its functions. This knowledge is currently being used to identify system problems and to aid in evaluating our current configuration and possible future configurations. Mere conjecture on system problems or system configurations in the absence of supporting data is not the basis for a logical decision on how to increase system efficiency. Performance measurement and evaluation are essential to efficient use of the system. REFERE~CES 1. B6700 Information Processing Systems Reference Manual, Bur- roughs Corp. May, 1972. 2. B6700 Master Control Program Reference Manual, Burroughs Corp. November, 1970 3. Organick, E. I., Cleary, J. G., "A Data Structure Model of the B6700 Computer System," Sigplan Notices, Vol. 6, No.2, February 1971, "Proceedings of a Symposium on Data Structures in Programming Languages." APPENDIX A The use of processor queue data to determine proCEssor utilization The SPASM system records the length of the processor queue periodically. The processor utilization will be based upon these examinations, taking into account that the monitor itself is processing at this instant of time. If the processor queue is not empty, the monitor is preventing some other job from processing. Consequently, if the monitor were not in the system the processor would be busy with some other task at that instant of time. This is considered to be a processor "busy" sample. On the other hand, if the processor queue is empty at the sample time there is no demand for the processor other than the monitoring program itself. Therefore, if the monitor were not in the system at that instant of time the processor would be idle. This is considered a processor "idle" sample. Processor utilization can therefore be estimated as: ... No. "busy" samples processor utilIzatIOn = total N o. sampI es This sampling approach to determining processor utilization was validated by executing controlled mixes of programs and then comparing the results of the sampling calculation of processor utilization to the job processor utilization given by: processor utilization = processor time logged against jobs in the mix elapsed time of test interval Table II compares the results of these two calculations. 100 National Computer Conference, 1973 TABLE II-Comparison of Processor Utilization Statistics By Sampling Technique and By Processor Time Quotient Technique Average Processor lJtilization (%) Test Series 1 Test Series 2 Sampling Technique Processor Time Quotient 99.1 57.6 96.5 53.5 In a second type of test, processor idle time was monitored (by means of a set of timing statements around the idling procedure) to gain a close measure of utilization. The total idle time was subtracted from the total elapsed time of the test to obtain the processor busy time and hence the utilization. Over a period of five hours the respective processor utilization calculations were: Sampling Technique Idle Timing 46.3% 48.0% These results make us confident of the validity of using the processor queue sampling technique to accumulate processor utilization statistics during any given time interval. Evaluation of performance of parallel processors in a real-time environment by GREGORY R. LLOYD and RICHARD E. MERWI~ SAFEGUARD System Office Washington, D.C. number of separate data sets (array processing). A more eo-mplex ease involves different ealeulationson separate data sets (multiprocessing) and finally, the greatest challenge to the parallel processing approach occurs \\'hen a single calculation on a single data set must be decomposed to identify parallel computational paths within a single computational unit. A number of mathematical calculations are susceptible to this type of analysis, e.g., operations on matrices and linear arrays of data. The computational support required for a phased array radar is represf'ntative of problems exhibiting a high degree of parallelism. These systems can transmit a radar beam in any direction within its field of view in a matter of microseconds and can provide information on up to hundreds of observed objects for a single transmission (often called a "look"). The amount of information represented in digital form which can be generated by this type of radar can exceed millions of bits per second and the analysis of this data provides a severe challenge to even the largest data processors. Applications of this radar system frequently call for periodic up dates of position for objects in view which are being tracked. This cyclic behavior implies that a computation for all objects must be completed between observations. Since many objects may be in view at one time, these computations can be carried out for each object in parallel. The above situation led quite naturally to the application of associative paral1el processors to provide part of the computational requirements for phased array radars. A number of studies9 ,IO,II,I2 have been made of this approach including use of various degrees of parallelism going from one bit wide processing arrays to word oriented processors. As a point of reference this problem has also been analyzed for implementation on sequential pipelined machines. 1o One of the main computational loads of a phased array radar involves the filtering and smoothing of object position data to both eliminate uninteresting objt>cts and providt> more accurat.e tracking information for objt>cts of interest. A technique for elimination of unintert>sting objects is referred to as bulk filtering and the smoothing of data on interesting objects is carried out with a Kalman filter. The following presf'nts an analysis of the results of the above studies of the application of associative parallel processors to both the bulk and Kalman filter problems. The two IXTRODUCTIOK The use of parallelism to achieve greater processing thruput for computational problems exceeding the capability of present day large scale sequential pipelined data processing systems has been proposed and in some instances hardware employing these concepts has been built. Several approaches to hardware parallelism have been taken including multiprocessors l ,2,3 which share common storage and input-output facilities but carry out calculations with separate instruction and data streams; array processors4 used to augment a host sequential type machine 'which executes a common instruction stream on many processors; and associative processors which again require a host machine and vary from biP to 'word oriented6 processors which alternatively select and compute results for many data streams under control of correlation and arithmetic instruction streams. In addition, the concept of pipelining is used both in arithmetic processors7 and entire systems, i.e., vector machines8 to achieve parallelism by overlap of instruction interpretation and arithmetic processing. Inherent in this approach to achieving greater data processing capability is the requirement that the data and algorithms to be processed must exhibit enough parallelism to be efficiently executed on multiple hardware ensembles. Algorithms which must bf' executed in a purely sequential fashion achieve no benefit from having two or more data processors available. Fortunately, a number of the problems requiring large amounts of computational resources do exhibit high degrees of parallelism and the proponents of the parallel hardware approach to satisfying this computational requirement have shown considerable ingenuity in fitting these problems into their proposed machines. The advocates of sequential pipelined machines can look forward to another order of magnitude increase in basic computational capability bf'fore physical factors will provide barriers to further enhancement of machine speed. \Vhf>n this limit is reached and ever bigger computational problems remain to be solved, it seems likely that the parallel processing approach will be one of the main techniques used to satisfy the demand for greater processing capability. Computational parallelism can occur in several forms. In the simplest case the identical calculation is carried out on a 101 102 National Computer Conference, 1973 criteria used to evaluate the application of parallel hardware to these problems are the degree of hardware utilization achieved and the increase in computational thruput achieved by introducing parallelism. The latter measure is simply the ratio of computational thruput achieved by the array of processing elements to the thruput possible with one element of the array. The Parallel Element Processing Ensemble (PEPE) considered as one of the four hardware configurations is the early IC model and is not the improved ~iSI PEPE currently under development by the Advanced Ballistic ~1issile Defense Agency. Finally, a comparison of hardware in terms of number of logical gates is presented to provide a measure of computational thruput derived as a function of hardware complexity. The paper concludes with a number of observations relative to the application of the various associative parallel hardware approaches to this computational requirement. FILTER CO~IPUTATIONS The bulk and Kalman filters play complementar~T roles in support of a phased array radar. The task assigned to the radar is to detect objects and identify those with certain characteristics e.g. objects which will impact a specified location on the earth, and for those objects so identified, to provide an accurate track of the expected flight path. The bulk filter supports the selection process by eliminating from consideration all detected objects not impacting a specified area while the Kalman filter provides smoothed track data for all impacting objects. Both filters operate upon a predictive basis with respect to the physical laws of motion of objects moving in space near the carth. Starting with an observed position, i.e., detection by a radar search look, the bulk filter projects the position of the object forward in time, giving a maximum and minimum range at which an impacting object could be found in the next verification transmission. Based upon this prediction the radar is instructed to transmit additional verification looks to determine that this object continues to meet the selection criteria by appearing at the predicted spot in space following the specified time interval. Those objects which pass the bulk filter selection criteria are candidates for pr<.>cision tracking by the radar and in this case the Kalman filter provides data smoothing and more precise estimates of the object's flight path. Again a prediction is made of the object's position in space at some future time based upon previously measured positions. The radar is instructed to look for the object at its predicted position and determines an updated object position measurement. The difference between the measured and predicted position is weighted and added to the predicted position to obtain a smoothed position estimate. Both the bulk and Kalman filter are recursive in the sense that measurement data from one radar transmission is used to request future measurements based upon a prediction of a future spatial position of objects. The prediction step involves evaluation of several terms of a Taylor expansion of the equations of motion of spatial objects. DrtaiJed discussion of thr mathC'matical basis for thc'se filters can be found ill the: literature UIl jJhas(~d array radars. !~"o The computations required to support the bulk filter are shown in Figure 1. The radar transmissions are designated as either search or verify and it is assumed that every other trans~ission is assigned to the search function. When an object is detected, the search function schedules a subsequent verification look typically after fifty milliseconds. If the verification look confirms the presence of an object at the predicted position another verification look is scheduled again after fifty milliseconds. When no object is detected on a verification look, another attempt can be made by predicting the object's position ahead two time intervals i.e., one hundred milliseconds, and scheduling another verification look. This procedure is continued until at least M verifications have been made of an object's position out of N attempts. If N - M attempts at verification of an object's position result is no detection then the object is rejected. This type of filter is termed an M out of N look bulk filter. Turning now to the Kalman filter the computational problem is much more complex. In this case a six or seven element state vector containing the three spatial coordinates, corresponding velocities, and optionally an atmospheric drag coefficient is maintained and updated periodically for each tracked object. A block diagram of this computation is shown in Figure 2. The radar measurements are input to state vector and weighting matrix update procedures. The weighting matrix update loop involves an internal update of a covariance matrix which along with the radar measurements is used to update a ,veighting matrix. The state vector update calculation generates a weighted estimate from the predicted and measured state vectors. The Kalman filter computation is susceptible to decomposition into parallel calculations and advantage can be taken of this in implementations for a parallel processor. CO~IPUTATIOXAL ~IODELS Bulk filter The bulk filter is designed to eliminate with a minimum expenditurr of computational resources a large number of unintrresting objects which may appear in the field of view of a phased array radar. A model for this situation requires CORRELATE RETURN WITH PREDICTED ASSIGN DllJECT TO PROCESSIIIG ELEMENT REOUEST vtRl'ICATION TRANSMISSION DllJECT POSITIOII IIITIATE PRECISION TRACK Missed IM,NI M Detections REJECT OBJECT Detections I----~ REOUEST NEXT VERIFICATION TRANSMISSION Figure 1 -Bulk filter flow diagram Performance of Parallel Processors assumptions for the number and type of objects to be handled, efficiency of the filter in eliminating uninteresting objects, and radar operational parameters. These assumptions must produce a realistic load for the filter "\vhich would be characteristic of a phased array radar in a cluttered environment. The assumptions, which are based upon the Advanced Ballistic ::\'iissile Agency's Preliminary Hardsite Defense study, are: to to + 25 ms to + 50ms 103 to + 75ms 1-----50ms----~1 Search Return 1st Verify Return Figure 3-Correlation and arithmetic phases 1. The radar transmits 3000 pulses, i.e. looks, per second and every other one of these is assigned to search. 2. New objects enter the system at a rate of 100 per 10 milliseconds (Yrs) all of "'\-",hich are assumed to be detected on one search look. 3. Fifteen objects are classified as being of interest, i.e. impacting a designated area (must be precision tracked), and 85 of no interest (should be eliminated from track). 4. Following detection an attempt must be made to locate each object not rejected by the filter every 50 NIs. 5. The filter selection criteria is 5 (::.vI) detections out of 7 (N) attempts. Failure to detect the object three timt>.s in the sequence of 7 looks results in rejection. 6. The filter is assumed to reduce the original 100 objects to 70 at the end of the third; 45 at the end of the fourth; 30 at the end of the fifth; 25 at the end of the sixth; and 20 at the end of the seventh look; thus failing to eliminate 5 uninteresting objects. being processed to determine if it represents new positional information for an object being tracked by that processor. For all objects in process, new data must be received every 50 2.\fs or it is considered to have not been redetected and hence subject to rejection by the filter. The associative processors studied were unable to carry out the required calculations·within the pulse repetition rate of the radar (330 J.l.sec). To achieve timely response, the processing was restructured into correlation and arithmetic cycles as shown in Figure 3. During the first 25 msec interval, the processors correlate returns from the radar with internal data (predicted positions). During the subsequent 25 msec interval, processors carry out the filter calculations and predict new object positions. This approach allmved all required processing to be completed in a 50 msec interval. Objects which fail the selection criteria more than two times are rejected and their processor resources are freed for reallocation. Ka-Zman filter Based upon the above assumptions the bulk filter accepts 500 new objects every 50 ::.vIs. When operational steady state is reached, the processing load becomes 100 search and 290 verify calculations every 10 ::\1s. Each object remains in the filter for a maximum of 350 ::\fs and for a 50 ::.vfs interval 1950 filter calculations are required corresponding to 10,000 new objects being detected by the radar per second. The above process can be divided into two basic steps. The first involves analysis of all radar returns. For search returns the new data is assigned to an available processor. For verify returns each processor must correlate the data with that Track Return 1 L Track Request RADAR \. RADAR TO STATE VECTOR COORDINATE TRANSFORMATION EVALUATION APPROACH I ! ! UPDATE STATE VECTOR UPDATE WEIGHTING MATRIX MEASUIIEIIEIfT PREDICT NEXT STATE VECTOR STATE VECTOR TO RADAR COORDIIATE TRANSFORIIA llOfI Figure 2-Kalman filter flow diagram The Kalman filter computation requires many more arithmetic operations than the bulk filter. The radar becomes the limiting factor in this case since only one object is assumed for each look. Assuming a radar capable of 3000 transmissions per second and a 50 ::'\ls update requirement for each precision track, a typical steady state assumption ·would be 600 search looks and 2400 tracking looks per second (corresponding to 120 objects in precision track). At this tracking load it must again be assumed that the 50 ::\ls update interval is divided into 25 :\ls correlation and compute cycles as was done for the bulk filter and shown in Figure 3. This implies that 60 tracks are updated every 25 .:'\'1s along with the same number of verify looks being received and correlated. UPDATE COVARIANCE MATRIX - The three quantities of interest in determining the relation between a parallel processor organization and a given problem are: resources required by the problem, resources available from the processor configuration, and time constraints (if any). A more precise definition of these quantities follows, but the general concept is that the processor capabilities and problem requirements should be as closely balanced as possible. Quantitative resource measures and balance criteria are derived from Chen's14 analysis of parallel and pipelined computer architectures. Chen describes the parallelism inherent in a job by a graph with dimensions of parallelism 104 National Computer Conference, 1973 M execution time with only one such processor (speedup over the job). Expressing 11' in terms of job width W gives for any job step HARDWARE SPACE = MTA 11'= JOB SPACE = ~WITI~T sequential processor execution time • • parallel processor executIOn tIme = WeT) ' (3) Similarly, averaging this quantity over an entire job during the available time gives: N L W(Ti)IlT i i=O ir= - - - - - - (4) ir=iiM (5) Ta Figure 4-Hardware and job space diagram width (number of identical operations which may be performed in parallel) and execution time. The ratio p is defined for a job as the area under the step(s) showing parallelism (width W> 1) divided by the total area swept out by the job. Thf' hardware pfficiency factor, '1], is the total job work space (defined as the product of execution time and the corresponding job width lV summed over all computations) over the total hardware work space (defined as the product of total execution time and the number, M, of available parallel processors). This provides a measure of utilization of a particular hardware ensemble for each segment of a computation. A modification of Chen's 'I] allows consideration of time constraints. Hardware space will now be defined as a product of the total time available to carry out the required computation times 1t.f, the number of processors available. Call this ratio ii. The work space is as defined above except that periods of no processor activity may be included, i.e., job width W =0. Figure 4 illustrates these concepts showing a computation involving several instruction widths carried out in an available computation time Ta. The stepwise value of 'I] varies during job execution and the average value for the whole job becomes: (Ta is divided into N equal time intervals =IlT, W(T i ) >0 forK steps). i i = - - - - - where MTa T/= - - - - - MKIlT (1,2) Note that under this interpretation, ii measures the fit between this particular problem and a given configuration. If ii = 1.0 the configuration has precisely the resources required to solve the problem within time constraints, assuming that the load is completely uniform (with non-integral ,,'idth in most cases). Although ii will be much less than 1.0 in most cases, it is interesting to compare the values obtained for processors of different organizations and execution speeds, executing the same job (identical at least on a macroscopic scale). Implicit in the stepwise summation of the instruction time-processor width product are factors such as the suitability of the particular instruction repertoire to the problem (number of steps), hardware technology (execution time), and organizational approach (treated in the following section) . A criterion 11' is expressed as the inverse ratio of time of execution of a given job with parallel processors to the or simply: which states that the speed of execution of a computation on parallel hardware as contrasted to a single processing element of that hardware is proportional to the efficiency of hardware utilization times the number of available processing elements. Again, ir measures the equivalent number of parallel processors required assuming a uniform load (width = if, duration = Ta). PROCESSOR ORGANIZATIONS General observations In the analysis which follows, job parallelism is calculated on an instruction by instruction step basis. For the purposes of this discussion, consider a more macroscopic model of job parallelism. Sets of instructions with varying parallelism widths will be treated as phases (lJI i ) , with phase width defined as the maximum instruction width within the phase. (see Figure 5, for a three phase job, with instructions indicated by dashed lines). Given this model, it is possible to treat the parallel component in at least three distinct ways (see Figure 6). The simplest approach is to treat a parallel step of width N as N steps of width one which are executed serially. This would correspond to three loops (with conditional branches) of iteration counts Wl, W 2, W a, or one loop with iteration count max[Wl , W 2 , W a]. The worst case execution time (macroscopic model) for width N would be T=Nts. Parallel processing, in its purest sense, devotes one processing element (PE) to each slice of width one, and executes the total phase in T = ta, where ta is the total execution time for any element. Variations on this basic theme are possible. For example, STARAN, the Goodyear Aerospace associative PARALLELISM WIDTH I I I r-I r--'I 1 ¢3 WI I .' W I ¢2 : ---"'1 W3 I W2 I I -.., I I r-I l - t l - -t2 - - -t3 Figure 5-Three phase jtlb space diagram TIME Performance of Parallel Processors processor,5,9,l2 is actually an ensemble of bit-slice processors lS arranged in arrays of 256 each having access to 256 bits of storage. The ensemble is capable of bitwise operations on selected fields of storage. Since the bulk filter algorithm requires 768 bits of storage for the information associated with one filter calculation, i.e. track, a "black box" model devotes three PE's to each object in track (generally one of the three is active at any instruction step). The converse of the STARA~ case is exemplified by the Parallel Element Processing Ensemble (PEPE) ,6 which devotes M PE's to N tracks, Jf ~l:ill}ffi!1~tJ~~.h~n~l~db.y.sgparateJ)roce,ss.Qrs. A third approach is analogous to the pipelining of instruction execution. Assuming that each phase has execution time tp, one could use one sequential processor to handle execution of each phase, buffering the input and output of contiguous DATA RESULT ts ~ 02 U T = Nts N N Sequential Processor ta DATA - ..... 01 -- D2 RESULT T = ta f----...... f----DN N f----- 105 phases to achieve a total execution time of T= (N -l+m)tp for an 1l.f stage process. The Signal Processing Element (SPE) designed by the US Naval Research Laboratory'5 can utilize this strategy of functional decomposition, linking fast microprogrammed arithmetic units under the control of a master control unit to achieve tp~M-'ts for "sequential" machines of the CDC 7600, IB1I 370/195 class (T~ [~"11-'(N -1) +l]ts). One other factor of considerable importance is the number of control streams active in each processor array. The simplest arrangement is a single control stream, broadcast to all elements from a central sequencing unit. Individual PE's may be deactivated for part of a program sequence by central direction, or dependent upon some condition determined by each PE. Dual control units mean that arithmetic and correlative operation can proceed simultaneously, allowing the t\'"Q .ph~_s~ ~trl1tegy.Qutli:u.e_d.__.e_a.dier t.o.IDrrk. efficiently (one control stream would require an "interruptible arithmetic" strategy, or well defined, non-overlapping, search/ verify and arithmetic intervals). These two control streams can act on different sets of PE's (e.g. each PE has a mode register which determines the central stream accepted by that PE), or both control streams can share the same PE on a cycle stealing basis (PEPE IC model). Configurations considered Table I presents the basic data on the four hard,vare configurations considered for the bulk filter problem. Sizing estimates are based upon the assumptions described previously, i.e. 1950 tracks in processing at any given time (steady state). Over any 25 11s interval, half of the tracks are being correlated, half are being processed arithmetically. The Kalman filter results compare the performance of STARAN, PEPE (IC model), and the CDC 7600 in sustaining a precise track of 120 objects (1 observation each 50 ::\1s) using a basic model of the Kalman filter algorithm. The STARAN solution attempts to take advantage of the parallelism within the algorithm (matrix-vector operations). Twenty-one PE are devoted to each object being tracked . PEPE would handle one Kalman filter sequence in each of its PE's, performing the computations serially within the PE. N CO :VIPARATIVE RESULTS Parallel Processor Bulk filter RESULT DATA §~--DI§ N~ T = IN+lltp = IN-l+Mltp For M Stages L---.lN Functional Pipeline Figure 6- Decomposition of parallelism in three processor organizations Table II presents the values for 'YJ, 71, ir, and execution time for each of the 4 processor configurations. As has been explained earlier 'YJ and ij differ only in the definition of hardware space used in the denominator of the 'YJ expression. It is interesting to note that although the efficiency 71 over the constrained interval is not large for any of the parallel processors, all three do utilize their hardware efficiency over the actual arithmetic computation ('YJ). The implication is that some other task could be handled in the idle interval, a 106 National Computer Conference, 1973 TABLE I-Bulk Filter Processor Configuration Comparison STARAN 3 PE/TRACK HONEYWELL PE/TRACK PEPE (IC model) PE/20 TRACK 1950 Number of PE's (1950 track load) 30 Arrays =7680 PE's 32 bit fixed point Add time/PE 18.0,usee Control Streams Single-(Standard opDouble-Each PE may tion) all PE's correlate be in correlation or arithmetic mode or perform arithmetic functions Approximate gate count/PE (not including storage) 82 (21,000/256 PEarray) Gate Count for configuration (PE's only) 100 .75,usee 630,000 CDC 7600 SEQUENTIAL .25,usec 27.5-55 n.s. (60 bit) Double-EACH PE may Single pipelined perform correlation and arithmetic functions simultaneously 2,400 9,000 170,000 4.68X106 900,000 170,000 Adds/secX1()6 (",MIPS) 427 2600 400 18 Gates/track 320* 2400 450 87 * Based on a 30 Array configuration-246 are required for the algorithm. more sophisticated filter algorithm could be used, or the PE's could be built from slower, less expensive logic. It should be stressed that the gate counts given are strictly processing element gates, not including memory, unit control, or other functions. Kalman filter As noted above, 60 precision tracks must be updated in 25 ~'ls while 60 track looks are being correlated. Benchmark data for the CDC 7600 indicates that a Kalman filter calculation consisting of 371 multiplies, 313 add/subtracts, 2 divides, 6 square roots, and 1 exponentiation will require approximately 0.3 ~ls (18 1\1s for 60 tracks). This leaves a reserve of 7 ~1s out of a 25 Ms interval for correlation. An analysis of the STARAN processor applied to the Kalman filter l2 indicates that with 21 processing elements assigned to each precision track the calculation can be carried out in slightly less than 25 Ms. This performance is achieved by decomposing the filter calculation into 56 multiplies, 61 add/subtracts, 3 divides, 4 square roots and is achieved at the cost of 322 move operations. Figure 7 shows the processor activity for the first 15 instructions of the Kalman filter sequence. One bank of STARAN processing elements (5256 element arrays) containing 1280 processors is required to update tracks for 60 objects in one 25 ::Vls interval and correlate returns during the other. The PEPE configuration would require 60 processing elements (two track files per element) taking advantage of this hardware's ability to do arithmetic calculations and correlations simultaneously, achieving a 45 percent loading (11.3 ~ls execution time per Kalman filter sequence) of each PEPE processing element. Table III summarizes the Kalman filter results. tions described in the studies. 9 ,IO,1l,12 Each of the proposed configurations is more than capable of handling the required calculations in the time available. System cost is really outside the scope of this paper. In particular, gate count is not a good indicator of system cost. The circuit technology (speed, level of integration) and chip partitioning (yield, number of unique chips) trade-offs possible "ithin the current state of the art in LSI fabrication relegate gate count to at most an order of magnitude indicator of cost. Each of the three parallel processor organizations represents a single point on a trade-off curve in several dimensions (i.e. processor execution speed, loading, and cost, control stream philosophy, etc.). Given an initial operating point, determined by the functional requirements of the problem, the system designer must define a set of algorithms in sufficient detail to TABLE II-Results of Bulk Filter Analysis STARAN 3 PEl TRACK Correlation time 7J~1 7I"¢1 1i~1 ir~l Arithmetic time 7J~2 7I"~2 1i~2 7I"~2 Total time ." 71" OBSERVATIONS AXD COXCLUSIOKS It should be emphasized that this study was not an attempt to perform a qualitatiyc cyaluation of the procC'~SQr organiza- ij ir .".MIPS ----- 1.8 msec .036 139 .0026 9.9 14.5 msec .66 2450 .38 1470 16.3 msec .30 2270 .19 1480 128 CDC HONEYPEPE (IC 7600 SEWELL model) PE/20 QUENPE/TRACK TRACKS TIAL 15.9 msec .035 34 .022 22 5.1 msec .80 780 .16 159 21 msec .11 216 .093 181 286 2.1 msee .68 68 .057 5.7 3.85 msec .79 79 .122 12.2 5.95 msec .75 75 .18 18 300 22 msec 1.0 1.0 .88 .88 18 Performance of Parallel Processors convince himself that he can operate 'within any given constraints. Fine tuning of the system is accomplished by restructuring the algorithms, redefining the operating point, or both. In the two cases treated in this paper, elapsed time is the crucial measure of system performance (in a binary sense-it does or does not meet the requirement). The purpose of the 7] and 71 calculations, as well as the step by step processor activity diagrams is to provide some insight-beyond the elapsed time criteria which might be helpful in restructuring algorithms, or modifying some aspect of the system's architecture such as control stream philosophy. The properties of the processor activity diagrams are of significant interest in determining the number of PE's that are required to handle the given load (uniform load implies fewer PE's and higher 7]). The measures used in this paper are of some interest because of the fact that they are functions of problem width and instruction execution time) allowing factors such as the selection of a particular instruction set to enter into the values of the resultant tuning parameters. Several more specific observations are in order. First, for the particular bulk filter case considered, the CDC 7600 can easily handle the computational load. Proponents of the parallel processor approach ,,,"ould claim that quantity production of PE's, utilizing LSI technology, would enable them to produce equivalent ensembles at less than a CDC 7600's cost. In addition, computation time for the parallel ensembles is only a weak function of the number of obj ects in the correlation phase, and essentially independent of object load in the arithmetic phase. Therefore, it would be simple to scale up the capabilities of the parallel processors to handle loads well beyond the capability of a single, fast sequential processor. The functional pipelining approach advocated by the Xaval Research Laboratory would appear to be the strongest challenger to the parallel approach in terms of capabilities and cost (and to a somewhat lesser extent, flexibility). Very rough estimates indicate that the bulk filter case presented here could be handled by no more than two arithmetic units (each with ,....,.,10,000 gates) and a single microprogrammed control unit (,....,.,5,000 gates). Tasks which stress the correlative capabilities of parallel arrays rather than # ACTIVE ElEMENTS ELASPED TIME IMICROSECOfl)S} Figure 7-STARAN Kalman filter loading (one track, first 15 instructions) 107 TABLE III-Results of Kalman Filter Analysis Time 7f 11" 'ii ir 7f}LMIPS gates/track STARAN PEPE (IC model) CDC 7600 25 Ms "".5 640 "".5 640 36 875 11.3 Ms >.9 >54 .45 27 108 4500 18 Ms 1 1 .72 .72 13 1400 NOTE: Correlation time for the Kalman filter is not significant ("" 100 }Ls) since each track is assigned a unique track number number (120 total). Accordingly, only total time figures are presented. the parallel arithmetic capabilities should show the parallel array architecture to its greatest advantage. ACKNO'WLEDG:\;IEKTS The authors wish to express their gratitude to ~fessrs. Brubaker and Gilmore, Goodyear Aerospace Corporation for providing STARAK logic gate counts and Kalman filter computation timing estimates; to }Ir. W. Alexander, Honeywell Inc. for providing PEPE and Honeywell Associative Processor Logic gate counts; and to Jlr. J. Burford, Control Data Corporation for providing the CDC 7600 logic gate counts. The study reports9 ,lo,1l were prepared under the direction of the Advanced Ballistic ~,1issile Defense Agency, whose cooperation is greatly appreciated." REFERENCES 1. Conway, M. E., "A Multi-Processor System Design," Proceedings AFIPS, FJCC, Vol. 24, pp. 139-146, 1963. 2. Stanga, D. C., "UNIVAC l108 Multi-Processor System," Proceedings AFIPS, SJCC, Vol. 31, pp. 67-74,1967. 3. Blakeney, G. R., et aI, "IBM 9020 Multiprocessing System," IBM Systems Journal, Vol. 6, No.2, pp. 80-94,1967. 4. Slotnik, D. L., et aI, "The ILLIAC IV Computer, IEEE Transaction on Computers, Vol. C-17, pp. 746-757, August 1968. 5. Rudolph, J. A., "STARAN, A Production Implementation of an Associative Array Processor," Proceedings AFIPS, FJCC, Vol. 41, Part I, pp. 229-241, 1972. 6. Githens, J. A., "A Fully Parallel Computer for Radar Data Processing," NAECON 1970 Record, pp. 302-306, May 1970. 7. Anderson, D. W., Sparacio, F. J., Tomasula, R. M., "Systemj360 ~Y10D 91 Machine Philosophy and Instruction Handling," IBM Journal of Research and Development, Vol. II, No.1, pp. 8-24, January 1967. 8. Hintz, R. G., "Control Data Star 100 Processor Design," Proceedings COMPCON '72, pp. 1-10, 1972. 9. Rohrbacher, Terminal Discrimination Development, Goodyear Aerospace Corporation, Contract DAHC60-72-C-0051, Final Report, March 31, 1972. 10. Schmitz, H. G., et aI, ABMDA Prototype Bulk Filter Development, Honeywell, Inc., Contract DAHC60-72-C-0050, Final Report, April 1972, HWDoc #12335-FR. 11. Phase I-Concept Definition Terminal Discrimination Development Program, Hughes Aircraft Company, Contract DAHC60-72C-0052, March 13, 1972. 108 National Computer Conference, 1973 12. Gilmore, P. A., An Associative Processing Approach to Kalman Filtering, Goodyear Aerospace Corporation, Report GER 15588, March 1972. 13. Kalman, R. E., "A New Approach to Linear Filtering and Prediction Problems," Journal of Basic Engineering, Vol. 820, pp. 35-45, 1960. 14. Chen, T. C., "Unconventional Superspeed Computer Systems," Proceedings AFIPS, SJCC, Vol. 39, pp. 365-371, 1971. 15. Ihnat, J., et aI, Signal Processing Element Functional Description, Part I, NRL Report 7490, September 1972. 16. Shore, J. E., Second Thoughts on Parallel Processing, :t\aval Research Laboratory Report #7364, December 1971. A structural approach to computer performance analysis by P. H. HUGHES and G. MOE University of Trondheim Norway is a series of requests for action by one or more devices, usually in a simple, repetitive sequence. The way in which instructions are interpreted means that the central processor is involved every time I/O action is to be initiated, so that every program can be reduced to a cycle of requests involving the CPU and usually one or more other devices. In a single program system, devices not involved in the current program remain idle, and the overlap between CPU and I/O activity is limited by the amount of buffer space available in primary store. Performance analysis in this situation is largely a matter of device speeds and the design of individual programs. ~Iultiprogramming overcomes the sequential nature of the CPU-I/O cycle by having the CPU switch between several such programs so as to enable all devices to be driven in parallel. The effectiveness of this technique depends upon several factors: IXTRODUCTIOK The perfOl'm-aR~-e analysis of computer systems is as yet a rather unstructured field in which particular aspects of systems or items of software are studied 'with the aid of various forms of models and empirical measurements. Kimbleton 1 develops a more general approach to this problem using three primary measures of system performance. The approach to be described here represents a similar philosophy but deals only with throughput as the measure of system performance. This restriction results in a model which is convenient and practical for many purposes, particularly in batch-processing environments. It is hoped that this approach will contribute to the development of a common perspective relating the performance of different aspects of the system to the performance of the 'whole. The present work arises out of a continuing program of performance evaluation begun in 1970 on the UNIVAC 1108 operated by SIXTEF (a non-profit engineering research foundation) for the Technical University of Norway (KTH) at Trondheim. The status of the Computing Centre has since been revised, such that it now serves the newlv formed University of Trondheim which includes NTH. Early attention focused on trying to understand EXEC 8 and to identify possible problem areas. One of the early fruits of the work is described in Reference 2. However the main emphasis for most of 1971 was on the development of benchmark techniques supported by a software monitoring package supplied by the University of Wisconsin as a set of modifications to the EXEC 8 operating system. 3 •4 It was in an effort to select and interpret from the mass of data provided by software monitoring that the present approach emerged. The operation of a computer system may be considered at any number of logical levels, but between the complexities of hardware and software lies the relatively simple functional level of machine code and I/O functions, at ","hich processing takes physical effect. The basis of this paper is a general model of the processes at this physical level, to which all other factors must be related in order to discover their net effect on system performance. (i) the buffering on secondary storage of the information flow to and from slow peripheral devices such as readers, printers and punches ('spooling' or 'symbiont activity') . (ii) the provision of the optimum mix of programs from those 'waiting to be run so that the most devices may be utilised (coarse scheduling). (iii) a method of switching between programs to achieve the maximum processing rate (dynamic scheduling). v These scheduling strategies involve decisions about allocating physical resources to programs and data. The additional complexity of such a system creates its own administrative overhead which can add significantly to the total v,"orkload. The success of the multiprogramming is highly sensitive to the match behveen physical resources and the requirements of the workload. In particular there must be: (iv) provision of sufficient primary store and storage management to enable a sufficient number of programs to be active simultaneously. (v) a reasonable match between the load requirements on each device and the device speed and capacity so as to minimise 'bottleneck' effects whereby a single overloaded device can cancel out the benefits of multiprogramming. THE GENERAL XATURE OF THE WORKLOAD At the level we shall consider, a computer is a network of devices for transferring and processing information. A program 109 110 National Computer Conference, 1973 CPlj I/O DEV:CES (FH"32 ) (FH880) p processes values for each device. First-in first-out queuing disciplines are also assumed. This is not strictly valid under EXEC 8 in the case of the CPU, but the assumption does not upset the general behaviour of the model. (FASTRAND) (MAGNETIC TAPE) Throughput and load Figure I-Configuration used by the model. Names in parentheses denote corresponding equipment Such a system, if properly configured and tuned, can achieve a throughput rate several times greater than a single program system. Performance analysis becomes much more complex, but correspondingly more important. A ~VIODEL OF THE PHYSICAL WORKLOAD We wish to model a number of independent processes, each consisting of a cycle of CPU-I/O requests. In real life, the nature of these processes changes dynamically in an extremely complex way and we introduce several simplifying assumptions: (i) the processes are statistically identical and are always resident in core store. (ii) requests by a process are distributed among devices according to a stationary, discrete probability distribution function f. (iii) the time taken to service a request on a particular device i is drawn from a stationary distribution function s. Figure 1 illustrates the network we have been using. The names in parentheses refer to the particular equipment in use at our installation. FH432, FH880, and FASTRAND constitute a hierarchy of fast, intermediate, and large slow drums respectively. We restrict the model to those devices which satisfy the constraint that any process receiving or waiting to receive service from the device must be resident in core store. If a process requires service from a device which does not satisfy this constraint (e.g., a user terminal) it is no longer 'active' in the terms of the model. Normally it ",ill be replaced in core by some other process which is ready to proceed. This restriction rules out an important class of performance variables such as system response time, but has corresponding advantages in dealing with throughput. At any instant the number of active, in-core processes p is discrete, but the average over a period of time may be noninteger, as the mix of programs that will fit into core store changes. In addition p includes intermittent processes such as spooling which will contribute fractionally to its average value. We will refer to p as the multiprogramming factor. This is to be distinguished from the total number of open programs (including those not in core store) which is sometimes referred to as the degree of multiprogramming. In the first instance, all distribution functions have been assumed to be negative exponential vlith appropriate mean We may define the throughput of such a network to be the number of request cycles completed per second for a given load. The load is defined by two properties (i) the distribution of requests between the various devices (ii) the work required of each device. This 'work' cannot be easily specified independently of the ch~racteristics of the device, although conceptually the two must be distinguished. The most convenient measure is in terms of the time taken to service a request, and the distribution of such service times ",ith respect to a particular device. For a given load, the distribution of requests among N devices is described by a fixed set of probabilities fi (i = 1, ... N). The proportion of requests going to each device over a period of time will therefore be fixed, regardless of the rate at which requests are processed. This may be stated as an invariance rule for the model, expressible in two forms "ith slightly different conditions: For a given load distribution f (i) the ratio of request service rates on respective devices is invariant over throughput changes and (ii) if the service times of devices do not change, the ratio of utilizations on respective devices is invariant over throughput changes. Behaviour of the model Figures 2 (a), 2 (b) show how the request service rate and the utilization of each device vary with the number of active processes p, for a simulation based on the UNIVAC 1108 configuration and workload at Regnesentret. These two figures illustrate the respective invariance rules just mentioned. Furthermore the b;o sets of curves are directly related by the respective service times of each device j. In passing we note that the busiest device is not in this case the one with the highest request service rate, \vhich is alwavs the CPU. Since we have assumed that service times are i~dependant of p (not entirely true on allocatable devices), it is clear that the shape .of every curve is determined by the same function F(p) which we will call the multiprogramming gain function, shown in Figure 2 (c) . This function has been investigated analytically by Berners-Lee. 5 Borrowing his notation, we may express the relationship of Figure 2(b) as Xj(p)=F(p)xj for j=1,2, ... N (1) A Structural Approach to Computer Performance Analysis ::~:::: rr re~~es:s ~er 50 sec. ra:e Fig.2(a)~_ !'a't':'o be-':'Nee::: curves deter:nined bv t:he load on d~vices FH u 32 _ ~ ~~~~:::D ~_ _ _ _ _ _ _ ~ag"etic tape 1 2 3 . 4 Improving system throughput _ :AST~ANCl 0.8 0.6 2(b)~_--- cpu 0.4 ratio between curves determir.ed by the product 0: load and service tine far each device I ~ - =~ ~ FH432 ,.,"',. "" (i) by increasing the mUlti-programming factor (ii) by improving the match between system and workload (iii) by reducing the variance of request service times (iv) by improving the efficiency of the software so that the same payload is achieved for a smaller total \vorkload. s lODe of C:.lrve det~rmined by rela"'[ive device utilizations Fig. We shall consider in detail two ways of improving the throughput of such a system: Later we shall encounter two further ways by which throughput could be improved: FH880 0.' cesses q satisfying q~min(p, N) are actually being serviced at any instant, while (p-q) processes ,vill be queuing for service. It follows that the time-averaged value of q must be F (p), so that one may regard F (p) as the true number of processes actually receiving service simultaneously. Since q is limited by the number of devices N it follows that the maximum value of F (p) must be N which we would expect to be the case in a perfectly balanced system, at infinitely large p. ? Utili:~:ir _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Fig. 111 Figure 2-Behaviour of model as a function of the multiprogramming factor p where Xj(p) is the utilization of device j when p processes are active and Xj is identical with Xj(l), the corresponding utilization when only one process is active. Now when only one process is active, one and only one device will be busy at a time, so that (2) Summing equations (1) for all devices, we have Increasing the multiprogramming factor We may increase the number of active processes by either acquiring more core store or making better use of what is available by improved core store management, or smaller programs. The latter two alternatives may, of course, involve a trade-off with the number of accesses required to backing store. The maximum gain obtainable in this way is limited by the utilization of the most heavily used device-the current "bottleneck." If the present utilization of this device is X m , and the maximum utilization is unity, then the potential relative throughput gain is l/Xm (Figure 3). The practical limit will of course be somewhat less than this because of diminishing returns as p becomes large. A simple test of whether anything is to be gained from increasing p is the existence of any device whose utilization approaches unity. If such a device does not exist then p is a "bottleneck." However, the converse does not necessarily apply if the limiting device contains the swapping file, N L Xj(p) =F(p) (3) Matching system and workload j=l That is, the multiprogramming gain for some degree of multiprogramming p is equal to the sum of the device utilizations. It is instructive to consider the meaning of the function F(p) . .:\Iultiprogramming gain comes about only by the simultaneous operation of different devices. If there are p parallel processes on lv~ devices, then some number of pro- This method of changing performance involves two effects which are coupled in such a way that they sometimes conflict with each other. They are: (a) Improvement In monoprogrammed system formance (b) Improvement in multiprogramming gain per- 112 National Computer Conference, 1973 The limiting throughput corresponding to F I is obtained by substituting in (4) using (5) and (6). 1 1 LfiSi Xm R z= - - · By definition m,..l~ ~iprogrammir..g factor Figure 3(a)-The effect of increasing multiprogramming on the utilization of the limiting device m so that 1 R ,= fmsm Mlll"[:' :;!"'ogramrr.'::lg Gair: (7) For a balanced system, Figure 3(b)-The effect of increasing multiprogramming on the multiprogramming gain function F(P) These two components of system performance are seen in (4) where r is the total no. of requests processed per second on all devices when p = 1, and R is the corresponding total at the operational value of p (Fig. 2(a)). R=F(p)r (4) Clearly we may improve the total performance R by improving either r or F (p), but we shall see that any action we take to improve one has some "effect on the other. At p = 1 the mean time to perform any request is L fiSi hence 1 r= LfiSi (5) In considering the potential mUltiprogramming gain F (p ) , it is useful to examine the limiting value F I as p becomes large. Applying equation (1) to the limiting device m we have Xm F(p) = Xm and, as p becomes large, 1 F ,= - (6) Xm The only way of improving F, is to reduce the monoprogrammed utilization x m • But since m is the limiting device and it follows that Xm must have a lower bound of liN, at which point all Xi must be equal to liN and F,=~V. This is the condition for a balanced system. and from (4) Rz (balance)=Nr It is important to note that while a balanced system is a necessary and sufficient condition for a maximum potential multiprogramming gain F it is not necessarily an appropriate condition for a maximum"throughput R at some finite value of p. This is because of the coupling between F(p) and r. We shall now use equations (5) to (8) to examine informally the effect, at high and low values of p, of two alternative ways of improving the match between system and workload. They are (i) to reduce device service times (ii) to redistribute the load on I/O devices Effect of reducing service time Let us consider reducing service time Sf on device i From (5), r will always be improved, but it will be most improved (for a given percent improvement in Sf) when !isf is largest with respect to L fiSi i.e. when j = m, the limiting device. From (7), if j ¥=m, Rz is not affected by the change but if j = m then Rz is inversely proportional to the service time. Thus we may expect that for the limiting device, an improvement in service time will always result in an improvement in throughput but this will be most marked at high values of p. Speeding up other devices will have some effect at low values of p, diminishing to zero as p increases (Fig. 4(a)) . If a limiting device j is speeded up sufficiently, there will appear a new limiting device k. We may deal with this as two successive transitions with j = m and j ¥=m separated by the boundary condition !iSj = fkSk. Redistributing the load This alternative depends on changing the residence of files so that there is a shift of activity from one device to another. Here we must bear in mind two further constraints. Firstly this is not possible in the case of certain devices, e.g. the CPU in a single CPu system. t:;econdly, a conservation rule A Structural Approach to Computer Performance Analysis System Thr;ughPU' iI j=n RR, 113 redistribution depends both on the multiprogramming factor and on how the load on the limiting device is affected by the change. Redistribution is most effective if it reduces the load on the limiting device. Successive applications of this criterion will eventually lead to one of two situations: either (1) the CPU becomes the limiting device (CPU-bound situation) . or r rr.ul-: iprogrammi:-.g f ac"':or p Figure 4(a)-The effect of reducing service time on device j System Throughput i=m (liciting device) R RR, Original curve j=~ (limiLing device) r rnultiprograIT~ing factor p Figure 4(b)-The effect of redistributing the load from device i to a faster device j usually applies such that work removed from one device i must be added to some other devicej. That is (9) Consider the effect of s"\\'-itching requests from any device i to a faster device j. Since Sj (I: I '''Yo_cal Workload / !ly:oaT:c I p~.yroa: I -,-,,'-n._ .... '- - .. ' H ,- LEVEL 11 SCP.E[)ULltJG C'f wORKU!\J TO OF'?Iv.TSE ;.)or~>:",~ :":::VEL ~ I ~!"' '::)N~'I:;L'!=!=~~G :-: ':::'~. ::"[1'" -:-C .:\.1r: :;f.~:SJ:< . ."':£:.-:~;-! ""'t::: ·w..,:C'~;r:_">~D J Figure 8-Structur~1 view of pl'rform~n('e ~n~lv<;i" At this level the users processing requirements are given explicit shape in some programming language, or by the selection of a particular application package. Processing is defined in terms of records, files and algorithms. The net result from this level of decision making is the identifiable workload of the computer. This is the 'payload', which the user wishes to pay for and which the installation seeks to charge for. Unfortunately, accounting information provided by the system is often only available on a device-oriented rather then file-oriented basis, and compounded with effects from lower levels in the system. This is the level at which benchmarks are prepared for purposes of computer selection or testing of alternative configurations. In the case of computer selection, the aim must be to withhold as many decisions as possible about how the users requirements ,,"ill be implemented, since this is so dependent upon available facilities. The alternative is to follow up the ramifications of each system. Thus in general, a quantitative approach to computer selection is either inaccurate or costly, or both. Benchmarks for a given range of computers are a more feasible proposition since they can be constructed on the output of levpl .5, leaving open as man:v rf'source allocation A Structural Approach to Computer Performance Analysis decisions as possible, so that the most can be made of alternative configurations. Level 4 translation By this is intended the whole range of software including input/output routines and compilers by which the punched instructions of the programmer are translated into specific machine orders and the results translated back again into printed output. Factors involved at this level 'will include optimization of code, efficiency of compilers, buffer sizes determined by standard I/O routines. It is clear that these will in turn affect the core size of programs, the length of CPU requests and the number and length of accesses to different files. The output from this stage is the generated \vorkload, and its relation to the payload might be thought of as the 'efficiency' of the system software. Improving this efficiency is the fourth way of improving system performance, referred to in a previous section. Level 3 static resource allocation At this level, decisions are concerned with matching processing requirements to the capacity of the configuration. Decisions are made about the allocation of files to specific devices, taking account of configuration information about physical device capacity, number of units, etc. We also include here the decision to 'open' a job, implying the assignment of temporary resources required for its execution. This is done by the coarse scheduling routines, which also decide on the number and type of jobs to be simultaneously open and hence the job mix which is obtained. Both user and operator may override such decisions, subordinating machine efficiency to human convenience. The decisions taken at this level may influence the maximum number of parallel processes and the relative activity on different I/O devices. Level 2 dynamic resource allocation By 'dynamic' we mean decisions taken in real time about time-shared equipment, namely the CPu and primary store. The number of active parallel processes is that number of processes which are simultaneously requesting or using devices in the system. To be active a process must first have primary store, and the dynamic allocation of primary store governs the number of processes active at any instant. Given the processes with primary store, the system must schedule their service by the CPU, which in turn gives rise to requests for I/O devices. The rules for selection among processes and the timeslice that th('~· are allo\\-ed "'ill influence the instantaneous load on devices. In terms of the model, this will influence the shape of the distributions of load f and service time s which in turn influence the shape of the gain function F (p). 119 Level 1 execution At this level, the final load has been determined, so that the remaining effects on performance are due to the physical characteristics of the devices. Unfortunately, it is difficult to express the load in terms which are independent of the specific device. The function f gives the distributions of requests, but the service time s is a compound of load and device speed as we have discussed. However, it is at least a quantity which can be directly monitored for a given workload and configuration, and one may estimate how it is affected by changes in device characteristics. In principle, the important parameters of our model can be monitored directly at this level of the system, although \ve have not yet succeeded in obtaining an empirical value for p. However, the model depends for its simplicity and power on the correct use of the distributions of these parameters and investigations continue in this area. CO~CLUSION We have presented a set of concepts which have been developed in an effort to master the performance characteristics of a complex computer system. These concepts, together ,,-ith the simple queuing model which enables us to handle them, have proven their usefulness in a variety of practical situations, some of which have been described. The application of these concepts depends upon having the necessary information provided by monitoring techniques, and conversely provides insight in the selection and interpretation of monitor output. While such abstractions should whenever possible be reinforced by practical tests, such as benchmarks, they in turn provide insight in the interpretation of benchmark results. In its present form the model is strictly concerned \"ith throughput and is not capable of distinguishing other performance variables such as response time. This severely restricts its usefulness in a timesharing environment, but is very convenient in situations where throughput is of prime concern. Consideration of the distinct types of decisions made within the computer complex, suggests that it may be possible to assess the effect of different system components on overall performance in terms of their effect on the basic parameters of the model. It is thought that the approach described may be particularly useful to individual computer installations seeking an effective strategy for performance analysis. REFERENCES 1. Kimbleton. S. R.. "Performance Evaluation-A Structured Approach," Proceedings AFIPS Spring Joint Computer Conference, 1972, pp. 411-416. 2. Strauss, J. C., "A Simple Thruput and Response Model of EXEC 8 under Swapping Saturation," Proceedings AFIPS Fall Joint Computer Conference, 1971, pp. 39-49. 120 National Computer Conference, 1973 3. Draper, M. D., Milton, R. C., UNIVAC 1108 Evaluation Plan, University of Wisconsin Computer Center, Technical Report No. 13, March 1970. 4. Hughes, P. H., "Developing a Reliable Benchmark for Performance Evaluation," NordDATA 72 Conference, Helsinki, 1972. Vol. II, pp. 1259-1284. 5. Berners-Lee, C. M., "Three Analytical Models of Batch Processing Systems," British Computer Society Conference on Computer Performance, University of Surrey, Sept. 1972, pp. 43-52. 6. Florkowski, J. H., "Evaluating Advanced Timesharing Systems." IBM Technical D~~closure Bulletin, Vol. 14, No.5, October 1971. Simulation-A tool for performance evaluation in network computers by EDWARD K. BOWDON, SR., SANDRA A. MAMRAK and FRED R. SALZ University of Illinois at Urbana-Champaign II rbana, Illinois Since the model was developed for a hypothetical network, we needed to ensure that the results were valid and that no gross errors existed in the model. Our approach was to design a general n node network simulator and then to particularize the input parameters to describe ILLINET (the computer communications network at the University of Illinois). For a given period, system accounting records provided exact details of the resources used by each task in the system including CPU usage, input/ output resources used, core region size requested, and total real time in the system. Using the first three of these parameters as input data, we could simulate the fourth. Comparison of the actual real time in the system to the simulated real time in the system authenticated the accuracy of the model. Extrapolating from these results, we could then consider the more general network with reasonable assurance of accurate results. I~TRODUCTION The success or failure of network computers in today's highly competitive market will be determined by system performance. Consequently, existing network computer configurations are constantly being modified, extended, and hopefully, improved. The key question pertaining to the implementation of proposed changes is "Does the proposed change improve the existing system performance?" Unless techniques are developed for measuring system performance, network computers will remain expensive toys for researchers, instead of becoming cost effective tools for progress. In order to analyze and evaluate the effects of proposed changes on system performance, we could employ a number of different techniques. One approach would be to modify an existing network by implementing the proposed changes and then run tests. Unfortunately, for complex changes this approach becomes extremely costly both in terms of the designer's time and the programmer's time. In addition, there may be considerable unproductive machine time. Alternatively, we could construct a mathematical model of the envisioned network using either analytical or simulation techniques. Queueing theory or scheduling theory could be employed to facilitate formulation of the model, but even for simple networks the resulting models tend to become quite complex, and rather stringent simplifying assumptions must be made in order to find solutions. On the other hand, simulation techniques are limited only by the capacity of the computer on which the simulation is performed and the ingenuity of the programmer. Furthermore, the results of the simulation tend to be in a form that is easier to interpret than those of the analytical models. To be of value, however, a simulation model must be accurate both statistically and functionally. In order to ensure that the analysis of proposed changes based on the simulation results are realistic, the model's performance must be measured against a known quantity: the existing network. In this paper we present a simulation model for a hypothetical geographically distributed network computer. MODEL DEVELOPMENT We begin the development of our network model by focusing our attention on ILLINET. This system contains a powerful central computer with copious backup memory which responds to the sporadic demands of varying priorities of decentralized complexes. The satellite complexes illustrated in Figure 1 include: (1) Simple remote consoles. (2) Slow I/O. (3) Faster I/O with an optional small general purpose computer for local housekeeping. (4) Small general purpose computers for servicing visual display consoles. (5) Control computers for monitoring and controlling experiments. (6) Geographically remote satellite computers. This network was selected for study because it represents many of the philosophies and ideas which enter into the design of any network computer. The problems of interest here include the relative capabilities of the network, identification of specific limitations of the network, and the 121 122 National Computer Conference, 1973 LOW SPEED "REMOTES" t ~ computer conceptually performs three major functions: queue handling and priority assignment; processor allocation; and resource allocation other than the CPU (such as main storage, input/ output devices, etc.). The goal of the single node optimization was to develop a priority scheme that would minimize the mean flow time of a set of jobs, while maintaining a given level of CPU and memory utilization. The IBM 360/75 was taken as the model node, the present scheduling scheme of the 360;75 under HASP (Houston Automatic Spooling Priority System)! was evaluated, and as a result a new priority scheme was devised and analyzed using the simulation model. r::::7 NODE DESCRIPTION OUCHTONE TELEPHONES LOW SPEED "LOCALS" HIGH SPEED "LOCALS" ~ \J r:::::7 ~ 1"""I N L:J \ MEDIUM SPEED "REMOTES· Figure l-ILLINET-University of Illinois computer communications network interrelationship between communication and computing. From a long range viewpoint, one of the more interesting problems is the effect on system performance of centralized vs. distributed control in the operating system. From a postulation of the essential characteristics of our computer network, we have formulated a GPSS model for a three node network, illustrated in Figure 2. Jobs entering the nodes of the network come from three independent job streams, each with its own arrival rate. A single node was isolated so that performance could be tested and optimized for the individual nodes before proceeding to the entire network. A node in a network The logical structure of the HASP and OS /360 systems currently in use on ILLINET is illustrated in Figure 3 and briefly described in the following paragraphs. Job initiation Under the present HASP and O.S. system jobs are read simultaneously from terminals, tapes, readers, disks, and other devices. As a job arrives, it is placed onto the HASP spool (which has a limit of 400 jobs). If the spool is full, either the input unit is detached, or the job is recycled back out to tape to be reread later at a controlled rate. Upon entering the system, jobs are assigned a "magic number," Y, where the value of Y is determined as follows: Y=SEC+.l*IOREQ+.03*LINES. SEC represents seconds of CPU usage, LINES represents printed output, and IOREQ represents the transfer to or ILl PLACE JOB ON O.S.QUEUE JOB ACCOUNTING Figurr ?-- Hypf)thptir~l nrtwnrk romplltpr (1) FiglJ1"P ::I~- T,ngif'fll <;t1"llf'tll1"P "f HARP Simulation 123 The O.S. supervisor is then called to allocate core space. The first block of contiguous core large enough to contain the step request is allocated to the job. If no such space is available, the initiator must wait, and is therefore tying up both the O.S. and HASP initiators. No procedures in O.S. exist for compacting core to avoid fragmentation. Once core is allocated, the program is loaded, and the job is placed on a ready queue with the highest nonsystem priority. PREPARE TO RUN NEXT STEP o.s. scheduler Jobs are selectively given control of the CPU by the O.S. scheduler. The job with the highest dispatching priority is given control until an interrupt occurs-either user initiated or system initiated. CALL SUPERVISOR TO ALLOCATE CORE SPACE HASP dispatcher SCHEDULER CHECK LIMITS (Yes) (No) Ret'Jrn Initiator Every two seconds, a signal is sent by the dispatcher to interrupt the CPU, if busy. All of the jobs on the ready queue are then reordered by the assignment of new dispatching priorities based on resources used in the previous 2 second interval. The job that has the lowest ratio of CPU time to 110 requests will get the highest dispatching priority. (For example, the jobs that used the least CPU time will tend to get the CPU first on return from the interrupt.) During this period, HASP updates elapsed statistics and checks them against job estimates, terminating the job if any have been exceeded. Figure 3b-Logical structure of 0.8. Job termination from core storage of blocks of data. Based on this magic number, a "class" assignment is given to each job. Anyone of seven initiators can be set to recognize up to five different classes of jobs, in a specific order. It is in this order that a free initiator will take a job off the spool and feed it to O.S. For example, if an initiator is set CBA, it will first search the spool for a class C job; if not found, it will look for a class B. If there is no B job, and no A job either, the initiator will be put in a wait state. Once the job is selected, it is put on the O.S. queue to be serviced by the operating system. o.s. initiation After a job is placed on the O.S. queue, there is no longer any class distinction. Another set of initiators selects jobs on a first-come, first-served basis and removes them from the O.S. queue. It is the function of these initiators to take the job through the various stages of execution. The control cards for the first (or next) step is scanned for errors, and if everything is satisfactory, data management is called to allocate the devices requested. The initiator waits for completion. When execution of the job is completed, control is returned to the HASP initiator to proceed with job termination. Accounting is updated, the progression list is set to mark completion, and Print or Punch service is called to produce the actual output. Purge servi<;e is then called to physically remove the job from the system. The initiator is then returned to a free state to select a new job from the spool. The main goal of the HASP and O.S. system is to minimize the mean flow time and hence the mean waiting time for all jobs in the system, provided that certain checks and balances are taken into account. These include prohibiting long jobs from capturing the CPU during time periods when smaller jobs are vying for CPU time, prohibiting shorter jobs from completely monopolizing the CPU, and keeping a balance of CPU bound and II 0 bound jobs in core at any given time. At this point the question was asked: "Could these goals be achieved in a more efficient way?" PROPOSED PRIORITY SCHEME In a single server queueing system assuming Poisson arrivals, the shortest-processing-time discipline is optimal 124 National Computer Conference, 1973 with respect to mimmizmg mean flow-time (given that arrival and processing times of jobs are not known in advance of their arrivals). 2 This result is also bound to the assumption that jobs are served singly and totally by the server and then released to make room for the next job. Processing time With respect to OS /360 there are several levels and points of view from which to define processing or service time. From the user's point of view processing time is, for all practical purposes, the time from which his program is read into HASP to the time when his output has been physically produced-punched, filed, plotted and/ or printed. Within this process there are actually three levels of service: (1) The initial HASP queuing of the job, readying it for O.S.; a single server process in the precise sense of the word. (2) The O.S. processing of the job; a quasi single server process where the single-server is in fact hopping around among (usually) four different jobs. (3) The final HASP queueing and outputting of the job; again a true single-server process. job's priority). A summary of the dynamics of the proposed priority scheme is depicted in Figure 4. Dynamic priority assignment Once the initial static priority assignment has been determined for each job, a dynamic priority assignment algorithm is used to ensure that the checks and balances listed previously are achieved. The restraints which are enforced by the dynamic priority assignment are needed JOBS PRT (Static) WAITING QUEUE DPA and SBM (Dynamic) The second level of service was used as a reference point and processing time was defined as the total time a job is under O.S. control, whether it is using the CPU or not. The total time a job is under control of O.S. consists of four time elements: " READY QUEUE (1) Waiting for core-this quantity is directly related to the region of core requested by a job and can be represented by a . R where a is a statistical measure of the relationship of core region requests to seconds waiting and R is the region size requested. (2) Direct CPU usage-this quantity can be measured in seconds by a control clock and is denoted by CPUSEC. (3) Executing I/O-this quantity includes the time needed for both waiting on an 110 queue and for actually executing 1/0. It is directly related to the number of I/O requests a job issues and can be represented by {3 . 10 where {3 is a statistical measure of the relationship of the number of 1I 0 req uests to seconds waiting for and executing I/O, and 10 is the number if I/O requests issued. (4) Waiting on the ready queue-this quantity is heavily dependent on the current job configuration. Since the O.S. queue configuration a job encounters is unknown when the job enters HASP, this waiting time is not accounted for in the initial assignment. HASP DISPATCHER (Dynamic) CPU (Static) OUTPUT QUEUE The total job processing time, PRT, may be expressed as follows: PRT=a· R+CPUSEC+{3· 10 ~, (2) This number, calculated for each job, becomes an initial priority assignment (the lower the number the higher the JOBS Figurel Prupuseu priulity .:l,;siglll11cnt scheme Simulation before a job enters O.S., so the dynamic priority assignment is made while jobs are waiting on the HASP queue. The dynamic priority assignment, DPA, was implemented by measuring a job's waiting time at its current priority. The job's priority is increased if the time spent at the current level exceeds an upper limit established for that level. System balance At this point jobs are on the HASP queue, ordered by a dynamically adjusted PRT priority assignment, ready to be picked up by an initiator that is free. As soon as an initiator chooses a job, that job leaves HASP control and enters O.S. control. Jobs then move between the O.S. run and ready queues under the influence of the HASP dispatcher discussed earlier. The HASP dispatcher guarantees the highest CPU use level possible, given ihaf the set of jobs has been initiated according to its DPA values. However, CPU and memory utilization may fall below some predetermined level because a particular set of initiated jobs simply does not maintain system balance. 3 Therefore, one more dynamic assignment, SBM-System Balance Measure, was introduced. An SBM assignment enables jobs to take positions in the job queue independently of their DPA. If the utilization of CPU and memory is below a predetermined level, the system is said to be out of balance, and the next job initiated for processing should be the one that best restores balance (where the one with the lowest DPA is chosen in case of ties). NODE VALIDATION tion of how much processing time that job required. The prediction was to be made from user estimates of the resources required by a job. In our previous discussion a formula was devised to make this initial priority assignment, based on user predictions of CPU seconds. kilobytes of core and number of I/O requests and on t~o statistical measures-a and {3. Recall a is a measure of the relationship of core region request to seconds waiting for this request, and {3 is a measure of the relationship of the number of I/O requests to seconds waiting for and executing this I/O so that PRT=a . COREQ+CPUSEC +{3' [0 The first step in the proposed priority scheme was to assign an initial static priority to a job, based on a predic- + (3 . [0 (4) where {3 was the only unknown. Using a light job stream (average arrival rate of one job every 40 seconds and a CPU utilization of 8 percent with 3 percent of the jobs waiting for core) an exponential distribution of wait times distributed around a mean of 54 msec gave the closest match between the simulated and real O.S. time distributions. {3 was assigned the value 0.038 seconds per I/O request, since in an exponential distribution 50 percent of the values assigned will be less than 69 percent of the mean. The simulation was then used with a heavier job stream (one job every 15 seconds) for the determination of a. Statistics were produced correlating the size of a step's core request and the number of milliseconds it had to wait to have the request filled. A least squares fit of the data yielded the relationship: WC=maxIO, . 7K2-100K! (5) where Wc is milliseconds of wait time and K is the number of kilobytes core requested. The PRT, in its final form thus became: PRT=CPUSEC+38iO Parameter determinations (3) Neither the a nor {3 measure was immediately available from the present monitoring data of the 360. The simulation was used in two different ways to determine these measures. The particular GPSS simulation being used, while allocating cote in the same way as the 360, does not set up all the I/O mechanisms actually used by the 360 when a job issues a request. The simulator assigns some time factor for the request and links the job to a waitingfor-I/O chain for the duration of the assigned time. The approach used in obtaining the {3 factor was to create a job stream for which the total time in O.S. was known and for which all the components contributing to time in O.S. were known except for the 10 factor. Of the four factors that contribute to a job's time in O.S. only actual CPU time could be positively known. A job stream in which jobs did not have to wait for core and in which jobs essentially had the CPU to themselves when they were in a ready state was required. Thus the equation was reduced to: Time in O.S. = CPUSEC The proposed priority assignment was then implemented on the simulator and statistics for jobs run on the ILLINET system were collected and used to determine the frequency distributions for the variables needed to create the simulation of the IBM 360;75. The model thus created was used for three distinct purposes. The first of these uses was as a tool to collect data not directly or easily accessible from the actual ILLINET system. It was in this capacity that the simulator yielded a and {3 factors needed for the PRT formula. The second use of the model was as a tuning instrument for finding the best adjustments of values for the DPA and SBM. The third use of the model was for evaluating the advantages (or disadvantages) of the new priority scheme by comparing various system measures for identical job streams run under both the old and new schemes. (A fourth use of the model might also be included here. The convincing results that it provided became the deciding factor in obtaining from the system designers the money and personnel needed to implement and test the proposed priority scheme under real-time conditions.) 125 + max iO, . 7K2-100KI (6) where CPUSEC is the number of CPU milliseconds required, 10 is the number of I/O requests, and K is kilobytes of core. 126 National Computer Conference, 1973 Dynamic tuning The values for the maximum waiting times for jobs with given PRTs were determined by putting the PRTs into a loose correspondence with the existing class divisions. A small job is guaranteed thirty minute turnaround time and a slightly larger job is guaranteed a turnaround of ninety minutes. Since the simulations generally were not set to simulate more than ninety minutes of real time, the guaranteed turnaround for very large jobs was set to an arbitrarily high value. Since test runs using the PRT were showing very satisfactory values for CPU and core utilization, about 96 percent and 73 percent respectively, a simple system balance measure was adopted. The SBM, System Balance Measure, adjustment checks CPU and memory use individually every two seconds and signals the initiator to choose as its next job the one that would best restore the utilization of the respective resource. The criterion for resource underuse is less than 30 percent utilization. The criterion for choosing a job to restore CPU use is the highest CPU /10 ratio. The criterion for choosing a job to restore memory use is the largest core request that fits into core at the time. NODE PERFORMANCE After the proposed priority scheme was developed, the simulation model was used to evaluate the node performance under each of three conditions: (1) Using the existing magic number technique. (2) Using the static PRT technique. (3) Using the dynamically adjusted PRT (including the DPA and SBM measures). For each of these tests, an hour of real time was simulated, with identical job streams entering the system. Table I illustrates the results of these tests including O.S. and turnaround times for the job streams, as well as CPU and core utilization values. In this evaluation we are particularly interested in three measures of system performance: (1) Turnaround time-system performance from the user's point of view. (2) System throughput-system performance from the system manager's point of view. (3) System balance-system performance from the device utilization point of view. In particular, we note the striking decrease in overall turnaround time for jobs processed under the proposed PRT scheduling algorithms. When the resource utilization is kept above some critical level and a maximum waiting time is specified, we observe that the turnaround time for the entire system can, in fact, increase. TABLE I -System Performance Measures for Three Priority Schemes PRESENT SCHEME Mean HASP Time· Class A B C D A-D PRT (Static) PRT (Dynamic) 100 100 100 8 98 28 9 17 100 12 18 11 Mean O.S. Time· Class A B C D A-D 100 100 100 74 54 47 94 69 91 100 70 87 Mean Turnaround Time· A-D 100 22 29 94 75 482 96 73 560 98 73 515 % CPU Utilization % CORE Utilization Total Jobs Processed • Relative time units (times are normalized to 100 units for each priority class). NETWORK MODELING Having developed a simulation model for a single node, we now turn to the problem of constructing a model for a network of three such nodes as illustrated in Figure 2. Jobs entering the system come from three independent job streams with different arrival rates. At selected intervals, the relative "busyness" of each center is examined. Based on this information, load-leveling is performed between centers. The three node network model was written in IBM's GPSS (General Purpose Simulation System),4 and run on an IBM 360;75. Once the simulation language and computer were selected, the next step was to formulate a design philosophy. DESIGN PHILOSOPHY Simulated time unit A major decision regarding any simulation model is the length of the simulated time unit. A small time unit would be ideal for a computer system simulation. However, other ramifications of this unit must be considered. It is desirable to simulate a relatively long real-time period in order to study the effect of any system modifications. This would be extremely lengthy if too small a time unit were chosen, requiring an excessive amount of computer time. Also, depending on the level of simulation, the accuracy could actually deteriorate as a result of the fine division of time. These and other considerations led to the selection of 1 millisecond as the clock unit. Using a time unit of 1 millisecond immediately results in the problem of accounting for times less than 1 ms. Of Simulation course, these times could not be ignored, but at the same time, could not be counted as a full clock unit. A compromise approach was used-that of accumulating all of these small pieces into a sum total of "system overhead," to be run during initiation/termination.* System time chargeable to a job therefore, is executed during initiation/termination. Additionally, nonchargeable overhead is accounted for at each interrupt of a job in the CPU, and at the reordering of dispatching priorities of jobs on the ready queue. Entity representation Before proceeding to write a simulation program, careful consideration had to be given to the way in which actual system entities were to be represented by the simutator~---'ilre-propertiesof a given systemfe-ature--to-lJe-simu--lated had to be defined and the GPSS function most closely matching the requirements selected. For example, representation of the HASP Spools was of primary importance, and GPSS offers a number of possibilitiesqueues, chains, etc. The requirement that transactions be re-ordered at any time ruled out the queue representation, and the optional automatic priority ordering possible with a user chain led to its selection. Chains also offered the best method of communication between nodes of the network since it is possible to scan the chains and remove any specific job. This is essential for the implementation of any load-leveling or system balancing algorithm. The structure of GPSS played another important part in determining the representation of jobs. A job could have been represented as a table (in the literal programming and not GPSS sense) that would contain the information about this job, and be referenced by all other transactions in the simulation. This would have led to a simulation within the simulation, an undesirable effect. Therefore, jobs are represented as transactions which keep all pertinent information in parameters. Unfortunately, this led to some rather complex timing and communication considerations which had to be resolved before the simulator could run. Timing and communication There is no direct method of communication between two transactions in GPSS, so whenever such contact was 127 necessary, alternate procedures were devised. For example, at some points an initiator must know the size of core requested by a given job. The receiving transaction must put itself in a blocked state, while freeing the transaction from which the information is required. The information is then put in a savevalue or other temporary location by the sending transaction. After signalling the receiving transaction that the information is present, this transaction puts itself in a blocked state, and thus allows the receiving transaction to regain control of the simulator in order to pick up the contents of the savevalue. This procedure is non-trivial, since in an event-driven simulation, there may be any number of transactions ready to run when the sending transaction is blocked. The priorities of the ready transactions, and knowledge of the scheduling algorithms of the simulation language itself, must be anaIY:l~d to en,!;!!l~_GQ!rectr~~_\llts. During the simulation, the jobs waiting to be executed are not the only transactions waiting to use the simulated CPU. Transactions representing the scheduler and dispatcher also require this facility. Therefore, we must ensure that only one transaction enters the CPU at any given time since this is not a multiprocessing environment. Logic switches are set and facility usage tested by every transaction requesting the CPU. GPSS IMPLEMENTATION With this design philosophy, we proceed to outline the representation of the various HASP and O.S. entities at each node. The logical structure of the simulation process occurring at each node, shown in the flow charts of Figures 5 through 8, is summarized in the following paragraphs. 5 Jobs Each job is represented by one GPSS transaction with parameters containing information such as creation time, number of milliseconds that will be executed, size of core requested, etc. The parameters are referenced throughout the simulation to keep a record of what was done, and indicate what the next step will be. In this way, by moving the transaction from one section of the model to another, different stages of execution can be indicated. HASP and O.S. initiators * It is impossible to measure accurately (within 10 percent) the amount of system time required for a given job. In a period of simulated time T, an error Es = TsNse will be accumulated, where Ns is the number of relatively short (~1 ms) amounts of system time needed, Ts is the total time spent on each of these short intervals, and e is the percentage error in the simulated time given to this operation. Similarly, the error for long intervals E, can be shown to be Tl~1I.[,e where T, and Ni are as above for some longer periods (~1000 ms). The simulation shows the ratio TsN,j T,N, is approximately 2, resulting in a greater error with the smaller time unit. There are two sets of initiators, one for HASP, and another for O.S., each requiring the same information about the jobs they are servicing. The HASP initiator for a specific job must be dormant while the O.S. initiator is running. Therefore, seven transactions are created, each of which represents either a HASP or O.S. initiator. Each transaction is created as a HASP initiator and put in an inactive state awaiting the arrival of jobs. After the initia- 128 National Computer Conference, 1973 tive state, it must be taken off the current events chain and placed on a chain specifically representing that wait condition. Queues All HASP and O.S. queues are represented by user chains as discussed earlier. In addition to facilitating ordering of objects on the queues, chains gather the proper waiting time and size statistics automatically. CPU-scheduler ASSIGN NUMBER, CLASS The scheduler has the responsibility of determining which task will next get control of the CPU. The scheduler is represented by one high priority transaction that unlinks jobs from the ready queue and lets them seize the ASSIGN NUMBER MAKE CLASS SETTINGS ASSIGN REQUESTS FOR EACH STEP (TIME, I la, CORE) UNLINK JOB FROM SPOOL ON SPOOL INITIALIZE JOB Figure 5-Job creation flow chart tor completes initiation of a job and places it on the O.S. queue, the HASP initiator becomes the O.S. initiator. This O.S. initiator flows through the core allocation and other resource allocation routines to request core space, and finally places the job on the ready queue to run. This initiator then becomes dormant waiting for the job (or job step) to complete. At each step completion, the initiator is awakened to request resource~ for the succeeding step. When the entire job completes, the initiator is returned to an inactive state where it again performs its HASP function. \Vhenever an initiator or job is to be put in an inac- A r-------i~ INITIALIZE STEP REQUESTS (Y•• ) ASSIGN CORE LINK TO READY QUEUE Figure 6-HASP and O,S. initiator flow chart Simulation 129 RETURN TO READY QUEUE (No) (Yes) UNLINK FROM READY QUEUE ASSIGN SLICE CHECK LIMITS DE -ALLOCATE CORE UPDATE STATS RECORD TIMES RESET CORE MAP FREE JOBS AWAITING CORE SET JOB TO OUT OF CORE OVERHEAD (Yes) (No) DISPATCHER INTERRUPT RUN JOB FOR THIS SLICE EVERY 2 SEC. ENTER COMPLETION TIME FREE INITIATOR ISSUE 1/0 REQUEST UPDATE ELAPSED STATS SET JOB 10 IN CORE OVERHEAD RETLRN 10 READY QUEUE LINK TO COMPLETED CHAIN Figure 7b-CPU-Scheduler flow chart (cont.) facility corresponding to the CPU. While this job is advancing the clock in the facility, no other transactions are permitted to enter. Although GPSS automatically permits only one job in each facility, this is not sufficient protection against more than one transaction entering the CPU. Therefore, this condition is explicitly tested by all transactions requesting the CPU. Multiple transactions are allowed to enter blocks representing II 0 requests, and other system processes, since these functions in the real system are actually carried out in parallel. When the CPU is released, control is returned to the scheduler, which allocates the facility to the next job on the ready queue. Dispatching priority assignment Figure 7a-CPC-Scheduler flow chart The HASP dispatching priority assignment is carried out by one final transaction. Every 2 seconds this transac- 130 National Computer Conference, 1973 Q DISPATCHER --~ ~, ..... WAIT TWO SECONDS ~~ INTERRUPT CPU NETWORK PERFORMANCE After having tested the simulation model, a hypothetical network consisting of three nodes with independent job streams and arrival rates, was investigated. Network balance was maintained using a load-leveling algorithm on the network that periodically (about five times the arrival rate of jobs at the busiest center) examined the queues at each center. The number of jobs in the queue of the busiest center was compared to the number of jobs in the queue of the least busy center, and this ratio used as the percentage of jobs to be sent from the busiest center to the lightest center. This distributed the number of jobs in the network so that each center would be utilized to the maximum possible degree. Naturally, the users submitting jobs to the highest center experienced an increase in turnaround time, but this is outweighed by the increased throughput for the network. To demonstrate how the simulator could be used to evaluate loadleveling or other network balancing algorithms, two simulation runs were made: the first having no communication between centers, and the second with the load-leveling algorithms implemented as described above. Performance data for the two networks, evaluated according to the criteria outlined earlier, is illustrated in Table II. TABLE II-System Perfonnance Measures for a Hypothetical Network Without Load Leveling " RE-ASSIGN DISPATCHING PRIORITY " RE-LINK JOBS TO READY QUEUE Figure 8-Dispatcher flow chart tion is given control of the simulator, and proceeds to reassign the dispatching priority of the jobs on the ready queue and those jobs currently issuing I/O requests. The job in control of the CPU (if any) is interrupted, and placed back on the ready queue according to its new priority. When all of the re-ordering is complete, the scheduler is freed, and the dispatcher is made dormant for another two seconds. Turnaround Time* Center 1 Center 2 Center 3 Network Average Queue Length Center 1 Center 2 Center 3 Network 99 100 44 With Load Leveling 90 89 80 95 87 125 4 "'0 43 42 25 23 30 CPU Utilization Center 1 Center 2 Center 3 Network .985 .902 .518 .802 .982 .952 .931 .955 Core Utilization Center 1 Center 2 Center 3 Network .685 .698 .326 .569 .713 .684 .668 .688 System Throughput** Center 1 Center 2 Center 3 Network * Relative time units. ** Jobs per hour. 330 196 98 624 256 256 234 746 Simulation CONCLUSIONS NETWORK SIMULATION RESULTS Validation Our simulation model for a network center is a valid tool for measuring and evaluating network performance only if we accurately simulate the intercommunication among the network centers and the control of jobs within each center. Therefore, an essential goal of our simulation effort was to verify the accuracy of representing the interaction among the simulated entities of ILLINET. Frequently, spot checks were made and tests were designed to ensure that the proper correspondence existed between the real and simulated environments. Hence, the evaluation measurements taken, effectively predict the expected system performance of future networks. System evaluation The GPSS simulation of the IBM 360;75 was used to develop and test a new priority scheme suggested as an alternative to the present system used on the 360. An initial static priority assignment was determined which uses user estimates of CPU, I/O requests and core required by a job. Subsequent priority adjustments are made to reward a job for its long wait in the system or to restore some minimum level of utilization of the CPU and core. Test runs of the new priority scheme on the simulator suggest very substantial improvements in terms of minimizing turnaround time and utilizing system resources. The emphasis is on evaluating scheduling disciplines since the only degree of freedom open to a network manager to affect system congestion is a choice of scheduling algorithms with priority assignments. (A network manager can seldom affect the arrival rate, service rate or the network configuration on a short term basis.) The simulation was then extended to a three node network to study the effect of implementing load-leveling and other network balancing algorithms. Simulation runs show an improved turnaround time for heavily loaded centers and at the same time a larger increase in total throughput and utilization of network resources. THE ROAD AHEAD Until recently, efforts to measure computer system performance have centered on the measurement of resource (including processor) idle time. A major problem with this philosophy is that it assumes that all tasks are of roughly equal value to the user and, hence, to the operation of the system. 131 As an alternative to the methods used in the past, we have proposed a priority assignment technique designed to represent the worth of tasks in the system. 6 We present the hypothesis that tasks requiring equivalent use of resources are not necessarily of equivalent worth to the user with respect to time. We would allow the option for the user to specify a "deadline" after which the value of his task would decrease, at a rate which he can specify, to a system determined minimum. Additionally, the user can exercise control over the processing of his task by specifying its reward/ cost ratio which, in turn, determines the importance the installation attaches to his requests. The increased flexibility to the user in specifying rewards for meeting deadlines yields increased reward to the center. The most important innovation in this approach is that it allows a computing installation to maximize reward for the use of resources while allowing the user to specify deadlines for his results. The demand by users upon the resources of a computing installation is translated into rewards for the center. Thus, the computing installation becomes cost effective, since, for a given interval of time, the installation can process those tasks which return the maximum reward. Using our network simulator to demonstrate the efficacy of this technique is the next step in the long road to achieving economic viability in network computers. ACKNOWLEDGMENTS We are particularly grateful to Mr. William J. Barr of Bell Laboratories for inviting us to write this paper. His constant encouragement and his previous endeavors in the Department of Computer Science at the University of Illinois at Urbana-Champaign have made this work possible. This research was supported in part by the National Science Foundation under Grant No. NSF GJ 28289. REFERENCES 1. Simpson, T. H., Houston Automatic Spooling Priority System-II (Version 2), International Business Machines Corporation, 1969. 2. Schrage, L., "A Proof of the Optimality of the Shortest Remaining Processing Time Discipline," Operations Research, Vol. 16, 1968. 3. Denning, P., "The Working Set Model for Program Behavior," Communications of the ACM, Vol. 11, No.5, 1968. 4. - - - , General Purpose Simulation System/360 Introductory User's Manual GH20-0304-4, International Business Machines Corporation, 1968. 5. Salz, F., A GPSS Simulation of the 360/75 Under HASP and O.s. 360, University of Illinois, Report ~o. UIUCDCS-R72-528, 1972. 6. Bowdon, E. K., Sr., and Barr, W. J., "Cost Effective Priority Assignment in :--.Ietwork Computers," Proceedings of the Fall Joint Computer Conference, 1972. ACCNET-A corporate computer network by MICHAEL L. COLEMAN Aluminum Company of America Pittsburgh, Pennsylvania Corporation machines;38 the Distributed Computer System (DCS) at the University of California at Irvine; IS the Michigan Educational Research Information Triad, Inc. (MERIT), a joint venture' between Michigan State U niversity, Wayne State University, and the University of Michigan;2.12.30 the OCTOPUS System at the Lawrence Berkeley Laboratory;41 the Triangle Universities Computation Center (TUCC) Network, a joint undertaking of the Duke, North Carolina State, and North Carolina Universities;,'4 ad the TSS Network, consisting of interconnected IBM 360/67s.39.47.53 But perhaps the most sophisticated network in existence today is the one created by the Advanced Research Projects Agency (ARPA), referred to as the ARPA network. 9 )''>,20.22.28.33.34.40.42.44.46 The ARPA network is designed to interconnect a number of various large time-shared computers (called Hosts) so that a user can access and run a program on a distant computer through a terminal connected to his local computer. It is set up as a message service where any computer can submit a message destined for another computer and be sure it will be delivered promptly and correctly. A conversation between two computers has messages going back and forth similar to the types of messages between a user console and a computer on a time-shared system. Each Host is connected to the network by a mini-computer called an Interface Message Processor (IMP). A message is passed from a Host to its IMP, then from IMP to IMP until it arrives at the IMP serving the distant Host who passes it to its Host. Reliability has been achieved by efficient error checking of each message and each message can be routed along two physically separate paths to protect against total line failures. The ARPA network was designed to give an end-to-end transmission delay of less than half a second. Design estimates were that the average traffic between each pair of Hosts on the network would be .5 to 2 kilobits per second with a variation between 0 and 10 kilo bits per second and the total traffic on the network would be between 200 and 800 kilobits per second for a 20 IMP network. 20 To handle this load, the IMPs were interconnected by leased 50KB lines. For the initial configuration of the ARPA network, communication circuits cost $49,000 per node per year and the network supports an average traffic of 17 kilobits I~TRODUCTION The installation of a Digital Equipment Corporation DEC ro, in close proximity to an existing IBM 370/165, initiated an investigation into the techniques of supporting communication between the two machines. The method chosen, use a mini -computer as an interface, suggested the possibility of broadening the investigation into a study of computer networks-the linking of several large computer systems by means of interconnected mini-computers. This paper explains the concept of a network and gives examples of existing networks. It discusses the justifications for a corporate computer network, outlines a proposed stage by stage development, and analyzes and proposes solutions for several of the problems inherent in such a network. These include: software and hardware interfaces, movement of files between dissimilar machines, and file security. WHAT IS A NETWORK? A computer network is defined to be "an interconnected set of dependent or independent computer systems which communicate with each other in order to share certain resources such as programs or data-and/ or for load sharing and reliability reasons."19 In a university or a research environment, the network might consist of interconnected time-sharing computers with a design goal of providing efficient access to large CPU s by a user at a terminal. In a commercial environment a network would consist primarily of interconnected batch processing machines with a goal of efficiently processing a large number of programs on a production basis. One example of the use of a network in a commercial environment would be preparing a program deck on one computer, transmitting it to another computer for processing, and transmitting the results back to the first computer for output on a printer. OTHER NETWORKS Functioning networks have been in existence for several years.4.19.36 These include: CYBERNET, a large commercial network consisting of interconnected Control Data 133 134 National Computer Conference, 1973 per node. Each IMP costs about $45,000 and the cost of the interface hardware is an additional $10,000 to $15,000. 23 The IMPs are ruggedized and are expected to have a mean time between failures of at least 10,000 hours-less than one failure per year. They have no mass storage devices and thus provide no long term message storage or message accounting. This results in lower cost, less down time, and greater throughput performance. 46 TYPES OF NETWORKS There are three major types of networks: Centralized, Distributed, and Mixed. 19 A Centralized network is often called a "Star" network because the various machines are interconnected through a central unit. A network of this type either requires that the capabilities of the central unit far surpass those of the peripheral units or it requires that the central unit does little more than switch the various messages between the other units. The major disadvantage of a network of this type is the sensitivity of the network to failures in the central unit, i.e., whenever the central unit fails, no communication can occur. The most common example of this type of network is one consisting of a single CPU linked to several remote batch terminals. A Distributed network has no "master" unit. Rather, the responsibility for communication is shared among the members; a message may pass through several members of the network before reaching its final destination. For reliability each unit in the network may be connected to at least two other units so that communication may continue on alternate paths if a line between two units is out. Even if an entire unit is disabled, unaffected members can continue to operate and, as long as an operable link remains, some communication can still occur. The ARPA network is an example of a Distributed network. A Mixed network is basically a distributed network with attached remote processors (in most cases, batch terminals) providing network access to certain locations not needing the capability of an entire locally operated computer system. These remote locations are then dependent on the availability of various central CPUs in order to communicate with other locations. Within a network, two types of message switching may occur: circuit switching and packet switching. Circuit switching is defined as a technique of establishing a complete path between two parties for as long as they wish to communicate and is comparable to the telephone network. Packet switching is breaking the communication into small messages or packets, attaching to each packet of information its source, destination, and identification, and sending each of these packets off independently to find its way to the destination. In circuit switching, all conflict and allocation of resources must be resolved before the circuit can be established thereby permitting the traffic to flow with no conflict. In packet switching, there is no dedication of resources and conflict resolution occurs during the actual flow. This may result in some" what uneven delays being encountered by the traffic. F WHY A NETWORK? By examining the general characteristics of a network in the light of a corporate environment, specific capabilities which provide justification for the establishment of a corporate computer network can be itemized. 25 These are: load balancing avoidance of data duplication avoidance of software duplication increased flexibility simplification of file backup reduction of communication costs ability to combine facilities simplification of conversion to remote batch terminal enhancement of file security Load balancing If a network has several similar machines among its members, load balancing may be achieved by running a particular program on the machine with the lightest load. This is especially useful for program testing, e.g., a COBOL compilation could be done on any IBM machine in the network and achieve identical results. Additionally, if duplicate copies of production software were maintained, programs could be run on various machines of the network depending on observed loads. Avoidance of data duplication In a network, it is possible to access data stored on one machine from a program executing on another machine. This avoids costly duplication of various files that would be used at various locations within the corporation. Avoidance of software duplication Executing programs on a remote CPU with data supplied from a local CPU may, in many cases, avoid costly software duplication on dissimilar machines. For example, a sophisticated mathematical programming system is in existence for the IBM 370. With a network, a user could conversationally create the input data on a DEC 10 and cause it to be executed on the 370. Without a network, the user would either have to use a more limited program, travel to the 370 site, or modify the system to run on his own computer. Flexibility Without a network each computer center in the corporation is forced to re-create all the software and data files it wishes to utilize. In many cases, this involves complete reprogramming of software or reformatting of the data files. This duplication is extremely costly and has led to considerable pressure for the use of identical hardware ACCNET-A Corporate Computer Network and software systems within the corporation. With a successful network, this problem is drastically reduced by allowing more flexibility in the choice of components for the system. Simplification of file backup In a network, file backup can be achieved automatically by causing the programs which update the file to create a duplicate record to be transmitted to a remote machine where they could be applied to a copy of the data base or stacked on a tape for batch update. This would eliminate the tedious procedure of manually transporting data from one machine to another; the resulting inherent delay in the updates would be eliminated. 11 Reduction of communication costs The substitution of a high bandwidth channel between two separate locations for several low bandwidth channels can, in certain cases, reduce communication costs significantly. A bility to combine facilities With a network, it is possible to combine the facilities found on different machines and achieve a system with more capability than the separate components have individually. For example, we could have efficient human interaction on one machine combined with a computational ability of a second machine combined with the capability of a third machine to handle massive data bases. Simplification of conversion Converting a site from its own computer to a remote batch terminal could be simplified by linking the computer at the site into the network during the conversion. Enhancement of file security By causing all references to files which are accessible from the network to go through a standard procedure, advanced file security at a higher level than is currently provided by existing operating systems may be achieved. This will allow controlled access to records at the element level rather than at the file level. EXISTING SITUATION The existing configuration of the DEC 10 installation provides a 300 (to be extended to 1200) baud link to the 370 via a COMTEN/60, a mini-computer based system which provides store-and-forward message switching capability for the corporate teletype network. This link is 135 adequate to support the immediate needs of a Sales Order Entry System but is totally inadequate for the general capability of making the computational power and the massive file storage of the 370 available to a usei on the DEC 10. Five DATA 100 terminals provide remote batch service into the 370 for users at various locations including three plants and a research center. Most of the other plants have medium scale computer systems to support their local data processing needs. All make extensive use of process control mini-computers and two have UNIVAC 494 systems which can handle both real-time control and batch data processing. Approximately 25 interactive CRTs scattered throughout various sales offices across the country have recently been installed to upgrade our Sales Order Entry System. Each terminal is connected to the DEC 10 on a dial-up 300 baud line. PROPOSED SOLUTION The most obvious solution to the problem of 370-DEC 10 communication would be to connect the DEC 10 to the 370 in a "back-to-back" fashion. To provide an upward flexibility, however, it is proposed that rather than connecting the machines in that way, they will be connected using a mini-computer as an interface. By designing the system which controls their interaction with a network approach, additional communication links may be obtained with a relatively small software investment. For example, if in the future, our research center obtains a large computer that they wish to incorporate into the communications process of the other two, an additional mini-computer would be placed there and connected via a communication line to the other. This approach has several advantages. First, by going through a mini-computer, each of the two interfaces can be very carefully debugged in isolation and thus not affect the other machine. Second, once an IBM interface to the mini-computer is designed, one can connect any IBM machine into the network without rewriting any of the other interfaces. We would not have to write an IBM to UNIVAC interface, an IBM to CDC interface, an IBM to Honeywell interface, etc. Third, the only change necessary in the existing portion of the network, as the network expands, would be to inform the mini-computers of the presence of the other machines. System description In order to effectively describe a system as potentially complex as this one, we shall make use of techniques being developed under the classification of "Structured Programming."17.37.48,55,56 The system will be broken down into various "levels of abstraction," each level being unaware of the existence of those above it, but being able to use the functions of lower levels to perform tasks and supply information. When a system is specified in terms 136 National Computer Conference, 1973 of levels, a clear idea of the operation of the system may be obtained by examining each level, starting from the top, and continuing down until further detail becomes unimportant for the purposes of the specification. Let us now examine the first few levels of a portion of the proposed system. The top-most level is level 6, under that is level 5, and so on. We shall look at what occurs in the case of a user at a terminal on the DEC 10 submitting a program to a distant IBM 370 under HASP. • Level 6 On level 6 is found user and program processes. All interaction with the user or with a program written by the user occurs on this level. In fact, after this level is completely specified, the User Manual for the system can be written. In our example, an examination of what is happening would show the following <;;teps: User creates the input file and a file for the output; User logs onto the network specifying his ID number; User types "SUBMIT" command specifying the input file, the output file, and the Host on which the program is to be run. This submit command calls on the HASP Submit-Receive function on level 5; User waits a brief period until he gets an "OK" from the terminal signifying that the program has been submitted. He is then free to either perform other actions or to sign off of the network; At some later time the user receives an "output ready" message on his terminal; User can now examine his output file. • Level 5 On level 5 is found the HASP Submit-Receive function, HSR, and functions to perform network access control, file access control, and remote program contr'ol. Let us examine the actions of the HSR function applied to our example: The HSR function obtains the name of the HASP-READER process of the specified Host. It then calls on a level 4 function to pass the input file to that process. When the level 4 function which controls process-to-process communication is completed, it will return a value corresponding to the job number that HASP has as:;igned; The HSR function sends an "OK" to the user. It then obtains the name of the HASP-WRITER process on the specified Host and calls on a level 4 to pass the job number and to specify the output file to the HASP-WRITER. Control returns when the output file is complete; The HSR function then sends an "OUTPUT READY" message to the user . • Level 4 On level 4 is found the functions which control the file descriptors, file access, and process-to-process communication. Examining the actions of the process-to-process communication function, PPC, applied to our example, we find: The PPC function converts the name of the process into a "well-known port" number and then establishes a logical link to the desired process; It then formulates a message containing the information to be passed and uses a level 3 function to transmit the message; It then receives a message in reply (which contains the job number in one case, and the output, in another). It passes this up to level 5 after destroying the links. • Level 3 Level 3 contains, among others, the function which transfers a message from one Host to another. To do this it: Takes the message, breaks it into pages, and calls a level 2 function to transmit each page; When the last page has been transmitted, it waits for an acknowledgment; If the acknowledgment indicates that a reply is being sent, it receives each page of the reply and passes up to level 4. • Level 2 On level 2 steps are: IS handled the passing of pages. The The page is transferred from the Host to its IMP; The page is then translated into the standard neti'\'ork representation and broken into packet,,; ACC~ET -A A level 1 function packet. IS called to transmit each • Levell At level 1 is handled the details of transmitting a packet from IMP to IMP. This includes retransmission in case of errors. Stages of development In order to allow the concept of a corporate computer network to be evaluated at minimum expense, it is desirable to break the development into discrete stages, each stage building on the hardware and software of the previous stage to add additional capability. • Stage 1 This first stage would connect the DEC 10 to the local IBM 370/165 by using a single mini-computer. It would allow a user on the DEC 10 to conversationally build a program on a terminal and submit it to the 370 to be run under HASP. His output would be printed either at the 370, at the DEC 10, or at his terminal. This stage would also support the transfer of files consisting solely of character data to be transferred from one machine to the other. The mini-computer hardware required for the stage would include: one CPU with 16-24K of memory, power monitor and restart, autoload, and teletype; two interfaces, one to the 370 and one to the DEC 10; a real time clock; and a cabinet. The approximate purchase price would be $25,000 to $35,000 with a monthly maintenance cost of approximately $300. In addition, a disk and controller should be rented for program development. This cost is approximately $500 per month and would be carried for the remaining stages. • Stage 2 The second stage would remove the restriction on file transfer and allow files consisting of any type of data to be accessed from the other machine. At this stage, strict security controls would be integrated into the system. The additional hardware required for this stage would include an additional CPU with 8K of memory and adaptors to interconnect the two CPUs. The approximate purchase cost would be $9,000-$12,000, with a monthly maintenance cost of approximately $75. • Stage 3 This stage would expand the network to include computers at other locations. Additional hardware at the original site would include one synchronous communication controller for each outgoing line at a cost of $2,000-S2,500 with a maintenance cost of $25, Corporate Computer Network 137 and appropriate modems. Total cost for the original site, assuming two outgoing line~, would be between $36,000 and $49,500, excluding disk rental, modems, and communication lines. • Stage 4 This stage could be developed in parallel with stage 3. It would add the capability for a user on a terminal attached to one machine to submit and interact with a program executing on the other machine. )Jo additional hardware would be required. • Stage 5 This stage consists of the design and implementation of automatic back-up procedures. Most of the preliminary analysis can be done in parallel with stages 2-4. These procedures would automatically create duplicate transactions of updates to critical files and have them routed to an alternate site to be applied to the back-up data base. ~o additional hardware is required. HANDLING OF FILES IN A NET\VORK The handling of files in a non-homogeneous, distributed network poses several complex problems. These include control of access and transfer of information between dissimilar machines. Control of access That any system supporting multiple, simultaneous use of shared resources requires some sort of flexible, easy to use method of controlling access to those resources seems obvious to everyone (with the possible exception of the designers of IBM's OS/360), the main problem being how to provide the control at a reasonable cost. Restricting ourselves just to file access control, we see many potential methods with varying degrees of security and varying costS.JO· 13 ,14,31.43 All provide control at the file level, some at the record level, and others at the element level. By designing our system with a Structured Programming approach, it should be possible to modify the method we choose, upgrading or downgrading the protection until a cost-benefit balance is reached. Most designers of file access control systems have mentioned encryption of the data-we shall be no different. Apparently finding the cost prohibitive, they have failed to include this capability in their final product. In the proposed network, however, translation between the data representations of dissimiliar machines will be performed (see below), so the added cost of transforming from a "scrambled" to an "unscrambled" form will be small. Each file access control system is based on a method which associates with each user-file pair a set of descriptors listing the rights or privileges granted to that user for that file (e.g., Read Access, Write Access, Transfer of Read Access to another user). Conceptualized as entries in a matrix, these descriptors are almost never stored as 138 National Computer Conference, 1973 such due to its sparceness. Rather, they are stored as lists, either attached to each element of a list of users or attached to each element of a list of files. Assuming that we have a system for controlling file access, one design question for a distributed network is where to store the file access descriptors? For example, let us look at a network with three machines: A, B, and C, and a file, F, located at A but created by a user at B. To be accessible from the other machines, the file must be known by them and therefore, each machine must have a file descriptor stating that file F is located at A. If we also distribute the file access descriptors, an unauthorized user at C could gain access to the file by obtaining control of his machine and modifying the file access descriptors. Hence, each file access descriptor should be stored at the same location as the file it protects. Transfer of information The complexity of transferring information between two machines is increased by an order of magnitude when dissimilar machines are involved.1.7·s Using ASCII as the standard network code allows the interchange of files containing character data but does not address the problem of different representations of numerical data, e.g., packed decimal, short floating point, long floating point, etc. Two alternatives present themselves: either allow each machine to translate from the representation of every other machine to its own or use a standard network representation and have each machine translate between its own and the network's. The first is attractive when only a few different types of machines will be allowed on the network (If there are N different types of machines, then N(N-l) translation routines might have to be written). The second alternative requires more effort in developing the standard network representation, but is really the only choice when the number of different types is larger than three or four. Another problem is the large amount of translation that must take place. It may not be desirable to place this CPU laden task on a time-sharing machine for fear of degrading response time so the solution seems to lie in executing the translation within the IMPs. If performing translation interferes with the ability of the IMP to perform communication, an additional CPU can be attached to each in order to perform this task. With hardware costs decreasing 50 percent every two or three years, this seems an attractive solution. INTERFACES IAfJD--JIost interface The ARPA network is optimized toward supporting terminal interaction. 28 A commercial network must be optimized toward maximizing throughput of lengthy data files which produces large peak loads requiring high bandwidth channels between each Host and its IMP. In order to allow an IMP to communicate with its Host with a minimum of CPU intervention by either party, data must be transferred directly between the memory of the IMP and the memory of the Host. This can be achieved by connecting to an equivalent of the memory bus of the DEC 10 or multiplexor channel of the 370. With this type of interconnection, it will be necessary to configure the software so that each member of the communicating partnership appears to the other member as if it were a peripheral device of some sort, presumably a high-speed tape drive. Communication, therefore, would take place by one member issuing a READ while the other member simultaneously issues a WRITE. 24 IAfJD-IAfJD interface The IMPs will be linked by standard synchronous communication interfaces. Initial plans call for 40.8KB full duplex leased lines, but 19.2KB lines could also be used. A Cyclical Redundancy Check will provide detection of errors and cause the offending packet to be retransmitted. Software interfaces One of the main reasons for using mini-computers between the Hosts is to insure that the number of interface programs which must be written only grows linearly with the number of different types of Hosts. The effort in writing subsequent versions of the IMP-Host interface can be minimized by at least two methods: 1. Put as much of the system software as possible into the IMPs. Make use of sophisticated architecture 3 (e.g., multi-processor mini-computers, read-only memory) to obtain the power required. 2. For that portion of the system which resides in the Host, write the software using a standard, high-level language (e.g., FORTRAN) for as much of the code as possible. REFERENCES 1. Anderson, Robert, et aI, Status Report on Proposed Data Reconfiguration Services, ARPA Network Information Center Document No. 6715, April 28, 1971. 2. Aupperle, Eric, "MERIT Computer Network: Hardware Considerations" Computer Networks, R. Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 49-63. 3. Bell, G., Cady, R., McFarland, H., Delagi, B., O'Laughlin, J., Noonan, R., "A New Architecture for Mini-Computers-The DEC PDP-H," Proc. AFIPS 1970 SJCC, Vol. 36, AFIPS Press, Montvale, N.J., pp. 657-675. 4. Bell, G., Habermann, A. N., McCredie, J., Rutledge, R., Wulf, W., "Computer Networks," Computer, IEEE Computer Group, September/October, 1970, pp. 13-23. 5. Bjorner, Dines, "Finite State Automation-Definition of Data Communication Line Control Procedures," Proc. AFIPS 1970 FJCC, Vol. 37, AFIPS Press, Montvale, N.J., pp. 477-49l. 6. Bowdon. Edward, K., Sr., "~etwork Computer Modeling" Proc. ACM Annual Conference, 1972, pp. 254-264. ACCNET-A Corporate Computer Network 139 7. Bhushan, Abhay, The File Transfer Protocol, ARPA Network Information Center Document No. 10596, July 8, 1972. 8. Bhushan, Abhay, Comments on the File Transfer Protocol, ARPA Network Information Center Document No. 11357, August 18, 1972. 9. Carr, C. Stephen, Crocker, Stephen D., Cerf, Vinton G., "HostHost Communication Protocol in the ARPA Network" Proc. AFIPS 1970 &/CC, Vol. 36, AFIPS Press, Montvale, N.J., pp. 589597. 10. Carroll, John M., Martin, Robert, McHardy, Lorine, Moravec, Hans, "Multi-Dimensional Security Program for a Generalized Information Retrieval System," Proc. AFIPS 1971 FJCC, Vol. 39, AFIPS Press, Montvale, N.J., pp. 571-577. 11. Casey, R G, "Allocation of Copies of a File in an Information Network," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 617-625. 12. Cocanower, Alfred, "MERIT Computer Network: Software Considerations," Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 65-77. 13. Conway, R W., Maxwell, W. L., Morgan, H. L., "On the Imple-mentation-of-S-ecurity Measures in Information Systems" Comm: of the ACM, Vol. 15, April, 1972, pp. 211-220. 14. Conway, Richard, Maxwell, William, Morgan, Howard, "Selective Security Capabilities in ASAP-A File Management System," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 1181-1185. 15. Crocker, Stephen D., Heafner, John F., Metcalfe, Robert M., Postel, Jonathan B., "Function-Oriented Protocols for the ARPA Computer Network," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, ~.J., pp. 271-279. 16. deMercado, John, "Minimum Cost-Reliable Computer Communication Networks," Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 553-559. 17. Dijkstra, E. W., "The Structure of the 'THE' Multiprogramming System," Comm. of the ACM, Vol. 11, May, 1968. 18. Farber, David, "Data Ring Oriented Computer Networks" Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 79-93. 19. Farben, David J., "Networks: An Introduction," Datamation, April, 1972, pp. 36-39. 20. Frank, Howard, Kahn, Robert E., Kleinrock, Leonard, "Computer Communication Network Design-Experience with Theory and Practice," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 255-270. 21. Frank, Howard, "Optimal Design of Computer Networks," Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 167-183. 22. Frank, H., Frisch, LT., Chou, W., "Topological Considerations in the Design of the ARPA Computer Network," Proc. AFIPS 1970 &/CC, Vol. 36, AFIPS Press, Montvale, N.J., pp. 581-587. 23. Frank, Ronald A., "Commercial ARPA Concept Faces Many Roadblocks," Computerworld, November 1,1972. 24. Fraser, A. G., "On the Interface Between Computers and Data Communications Systems," Comm. of the ACM, Vol. 15, July, 1972, pp. 566-573. 25. Grobstein, David L., Uhlig, Ronald P., "A Wholesale Retail Concept for Computer Network Management," Proc. AFIP.'; 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 889-898. 26. Hansen, Morris H., "Insuring Confidentiality of Individual Records in Data Storage and Retrieval for Statistical Purposes," Proc. AFIPS 1971 FJCC, Vol. 39, AFIPS Press, Montvale, N.J., pp. 579-585. 27. Hansler, E., McAuliffe, G. K., Wilkov, R S., "Exact Calculation of Computer Network Reliability," Proc. AFIP.'; 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 49-54. 28. Heart, F. E., Kahn, R. E., Ornstein, S. M., Crowther, W. R., Walden, D. C., "The Interface Message Processor for the ARPA Computer ~etwork," Proc. AFIP.'; 1970 &/CC, Vol. 37, AFIPS Press, Montvale, N.J., pp. 551-1567. 29. Hench, R R, Foster, D. F., "Toward an Inclusive Information Network," Proc. AFIP.'; 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 1235-1241. 30. Herzog, Bert, "MERIT Computer Network," Computer Networks, R. Rustin (Ed.), Prentice-Hall, Englewood CHiis, ~.J., 1972, pp. 45-48. 31. Hoffman, Lance J., "The Formulary Model for Flexible Privacy and Access Controls," Proc. AFIPS 1971 FJCC, Vol. 39, AFIPS Press, Montvale, N.J., pp. 587-601. 32. Hootman, Joseph T., "The Computer Network as a Marketplace," Datamation, April, 1972, pp. 43-46. 33. Kahn, Robert, "Terminal Access to the ARPA Computer Network" Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, X.J., 972, pp. 147-166. 34. Kleinrock, Leonard, "Analytic and Simulation Methods in Computer Network Design," Proc. AFIPS 1970 &/CC, Vol. 36, AFIPS Press, Montvale, N.J., pp. 569-579. 35. Kleinrock, Leonard, "Survey of Analytical Methods in Queueing Networks," Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 185-205. 36. Lichtenberger, W.,· (Ed), Tentative SpecT[ications for a Network of Time-Shared Computers, Document No. M-7, ARPA, September 9,1966. 37. Liskov, B. H., "A Design Methodology for Reliable Software Systems," Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 191-199. 38. Luther, W. J., "Conceptual Bases of CYBERNET," Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 111-146. 39. McKay, Douglas B., Karp, Donald P., "IBM Computer Network/ 440," Computer Networks, R. Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 27-43. 40. McQuillan, J. M., Cro\\iher, W. R, Cosell, B. P., Walden, D. C., Heart, F. E., "Improvements in the Design and Performance of the ARPA Network," Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 741-754. 41. Mendicino, Samuel F., "OCTOPUS: The Lawrence Radiation Laboratory Network," Computer Networks, R Rustin (Ed.), Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 95-110. 42. Metcalfe, Robert M., "Strategies for Operating Systems in Computer Networks," Proc. ACM Annual Conference, 1972, pp. 278281. 43. Needham, R M., "Protection Systems and Protection Implementations," Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 571-578. 44. Ornstein, S. M., Heart, F. E., CroMher, W. R, Rising, H. K., Russell, S. B., Michel, A., "The Terminal IMP for the ARPA Computer Network," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 243-254. 45. Roberts, Lawrence G., "Extensions of Packet Communication Technology to a Hand Held Personal Terminal," Proc. AFIP.'; 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 295-298. 46. Roberts, Lawrence G., Wessler, Barry D., "Computer Network Development to Achieve Resource Sharing," Proc. AFIPS 1970 &/CC, Vol. 36, AFIPS Press, Montvale, N.J., pp. 543-549. 47. Rutledge, Ronald M., Vareha, Albin L., Varian, Lee C., Weis, Allan R., Seroussi, Salomon F., Meyer, James W., Jaffe, Joan F., Angell, Mary Anne K., "An Interactive Network of Time-Sharing Computers," Proc. ACM Annual Conference, 1969. pp. 431-441. 48. Sevcik, K. C., Atwood, J. W., Grushcow, M. S., Holt, R C., Horning, J. J., Tsichritzis, D., "Project SUE as a Learning Experience," Proc. AFIP.'; 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp.331-338. 49. Stefferud, Einar, "Management's Role in Networking," Datamation, April, 1972, pp. 40-42. 50. Thomas, Robert H., Henderson, D., Austin, "McROSS-A MultiComputer Programming System," Proc. AFIPS 1972 &/CC, Vol. 40, AFIPS Press, Montvale, N.J., pp. 281-293. 140 National Computer Conference, 1973 51. Tobias, M. J., Booth, Grayce M., "The Future of Remote Information Processing Systems." Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 1025-1035. 52. Walden, David C., "A System for Interprocess Communication in a Resource Sharing Computer :\"etwork," Comm. of the ACM, Vol. 15, April, 1972, pp. 221-230. 53. Weis, Allan H., "Distributed ~etwork Activity at IBM," Computer Networks, R. Rustin (Ed.), Prentice Hall, Englewood Cliffs, ~.J., 1972, pp. 1-2.5. 54. Williams, Leland H., "A Functioning Computer Network for Higher Education in North Carolina," Proc. AFIPS 7972 FJCC, Vol. 41, AFIPS Press, Montvale, N.J., pp. 899-904. 55. Wulf, William A., "Systems for Systems Implementors-Some Experience From BLISS," Proc. AFIPS 1972 FJCC, Vol. 41, AFIPS Press, Montvale, ~.J., pp. 943-948. 56. Wulf, W. A., Russell, D. B., Habermann, A. N., "BLISS: A Language for Systems Programming," Comm. of the AC.\f. Vol. 14, December, 1971, pp. 780-790. A system of APL functions to study computer networks by T. D. FRIEDMAN IBM Research Laboratory San Jose, California A collrction of programs and procrdural ('onvrntions will be dC'scribrd which wrrr devPloppd as part of a largpr study of modf'ling and dpsign of eomputpr nptworks. This work was condlli'trd undpr Xavy Contract XOO14-72wC-0299, and was based on approaches developed by Dr. R. G. Casry, to whom I am indebtpd for his hrlpful suggrstions and rncouragement. The programs arp written on the APL terminal system. For proper understanding of the programming language uspd, it is drsirable' for thp rf'adpr to refe'r to Referener 1. Thpsp programs make' it possible to crpatr, modify and evaluatp graph theorptic rf'presentations of computpr nptworks in minutes whilp working at t he terminal. 13 14 I.j 1 ,j 6 '7 I 8 9 10 11 12 2 2 2 Li 3 3 3 6 6 1 1 3 4 5 4 10 14 15 The ovprall concpptual framr\york for rrprpsent.ing net.works \vill first bp discusspd. We assume a spt of fixpd nodps locatpd at gpographically dispprspd locations, somp of which contain copies of a given data file. Cprtain nodes are interconnrct.ed by transmission links. Togrthpr, t.he nodrs and links constitutp a particular network configuration. Each node is assignrd a unique idrnt.ification number, and the links are likewise idrnt.ified by link numbers. An arbitrary configuration of an n-node ident.ifird by link numbers. An arbitrary configuration of an n-node network is represented by a link list, which consists of an m-by-3 array of the m link numbrrs, rarh of \yhich is followrd by the identification of thp t,,"O nodes it. connects. For example, a six-node network with all possible links providpd would be represented by the following link list: 1 1 1 1 ,j This nptwork is depicted sehpmatieally in Figurr 1. (Xote that. paLh link is rpprpsentpd in thp .figure by thr least significant digit of its idpntification numbrr.) Any othrr configuration of a six-node network is npcessarily a subset of thr prpceding nC'twork. Onp such subset is thp follO\ying configuration, representing the network shown in Figure 2. COXCEPTUAL FRA:\IEWORK 1 2 3 4 4 4 5 2 5 4 6 6 For certain operations, it is useful to rpprf'sent a network by its connection matri.r, in which the ith element of the jth row idpntifies the link connecting node i to j. If no link is prf'sent, or if i = j, then the element is zero. Thus, the fully connf'cted six-node nehvork described above would be characterized by the connection matrix: 0 1 2 3 4 5 1 0 6 7 8 9 2 6 0 10 11 12 3 7 10 0 13 14 4 8 11 13 0 15 5 9 12 14 15 0 Likf'wise, the configuration represented in Figure 2 would be characterized by the conneetion matrix 2 3 4 0 1 0 0 4 0 ,j 6 3 4 5 6 4 5 6 1 0 0 0 0 0 0 0 0 10 0 0 0 0 10 0 0 14 4 0 0 0 0 15 0 0 0 14 15 0 BASIC NETWORK FUXCTIOXS In this section, functions for modeling and evaluating the networks are described. 141 142 National Computer Conference, 1973 ( 1) 2 22 2 4 "" 4444 44 4 4444444 44 4 4 44 44 44 4 4 44 44 4444 4 444 44 "444 44 44 44 44 44 ( 5 ) 135 2222222222 11111111111 885 355 2222222222 11111111111 883 5 3 55 2222( 3)1111 88 3 5 3 55 602 88 33 3 55 6 a 2 88 3 55 6 a 2 88 3 55 6 2 88 3 33 55 6 22 88 3 3 55 6 2 88 3 3 55 6 2 88 3 33 556 2 88 3 3 655 88 3 6 55 88 2 3 3 6 55 88 2 3 3 550 88 2 3 3 6 23 5 8 36 855 32 63 66 3 880 55 33 22 6 3 88 a 55 2 6 3 88 a 55 3 2 388 55 2 6 8833 3 55 2 6 88 3 3 55 2 6 88 3 3 55 2 88 3 33 55 2 88 3 3 55 2 2 88 66 88 5\5 22 33~/ 6 88 77(4)"44 55 2 6 88 77777777 44444444 55 2 1 688 77777777 44"44444 552 5 168 77777777 44444444 555 (2) 77 999 9999 99 99999999999 9 9999 99 999999 99 99999999 99 999999999 999 99 (6) Figure 1 }Veiwork topology An APL function, FORMC:VI, creates the connection matrix as output when given the link list and the node count. The function OKC)'1 determines "\vhether a network configuration connects all nodes together. It is possible to determine whether all nodes are connected by calculating the inner product of a Boolean form of the connection matrix, repeated n-l times for an n-node network. However, the OKC}1 function carries out this determination with somewhat less processing, by performing n-l successive calculations of then, LINKSI UNION LINKS2 results in: 1 2 1 1 1 4 6 7 2 3 5 3 4 2 2 3 3 4 5 10 11 14 15 4 5 6 6 The CSP program searches for the shortest path spanning two nodes in a given network. It operates by first seeking a direct link between the nodes if one exists. If one does not, it follows each link in turn which emanates from the first node, and calls itself recursively to see if the second node can eventually be reached. A control parameter C is needed to limit the depth of recursion so as to avoid infinite regress. SPAN is a function to call CSP without requiring the user to specify the recursion control parameter, C. SPAN operates by calling CSP with the control parameter set to the node count minus 1, since at most n-l links are required for any path in an n-node network. CSP and SPAN return only one set of links spanning the two given nodes, and this set consists of the fewest number of links possible. However, more than one such set of that number of links may be possible. The function MCSP operates exactly like CSP, except it finds all sets of the minimum number of links which span the two nodes. ~ISP AN corresponds to SPAN by allowing the user to omit the control terms, but calls ~ICSP rather than CSP. X etwork traffic C:VIS~C:YIS V V /(GMS V . ;\BC:YI)/BC:VI, where BC:\! is a Boolean form of the connection matrix, GM, i.e., BC:Jl[I ;J] = 1 if CM[I ;J];;cO or if I =J; BC)'l [I ;J] = 0 otherwise; and C:VIS is initially defined as BC:\P. OKCM determines that all nodes are connected if and only if at the conclusion C}IS consists only of 1'so The function U~ION finds the union of two link lists, and presents the link list of the result in ascending order. For example, if we define LIXKSI as 2 4 6 7 1 1 2 2 3 3 5 3 5 3 4 4 5 6 10 14 1 1 3 2 .5 4 4 () I.!) 5 6 10 11 1.5 We assume that the average amount of query and update activity emanating from ('ach node to each file is known. This information is provided in a table called FILEACT, i.e., file activity. (1) 44444 44 4 4 4"4 44444 4 4 44 44" 4 4" 44 44 44 44 44 44 44 44 44 44 44 44 4 4 "4 4" 44 44 ( 5 ) 1 5 5 5 (3) and LIXKS2 as 1 4 1 ( 4)444 4~444444 ;.. 4 uJ.. W.4L1.4 444444"4 1 (2 ) 444 (6) Figure 2 A System of APL Functions to Study Computer Networks On the basis of these data, the activity on each link of the network can be calculated when the location of the files is specified. The function LCACT calculates the link update activity; using CSP to find the sets of links between each node and each copy of the file. It then adds the appropriate update activity from the FILEACT table to the total transmission activity of each link. The function LQACT derives the query activity on each link using a similar approach. However, a query from a node need only be transmitted to the nearest copy of the file. A function, PATH, finds the set of links connecting a node to the nearest copy of a file located at one or more nodes. LQACT uses PATH to select the links affected, then accumulates the total query activity on all links. LACT finds the total activity on all links as the sum of LUACT and LQACT. In the experiments run to date, five separate transmission capaCIties were available for each link, namely,' 100; 200, 400, 1000 and 2000 kilo bits per second. It is most economic to employ the lowest capacity line needed to handle the activity of each link. The function ~IINCAP accomplishes this goal by examining the activity of all links as determined by LUACT and LQACT, and it then calculates the minimal capacity needed in each link. If the activity of any link exceeds the maximal capacity possible, this is noted in an alarm. Cost calculations It is assumed that the costs are known for maintaining each possible link at each possible capacity in the network. This information is kept in TARTABLE, i.e., the tarriff table. In addition, the costs to maintain any file at any node are given in the F~IAIXT table. By reference to the TARTABLE, the function FOR::\ILTAR determines the monthly cost of all links, called LTAR. Using LTAR and the FJIAIKT table, the function TARRIF calculates the monthly cost of a configuration when given the nodes at which the files are located. A function, GETTARRIF, derives this same cost data starting only with the link list and the locations of the files. In turn, it calls FOR::\ICJI to develop the connection matrix, then it calls LQACT and LUACT to determine activity on all links, then ::\II~CAP to determine minimal link capacities, then FOR::\ICTAR to derive LTAR, and finally it calls T ARRIF to determine the total cost. An abbreviated version of the latter function, called GETLTF, derives the link costs but does not find the total file maintenance cost. XETWORK JIODIFICATIOX FUXCTIOXS The cost of maintaining a network may on occasion be reduced by deleting links, by adding links, or by replacing certain links with others. The function SXIP removes links from a network whenever this 10\\"('rs cost or leaves the cost. unchanged. SXIP must have the nodes specified at which files are located, and 143 the link list. It proceeds by calculating the link activity, then it calls ::\UXCAP to determine the capacity of each link. It deletes all links that carry no traffic. Then it attem pts to delete each of the remaining links from the net\vork, starting with the least active link. At each step, it calls OKC::VI to check that all nodes remain connected in the network-if not, that case is skipped. Mter it determines the cost effects of all possible deletions of single links, the one link (if any) is deleted which lowers cost most (or at least leaves it unchanged). Mterwards, it repeats the entire process on the altered network and terminates only when it finds that no additional links can be removed without increasing cost or disconnecting the nehvork. The function JOIN will add the one link, if any, to a specified node 'which most reduces cost. Mter one link is added, it repeats the process to determine if any others may also be added to that node. It follO\vs a scheme similar to S~IP, except that a link will not be added 'if the' cost is unchanged as a result. JOIXALL accomplishes the same process on a net\vork,vide basis, i.e., a link will be added anywhere in the net'work if it reduces the cost. REPLACE substitutes ne\v links for old ones whenever this reduces cost. It begins by calling SNIP to delete all links whose removal will not raise costs, then it selectively attempts to delete each of the remaining links and attach a new link instead to one of the nodes that the old link had connected. Mter trying all such possible replacements and calculating the resulting costs, it chooses the one replacement, if any, which lowers the cost the most. Then, the whole process is repeated on the modified network, and terminates only when no further cost reduction is possible. :\IESSAGE-COST-ORIEXTED FUXCTIOXS The preceding functions assume an "ARPA-type" design2 ,3 in which total costs are based on fixed rental rates for transmission lines for the links according to transmission capacities, plus the cost of maintaining files at specific nodes. However, networks may also be considered in which the costs are calculated according to message traffic across links. A family of functions have been written similar to the functions described above, but for \vhich costs are calculated according to the moment-to-moment transmission activity. Those functions are described in this section. A table, TC, specifies the transmission cost for sending messages over each link. It is assumed that the link capacities are fixed, and are given by the parameter LCAP. A function, FOR~ILCOST, uses TC and LCAP to create LCOST, the list of message transmission cost rates for the links. Rather than simply finding a shortest set of links connecting hvo nodes, in this case it becomes necessary to compare the different total transmission costs for each possible set of links. The function ECSP is provided to do this. ECSP is similar to JICSP, but instead of returning all sets of the minimal number of links connecting two given nodes, it returns only the most eeonomical single set of links. 144 National Computer Conference, 1973 14 145 146 15 24 245 1456 134 1345 135 4 45 246 25 124 1245 1346 156 2456 234 2345 136 13456 46 13 235 1246 125 1356 456 34 345 2346 256 12456 1234 12345 16 236 23456 23 1235 346 2356 12346 1256 3456 35 1236 123456 ALLCOSTS drrivrs a tablr, COST~lATRIX, gIvmg the total eosts for f'aeh file whrn locatpd at f'aeh possiblf' node. FICO ('rratrs a tablf' of eosts for all possiblr combinations of a filp at difff'rl'nt nod('s, shown in Figurf' 3. Thp first. column speeifif's thr nodf's at which t.hf' file is locatf'd and column 2 supplies thp cost. The configuration uSf'd to derive this table was a fully connpctpd six-node network. FOR~IQ~I usrs LIXKQACT to cre>atf' a tablf' of query costs for each filf' when located at rach node. FOR~lUM forms a similar tablf' for update costs. TRL\l corrf'sponds to SXIP, and deletes links from a network until the cost no longf'r drops. SUPP (for supplement) corresponds to .JOI~ALL, and adds links to a network until t he cost no longer drops. 64000 64000 71200 72000 73600 73600 75200 75200 75200 79200 80000 80000 80800 81600 81600 81600 82400 83200 84800 84800 84800 86400 86400 87200 87200 88800 88800 89600 90400 91200 91200 91200 92000 92800 92800 92800 92800 95200 96000 96000 96800 96800 98400 100000 100000 100800 102400 103200 104000 104000 DIAGRA}nn~G 26 123 12356 36 3 5 1 126 356 56 2 12 6 104800 104800 108000 110400 111200 112000 112000 112800 114400 123200 125600 129GOO 159200 Figure 3 Like ::.Yrcsp, ECSP requires a recursion control parameter. ESP (which corresponds to ~lSPAN) calls ECSP, and provides the recursion control parameter implicitly as the node count minus 1. LINKUACT and LINKQACT correspond to LUACT and LQACT but the links between nodes are chosen by ECSP rather than CSP so as to obtain lowest cost. LQACT calls ROUTE, which operates like PATH except that it chooses the most economic set of links connecting a node to one of several copies of a file. LINKACT corresponds to LACT in the same way. The function COST corresponds to T ARRIF, but it calculates the cost according to the total activity on each link times the component of LCOST corresponding to that link. The function TOTCOST calculates the sum of the costs over all links calculated by COST, and the function FUXCTIONS Functions are provided to diagram nehYork configurations developf'd by the prrcrding functions. The diagram is prf'parf'd as a matrix of blank characters, with nodes indicah'd by parenthf'sizrd numbrrs, and links indicatrd by a linf' compoS<"d of a numbf'r or other character. Figures 1 and 2 arf' f'xamplf's of such diagrams. DEPICT crf'ates such a matrix when given the dimensions and the positions in the matrix at which the nodes are to be located. BLAZE draws a line betwf'rn any two elements in the matrix, "'hen given the location of thr elements and the sprcial charactrr with which the> linE' is to be drawn. COXXECT dra\ys a line, using BLAZE, betw(,f'n two nodf's, looking up the locations of f'ach nodf' in thf' picture matrix to do so. COXXECTALL diagrams an f'ntirf' network configuration whf'n givf'n its link list as an argumf'nt. Each link is drawn using thf' If'ast significant digit of thf' link number. Figurf's 1 and 2 Wf're producf'd by CONXECTALL. A function caIlf'd SETUP is used to initializf' the link list, connection matrix and node> count parameter for various preassigned network configurations. The statement "SETUP 5" for rxample initializes the prespecified network configuration number five. ~IE~10-PAD FUNCTIONS Provisions were made for a memorandum-keeping system which may be of interest to other APL users, inasmuch as it is not restricted just to the network study. The "niladic" function NOTE causes the terminal to accept a line of character input in which the user may write a comment or note for future references. The note is then marked with the current calendar date and filed on the top of a stack of notes (called XOTES). The function ~fEl\lO displays all previous notes with their dates of entry, and indexes each with an identification number. Any note may be deleted by the SCRATCH function. The user simply fo11O\vs the word SCRATCH by the indices of notes to be deleted. A System of APL Functions to Study Computer Networks (1) 444<;44444444444444444444444444444444 44 4~44 444441+'44444444444 4 (5) 1 5 145 (1) 22 22 1 2222222222 (5 ) 5 2222222222 2222(3) 2 (3 ) 2 2 2 22 2 2 2 2 2 2 22 o 22 (4) 44 4 (4) 444 44444444 1 2 2 44444444 44444444 1 5 5 44444444 (2 ) 44444444 444 (6) 4L,.44444l.- (2 ) Figure 4a 2 5 25 .. 44 (6) Figure 4d (1)2222 1 2222222222 (5 ) 35 3 5 2222222222 (3 ) 2222(3) a 33 c 3 3 3 3 33 3 3 3 3 33 3 3 03 (q 77777777 44'4-!....4444 L4Lr~4L.44 77777777 ~44i!4111..j.~ 77777777 (2 )77 L 5 44 (6) Figure 4b 5 (6 ) Figure 4e (1 ):,222 1 1 (2 ) (5) These functions have proven useful for keeping re('ords of currrnt activities, ideas, and problems from one work session to the next. ~222222222 2=222~2222 :2 L 2 L ( 3) EXA:\IPLES OF EXPERL\IEXTS WITH THE SYSTE:\I In this section, thrr(' sample experiments "'ill br> discussed. The records of the terminal st'ssion art' included in an Appendix to show how the experiments were conducted. In the first run, we start with configuration five, by issuing the command SETUP .1. (4 ) ........ 4 4444- 44 44 44444444 44 ... 1..:.4-+44 (2 ) 5 The system responds by typing IICASE ;) IS XOW SET UP." This establishes the link list and connection matrix of the configuration shown in Figure 4a. We then issue the statement 4" 4 (6) Figure 4c o REPLACE 3 146 National Computer Conference, 1973 (1) 222244444444444''''44444444444444444444444444444 4444 444444 44 44 4 (5) 135 2222222222 11111111111 885 1 355 2222222222 11111111111 863 5 1 3 55 2222(3)1111 88 3 5 3 55 G02 88 33 55 6 0 2 88 3 55 fj 0 2 88 3 3 88 2 55 33 88 55 22 3 88 2 55 3 3 55 2 88 3 55 (j 2 88 33 3 655 88 3 3 G 55 88 2 3 G 55 0 88 2 3 3 6 550 88 2 3 23 36 5 8 63 855 32 33 22 66 3 880 55 2 6 3 88 55 3 6 88 55 3 3 2 388 55 2 8833 3 55 6 88 3 3 55 2 6 88 3 55 2 2 88 3 33 55 2 88 3 3 55 6 88 3 0 3 55 6 88 303 55 22 6 88 77(4)444 55 2 6 88 77777777 44444444 55 2 1 688 77777777 44444444 552 168 77777777 44444444 555 (2) 7799 99 99 99 99 99 99999 99999999999999999 99 99 999 9 99 99 99999 99 999 99 9 (6) Figure 5a (1)2222 1111(5) 2222222222 11111111111 2222222222 11111111111 2222( 3 )1111 602 6 0 2 6 0 2 2 22 2 5 2 6 2 6 2 6 2 6 2 2 2 6 2 6 22 65 2 5 2 5 2 5 2 2 2 2 6 o 6 22 (:,) 2 2 (6 ) 5 (2) Figure 5b (1)2222 1111(5) 2222222222 11111111111 2222222222 11111111111 2222( 3 )1111 02 o 2 2 2 22 2 2 2 2 2 2 2 2 which signifies that a file is located at the 3rd node, and that alternative links are to be tested in the network to find configurations which are less costly. In the listing of the terminal session, a partial trace is made of REPLACE and of SNIP, called by REPLACE, to show the cost of each configuration as it is calculated Each time a configuration is discovered which has a 10weI cost than any of the preceding cases, its link list is displayed. Thus, we start \vith the configuration shown in Figure 4a. SNIP finds it cannot delete any links. The cost of this configuration is 2700. REPLACE then attempts to change a link, but the resulting cost is 2750, and because it is more expensive than the original case, it is discarded. REPLACE then tries a second alteration, the cost of which is found to be 2550, which is less than the original cost. The link list of this configuration is displayed, and the revised network is shown in Figure 4b. The reader can note that link 1 has been replaced by link 7. Following this, three more modifications are examined and the first two are found more expensive than the last acceptable change. The third, however, costs 1550 and thus is noted, and is shown in Figure 4c. Here we see that link 4 in the original configuration has been replaced by link 2. A large number of additional modifications are examined, but only when it arrives at the configuration shown in Figure 4d is a more economic case found. Finally, REPLACE discovers that the configuration in Figure 4e is lower in cost than any others preceding it, costing 1300. Several additional cases are examined, following this, but none is lower than this last configuration, which is then displayed at the conclusion of execution. Npxt., in a second run, we attempt to add links to the final configuration derived in the first run to see if a still cheaper case could be discovered. JOIN ALL was called with the file specified at node 3, but each new link it attempted to add failed to decrease the cost below 1300. Thus, no change was made. In a third run, a fully connected network was specified, as shown in Figure 5a. REPLACE was called, again with the file at node 3. REPLACE called SNIP, which removed ten links to produce the configuration shmvn in Figure 5b, having a cost of 1500. REPLACE then modified the network to produce the structure in Figure 5c, having a cost of 1350. Thus, we see that sometimes an arbitrary configuration such as that in the first run may be a better starting point for REPLACE then a fully connected network, since the first run resulted in a more economic configuration than did the third. 2 22 2 2 REFERENCES 2 2 2 2 0 22 (4 ) 1 1 1 ( 2) 7 2 (G) Figure 5c 1. APL/360-0S and APL/360-DOS, User's Manual, IBM Corporation, White Plains, New York, Form Number SH20-0906. 2. Roberts, L. G., Wessler, B. D., "Computer Network Development to Achieve Resource Sharing," Proceedings of AFIPS &/CC 1970, pp. 543-549. 3. Frank, H., Frisch, I. T., and Chou, W., "Topological Considerations in the Design of the ARPA Computer ~etwork:' ibid. pp. 581. A System of APL Functions to Study Computer Networks Fi rst Run A P PEN 0 I X ------- SETUP 5 CASE 5 IS 110rl SET UP o REPLACE 3 SNIP[2] 1 1 2 415 10 3 4 14 4 6 15 5 6 SNIP[4] 2700 REPLACE[3] 1700 1800 1900 2000 1500 1550 2000 2100 1650 RE'PLACE[ 18J 1550 112 415 10 3 4 14 4 6 15 5 6 REPLACE[18] 2750 REPLACE[18] 2550 REPLACE[22] 415 724 10 3 4 14 4 6 15 5 6 REPLACE[18] 2750 REP LA Cl' [ 1 8 ] 2 8 5 0 REPLACE[18] 1550 REPLAC8[22] 112 213 10 3 4 14 4 6 15 5 6 REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18J REPLACE[18] 1800 2800 2700 2400 3250 2400 2400 2600 2950 2100 2300 2000 2000 3150 3250 1700 1750 2050 REPLACE[18J REPLACE[18J REPLACE[18J RBPLACE[18J REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18J REPLACE[18J REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] 112 213 10 3 4 11 3 5 3 6 1'2 1500 1450 1700 1800 1500 1850 2550 1650 1450 1550 1550 1650 1550 1650 1350 1400 2150 2250 1500 1400 REPLACE[18] REPLACE[18] R8PLACE[18] REPLACE[18J REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[22] 1 2 12 14 '15 1 1 3 4 5 2050 1800 2700 2800 1850 1600 1500 2 3 6 6 6 REPLACE[18] 2300 REPLACE[18J 2650 REPLACE[18] 1300 REPLACE[22] 1 1 2 1 2 3 10 13 15 3 4 4 5 5 6 REPLACE[18] REPLACE[lB] REPLACE[18] REPLACE[18J REPLACE[18] REPLACE[18J REPLACE[18J REPLACE[18J REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18J REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] 112 213 10 3 4 13 4 5 15 5 6 2400 2500 1400 1700 1800 1500 1350 1450 1500 1700 1900 1550 2100 2650 1600 1400 1500 2050 2400 1550 2200 2300 1400 2150 2250 1350 1350 147 148 National Computer Conference, 1973 APPENDIX Second Run Third Run lIL+O JOINALL 3 JOINALL[3] 1 2 10 13 15 2 3 4 5 6 JOINALL[5] 1300 JOINALL[10] 1300 JOINALL[10J 2200 JOINALL[10] 2150 1 1 3 4 5 JOINALL[18] JOINALL[10] JOINALL[10] JOINALL[10] JOINALL[10] JOINALL[18] JOINALL[10] JOINALL[10] JOIiVALL[18] JOINALL[10] JOINALL[18] JOINALL[18] 7 1450 1300 1300 2250 7 1400 1350 7 1350 7 HL 1 2 10 13 15 1 1 3 4 5 2 3 4 5 6 o REPLACE 3 SNIP[2] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5 2 3 4 5 6 3 4 5 6 4 5 6 5 6 6 SNIP[4] 1500 REPLACE[ 3] 2 6 10 11 12 1 2 3 3 3 3 3 4 5 6 REPLACE[ 18] REPLACE[ 18] REPLACE[18] REPLACE[18] REPLA CE[ 18] REPLACE[22] 1 2 10 11 12 1 1 3 3 3 REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] REPLACE[18] 1650 1650 1900 2000 1350 2 3 4 5 6 1450 1700 1800 1800 190 a A high-level language for use with multi-computer networks by HARVEY z. KRILOFF University of Illinois Chicago, Illinois INTRODUCTION NETWORK LANGUAGE REQUIREMENTS Two basic trends can be observed in the modern evolution of computer systems. They are the development of computer systems dedicated to a single task or user (minicomputers) where the sophistication of large computer systems is being applied to smaller units, and the trend of very large systems that locate the user remotely from the computer and share resources between more and more locations. It is to the latter case that this paper is directed. This trend reaches its culmination in the design of distributed computer systems, where many individual computer components are located remotely from each other, and they are used to jointly perform computer operations in the solution of a single problem. Systems such as these are being developed in increasing numbers, although they are yet only a small fraction of the total number of computer systems. Examples of such systems range from those that are National and International in scope (the United States' APRANET,l Canada's CANUNET,z the Soviet Union's ASUS system 3 and the European Computer Network Project 4 ), the statewide systems (the North Carolina Educational Computer System''; and the MERIT Computer Network6 ), to single site systems (The Lawrence Radiation Laboratory Network' and Data Ring Oriented Computer Networks8 ). These systems and others have been designed to solve problems in the areas of research, education, governmental planning, airline reservations and commercial time-sharing. Taken together they demonstrate a capability for computer utilization that places more usable computer power in the user's control than he has ever had before. The challenge is to make effective use of this new tool. Development of this new mode of computer usage has followed the same set of priorities that has prevented effective utilization of previous systems. A large body of information has been collected on the hardware technology of network systems,9 but little effort has been expended on the development of software systems that allow the average user to make effective use of the network. A systematic examination of the requirements and a design for a language that uses the full facilities of a number of computer networks is needed. After the language design had begun, it developed that the concept of what a computer network was, and how it was to be used, was not well understood. An effort was therefore begun to classify computer networks and their operations. This classification scheme indicated that a language for computer networks would have to be constantly changing because networks evolve from one form to another. It is this dynamic behavior that makes the design of a single language for different types of networks a high priority requirement. Types of computer networks A classification scheme was developed based on the resource availability within the computer network and the network's ability to recover from component failure. The scheme consists of six (6) different modes of network operatioI) with decreasing dependability and cost for the later modes. 1. Multiple Job Threads: This consists of a system where all components are duplicated and calculations are compared at critical points in the job. If a component should fail, no time would be lost. Example: NASA Space Flight Monitoring 2. Multiple Logic Threads: This is identical with the previous mode except peripherals on the separate systems can be shared. 3. Multiple Status Threads: In this mode only one set of components performs each task. However, other systems maintain records of system status at various points in the job. If a component should fail all work since the last status check must be re-executed. Example: Remote Checkpoint-Restart 4. Single Job Thread: In this mode one computer system controls the sequencing of operations that may be performed on other systems. If a system failure occurs in the Master computer, the job is aborted. 5. Load Sharing: In this mode each job is performed using only a single computer system. Jobs are transferred to the system with the lightest load, if it has all 149 150 National Computer Conference, 1973 necessary resources. If the system fails you may lose all record of the job. Example: HASP to HASP job transfer. 6. Star: In this mode only one computer system is available to the user. It is the normal time-sharing mode with a single computer. If the system is down you can't select another computer. using these type commands may not be transferable to other networks. These type commands allow direct connection to a computer system, submission of a RJE job, reservation of a system resource, network status checking, select network options, access to system documentation, and establishment of default options. All presently operating networks fall within one of the above modes of operation, and additional modes can be developed for future networks. Operation within the first two and last two modes require no additional language features that are not needed on a single (non-network) computer system. Thus, the language to be described will be directed toward utilizing the capabilities of networks operating in modes 3 and 4. In order to evaluate its usefulness, implementation will be performed on both the ARPANET and MERIT Networks. Both networks can operate in modes 3 and 4, and possess unique features that make implementation of a single network language a valid test of language transferability. The network language must allow the user to perform all these operations by the easiest method. Since the average user will know very little about the network, the system must be supplied with default parameters that will make decisions that the user does not direct. These defaults can be fitted to the network configuration, the individual user or a class of problems. User operations must be expressible in a compressed form. Operations that the user performs very often should be expressed by a single command. This will prevent programming errors and it will allow for optimization of command protocol. As new operations are required they should be able to be added to the language without affecting already defined commands. Computer network operations Additional language constraints In designing the language we recognized three (3) types of operations that the network user would require. 1. Data Base Access: Four operations are necessary for this type of usage. The user can copy the entire data base from one computer system to another, or he could access specific elements in the data base, or he could update the data base, or he could inquire about the status of the data base. Decisions on whether to copy the data base or access it element by element depend on data transfer speeds and the amount of data needed, therefore they must be determined for each job. 2. Subtask Execution: Tasks may be started in one of four different modes. In the Stimulus-Response mode a task is started in another machine while the Master computer waits for a task completion message. In the Status Check mode the subtask executes to completion while the Master task performs other work. The subtask will wait for a request from the Master to transmit the result. The third mode is an interactive version of status checking where the status check may reveal a request for more data. The final mode allows the subtask to interrupt the master task when execution is completed. Most networks do not have the facilities to execute in mode four, however all other modes are possible. The novice will find mode one easiest, but greater efficiency is possible with modes two and three. 3. Network Configuration: This type of usage is for the experienced network user. It provides access to the network at a lower level, so that special features of a specific network may be used. Programs written Three of the six constraints used to design the network language have already been described. The list of features that were considered necessary for a usable language are: 1. All purely systems requirements should be invisible to the user. 2. The language should be easily modified to adapt to changes in the network configuration. 3. The language should provide easy access to all features of the network at a number of degrees of sophistication. 4. The language should provide a method for obtaining on-line documentation about the available resources. 5. The fact that the system is very flexible should not greatly increase the system's overhead. 6. The language syntax is easy to use, is available for use in a non-network configuration, and it will not require extensive modification to transfer the language to another network. These requirements are primarily dictated by user needs, rather than those required to operate the hardware/software system. It is the hope that the end result would be a language that the user would use without knowing that he was using a network. The demand for on-line documentation is particularly important. Most software systems are most effectively used at the location where they were developed. As you get farther from this location, fewer of the special features and options are used because of lack of access to documentation. Since most of the systems to be used will reside on remote computers, it is important that the user A High-Level Language for Use with Multi-Computer Networks be able to obtain current documentation while he uses the system. That documentation should reside on the same computer as the code used to execute the system. Options used to determine which computer system you are communicating with should not have to exist in interpretable code. This generality often leads to heavy overhead that defeats the advantages of a computer network. An example of this was the Data Reconfiguration Service created by Rand 10 as a general data converter that could be used when transferring data from one computer to another on the ARPANET. Because it was written to execute interpretly, data conversion can only be performed at low speed. While high-speed and low-overhead were not conditions of their implementation, an operationallanguage should not produce such restricted conditions of usage. The final condition that the language be easily used and operate on a single computer, led to the investigation of available extendable languages. Using an already developed language has certain advantages, people will not need to learn a new language to use the network, program development can continue even when the network is not operating, and network transparency is heavily dictated by the already designed language syntax. It was a belief that a network language should not be different in structure than any other language that led to the investigation of a high-level language implementation. THE NETWORK LANGUAGE An available language was found that met most of the requirements of our network language. The language is called SPEAKEASYll and it consists of a statement interpreter and a series of attachable libraries. One set of these libraries consist of linkable modules called "linkules" that are blocks of executable code that can be read off the disk into core and executed under control of the interpreter. This code, that may be written in a high-level language such as FORTRAN, can be used to perform the necessary protocols and other system operations required by the network. Thus, the user would not be required to know anything other than the word that activates the operation of the code. Each user required operation could be given a different word, where additional data could be provided as arguments of the activating word. Since there are almost no limitations on the number of library members, and each user could be provided with his own attachable library, the language can be easily extended to accommodate new features created by the network. Linkules in SPEAKEASY could be created that communicate with ind{vidual computer systems or that perform similar operations in more than one system. Where the latter is implemented, an automatic result would be the creation of a super control language. Since one of the factors that prevent the average user from using more than one computer system is the non-uniformity of the operating systems, the development of a 151 network language will eliminate this problem. Using data stored in other libraries, the network language could supply the needed control syntax to execute a specified task. This operation is not very much different from what the user does when he is supplied control cards by a consultant that he uses until it is outdated by systems changes. The SPEAKEASY language presently provides the facility to provide on-line documentation about itself based on data in attachable libraries. This would be extended to allow the SPEAKEASY interpreter to read from libraries resident on a remote computer. Therefore, the documentation could be kept up-to-date by the same people responsible for developing the executing code. Since SPEAKEASY linkules are compiled code, and there may exist seperate modules that are only loaded into core--whEm needed, minimal overhead is provided by adding new operations to the system. This is the same technique used to add operations to the present system, therefore no difference should be detectable between resident and remote linkules. NET\VORK MODULE PROTOCOLS Where a single computer version of SPEAKEASY has an easy task to determine how a command should be executed, a multi-computer version makes more complex decisions. Therefore it is necessary that there be established a well defined pattern of operations to be performed by each linkule. Linkule order of operations Each linkule that establishes communication with a remote (slave) computer system should execute each of the following ten operations, so as to maintain synchronization between the Master task and all remote subtasks. No module will be allowed to leave core until the tenth step is performed. 1. 2. 3. 4. 5. 6. 7. 8. Determine System Resources Needed. Establish the Network Connections. a) Select the Computer Systems that will be used. b) Establish the System Availability. c) Perform the Necessary Connect & Logon Procedures. Allocate the Needed Computer System Resources to the Job. Provide for System Recovery Procedures in the case of System Failure. Determine what Data Translation Features are to be used. Determine whether Data Bases should be moved. Start the main task in the remote computer executing. Initiate and Synchronize any subtasks. 152 National Computer Conference, 1973 9. Terminate any subtasks. 10. Terminate the remote task and close all related system connections. In the present single computer version of SPEAKEASY only tasks 3 and 7 are executed, and a more limited form of task 4 is also performed. Where the overhead in opening and closing network connections is greater then the cost of leaving the connections open, the 10th task will not terminate any connection once made, allowing the next linkule to check and find it open. Information on the systems availability can be obtained by inquiry from the network communications software. Procedures and resource information can be obtained from data stored in local or remote data set, by inquiry from the user, or by inquiry from the system. All allocation procedures will be performed by submitting the appropriate control commands to the standard operating system on the relevant computer. Since the SPEAKEASY system will have the necessary data about the origination and destination computers for any unit of data being transfered, it can link to a linkule that is designed for that translation only. This linkule could take advantage of any special features or tricks that would speed the translation process, thus reducing overhead during the translation process. Requirements for network speakeasy The minimal requirements to execute the Network version of SPEAKEASY is a computer that will support the interpreter. At the present time that means either an IBM 360 or 370 computer operating under either the O/S or MTS operating systems, or a F ACOM 60 computer system. The network protocols for only two operations are required. The Master computer must be able to establish a connection to a remote computer and it must be able to initiate execution of a job in the remote computer system. The remote site should provide either a systems routine or a user created module that will perform the operations requested by the Master computer. This program, called the Network Response Program, is activated whenever a request is made to the remote computer. There may be one very general Network Response Program, or many different ones designed for specific requests. Figure 1 shows the modular structure of the Network Language in both the Master and Slave (Subtask) computing systems. Because the network features of the language are required to be totally transparent to the user, no examples of network programming are shown. The reader should refer to previously published papers on Speakeasy for examples of programming syntax. SUMMARY A network version of the SPEAKEASY system is described that consists of a series of dynamically linked "ASTER cOIIPUTER SLAVE COMPUTER Figure I-The modular structure of the Network Speakeasy System. Dotted lines indicate dynamic linkages, dashed lines are telecommunications links, and solid lines are standard linkages. modules that perform the necessary language and system protocols. These protocols are transparent to the user, producing a language that has additional power without additional complexity. The language uses attached libraries to provide the necessary information that will tailor the system to a specific network, and to supply on-line documentation. Since the modules are compiled code, the generality of the system does not produce a large overhead. ACKNOWLEDGMENT The author is indebted to Stanley Cohen, developer of the SPEAKEASY system. His assistance and support made this extension possible. REFERENCES 1. Roberts, L. G., "Resource Sharing Networks," Proceedings of IEEE International Conference, 1969. 2. Demercado, J., Guindon, R., Dasilva, J., Kadoch, M., "The Canadian Universities Computer Network Topological Considerations," The First International Conference on Computer Communication, 1972. 3. Titus, J., "Soviet Computing-A Giant Awake," Datamation, Vol. 17, No. 24, 1971. 4. Barber, D. L. A., "The European Computer Network Project," The First International Conference on Computer Communication, 1972. 5. Denk, J. R., "Curriculm Development of Computer Usage in North Carolina," Proceedings of a Conference on Computers in the Undergraduate Curricula. 6. Herzog, B., "Computer Networks," International Computing Symposium, ACM, 1972. 7. Mendicino, S. F., "OCTOPUS-The Lawrence Radiation Laboratory Network," Computer Networks, Prentice-Hall, Englewood Clitfs. ~ .•J., Hr;2. A High-Level Language for Use with Multi-Computer Networks 8. Farber, D., "Data Ring Oriented Computer Networks," Computer Networks, Prentice-Hall, Englewood Cliffs, N.J., 1972. 9. Frank, H., Kahn, R. E., Kleinrock, L., "Computer Communication Network Design-Experience with Theory and Practice," Networks, Vol. 2, 1972. 153 10. Harslem, E. F., Heafner, J., Wisniewski, T. D., Data Reconfiguration Service Compiler-Communication Among Heterogeneous Computer Center Using Remote Resource Sharing, Rand Research Report R-887 -ARPA, 1972. 11. Cohen, S., "Speakeasy-RAKUGO," First USA Japan Computer Conference Proceedings, 1972. A resource sharing executive for the ARP ANET* by ROBERT H. THOMAS Bolt, Beranek and Newman, Inc. Cambridge, Mass. with paging hardware. In comparison to the TIPs, the TENEX Hosts are large. TENEX implements a virtual processor with a large (256K word), paged virt_~_~L_I!l~_ITI~ ory for -each user process. In addition, it provides a multiprocess job structure with software program interrupt capabilities, an interactive and carefully engineered command language (implemented by the TENEX EXEC) and advanced file handling capabilities. Development of the RSEXEC was motivated initially by the desire to pool the computing and storage resources of the individual TENEX Hosts on the ARPA~ET. We observed that the TENEX virtual machine was becoming a popular network resource. Further, we observed that for many users, in particular those whose access to the network is through TIPs or other non-TENEX Hosts, it shouldn't really matter which Host provides the TENEX virtual machine as long as the user is able to do his computing in the manner he has become accustomed*. A number of advantages result from such resource sharing. The user would see TENEX as a much more accessible and reliable resource. Because he would no longer be dependent upon a single Host for his computing he would be able to access a TENEX virtual machine even when one or more of the TENEX Hosts were down. Of course, for him to be able to do so in a useful way, the TENEX file system would have to span across Host boundaries. The individual TENEX Hosts would see advantages also. At present, due to local storage limitations, some sites do not provide all of the TENEX subsystems to their users. For example, one site doesn't support FORTRAN for this reason. Because the subsystems available would, in effect, be the "union" of the subsystems available on all TENEX Hosts, such Hosts would be able to provide access to all TENEX subsystems. The RSEXEC was conceived of as an experiment to investigate the feasibility of the multi-Host TENEX concept. Our experimentation with an initial version of the RSEXEC was encouraging and, as a result, we planned to develop and maintain the RSEXEC as a TENEX subsystem. The RSEXEC is, by design, an evo- INTRODUCTION The Resource Sharing Executive (RSEXEC) is a distributed, executive-like system that runs on TENEX Host computers in the ARPA computer network. The RSEXEC creates an environment which facilitates the sharing of resources among Hosts on the ARPANET. The large Hosts, by making a small amount of their resources available to small Hosts, can help the smaller Hosts provide services which would otherwise exceed their limited capacity. By sharing resources among themselves the large Hosts can provide a level of service better than any one of them could provide individually. Within the environment provided by the RSEXEC a user need not concern himself directly with network details such as communication protocols nor even be aware that he is dealing with a network. A few facts about the ARPANET and the TENEX operating system should provide sufficient background for the remainder of this paper. Readers interested in learning more about the network or TENEX are referred to the literature; for the ARPANET References 1,2,3,4; for TENEX. References 5,6,7. The ARPANET is a nationwide heterogeneous collection of Host computers at geographically separated locations. The Hosts differ from one another in manufacture, size, speed, word length and operating system. Communication between the Host computers is provided by a subnetwork of small, general purpose computers called Interface Message Processors or IMPs which are interconnected by 50 kilobit common carrier lines. The IMPs are programmed to implement a store and forward communication network. As of January 1973 there were 45 Hosts on the ARPANET and 33 IMPs in the subnet. In terms of numbers, the two most common Hosts in the ARPANET are Terminal IMPs called TIP s 12 and TENEXs.9 TIP ss.9 are mini-Hosts designed to provide inexpensive terminal access to other network Hosts. The TIP is implemented as a hardware and software augmentation of the IMP. TENEX is a time-shared operating system developed by BBN to run on a DEC PDP-10 processor augmented * This, of course, ignores the problem of differences in the accounting and billing practices of the various TENEX Hosts. Because all of the TENEX Hosts (with the exception of the two at BBN) belong to ARPA we felt that the administrative problems could be overcome if the technical problems preventing resource sharing were solved. * This work was supported by the Advanced Projects Research Agency of the Department of Defense under Contract No. DAHC15-71-C-0088. 155 156 National Computer Conference, 1973 lutionary system; we planned first to implement a system with limited capabilities and then to let it evolve, expanding its capabilities, as we gained experience and came to understand the problems involved. During the early design and implementation stages it became clear that certain of the capabilities planned for the RSEXEC would be useful to all network users, as well as users of a multi-Host TENEX. The ability of a user to inquire where in the network another user is and then to "link" his own terminal to that of the other user in order to engage in an on-line dialogue is an example of such a capability. A large class of users with a particular need for such capabilities are those whose access to the network is through mini-Hosts such as the TIP. At present TIP users account for a significant amount of network traffic, approximately 35 percent on an average day. IO A frequent source of complaints by TIP users is the absence of a sophisticated command language interpreter for TIPs and, as a result, their inability to obtain information about network status, the status of various Hosts, the whereabouts of other users, etc., without first logging into some Host. Furthermore, even after they log into a Host, the information readily available is generally limited to the Host they log into. A command language interpreter of the type desired would require more (core memory) resources than are available in a TIP alone. We felt that with a little help from one or more of the larger Hosts it would be feasible to provide TIP users with a good command language interpreter. (The TIPs were already using the storage resources of one TENEX Host to provide their users with a network news service. IO . 11 Further, since a subset of the features already planned for the RSEXEC matched the needs of the TIP users, it was clear that with little additional effort the RSEXEC system could provide TIP users with the command language interpreter they needed. The service TIP users can obtain through the RSEXEC by the use of a small portion of the resources of several network Hosts is superior to that they could obtain either from the TIP itself or from any single Host. An initial release of the RSEXEC as a TENEX subsystem has been distributed to the ARPANET TENEX Hosts. In addition, the RSEXEC is available to TIP users (as well as other network users) for use as a network command language interpreter, preparatory to logging into a particular Host (of course, if the user chooses to log into TENEX he may continue using the RSEXEC after login). Several non-TENEX Hosts have expressed interest in the RSEXEC system, particularly in the capabilities it supports for inter-Host user-user interaction, and these Hosts are now participating in the RSEXEC experiment. The current interest in computer networks and their potential for resource sharing suggests that other systems similar to the RSEXEC will be developed. At present there is relatively little in the literature describing such distributing computing systems. This paper is presented to record our experience with one such system: we hope it will be useful to others considering the implementation of such systems. The remainder of this paper describes the RSEXEC system in more detail: first, in terms of what the RSEXEC user sees, and then, in terms of the implementation. THE USER'S VIEW OF THE RSEXEC The RSEXEC enlarges the range of storage and computing resources accessible to a user to include those beyond the boundaries of his local system. It does that by making resources, local and remote, available as part of a single, uniformly accessible pool. The RSEXEC system includes a command language interpreter which extends the effect of user commands to include all TENEX Hosts in the ARPANET (and for certain commands some nonTENEX Hosts), and a monitor call interpreter which, in a similar way, extends the effect of program initiated "system" calls. To a large degree the RSEXEC relieves the user and his programs of the need to deal directly with (or even be aware that they are dealing with) the ARPANET or remote Hosts. By acting as an intermediary between its user and non-local Hosts the RSEXEC removes the logical distinction between resources that are local and those that are remote. In many contexts references to files and devices* may be made in a site independent manner. For example, although his files may be distributed among several Hosts in the network, a user need not specify where a particular file is stored in order to delete it; rather, he need only supply the file's name to the delete command. To a first approximation, the user interacts with the RSEXEC in much the same way as he would normally interact with the standard (single Host) TENEX executive program. The RSEXEC command language is syntactically similar to that of the EXEC. The significant difference, of course, is a semantic one; the effect of commands are no longer limited to just a single Host. Some RSEXEC commands make direct reference to the multi-Host environment. The facilities for inter-Host user-user interaction are representative of these commands. For example, the WHERE and LINK commands can be used to initiate an on-line dialogue with another user: <-WHERE (IS USER) JONES** JOB 17 TTY6 USC JOB 5 TTY14 CASE <-LINK (TO TTY) 14 (AT SITE) CASE * Within TENEX, peripheral devices are accessible to users via the file system; the terms "file" and "device" are frequently used interchangeably in the following. ** "+--" is the RSEXEC "ready" character. The words enclosed in parentheses are "noise" words which serve to make the commands more understandable to the user and may be omitted. A novice user can use the character ESC to cause the RSEXEC to prompt him by printing the noise words. A Resource Sharing Executive for the ARPA~ET Facilities such as these play an important role in removing the distinction between "local" and "remote" by allowing users of geographically. separated Hosts to interact with one another as if they were members of a single user community. The RSEXEC commands directly available to TIP users in a "pre-login state" include those for inter-Host user-user interaction together with ones that provide Host and network status information and network news. Certain RSEXEC commands are used to define the "configuration" of the multi-Host environment seen by the user. These "meta" commands enable the user to specify the "scope" of his subsequent commands. For example, one such command (described in more detail below) allows him to enlarge or reduce the range of Hosts encompassed by file system commands that follow. Another "meta" command enableshirn to spe.cify _a set of peripheral devices which he may reference in a site independent manner in subsequent commands. The usefulness of multi-Host systems such as the RSEXEC is, to a large extent, determined by the ease with which a user can manipulate his files. Because the Host used one day may be different from the one used the next, it is necessary that a user be able to reference any given file from all Hosts. Furthermore, it is desirable that he be able to reference the file in the same manner from all Hosts. The file handling facilities of the RSEXEC were designated to: 157 designed to allow use of partial pathnames for frequently referenced file, for their users. * It is straightforward to extend the tree structured model for file access within a single Host to file access within the entire network. A new root node is created with branches to each of the root nodes of the access trees for the individual Hosts, and the complete pathname is enlarged to include the Host name. A file access tree for a single Host is shown in Figure 1; Figure 2 shows the file access tree for the network as a collection of single Host trees. The RSEXEC supports use of complete pathnames that include a Host component thereby making it possible (albeit somewhat tedious) for users to reference a file on any Host. For example, the effect of the command ~AI>PENP (FJL~) [CASEIDSK: DATA. NEW (TO FILE) [BBN]DSK: DATA.OLD** is to modify the file designated CD in Figure 2 by appending to it the file designated (2) . To make it convenient to reference files, the RSEXEC allows a user to establish contexts for partial pathname interpretation. Since these contexts may span across several Hosts, the user has the ability to configure his own "virtual" TENEX which may in reality be realized by the resources of several TENEXs. Two mechanisms are available to do this. The first of these mechanisms is the user profile which is a collection of user specific information and parameters 1. Make it possible to reference any file on any Host by implementing a file name space which spans across Host boundaries. 2. Make it convenient to reference frequently used files by supporting "short hand" file naming conventions, such as the ability to specify certain files without site qualification. The file system capabilities of the RSEXEC are designed to be available to the user at the command language level and to his programs at the monitor call level. An important design criterion was that existing programs be able to run under the RSEXEC without reprogramming. File access within the RSEXEC system can be best described in terms of the commonly used model which views the files accessible from within a Host as being located at terminal nodes of a tree. Any file can be specified by a pathname which describes a path through the tree to the file. The complete pathname for a file includes every branch on the path leading from the root node to the file. While, in general, it is necessary to specify a complete pathname to uniquely identify a file, in many situations it is possible to establish contexts within which a partial pathname is sufficient to uniquely identify a file. Most operating systems provide such contexts, Figure I-File access tree for a single Host. The circles at the terminal nodes of the tree represent files * For example, TENEX does it by: 1. Assuming default values for certain components left unspecified in partial pathnames; 2. Providing a reference point for the user within the tree (working directory) and thereafter interpreting partial pathnames as being relative to that point. TENEX sets the reference point for each user at login time and, subject to access control restrictions, allows the user to change it (by "connecting" to another directory). ** The syntax for (single Host) TENEX path names includes device. directory, name and extension components. The RSEXEC extends that syntax to include a Host component. The pathname for@specifies: the CASE Host; the disk ("DSK") device; the directory THOMAS; the name DATA; and the extension NEW. 158 National Computer Conference, 1973 Figure 2-File access tree for a network. The single Host access tree from Figure 1 is part of this tree maintained by the RSEXEC for each user. Among other things, a user's profile specifies a group of file directories which taken together define a composite directory for the user. The "contents" of the composite directory are the union of the "contents" of the file directories specified in the profile. When a pathname without site and directory qualification is used, h is interpreted relative to the user's composite directory. The composite directory serves to define a reference point within the file access tree that is used by the RSEXEC to interpret partial pathnames. That reference point is somewhat unusual in that it spans several Hosts. One of the ways a user can reconfigure his "virtual" TENEX is by editing his profile. With one of the "meta" commands noted earlier he can add or remove components of his composite directory to control how partial pathnames are interpreted. An example may help clarify the role of the user profile, the composite directory and profile editing. Assume that the profile for user Thomas contains directories BOBT at BBN, THOMAS at CASE and BTHOMAS at USC (see Figure 2). His composite directory, the reference point for pathname interpretation, spans three Hosts. The command <-APPEND (FILE) DATA.NEW (TO FILE) DATA.OLD achieves the same effect as the APPEND command in a previous example. To respond the RSEXEC first consults the composite directory to discover the locations of the files, and then acts to append the first file to the second; how it does so is discussed in the next section. If he wanted to change the scope of partial pathnames he uses, user Thomas could delete directory BOBT at BBN from his profile and add directory RHT at AMES to it. The other mechanism for controlling the interpretation of partial pathnames is device binding. A user can instruct the HSEXEr tn interpret subspquent use of ::l particular device name as referring to a device at the Host he specifies. After a device name has been bound to a Host in this manner, a partial pathname without site qualification that includes it is interpreted as meaning the named device at the specified Host. Information in the user profile specifies a set of default device bindings for the user. The binding of devices can be changed dynamically during an RSEXEC session. In the context of the previous example the sequence of commands: +-BIND (DEVICE) LPT (TO SITE) BBN +-LIST DATA. NEW <-BIND (DEVICE) LPT (TO SITE) USC <-LIST DATA.NEW produces two listings of the file DATA.NEW: one on the line printer (device "LPT") at BBN, the other on the printer at USC. As with other RSEXEC features, device binding is available at the program level. For example, a program that reads from magnetic tape will function properly under the RSEXEC when it runs on a Host without a local mag-tape unit, provided the mag-tape device has been bound properly. The user can take advantage of the distributed nature of the file system to increase the "accessibility" of certain files he considers important by instructing the RSEXEC to maintain images of them at several different Hosts. With the exception of certain special purpose files (e.g., the user's "message" file), the RSEXEC treats files with the same pathname relative to a user's composite directory as images of the same multi-image file. The user profile is implemented as a multi-image file with an image maintained at every component directory of the composite directory. * '" The profile is somewhat special in that it is accessible to the user only through the profile erliting rommands. and is otherwise transparent. A Resource Sharing Executive for the ARPANET Implementation of the RSEXEC The RSEXEC implementation is discussed in this section with the focus on approach rather than detail. The result is a simplified but nonetheless accurate sketch of the implementation. The RSEXEC system is implemented by a collection of programs which run with no special privileges on TENEX Hosts. The advantage of a "user-code" (rather than "monitor-code") implementation is that ordinary user access is all that is required at the various Hosts to develop, debug and use the system. Thus experimentation with the RSEXEC can be conducted with minimal disruption to the TENEX Hosts. The ability of the RSEXEC to respond properly to users' requests often requires cooperation from one or more remote Hosts. When such cooperation is necessary, tile RSEXEC program interacts with RSEXEC "service" programs at the remote Hosts according to a pre-agreed upon set of conventions or protocol. Observing the protocol, the RSEXEC can instruct a service program to perform actions on its behalf to satisfy its user's requests. Each Host in the RSEXEC system runs the service program as a "demon" process which is prepared to provide service to any remote process that observes protocol. The relation between RSEXEC programs and these demons is shown schematically in Figure 3. The RSEXEC protocol The RSEXEC protocol is a set of conventions designed to support the interprocess communication requirements of the RSEXEC system. The needs of the system required that the protocol: Figure 3-Schematic showing several RSEXEC programs interacting, on behalf of their users, with remote server programs 159 1. be extensible: As noted earlier, the RSEXEC is, by design, an evol utionary system. 2. support many-party as well as two-party interactions: Some situations are better handled by single multiparty interactions than by several two-party interactions. Response to an APPEND command when the files and the RSEXEC are all at different Hosts is an example (see below). 3. be convenient for interaction between processes running on dissimilar Hosts while supporting efficient interaction between processes on similar Hosts: Many capabilities of the RSEXEC are useful to users of non -TENEX as well as TENEX Hosts. It is important that the protocol not favor TENEX at the expense of other Hosts. The RSEXEC protocol has two parts: 1. a protocol for initial connection specifies how programs desiring service (users) can connect to programs providing service (servers); 2. a command protocol specifies how the user program talks to the server program to get service after it is connected. The protocol used for initial connection is the standard ARPANET initial connection protocol (ICP).12 The communication paths that result from the ICP exchange are used to carry commands and responses between user and server. The protocol supports many-party interaction by providing for the use of auxiliary communication paths, in addition to the command paths. Auxiliary paths can be established at the user's request between server and user or between server and a third party. Communication between processes on dissimilar Hosts usually requires varying degrees of attention to message formatting, code conversion, byte manipulation, etc. The protocol addresses the issue of convenience in the way other standard ARPANET protocols have. 13 ,14.15 It specifies a default message format designed to be "fair" in the sense that it doesn't favor one type of Host over another by requiring all reformatting be done by one type of Host. It addresses the issue of efficiency by providing a mechanism with which processes on similar Hosts can negotiate a change in format from the default to one better suited for efficient use by their Hosts. The protocol can perhaps best be explained further by examples that illustrate how the RSEXEC .uses it. The following discusses its use in the WHERE, APPEND and LINK commands: -WHERE (IS USER) JONES The RSEXEC querie~ each non-local server program about user Jones. To query a server, it establishes connections with the server; transmits a "request for information about Jones" as specified by the protocol; 160 National Computer Conference, 1973 and reads the response which indicates whether or not Jones is a known user, and if he is, the status of his active jobs (if any). --APPEND (FILE) DATA.NEW (TO FILE) DATA.OLD Recall that the files DATA.NEW and DATA.OLD are at CASE and BBN, respectively; assume that the APPEND request is made to an RSEXEC running at USC. The RSEXEC connects to the servers at CASE and BBN. Next, using the appropriate protocol commands, it instructs each to establish an auxiliary path to the other (see Figure 4). Finally, it instructs the server at CASE to transmit the file DATA. NEW over the auxiliary connection and the server at BBN to append the data it reads from the auxiliary connection to the file DATA.OLD. +-LINK (TO TTY) 14 (AT SITE) CASE Assume that the user making the request is at USC. After connecting to the CASE server, the RSEXEC uses appropriate protocol commands to establish two auxiliary connections (one "send" and one "receive") with the server. It next instructs the server to "link" its (the server's) end of the auxiliary connections to Terminal 14 at its (the server's) site. Finally, to complete the LINK command the RSEXEC "links" its end of the auxiliary connections to its user's terminal. The RSEXEC program A large part of what the RSEXEC program does is to locate the resources necessary to satisfy user requests. It can satisfy some requests directly whereas others may require interaction with one or more remote server programs. For example, an APPEND command may involve AUXILIARY / CONNECTION Figure 4-configuration of RSEXEC and two server programs required to satisfy and APPEND command when the two files and the RSEXEC are all on different Hosts. The auxiliary connection is used to transmit the file to be appended from one server to the other interaction with none, one or two server programs depending upon where the two files are stored. An issue basic to the RSEXEC implementation concerns handling information necessary to access files: in particular, how much information about non-local files should be maintained locally by the RSEXEC? The advantage of maintaining the information locally is that requests requiring it can be satisfied without incurring the overhead involved in first locating the information and then accessing it through the network. Certain highly interactive activity would be precluded if it required significant interaction with remote server programs. For example, recognition and completion of file names* would be ususable if it required direct interaction with several remote server programs. Of course, it would be impractical to maintain information locally about all files at all TENEX Hosts. The approach taken by the RSEXEC is to maintain information about the non-local files a user is most likely to reference and to acquire information about others from remote server programs as necessary. It implements this strategy by distinguishing internally four file types: 1. files in the Composite Directory; 2. files resident at the local Host which are not in the Composite Directory; 3. files accessible via a bound device, and; 4. all other files. Information about files of type 1 and 3 is maintained locally by the RSEXEC. It can acquire information about type 2 files directly from the local TEi'TEX monitor, as necessary. No information about type '* files is maintained locally; whenever such information is needed it is acquired from the appropriate remote server. File name recognition and completion and the use of partial pathnames is rest ricted to file types 1, 2 and 3. The composite directory contains an entry for each file in each of the component directories specified in the user's profile. At the start of each session the RSEXEC constructs the user's composite directory by gathering information from the server programs at the Hosts specified in the user profile. Throughout the session the RSEXEC modifie!' the composite directory. adding and deleting entries, as necessary. The composite directory contains frequently accessed information (e.g., Host location, size, date of last access, etc.) about the user's files. It represents a source of information that can be accessed without incurring the overhead of going to the remote Host each time it is needed. * File name recognition and completion is a TENEX feature which allows a user to abbreviate fields of a file pathname. Appearance of ESC in the name causes the portion of the field before the ESC to be looked up, and, if the portion is unambiguous, the system will recognize it and supply the omitted characters and/ or fields to complete the file name. If the portion is ambiguous, the system will prompt the user for more characters by ringing the terminal bell. Because of its popularity we felt it important that the RSEXEC support this feature. A Resource Sharing Executive for the ARPANET The RSEXEC regards the composite directory as an approximation (which is usually accurate) to the state of the user's files. The state of a given file is understood to be maintained by the TENEX monitor at the site where the file resides. The RSEXEC is aware that the outcome of any action it initiates involving a remote file depends upon the file's state as determined by the appropriate remote TEKEX monitor, and that the state information in the composite directory may be "out of phase" with the actual state. It is prepared to handle the occasional failure of actions it initiates based on inaccurate information in the composite directory by giving the user an appropriate error message and updating the composite directory. Depending upon the severity of the situation it may choose to change a single entry in the composite directory, reacquire all the information for a component directory, ox rebuild the entire composite directory. The service program for the RSEXEC Each RSEXEC service program has two primary responsibilities: 1. to act on behalf of non-local users (typically RSEXEC programs), and; 2. to maintain information on the status of the other server programs. The status information it maintains has an entry for each Host indicating whether the server program at the Host is up and running, the current system load at the Host, etc. Whenever an RSEXEC program needs service from some remote server program it checks the status information maintained by the local server. If the remote server is indicated as up it goes ahead and requests the service; otherwise it does not bother. A major requirement of the server program implementation is that it be resilient to failure. The server should be able to recover gracefully from common error situations and, more important, it should be able to "localize" the effects of those from which it can't. At any given time, the server may simultaneously be acting on behalf of a number of user programs at different Hosts. A malfunctioning or malicious user program should not be able to force termination of the entire service program. Further, it should not be able to adversely effect the quality of service received by the other users. To achieve such resiliency the RSEXEC server program is implemented as a hierarchy of loosely connected, cooperating processes (see Figure 5): 1. The RSSER process is at the root of the hierarchy. Its primary duty is to create and maintain the other processes; 2. REQSER processes are created in response to requests for service. There is one for each non-local user being served. / / / / 161 / / / / / / Figure 5-Hierarchical structure of the RSEXEC service program 3. A ST ASER process maintains status information about the server programs at other sites. Partitioning the server in this way makes it easy to localize the effect of error situations. For example, occurrence of an unrecoverable error in a REQSER process results in service interruption only to the user being serviced by that process: all other REQSER processes can continue to provide service uninterrupted. When service is requested by a non-local program, the RSSER process creates a REQSER process to provide it. The REQSER process responds to requests by the nonlocal program as governed by the protocol. When the nonlocal program signals that it needs no further service, the REQSER process halts and is terminated by RSSER. The STASER process maintains an up-to-date record of the status of the server programs at other Hosts by exchanging status information with the STASER processes at the other Hosts. The most straightforward way to keep up-to-date information would be to have each STASER process periodically "broadcast" its own status to the others. Unfortunately, the current, connectionbased Host-Host protocol of the ARPANET16 forces use of a less elegant mechanism. Each STASER process performs its task by: 1. periodically requesting a status report from each of the other processes, and; 2. sending status information to the other processes as requested. To request a status report from another STASER process, STASER attempts to establish a connection to a "well-known" port maintained in a "listening" state by the other process. If the other process is up and running, the connection attempt succeeds and status information is sent to the requesting process. The reporting process then returns the well-known port to the listening state so that it can respond to requests from other proc- 162 National Computer Conference, 1973 esses. The requesting process uses the status report to update an appropriate status table entry. If the connection attempt does not succeed within a specified time pe.riod, the requesting process records the event as a mIssed report in an appropriate status table entry. When the server program at a Host first comes up, the status table is initialized by marking the server programs at the other Hosts as down. After a particular server is marked as down, STASER must collect a number of status reports from it before it can mark the program as up and useful. If, on its way up, the program misses several consecutive reports, its "report count" is zeroed. By requiring a number of status reports from a remote server before marking it as up, STASER is requiring that the remote program has functioned "properly" for a while. As a result, the likelihood that it is in a stable state capable of servicing local RSEXEC programs is increased. STASER is willing to attribute occasionally missed reports as being due to "random" fluctuations in network or Host responses. However, consistent failure of a remote server to report is taken to mean that the program is unusable and results in it being marked as down. Because up-to-date status information is crucial to the operation of the RSEXEC system it is important that failure of the STASER process be infrequent, and that when a failure does occur it is detected and corrected quickly. STASER itself is programmed to cope with common errors. However error situations can arise from which STASER is incapable of recovering. These situations are usually the result of infrequent and unexpected "network" events such as Host-Host protocol violations and lost or garbled messages. (Error detection and control is performed on messages passed between IMPS to insure that messages are not lost or garbled within the IMP subnet: however, there is currently no error control for messages passing over the Host to IMP interface.) For all practical purposes such situations are irreproducible, making their pathology difficult to understand let alone program for. The approach we have taken is to ~ckn?wl edge that we don't know how to prevent such sIt~atI?ns and to try to minimize their effect. When functIOnmg properly the STASER process "reports in" periodicall~. If it fails to report as expected, RSSER assumes that It has malfunctioned and restarts it. Providing the RSEXEC to TIP users The RSEXEC is available as a network executive program to users whose access to the network is by way of a TIP (or other non-TENEX Host) through a standard service program (TIPSER) that runs on TENEX Hosts.* To use the RSEXEC from a TIP a user instructs the TIP to initiate an initial connection protocol exchange with one of the TIPSER programs. TIPSER responds to the * At present TIPSER is run on a regular basis at only one of the TENEX Hosts; we expect several other Hosts will start running it on a ff'gnl:H hH~i~ "hortly. ICP by creating a new process which runs the RSEXEC for the TIP user. CONCLUDING REMARKS Experience with the RSEXEC has shown that it is capable of supporting significant resource sharing among the TENEX Hosts in the ARPANET. It does so in a way that provides users access to resources beyond the boundaries of their local system with a convenience not previously experienced within the ARPANET. As the RSEXEC system evolves, the TENEX Hosts will become more tightly coupled and will approach the goal of a multi-Host TENEX. Part of the process of evolution will be to provide direct support for many RSEXEC features at the level of the TENEX monitor. At present the RSEXEC system is markedly deficient in supporting significant resource sharing among dissimi1ar Hosts. True, it provides mini-Hosts, such as TIPs, with a mechanism for accessing a small portion of the resources of the TENEX (and some non-TENEX) Hosts, enabling them to provide their users with an executive program that is well beyond their own limited capacity. Beyond that, however, the system does little more than to support inter-Host user-user interaction between Hosts that choose to implement the appropriate subset of the RSEXEC protocol. There are, of course, limitations to how tightly Hosts with fundamentally different operating systems can be coupled. However, it is clear that the RSEXEC has not yet approached those limitations and that there is room for improvement in this area. The RSEXEC is designed to provide access to the resources within a computer network in a manner that makes the network itself transparent by removing the logical distinction between local and remote. As a result, the user can deal with the network as a single entity rather than a collection of autonomous Hosts. We feel that it will be through systems such as the RSEXEC that users will be able to most effectively exploit the resources of computer networks. ACKNOWLEDGMENTS Appreciation is due to W. R. Sutherland whose leadership and enthusiasm made the RSEXEC project possible. P. R. Johnson actively contributed in the implementation of the RSEXEC. The TENEX Group at BBN deserves recognition for constructing an operating system that made the task of implementing the RSEXEC a pleasant one. REFERENCES 1. Roberts, L. G., Wessler, B. D., "Computer Network Development to Achieve Resource Sharing," Proc. of AFIPS SJCC, 1970, Vol. 36, pp. 543-549. 2. Heart, F. E., Kahn, R. E., Ornstein, S. M., Crowther, W. R., Walden, D. C., "The Interface Message Processor for the ARPA Computer :\'etwork," Proc. of J.FJPS SJCC, 1970, Vol. 36. A Resource Sharing Executive for the ARPANET 3. McQuillan, J. M., Crowther, W. R., Cosell, B. P., Walden, D. C., Heart, F. E., "Improvements in the Design and Performance of the ARPA ~etwork," Proc. of AFIPS FJCC, 1972, Vol. 41, pp. 741-754. 4. Roberts, L. G., "A Forward Look," Signal, Vol. XXV, No. 12, pp. 77 -81, August, 1971. 5. Bobrow, D. G., Burchfiel, J. D., Murphy, D. L., Tomlinson, R. S., "TENEX, a Paged Time Sharing System for the PDP-lO," Communications of the ACM, Vol. 15, No.3, pp. 135-143, March, 1972. 6. TENEX JSYS Manual-A Manual of TENEX Monitor Calls, BBN Computer Science Division, BBN, Cambridge, Mass., November 1971. 7. Murphy, D. L., "Storage Organization and Management in TENEX," Proc. of AFIPS FJCC, 1972, Vol. 41, pp. 23-32. 8. Ornstein, S. M., Heart, F. E., Crowther, W. R., Rising, H. K., Russell, S. B., Michel, A., "The Terminal IMP for the ARPA Computer Network," Proc. of AFIPS SJCC, 1972, Vol. 40, pp. 243254. ~~ Kahn, R. E., "Terminal Access to the ARPA Computer Network," Courant Computer Symposium 3-Computer Networks, Courant Institute, New York, ~ov. 1970. 163 10. Mimno, N. W., Cosell, B. P., Walden, D. C., Butterfield, S. C., Levin, J. B., "Terminal Access to the ARPA ~etwork-Experience and Improvement," Proc. COMPCON '73, Seventh Annual IEEE Computer Society International Conference. 11. Walden, D.C., TIP User's Guide, BBN Report No. 2183, Sept. 1972. Also available from the Network Information Center at Stanford Research Institute, Menlo Park, California, as Document NIC #10916. 12. Postel, J. B., Official Initial Connection Protocol, Available from Network Information Center as Document NIC #7101. 13. Postel, J. B., TELNET Protocol, ARPA Network Working Group Request for Comments #358. Available from Network Information Center as Document NIC #9348. 14. Bhushan, A. K., File Transfer Protocol, ARPA Network Working Group Request for Comments #358. Available from Network Information Center as Document NIC #10596. 15. Crocker, S. D., Heafner, J. F., Metcalfe, R. M., Postel, J. B., "Function Oriented Protocols for the ARPA Computer Network," Proc. of AFIPS SJCC, 1972, Vol. 40, pp. 271-279. 16. McKenzie, A., Host/Host Protocol for the ARPA Network. Available from the Network Information Center As Document NIC #8246. Avoiding simulation in simulating computer communication networks* by R. VAN SL YKE, W. CHOU, and H. FRANK Network Analysis Corporation Glen Cove, ~ew York to analysis. In the next three sections we give many illustrations drawn from practice of these ideas. INTRODUCTION Computer communication networks are complex systems often involving human operators, terminals, modems, multiplexers, concentrators, communication channels as well as the computers. Events occur at an extraordinary range of rates. Computer operations take place in microseconds, modem and channel operations take place on the order of milliseconds, while human interaction is on a scale of seconds or minutes, and the mean time between failures of the various equipments may be on the order of days, weeks, or months. Moreover, many systems are being designed and built today which involve thousands of terminals, communication links, and other devices working simultaneously and often asynchronously. Thus, in addition to the ordinary difficulties of simulation, the particular nature of computer communication networks gives rise to special problems. Blind brute force simulations of systems of this degree of complexity usually result in large, unwieldy, and unverifiable simulation programs only understood by the creator (if by him) and/ or statistically insignificant estimates due to unacceptable computational time per sample. In order to obtain useable results careful attention must be paid to determining the key features of the system to be considered before the simulation is designed and approximating, ignoring or otherwise disposing of the unimportant aspects. In this paper, we discuss three approaches we have found useful in doing this. While these ideas may seem obvious especially in retrospect, our experience has been that they have often been overlooked. The first technique takes advantage of situations where the significant events occur infrequently. A common example is in studies involving infrequent data errors or equipment failures. The second idea is the converse of the first and arises from simulations in which the significant events occur most of the time and the rare events are of less importance. The final idea is to make as much use as possible of analytic techniques by hybrid simulations reserving simulation for only those aspects not amenable IMPORTANT RARE EVENTS In a computer communication environment there coexist various events occurring at widely varying rates. Often the activities of interest are events which occur rarely or are the result of composite events consisting of a relatively infrequent sequence of events. This suggests the possibility of ignoring or grossly simplifying the representation of interim insignificant events. For this the simulator has one big advantage over the observer of a real system in that he can "predict" the future. In a real operational environment there is no way to predict exactly the next occurrence of a randomly occurring event; in a simulated environment often such an event can be predicted since the appropriate random numbers can all be generated in advance of any calculation (if their distribution is not affected by generating the next random number). Once the future occurrence time is known, the simulated clock can be "advanced" to this time. The intervening activities can either be ignored or handled analytically. Extending this idea further the clock in the usual sense can be dispensed with and time can be measured by the occurrence of the interesting events much as in renewal theory. Example J-Response time simulation of terminalconcentrator complexes In performing response time simulations for terminals connected by a multidrop polled line connected to a concentrator or central computer one can often save large amounts of computer time by avoiding detailed simulation of the polling cycle in periods of no traffic. Unless the traffic load is very heavy, large time intervals (relative to the time required for polling) between two successive messages will occur rather frequently. In a real transmission system, the concentrator just polls terminals during this period; if the simulation does the same, large * This work was supported bv the Advanced Research Projects Agency of the Department of Defense under Contract No. DAHC 15-73-C-0135. 166 166 National Computer Conference, 1973 amounts of computer time may be wasted. In a simulation program, after one message is transmitted, the time the next message is ready to be transmitted can be predicted. Therefore, the program may advance the clock to the time when the next message is ready and determine which terminal should be polled at that time (ignoring the polling that goes on in the interim). For each terminal v. hich is not waiting for a reply message from the concentrator, the program can predict the time when it has an inbound message to send to the concentrator by generating as the inter-message time at the terminal, a random number based on input information. For the concentrator, the program can predict in the same way when it has a reply message to deliver. Among these messages the time the first message is ready to send is determined. The time elements involved in polling one terminal is usually deterministic and known. Therefore, if Tl is the time at the end of the current complete message transaction, and T2 is the time when the next message begins to wait for transmission, the terminal being polled at T2 can be determined. Let this terminal be ID. The concentrator did not start to poll ID at exactly T2 but started at some earlier time Ta. In the simulation program, the clock can be advanced to Ta and no activities between Tl and T2 need be simulated. ID is polled at Ta and the simulation is resumed and continues until the transaction for the next message is completed. INSIGNIFICANT RARE EVENTS In contrast to the situation in the previous section where rare events were of critical importance, often activities resulting from rare events are of little interest in the simulation. These often numerous insignificant rare events can lead to difficulties both in the coding of the simulation and its execution. Firstly, an inordinate amount of coding and debugging time can be spent in representing all the various types of insignificant events. On the other hand, in the operation of the system, the rare events often can occur only when more significant events of much higher frequency also occur. Thus, each time such frequently occurring events are realized, one must check for the occurrence of the rare events which can grossly increase the computer time of the simulation. To reduce these undesirable effects, one may either eliminate the insignificant rare events from the simulation completely, or at least simplify the procedures relevant to the insignificant events so as to reduce the coding efforts and shorten running time. Example 2-Detecting and recovering from transmission errors a In using a simulation program to predict terminal response time on a polled multidrop line, the transmission of all messages and their associated control sequence is simulated. For each type of transmission, errors may appear in a variety of different forms, such as no ac- knowledgment to a message transmission, unrecognized message, the loss of one or more crucial characters, and parity errors. For each of them, there may be a different way to react and a different procedure for recovery. In a typical system, there are over forty different combinations of errors and procedures for their treatment. Therefore, it is impractical and undesirable to incorporate every one of the procedures into the program. Even if all the error handling procedures were simulated in the program, there is only limited effect on the response time under normal conditions (i.e., the message transmission error rate and the terminal error rate should be at least better than 10- a) but the program complexity and the computer running time would both be increased by a factor of 10 to 100. With respect to response time, the errors can be easily handled in the program. Whatever the form of error, it must stem from one of the following three sources: a line error during the transmission of the original message; an error caused by a malfunctioning terminal; or a line error during the transmission of an acknowledgment. The error prolongs the response time by introducing the extra delay caused by either the retransmission of the message and its associated supervisory sequences or by a time-out. If the timer is expired and the expected reply or character has not been detected by the concentrator, action is taken to recover from the error. This action may be a retransmission, a request for retransmission, or a termination sequence. The time-out is always longer than the total transmission time of the message and the associated supervisory sequences. Rather than considering the exact form of an error and its associated action, whenever the simulated transmission system detects an error, a delay time equal to the time-out is added to the response time. With these simplifications introduced into the simulation program, the error handling procedures still require a large percentage of overall computer time. When a leased line is initially installed, the errors are very high mainly due to the maladjustment of the hardware. However, when all the bugs are out, the error rate has a very limited effect on the response time. Therefore, for a slight sacrifice in accuracy the program may be run neglecting errors to save computer time. Example 3-Network reliability in the design and expansion of the ARPA computer network (SJCC; 1970), (SJCC; 1972) It was desired to examine the effect on network performance of communication processor and link outages. 10 . 11 The initial design was based on the deterministic reliability requirement that there exist two node disjoint communication paths between every pair of communication nodes (called Interface Message Processors, IMPs) in the net. A consequence of this is that in order for the network to become disconnected at least two of its elements, communication links or IMPs must fail. Since the Avoiding Simulation in Simulating Computer Communication IMPs and communication lines are relatively reliable (availability on the order of .98) disconnectiom are quite rare so a simulation model appropriate for the analysis of the communication functions of the network would have to run an enormously long time before a single failure could be expected, let alone two failures. To illustrate the ideas involved, let us consider the simplified problem of determining the fraction of the time, h(p), the network is disconnected given that links are inoperable a fraction p of the time. (We assume for purposes of illustration that IMPs don't fail. Depicted in Figure 1 is a 23 IMP, 28 link version of the ARPA network.) An obvious method of determining h(p) would be to generate a random number for each link i of the network; if the number ri is less than p the link is removed, otherwise it is left in. After this is done for each link the connectivity of the remaining })etwork is determined. Tbis is done many tim.es and the fraction of the times that the resulting network is disconnected provides an estimate of h(p). Unfortunately, a good part of the time ri>p for each i, occasionally ri~p would hold for one i, and only rarely would ri~p for two or more i; thus, many random numbers would be generated to very little purpose. An effective way to sort out the significant samples is by using a Moore-Shannon expansion of h(p). We have m h(p)=L C(k)pk(l-p)m-k where m is the number of links and C(k) is the number of distinct disconnected subnetworks obtained from the original network by deleting exactly k links. Clearly, O~C(k)~(;). Thus we have partitioned the set of pos- sible events into those in which 0 links failed, 1 link failed and so on. In practice, it turns out that only a few of these classes of events are significant, the values of C(k) for the remaining classes are trivially obtained. Thus it takes at least n - 1 links to connect a network with n nodes so that C(k)=(;) for k=m-n, " ' , m. Similarly, C(O)=C(l)=O for the ARPA network because at least 2 links must be removed before the network becomes disconnected. For the network depicted in Figure 1 where m = 28, n = 23 the only remaining terms which are not immediately available are C(2), C(3), C(4), C(5), and C(6) (See Table 1). C(2) and C(3) can be obtained rather quickly by enumeration and the number of subtrees is obtainable for formula giving C(6), thus leav- Figure I-Network for reliability analysis ~etworks 167 TABLE I-Exactly Known C(k) for 23 ~ode 28 Link ARPA Net Number of links Operative Number of links Failed Num.ber of Nets Number of Failed Nets 0 1 2 3 4 5 6 7 8 9 10 11 12 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 28 378 3276 20475 98280 376740 1184040 3108105 6906900 13123110 21474180 30421755 37442160 40116600 37442160 30421755 21474180 13123110 6906900 3108105 1184040 376740 98280 20475 3276 378 28 1 1 28 378 3276 20475 98280 376740 1184040 3108105 6906900 13123110 21474180 30421755 37442160 40116600 37442160 30421755 21474180 13123110 6906900 3108105 1184040 349618 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Notes: a: b: c: d: Method of Determination a a b ? 827 30 0 0 c c d d not enough links to connect 23 nodes. number of trees calculated by formula. enumerated. less failed links than minimum cutset. ing only C(4) and C(5) undetermined. These can be obtained by sampling; in general, by stratified sampling. Thus we have not only been able to dispose of frequently occurring but unimportant events corresponding to C(O), and C(l) but also to the rare and unimportant events corresponding to C(7) through C(28). HYBRID SIMULATIONS Hybrid simulations refer to models which involve both analytic techniques and simulation techniques. Since analytic techniques are often more accurate and faster than simulation, it is usually worth the effort to model as much of the system as is possible analytically. A complete computer communication system is in general a composite system of many complicated subsystems, involving the interaction of a variety of computer subsystems, terminal subsystems and functiona lly independent transmission subsystems. Many of these systems are complicated and not well defined. For example, there is no simple macroscopic characterization of the hardware and software in a computer system or of how the hard- 168 National Computer Conference, 1973 ware and software interact with each other while dealing with message traffic. It is therefore practically speaking impossible to simulate a whole ~ystem without using some analytic representations or employing some approximations. In place of simulation, the functioning of a subsystem can often be represented by an empirical model. In deriving analytic models, however, simplification is always necessary to make formulation manageable. For example, message flow is often assumed to follow a Poisson pattern, and the message length distribution is sometimes approximated by an exponential distribution. Analytic approaches can give quite acceptable results, if the problem is properly modeled. Example 4- Throughput analysis of distributed networks In the topological design of distributed computer networks such as the ARPA network a rapid method for analyzing the throughput capacity of a design under consideration is essential. Analyzing the traffic capacity in detail is a formidable task, involving modeling the traffic statistics, queuing at the IMPs, the routing doctrine, error control procedures, and the like for 30 or 40 IMPs and a comparable number of communication links. In order to examine routing methods, Teitelman and Kahn9 developed a detailed simulation model with realistic traffic routing and metering strategy capable of simulating small versions of the ARPA Network. Since, in the design of the topology, traffic routing is performed for hundreds of possible configurations, such a simulation is impractical so that a deterministic method of traffic analysis was developed! requiring orders of magnitude less in computing times and thus allowing its repeated use. The two models were developed independently. Based on a 10 IMP version of the ARPA Network shown in Figure 2, average node-to-node delay time is plotted versus network traffic volume in Figure 3. The curve is from the deterministic analytic model while the x's are the results of simulation. The analytic results are more conservative than the simulation results due to the simplification introduced in the analytic model but the results are strikingly close. 6 The topological design is chosen with respect to some predicted traffic distribution from the computers in the network. Since such predictions are highly arbitrary, it is Figure 2--~etwork for throughput analysis Figure 3-Throughput for network shown in Figure 2 necessary to determine how robust the design is with respect to errors in the prediction of traffic. This was done by choosing the average traffic levels at each computer randomly4 and using the analytic throughput analyzer to evaluate the network for varying input traffic patterns. Thus, the availability of a quick throughput analyzer allowed a more macroscopic simulation than would have been computationally feasible if the throughput would have had to be calculated by simulation. Example 5- Time spent at a concentrator 3 In the study of a general centralized computer communication network, one of the most difficult tasks is to estimate accurately the time span between an inbound message's arrival at the concentrator and the time the reply is ready for transmission back to the terminal which generated the inbound message. The system we model for this example has several concentrators and one central computer. Several multidrop lines, called regional lines, are connected to a concentrator, called a regional concentrator terminal (RCT). Each RCT is connected to the central computer system (CPS) via a high speed trunk line. After an inbound message reaches the RCT it undergoes the following processes before the reply is transmitted back to the terminal: (1) waiting for processing by the RCT, (2) processing by the RCT, (3) waiting for access to the inbound trunk (from RCT to CPS), (4) transmitting on the inbound trunk, (5) waiting for processing by the CPS, (6) CPS processing the inbound message to obtain the reply, (7) reply waiting for access to the outbound trunk, (8) transmitting on the outbound trunk, (9) waiting for RCT to process, (10) processing by the RCT, (11) waiting for access to the regional line. Avoiding Simulation in Simulating Computer Communication Networks In our application, the RCT is under-utilized while the CPS is highly utilized. Items (1), (2), (9) and (10) are times relevant to the RCT and are negligibly small. Items (4) and (8) are transmission times and can be obtained by dividing message length by the line speed. The network control procedure in our model does not allow a second message to be delivered if the reply to the first has not been returned. Therefore, there is no waiting for the reply to access the regional line, and item (11) is zero. A combination of an analytic model and an analytic function is used to obtain item (3), as shown below. An empirical distribution is used for the combination of items (5) and (6). An analytic function is used for determining item (7). When an inbound message arrives at the RCT, it is processed and queued at the output buffer for transmission to the CPS. The waiting time for access to the inbound trunk line from the RCT to the CPS depends on the traffic load of other regional lines connected to the same RCT. To determine how long the message must wait, the number of regional lines having a message waiting at the RCT for transmission to the CPS must be determined. We assume that the average transaction rate (which is the sum of the transaction rates of all terminals connected to the same RCT) on the trunk connecting the RCT to the CPS is known and that the message arrivals to the trunk have a Poisson distribution and their length has an exponential distribution. Then, the probability of N or more messages is ps where P is the trunk utilization factor, i.e., the ratio of average total traffic on the inbound trunk to the trunk speed. A random number with this distribution can be generated from one uniformly distributed random number by inverting the cumulative distribution function. These messages include those waiting at terminals and those at the RCT. A random number for each of these inbound messages is generated to determine which regional line the message is from. The number of regional lines having inbound messages in waiting is thus determined. The network flow control procedure in our model allows no more than one inbound message from each of these regional lines at the RCT. Therefore, the number of non-empty buffers is no more than the number of lines having inbound messages in waiting. Conservatively, the former number is set to be equal to the latter. The waiting for access to the trunk is then equal to the number so obtained multiplied by the time required to transmit one average inbound message from the RCT to the CPS. The time interval between the time at the end of the transmission of an inbound message to the CPS and the time the reply has returned to the RCT, depends on the CPS occupancy and the trunk utilization. The CPS occupancy is a direct consequence of the transaction rate of the whole system. The trunk utilization is a direct result of the transaction rate input from all the regional lines connected to the ReT. In our model the time waiting for processing at the CPS and the processing time for each message is given as an empirical distribution. With the 169 help of a random number generator, the time spent at CPS can be determined. The time a reply or an outbound message spends in waiting for the trunk is conservatively estimated by using the following analytic formula Probability (waiting time>t) =Pe· (l-Plt A~·:-; where p is the trunk utilization factor and A VS is the average time required to transmit an outbound message. s CONCLUSION Computer communication networks can be quite large and immensely complex with events occurring on a vast time scale. Rarely can all aspects of such a system be simulated at once for any but the most trivially small and simple systems. Therefore, simulations must be designed for relatively restricted purposes and careful attention must be paid to ways and means of simplifying and ignoring factors not directly pertinent to the purposes. Here we have suggested three ways to avoid unnecessary simulation: (1) by not simulating in detail insignificant events which occur at a much higher rate than the significant ones, (2) by ignoring rare events of little practical significance and (3) by using hybrid simulations with extensive use of analytic modeling where applicable. A number of examples drawn from practice illustrate the application of these ideas to computer communication systems. REFERENCES 1. Chou, W., Frank, H., "Routing Strategies for Computer Network Design," Proc. of the Symposium on Computer-Communications Networks and Teletraffic, Polytechnic Institute of Brooklyn, 1972. 2. Frank, H., Chou, W., "Routing in Computer Networks," Networks, 1. No.2 pp. 99-112, 1971. 3. Frank H., Chou W., "Response Time/Capacity Analysis of a Computer-Communications Network," Infotech State of the Art Reports: Network Systems and Software, Infotech, Maidenhead, Berkshire, England, 1973. 4. Frank, H., Chou, W., "Properties of the ARPA Computer Network," to be published, 1973. 5. Frank, H., Chou, W., Frisch, I. T., "Topological Considerations in the Design of the ARPA Computer Network," &lCC Conf Rec. 36, pp. 581-587 (1970). 6. Frank H., Kahn, R., Kleinrock, L., "Computer Communication Network Design-Experience with Theory and Practice," Proceedings of the Spring Joint Computer Cont., AFIPS Press, 1972. 7. Kershenbaum, A., Van S!yke, R. M., "Recursive Analysis of Network Reliability," Networks, Jan. 1973. 8. Saaty, T. L., Elements of Queueing Theory, McGraw-Hill, New York 1961. 9. Teitelman, W., Kahn, R. E., "A Network Simulation and Display Program," Third Princeton Conference on Information Sciences and Systems, 1961. lO. Van Slyke, R. M., Frank, H., "Network Reliability Analysis-I" Networks, 1, No.3, 1972. 11. Van Slyke, R M., Frank, R., "Reliability in Computer-Communications Networks," Proc. of 1971 ACM Winter Simulation Conference. An implementation of a data management system on an associative processor by RICHARD MOULDER Goodyear Aerospace Corporation Akron, Ohio der of this paper will describe the hardware configuration of the- author's facility, the data stor-age-scheme-,--the search technique, the user oriented data definition and manipulation languages, and some of the benefits and problems encountered in utilizing an Associative Processor for DBMS. INTRODUCTION Recent years have witnessed a widespread and intensive effort to develop systems to store, maintain, and rapidly access data bases of remarkably varied size and type. Such systems are variously referred to as Data Base Management Systems (DBMS), Information Retrieval Systems, Management Information Systems and other similar titles. To a large extent the burden of developing such systems has fallen on the computer industry. The problem of providing devices on which data bases can be stored has been reasonably well solved by disc and drum systems developed for the purpose and commercially available at the present time. The dual problem of providing both rapid query and easy update procedures has proved to be more vexing. In what might be called the conventional approach to DBMS development, sequential processors are employed and large, complex software systems developed to implement requisite data processing functions. The results are often disappointing. A promising new approach has been provided by the appearance of the Associative Processor (AP). This new computer resource provides a true hardware realization of content addressability, unprecedented I/O capability, and seems ideally suited to data processing operations encountered in data management systems. Many papers have been written about the use of associative processors in information retrieval and in particular about their ability to handle data management problems. l To the best of the author's knowledge no actual system has been previously implemented on an associative processor. This paper will detail the author's experience to date in implementing a data management system on an associative processor. It should be noted that the data management system to be described in the following pages is not intended as a marketable software package. It is research oriented, its design and development being motivated by the desire to demonstrate the applicability of associative processing in data management and to develop a versatile, economical data base management system concept and methodology exploiting the potential of an associative processor and a special head per track disc. The remain- HARDWARE CONFIGURATION This section will describe the computer facility employed by the author and available to Goodyear Aerospace Corporation customers. The STARAN* Evaluation and Training Facility (SETF) is a spacious, modern, well equipped facility organized around a four-array STARAN computer. Of particular significance to DBMS efforts is the mating of STARAN to a parallel head per track disc (PHD). The disc is connected for parallel I/O through STARAN's Custom Input/Output Unit (ClOU). Of the 72 parallel channels available on the disc, 64 are tied to STARAN. The switching capability of the CIOU allows time sharing of the 64 disc channels within and between the STARAN arrays. (The SETF STARAN is a 4 array machine, each array containing 256 words of 256 bits each.) The switching provision allows simulations of AP /PHD systems in which the number of PHD channels are selectable in multiples of 64 up to 1024 total parallel channels. In order to provide for rather general information handling applications and demonstrations, the STARAN is integrated, via hardware and software, with an XDS Sigma 5 general purpose computer, which in turn is integrated with an EAI 7800 analog computer. Both the STARAN and the Sigma 5 are connected to a full complement of peripherals. The Sigma 5 peripherals include a sophisticated, high-speed graphics display unit well suited for real time, hands-on exercising of the AP / PHD DBMS. A sketch of the SETF is given in Figure l. The Custom Input/Output Unit (CIOU) is composed of two basic sections. One section provides the communication between the Sigma 5 and the AP. This communication is accomplished by the Direct Memory Access capa* TM. Goodyear Aerospace Corporation, Akron, Ohio 44315 171 172 National Computer Conference, 1973 STAr-AN 5-1000 LINE[ PRINTER i Figure l-STARAN evaluation and training facility bility of the Sigma 5 computer. The second section of the CIOU is composed of the Parallel Input/ Output Unit (PIOU). This unit interfaces the AP with the PHD. As previously mentioned the PHD is composed of 72 tracks of which 64 tracks are tied to STARAN. The disc is composed of one surface which is sub-divided into 384 sectors consisting of 64 tracks, each track having a bit capacity of 256 bits per sector. The time per revolution for the disc is approximately 39 msec. It should be noted that Goodyear Aerospace Corporation did not manufacture this disc and that there are several manufacturers providing parallel head per track devices. Given this hardware configuration, a data storage scheme was developed. DATA STORAGE SCHEME A hierarchical data structure was chosen for our initial effort since it is probably the structure most widely used for defining a data base. In order to utilize the parallel search capabilities of the AP and the parallel communication ability of the AP /PHD system, it was decided to reduce the hierarchical structure to a single level data base. The method used was similar to the one suggested by DeFiore, Stillman, and Berra. I In this method each level of the hierarchy is considered a unique record type. Associated with each record type are level codes indicating the parentage of the particular record. The association of level numbers with record type is purely logical and does not imply or require a correspondingly structured data storage scheme. It should be noted that each record contains a level code for each of the preceding levels in the hierarchical structure. The different records are stored on the PHD in a random fashion. No tables or inverted files are introduced. Only the basic data file is stored on the disc. Since we have an unordered and unstructured data base, it is necessary to search the entire data base in order to respond to a query. The searching of the entire data base presents no significant time delays because of the parallel nature of both the AP and the PHD. This data storage scheme was selected because it provides an easy and efficient means of updating the dHtH hHse dl1P to the lHck of multiplicity of the data. It should be emphasized that this scheme is not necessarily the best approach for all data bases. It appears to be well suited for small to moderately sized data bases having hierarchical structure with few levels. Data bases which are to be queried across a wide spectrum of data elements are especially well suited for this type of organization since all data elements can participate in a search with no increase in storage for inverted lists or other indexing schemes. Since there is no ordering of the data base and no multiplicity of data values, updating can be accomplished very rapidly with a minimum amount of software. SAMPLE DATA BASE The data base selected for the AP jPHD DBMS development program is a subset of a larger data base used by the SACCS/DMS and contains approximately 400.000 characters. It is a four-level hierarchical structure composed of command, unit, sortie, and option records. A tree diagram of the selected data base is shown in Figure 2. Figure 3 gives an example of an option record. Definitions for the option record are as follows: (1) Record Type-indicates the type of record by means of the level number; also indicates the word number of this record. If a logical record requires more than one array word, these words will be stored consecutively with each word being given a word number. (2) Level Codes-these codes refer to the ancestry of the particular record. They consist of an ordinal number. (3) Data Item Names-These are elements or fields of one particular record type. Each field has a value. There may not be any two or more fields with the same name. (4) Working Area-This is that portion of an array word not being used by the data record. Figure ?-Tree rlil'lgmm of ~E'le('teo oMa hasp Implementation of a Data Management System RECORD LEVEL LEVEL TYPE CODE 1 CODE 2 LEVEL CODE 3 P LEVEl OPTION NUMBER CODE 4 C I:U,PON TYPE S L U C rl 0 B I I I I I I I I WORKING 0 I T 0 R E Y I "" I I I 173 array, a bit map of searched sectors is kept. This bit map will allow minimization of rotational delay. DATA BASE MANAGEMENT SYSTEM ' - - - - " ' - ' - '- - - - - - - - ' - - -_ _ _ _ _~I'----I LEVEL CODES RECORD RECORD ATTRIBUTE WORKI~IG AREA Figure 3-0ption record layout SEARCH TECHNIQUE In order to search the entire data base, the data will be arranged in sectors consisting of 64 tracks. Each track will be composed of 256 bits. The STARAN Associative Processor employs 256 bit words; each associative array being composed of 256 such words. A PHD sector will therefore contain one fourth of an array load. In our current system we are utilizing the first 64 words of the first array for data storage. With the addition of more tracks on the PHD we could utilize a greater portion of an array with no additional execution time due to the parallel nature of our input. (The AP can of course time share the 64 tracks currently available in order to simulate larger PHD's.) With this technique it is possible to read in a sector, perform the required search, then read in another sector, perform a search, continuing in this fashion until the entire data base is searched. For our current PHD, the time for a sector to pass under the read/ write heads is approximately 100 J,Lsec. Preliminary studies have shown that complex or multisearches can be performed in this time span. Utilizing this fact it should be possible to read every other sector on one pass of our disc. With two passes the entire data base will have been searched. This assumes that the entire data base is resident on one surface. Additional surfaces would increase this time by a factor equal to the number of surfaces. Figure 4 shows an associative processor array and the every other sector scheme employed when searching the PHD. It should be emphasized that due to the hierarchical structure of the data base more than one data base search will be required to satisfy a user's request and that the time to search and output from the associative array is variable. In order to minimize the time to search the disc when a 100 J,Lsec is not sufficient time to process an Parallel Input/Output .--_ _ _ _ -,vru~ ASSOC 1'\ T!VE ARRAY 100 f../Sec to perform complex searches Data definition module This module defines a file definition table. This table relates the data structure as the user views it, to a data storage scheme needed by the software routines. Every data base will have only one file definition table. This table will contain the following information. (1) File name-the name of the particular data base (2) Record name-the name of each logical record type (3) Data items-the name of each attribute in a given logical record type (4) Synonym name-an abbreviated record or attribute name (5) Level number-a number indicating the level of the record (6) Continuation number-this number is associated with each attribute and indicates which AP word of a multi-word record contains this attribute (7) Starting bit position-the bit position in an AP word where a particular attribute starts (8) Number of bits-the number of bits occupied by an attribute in an AP word (9) Conversion code-a code indicating the conversion type: EBCDIC or binary The language used to define the data definition table is called the Data Description Language (DDL). The DDL used in our system has been modeled after IBM's Generalized Information System DDL. 255 Bits Per Sector P::.rallel Rl'a::!'Write ElectrOnics lr~ ui~ ASSOCIA rp.:r: FnCCESSCR The DBMS software is composed of four modules: the Data Definition, File Create, Interrogation and Update modules. Since our DBMS is intended for research and not a software product package, we have not incorporated all the features that would be found in a generalized DBMS. Our system will provide the means to create, interrogate and update a data base. The following discussion will briefly describe each module. File create module Once the file definition table has been created, the file create module is run. This module populates a storage medium with the data values of a particular data base. Interrogation module D~SC 100 ,usee to read a sector Figure 4-AP! PHD tie-in This module is the primary means for interrogating the data base. Since our DBMS is intended to demonstrate the applicability of Associative Processors in DBMS, we 174 National Computer Conference, 1973 have developed a user oriented language. This Data Manipulation Language (DML) is modeled after SDC's TDMS system. It is conversational in nature and requires no programming experience for its utilization. At present there is only one option in the interrogation module implemented in our DBMS. This option is the print option. The print option is used to query the data base. The user is able to query the data base using any data item(s) as his selection criteria. The form of a query follows: Print Name(s) Where Conditional Expression(s) • Name(s)-Any data item name or synonym • Conditional Expression (CE)-Selection criteria for searches The form of the CE is: Data Item Name Relational Operator Value Relational Operators EQ-Equal to NE-Not equal to LT - Less than LE-Less than or equal to GT -Greater than GE-Greater than or equal to Logical Operators (LO )-link conditional expressions together: CE Logical Operator CE Logical operators • AND-Logical And • OR-Inclusive Or At present the output from a query will be in a fixed format. It is anticipated that a formatting module will be added to the system. An example of the print option and its output are shown below. Query Print Unit Name, Sortie Number where Option Number EQ 6 and Weapon Type NE Mark82 and Country Code EQ USA * Output Unit Name-303BW Sortie Number-1876 Update module This module performs all the updating of the data base. There are four options available to the user. These are change, delete, add, and move. These options are described helow. Change option The change option performs all edits to the data base. All records satisfying the change selection criteria are updated. The form of the change option follows: Change Name to Value Where Conditional Expression(s) • N ame-Any data item name or synonym • Value-Any valid data value associated with the above name; this is the new value • Conditional Expression-same as the print option . An example of the change option follows: Change Unit Name to 308BW Where Unit Number EQ 1878 And Option Number GE 6* It should be noted that all records that satisfy the unit number and option number criteria will have their unit names changed to 308BW. Delete option This option will delete all records that satisfy the conditional expression. All subordinate records associated with the deleted records will also be deleted. The form and an example of the delete option follows: Delete Record Name Where Conditional Expression(s) Record Name-any valid record name (e.g. command, unit, sortie, option) Conditional Expression(s)-same as print option Example: Delete Sortie Where Sortie Number EQ 7781 * In this example all the sortie records with sortie number equal 7781 will be deleted. In addition all option records having a deleted sortie record as its parent will also be deleted. Add option This option adds new records to the data base. Before a new record -is added to the data base, a search will be initiated to determine if this record already exists. If the record exists a message will be printed on the graphic display unit and the add operation will not occur. The form of the add option follows. Add Record Name to Descriptor Where Description of Data • Record Name-any valid record name (i.e. command, unit, sortie, option) • Descriptor-special form of the conditional expres!';ion in whirh 'EQ' is the only allowed relational Implementation of a Data Management System operator; the descriptor describes the parentage of the record being added to the data base. • Description of Data-This item is used to define the data base; the form of this item is the same as a conditional expression with the requirement that 'EQ' is the only allowed relational operator. An example of the add option follows: Add Unit To Name EQ 8AF Where Unit Name EQ 302BW And Unit Number EQ 1682 And Aircraft Possessed EQ 85* In this example the command record for the 8AF must be in the data base before this unit record can be added to the data base. In the case of a command record being added to the data base, the descriptor field is omitted from the above form. Move option This option allows the user to restructure the data base. The move option will change the level codes of the affected records. The affected records are those satisfying the conditional expression, in the move command, as well as all subordinate records. The move option has the following form: Move Record Name to Descriptor Where Conditional Expression • Record Name-any record name • Descriptor-this item describes the new parentage of the record that is to be moved; the form is the same as the add option • Conditional Expression-same as the print option An example of the move option follows: Move Sortie to Name EQ 12AF and Unit Number EQ 2201 Where Sortie Number EQ 41 and Option Number GT 12* In this example all sortie records which have a sortie number equal to 41 and an option record with option number greater than 12 will have its parentage (i.e., level codes) changed to the 12AF and 2201 unit. Also included in this restructuring will be all option records which have the changed sortie record as a parent. Status Currently the Data Definition and the Create Module are operational and our sample data base has been created and stored on the PHD. A simplified version of the Interrogation Module is running and performing queries using the AP and PHD. In this version the translator for the query is executed on the Sigma 5 and a task list is 175 constructed. This task list is transmitted to the AP where the desired searches are performed and the output of the searches are transmitted to the Sigma 5 via Direct Memory Access. In the simplified version of the Interrogation Module no optimation of the software has been attempted. At the moment the number of disc sectors skipped between reads during a search of the data base is governed by the size of the task list and is not modified by the actual time available to process the task list. No sector bit map has been implemented. It is anticipated that future versions of the Interrogation Module will be optimized and a bit map will be introduced into the system. The change and delete options of the Update Module are also operational. The search routines currently employed in the Update Module are the same routines found in the Interrogation Module. With a continuing effort, the attainment of our design goals and the completion of our research data base management system should be realized. Further progress reports will be issued as our development efforts are continued. CONCLUSION Preliminary results indicate that an AP working in conjunction with a sequential computer affords the best configuration. With this marriage comes the best of two computer worlds, each performing what it is best capable of doing. With the ability to rapidly search the entire data base we have provided the user extreme flexibility in constructuring his search criteria. We have provided this with no additional storage for inverted files and no sacrifice in update time. Associative processing brings to data base management designers and users the ability to query and update data bases in a fast and efficient manner with a minimum amount of software. With proper programming of AP's, multi-queries can be processed during a single array load, thus greatly increasing the throughput of the system. The future holds great promise for associative processors and we are striving to lead the way. Many questions must be answered such as: (1) how to use conventional storage devices in combination with PHD's, (2) what hardware requirements must be provided for a minimum configured AP; (3) the role of the sequential and AP computers in a hybrid configuration, and (4) how much software is required to provide a data management capability on AP's. In the near future these problems and others will have answers. Subsequent reports will elaborate on the performance of the system. Our work to date has sh~wn that associative processors truly provide an attractive alternative to conventional approaches to data retrieval and manipulation. 176 National Computer Conference, 1973 ACKNOWLEDGMENTS The help of P. A. Gilmore, E. Lacy, C. Bruno, J. Ernst and others at Goodyear Aerospace is gratefully acknowledged. REFERENCES 1. DeFiore, C. R., Stillman, N. J., Berra, P. B., "Associative Tech- niques in the Solution of Data Management Problems" Proceedings of ACM pp. 28-36, 1971. Aircraft conflict detection in an associative processor by H. R. DOWNS Systems Control, Inc. Palo Alto, California PROBLEM If MD is less than a specified criterion the aircraft are assumed to be in danger of conflict. More complicated algorithms are also under study. These involve altitude;turning rates, etc. DESCRIPTIO~ A major problem in Air Traffic Control (ATC) is detecting when two aircraft are on a potential collision course soon enough to take some corrective action. Many algorithms are being developed which may lead to automating the process of conflict detection. However, these algorithms typically require large amounts of computing resource if they are to be performed in real-time. This paper describes some techniques which may be used by an associative processor to perform the conflict detection operation. Associative processors An associative processor is a processor which may access its memory by 'content' rather than by address. That is, a 'key' register containing some specific set of bits is compared with a field in each word of memory and when a match occurs, the memory word is 'accessed' (or flagged for later use). This type of memory is typically implemented by having some logic in each memory word which performs a bit-serial comparison of the 'key' with the selected field. 1 Many associative processors have enough logic associated with each memory word to perform inequality compares (greater or less than) and some arithmetic operations. In others, memory words may be turned on or off; that is, some words may not be active during a particular comparison. This associative processor may be viewed as a parallel-array computer where each word of memory is a processing element with its own memory and the 'key register' is contained in a control unit which decodes instructions and passes them to the processing elements. Some examples of this type of computer are the Sander's OMEN,2 the Goodyear Aerospace STARAN,3 and the Texas Instruments SIMDA. Some additional capabilities which these computers may have are: (1) the ability to compare operands with a different key in each processing element (both compare operands are stored in one 'word') and (2) some form of direct communication between processing elements. Typically, each processing element can simultaneously pass data to its nearest neighbor. More complex permutations of data between processing elements are also possible. These processors are often referred to as associative array processors, though parallel-array is perhaps more descriptive. Conflict detection In a terminal ATC environment, each aircraft in the area will be tracked by a computer which contains a state estimate (position and velocity) for each aircraft within the 'field of view' of the radar or beacon. The information available on each aircraft may vary (e.g., elevation may be available for beacon-equipped aircraft) but some form of estimate will be kept for each aircraft under consideration. The conflict detection algorithm typically considers each pair of aircraft and uses their state estimates to determine whether a conflict may occur. The algorithm may perform one or more coarse screening processes to restrict the set of pairs of aircraft considered to those pairs where the aircraft are 'near' each other and then perform a more precise computation to determine whether a conflict is likely. The algorithm is typically concerned with some rather short interval of time in the future (about one minute) and it must allow for various uncertainties such as the state estimate uncertainty and the possible aircraft maneuvers during this time interval. As an example of the kind of algorithms being considered for detection of conflicts between a pair of aircraft. a simple algorithm will be described. The relative distance at the initial time, DR, is computed. Then, the scalar quantity relative speed, SR, is computed for the initial time. If T is the iook-ahead time, then the estimated miss distance MD is SOLUTIO~ APPROACHES This section presents several ways to solve the conflict detection problem with the use of an associative processor 177 178 National Computer Conference, 1973 and provides some estimate of the order of each approach. (The order is a measure of how the computation time increases as the number of aircraft increases.) Straightforward associative processing approach The data for each aircraft under consideration can be placed in a separate processing element of the associative processor and each aircraft can be simultaneously compared with the remaining aircraft. This technique requires each aircraft to be compared with all other aircraft. If there are n aircraft under consideration there must be at least n processing elements (or associative memory words). The data on a single aircraft is placed in the key register and compared with the appropriate field in each processing element. Each compare operation in this algorithm is actually fairly complex, involving some arithmetic computations before the comparison can be made. This approach compares every aircraft with every other aircraft (actually twice) and appears to use the computing resources of the associative array rather inefficiently. In fact, the entire algorithm must be executed n times. Since n comparisons are made each time the algorithm is executed, a total of n 2 comparisons are made. (There are n(n -1) / 2 unique paris of aircraft to be considered.) It is more efficient to consider the associative processor for performing some sort of coarse screening procedure, and for all pairs which the AP determines may conflict to be passed to a sequential processor for a more detailed check. This assumes that the sequential processor can execute the complicated algorithm more quickly than a single processing element of the AP and that the AP can significantly reduce the number of pairs under consideration. An efficient algorithm for a serial processor One approach which has been suggested for solving this problem is to divide the area under consideration into boxes and to determine whether two aircraft are in the same or neighboring boxes. If they are then that pair of aircraft is checked more carefully for a possible conflict. This process is an efficient method for decreasing the number of pairs of aircraft to be checked in detail for a possible conflict. These boxes help to 'sort out' the aircraft according to their positions in the sky. The boxes can be described in a horizontal plane and may have dimensions of 1 or 2 miles on a side. An aircraft can be assigned to a box by determining a median position estimate for some time and choosing the box which contains this estimate. If the aircraft turns or changes velocity, then it may not arrive at the expected point at the time indicated but it cannot be very far from this point. For terminal area speeds and turning rates, an aircraft will typically not be more than about one box width away from the expected location. To check for conflicts, it is necessary to check possible conflicts in the same box and in boxes which are adjacent or near to the given box. If all aircraft in each box are compared with all other aircraft in this box and with all aircraft in boxes to the north, east and northeast, then all possible pairs are checked. By symmetry, it is not necessary to check boxes to the west, south, etc. The exact size of the boxes and the number of boxes checked for possible conflicts are determined by the aircraft speeds and measurement uncertainties. It is desirable to make them large enough so that only neighbors one or two boxes away need to be checked and yet small enough so that the number of aircraft per box is usually one or zero. Note, that this scheme does not check the altitude. If aircraft densities increased sufficiently and the altitude were generally available, then it might be desirable to form 3 dimensional boxes, accounting for altitude. Sort boxes on an associative processor A similar operation can be performed on an associative processor by assigning each box to a single processing element. If the boxes are numbered consecutively in rows and 'consecutive boxes are assigned to consecutive processing elements, then adjacent boxes can be compared by translating the contents of all processing elements by an appropriate amount. For a square area containing 32 X 32 (= 1024) boxes, the numbering scheme shown in Figure 1 will suffice. In this numbering scheme, all aircraft in Box 1 are compared with the aircraft in Box 2, Box 33 and Box 34. Similarly, all aircraft in Box 2 are compared with the aircraft in Box 3, Box 34, and Box 35. When these comparisons have been completed, all possible pairs within a box and in neighboring boxes will be detected. If only one aircraft were in each box, then each aircraft data set would be stored in the appropriate processing element and comparisons with neighboring boxes would be carried out by transferring a copy of each data set along the arrows shown in Figure 2 and comparing the original data set with the copy. When these neighboring boxes have been checked, then all possible conflicting pairs have been discovered. 1 2 3 33 34 35 . 32 64 . 1-g93 ." 1024 Figure I-Boxes in 32X32 surveillance area Aircraft Conflict Detection in an Associative Processor When there is more than one aircraft in a box, then additional processing is required. The first aircraft which belongs in a box is stored in the proper processing element and a flag (bit) is set indicating this is its proper box. Any additional aircraft are placed in any available processing element. A pointer is set up linking all aircraft belonging in a specific box. In an associative processor, all that is necessary is to store the I.D. of the next aircraft in the chain in a particular processing element. The contents of various fields of a word and the links to the next word are shown in Figure 3. When the data is stored as indicated above, then every aircraft in box i can be retrieved by starting with processing element i. If the I.D. in Field B is not zero, then the next aircraft data set is located by performing an equality test using Field B from processing element i in the key register and comparing with Field A in all processing elements. When comparing the aircraft in one box with those in another box, every aircraft in the first box must be com- '~ L L:'\r'\ II 1' 2 -~~ neighbor on right c:-'\ I c"' c"'"'---r_association _ _ _ _~ i ~ I 132 33134 I 35 ·] /'-7-/-'-::7,.-L--:;'~...-!--n-e-ig-h-bo-r-bel ow ---~~~--/ association Associative Array Processor Figure 2-Data transfer for comparison with neighboring boxes pared with every aircraft in the other box. This is done by making a copy of the state estimates for each aircraft pair and placing each pair of state estimates in a single processing element. When the conflict detection algorithm is executed, all the pairs are checked simultaneously. That is, each processing element operates on a pair of state estimates and determines, by a fairly complicated algorithm, whether the aircraft pair represented in this processing element may be in danger of colliding. If the memory available in each processing element pair is sufficient, then one pair can be stored in each element. In this case, the array is used very efficiently since many elements can execute the detailed conflict detection algorithm simultaneously. If there are more processing elements than pairs, then the detailed conflict detection algorithm need only be executed once in order to check all pairs for possible conflict. If there are no processing elements available for storing a potentially conflicting pair, then the algorithm is executed on the pairs obtained so far, non-conflicting pairs are deleted and conflicting aircraft are flagged. The prucess then continues looking for more conflicting pairs. The algorithm just described is similar to the one described for sequential computers but takes advantage proces S 'j n~! clc,conl i"i flag I: ;Field /\ rA/C , I Ale. Field B ~ ~~~t ~_L_s_tate estl"'::~j;j I\jt- , ;i:,;:::i;j _ I I spare ~_.D.~ ___ _ >' nI-1 :;~te estimatell :;~t 1.0.1 . ----------, .____------ A/C I state estimate spare J --~--------~ v processing el ement fk 179 r a Ii Figure 3-Storage of A/C state estimates for A/C in Box #i of the parallel and associative processing capabilities of an associative array computer to speed up the execution of the algorithm. If the associative processor has more elements than there are boxes and than there are pairs to check, then the algorithm requires essentially one pass. A sliding correlation algorithm Another approach to restricting the number of pairs of aircraft is possible on an associative array processor. This approach requires that the aircraft be arranged in order of increasing (or decreasing) range in the processing elements. That is, the range from the radar to the aircraft in processing element i is less than the range to the aircraft in processing element i+ 1. A number of techniques are available for performing this sort. They are discussed in the next section. The technique consists of copying the state estimates and I.D.'s of each of the aircraft and simultaneously passing them to the next adjacent processing element. Each element checks the two aircraft stored there for a possible conflict and flags each of them if a conflict occurs. The copied state estimates are then passed to the next processing element and another conflict check occurs. The process continues until all aircraft pairs in a given processing element are more than r nautical miles apart in range (where r is the maximum separation of two rrocess i ng .... _.--- e]er~c'nt llur1ber initio'] st2te estilirote copied state estinwte Figure 4-Storage during the kth conflict comparison 180 National Computer Conference, 1973 aircraft state estimates when a conflict might conceivably occur during the specified time interval). Since the aircraft are range ordered, there is no point in testing any additional pairs. Also, due to the symmetry of the algorithm, it is not necessary to check the other way. Ordering the state e.'dimates In the algorithms described earlier, a certain ordering of the aircraft state estimates in the processing elements was required. The ordering can be performed by checking the arrangement of the data each time a track update is performed. Each aircraft which has moved to a new box has its corresponding state estimate moved to the appropriate processing element at this time. It is assumed that the track update rate is sufficiently high so that state estimates need only move to neighboring boxes. This assumption improves the efficiency with which data is moved since many state estimates can be moved simultaneously. For instance, all objects which moved one box to the east can be simultaneously transferred to their new box. Similarly, for objects moving in other directions, a single transfer is necessary. The second, third, etc., aircraft in a box must be handled separately. Also, when a box is already occupied, then special handling is required to determine where the new state estimates are stored. Since most boxes contain only one or fewer aircraft, the amount of special handling will be small. The sorted list described in the previous section may also be done by keeping the list stored and performing minor iterations each time the state estimates are updated. This process is quite straightforward. A more interesting method is to perform a total sort of the state estimates. This can be done fairly efficiently by using a 'perfect shuffle' permutation of the processing element data. The algorithm is described in Stone 4 and requires log2 n steps (where n is the number of aircraft). SUMMARY Comparison of techniques The algorithm in the section "Straightforward Associative Processing Approach" is quite inefficient and requires n passes through the pairwise conflict detection algorithm. This can be quite expensive if n is large and the algorithm is complex. The algorithm in the section "Sort Boxes on an Associative Processor" greatly reduces the number of executions of the pairwise conflict detection algorithm at the expense of some data management overhead. If the box sizes can be chosen appropriately this is quite efficient. Some studies have shown that a few percent of the pairs need to be checked. That is, the number of potentially conflicting pairs after checking for box matching is p. n(n - 1) /2 where p is about .05. If a total of four boxes must be checked for conflicts (nearest neighbor boxes only), then the number of times the pairwise conflict detection algorithm must be executed is quite small. There are about n 2 /40 potentially conflicting pairs and if the number of processing elements is large enough, these can all be checked at one time. The algorithm described in the preceding section has been estimated to check about 10 percent of the potentially conflicting pairs. Comparing with the section "Straightforward Associative Processing Approach," this requires one-tenth the time or about n/l0 pairwise conflict detection executions (plus the sort). This is more than the sliding correlation algorithm previously discussed, but the data management is less and fewer processing elements are required. Depending on the execution time of the pairwise conflict detection algorithm one of these approaches could be chosen, the first one if the pairwise algorithm is expensive and the second if the pairwise algorithm is cheap. Conclusions This paper has presented some approaches to organizing the conflict detection algorithm for Air Traffic Control on an associative array processor. The relative efficiencies of these approaches were described and some of the implementation problems were explored. Other techniques of real-time data processing on a parallel-array computer have been described in Reference 5. The problem described here illustrates the fact that developing efficient algorithms for novel computer architectures is difficult and that straightforward techniques are often not efficient, especially when compared with the sophisticated techniques which have been developed for serial computers. The techniques described here may also be applicable to other problems for associative arrays. REFERENCES 1. Stone, H., "A Logic-in-Memory Computer," IEEE Transactions on Computers, January 1970. 2. Higbie, L., "The OMEN Computers: Associative Array Processors," Proceedings of COMPCON, 1972, page 287. 3. Rudolph, J. A., "A Production Implementation of an Associative Array Processor-STARAN," 1972, FJCC Proceedings, page 229. 4. Stone, H., "Parallel Processing with the Perfect Shuffle," IEEE Transactions on Computers, February 1971. 5. Downs, H. R., "Real-Time Algorithms and Data Management on ILLIAC IV," Proceedings of COMPCON, 1972. A data management system utilizing an associative memory* by CASPER R. DEFIORE** Rome Air Development Center (ISIS) Rome, New York and P. BRUCE BERRA Syracuse University Syracuse, New York knowledge of its content. The information stored at unknown locations can be retrieved on the basis of some knowledge of its content by supplying the contents of any portion of the word. An associative memory contains a response store associated with every word which is at least one bit wide. Its purpose is to hold the state of events in the memory. The response store, provides an easy way of performing boolean operations such as logical A~D and OR between searches. In the case of logical AND for example, in a subsequent search only those words whose response store were set would take part in the search. In addition boolean operations between fields of the same word can be performed by a single search. The applicability of boolean operations is a most important requirement for a data management system since the conditional search for information in a data base requires their use. The instruction capabilities of associative memories are usually grouped into two categories: search instructions and arithmetic functions. The search instructions allow simultaneous comparison of any number of words in the memory and upon any field within a word. A partial list of search instructions include the following: equality, inequality, maximum, minimum, greater than, greater than or equaL less than, less than or equal. between limits, next higher, and next lower. All of these instructions are extremely useful for data management applications. For example, the extreme determination (maximum/ minimum) is useful in the ordered retrieval for report generation. An associative memory can perform mass arithmetic operations, such as adding a constant to specific fields in a file. The type of operations are as follows: add, subtract, multiply, divide, increment field and decrement field. I~TRODUCTION There are a wide variety of data management systems in existence.I-3.6.7.10-V; These systems vary from those that are fairly general to those that are very specific in their performance characteristics. The former systems tend to have a longer life cycle, while sacrificing some efficiency, whereas the latter are more efficient but tend to become obsolete when requirements are modified. 9 In addition, current data management systems have generally been implemented on computers with random access memories, that is, the data is stored at specified locations and the processing is address oriented. In contrast, using associative or content addressable memories, information stored at unknown locations is processed on the basis of some knowledge of its content. Since much of the processing of data management problems involve the manipulation of data by content rather than physical location, it appears that associative memories may be useful for the solution of these problems. In order to demonstrate the feasibility of utilizing a hardware associative memory, a data management system called Information Systems For Associative Memories (IF AM) has been developed and implemented. After presenting a brief description of associative memories, the implementation and capabilities of IF AM are described. DESCRIPTIO~ OF ASSOCIATIVE MEMORIES Associative memories differ considerably from random access memories. 16 Random access memories are address oriented and information is stored at known memory addresses. In associative memories, information stored at unknown locations is retrieved on the basis of some IFAM IMPLEMENTATION * This research partially supported by RADC contract AF 30(602)70-C-0190, Large Scale Information Systems. ** Present Address, Headquarters, DCA. Code 950, NSB, Washington. D.C. IF AM is an on-line data management system allowing users to perform data file establishment, maintenance, 181 182 National Computer Conference, 1973 retrieval and presentation operations. Using this system it is easy for users to define a meaningful data base and rapidly achieve operational capability. Discussed below are the hardware and software aspects of IF AM. Hardware IF AM is implemented on an experimental model associative memory (AM) at Rome Air Development Center (RADCl. The AM, developed by Goodyear, is a content addressable or parallel search memory with no arithmetic capability. It contains 2048 words where each word is 48 bits in length. The search capability consists of the 11 basic searches described previously, all performed word parallel bit-serial. As an example of the timing, an exact match search on 2048-48 bit words takes about 70 microseconds. The AM operates in conjunction with the CDC 1604 Computer via the direct memory access channel as shown in Figure 1. Information transfers between the AM and the 1604 are performed one word at a time at about 12 microseconds per word. The CDC 1604 computer is a second generation computer with 32000-48 bit words and a cycle time of 6.4 microseconds. It has various peripheral and input/ output devices. one of which is a Bunker Ramo (BR-85) display console. The display unit is capable of visually presenting a full range of alphabetic, numerical or graphical data. The console allows direct communication between the operator and the computer. The display unit contains a program keyboard, an alphanumeric keyboard, a control keyboard, curser control. and a light gun. The IFAM user performs his tasks on-line via the BR-85 display console. Software Both the operational programs and the data descriptions used in IF AM are written in JOVIAL, the Air Force standard language for command and control. The operational programs manipulate the data in the AM and are utilized in the performance of retrieval and 1 update operations. These programs are written as JOVIAL procedures with generalized input! output parameters which operate on data in the AM. By changing the parameters, the same procedures can be used for many different operations resulting in a more general and flexible data management system. The data descriptions are used to specify the format of the data such as the name of a domain, its size, type, value, the relation to which it belongs, etc. They are used in conjunction with the operational programs to process information within IFAM. The data descriptions specify the data format to the operational program which then performs the desired task. Providing an independence between these two means the information and the information format can change without affecting the programs that operate on the information and conversely. In addition, the data and data descriptions are maintained separately from the data. This separation means that multiple descriptions of the same data are permitted and so the same data can be referred to in different ways. This can be useful when referencing data for different applications. The next section describes the capabilities of IF AM using a personnel data base as a test model. IFAM CAPABILITIES Test Model Description As a test model for IF AM a data base has been implemented containing information about personnel. The data structure for the test model, shown in Figure 2, is as follows: PERSONNEL (NAME, SOCIAL SECURITY NUMBER, CATEGORY, GRADE, SERVICE DATE, DEGREES) DEGREES (DEGREE,DATE). For this structure PERSONNEL is a relation that contains six domains and DEGREES is a relation that contains two domains. The occurrence of the domain degrees ~:~:::,' PERSONNEL • 48 !lIT HORD DIRECT MEMORY • 32 K CORE / BR - 85 NA.'ffi SSN GRADE CATEGORY DISPLAY CONSOLE • LIGlIT Gl:tl • 48 BIT \fORD • PROGIWI KEY!lOARD • 2048 WORDS • ALPHA:Ilr.!ERIC KEYBOARD • WORD PARALLEL BIT SERIAL OPER.ATIO~ • 70 HI CROS ECOt>D EXACT MATCE Figure I-HADe a~~ociative memorY computer ~v~tem DEGREE Figure 2-Hierarchical structure for data base DATE A Data Management System utilizing An Associative Memory in the PERSON:\EL relation is a non-simple domain since it contains two other domains namely degree and date. All other domains are called simple domains since they are single valued. A method and motivation for eliminating the non-simple domain is given in Reference 4. Essentially the procedure eliminates each non-simple domain by inserting a simple domain in its place and inserting this same simple domain in all subordinate levels of the structure. Through the application of this procedure, the data structures now contain only simple domains and therefore are amenable to manipulation on an associative memory. When this procedure is applied to the above data structure, the result is as follows: PERSON~EL (NAME, SOCIAL SECURITY NUMBER, CATEGORY, GRADE, SERVICE DATE, 0'1) DEGREES (DEGREE, DATE, 0'1), where (\'1 is the added simple domain. The storage structure showing one n-tuple of the personnel relation and one n-tuple of the degrees relation appears in Figure 3. Five associative memory (AM) words are needed to contain each n-tuple of the personnel relation and two AM words are needed to contain each ntuple of the degrees relation. Since the capacity of the Goodyear AM is 2000 words, any combination of n-tuples for the two relations not exceeding 200 words can be in the AM at anyone time. PERSO:,~EL: 1 CHARACTER HORD 0 7 CHARACTERS I OOOxxx NAME (CONTINUED) WORD 1 WORD 2 I OlOxxx SOCIAL SECURITY NUMBER (CONTINUED) WORD 3 I Ollxxx GRADE I ICATEGORY That is, at anyone time the AM could contain a combination of n-tuples such as the following: 400 personnel and no degrees, 1000 degrees and no personnel. 200 personnel and 500 degrees, etc. Whenever the number of n-tuples exceeds the capacity of the AM, additional loading is required. The first three bits of each word are the word identification number and are used to determine which group of AM words participate in a search. For example, consider a case where both the personnel and degrees relations are in the AM, and suppose it is required to perform an exact match on the first two digits of social security number (see Figure 3), In this case only word one of each personnel n-tuple participates in the search. In order to insure that this is so, a 001 is placed in the first three bits of the comparand register. This corresponds to words in the personnel relation that contain the first two digits of social security number. The remainder of the comparand contains the search argument, which in this case is the desired social security number. This method assures that only the proper AM words participate in a search. Also, as shown in Figure 3, word 4 character 4 of the PERSONNEL relation and word 1 character 7 of the DEGREES relation contain (\'3 domains. Query and update method A dialog technique exists within IF AM in which an inquirer interacts with a sequence of displays from the BR-85 to accomplish the desired task. In this way the user does not have to learn a specialized query language thereby making it easier for him to achieve operational capability with a data base. The first display appears as shown in Figure 4. Assuming Button 1 is selected, the display appears as shown in Figure 5. Observe that although the degrees relation is embedded in the personnel relation, a retrieval can be performed on either relation directly (i.e., one does not have to access the personnel relation in order to get to the degrees relation). Assuming button 2 G~AME) is selected, the next display appears as in Figure 6, in which the 11 basic searches available in the system are shown. Assum- SERVICE DATE Press Button 1 For Retrieval WORD 4 C::l<''RUTrl:' 1100xXX T"IA'T""J::;" (C~~~I;uiD)~ I I EXTRA "I Press Button 2 For Update DEGREES: WORD 0 WORD 1 DATE IllOXXX DEGREE EXTRA I EXTRA Figure 3-Data base storage structure 183 Figure 4- First display 184 National Computer Conference, 1973 (1) Personnel (2) (3) (4) (5) (6) (10) Name Social Security Number Grade Category Service Date Degrees (11) (12) Degree Date Press button of item you wish to specify (only one button may be pressed) Press button 20 to terminate Figure 5-Second display ing button 2 (EQUAL) is chosen, the next display appears as shown in Figure 7. This display is usen to specify the search arguments and the AND or OR operation for conditional searches. Assuming that an "E" is typed in the second position, then all personnel with names which have E as the second letter will be retrieved from the system and the next display appears as shown in Figure 8. This display gives the inquirer the option of terminating his request, having been given the number of responders, or having the results displayed. Assuming it is desired to display the results, button 1 is selected and the next display appears as shown in Figure 9. In Figure 9 the n-tuples of the personnel and degrees relations satisfying the request are displayed. Observe that each name has an E as the second letter and that all the degrees a person has are displayed (e.g., Mr. Feate has 3 degrees). At the completion of this display the sys- You have chosen (2) Name The operators available are: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Between Limits Equal Greater than or equal Greater than Less than or equal Less than Maximum value Minimum value Not equal Next higher Next lower you have chosen Name Equal Specify values below if required C~ _______ _ (------------- Input xxx to terminate Also specify fu~D. OR operation from previous search Blanks specify neither (__ _ Figure 7-Fourth display tern recycles and begins again with the first display as shown in Figure 4. Using IFAM it is easy to perform operations involving the ~nion or intersection of data items. For example, conSIder ANDing together SSN, Name and Degree such that the 2nd, 4th, 5th, and 9th digits of the SSN are eq ual to 3, 9, 0, 5 respectively, the second letter of the name is equal to K, and the degree begins with B, as shown in Figure 10. This is performed as a routine request in IF AM in the following way. In one access those records satisfying the specified SSN are found by placing the desired SSN in the comparand register and the fields to be searched in the mask register. Of those records, only the ones in which the second letter of the name is equal to K are found in one memory access using the AND connector between searches. For those records the (\:') values are used to set the corresponding values in the degree file A0JDed together with the records in which the de~ree begins with B. The records in this final set satisfy thIS complex request. Other systems either have to anticipate such a request and provide the necessary programming and indexing to accomplish it, or would have to perform a sequential search of the data base. On-line updating is provided in IFAM and operates in conjunction with retrieval. Retrieval is used to select the portion of the data to be altered. Once selected the items are displayed on the BR-85 and any of the domains of a relation including the (Y s can be changed and put back into the data base. Using IFAM, updates are performed in a straightforward manner with no directories, pointers, addresses or indices to change. There are 4 responders to your query Press button of operation you wish to specify Do you wish to display them? Press button 20 to terminate If Yes press Button 1 If No press Button 2 Figure ti--Third dIsplay A Data Management System Utilizing An Associative Memory 3 ~~,;:t:: S5, CATeGORY GRADe S':RVICE DATZ J~GREE 0 DATE 185 --.L SOCIAL SECURITY /',lJMBER .)erg ·~31711.! 32 13 570725 fe?.or::!cs 032513165 12 621113 Feate 425626582 11 510820 ~eed 057254321 to 480530 BS 49 :-15 31 BS 55 BS S5 MS 58 Ph.D. 62 8S 47 K NAME Figure 9-Sixth display _B_ CONCLUSION This paper has described the utilization of an associative schemata for the solution of data management problems. There are advantages and disadvantages in utilizing associative memories for data management. Among the advantages shown by this implementation are that this method superimposes little additional structure for machine representation and eliminates the need for indexing. From an informational standpoint, an index is a redundant component of data representation. In the associative method, all of the advantages of indexing are present with little data redundancy. A comparison between IF AM and an inverted list data management system given in Reference 4 shows that the inverted list technique requires 3 to 15 times more storage. Also provided in Reference 4 is a comparison between IF AM and inverted lists in the area of query response time. It is shown that queries using inverted lists take as much as 10 times longer than IFAM. In the inverted list technique, as more items are indexed response time to queries tend to decrease, whereas update time normally increases. This is so because directories and lists must be updated in addition to the actual information. Since IF AM does not require such directories, the overhead is kept at a minimum and updating can be accomplished as rapidly as queries. In addition, the associative approach provides a great deal of flexibility to data management, allowing classes of queries and updates to adapt easily to changing requirements. Thus, this system is not likely to become obsolete. One of the disadvantages of associative memories is the cost of the hardware. The increased cost is a result of more complex logic compared to conventional memories. This cost is justified in instances such as those described above. On the other hand, if factors such as speed and flexibility are not important, then the less costly current systems are preferable. DEGREE Figure lO-Sample retrieval request REFERE~CES 1. Bleier, R., Vorhaus, R., "File Organization in the SDC Time 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Shared Data Management System (TDMS)", IFIP Congress, Edinburg, Scotland, August, 1968. CODASYL System Committee Technical Report, "Feature Analysis of Generalized Data Base Management Systems", ACM Publication, May, 1971. Dodd, G., "Elements of Data Management Systems", Computer Surveys, June, 1969. DeFiore, C., "An Associative Approach to Data Management", Ph.D. Dissertation, Syracuse University, Syracuse, New York, May, 1972. DeFiore, C., Stillman, N., Berra, P. Bruce, "Associative Techniques In The Solution of Data Management Problems", Proc. of the 197] ACM National Conference, 1971. "GE 625/635 Integrated Data Store", GE Reference Manual, August, 1965. "Generalized Information Management (GIM), Users Manual", TRW Doc. #3]8]-C, 15 August, 1969. Minker, J., "An overview of Associative or Content Addressable Memory Systems And A KWIC Index to the Literature: 19561970", Computing Reviews, Vol. 12, No. 10, October, 1971. Minker, J., "Generalized Data Management Systems - Some Perspectives", University of Maryland, TR 69-101, December, 1969. Prywes, N., Gray, R., "The Multi-List System for Real Time Storage and Retrieval", Information Processing, 1962. Prywes, N., Lanauer, W., et aI., "The Multi-List System - Part I, The Associative Memory", Technical Report I, AD 270-573, November, 1961. Sable, J., et aI., "Design of Reliability Central Data Management System", RADC TR 65-189, July, 1965. Sable, J., et aI., "Reliability Central Automatic Data Processing System", RADC TR 66-474, August, 1966. SDC TM-RF-104/000/02, "Management Data Processing Systems", (Users Manual), 20 June, 1969. SDC TM-RF-lO;OOO/O], "Management Data Processing System", (On-Line File Manipulation Techniques), 20 June, 1969. Wolinsky, A., "Principles and Applications of Associative Memories", TRW Report 5322.01-27, 31 January, 1969. Associative processor applications to real-time data management by RICHARD R. LINDE, ROY GATES, and TE-FU PENG System Development Corporation Santa Monica, California for data management applications, we have described a machine that has-many more processing elements than the classical array processors and whose processing elements are smaller (256 bits) and, hence, provide more distributed logic. For cost reasons, our machine cannot be fully parallel, having logic at every bit; for efficiency reasons, however, we selected a byte-serial, external-bytelogic machine design as opposed to the bit-serial processors that are widely described in the literature. For example, three cycles are required for this type of machine to perform an 8-bit add, whereas 24 cycles are required for the bit-serial, external-logic machines, yet the byte-serial logic cost is very little more than that of the equivalent bit-serial logic. Past studies have shown that data management systems implemented on an AP can become I/O-bound quite quickly due to the AP's microsecond search operations relative to slow, conventional, I/O-channel transfer rates. 4 Hence, for our associative memory (AM), we have hypothesized a large, random-access data memory for swapping files to and from the AMs at a rate of 1.6 billion bytes/ sec. Other studies have considered the use of associative logic within high-capacity storage devices (logicper-track systems) such as fixed-head-per-track disc devices. 2 •5 Thus, for reasons of efficiency, at a reasonable cost, we have described a byte-serial, word-parallel machine, called the Associative Processor Computer System (APCS), that consists of two associative processing units, each having 2048 256-bit parallel-processing elements. This machine is linked to a conventional processor that is comparable to an IBM 370/145 computer. In order to compare this machine with a sequential machine, we have programmed several real-time problems for both machines and compared the resultant performance data. The set of problems for the study were derived in part from an analysis of the TACC testbed developed at Hanscom Field, Mass. The TACC is the focal point of all air activity within the Tactical Air Control System (TACS). The TACS is a lightweight, mobile surveillance and detection system, assigned to a combat area at the disposal of Air Force Commanders for making real-time air support and air defense oriented decisions. INTRODUCTION This paper describes a research study concerning the potential of associative processing as a solution to the problem of real-time data management. 1 The desired outcome of the research was an evaluation of the comparative advantages of associative processing over conventional sequential processing as applied to general real-time Data Management System (DMS) problems. The specific DMS application framework within which the study was carried out was that of the data management functions of the U.S. Air Force Tactical Air Control Center (TACC). The primary feature used to select an associative processor (AP) configuration was "processing efficiency," by which is meant the number of successful computations that can be performed in a given time for a certain cost. In DMS applications, processing efficiency is influenced by large-capacity storage at low cost per bit, the record format, search speed, and the ability to search under logical conditions and to combine search results in Boolean fashion. The primary technique used for comparing an AP's performance with that of a sequential processor was to arrive at a set of actual execution rates for the class of real-time data management problems. These execution rates, coupled with the assumption that parallel computers cost four or five times more than their equivalent sequential computer counterparts, will yield an estimate of an AP's cost-effectiveness for the DMS problem. Obviously, the more data that can be processed in parallel, the greater the processing efficiency; and DMS applications, by their nature, have a high degree of parallelism and, hence, they dictated the type of parallel processor used for the comparisons. As a result, we dealt with machines that fit into the general category of the associative processors. 2 These are machines that execute single instruction streams on multiple data paths; or, viewed another way, they are machines that can execute a program on a (large) set of data in parallel. The classical array processors, such as the ILLIAC-IV and PEPE machines, are characterized by a relatively small number of sophisticated processing elements (e.g., a network of mini-computers), each containing several thousand bits of storage. 3 In order to obtain a high processing efficiency 187 188 National Computer Conference, 1973 The TACC is divided into two divisions: Current Plans and Current Operations. Current Plans is responsible for developing a 24-hour fragmentary (FRAG) order that defines the air activities within the combat area during the next 24-hour period, and Current Operations monitors those activities in real time. Some of the Current Operations functions were programmed for the IBM 1800 computer at the testbed, and an analysis· was made of a scenario, relating to a testbed demonstration, for determining those functions critical to Current Operations and, therefore, important candidates for our study. Also, a study was made of several data management systems (SDC's DS/2, CDMS, and CONVERSE, and MITRE'S AESOP-B) to arrive at a set of associative processor DMS primitives. 6 . 7 . 8 ,9 Those primitives that appeared to be most important to DMS functions were evaluated. We used the primitives required by update and retrieval applications for writing APCS programs, and compared the APCS performance result::> against conventional programs, written for data bases stored physically in random-access fashion. A relational data-structure model provided us with a mathematical means of describing and algorithmically manipulating data for an associative processor. Because relational data structures are not encumbered with the physical properties found in network or graph-oriented models, they provide independence between programs and data. The T ACC testbed data structure provided us with examples for describing and logically manipulating relational data on the APCS in a DMS context. The study results enabled us to make re~ommendations for further studies directed toward arriving at cost-effective parallel-machine solutions for real-time DMS problems. APCS-like machine descriptions will continue to evolve for the next several years. We do not recommend that the APCS or the various future versions (which, like the APCS, will have to be fixed for comparison purposes even though their structures may not be optimally costeffective) be considered a cost-effective solution to the DMS problem. Instead, we hope that they will serve as starting points for the machine evolution studies that will have to precede any attempt to describe the "ultimate" solution for data management. The central processor is an integration of five units: two associative processing units (APUs), a sequential processing unit (SPU), an input/output channel (IOC), and a microprogrammed control memory. The APU s provide a powerful parallel processing resource that makes the system significantly different from a conventional one. From a conventional system-architectural point of view, the APU s can be viewed as a processing resource that has a large block of long-word local registers (in this case, 4K X 256 bits) with data manipulation logic attached to each word; an operation is performed on all registers simultaneously, through content addressing. Actually, the central processor has three different processors-II 0, sequential, and parallel-and all three can operate simultaneously. The three processors fetch instructions from the program memory, interpret the microprograms from control memory, and manipulate the data from data memory. The memory system consists of three units: primary memory, control memory, and secondary storage. The data management functions and the data base of the data management system are independent of each other; for this reason, the primary memory is subdivided into program memory and data memory for faster information movement and for easy control and protection. This separation is another special feature of the system. The program memory is assumed to be large enough to store executive routines, data management functions, and user programs for the three processing units; consequently, program swap operations are minimized. The data memory can store the data base directory and several current active data files. Finally, two additional resources, terminals and displays and other peripherals, are added to make up a complete real-time system. The other peripherals include tape units and unit-record devices such as line printers, card readers, and punches. The reading and writing of AP words are controlled by tag vectors. The word logic is external to each AM word for actual data operations and control. The word logic Secondary Storage ASSOCIATIVE PROCESSOR COMPUTER SYSTEM A complete associative processor computer system (APCS) for a real-time data management application is shown in Figure 1. It is not a so-called "hybrid" system, with an interconnection of serial and associative processors; rather, it is a totally integrated associative computer. The central processor functions as an active resource that controls the overall system and performs information transformation. The memory system acts as a passive resource for the storage of information. Numerous data paths are provided between the central processor and various memory and external units for efficient information movement. Sequential Processing Un! t Program !1emory Data l'1emory Control Memory Central Processor Figure I-Associative processor computer system (APCSi Associative Processor Applications to Real-Time Data Management consists of two main portions: byte operation logic and tag vectors. The byte operation logic includes byte adders, comparison logic, and data path control logic. There are three types of tag vectors: hardware tag vectors, tag vector buffers, and tag vector slices. The hardware tag vectors consist of four hardware control tag columns: word tag vector (TVW), control tag vector (TVC), step tag vector (TVS), and response tag vector (TVR). The TVW indicates whether a corresponding AM word is occupied; it is set as the AM is loading and reset when it is unloading. The TVC controls the actual loading and unloading of each AM word in a way analogous to that in which mask bits control the byte-slice selection, except that it operates horizontally instead of vertically. The setting of TVS indicates the first AP word of a long record (one containing more than one AP word). The use of the TVR is the same as the familiar one, where a c9rr~,sponding tag bit. is set if the search field of an AP word matches the key. In addition, most tag-manipulation logic (such as set, reset, count, load, store, and Boolean) is built around the TVR. The status and bit count of the TVR vector can be displayed in a 16-bit tag vector display register (TVDR) to indicate the status of various tag vectors for program interpretation. These four hardware tag vectors are mainly controlled by hardware, which sets and resets them during the actual instruction execution. In addition, they can also be controlled by software for tag vector initiation or result interpretation. The contents of various hardware tag vectors can be stored temporarily in the eight tag vector buffers if necessary. This is particularly useful for simultaneous operations on multiple related data files or for complex search operations that are used for multiple Boolean operations on the results of previous tag-vector settings. The use of the five tag vector slices is similar to that of the tag vector buffers except that they can be stored in the DM along with the corresponding AP words and loaded back again later. A RELATIONAL DATA MODEL FOR THE TACC TESTBED Let us now consider a data structure for the APCS in terms of the TACC testbed DMS. Set relations provide an analytical means for viewing logical data structures on an associative processor.lO.lL12 Given sets (8,82 , • • • , Sn). If R is a relation on these sets, then it is a subset of the Cartesian Produce 8 l X82 • • • 8 n , and an n-ary relation on these sets. The values of elements in this relation can be expressed in matrix form where each jill column of the matrix is called the jill domain of R and each row is an n-tuple of R. Domains may be simple or non-simple. A simple domain is one in which the values are atomic (single valued), whereas nonsimple domains are multivalued (i.e., they contain other relations). As an example of set relations applied to associative processing, consider the example of the TACC testbed. F:~E A.'<) ?R.OPERTY DESCR!?70RS 1:-r7TTTT '..:o:::~s '!-t:::: 189 v ?RC?~R:-:' I In ;; S V \' V \' Reo 1 2 ) T 7 \' I. " "6 "7 ) I : I : ; i i I !I i I I i I : I I PROPERTY DESCR.:;:PTORS :-;-A.."1E" s~~~~ ! (;'1I i i I ~'~IT I i i SY;ES I i i ~~~~~R I 0 I I il i C> ! i I Cl' I I (' ..•.... i ..... " ...... , i l Ie : I / I ,,"'. I i I I I [ ! I 1I i ! , Figure 2-T ACC tested- storage structure on the APCS .. .. .. .. The testbed defines a "property" as one item of in formation (e.g., one aircraft type for a designated base). An "object" is a collection of properties all relating to the same base, unit, etc. Objects are the basic collections of data within a file (e.g., an air base file would have one object for each designated base, and each object would contain all properties for one base). A complex property is a collection of properties (such as data and time) that are closely related and that can be accessed either individu·· ally or as a group (date/time). The data base files are structured to show property number, property name, complex property indicator, property type, EBCDIC field size, range/values, and property descriptions. They are self-described and contain both the data (the property values) and the descriptions of the data. The descriptive information constitutes the control area of the file, and the data values constitute the object area of the file. The control area contains an index of all objects in the file (the object roll), and an index and description of all properties in the file (the property roll). The object roll contains the name of every object in the file, along with a relative physical pointer to the fixed-length-record number containing its data; the property roll defines the order of property-value data for each object and points to the location of the data within the object record. Figure 2 illustrates how AM #1 might be used to logically represent a testbed file control area. It contains two relations: and which represent an index of all the files in the testbed (R l ) and a list of all the property descriptors associated with the file (R 2 ). The first domain (file name) of relation R I is the primary key; that is, it defines a unique set of elements (or n-tuples) for each row of R I • 190 National Computer Conference, 1973 AM H2 DATA T V S FTRASGN W NA.'1E A;lEF i I PCAS! NAME I ADEF I PCAsl I UNIT I ICAS UNIT I ICAS Alc TYPE BASE I INTD I CAIR I CAIR r Em I TOT: \ CAPE BASE AtC TYPE I TOT CAPE I INTO I I) I ICASFRAG NA.'IE 1 UXIT !TYPEI TXJ1T lAIC TYPE ISORT r I I FIRST-MSN I ! : I ir: 1------------- SEARCH AND RETRIEVAL COMPARISONS 1------------- f------------j ~A.'IE 1 cNIT ITYPEI T:01T lAIC TYPE ISORT I F1RST-MSN flight characteristics (range, speed, etc.) occurring within the primary relation. To normalize this relation, the primary key or a unique marker is inserted in the nonsimple domain, the non-simple domain is removed from the primary relation, and the sequence is repeated on the secondary relation, tertiary relation, etc., until all nonsimple domains have been removed. DeFiore lo has described and illustrated this technique on an AP and calls the data structure containing only simple domains the Associative Normal Form (ANF). 1.\ I I Figure 3-TACC testbed property values on the APCS The domain labeled "Vector" contains the address of the tag vector that delineates the property values for the file; they are contained in AM #2. The domains labeled "AM #" and "Words/Record" contain the AM number for the property values and the number of physical AM words per record. The domain labeled "Property ID" contains a marking value that identifies the property descriptors in relation R2 for each file name of primary key in relation R I • Tag vectors TVO and TV1 mark those words in AM #1 containing relations RI and R 2 , respectively; this is necessary because two domains of the respective relations may be of the same size and content, thus simultaneously satisfying search criteria. The programmer must AND the appropriate vector, either TVO or TV1, with TVR after such a search to obtain the relevant set of N-tuples. Since the APCS uses a zero-address instruction set, parameter lists must be defined in control memory for each APCS instruction to be executed. Obviously, controlmemory parameter lists are needed to define search criteria for relations R1 and R2 and table definitions (not shown) for creating these lists before the descriptors contained iIi AM #1 can be used to define another set of lists for accessing the property values in AM #2 (see Figure 3). Figure 3 illustrates the relations that describe the Fighter Assignment File (FTRASGN) and the Immediate Close Air Support Frag Order File (lCASFRAG) property values. They are represented by n-ary relations containing a set of simple domains, 1 through n, which contain the actual data values for the elements: first mission number, last mission number, aircraft/type, etc. The domain labeled UNIT is the primary key for the relation and corresponds to the unit number contained in the testbed object roll. H the domain representing aircraft type contained more than one aircraft and a set of simple domains describing each aircraft, then a non-simple domain would occur in a primary relation. That is, there might be F104 and F105 fighter aircraft and parameters relating to their A search and retrieval comparison was performed using the APCS and a sequential computer, the IBM 370/145, as the comparative processors. Two problems involving fact and conditional search and retrieval were coded. A random-access storage structure was chosen for the conventional case; the conventional dictionaries were unordered. The data structure for the APCS can be shown by the relation It is an ll-tuple requiring two words per record. Branching or updating more than one word per record presents no particular problem.1o.13.14 Table I contains data (normalized to the 370/145) from these measures. The APCS was between 32 and 110 times faster than the sequential processor, depending upon the algorithm and the number of keys searched. However, system overhead time to process the 1;0 request and to transfer the data from disc to data memory (no overlap assumed) was charged to the sequential and associative processors. Therefore, after 4000 keys, the APCS times tended to approximate those of the sequential processor since an IBM 2314 device was required for swapping data into the DM. UPDATE COMPARISONS In the conventional case, the measurements involved ordered and unordered dictionaries; in the associative case, no dictionary was used. For both cases, we used fixed-length records, 64 bytes/ record, and assumed that update consisted of adding records to the file, deleting records from it, and changing fields within each record. As in retrieval, the personnel data base was used for the measurements. It consists of a set of simple domains shown by the general relation R1 (an, a 12, . . " a 1n ). Updating the data base involved a series of searches on the domain a 1n such that a subset of al!t was produced: the list of length I r 1n I. The set of n-tuples in the relation R can be changed using the APCS Store Data Register command in Nir 1n I insertions, where N in the APCS measurements Associative Processor Applications to Real-Time Data Management 191 TABLE I-Retrieval Measurements (Data Normalized to 370/145) APCS Tl 1000 2000 4000 8000 FACT RETRIEVAL 370/145 T2 .341 ms .401 .613 913.1 CONDITIONAL SEARCH AND RETRIEVAL APCS 370/145 RATIO T2/Tl RATIO T2/Tl 10.888 ms 18.345 36.617 1,011.853 31.8 45.6 60.0 1.11 .695 ms 1.098 1.926 915.35 53.135 ms 105.928 211.810 1,335.624 76.5 96.4 110.0 1.45 • APCS EASIER TO PROGRAM, ALTHOuGH IT TOOK MORE CODE • APCS REQ"GIRED 15% LESS STORAGE consists of three domains: department number, division, and employee ID. In each case, a batch update technique was used, and one-half the data records were modified: 20 percent changed, 15 percent added, 15 percent deleted. Since no dictionary was needed in the associative case, there was a 15 percent savings for data storage for the APCS. The system calls and tables used in the Search and Retrieval calculation were used in the Update measures, as well. Assuming that the list r 1n and the Update list are both in sorted order, data values could be inserted into the list r1n using the parallel transfer capabilities of the APCS in m times the basic cycle time, where m is the number of bytes in the set of n-tuples to be changed. Table II shows the normalized (to the 370/145) timings for the APCS and a conventional computer (370/145) for the Update measurement. The measurements for the conventional case involved ordered and unordered dictionaries. The search algorithm for the conventional case was applied to an ordered dictionary, where a binary search technique was used involving (rIog2 v(x)) -1 average passes; v(x) denotes the length of the dictionary. For the unordered case, it was assumed that a linear search technique found the appropriate dictionary key after a search of one-half the dictionary entries. Since the Search and Retrieval measurements were severely limited by conventional I/O techniques, the Update measures were projected for mass storage devices (semi-conductor memory) holding four million bytes of data. Hypothetically, such devices could also be bubble and LSI memories or fixed-head-per-track discs; in either case, the added memories are characterized by the parallel-by-word-to-AM transfer capability. Since N/2 records were updated over an N-record data base, the conventional times increased as a function of N2/2, whereas the APCS times increased as a function of N. Even though a four-megabyte semiconductor memory would probably not be cost effective for conventional processing, it was hypothesized as the storage media for the conventional processor so as to normalize the influence of conventional, channel-type I/O in the calculations. Even so, the APCS showed a performance improvement of 3.4 orders of magnitude over conventional processing for unordered files consisting of 64,000 records. SEARCH AND RETRIEVAL WITH RESPECT TO HIERARCHICAL STRUCTURES A comparison was made between the associative and sequential technologies using hierarchically structured data. Consider a personnel data base containing employee information, wherein each employee may have repeating groups associated with his job and salary histories. The non-normalized relations for this data base are: employee (man #, name, birthdate, social security number, degree, title, job history) job history (jobdate, company, title, salary history) salary history (salary date, salary, percentile) The normalized form is: employee (man #, name, birthdate, social security number, degree, title) TABLE II-Execution Times-Update-APCS vs. Conventional Times (Normalized to 370/145) CONVENTIONAL (T2) DICTIONARIES NUMBER OF RECORDS APCS (Tl) UNORDERED 1000 2000 4000 8000 16000 64000 .046 sec. .108 .216 .433 .865 2.916 2.874 11.386 4.5.326 180.861 722.422 11 ,548.198 ORDERED .719 sec. 2.912 10.26.'3 38.091 146.440 2,271.915 RATIOS (T2/Tl) DICTIONARIES UNORDERED ORDERED 62.4 105.0 210.0 416.0 825.0 3,875.0 15.6 26.9 43.5 87.9 169.0 770.0 192 National Computer Conference, 1973 job history (man #, jobdate, company, title) salary history (man #, jobdate, salary date, salary, percentile) the length of RI (il), the number (C) of comparison criteria, and the length ( IPII ) of the list formed after a search of RI times the average number of items, a, in the simple domain a 21 • This is represented by: and the Associative Normal Form is: RI(a ll , a 12 , ala, a 14 , a 15 , a 16 , R 2(a 21 , a22, a2S, O'll 0'2) Rs(as h as 2 , aaa, 0'1, 0'2) 0'1) The total number of searches, nl, required for the relation RI is the sum of the number of possible search arguments for each domain in R I • Similarly, the number of searches for R2 and Ra can be represented by n 2 and na. The number of words, or processing elements (PEs), satisfying a search on RI is I kll, and for R2J I k21. Since the 0' I values from the PEs satisfying the search conditions for R I are used as arguments for searching R 2 , the total number of ~earchei" po~sible for R2 is equal to n 2 + IkJ I: analogously, the total number for Ra is equal to n a+ Ik21. Therefore, the total number of searches for one APCS data load required for the three relations hi: T=n +n +n s +IkII+lk 1· l 2 2 The first comparison dealt with a search and retrieval problem for a personnel data base consisting of the three relations R I , R 2 , and Ra. These relations required 50,000 records, each contained within two PEs. The problem dealt with the query: PRINT OUT THE MAN #, SALARY DATE, AND SALARY FOR THOSE EMPLOYEES LQ 40, WITH MS DEGREES, WHO ARE ENGINEERS, AND WHO HAVE WORKED AT HUGHES. In the associative case, the total number of searches required was: where N = number of APCS data loads. (Note: the multi-conditional search on RI can be made with one search instruction.) For the total data base, it was assumed that varying numbers of records satisfied the RI search, i.e., Holding the term C II constant in T2 and examining the two equations TI and T2 it can be seen that, for this particular query, as the number of responding PEs increases, the associative performance improvement ratio will decrease; that is, as the number of 0'1 and 0'2 searches increases, as a result of an increase in the number of PEs responding to primary and secondary relation multi-conditional searches, the performance improvement ratio for the APCS decreases. This is illustrated by Table III. However, as the number of records increases (C 11-> 'Xl), the performance ratio increases for the APCS when holding the number of responding PEs constant over the measures. Based on the search and retrieval problems that have been investigated, it can be concluded that with respect to conventional random-access storage structures, the performance improvement for the APCS is affected by the amount of parallelism inherent within a search, the number of multi-field searches spread over a set of domains, and the degree of hierarchical intra -structure manipulation, with respect to 0'1 and 0'2 searches, invoked by the query. A set of queries that tends to produce the following effects will improve the APCS performance ratio: • Queries invoking an increase in the number of PEs participating in a series of searches. • Queries requiring an increasing number of multifield searches. • Queries requiring increasing volumes of data (assuming that high -speed parallel I/O devices are available for loading the APCS). The APCS performance ratio decreases as the queries tend to produce: • A decreasing number of PEs invoked by a series of searches. (EMPLOYEES LQ 40, DEGREE EQ MS, TITLE EQ ENGINEER); and that a variable number of the records marked after the 0'1 search satisfied the R2 search for 'HUGHES'. For the conventional case, it was assumed that a random-access storage structure existed and that an unordered dictionary was used for calculating record addresses if needed. An index in R I was used to access the records belonging to R 2 , and an index in Ra was used to obtain the values to be printed. The total number of Search Compare instructions required was a function of TABLE III-Hierarchical Search and Retrieval (50,000 Records) (Normalized to 370/145) al AND a2 SEARCHES 11 110 600 1100 2300 APCS TIME (Tl) IBM 370/145 TIME (T2) RATIO T2/Tl 6.167 MS 14.785 50.87 81.87 150.07 901.951 MS 910.51 952.6 989.8 1077 .3 150 62 18 12 7 Associative Processor Applications to Real-Time Data Management • An increasing number of 0'1 and 0'2 searches with respect to hierarchy. • An increase in the use of slow, conventional channel I/O techniques. TACC APPLICATION SYSTEM AND ASSOCIATIVE PROCESSING ANALYSIS After having looked at specific DMS subfunctions on an AP, it is of interest to see their importance applied to the processing activities of a real-time system and to evaluate an AP's impact from a total system viewpoint. The desired outcome of this study is to have a capability to evaluate the impact of associative processing on the future data automation efforts of the Tactical Air Control System (TACS). A portion of the data management sys-tem-p-rocessingoperations were defined;forthe purpose of this study, in the context of the on-going Advanced Tactical Command and Control Capabilities (ATCCC) Studies. The results presented below were derived from an analysis of a scenario for the T ACe Current Operations Division that was under investigation on a testbed facility at Hanscom Field, Massachusetts. The testbed consisted of functional software for the Current Operations and a data management system operating on an IBM 1800/ PDP-8 distributed system. 15 A real-time associative processor data management system (AP / DMS) study model was developed with equivalence to the T ACC testbed system in mind. A practical comparison was made between the two systems in order to determine which system better meets TACC data management requirements. The AP IDMS design also serves as a basis for projections of the future effects of sophisticated application of associative processing technology on hardware, data structures, and future data management systems. The following describes the API DMS and the comparison methodology and results. The comparison measurements are normalized to bring the hardware, data structures, and data management systems to equivalent levels. The testbed system (running on the IBM 1800 computer) and the AP IDMS (running on the APCS) were compared first with both using IBM 2311 discs as peripheral storage, then with both using IBM 2305 fixed-headper-track discs. In the first case, the number of instruction executions for both systems was multiplied by the average instruction time for the IBM 1800; in the second, the multiplier was the average instruction for the APCS. AP/DMS The AP I DMS was req uired to provide a data management environment for the comparison of associative processing techniques with the sequential techniques used in conventional data management systems. It was aimed at the development of cost-effectiveness and performance 193 ratios and as a background for the analysis of advanced associative processors. This general statement of requirements was used to derive a set of general capability and performance requirements for the system. The equivalence between the testbed system and the API DMS was the overriding requirement. The testbed data base and scenario were kept in the same form to achieve the desired results. The user language was tailored to match both the testbed system and the AP IDMS. The notion of normalizing computer instruction times implies a similarity in the software and the data flow. The I/O requirements were accorded separate but equal status to provide an ability to make both simple and complex changes to the system configuration. System organization The concern for system equivaience pervades all phases of system design-starting with the system's logical organization and physical configuration. An API DMS system organization aimed at meeting the requirements discussed above was established. The basic organizational concept was that the data processing capability should be concentrated in the associative processor. Such a system organization offers better and faster response to user queries and provides a capability for system growth and change and favorable conditions for system design and implementation. The APCS computer performs the following functions: 1. Generation, storage, and maintenance of the sys- tem data base. 2. Storage of general-purpose and application programs. 3. Dynamic definition of data base organization. 4. Execution of message processing programs and, in support of those programs, retrieval from and manipulation of data base files. 5. Message reformatting-specifically, conversion from data base format to display format. 6. Immediate response to message and data errors, reducing the processing time for inconsistent users queries. Measurements The AP/DMS was coded in JOVIAL-like procedures using the APCS instruction set. An instruction path derived from the testbed scenario was given to a counting program, and a comprehensive set of statistics was generated for the AP / DMS. Similar measurements were made from several SDC testbed documents to derive a set of testbed ~tatistics. Table IV presents the test-run results. The total time for the TACC testbed was 29.9 seconds; the TACC API DMS time was 11.8 seconds. These times were normal- 194 National Computer Conference, 1973 TABLE IV-Comparison Normalized to IBM 1800-2311 Base TACC AP/DMS TESTBED SP ONLY AP+SP AP SP AVG. TIME PER OPERATION 4810 4810 4810 4810 NO. OF OPERATIONS 2,161,700 994,300 217,000 777,300 OPERATION TIME 10,397,777 4,782,583 1,043,770 3,738,813 I/O OPERATION TIME 19,589,770 7,063,731 8,731 7,055,000 TOTAL TIME 29,987,547 11,846,314 1,052,501 10,793,813 RATIO TESTBED: AP/DMS 2.5:1 ized to the IBM 1800/2311 operational times. Table V was normalized to APCS/2305 operational times. In the latter case, the performance ratio was approximately 3:1; that is, the AP performed the equivalent task three times faster than the conventional processor. Statistics showed that the APCS spent 80 percent of its time in parameter· definition and passing functions, whereas the testbed computer spent 3.8 percent of its time with such functions. The APCS defined its instruction operands from data definition parameters located in an AM-resident dictionary, and sequences of code pas~ed these parameters from JOVIAL procedure to procedure. The associative processor does use more parameters than the sequential processor, but not 20 times as many; statistics showed that 73.7 percent of the AP /DMS time was spent in dictionary operation. It can be concluded that there were too many calls to the dictionary, causing too many parameters to be used and passed. Another factor in the AP / DMS performance is the use of the JOVIAL procedure call technique. This technique uses elaborate register and address saving and restoring code. This code is unnecessary, particularly for one AP procedure calling another AP procedure. When we reduce the parameter passing and dictionary calls by 90 percent and disregard I/O times, the performance ratio is approximately 30: I-however, we can improve this ratio even more. The TACC data base used by the AP /DMS was biased in favor of sequential computing. The size of the data base was 400KB, consisting of 50 files containing 1850 domains which involved an average of 68 processing elements. Cost-effective parallel computation techniques favor the reverse-68 domains spread over 1850 processing elements. The T ACC data base can be reduced to 500 domains, which would have the net effect of involving more parallel processing elements and reducing the time spent in parameter setup, producing a 60: 1 CPU gain over sequential processing for the TACC problems. CONCLUSIONS AND RECOMMENDATIONS In order to investigate real-time DMS functions on an associative processor, a hypothetical Associative Processor Computer System (APCS) was described. The description was carried to the instruction level in order to have an associative machine on which to code Air Force Tactical Air Control Center (TACC) and update and retrieval problems for comparison with corresponding conventional (sequential) codings. A T ACC testbed scenario describing an operational demonstration of the T ACC Current Operation functions was analyzed and from this analysis it was concluded that of the DMS functions most able to use an associative processor, the great majority of tasks in an operational situation fell in the three data management areas of retrieval, update, and search, with 60 percent of the total processing tasks being associated with data search and retrieval. For this reason, it was decided to investigate Search and Retrieval and Update subfunctions on the APCS. This investigation involved coding Search and Retrieval and Update problems for the APCS and for a conventional computer, an IBM 370/145. From an analysis of various conventional physical data organizations, the random-access storage structure was chosen for comparison with the APCS because of its similarity to testbed data organizations, because of its widespread use in other systems,16 and because of other studies relating to list organizations. 1o Hierarchical and non-hierarchical record structures were investigated. APCS performance improvements, normalized to the IBM 370/145 computer, varied from 32 to 110 times faster for Search and Retrieval and from 15 to 210 times faster for update; this assumes that a half-million-byte mass storage device (semi-conductor memory) with a parallel I/O bandwidth of 1.6 billion bytes/second exists for loading the associative memory. By increasing the size of this device, more cost-effective performance ratios were achieved for the two DMS measures; other mass storage devices which may be more cost-effective for associative memories are bubble memories, fixed head-per-track discs,17 and LSI memories. TABLE V-Comparison Normalized to APCS-2305 Base TACC AP/DMS TESTBED-----------------AP SP ONLY AP+SP SP AVG. TIME PER OPERA291 291 291 291 TION 2,161,700 994,300 217,000 777,300 NO. OF OPERATIONS OPERATION TIME 829,055 289,341 63,147 226,194 1,756,270 587,351 I/O OPERATION TIME 351 587,000 2,585,325 876,692 63,498 813,194 TOTAL TIME RATIO TESTBED: AP/DMS 3.0:1 Associative Processor Applications to Real-Time Data Management From the testbed scenario analysis, three critical realtime functions were selected to provide us with a real Air Force problem to investigate. We obtained program listings for these functions and converted them to analogous APCS code. The JOVIAL language, augmented with APCS instructions, was used for coding the functions. Due to the data structuring and coding tehniques used, a 3: 1 improvement, normalized to the testbed IBM 1800 computer, was initially shown for the APCS. This improvement ratio was due to the fact that almost a literal translation was made for converting testbed data structures to the APCS. Also, the JOVIAL techniques used for coding the APCS required twenty-five times as much overhead for passing parameters between subroutines as did the testbed code. By restructuring the testbed data in order to gain more parallelism and by reducing -the-parameter passing overhead to that of the testbed, we concluded that a 60:1 improvement could be gained for the APCS, and we would expect to obtain that ratio upon repeating the measures. This assumes that a sufficient quantity of mass storage exists for swapping data into the AMS with a bandwidth of 1.6 billion bytes/sec. Based on our study, recommendations are made for future associative processor research studies relating to high-order languages, firmware implementation techniques, generalized tag operations, LSI design and AP cell implementation, and memory organization and data movement. 195 REFERENCES 1. Linde, R R, Gates, L. R, Peng, T., Application of Associative Processing to Real-Time Data Management Functions, Air Force Systems Command, Rome Air Development Center, Griffiss AFB, New York, 1972. 2. Minker, J., A Bibliography of Associative or Content-Addressable Memory Systems-1956-I97I, Auerbach Corporation, Philadelphia, 1971. 3. Session I-Parallel Processing Systems, 1972 Wescon Technical Papers, Los Angeles Council, IEEE. 4. Dugan, J. A., Green, R S., Minker, J., Shindle, W. E., "A Study of the Utility of Associative Memory Processors," Proc. ACM National Conference, 1966. 5. Parhami, B., "A Highly Parallel Computing System for Information Retrieval," AFIPS Cont. Proc., Vol. 41, Part II, pp. 681-690, 1972. 6. DS/2 Data Management System Technical Description, System Development Corporation, Santa Monica, California. 7. CDMS Data Management System Technical Description, System Development Corporation, Santa Monica, California. 8. Kellogg, C. H., "A Natural-Language Compiler for On-Line Data Management," AFIPS Cont. Proc., Vol. 33, Part 1,1968. 9. Hazle, M., AESOP-B Data Management System, MITRE Corporation, MTR-851, 1970. 10. DeFiore, C. R, An Associative Approach to Data Management, PhD Thesis, Syracuse University, 1972. 11. Childs, D. L., "Description of Set-Theoretic Data Storage," IFIPS Congo Proc., 1968. 12. Codd, E. F., "A Relational Model of Data for Large Shared Data Banks," Comm. ACM, June, 1970. 13. DeFiore, C. R, Stillman, N. J., Berra, P. B., "Associative Techniques in the Solution of Data Management Problems," Proc. ACM National Conf., 1971. 14. Rudolph, J. A., "A Production Implementation of an Associative Array Processor-STARA~," AFIPS Cont. Proc., Vol. 41, Part I, 1972, pp. 229-242. ACKNOWLEDGMENT 15. Definition of General Purpose and Tactical Air Control Center Functional Software, System Development Corporation, TM-LX- This work was sponsored by Air Force Systems Command, Rome Air Development Center, Griffiss AFB, New York under Contract No. F30602-72-C-0112. 346/200/00, Lexington, Mass., 1971. 16. Dodd, G., "Elements of Data Management Systems," Computer Surveys, June 1969. 17. Barnes, G. H., et aI., "The ILLIAC IV Computer, IEEE Trans. Computers, Vol. C-17 pp. 746-751, August 1958. A computer graphics assisted system for management by ROHI CHAUHAN Tektronix, Incorporated Beaverton, Oregon However, after introduction of a less than $4,000 Computer Display Terminal by Tektronix, Inc., in October 1971, the use of high speed interactive graphing terminals in several phases of business planning and control activities has now become an economically practical reality. Several companies are known to have implemented, rather quickly, some simple but elegant and profitable graphics assisted systems. Two companies that have publicly talked about their applications are U niroyaP and International Utilities. 6 This paper will suggest how it is possible to configure simple low cost Decision Support Systems, via description of a system called GAMA-1 for Graphics Assisted Management Applications. This system is being used at Tektronix Inc., Beaverton, Oregon, for corporate planning and production scheduling purposes. The discussion is focused on characteristics of such business systems, software architecture, simplicity of design, and ease of its usage, all of which, of course, is with reference to the GAMA-l. U\TRODUCTIOK Everyone agrees that a "picture is worth a thousand words". However, it is not yet obvious that it would make good business sense for managers to make liberal use of Computer Graphing in almost all phases of business decision-making. This paper asserts that it is convenient and practical where something worthwhile can be gained by study of variation of such important parameters as demand, inventory level, or sales, with time or with respect to another independent variable. This assertion is predicated on the assurance that: A. Graphs can be obtained easily as soon as they are needed and in the form they are needed. B. Graphs are very easy to obtain and modify, and C. Graphing is cost effective. Some examples of activity areas where business data graphing is known to be definitely profitable are corporate planning, purchasing, resource allocation, production scheduling, and investment portfolio analysis. However, today, only in a very few management decision -making processes can it be said that computer data graphing is being used as a daily routine. Reasons are primarily that the three conditions mentioned above have not so far been met to users' satisfaction. The need for easy graphing with desired flexibility dictates use of computer assisted graphing on display terminals providing both graphic output as well as graphic input and hard copy capabilities. Management systems involving such Computer Display Terminals have been, until recently, quite expensive,l·2 and difficult to justify for common business applications. Also, the choice of computer display terminal suppliers and their product lines were very limited. The software packages that would really make the job of a "business programmer" easy were practically non-existent. As such, the application of Display Terminal Graphics in business decision-making has remained limited to a very few places, like IBM, 3 and there too apparently on somewhat of an experimental basis. A significant number of industries have been using plotter type graphics, 4 but only sparingly because of inconvenience, lack of flexibility, and expense. SYSTEMS CONFIGURATION The GAMA-1 system is an imaginative attempt at exploiting capabilities furnished by modern computers in timesharing environment. Tektronix 4010 type computer display terminals, Tektronix 4610 type hardcopy units for the 4010's, knowledge of quantitative methods that are frequently used in decision-making process, simple data organization concepts, and FORTRAN are utilized. Figure 1 shows a multi-terminal GAMA-1 system's configuration. The computer system presently being used is a PDP-10. FUNCTIONAL REPRESENTATION As depicted in Figure 2, the GAMA-l system can be functionally used in three ways, i.e., 1. Graphing of user data files (complete with specified annotation), movement of the produced pictures as desired to fit suitably on the screen, and finally, making hard copies for a report. 2. Manipulation of user data, e.g., adding and subtracting of two series, updating, smoothing, etc., and then performing the task described above, in step 1. 197 198 National Computer Conference, 1973 commonly used methods of moving averages, forecasting by graphing and inspection. Also, hooks have been designed, into GAMA-l, so that programs of other desirable techniques can be integrated into the system. t' SOFTWARE ATTRIBUTES 4010 T/S COMPUTER SYSTEM :~ ~:. 4010 iOMPlITER OISPLAY TERMINAL(,/ 4610 HARD COpy UNIT .1 Figure I-GAMA-l system's configuration 3. Statistical analysis of the user data, systematic making of forecasts, and generation of reports as in atep 1. For data manipulation, analysis, and forecasting, several useful techniques such as adaptive exponential smoothing, census method X-II, pairing, and regression analysis are furnished, together with the simple but ) GRAPHING AND PICTURE MANIPULATION DATA FIlES f-- 8-1/2xll DATA ~ MANIPULATION REPORTS -~ - ~ OATA ANALYSIS AND FORECASTING --- Figure 2-GAMA-l functional representation The GAMA-l software has been designed to provide the following most desired features into the system. 1. An ease of use as reflected by the self-guiding, conversational nature of the system. A conversational program is normally thought of as one where the user responds to the queries from the system, one at a time. However, this standard can be surpassed by making all possible choices open to the user known to him at all times. This has been accomplished in the GAMA-l software by extensive display of menus of the available choice~ and use of the graphic cross hair for selection of the chosen menu item. 2. Ability to combine usage of several data manipulation, analysis, and forecasting techniques as desired, i.e., adaptability to varying user requirements. 3. Liberal use of graphing throughout the system for easy comprehension of results of various GAMA-l operations. 4. Easy interfaceability with user's own programs furnishing capability of further growth of the system. 5. No need for the GAMA-l users to do any applications programming. 6. Mechanism for saving, if so desired, the results that are generated during a GAMA-l session in the format in which they can be directly fed back as data into the subsequent GAMA-l sessions. 7. Extensive report generation capabilities allowing the user easily to compose his report pages consisting of graphs of original data as well as the results of the GAMA-l analysis programs saved before. Complete control over the size and positioning of the graphs, form of the axes, grid super-imposition, alphanumeric annotations (both vertical and horizontal), and movement of the report element on a page is provided. Also, any page of a saved report may be retrieved, modified, and saved again. This feature can be exploited for storing the frequently used preformatted report pages, retrieving them later as needed, filling the blanks (with graphs of new data or annotations) for making up a new report. SYSTEMS ARCHITECTURE The GAMA-l is a file oriented system designed to function in one of the five GAMA modes at anyone time, i.e., 1. Option Select Mode (OSM): 2. Data Manipulation Mode (DMM), 3. Analysis and Forecasting Mode (AFM), 4. Report Generation Mode (RGM)' or 5. Help Mode A Computer Graphics Assisted System for Management As illustrated in Figure 3, OSM is the central interface mode which must be entered before any other mode can be invoked. A typical GAMA-l session will mostly consist of operations in one or more of the three work modes, i.e., the DMM, AFM, and RGM. The system's activity in any of these three work modes is performed by a set of applications programs whose purpose is related to the nomenclature of the modes. During a normal conversational session the system uses three types of disk files, namely, ir= .... I GAMA FILE, DATA FILE, and REPORT FILE, whose simplified functional relationships are shown in Figure 3. NEXT SEGMENT POINTER 4 NEXT DISPLAY ELEMENT POINTER DISPLAY ELEMENT TYPE DEPENDENT INFORMATION W ...J i: o c:r:I-W ::E C!:I c:r:...Jc:r: c.!l c:r:o... u c:r: ...... w I-...J I.I-z ...... OWI.I- NEXT DISPLAY ELEMENT POINTER o z ...... 1o 0:::: ...... 1-0 DISPLAY ELEMENT TYPE DEPENDENT INFORMATION I- zo... o::::ww o::EO:::: 0... c.!l wc:r: (/) (/) ...... ::c I- I+~ r-- I Gama file The GAMA FILE is the central work file used for conducting information flow between various modules of the GAMA-1 software. It is read by the programs in DMM, AFM, and RGM, and it can be modified by the program in DMM and AFM. It can have several segments in it, each of them being a binary representation of the original data or the data resulting from operations on a GAMA FILE SEGMENT by one of the applications programs. Use of this technique saves information storage costs and adds to run time efficiency. Figure 4 describes the general design philosophy of the GAMA FILE. It is simple. A complete display normally consists of more than one display element, which can be added to, removed from, or moved on the display screen, but may not be subdivided. The display elements may be either "Graph" or an "Annotation", as defined below. 199 - r- I I I I I I L I I I I I .. I... 0 NEXT SEGMENT POINTER NEXT DISPLAY ELEMENT POINTER I I (MAY BE SEVERAL SEGMENTS OR REPORT PAGES) I a a END OF FILE Figure 4-GAMA FILE layout Graph-Binary representation of data, both numeric (to be graphed) and alphanumeric, that must be treated as one unit. For example, a GAMA FILE segment generated from a DATA FILE (Figure 5) would contain only one element which would be of type "GRAPH". DATA FILE Annotation-A display element created by the GAMA1 programs during DMM, AFM, or RGM under user control. Annotations do not have names. Annotations cannot exist outside of RGM unless they are associated with a graph. Annotations may be either graphic or alphanumeric. GJl.~A Alphanumeric Annotation-It is a string of alphanumeric characters, alternately referred to as labels. Multiple lines are permitted, in both horizontal and vertical configurations. FILE (SEGMENTS) Graphic Annotation-Lines or points that have been added to a "Graph", by graphic cursor input. Data file Figure 3-GAMA-l file relationships The DATA FILES, to be created from raw data by using computer system's text editor, contain data to be analyzed by GAMA-l in character strings. The data is 200 National Computer Conference, 1973 classified under four categories, which must follow after declaration of the headers by the same name. Figure 5 shows an example of the DATA FILE. /TITLE "XYZ COMPANY"; "ABC DIVISION"; "CONSOLIDATED SALES" /TYPE* MONTHLY FROM 6706 to 6805 / YUNITS "THOUSANDS $" /XUNITS "YEARS" /DATA 255 76 179 87 98 140 82 29 80 53 31 16 100 Figure 5-A DATA FILE using one year data It is to noted that: • A character string following a "/", until a blank space is encountered, constitutes the HEADER. Once a header has been declared, it is associated with the data following it until a new header is declared or an end of file is encountered. • All information is entered in free format separated by blanks. The entry of each line is terminated by a carriage return. • The actual numbers representing data must start on the next line following the header / DATA. Also, the data must be the last entry. • Length and order of /TITLE, /TYPE, /XU:NITS, and/YUNITS type information is arbitrary inasmuch as it occurs before I DATA. Any of these types may also be omitted. • A ";" in the TITLE type information indicates the beginning of a new line. The DATA FILES can be generated either by using a text editor, from raw data, or from a GAM A FILE segment by using the INTERPRT command in DMM. A DATA FILE is read only by using the CREATE command in DMM for generation of a GAMMA FILE segment; it is the GAMA FILE segment which is read or written by all other applications programs. After a GAMA FILE segment has been made, the corresponding DATA FILE can be deleted because it can always be re-created by selection of the INTERPRT command in the DMM when required. * Possible types are DAILY, WEEKLY, MONTHLY, PERIODICAL, QUARTERLY, YEARLY, and PAIRS. In case of PAIRS the format will be ,'TYPE PAIR nnn, where nnn is the number of pairs; then, in the lJATA all the X components ot the paw; are llsted firlit iollowed by the Y components. Report file In the Report Generation Mode (RGM), a report page is first composed in a one dimensional real array, called Display Data Array (DDA), and later saved as a page in the REPORT FILE. A report page, i.e., DDA of the RGM, consists of all information that is on display on the screen, with exception of the GAMA-1 system prompts. While in the RGM the DDA always contains all information that is necessary to reproduce the current display. When a display is saved the DDA is written into the REPORT FILE on disk with the specified page number. The DDA may be filled up gradually, via usage of RGM commands, from information in the GAMA FILE segments or it can be loaded from an existing page in the REPORT FILE. It is to be noted that the information in the DDA is organized in form of one or more linked display elements. Each of these elements can be manipulated (i.e., deleted, moved, etc.) as one unit by the RGM commands. Also, the other types of display elements have been linked via pointers to allow one refresh* routine to scan through and redraw the complete display in DMM, AFM or RGM. This feature has been made possible by choice of identical structures of the display elements in both GAMA FILE segments and the display data array. GAMA-l USAGE CONSIDERATIONS Because of the system's self-guiding, responsive, and conversational nature, a GAMA-1 session is naturally very creative and interesting. Unlike most systems, a user of GAMA-l is not required to guess and key in the answers at every stage. All alternatives available to a user are displayed before him as a menu on the right hand margin of the display terminal screen, as shown in Figure 6. A selection of any menu item can be readily made by positioning just the horizontal line of the graphic crosshair cursor over the desired item and touching a key. Quite often, selection of one menu item results in the display of a subsidiary menu comprising the alternatives that are available in relationship to the previously selected activity. For example, with reference to Figure 6, once the RGM's DISPLAY command in menu I is chosen, the menu II appears. A convenient mechanism for transfer of control from one set of activities to another is built into the system. For example, input of a "$" character in response to any input request, anywhere, results in cancellation of the going activity and the main menu of that mode is activated. After a while in a GAMA-1 session, a display often gets cluttered because of systems prompts and menus. For such situations, a clean, clutter-free, and current display (i.e., showing effect of MOVE and DELETE commands and addition of any new display elements) can be * Selection of the REFRESH command clears the entire screen and displays a fresh page. It is uflen used to redi"play the "amI" informatiull after erasing the screen. free of the unnece~!'ary clutter. A Computer Graphics Assisted System for Management DMM Aft<' RGt-: HELP EXIT OSM MAIN MEN\; ~~~rTE AFM MAIN MENU GRAPHIC CROSSHAIR CURSOR ~ \ RGM MAIN MENU I LIST (I. ) DISPLAY MENU IN RGM (II. ) LIST JUDGEMENTAL INSPECTION PAl RING CMX-II AVERAGING EXPO Sf()OTH ADAPTIVE REGRESSION GRAPH END REFRESH LABEL I'KlVE DELETE SAVE CLEAR HELP END NO TITLE NO LABEL NO AXIS NO DATA NO X LAB NO Y LAB NO GRID X LOG Y LOG SUB TIC CENTERED DASHED LETTER POSITION END INTERPRT UPDATE ZOOM Sf()OTH COMBINE GRAPH TABLE DELETE HELP END J I ~ obtained by either selection of the REFRESH command or typing an "*,, followed by a carriage return. Default options Inasmuch as an extensive usage of the menu mechanism furnishes complete control over selection of data manipulation, analysis, and forecasting techniques, and picture (i.e., graph and annotations) manipulation (Figure 7), certain default options are also available to the users. For example, referring again to Figure 6, when menu II is active, selection of END draws the picture with the alternatives that have been chosen in menu II and returns control to the main RGM menu, i.e., menu I. If no alternatives, in menu II, are chosen before END is activated, a graph using the default options is produced. As is evident from Figure 6, a GRAPH command for a quick look at graphic representations of data, without the ABC Ca1PANY CONSOLIDATED ORDERS EXPO SMOOTHED FORECAST GRAPHS MAY BE POSITIONED AND SHAPEo UNDER ~zeeee r / ~16B08 N NOiES ANO COMMi:.NTS MY BE AOOED WHERE HEEDED o 5 o 8009 L SMOOTHING CONSTANT = .154 49ae eI DRIGINAL DATA It may be desirable to speed up a GAMA-1 session by cutting down on the bulk of conversation with the system. Such a facility is implemented via a control file option. If this option is selected, completion of a normal GAMA-1 session results in saving of a control file which can later be executed any number of times to reproduce the same GAMA -1 session with different sets of data. The system, when run as governed by a control file, asks only a minimum number of necessary questions. such as identification of the new data, etc. This option will be particularly appreciated by managers and those users who do not have a need to get into the "nitty gritty" of the GAMA-l system. They can have a control file for a session tailored to their requirements by an analyst and execute it, with their data, for results with just about no questions asked. HELP TO USERS While in the Option Select Mode (OSM), users have a choice of selecting the HELP Mode for a reasonably detailed description of the GAMA-1 system. In this mode, an attempt is made to advise users of the system's various capabilities and to furnish adequate guidance for their usage in a conversational style. A limited (one page only) amount of information pertaining to a particular work mode (e.g., DMM, AFM, or RGM) is also available if the HELP command is selected out of the main menu in the respective mode. THE GAMA-1 RESULTS Results generated during activities in DMM and AFM can be saved into the GAM A FILE as new segments or as replacement of the existing segments. Each of the final report pages can be saved, with respective page numbers, into the REPORT FILE. Figures 8 and 9 are reproductions of true hard copies of three pages, for example, of a report generated in the RGM. Tabular outputs are produced separately by each applications program. SUMMARY L A ~ Control file option Izee9 F o o bother of choosing any display options whatsoever, is also provided in the DMM and AFM so that application of right techniques can be facilitated. I Figure 6-Some of the GAMA-l menus USER CONTROL 201 I I I I I 6!J-7079-71 71-7272-73 73-74 YEARS Figure 7-Some picture positioning options- A true hard copy of the display At last, the long awaited usage of Computer Graphics in business decision-making is here. It is now also costeffective. Looking ahead, it appears certain that after five years, if not sooner, we will be able to say that the largest numbers of applications of computer graphing terminals are in business, not in computer aided design as today. Svstems such as GAMA-1 are on the threshold of emergen~e all over. They are very versatile in being potentially applicable to a variety of business environments, sepa- 202 National Computer Conference, 1973 T38808 XP ~=m SMOOTHED SEASONALIZED o H +-__~~~____~ o s SZ~ ~____~______~______ A 24000 ,,\ 1970 12000 'I\J 6090 1971 1972 1973 1978 1971 1972. 1973 1974 YEARS YEARS ABC COMPANY TOTAL PRODUCT ORDERS (SOLID LINES) EXPO SMOOTHED SEASONALIZED FORECAST (DASHEO LINES) L L A R S Dl28e0 ~~~-+-7~--+-~~~~--~~----~ o L 18000 .,. 1/ 0 o o F '\ 4800 F ~----~------~~~--+-~~~-------i IJ\ aeee o H o EXPO SMOOTHED SEASONALIZE0300ae 512000 A U SI8809 TOTAL PRODUCT ORDERS TOTAL. f'ROOUCT ORDERS H o ABC COMPANY FORECAST ABC COI1PAHY ABC COMPANY TOTAL PRODUCT ORDERS ~biAft1'NVI L A R 6800 S 1970 1971 1972 1973 YEARS 1970 1971 1972. 1973 1974 YEARS Figure 8-Method of inspection in AFM XYZ COP1PANY' Wi·lll ~IJlfi II 68 rately as well as a functional module of complex Corporate Information Systems. Future developments will be in the area taking advantage of distributed computing in business decision support systems, exploiting minis or intelligent terminals for processing of pictures and small applications functions, locally, leaving the large number crunching jobs to the big computers in remote locations. The possibility of simple turnkey systems using a large mini is also very promising. ORDER ANALYSIS 69 70 71 72 73 74 68 48e 69 This paper would be incomplete without due acknowledgments to Roz Wasilk, my colleague, for her help by participation in discussions, in working out the details in various file designs, and in programming several modules of the GAMA-1 software. The author also wishes to express his sincere gratitude for the foresight demonstrated by Peter G. Cook and Larry Mayhew in accepting my ideas and furnishing necessary encouragement and support without which the project would never have gotten off the ground. Finally, my thanks to Judy Rule who has been a great help by enthusiastically making sure that the typing and artwork were completed in time, at a very short notice, and to a number of other individuals who have contributed to the success of GAMA-l. 71 72 73 74 vrI~O T 0 H F 329 0 U 0 S R A 0 24e I \ HE o R 160 S S RAW DATA 88 iii 72-73 73-74 74-75 FISCAl. YEARS Figure 9--Two report ACKNOWLEDGMENTS 70 page~ REFERENCES 1. Machover, C., "Computer Graphics Terminals-A Backward Look", AFIPS Conference Proceedin/?s, Vol. 40, 1972, AFIPS Press, Montvale, N.J., pp. 439-446. 2. Hagan, T. G., Stotz, R. H., "The Future of Computer Graphics," AF]PS Conference Proceedings, Vol. 40, 1972, AFIPS Press, Montvale, N.J., pp. 447-452. 3. Miller, I. M., "Economic Art Speeds Business Decision-making", Computer Decisions, Vol. 4, No.7, July 1972, Hayden Publishing Co., Rochelle Park, N.J., pp. 18-21. 4. Shostack, K., Eddy, C., "Management by Computer Graphics," Harvard Business Review, November-December, 1971, pp. 52-63. 5. Friedman, H., Scheduling with Time-Shared Graphics, Joint ORSAjTIMSjAIIE Meeting, Atlantic City, N.J., 1972. Unpublished. 6. Curto, C. J., Analytic Systems for Corporate Planning-The State-of-the-Art, Joint ORSA/TIMS/ AIlE Meeting, Atlantic City. N.J., 1972. Unpublished. On the use of generalized executive system software by WILLIAM GORMA~ Computer Sciences Corporation Silver Spring, Maryland the operating system and of its interface between a job and the hardware available to perform that job. It is the purpose of this paper to suggest ways to effectively use a third· generation operating system. Most of the examples used will be from the most generalized and complicated of them all-IBM OS/360. Our examination of operating systems will begin with the typical hardware resources of a computing plant and the OS response to those resources. A brief overview of the user tools supplied by the operating system will then be presented, followed by discussions on bugs and debugging and other problems of performance. Our conclusion will cover the most valuable and cantankerous resource of all-human. Lack of space prevents a complete tutorial, but it is the author's hope that many questions and ideas will be raised in the reader's mind. Perhaps a thoughtful user may see ways to regain effective use of his computing facility, or as Herb Bright says, "Learn to beat OS to its knees." INTRODUCTION The characteristic of third generation computing systems that most distinguishes them from previous ones is that they are designed to perform multiprogramming. The purpose of multiprogramming is cost-effective utilization of computer hardware, which is achieved by reducing the CPU time otherwise lost waiting for completion of I/O or operator action. An operating system is necessary to achieve multiprogramming: to schedule jobs, allocate resources, and perform services such as 1/0. Since these systems must be very generalized in order to accommodate the vast spectrum of potential application program requirements, they require some specific information from the user if they are to perform effectively. To supply the needed information and intelligently (efficiently) use the system, then, the user must have some understanding of the operating system's function as related to his particular needs. A third generation computing system has so much generality that to use or understand it one must wade through stacks of manuals that seem neither clear nor convenient. Problems plague users, who get caught in a juggling act with executive system control language. The unwary become hopelessly involved, generating endless control card changes, with attendant debugging problems and loss of valuable personnel time. Other users have gotten almost mystic about their job control language (JCL), taking an "it works don't touch it" attitude. With indirection such as this, even competent organizations can become very inefficient, using far more hardware, software, and human resources than are actually needed for the work at hand. Before we surrender and send out an appeal for the "save money" salesmen, let's examine the purposes of executive system software and determine if the application of a little horse-sense doesn't go a long way toward solving our dilemma. Randal}! notes in his excellent paper on operating systems that the quite spectacular improvements that are almost always made by tuning services "are more an indication of the lamentable state of the original system, and the lack of understanding of the installation staff, than of any great conceptual sophistic a tion in the tools and techniques that these companies use." Clearly, the key to effective use is understanding of HARDWARE RESOURCES CPU time The first and most val uable resource we shall examine is that of computer time itself. Emerson has said, "Econ0my is not in saving lumps of coal but in using the time whilst it burns." So it is with computer time. Most large computer shops run their computers 24 hours a day, yet typically their central processing units are doing useful work for far too small a percentage of that time. Cantrell and Ellison 3 note that "The second by second performance of a multiprogrammed system is always limited by the speed of the processor or an I/O channel or by a path through several of these devices used in series. . .. If some limiting resource is not saturated, there must be a performance limiting critical path through some series of resources whose total utilization adds up to 100%." To achieve the theortical potential of a computing system, we must manipulate it so as to increase the percentage of resource use. Analysis of the bottlenecks that cause idle time generally reveals that resource needs of companion runs are in conflicting demands in such a manner as to gain greater use of the CPU. 203 204 National Computer Conference, 1973 There are three states of CPU time: wait, system, and active. System wait time is time when the CPU is idle. System time is that time spent in supervisory routines, I/O and other interrupt processing, and error handlingmost of which is considered overhead. Active time is the time spent executing problem programs. Any reduction of system or wait time makes more time available for problems, thus contributing to greater efficiency. I/O channel time The channel handles the transfer of information between main storage and the I/O devices and provides for concurrency of I/O and CPU operation with only a minimum of interference to the CPU. Whenever I/O activity overloads a CPU, idle time can result because the CPU might be forced to wait for completion of I/O activity in order to have data to process. Such cases might be an indication of poor job mix. Problems also result from the frequency and duration of I/O activity. When data is moved in many small bursts, competition for channels and devices can markedly slow the progress of the operating system. Main storage Main storage, or memory, is probably the most expensive and limiting resource in a computing systtm, besides the CPU itself. Many programs use huge amounts of memory-often more than is available. Since the consumption of memory by programmers seems, like Parkinson's Law, to rise with availability, it is doubtful that expansion of memory will alone solve the average main storage problem. Expansion of memory without a corresponding increase in the average number of jobs residing in memory at execution time is a dubious proposition. Certain portions of the operating system must reside permanently in main memory in order to execute; but the basic system is too large, with many portions too infrequently used, to make it all resident. Memory not used by the system then serves as a pool of storage from which the system assigns a partition or region to each job step as it is initiated. One memory scheme used by the 360 breaks memory up into fixed-length parts or partitions, and the user program is allocated the smallest available partition that will accommodate it. Another 360 scheme has the system allocate (de-allocate) memory at execution time in the specific amounts requested. This method is more complicated, with more overhead, but it permits a greater variation in the number and size of jobs being executed. The most common abuse of memory that I have observed is over-allocation, or more simply the request for greater amounts of memory than are used. Fragmentation, a particularly frustrating problem, results from the requirement that memory be allocated to user jobs in single continuous chunks. As jobs of varying size are given memory, the memory assignments are at first contiguous to one another. When a job finishes, the space it occupied is freed and can be assigned to another job or jobs. However, if subsequent jobs require less than the full amount vacated, small pieces or fragments of unused memory occur and must wait until jobs contiguous to them are ended and can be combined back into usable size. As a result, when we wish to execute programs with large storage needs, the operator often must intervene and delay the initiation of other jobs until enough jobs terminate to create the necessary space. Thus, our CPU can become partially idle by virtue of our need to assemble memory into a single contiguous piece large enough to start our job. Direct-access storage 4 Direct-access storage is that medium (drum, disk, or data cell) where data can be stored and retrieved without human intervention. Modern computing demands could not be met without direct-access storage, and operating systems could never reach their full potential without it. The operating system uses direct-access to store system load modules and routines for use upon demand. Control information about jobs waiting to be processed, jobs in process, and job output waiting to be printed or punched is stored on direct-access devices by the operating system. The system also provides facilities whereby user programs have access to temporary storage to hold intermediate data. Magnetic tapes Data can be recorded on magnetic tape in so many different forms that we frequently sacrifice efficiency through lack of understanding. We often encounter difficulty with I/O errors not because of bad tapes, but rather due to incorrect identification to the operating system of recording format and such trivial things as record size. Further errors can develop from contradictions between our program's description and the JCL description of the same data. We generally inform the operating system of the recording format, etc., through JCL parameters. The system provides many services in the handling of tapes, one of the more important ones being the ability to identify data sets on tape by comparing JCL parameters with labels written as separate files in front of the data being identified. In my diagnostic work, I have identified more I/O errors as due to bad JCL and wrong tape mounts than as legitimate I/O errors. Due to the perishable nature of tape, provision for backup must also be made. Unit-record devices Printers, card readers, and punches all fit into this category. The operating system rarely reads or writes user data directly from user programs to these units. Normally, data input from a card reader or output to a punch or On The Vse Of Generalized Executive System Software 205 printer is stored as an intermediate file on direct-access devices, so that the system can schedule the use of these relatively slow devices independently of the programs using them. High volume and slow speed can occasionally cause system degradation. two programmers in an installation are using a complicated higher-level language, those users can encounter serious debugging problems for which no help is available from fellow programmers, due to a lack of expertise in that language. SOFTWARE TOOLS Input/output control systems Many of the tools of the operating system are independently developed segments or modules collected into libraries for use by the system and the user. Additional libraries are created to contain installation-developed routines, programs, and utilities. The IOCS portion of the system automatically synchronizes 1;0 operations with the programs requesting them, provides built-in automatic error handling, and is further extended by system schemes to handle queues of 1;0 requests from many totally unrelated programs. The system also permits the user to change his output medium with only a simple change to JCL. Users need to write 1;0 code at the device level only when introducing unique, special-purpose hardware to a system. Utilities Supplied with our operating system are numerous service programs or utilies for performing frequently used operations such as sorting, copying, editing, or manipulating programs and data. Among the services supplied are programs to update and list source files, print or punch all or selected parts of data sets, and compare sets of data. Generally these programs are easy to use once learned, are control card driven, and have" ... the priceless ingredient of really good software, abrasion against challenging users."2 They are generally stable from operating system release to release. User-written utilities This brings us to the subject of user-written utilities and programs. A search that I personally made at one installation uncovered over 20 user-written utilities, all occupying valuable disk space and all performing the same function-updating program source files. The only reason for this that I could discover was that the user was unable or unwilling to understand the utility already available. Despite what I termed waste, the writers, to a man, thought their approach sensible. Many useful utilities have been user-written, and often copies can be secured from the writers at no charge or low charge. Programming languages and subroutine libraries Programming languages have been developed to reduce the time, training, expense, and manpower required to design and code efficient problem programs. Essentially they translate human-readable code into machine-readable instructions, thus speeding up the programming process. With these language translators come subroutine libraries of pretested code to handle functions such as deriving the square root of a number of editing sterling currency values. It is beyond the scope of this paper to discuss language selection, but one note seems in order. When only one or Linkers and loaders The 360 linkage editor combines program segments that were compiled or assembled separately into a single program ready to be loaded. We can therefore make changes without recompiling an entire program. The linkage editor also permits us to create a program too large for available hardware by breaking it into segments that can be executed and then overlaid by other segments yet to be executed. The loader handles minor linkage tasks and physically loads into main storage the programs we wish to execute. JCLASA GLUE Operating system job control languages (JCL) have been established to enable us to bypass the operator and define precisely to the system the work we wish to perform. JCL reminds me of glue: used properly, it's effective; used poorly, it's a mess. I look upon the differences between the major operating systems as trade-offs between simplicity and flexibility. UNIVAC and most of the others have opted for simplicity, while IBM has stressed flexibility. For example, UNIVAC uses a very simple control language with singleletter keys that identify the limited range of options permitted via control card. IBM, on the other hand, allows extreme flexibility with literally dozens of changes permitted at the control card level-a not very simple situation. I consider 360 JCL to be another language-quite flexible, but unfortunately a little too complicated for the average user. To execute a job on the 360 we need three basic JCL cards: a job card to identify our job and mark its beginning, an execute card to identify the specific program we wish to execute, and data definition (DD) cards to define our data sets and the 1;0 facilities needed to handle them. When we supply information about a data set via JCL rather than program code, it becomes easier to change 206 National Computer Conference, 1973 parameters such as block size, type of 1/0 device used, etc., than it would be with other control languages, simply because· no recompilation is required as is frequently so with the other approaches. However, due to the complexity of the process we can unknowingly make mistakes. For example, to create printed output under 360 OS we need only code SYSOUT=A on the DD card describing the data set. Since printed output is usually stored on intermediate disk files, a block size is needed; but unless block size is specified, the output may end up unblocked. Also we might not be able to estimate the volume of printed output that would be generated before our job fails for lack of space allocated to handle the printed output. Numerous and extremely troublesome problems are generated when our use of JCL is uninformed or haphazard. The large number of JCL parameters required to properly execute a job introduces error possibilities due to sheer volume and an inability to remember every detail required by a large process. Even proficient JCL users may require several trial runs to iron out bugs, while uninformed users frequently give up and instead borrow JCL that allegedly works, even though that JCL may not really match their needs. It therefore becomes imperative that we devise ways to help users assume their proper responsibilities and get away from JCL as much as possible. IBM assists by making provisions for cataloged libraries of JCL called procedures. To invoke a p.i'Ocedure, a user need supply only a job card and an execute card for each procedure we wish to execute. Within a procedure, necessary details can be coded in symbolic form, with the procedure equating our symbols to proper JCL values. Any value so defined can be changed merely by indicating the symbol and its new value on the execute card invoking the procedure. We can also add or override DD cards and DD card parameters by supplying additional cards containing values to be changed or added. BUGS AND DEBUGGING Diagnosing bugs Diagnosing bugs in user programs requires a clear understanding of the relationship between system services and problem programs. Bugs call to mind complicated dumps and endless traces, 110 errors that aren't 110 errors at all, and other frustrating experiences. Certain higher-level languages include debugging facilities, trace facilities, and other diagnostic capabilities that can further complicate the diagnostic process whenever they are unable to properly field and identify an error. Our problem, then, in debugging is the rapid reduction of bugs to their simplest terms, so that proper corrections can be easily and simply made. Here we see the need for informed diagnosticians. Worthwhile procedures for debugging Often there is more than one path to a problem solution. We should avoid the trial-and-error, pick-andchoose methods because they are expensive and generally unproductive. Here is a quick overview of my formula for diagnosing abnormal terminations ("ABEND"s). First, examine the operating system's reported cause for the ABEND. Try to get a clear understanding of why the operating system thought an error occurred. If any point is not clear, consult the appropriate reference manuals. Research the ABEND description until it is understood. Before progressing, ask these questions: Can I in general identify the instructions subject to this error? Can I recognize invalid address values that would cause this error? If either answer is yes, proceed; if no, dig some more. Next, examine the instruction address register portion of the program status word (PSW) which reveals the address of the next instruction to be executed. Check the preceding instruction to see if it was the one that failed. If this process does not locate the failing instruction, perhaps the PSW address was set as the result of a branch. Check each register at entry to ABEND. Do they look valid or do they look like data or instructions? Are a reasonable percentage of them addresses within our region of memory? If register conventions are observed, tracing backwards from the error point might reveal where a program went awrv. The beautv of higher-level languages is that they con;istently follo~ some sort of register use convention. Once these are learned, debugging becomes simpler. The process just described continues point by point backwards from the failure to the last properly executed code, attempting to relate the progress of the machine instructions back to the original language statements. If this process fails, attempt to start your search at the last known good instruction executed and work forward. The same kind of process is followed with 1/0 errors and errors in general: first identifying the exact nature of the error which the system believes to have occurred; next identifying via register conventions pertinent items such as the data set in error-going as far back into the machine level code as necessary to isolate the error type. I have a whole string of questions that I ask myself when debugging and it seems that I'm forced to dream up new ones constantly. Let me sum up my approach with four statements: • Get a clear understanding of the nature of the error. • Ask yourself questions that bring you backwards from failure point to the execution of valid code. • If this yields nothing, try to approach the error forward, working from the last known valid execution of code. On The Use Of Generalized Executive System Software • If none of these approaches gets results, try recreating the error in a small case. Perhaps you'll find that the first step-undertanding the error-was not really completed. PERFORMANCE PROBLEMS The improvement of system performance and the elimination of bottlenecks has attracted wide attention of late perhaps because economy and good business practic~ dictate that it be so. Unfortunately, no cookbook approach yet exists, and it remains up to us to discover one for ourselves. The tools are legion,5.6 and are sometimes quite expensive and difficult to interpret. The tools include accounting data, failure statistics, operator shift reports, various types of specially developed system interrogation reports, simulations, and hardware and software monitors. The availability of tools and the level of sophistication of systems personnel may dictate whether these tools are handled in-house or contracted out on a consulting basis. Our first step is to outline each system resource to determine its fit into the overall system scheme and how our use may affect that fit. A manager might begin this process with a sort of time and motion study, eliminating as many handling problems associated with introduction of jobs to the computer as possible and smoothing the work flow between users, schedulers, messengers, operators, etc., and the computing system itself. The worst bottleneck might, in fact, be the one that prevents a tape or a card deck from being where needed when needed. Assuming that these things have been accomplished, we then poll the operators and the users for their impressions of how well the system serves their needs, at which time we might be forced to reexamine our work procedures. Our next step properly belongs to systems programmers. Their study should concentrate on those aspects of the system that consume the greatest asset of all-time. There are many techniques and packages available that measure system activity, and these should be put to work. Since the system contains the most heavily used code, our systems programmer places the most active system modules in main memory, intermediate-activity modules in the fastest available secondary storage hardware such as drums, and lower-activity modules in larger-volume, lower-speed (and lower traffic-density) hardware, perhaps large disks or tape. The direct-access addresses of the most frequently fetched routines ought to be resident to eliminate searches for them, and where possible these data sets should reside on the devices with the fastest access speeds. System data sets on direct-access devices with movable arms should have the most active of these routines stored closest to their directories, so that seek arm travel is kept to a minimum. Educated trial and error is necessary before reasonable balance in these areas can be achieved. 207 N ext we study each of the online storage facilities and their use. Questions as to adequacy, reliability, method of backup, and recovery ought to be asked. Criteria for the allocation of direct-access space should be established based upon criticality of use, volume and frequency of use, and cost-effectiveness when compared with available alternatives. Follow this with the determination and elimination of unused facilities which should be identifiable through the tools previously mentioned. After our system examination comes an examination of user programs and processes. In this part of our improvement cycle we first look at production programs, starting with the heaviest users of computer time and resources. Many times we find them in an unfinished state with improvements possible through the elimination of ~nnec essary steps. An important item to observe is the use of unblocked records or the operation of production programs from source rather than from object or executable load modules. Production programs that require the movement of data from card to disk or tape to disk preparatory to use should be avoided when a simple change to JCL makes this same data available directly to the program without the intervening steps. What is required is that an experienced, knowledgeable, and objective eye be brought to bear upon production programs and other user processes. Production programs should be examined to determine where the most CPU time is spent in executing and, consequently, what code could be improved to yield the best results. With OS 360, users can reduce wait time, system time, channel time, and device time by assembling records into blocks set as close as possible to addressable size. This reduces the number of times the I/O routines are invoked, as well as the number of channel and device requests and the seek time expended. With blocking, fewer I/O operations are needed and our programs spend less time in a nonexecutable state waiting on I/O completion. We gain additionally because blocking permits greater density on the storage devices. Many users are unaware of the fact that gaps exist between every record written on a track of a direct-access device. For example, the track capacity on an IBM 2314 disk is 7294 characters. If 80-byte card images are written as one block of 7280 characters, only one write is required to store 91 records on a track; yet if these records are written unblocked, only 40 records will fit on a track because of the inter-record gaps, and 40 writes are invoked to fill that track. Using large block sizes and multiple buffers introduces additional costs in terms of increased memory required for program execution. Obviously, we should balance these somewhat conflicting demands. A frustrating problem encountered by users of some systems is that of proper allocation of direct-access space. Since printed or punched output is temporarily stored on 208 National Computer Conference, 1973 direct-access, such a job's output volume needs to be known so, that the user can request the necessary space through his JCL. Jobs can abnormally terminate if insufficient space is allocated to data sets being created or updated. Over-allocation is wasteful and reduces space available for other jobs, as well as permitting excessive output to be created without detection. The space required should if possible be obtained in one contiguous chunk, as less CPU time and access time are used than if data is recorded in several pieces scattered across a unit. Another problem is locating files or even individual records within files. The system provides catalogs to point to the unit upon which a specific data set resides, but improper use or nonuse of these catalogs or of suitable substitutes can prevent a job from executing, due to an inability to identify where the data set resides. The use of a catalog introduces the problem of searching the catalogs for the individual data set's identity; and if these catalogs are excessively long, useful time (both CPU and II 0) can be lost, since every request for a data set not specifically identified as to unit results in a search of the system catalogs. Because of the changing nature of user requirements, a data set occupying permanently allocated space might occupy that space long after it is no longer used, simply because we are unaware of the fact. Techniques exist to monitor space, but users can easily cheat them. Proper estimation and allocation of direct-access space needs is a must, as is the release of unused or temporary space as soon as its usefulness has ceased. At nearly any installation one can find unused data sets needlessly tying up valuable space and sometimes forcing the system to fragment space requests due to the volume of space so wasted. Proper tape handling is mainly a user problem. Blocking should be employed for efficiency's sake. Data should be blocked to the largest sizes possible, consistent with memory availability, to reduce the amount of tape required to contain a data set and the average I/O transfer time consumed per record. Use of the highest densities provides for faster data transfer and surprisingly greater accuracy because of the built-in error recovery available with high density techniques. To protect tapes from inadvertent destruction the use of standard-label tapes is encouraged as a site-imposed standard. This permits the operating system to verify that the tape mounted by the operators is the one requested by the user's program. When processing multiple volume files, two tape drives should be allocated, if available, to permit a program to continue processing rather than wait for the mounting of subsequent tapes when earlier tapes are completed. Further, users should free tape drives not being used in subsequent steps. Systems programmers usually provide for blocking of card input and certain output types through default values and control card procedure libraries. Users ought not to unblock this 1/ O. Careful reduction of the volume of printed data to that actually needed by the ultimate recipient serves both the user and the system by reducing output volume. A typical high speed printer can consume some 35 tons of paper a year, and I can't even estimate the average consumption of cards for a punch unit. Perhaps, like me, you flinch when you observe the waste of paper at a typical computer site. To further reduce the waste of paper, users are well advised to go to microfilm where these facilities exist, particularly for dictionary type output. It is amazing how many users are still dependent on punched cards in this generation. Processing large volumes of cards requires many extra 110 operations and machine cycles that could be avoided by having these data sets on tape or disk. True, to update a card deck one only needs to physically change the cards involved; but the use of proper update procedures with tape or disk is a far more efficient and accurate use of computer and human time. This brings us to the subject of software tool usage. The most frequent complaints associated with vendor-supplied utilities are that they are difficult to use and that the documentation is unclear. This is probably due to their generality. Another difficulty is finding out what is available. To answer the difficulty-of-use problem, we might point out that it is simpler and more productive to try to understand the utilities available than to write and debug new ones. A small installation cannot afford the luxury of user-developed utilities when existing ones will do the job. Often it is better to search for what one is sure must exist than to create it. Still another avenue to investigate would be whether other installations might have the required service routine developed by their users. Since available utilities are usually more or less unknown to installation users, consideration might be given to assigning a programmer the responsibility of determining the scope of the utilities available and how to use them. This information could then be passed on to fellow programmers. In fact, a good training course would justify its costs by eliminating unnecessary programming and enabling installations programmers to select and use utilities that perform trivial tasks quickly. With vendorsupplied utilities, the first use or two might appear difficult, but with use comes facility. The use of programming languages ought not be an ego trip. Programmers should be forced to practice "egoless programming"7 and to follow recognized practices. Efficiency dictates that programs be written in modular form and that they be straightforward, well documented, and without cute programming gimmicks or tricks, lest the next release of the operating system render the program nonexecutable. Programmers themselves should realize that they cannot escape later responsibility for the problems of "tricky" programs, as long as they work for the same employer. Our eval uation process then continues with research into how new programs and problem program systems are On The Use Of Generalized Executive System Software tested and developed. It is the writer's experience that only rarely has much consideration been given to the efficiency of program development and testing. Frequently the heaviest consumers of computer resources are programmers with trial-and-error methods of debugging and complex or "clever" coding. Again, understanding the system can yield us an insight into relative inefficiencies of program development and how they might be overcome. Once we have attempted everything within our means, we might then consider outside performance improvement or consulting services. A final suggestion on resource consumption is a point regarding cost allocation. If a resource does not cost the user, he or she is not likely to try hard to conserve it. Proper allocation of cost in relation to resource use is both a form of control and an attempt to influence users to adjust their demands to the level most beneficial to the system. In short, users ought not to be given a free ride. A CASE FOR SPECIALISTS Merging the many parts of a third generation computing system into an effective problem-solving instrument requires that we inform the system of a myriad of details. Efficiency and economy in their most basic forms dictate that we simplify our work as much as possible. Some way must be devised to supply to the system with the detail it needs without typing up some 40 percent (as has actually happened-I cannot bear to cite a reference) of programmer time with JCL and associated trivia. What we are striving for is a lessening of the applications programmer's need to know operating system details. As already noted, the first step is to have knowledgeable system programmers supply as many efficient procedures and other aids as possible and to have them generate a responsive system. Also, those same systems people can generate libraries of control language to cover all the regular production runs (and as many of the development and other auxiliary processes as are practical after sufficient study). I propose, however, that we go one step further in this area. Chief programmer team One exciting concept to emerge recently has been that of the Chief Programmer Team. s Significantly increased programmer productivity and decreased system integration difficulties have been demonstrated by the creation of a functional team of specialists, led by a chief programmer applying known techniques into a unified methodolog-y. Managers of programming teams would do well to study this concept. Properly managed, this process has the programmer developing programs full-time, instead of programming part-time and debugging JCL part-time. 209 Programmer assistance9 Our examination of the use of third generation systems has continually pointed out the need for including sufficiently knowledgeable people in the process of system use. A focal point for users needing assistance should be created. This is usually done anyway informally, as users search out fellow programmers and systems people who might have answers to their problems. I propose that experienced, knowledgeable, and systems oriented people be organized into a team to answer user questions and to provide diagnostic assistance. This same group could aid in developing standards, optimizing program code, and teaching courses tailored to user needs. My own experience in a programmer assistance center has shown that such services greatly increases the productivity of a DP installation's personnel. Diagnostic services A cadre of good diagnostic programmers should be used to assist programmers who are unable to isolate their program bugs or who need assistance with utilities, JCL, or any other aspect of the operating system. Such a group should keep a catalog of the problems encountered for handy future reference and as a way for determining personnel training needs or system enhancements. The group could aid in locating and correcting software errors by creating small kernels or test programs designed solely to re-create the error. Through the use of such kernels, various error solutions could be tested without disturbing the users main program or process. Program optimization service These same personnel might also be charged with research into and development of simple and efficient programming techniques. These techniques could then be implemented in the optimization of the most heavily used programs or systems. Once we identify our largest program consumers of computer time and their most heavily used routines, we can find it cost-effective to thoroughly go over such code, replacing the routines identified with more efficient ones. Known programming inefficiencies and their correction might be identified by an occasional thorough review of the computer-generated output. For example, the entire output of a weekend might be reviewed in search of poor use of I/O processing. Runs with poor system usage might then be earmarked, and the programmers responsible, notified and given suggestions for possible improvements. The most frequently encountered poor practices might then be subjects of a tutorial bulletin or training session for programmers. Conventions and standards "Standards" is a dirty word to many people, but when large numbers of programming personnel are found to be 210 National Computer Conference, 1973 employing poor practices, our systems group would be charged with developing optimum alternatives. Effective streamlining of frequently used facilities could be accomplished through the publication of tested techniques and standards or conventions. With standards for guidance, the user has a yardstick to determine if his utilization of system resources meets a minimum acceptable level. Prior to the design of new problem program systems, these standards would play an important role in ensuring that optimum use of available facilities was achieved. "Structured Programming"IO,ll and other developing techniques for making the programming practice manageable should be researched and used as the basis for developing usable installation standards. Instructors The accumulation of expertise within a single group would be almost wasteful if this knowledge were not disseminated among the users of the system. A logical extension to our assistance group, then, would be the assignment of instructors who would conduct tutorial seminars and develop tailored training courses and user manuals. SUMMARY I have attempted in this paper to give you my views on coping with a highly generalized operating system. I have found that complexity is the main problem facing users of third generation systems. My formula suggests that we insulate the general user I programmer from OS detail to the maximum extent possible, and that we provide the user community with a technically competent consultative group to assist in fielding problems with which the user need not be concerned. ACKNOWLEDGMENTS I wish to extend my sincere thanks and appreciation to Messrs. Evmenios Damon and Chesley Looney of the National Aeronautics and Space Administration, Goddard Space Flight Center; Carl Pfeiffer and Ian Bennett of Computer Sciences Corporation; Dave Morgan of Boole and Babbage, Inc.; and John Coffey of the National Bureau of Standards. In particular, I wish to thank Herb Bright of Computation Planning, Inc., for his many hours of discussion and words of encouragement. REFERENCES 1. Randall, B., "Operating Systems: The Problems of Performance and Reliability," IFlPS 71, vol. 1,281-290. 2. Bright, H. S., "A Philco multiprocessing system," Proc, FJCC'64 Parl II, AFIPS Vol. 26 pp. 97-141, sec. 14 par. 1, Spartan Books, Washington 1965. 3. Cantrell, H. N., Ellison, A. L., "Multiprogramming System Performance Measurement and Analysis," AFIPS Conference Proceedings SJCC, 1968, pp. 213-221. 4. Gorman, W., Survey and Recommendations for the Use of DirectAccess Storage on the M&DO IBM System/360 Model 95 Computer, Computer Sciences Corporation Document No. 9101-08800OlTM, (Jan. 1973), NASA Contract No. NAS 5-11790. 5. Lucas, H. C., Jr., "Performance Evaluation and Monitoring," ACM Computing Surveys, 3,3 (Sept. 71), pp. 79-91. 6. Sedgewick, R., Stone, R., McDonald, J. W., "SPY: A Program to Monitor OS/360," AFIPS Conference Proceedings FlCC, 1970, pp. 119-128. 7. Weinberg, G. M., The Psychology of Computer Programming. New York: Van Nostrand Reinhold, 1971. 8. Baker, F. T., "Chief Programmer Team Management of Production Programming," IBM Systems Journal, 11,1 (1972), pp. 56-73. This topic was also discussed by Harlan Mills of IBM in his paper, "Managing and motivating programming personnel," given in this morning's session on Resource Utilization in the Computing Community. 9, Damon, E. P., "Computer Performance Measurements at Goddard Space Flight Center," CPEUG-FIPS Task Group 10, National Bureau of Standards, Oct. 72. 10. Dijkstra, E. W., "The Structure of the "THE" Multiprogramming System," CACM, 11, May 1968, pp. 341-346. 11. McCracken, D., Weinberg, G., "How To Write a Readable FORTRA~ Program," Datamation, 18,10, Oct. 1972, pp. 73-77. Language selection for applications by M. H. HALSTEAD Purdue University Lafayette, Indiana It may strike you as a truism to note that if the solution to a problem depends upon too many variables, we are apt to reduce the multiplicity of variables to one, base an important decision upon that one, and thereafter proceed as though there had never been, and would never be, any reasonable alternative to our choice. This approach, when applied to the problem of choosing a programming language with which to implement a given class of problems, results in advice of the following sort: For scientific and engineering problems, use FORTRAN; for problems with large data bases, use COBOL; for command-and-control, use JOVIAL; and for systems implementation, use PLj S. Except for academic work avoid ALGOL; and with respect to those opposite ends of the spectrum, APL and PLj 1, merely report that you are waiting to see how they work out. Now, obviously, advice of this type might be correct, but just as clearly, it will sometimes be erroneous, primarily because it ignores most of the variables involved. It would seem only prudent, therefore, to examine some of those dimensions along which languages might be compared, with a view toward increasing our understanding of their relative importance in the decision process. However, there are four items which we should discuss in some detail first. These are: 1. 2. 3. 4. C. Implementation Cl. Writing Program (Coding) C2. Preparing Test Data D. Testing D1a. Finding :\1inor (Syntactic) Errors D1b. Correcting }1inor Errors D2. Eliminating Test Data Errors D3a. Finding }lajor (Logical) Errors D3b. Correcting }Iajor Errors E. Documenting Totals For the first item, I will use the best data of which I am aware, from an unpublished study of Fletcher Donaldson,l done some years ago. Without presenting the details of his study, we note that his measurements of programmers in two different installations, when reduced to hours of programming activity per 40 hour week, yielded the following figures. Group 2 Group 1 1.24 Hours/Week 3 4 3 5.60 2 1.24 4 7.45 1 4.95 1 .31 6 6.18 3 1.24 3 40 1.24 40.00 From these data, let us combine those activities which are almost certainly independent of any language which might be chosen: Understanding the Objective, and Finding an Approach. For installation 1 this gives 7 hours, and for installation 2 it gives 7.14 hours, or roughly one day per week. From this it is apparent that, no matter how much improvement might be expected from a given language, it can only operate upon the remaining four days per week. With respect to the problem of global versus local inefficiencies in programs, there are even fewer data, but the broad outlines are clear, and of great importance. Let us look first at local inefficiency. This is the inefficiency in object code produced by a compiler which results from the inevitable lack of perfection of its optimizing pass. According to a beautiful study reported by Knuth,2 the difference to be expected from a "finely tuned FORTRAK -H compiler" and the "best conceivable" code for the same algorithm and data structure averaged 40 percent in execution time. This is not to say that the difference between well written machine code and code compiled from FORTRAN will show an average difference of the fun 40 percent, for even short programs wili seldom be "the best conceivable" code for the machine. With the present state of measurements in this area it is too soon to expect a high degree of accuracy, but let us How a programmer spends his time. The difference between local and global inefficiency. The role of the expansion ratio. Variance in programmer productivity. A. Understanding Objective B. Devising:vI ethods Bl. Finding Approach B2. Flow Charting 8 5.90 1.24 211 212 National Computer Conference, 1973 accept for the moment a figure of 20 percent for the average inefficiency introduced by even the best optimizing compilers. Now this inefficiency is the only one we are in the habit of examining, since it applies linearly to small as well as to large programs, and we are usually forced to compare only small ones. The other form of inefficiency is global, and much more difficult to measure. This is the inefficiency introduced by the programmer himself, or by a programming team, due to the size and complexity of the problem being solved. It must be related to the amount of material which one person can deal with, both strategically and tactically, at a given time. Consequently, this form of inefficiency does not even appear in the small sample programs most frequently studied. But because programs in higher level languages are more succinct than their assembly language counterparts, it is responsible for the apparent paradox which has been showing up in larger tests for more than a decade. This paradox probably first appeared to a non-trivial group during the AF ADA Tests 3 in 1962, in which a Command-and-Control type program was done in six or seven languages and the results of each compared for efficiency against a "Standard" done in assembly language. When the initial results demonstrated that many of the higher level language versions, despite their obvious local inefficiencies, had produced object code programs which were measurably more efficient than the "Standard," the paradox was attributed to programmer differences, and the entire test was redone. In the second version, the assembly language version was improved to reflect the best algorithm used in a higher level language, and the paradox dissappeared. As recently as 1972, the paradox was still showing up, this time in the Henriksen and Merwin study. 4 In that study, several operating programs which had been written in assembly language were redone in FORTRAN. If we direct our attention to their comparison of FORTRAN V and SLEUTH, the assembly language for the 1108, we find that two of their three programs gave the same execution times, while for the third they report: more than linearly with program size, it follows that anything which will make an equivalent program "smaller" in its total impact upon the minds of the programmers will reduce its global inefficiency even more rapidly than the reduction in size itself. With an expansion ratio of about four, it appears that the effects of local and global inefficiencies balance for programs of about 50 statements or 2000 instructions, but these figures are extremely rough. The fourth item listed for discussion, the variance in programmer productivity, enters the language picture in several ways. First, of course, it is because of the large size of this variance that many of the measurements needed in language comparisons have been so inconclusive. But this need not blind us to another facet. Since the time, perhaps 15 years ago, when it was first noted that chess players made good programmers, it has been recognized that programming involved a nice balance between the ability to devise good over-all strategie~, and the ability to devote painstaking attention to detail. While this is probably still true in a general way, the increasing power of computer languages tends to give greater emphasis to the first, and lesser to the second. Consequently, one should not expect that the introduction of a more powerful language would have a uniform effect upon the productivity of all of the programmers in a given installation. On the basis of the preceding discussion of some of the general considerations which must be borne in mind, and leaning heavily upon a recent study by Sammet, 5 let us now consider some of the bulk properties which may vary from language to language. While it is true that for many purposes the difference between a language and a particular compiler which implements it upon a given machine is an important distinction, it is the combined effect of both components of the system which must be considered in the selection process. For that purpose, the costs of some nine items directly related to the system must somehow be estimated. These will be discussed in order, not of importance, but of occurrence. 1. Cost of Learning. If one were hiring a truck driver to "In actual performance, the FORTRAN version ran significantly better than the SLEUTH version. If the SLEUTH version were to be recoded using the FORTRAN version as a guide, it is probable that it could be made to perform better than the FORTRAN version." While the dimensions of global inefficiency have not yet been well established, it tends to become more important as the size of a job increases. Since it is a non-linear factor, it overtakes, and eventually overwhelms, the local inefficiency in precisely those jobs which concern all of us most, the large ones. This brings us, then, to a consideration of the Expansion Ratio, the "one-to-many" amplification from statements to instructions, provided by computer languages and their compilers. Since the global inefficiency varies drive a truck powered by a Cummins engine, it would be irrelevant as to whether or not an applicant's previous experience included driving trucks with Cummins engines. It would be nice if prior familiarity with a given language were equally unimportant to a programmer, but such is not quite the case. The average programmer will still be learning useful things about a language and its compiler for six months after its introduction, and his production will be near zero for the first two weeks. According to a paper by Garrett,6 measurements indicate that the total cost of introducing a new language, considering both lost programmer time and lost computer runs, was such that the new language must show a cost advantage of 20 percent over the old in order to overcome it. Since those data were taken quite a long while ago, it is probably safe to estimate Language Selection for Applications 2. 3. 4. 5. 6. that increasing sophistication in this field has reduced the figure to 10 percent by the present time. Cost of Programming. Obviously, this is one of the most important elements of cost, but difficult to estimate in advance. If we restrict this item strictly to the activity classified as C1 in Donaldson's study, then it would appear to be directly related to the average expansion ratio obtained by that language for the average application making up the programming load. Here the average job size also becomes important, and for a given installation this appears to increase. In fact, some years ago Amaya7 noted through several generations of computers, it had been true that "The average job execution time is independent of the speed of the computer." Cost of Compiling. While it has been customary to calculate this item from measurements of either source statements of object instructions generated per unit machine time, some painstaking work by Mary Shaw8 has recently shown that this approach yields erroneous results. She took a single, powerful language, and reduced it one step at a time by removing important capabilities. She then rewrote a number of benchmark programs as appropriate for each of the different compilers. She found that when compilation time was measured for each of the separate, equivalent programs, then the more powerful compiler was slower only if it contained features not utilizable in a benchmark. For those cases in which a more powerful statement was applicable, then the lessened speed per statement of the more powerful compiler was dramatically more than compensated for by the smaller number of statements in the equivalent benchmark program. Cost of Debugging. Here again, since debugging varies directly with the size and complexity of a program, the more succinctly a language can handle a given application, the greater will be the reduction in debugging costs. While the debugging aids in a given compiler are important, their help is almost always limited to the minor errors rather than the major ones. Cost of Optimizing. Since a compiler may offer the option of optimizing or not optimizing during a given compilation, it is possible to calculate this cost seperately, and compare it to the improvements in object code which result. For example, FORTRANH9 has been reported to spend 30 percent more time when in optimization mode, but to produce code which is more than twice as efficient, FORTRANhand, Frailey lO has demonstrated that, under some conditions optimization can be done at negative cost. This condition prevails whenever the avoidance of nonessential instructions can be accomplished with sufficient efficiency to overcompensate for the cost of generating them. Cost of Execution. If, but only if, a good optimizing compiler exists for a given language, then it can be 213 expected that the local inefficiencies of different high level languages will be roughly comparable. The inevitable global inefficiencies will still exist, however. It is here, as well as in programming cost, that a language well suited to the application can yield the greatest savings. Again, the data are inadequate both in quantity and quality, but cost savings of a factor of two are well within my own experience. This does not apply to those languages which are primarily executed in interpretive mode, such as SNOBOL, where large costs of execution must be recovered from even larger savings in programming costs. 7. Cost of Documentation. In general, the more powerful, or terse, or succinct a language is for a given application, the smaller the amount of additional documentation that will be required. While this is true enough as a general statement, it can be pushed too far. It seems to break down for languages which use more symbolic operators than some rough upper limit, perhaps the number of letters in natural language alphabets. Allowing for this exception in the case of languages of the APL class, it follows that the documentation cost of a language will vary inversely with the expansion ratio obtainable in the given applications area. 8. Cost of Modification. Since it is well known that any useful program will be modified, this item is quite important. Here any features of a language which contribute to modularity will be of advantage. Block structures, memory allocation, and compile time features should be evaluated in this area as well as for their effect on initial programming improvement. 9. Cost of Conversion. Since hardware costs per operation have shown continual improvements as new computers have been introduced, it is only reasonable to expect that most applications with a half-life of even a few years may be carried to a new computer, hence this element of potential cost should not be overlooked. If one is using an archaic language, or one that is proprietary, then the cost of implementing a compiler on a new machine may well be involved. While this cost is much less than it was a decade ago, it can be substantial. Even with the most machine-independent languages, the problems are seldom trivial. Languages which allow for the use of assembly language inserts combine the advantage of permitting more efficient code with the disadvantage of increasing the cost of conversion. As noted by Herb Bright,11 a proper management solution to this problem has existed since the invention of the subroutine, and consists of properly identifying all such usage, and requiring that it comform to the subroutine linkage employed by the language. In examining the preceding nine elements of cost, it is apparent that many of them depend upon the expansion ratio of a language in a given application area. In deciding 214 National Computer Conference, 1973 OR.IE""ED M!.l,<.:..HIN::' LANGVI'G-E Figure 1 upon a new language, or between two candidates, it might be useful to attempt to plot them upon a globe, with longitude representing possible application areas, and latitude representing any convenient function of the expansion ratio. The plot should look something like Figure 1, where it can be seen that languages range from machine language, in which any application may be handled, girdling the equator, through general purpose, procedure oriented languages in the tropics, to highly specialized, problem oriented languages in the arctic. The pole, which implies an infinite expansion ratio for all applications areas, must be reserved for programming via mental telepathy. While there is some current research under way12 which may yield more basic insight into the problems in this area, it is only a year or so old, and the most that can be said is that it is not yet sufficiently developed to be of present assistance. A very interesting technique which has moved part of the way from research to practice, however, should be mentioned in conclusion. This involves a practical approach to the problem of language extension. Unlike the extensible-language approach, which seemed to open the door to a dangerous, undisciplined proliferation of overlapping and even incompatible dialects within a single installation, this alternate approach to language extension is based upon the older concept of precompilers. As suggested by Garrett 6 and demonstrated by Ghan,12 dramatic savings in programming costs can be achieved in shops having sufficient work in any narrow application area. This has been done by designing a higher level, more specialized language, and implementing a translator to convert it to a standard procedure oriented language. In this process, the higher-level language may readily permit inclusion of statements in the standard procedure oriented language, and merely pass them along without translation to the second translator or compiler. This process, by itself, has the obvious inefficiency that much of the work done by the first translator must be repeated by the second. While the process has proved economical even with this inefficiency, Nvlin!4 has recently demonstrated the ability to reorganize, and thereby remove the redundant elements, of such a preprocessor-compiler system automatically, provided that both are written in the same language. In summary, let us first note that we have not offered a simple table with line items of preassigned weights, nor a convenient algorithm for producing a yes-no answer to the question "Should I introduce a language specifically to handle a given class of programming jobs." Instead, we realize that, with the current state of the art, it has only been feasible to enumerate and discuss those areas which must be considered in any sound management decision. From those discussions, however, we may distill at least four guidelines. First, it is abundantly clear that great economies may be realized in those cases in which the following two conditions prevail simultaneously: (1) There exists a language which is of considerably higher level with respect to a given class of applications programs than the language currently in use, and {2} The given class of applications programs represents a non-trivial programming work load. Secondly, there is important evidence which suggests that a higher level language which is a true superset of a high level language already in use in an installation may merit immediate consideration. Thirdly, it must be remembered that data based upon comparisons between small programs will tend to underestimate the advantage of the higher level language for large programs. Finally, the potential costs of converting programs written in any language to newer computers should neither be ignored, nor allowed to dominate the decision process. REFERENCES 1 Donaldson, F. W., Programmer Evaluation, 1967, unpublished. 2 Knuth, Donald E., "An Empirical Study of Fortran Programs," Softwave-Practice and Experience, Vol. 1, ~R 2, April/June 1971, pp. 105-133. 3. The AFADA Tests, conducted by Jordan, were unpublished, but see Datamation, Oct. 1962, pps. 17-19. 4. Henriksen, J. 0., and Merwin, R. E., "Programming Language Efficiency in Real-Time Software Systems, SJCC 1972, pp. 155· 161. 5. Sammet, Jean, "Problems in, and a Pragmatic Approach to, Programming Language Measurement." FJCC Vol. 1 39, 1971, pp. 243-252. 6. Garrett, G. A., "Management Problems of an Aerospace Computer Center." FJCC 1965. 7. Amaya, Lee, Computer Center Operation, unpublished lecture. 8. Shaw, Mary, Language Structures for Contractible Compilers, Ph.D. Thesis, Carnegie-Mellon University, Dec. 1971. 9. Lowrey, Edward S., Medlock, C. W., "Object Code Optimization" CACM, Vol. 12, Jan. 1969, pp. 13-22. 10. Frailey, Dennis, A Study of Code Optimization Using a General Purpose Optimizer, Ph.D. Thesis, Purdue University, Jan. 1971. 11. Bright, Herb, private communications. 12. Halstead, M. H., "Natural Laws Controlling Algorithm Structure," SIGPLAN Notices Vol. 7. NR 2, Feb. 1972, pp. 22-26. 13. Ghan, Laverne, "Better Techniques for Developing Large Scale Fortran Programs," Proc. 1971 Annual ACM Cont. pp. 520-537. 11 "\Tylin, W. C .Jr., S"ructural RCJrga'1ization cf _"\fuitipa" Co'nputc Programs, Ph.D. Thesis. Purdue. ,June 1972. r Information ~etworks-International Communication Systems Session A national scientific and technical information system for Canada Global networks for information, communications and computers by JACK E. BROWN by KJELL SAMUELSON National Science Librarian Ottawa, Canada Stockholm University and Royal Institute of Technology Stockholm, Sweden 215 ABSTRACT ABSTRACT Canada is in the process of developing a national scientific and technical information (STI) system. It is designed to ensure that scientists, researchers, engineers, industrialists and managers have ready access to any scientific or technical information or publication required in their day-to-day work. In 1970 impetus was given the program when the National Research Council (NRC) was assigned formal responsibility for planning and development, with the National Science Library (NSL) serving as the focal point or coordinating agency for local STI services. During the last two years, emphasis has been placed on the strengthening of two existing networks-a network of 230 libraries linked to the NSL by the "Union List of Scientific Serials in Canadian Libraries"-the CANjSDI network, a national current awareness service at present using 12 data bases, and linked to the NSL by 350 Search Editors located in all parts of Canada. This service is being expanded to provide remote access to the CANjSDI data bases by an interactive on-line system. In recent months, steps have been taken to establish regional referral centres and link into the system little used pockets of subject expertise and specialized STI resources. When working with the concept of worldwide or global networks a clear distinction should be made between at least three different aspects. First of all, information networks based on globally distributed knowledge has a long time bearing on accumulated information and data. Secondly, computer networks that are gradually coming into existence provide various means of processing new or already existing information. For some years to come, computer networks will only to a limited extent provide adequate information and knowledge support. Thirdly, communication networks have existed for decades and are gradually improved by advancements in technology. The combined blend of all three kinds of international networks will have a considerable impact on global socioeconomical and geo-cultural trends. If bidirectional broadband telesatellites and universal, multipoint personto-person communications are promoted, there is hope for "free flow" of information. It appears recommendable that resources should be allocated to this trend rather than an over-emphasis on "massaged" and filtered data in computer networks. A position paper-Panel session on intelligent terminals-Chairman's introduction by IRA W. COTTON National Bureau of Standards Washington, D.C. Intelligent terminals are those which, by means of stored logic, are able to perform some processing on data which passes through them to or from the computer systems to which they are connected. Such terminals may vary widely in the complexity of the processing which they are capable of performing. The spectrum ranges from limited-capability point-of-sale terminals through moderately intelligent text-oriented terminals up to powerful interactive graphics terminals. The common thread that ties all these types of terminals together is their processing power and the questions relating to it. by some 3000 miles, and the U. S. Postal Service would not have sufficed to meet publication deadlines. Van Dam and Stabler of Brown University discuss the opportunities presented by a super intelligent terminal, or "intelligent satellite" in their terms. Such systems offer the most power, but also require the most caution, lest this power be misused or dissipated through poor system design. It is, of course, impossible to report in advance on the panel discussion which is part of the session. The position papers raise most of the issues that I expect will be discussed. Perhaps some means can be found to report on any new points or insights gleaned from the discussion. In addition, all of the work is ongoing, and all of the authors (and the chairman) welcome further discussion beyond the confineS of this conferenCe. What, for example, is the proper or most efficient division of labor between the terminals and the central computer? What are the limits, if any, to the power which can be provided in such terminals? Need we worry about the "wheel of reincarnation" syndrome < ME68 > in which additional processing power is continually added to a terminal until it becomes free-standing ... and then terminals are connected to it? The standards program at NBS requires participation by manufacturers and users. Englebart specifically invites inquiries regarding his system in general and the mouse in particular. The academic community has always been the most open for critical discussion and the exchange of ideas. This session was planned to at least expose to critical discussion some of these questions, if not answer them. Position papers were solicited to cover each of the three points on the spectrum identified above. Thornton of the Bureau of Standards points out the need to take a total systems approach and to develop relevant standards-specifically for point-of-sale terminals, but the argument applies to the more sophisticated terminals as well. Engelbart at Stanford Research Institute discusses his experiences with "knowledge workshop" terminals, and predicts the widespread acceptance of such terminals by knowledge workers of all kinds. That day may be nearer than some might think: two of the three papers for this session were transmitted to the chairman via remote terminal, and one was actually reviewed online in a collaborative mode. In the latter case, author and reviewer were separated In short, we recognize that a session such as this may well raise as many questions as it answers, but we hope that it may serve as a stimulus to further discussion. A Short Bibiiography on Inteliigent Terminais BA173 Bairstow, J. N., "The terminal that thinks for itself," Computer Decisions, January 1973. pp. 10-13. BA268 Bartlett, W. S., et aI., "SIGHT, a satellite interactive graphic terminaL" 1968 A CM National Conference, pp. 499-509. BE71 Bennett, William C. "Buffered terminals for data communications," TELECOMM, June 1971, p. 46. CH67 Christensen, C., Pinson, E. N., "Multi-function graphics for a large computer system," FJCC 1967, pp. 697 -712. 217 218 National Computer Conference, 1973 C068 Cotton, Ira W., Greatorex, Frank S., "Data structures and techniques for remote computer graphics," FJCC 1968, pp. 533-544. CP71 "CPU chip turns terminal into stand-alone machine," Electronics, 44:12, June 7, 1971, pp. 36-37. CR72 Crompton, J. C., Soane, A. J., "Interactive terminals for design and management reporting in engineering consultancy," Online 72, International Conference on Online Interactive Computing, Sept. 4-7, 1972, BruneI University, Uxbridge, England, pp. 187-208. D072 Donato, A. J., "Determining terminal requirements," Telecommunications, Sept. 1972, pp. 35-36. EN73 Engelbart, Douglas C., "Design considerations for knowledge workshop terminals," NCCE 1973. GI73 Gildenberg, Robert F., "Word processing," Modem Data, January 1973, pp. 56-62. GR72 Gray, Charles M., "Buffered terminals ... more bits for the buck," Communication.>; Report, December 1972, p. 39. H071 Hobbs, L. C., "The rational for smart terminals," Computer, Nov.-Dec. 1971, pp. 33-35. H072 Hobbs, L. C., "Terminals," Proceedings of the IEEE, Vol. 60, No. 11, Nov. 1972, pp. 1273-1284. IB71 "IBM enters terminal in credit checking race," Electronics, March 15, 1971, pp. 34/36. IN72 "Intelligent terminals," EDP Analyzer, 10:4, April 1972, pp. 1-13. KE72 Kenney, Edward L., "Minicomputer-based remote job entry terminals," Data Management, 10:9, Sept. 1971, pp. 62-63. KN70 Knudson, D., Vezza, A., "Remote computer display terminals." Computer Handling of Graphical Information, Seminar, Society of Photographic Scientists and Engineers, Washington, D. C., 1970, pp. 249-268. MA69 Machover, Carl, "The intelligent terminal." Pertinent Concepts in Computer Graphics, Proc. of the Second University of Illinois Conference on Computer Graphics, M. Fairman and J. Nievergelt (eds.), University of Illinois Press. Urbana, 1969, pp. 179-199. MA172 Machover, C., "Computer graphics terminalsA backward look," SJCC 1972, pp. 439-446. MA270 Marvin, Cloyd E., "Smart vs. 'dumb' terminals: cost considerations," Modem Data, 3:8, August 1970, pp. 76-78. MC71 McGovern, P. J. (ed.), "Intelligent terminals start popping up all over," EDP Industry Report, June 30, 1971, pp. 1-7. ME68 Meyer, T. H., Sutherland, I. E., "On the design of display processors," CACM, 11:6 June 1968, pp. 410414. NI68 Ninke, W. H., "A satellite display console system for a multi-access central computer," Proc. IFIP Congress, 1968, pp. E65-E71. OB72 O'Brien, B. V., "Why data terminals," Automation, 19:5, May 1972, p. 46-51. PR71 "Programmable terminals," Data Processing Mag, 13:2, February 1971, pp. 27-39. RA68 Rapkin, M. D., Abu-Gheida, O. M., "Standalone/remote graphic system," FJCC 1968, pp. 731-746. R0172 Roberts, Lawrence G., "Extension of packet communication technology to a hand-held personal terminal," SJCC 1972, pp. 295-298. R0272 Rosenthal, C. W., "Increasing capabilities in interactive computer graphics terminals," Computer, Nov.-Dec. 1972, pp. 48-53. R0367 Ross, Douglas T., et al., "The design and programming of a display interface system integrating multiaccess and satellite computers," Proc. A CM / SHARE 4th Annual Design Autumatiun Wurkshup, Los Angeles, June 1967. RY72 Ryden, K. H., Newton, C. M., "Graphics software for remote terminals and their use in radiation treatment planning," SJCC 1972, pp. 1145-1156. SA71 Sandek, Lawrence, "Once upon a terminal," Think, 37:9, Oct. 1971, pp. 39-41. SC171 Schiller, W. L., et aI., "A microprogrammed intelligent graphics terminal," IEEE Trans. Computers, C-20, July 1971, pp. 775-782. SC272 Schneiderman, Ron., "Point-of-salesmanship," Electronic News, January 17, 1972, pp. 4-5. SMI71 "Smart remote batch terminals share computer processing loads," Data Processing Mag., 13:3, March 1971, pp. 32-36. SM270 Smith, M. G., "The terminal in the terminaloriented system," Australian Compo J., 2:4, Nov. 1970, pp. 160-165. TH73 Thornton, M. Zane, "Electronic point-of-sale terminals," NCCE 1973. VA73 Van Dam, Andries, Stabler, George M., "Intelligent satellites for interactive graphics." NCC&E 1973. WE73 Wessler, John J., "POS for the supermarket," Modern Data, January 1973, pp. 52-54. A position paper-Electronic point-of-sale terminals by M. ZAl\;E THORNTO~ National Bureau of Standards Washington, D.C. The electronic point-of-sale terminal is the newest form of computer technology being introduced into the retail industry. Industry interest in the terminal is focused on its potentially great advantages for retailers in improving their productivity and performance in merchandise control and credit customer control. The electronic point-ofsale terminal's appeal over the standard cash register lies in its potential for impacting the total merchandise system through increasing the speed and accuracy of transactions and providing a method of capturing greater quantities of data essential to the effective management of the merchandise system. At the check-out counter, the terminal equipped with an automatic reading device and credit verification equipment will permit the rapid completion of the sales transaction and, at the same time, capture and enter into the central system all the data necessary for closer, more effective control of the merchandise system. The full potential of the electronic point-of-sale terminal cannot be realized by simply trying to insert it into the retail environment as a replacement for the electromechanical cash register. The terminal must be effectively integrated into an overall systems approach to the entire merchandising system. It must be equipped with an effective capability to automatically read merchandise tickets and labels; this, in turn, requires the adoption by the retail industry of merchandise identification standards and either a single technology or compatible technol- ogies for marking merchandise and automatically reading the tickets and labels. Further, the terminal must be effectively integrated with supporting computer systems, which raises still other needs related to data communications interconnections, network design and optimization, data standards, and software performance standards and interchangeability criteria. Without a thorough systems approach encompassing the entire merchandising system, the great promise of the electronic point-of-safe terminal may never be realized; indeed, the terminal could become the costly instrument of chaos and widespread disruption in the retail industry. The ~ational Retail Merchants Association is taking steps to insure that the proper preparations are made to smooth the introduction of the electronic point-of-sale terminal on a broad scale. The Association's first major objective is to develop merchandise identification standards by the end of 1973. At the request of the NRMA, the National Bureau of Standards is providing technical assistance to this effort. Equipment manufacturers, other retailers, merchandise manufacturers, tag and label makers, and other interested groups are also involved. Given the merchandise identification standards, the emphasis will shift to the implementation of the standards in operational systems where primary effort will be focused on network design, data communications and interfacing terminals with computers, and software development. 219 Design considerations for knowledge workshop terminals by DOUGLAS C. ENGELBART Stanford Research Institute Menlo Park, California tion is included, not only to provide a guide for selective follow up, but also to supplement the substance to the body of the paper by the nature of the commentary. IN"TRODUCTION The theme of this paper ties directly to that developed in a concurrent paper "The Augmented Knowledge Workshop," 1 and assumes that: "intelligent terminals" will come to be used very, very extensively by knowledge workers of all kinds; terminals will be their constant working companions; service transactions through their terminals will cover a surprisingly pervasive range of work activity, including communication with people who are widely distributed geographically; the many "computer-aid tools" and human services thus accessible will represent a thoroughly coordinated "knowledge workshop"; most of these users will absorb a great deal of special training aimed at effectively harnessing their respective workshop systems-in special working methods, conventions, concepts, and procedural and operating skills. Within the Augmentation Research Center (ARC), we have ten years of concentrated experience in developing and using terminal systems whose evolution has been explicitly oriented toward such a future environment; from this background, two special topics are developed in this paper: CONTROL MEANS Introduction Our particular system of devices, conventions, and command-protocol evolved with particular requirements: we assumed, for instance, that we were aiming for a VI:orkshop in which these very basic operations of designating and executing commands would be used constantly, over and over and over again, during hour-after-hour involvement, within a shifting succession of operations supporting a wide range of tasks, and with eventual command vocabularies that would become very large. THE MOUSE FOR DISPLAY SELECTION During 1964-65 we experimented with various approaches to the screen selection problem for interactive display work within the foregoing framework. The tests 6 •7 involved a number of devices, including the best light pen we could buy, a joy stick, and even a knee control that we lashed together. To complete the range of devices, we implemented an older idea, which became known as our "mouse," that came through the experiments ahead of all of its competitors and has been our standard device for eight years now. The tests were computerized, and measured speed and accuracy of selection under several conditions. We included measurement of the "transfer time" involved when a user transferred his mode of action from screen selection with one hand to keyboard typing with both hands; surprisingly, this proved to be one of the more important aspects in choosing one device over another. The nature of the working environment diminished the relative attractiveness of a light pen, for instance, because of fatigue factors and the frustrating difficulty in constantly picking up and putting down the pen as the user intermixed display selections with other operations. (1) What we (at ARC) have learned about controlling interactive-display services, and the means we have evolved for doing it-the partiuclar devices (mouse, keyset, key board), feedback, and protocol/skill features; and design data, usage techniques, learnability experience, and design data, usage techniques, learnability experience, and relevant needs and possibilities for alternatives and extensions. (2) Our considerations and attitudes regarding the distribution of functions between terminal and remote shared resources-including assumptions about future-terminal service needs, our networking experience, and foreseen trends in the associated technologies. References 2-19 include considerable explicit description of developments, principles, and usage (text, photos, and movies) to support the following discussion. Annota221 222 National Computer Conference, 1973 The mouse is a screen -selection device that we developed in 1964 to fill a gap in the range of devices that we were testing. It is of palm-filling size, has a flexible cord attached, and is operated by moving it over a suitable hard surface that has no other function than to generate the proper mixture of rolling and sliding motions for each of the two orthogonally oriented disk wheels that comprise two of the three support points. Potentiometers coupled to the wheels produce signals that the computer uses directly for X- Ypositioning of the display cursor. It is an odd-seeming phenomenon, but each wheel tends to do the proper mix of rolling and sideways sliding so that, as the mouse is moved, the wheel's net rotation closely matches the component of mouse movement in the wheel's "rolling" direction; one wheel controls updown and the other left-right cursor motion. Exactly the same phenomenon, applied in the mechanical integrators of old-fashioned differential analyzers, was developed to a high degree of accuracy in resolving the translation components; we borrowed the idea, but we don't try to match the precision. Imperfect mapping of the mouse-movement trajectory by the cursor is of no concern to the user when his purpose is only to "control" the position of the cursor; we have seen people adapt unknowingly to accidental situations where that mapping required them to move the mouse along an arc in order to move the cursor in a straight line. That the mouse beat out its competitors, in our tests and for our application conditions, seemed to be based upon small factors: it stays put when your hand leaves it to do something else (type, or move a paper), and reaccessing proves quick and free from fumbling. Also, it allows you to shift your posture easily, which is important during long work sessions with changing types and modes of work. And it doesn't require a special and hard-tomove work surface, as many tablets do. A practiced, intently involved worker can be observed using his mouse effectively when its movement area is littered with an amazing assortment of papers, pens, and coffee cups, somehow running right over some of it and working around the rest. ONE-HANDED, CHORDING KEYSET AS UNIVERSAL "FUNCTION" KEYBOARD For our application purposes, one-handed function keyboards providing individual buttons for special commands were considered to be too limited in the range of command signals they provided. The one-handed "function keyboard" we chose was one having five pianolike keys upon which the user strikes chords; of the thirty-one possible chords, twenty-six represent the letters of the alphabet. One is free to design any sort of alphabetic-sequence command language he wishes, and the user is free to enter them through either his standard (typewriter-like) keyboard or his keyset. The range of keyset-entry options is extended by cooperative use of three control buttons on the mouse. Their operation by the mouse-moving hand is relatively independent of the simultaneous pointing action going on. We have come to use all seven of the "chording" combinations, and for several of these, the effect is different if while they are depressed there are characters enterede.g. (buttons are number 1 to 3, right to left) Button 2 Down-Up effects a command abort, while "Button 2 Down, keyset entry, Button 2 Up" does not abort the command but causes the computer to interpret the interim entry chords as upper case letters. These different "chord-interpretation cases" are shown in the table of Appendix A; Buttons 2 and 3 are used effectively to add two bits to the chording codes, and we use three of these "shift cases" to represent the characters available on our typewriter keyboard, and the fourth for special, view-specification control. ("View specification" is described in Reference 1.) Learning of Cases 1 and 2 is remarkably easy, and a user with but a few hours practice gains direct operational value from keyset use; as his skill steadily (and naturally) grows, he will come to do much of his work with one hand on the mousp. and the other on the keyset, entering short literal strings as well as command mnemonics with the keyset, and shifting to the typewriter keyboard only for the entry of longer literals. The key set is not as fast as the keyboard for continuous text entry; its unique value stems from the two features of (a) being a one-handed device, and (b) never requiring the user's eyes to leave the screen in order to access and use it. The matter of using control devices that require minimum shift of eye attention from the screen during their use (including transferring hands from one device to another), is an important factor in designing display consoles where true proficiency is sought. This has proven to be an important feature of the mouse, too. It might be mentioned that systematic study of the micro-procedures involved in controlling a computer at a terminal needs to be given more attention. Its results could give much support to the designer. Simple analyses, for instance, have shown us that for any of the screen selection devices, a single selection operation "costs" about as much in entry-information terms as the equivalent of from three to six character strokes on the keyset. In many cases, much less information than that would be sufficient to designate a given displayed entity. Such considerations long ago led us to turn away completely from "light button" schemes, where selection actions are used to designate control or information entry. It is rare that more than 26 choices are displayed, so that if an alphabetic "key" character were displayed next to each such "button," it would require but one stroke on the keyset to provide input designation equivalent to a screen-selection action. Toward such tradeoffs. it even seems possible to me that a keyboard-oriented scheme could be designed for selection of text entities from the display screen, in which a skilled typist would keep his hands un keybuard and his eye~ U11 the :;creen at all time:;, Design Considerations For Knowledge Workshop Terminals where speed and accuracy might be better than for mouse-keyset combination. NOTE: For those who would like to obtain some of these devices for their own use, a direct request to us is invited. William English, who did the key engineering on successive versions leading to our current models of mouse and key set is now experimenting with more advanced designs at the Palo Alto Research Center (PARC) of Xerox, and has agreed to communicate with especially interested parties. LANGUAGE, SKILLS AND TRAINING I believe that concern with the "easy-to-learn" aspect of user-oriented application systems has often been wrongly emphasized. For control of functions that are done-very frequently ,- payoff in higher efficiency warrants the extra training costs associated with using a sophisticated command vocabulary, including highly abbreviated (therefore non-mnemonic) command terms, and requiring mastery of challenging operating skills. There won't be any easy way to harness as much power as is offered, for closely supporting one's constant, daily knowledge work, without using sophisticated special languages. Special computer-interaction languages will be consciously developed, for all types of serious knowledge workers, whose mastery will represent a significant investment, like years of special training. I invite interested skeptics to view a movie that we have available for loan,13 for a visual demonstration of flexibility and speed that could not be achieved with primitive vocabularies and operating skills that required but a few minutes (or hours even) to learn. No one seriously expects a person to be able to learn how to operate an automobile, master all of the rules of the road, familiarize himself with navigation techniques and safe-driving tactics, with little or no investment in learning and training. SERVICE NETWORK One's terminal will provide him with many services. Essential among these will be those involving communication with remote resources, including people. His terminal therefore must be part of a communication network. Advances in communication technology will provide very efficient transportation of digital packets, routed and transhipped in ways enabling very high interaction rates between any two points. At various nodes of such a network will be located different components of the network's processing and storage functions. The best distribution of these functions among the nodes will depend upon a balance between factors of usage, relative technological progress, sharability, privacy, etc. Each of these is bound to begin evolving at a high rate, so that it seems pointless to argue about it now; that there will be value in having a certain amount of local processor capability at the terminal seems obvious, 223 as for instance to handle the special communication interface mentioned above. EXTENDED FEATURES I have developed some concepts and models in the past that are relevant here, see especially Reference 5. A model of computer-aided communication has particular interest for me; I described a "Computer-Aided HumanCommunication Subsystem," with a schematic showing symmetrical sets of processes, human and equipment, that serve in the two paths of a feedback loop between the central computer-communication processes and the human's central processes, from which control and information want to flow and to which understanding and feedback need to flow. There are the human processes of encoding, decoding, output transducing -(motor actions), aiid--input transducing (sensory actions), and a complementary set of processes for the technological interface: physical transducers that match input and output signal forms to suit the human, and coding/ decoding processes to translate between these signal forms in providing I! 0 to the main communication and computer processes. In Reference 5, different modes of currently used human communication were discussed in the framework of this model. It derived some immediate possibilities (e.g., chord keysets), and predicted that there will ultimately be a good deal of profitable research in this area. It is very likely that there exist different signal forms that people can better harness than they do today's hand motions or vocal productions, and that a matching technology will enable new ways for the humans to encode their signals, to result in significant improvements in the speed, precision, flexibility, etc. with which an augmented human can control service processes and communicate with his world. It is only an accident that the particular physical signals we use have evolved as they have-the evolutionary environment strongly affected the outcome; but the computer's interface-matching capability opens a much wider domain and provides a much different evolutionary environment within which the modes of human communication will evolve in the future. As these new modes evolve. it is likely that the transducers and the encoding/ decoding processes will be built into the local terminal. This is one support requirement that is likely to be met by the terminal rather than by remote nodes. To me there is value in considering what I call "The User-System, Service-System Dichotomy" (also discussed in 5). The terminal is at the interface between these two "systems," and unfortunately, the technologists who develop the service system on the mechanical side of the terminal have had much too limited a view of the user system on the human side of the interface. 224 National Computer Conference, 1973 That system (of concepts, terms, conventions, skills, customs, organizational roles, working methods, etc.) is to receive a fantastic stimulus and opportunity for evolutionary change as a consequence of the service the computer can offer. The user system has been evolving so placidly in the past (by comparison with the forthcoming era), that there hasn't been the stimulus toward producing an effective, coherent system discipline. But this will change; and the attitudes and help toward this user-system discipline shown by the technologists will make a very large difference. Technologists can't cover both sides of the interface, and there is critical need for the human side (in this context, the "user system") to receive a lot of attention. What sorts of extensions in capability and application are reasonable-looking candidates for tomorrow's "intelligent terminal" environment? One aspect in which I am particularly interested concerns the possibilities for digitized strings of speech to be one of the data forms handled by the terminal. Apparently, by treating human speechproduction apparatus as a dynamic system having a limited number of dynamic variables and controllable parameters, analysis over a short period of the recentpast speech signal enables rough prediction of the forthcoming signal. and a relatively low rate of associated data transmission can serve adequately to correct the errors in that prediction. If processors at each end of a speechtransmission path both dealt with the same form of model, then there seems to be the potential of transmitting good quality speech with only a few thousand bits per second transmitted between them. The digital-packet communication system to which the "computer terminal" is attached can then become a very novel telephone system. But consider also that then storage and delivery of "speech" messages are possible, too, and from there grows quite a spread of storage and manipulation services for speech strings, to supplement those for text, graphics, video pictures, etc. in the filling out of a "complete knowledge workshop." If we had such analog-to-digital transducers at the display terminals of the NLS system in ARC, we could easily extend the software to provide for tying the recorded speech strings into our on-line files, and for associating them directly with any text (notes, annotations, or transcriptions). This would allow us, for instance, to use crossreference links in our text in a manner that now lets us by pointing to them be almost instantly shown the full text of the cited passage. With the speech-string facility, such an act could let us instantly hear the "playback" of a cited speech passage. Records of meetings and messages could usefully be stored and cited to great advantage. With advances in speech-processing capability, we would expect for instance to let the user ask to "step along with each press of my control key by a ten-word segment" (of the speech he would hear through his speaker), or "jump to the next occurrence of this word". Associated with the existing "Dialogue Support System" as discussed in Reference 1, this speech-string extension would be very exciting. There is every reason to expect a rapid mushrooming in the range of media, processes, and human activity with which our computer terminals are associated. ACKNOWLEDGMENTS During the 10 year life of ARC many people have contributed to the development of the workshop using the terminal features described here. There are presently some 35 people-clerical, hardware, software, information specialists, operations researchers, writers, and others-all contributing significantly toward our goals. ARC research and development work is currently supported primarily by the Advanced Research Projects Agency of the Department of Defense, and also by the Rome Air Development Center of the Air Force and by the Office of Naval Research. Earlier sponsorship has included the Air Force Office of Scientific Research, and the National Aeronautics and Space Administration. Most of the specific work mentioned in this paper was supported by ARPA, NASA, and AFOSR. REFERENCES 1. Engelbart, D. C., Watson, R. W., Norton, J. C., The Augmented Knowledge Workshop, AFIPS Proceedings National Computer Conference, June 1973, (SRI-ARC Journal File 14724) 2. Engelbart, D. C., Augmenting Human Intellect: A Conceptual Framework, Stanford Research Institute Augmentation Research Center, AFOSR-3223, AD-289 565, October 1962, (SRI-ARC Cata· log Item 3906) The framework developed a basic strategy that ARC is still following-"bootstrapping" the evolution of augmentation systems by concentrating on development~ and applications that best facilitate the evolution and application of augmentation systems. See the companion paper' for a picture of today's representation of that philosophy; the old report still makes for valuable reading, to my mind-there is much there that I can't say any better today. In a "science-fiction" section of the report, I describe a console with features that are clear precedents to the things we are using and doing today-and some that we haven't yet gotten to. 3. Engelbart, D. C., A Conceptual Framework for the Augmentation of Man's Intellect Vistas," in Information Handling, Howerton and Weeks (Editors), Spartan Books, Washington, D. C., 1963, pp. 129, (SRI-ARC Catalog Item 9375) This chapter contains the bulk of the report2; with the main exclusion being a fairly lengthy section written in story-telling, sciencefiction prose about what a visit to the augmented workshop of the future would be like. That is the part that I thought tied it all together-but today's reader probably doesn't need the help the reader of ten years ago did. I think that the framework developed here is still very relevant to the topic of an augmented workshop and the terminal services that support it. 4. Engelbart, D. C., Sorenson, P. H., Explorations in the Automation of Sensorimotor Skill Training, Stanford Research Institute. NAVTRADEVCEN 1517-1, AD 619 046, .January 1965, (SRI-ARC Catalog Item 11736). Here the objective was to explore the potential of using computeraided instruction in the domain of physical skills rather than of conceptual skills. It happened that the physical skill we chose, to make for a manageable instrumentation problem, was operating Design Considerations For Knowledge Workshop Terminals S. 6. 7. 8. 9. 10. 11. the five-key chording key set. Consequently, here is more data on keyset-skill learnability; it diminished the significance of the experiment on computerized skill training because the skill turned out to be so easy to learn however the subject went about it. Engelbart, D. C., Augmenting Human Intellect: Experiments, Concepts, and Possibilities-Summary Report Stanford Research Institute, Augmentation Research Center, March 1965, (SRI-ARC Catalog Item 9691). This includes a seven-page Appendix that describes our first keyset codes and usage conventions-which have since changed. Two main sections of about twelve pages, each of which is very relevant to the general topic of "intelligent terminal" design, are discussed above under "Extended Features." English, W. K, Engelbart, D. C., Huddart, B., Computer Aided Display Control-Final Report Stanford Research Institute, Augmentation Research Center, July 1965, (SRI-ARC Catalog Item 9692). About twenty percent of this report dealt explicitly with the screenselection tests (that were published later in [7]; most of the rest provides environmental description (computer, command language,hierarchical file-structuring conventions,etc.) that is interesting only if you happen to like comparing earlier and later stages of evolution, in what has since become a very sophisticated system through continuous, constant-purpose evolution. English, W. K, Engelbart, D. C., Berman, M. A., " Display-Selection Techniques for Text Manipulation," IEEE Transactions on Human Factors in Electronics, Vol. HFE-8, No.1, pp. 5-15, March 1967, (SRI-ARC Catalog Item 9694). This is essentially the portion of [6] above that dealt with the screen-selection tests and analyses. Ten pages, showing photographs of the different devices tested (even the knee-controlled setup), and describing with photographs the computerized selection experiments and displays of response-time patterns. Some nine different bar charts show comparative, analytic results. Licklider, J. C. R., Taylor, R. W., Herbert, E., "The Computer as a Communication Device," International Science and Technology, No. 76, pp. 21-31, April 1968, (SRI-ARC Catalog Item 3888). The first two authors have very directly and significantly affected the course of evolution in time-sharing, interactive-computing, and computer networks, and the third author is a skilled and experienced writer; the result shows important foresight in general, with respect to the mix of computers and communications in which technologists of both breeds must learn to anticipate the mutual impact in order to be working on the right problems and possibilities. Included is a page or so describing our augmented conferencing experiment, in which Taylor had been a participant. Engelbart, D. C., Human Intellect Augmentation Techniques, Final Report Stanford Research Institute, Augmentation Research Center, CR-1270, N69-16140, July 1968, (SRI-ARC Catalog Item 3562). A report especially aimed at a more general audience, this one rather gently lays out a clean picture of research strategy and environment' developments in our user-system features, developments in our system-design techniques, and (especially relevant here) some twenty pages discussing "results," i.e. how the tools affect us, how we go about some things differently, what our documentation and record-keeping practices are, etc. And there is a good description of our on-iine conferencing setup and experiences. Engelbart, D. C., "Augmenting Your Intellect," (Interview With D. C. Engelbart), Research Development, pp. 22-27, August 1968, (SRI-ARC Catalog Item 9698). The text is in a dialog mode-me being interviewed. I thought that it provided a very effective way for eliciting from me some things that I otherwise found hard to express; a number of the points being very relevant to the attitudes and assertions expressed in the text above. There are two good photographs: one of the basic work station (as described above), and one of an on-going augmented group meeting. Engelbart, D. C., English, W. K, "A Research Center for Augmenting Human Intellect," AFIPS Proceedings-Fall Joint Com- 225 puter Conference, Vol. 33, pp. 395-410, 1968, (SRI-ARC Catalog Item 3954). Our most comprehensive piece, in the open literature, describing our activities and developments. Devotes one page (out of twelve) to the work-station design; also includes photographs of screen usage, one of an augmented group meeting in action, and one showing the facility for a video-based display system to mix cameragenerated video (in this case, the face of Bill English) with computer-generated graphics about which he is communicating to a remote viewer. 12. Haavind, R., "Man Computer 'Partnerships' Explored," Electronic Design, Vol. 17, No.3, pp. 25-32, 1 February, 1969, (SRI-ARC Catalog Item 13961). A very well-done piece, effectively using photographs and diagrams to support description of our consoles, environment, working practices, and experiences to a general, technically oriented reader. 13. Augmentation of the Human Intellect-A Film of the SRI-ARC, Presentation at the 1969 ASIS Conference, San Francisco, (A 3Reel Movie). Stanford Research Institute, Augmentation Research Center, October 1969, (SRI-ARC Catalog Item 9733). 14. Field R. K., "Here Comes the Tuned-In, Wired-Up, Plugged-In, Hyperarticulate Speed-of-Light Society-An Electronics Special Report: No More Pencils, No More Books-Write and Read Electronically," Electronics, pp. 73-104, 24 Kovember, 1969, (SRIARC Catalog Item 9705). A special-feature staff report on communications, covering comments and attitudes from a number of interviewed "sages." Some very good photographs of our terminals in action provide one aspect of relevance here, but the rest of the article does very well in supporting the realization that a very complex set of opportunities and changes are due to arise, over many facets of communication. 15. Engelbart, D. C., "Intellectual Implications of Multi-Access Computer Networks," paper presented at Interdisciplinary Conference on Multi-Access Computer Networks, Austin, Texas, April 1970, preprint, (SRI-ARC Journal File 5255). This develops a picture of the sort of knowledge-worker marketplace that will evolve, and gives examples of the variety and flexibility in human-service exchanges that can (will) be supported. It compares human institutions to biological organisms, and pictures the computer-people networks as a new evolutionary step in the form of "institutional nervous systems" that can enable our human institutions to become much more "intelligent, adaptable, etc." This represents a solid statement of my assumptions about the environment, utilization and significance of our computer terminals. 16. Engelbart, D. C., SRI-ARC Staff, Advanced Intellect-Augmentation Techniques-Final Report, Stanford Research Institute, Augmentation Research Center, CR-1827, July 1970, (SRI-ARC Catalog Item 5140). Our most comprehensive report in the area of usage experience and practices. Explicit sections on: The Augmented Software Engineer, The Augmented Manager, The Augmented Report-Writing Team, and The Augmented Presentation. This has some fifty-seven screen photographs to support the detailed descriptions; and there are photographs of three stages of display-console arrangement (including the one designed and fabricated experimentally by Herman Miller, Inc, where the keyboard, keyset and mouse are built into a swinging control frame attached to the swivel chair). 17. Roberts, L. C., Extensions of Packet Communication Technology to a Hand Held Personal Terminal, Advanced Research Projects Agency. Information Processing Techniques, 24 January, 1972. (SRI-ARC Catalog Item 9120). Technology of digital-packet communication can soon support mobile terminals; other technologies can soon provide hand-held display terminals suitable for interactive text manipulation. 18. Savoie, R., Summary of Results of Five-Finger Keyset Training Experiment, Project 8457 -21, Stanford Research Institute, Bioengineering Group, 4, p. 29, March 1972, (SRI-ARC Catalog Item 11101). 226 National Computer Conference, 1973 Summarizes tests made on six subjects, with an automated testing setup, to gain an objective gauge on the learnability of the chording keyset code and operating skill. Results were actually hard to interpret because skills grew rapidly in a matter of hours. General conclusion: it is an easy skill to acquire. 19. DNLS Environment Stanford Research Institute, Augmentation Research Center, 8, p. 19, June 1972, (SRI-ARC Journal File 10704). Current User's Guide for ARC's Display Online :::System (DNLS). Gives explicit description on use of the keyset, mouse, and the basic interaction processes. APPENDIX B: PHOTOGRAPHS APPENDIX A: MOUSE AND KEYSET, CODES AND CASES Note: We generally use the keyset with the left hand; therefore, "a" is a "thumb-only" stroke. Of the three buttons on the mouse, the leftmost two are used during keyset input effectively to extend its input code by two bits. Instead of causing character entry, the "fourth case" alters the view specification; any number of them can be concatenated, usually terminated by the "P' chord to effect a re-creation of the display according to the altered view specification. ~Iouse 010 -1- 000 Buttons: Case -0- 100 110 -2- -3- Figure I-Our current standard work station setup: Mouse in right hand controls cursor on screen: keyset under left hand supplements keyboard for special, two-handed command execution operation. Separation of control and viewing hardware is purposeful, and considered by us to be an advantage enabled by computerized work stations. Keyset Code 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 X X X X X X X X X X X X X X X X 0 0 0 0 0 0 0 X X X X X X X X 0 0 0 0 0 0 0 0 X X X X X X X X 0 0 0 X 0 X X 0 X 0 X X X X 0 0 0 0 0 X 0 X X 0 X 0 X X X X 0 0 0 0 0 X 0 X X 0 X 0 X X X X 0 0 0 0 0 X 0 X X 0 X 0 X X X X X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X 0 X a b c d e f g h A B C D # $ E % & j k 1 m n F G H I J K I M N 0 0 p q r p q R S T U t u v w x y z V W X Y Z ( ) @ + / 0 1 2 3 4 5 6 7 8 9 < > SP ALT / TAB CR show one level less show one level deeper show all levels show top level only current statement level re-create display branch show only goff show content passed i or k off show content failed show plex only show statement numbers hide statement numbers frozen statement windows frozen statement off show one line more show one line less show all lines first lines only inhibit refresh display normal refresh display all lines, all levels one line, one level blank lines on blank lines off (nothing) (nothing) (nothing) centerdot (nothing) Figure 2-Closeup of Keyset. Finger pressure and key travel are quite critical. It took many successive models to achieve a really satisfactory design. Figure 3-Closeup of Mouse. There are characteristics of the "feel," depending upon the edging of the wheels, the kind of bearings, etc. that can make considerable difference. We happened to hit on a good combination early, but there have been subsequent trials (aimed at improvements, or where others more or less copied our design) that didn't work out well. The play in the buttons, the pressure and actuating travel, are also quite important. Design Considerations For Knowledge Workshop Terminals Figure 4-Clostmp of underside of mouse (old model), showing orthogonal disk-wheels. We now bring the flexible cable out the "front." Size and shape haven't changed, in later models. Bill English (the operator in Fig. 1, and mentioned in the text above) is now experimenting with new mouse sizes and shapes. 227 Intelligent satellites for interactive graphics* by ANDRIES VAN DAM and GEORGE M. STABLER Brown University Providence, Rhode Island mainframe or human command. Xote that if the terminal contains a mini, micro or microprogrammable computer which runs a standard program to service the terminal, and not arbitrary, user loaded programs, the terminal has a fixed function and is still just an intelligent terminal by our definitiont (e.g., a VIATRON, or SYCOR alphanumeric display). Only when the device contains a general purpose computer which is easily accessible to the ordinary user for any purpose and program of his choice, do we promote the terminal to an intelligent satellite (computer). Note that this notation is in conflict with that of Hobbs who calls this last category smart or intelligent terminal, and does not discuss our class of intelligent terminals. Machover's19 definition tallies more with ours since his intelligent terminal could be constructed purely with nonprogrammable hardware (e.g., the Evans and Sutherland LDS-l display). Our view of the idealized intelligent satellite is one which is powerful enough to run a vast number of jobs (e.g., student programs) completely in stand alone mode. In satellite mode it uses the host less than 50 percent of the time, as a fancy "peripheral" which supplies an archival (possibly shared) database, massive computational power (e.g., floating point matrix inversion for the analysis part of the application), and input/ output devices such as high speed printers, plotters, microfilm recorders, magnetic tape, etc. INTRODUCTION Semantics In the last four or five years it has become increasingly fashionable to speak of "intelligent," "smart," or "programmable" terminals and systems. Very few mainframe or peripheral manufacturers omit such a device from their standard product line. Although "intelligence," like beauty or pornography, is in the eye of the beholder, the adjective generally connotes that the device has a degree of autonomy or processing ability which allows it to perform certain (classes of) tasks without assistance from the mainframe to which it is connected. Many such devices are programmable by virtue of including a mini, microprogrammable or micro computer.** While operational definitions are pretty hazy and nonstandard, we call a device a terminal if a user interacts with a mainframe computer (host) through it (e.g., a teletype or an alphanumeric display console). Hobbs 15 lists 6 classes ofterminals:** (1) keyboard/ printer terminals; (2) CRT terminals; (3) (4) (5) (6) remote-batch terminals; real-time data-acquisition and control terminals; transaction and point-of-sale terminals; smart terminals. We consider the terminal to be intelligent if it contains hard, firm, and/ or software which allows it to perform alphanumeric or graphic message entry, display, buffering, verifying, editing, and block transmissions, either on Distributed computing The term "distributed computing" refers both to devices at remote locations, and logic up to the point of a programmable computer, which has been used to enhance the intelligence of the devices.t Such distributed or * The research described in this paper is supported by the National Science Foundation, Grant GJ -28401X, the Office of Naval Research, Contract N00014-67-A-0191-0023, and the Brown University Division of Applied Mathematics. ** A synonym is the "Computer on a Chip," e.g., the INTEL 8008. t This is true even if the program can be modified slightly i or replaced * * * Rather than repeating or paraphrasing the several excellent surveys on terminals here, the reader is referred directly to them. Suggested are References 4,6,15 and 19, as well as commercial reports such as those put out by Auerbach and DataPro. :j: Note that the development of ever more sophisticated channels and device controllers started the trend of distributing intelligence away from the CPU. (Programmable) data concentrators and front end processors are further extensions of the principle. in the case of Read Only Memory) to allow such things as variable field definitions on a CRT displayed form, etc.) 229 230 National Computer Conference, 1973 decentralized computing with remote intelligent terminals and especially satellites is a fact of life today. This is so despite the attempts by most computer center administrators in the middle and late 1960's to have individual and department funds pooled to acquire a large, central, omni-purpose computer satisfying everyone's needs. Indeed, the hardware architects,12 (ACM 72 National Conference panel participants) are predicting even more decentralization, with complete mini or micro computers in homes, offices, and schools, both to provide programmable intelligence to terminals and peripherals, and to serve as local general purpose computers for modest processing tasks. Some of the reasons for this phenomenon are psychological, others technical: (1) The hardware technology has made it possible; the price of mass produced logic and memory (even of small disks and drums) has decreased dramatically, and more so for mini's than for mainframe hardware; even Foster's "computer on a chip"12 is a reality before his predicted 1975 date. Consequently, minis (and even midis) have become widely affordable; OEM minis may be part of scientific equipment or of terminals costing under $7,000! This is partially due to the pricing structure of the mini manufacturers who do not tack on mainframe manufacturer's type overhead for such frills as universal software and customer services. (d) The corresponding conservation of resources, of both mainframe and communications link, due to fewer and more compact interactions and transmissions; the vastly superior user response time (especially given the saturated multi-programmed or time-shared operating systems typical today); and the real or "funny" money savings of not unnecessarily using the host. (e) The ability to do (significant) work, minimally message composition using simple cassettes, while the host is down or a core partition for the applications program is not available. (f) Returning to the transmission link, the advantage of being able to live without its constant availability; incurring fewer errors due to lesser use; being able to be satisfied with a lower speed and therefore less delicate, more widely available and lower cost link. (g) The ability, with sufficient intelligence, to emulate existing older devices, to the point of providing with one device "plug for plug compatible" replacements for several existing ones. (h) And finally, the enormous advantage of having locally, hopefully general purpose, processing power for arbitrary customization. An outline of potential problem areas (2) The advantages of distributed (remote) logic and computing are indisputable: (a) the convenience and psychological advantage of having a terminal or remote job entry station in or near your office, especially if it can do simple things like accumulate messages and print or edit them locally. (b) in the ideal case, the even greater convenience and sense of power rightfully restored to someone who controls the destiny of his very own little, but complete, computer system. No more fighting with the computing center, or contending with other users for scarce resources; presumably less lost data and fewer system crashes since there's no interuser interference within a fragile operating system. (To be fair, if the local machine is then hooked into the central host in satellite mode, the problems of reliability may be worse, due to communications errors for example.) (c) The advantage of avoiding extraneous user effort by being able to build a message or a picture* locally, and immediately verify and edit it, before entering information into the mainframe; this convenience is in contrast to cycles of enter I verify I correct steps which are separated in time. While distributed logic and computing offer genuinely enhanced capability and cost effectiveness, a number of problems which the user ought to be aware of do crop up. " Let alone scaling and rotating it, doing lightpen tracking to build it. etc. ** Note that if the device does contain a user accessible general purpose Hardware problems (a) Either you choose among the many off-the-shelf software supported, but unchangeable devices, ** or (b) you build yourself, in hardware, or preferably in firmware, a customized terminal just right for your application. You then have to implement your device and the supporting software, both probably incompatible with anybody else's gear. Inte rfacing problems (a) If you buy a standard device, connected over a standard (channel or front end) interface, you might be lucky and have no operating system support problems. The non-standard device might need a special purpose interface and might not be recognized (the "foreign attachment" gambit). Even standard interfaces are notorious for crashing operating systems. In any case, "mixed systems" containing multiple vendor hardware are coming of age, but lead to many "our system works, it must computer, inflexibility may be tempered with appropriate software. Intelligent Satellites for Interactive Graphics be their system" traumas. As an aside, maintenance on the device may not be cheap, so many installations use "on call" arrangements. (b) To make interfacing easy, obtain flexible remoteness, and avoid the foreign attachment maintenance support problem, telephone communications links are very popular. Modems and lines are expensive for more than 1200 baud transmission, however, and even low speed lines are notoriously noisy (especially in rainy weather)! Host operating system support Terminal systems today generally are supported with a host operating system: minicomputers rarely are. Homebrew customized systems, by definition, are not. Providing your own host support at the II 0 level for such systems is usually a reasonable task for experienced systems programmers; higher level, truly integrated support, however, may be a real research problem. Local software support This type of support ranges from minimal (say a local assembler) to reasonably complete; for a mini it is often a full disk operating system, with FORTRAN and, if you're very lucky, a cross compiler for a higher level procedural language, which compiles on the host, and produces code for the satellite. Even if a standard language like FORTRAN is available on both host and satellite, thereby obviating having to learn to program in two languages, the versions will be incompatible, due to differences in architecture and instruction set of the machines, and in compiler implementation. Reserving host resources Making sure that real (or virtual) core and disk space are available for the new device is an often overlooked or underestimated problem. THE EVOLUTION FROM TERMINAL TO SATELLITE Despite all the facilities intelligent terminals provide, they still contribute only a relatively small (but possibly quite sufficient) amount of processing power to the total needs of the program or system. For example, display regeneration and housekeeping, communications handling, and simple local interrupt fielding are typical tasks which can be allocated to an intelligent terminal for interactive graphics. Such functions can be lumped together into an area we will call "hardware enhancement." This term is meant to indicate that the basic goal to which the intelligence of the terminal is being directed is the sim ulation of a more sophisticated piece of hardware. Raising the level of the interface which the terminal presents to 231 Figure 1 the mainframe effectively lightens the load placed on the mainframe by simplifying the requirements of the low ("access method" or "symbiont") level support for the terminal. The operational distinction we have made between intelligent terminals and satellites is the use to which the intelligence of the remote terminal system is put. For intelligent terminals this intelligence is primarily directed into areas such as hardware enhancement and low level support. On the other hand, the intelligence of a satellite is sufficiently high to allow it to be applied directly to the processing requirements of the application program. (Some hardware implications of this requirement are discussed below.) The transformation of an intelligent terminal into a full satellite is a long and, as Myer and Sutherland have pointed oue o somewhat addictive process (" ... for just a little more money, we could ... "). For example, in the graphics case, one first adds a data channel to mainframe core to service the display. Then special display registers and channel commands are incrementally added. A local memory is inserted to free mainframe core. The display is given its own port (i.e., primitive data channel) into the local memory, and the wheel of reincarnation is rolling. The end result of this process is a system of the general form shown in Figure 1. THE SATELLITE CONFIGURATIO~ The diagram in Figure 1 says nothing about the power or complexity of the various components. As Foley 11 has pointed out, there are many ways in which the pieces of such a satellite system can be chosen. and the implementation of an optimal (highest cost/ performance ratio) configuration for a particular application may entail the examination of hundreds or even thousands of different combinations of subsystems. In the following, we are not as concerned with optimality as with defining a lower bound on the total processing power of the satellite below which it becomes infeasible to view the satellite as a general processor (at least for the purpose of satellite graphics). The argument for having a satellite system as opposed to an intelligent terminal is to do nontrivial local processing (real-time transformations and clipping, local attention handling, prompting. providing feedback, data-hase editing, etc.), while leaving large-scale computation and data-base management to the mainframe. In order to perform satisfactorily for a given (class of) job(s), the satellite must possess "critical intelligence." Analogous to 232 National Computer Conference, 1973 the "critical mass" concept, critical intelligence defines a threshold of local power which allows the majority of tasks to be executed locally without recourse to the mainframe; it is a complex and application-dependent figure of merit describing such parameters as the architecture and instruction set of the processor, and the primary and secondary storage capacity and access time. It is unfortunately not readily expressed quantitatively, although Foley does try to quantize trade-offs for several classes of applications. Below critical intelligence, i.e., if the satellite does not have (reasonably fast) secondary storage, sufficient local memory, and a powerful instruction set, it simply may not be able to do enough processing fast enough to make the division of labor worthwhile. Many minicomputers used as satellites have too few generalpurpose registers and core, inadequate bit and byte manipulation instructions, and minimal operating systems. On such machines, it is seldom possible to handle non-trivial applications programs in stand-alone mode satisfactorily (i.e., not just drawing, but handling data structure editing as well), manufacturers' claims notwithstanding.* In some cases, the shortcomings of a standard minicomputer can be overcome by the use of microprogramming. Microprogrammed processors have a very real advantage over those which are hardwired in that it is frequently possible to redefine a weak architecture or instruction set. Hardware deficiencies such as minimal core may be offset by, for example, an instruction set which has been designed to accommodate often used algorithms. Entire subroutines or their most critical parts may be put in the firmware. An example of the considerable savings in core and execution time which can be achieved with thoughtful firmware instruction set design is described in Reference 2. Given a satellite system sufficiently powerful to run stand-alone applications, the question arises, "Why go on?" If the satellite has gone once around the wheel and has become self-sufficient, we no longer have a satellite, but another mainframe, and the need for satellite/mainframe interaction disappears. Indeed, this approach was taken, for example, by Applicon, Inc., whose IBM 1130/ storage tube system has been carefully and cleverly tailored to a modest application (lC layout), providing one of the few money making instances of computer graphics. 3 ADAGE'sI3 more powerful midi systems also were enhanced for example with a fast, high-capacity disk to a point where they support a wide variety of graphic applications without recourse to an even larger computer. If stand-alone mode is no longer sufficient, the local system may again be enhanced, but in most cases a duplication of facilities with an existing large mainframe is not * Yet it is astonishing how many programmers who would not dream of writing a program in 32 kilobytes of 360 user core, with the powerful 360 instruction set and good disks behind it, have the chutzpah (typically justly rewarded) to write the same program for a 32-kilobyte mini with a slower, more primitive architecture and instruction set and a painfully ,.,luw tli"h.. cost effective, and a crossover point is reached at which communication with the mainframe becomes cheaper than satellite enhancement. In a sense, the satellite is a saddlepoint (an optimal strategy) in the range of configurations bounded at one end by simple display terminals and at the other by full stand-alone graphics processors. SOFTWARE STRATEGIES AND APPLICATIONS Given the typical hardware configuration outlined above, it is interesting to examine some of the ways in which such systems have been utilized. These will be presented roughly in order of increasing utilization of the satellite's intelligence. Hardware enhancement At the low end of the spectrum, the facilities of the satellite system are used soley to complement capabilities of the display processor. In such systems, the display processor can typically do little more than display graphic data out of a contiguous data list in core. All other display functions-subroutining, transformations, windowing, clipping, etc.-are performed by the satellite processor. Little if any processing power is left over for more application-oriented processing. In fact, there is little difference between a satellite used in this manner and an intelligent terminal; the approach is noted here only because some intelligent terminals may have the hardware structure described in Figure 1. Device simulation/emulation Another "non-intelligent" use of a satellite is to emulate and/ or simulate another display system. The rationale for this is almost always the presence of a large package of existing software for the display being emulated. Rather than discard many man-years of software development, the choice is made to under-utilize the facilities of the satellite system in order to support the existing applications. Another reason for using the satellite in simulation/ emulator mode is that it may be possible to provide higher performance and access to new facilities for existing programs (e.g., control dial and joystick input for a 2250 program). In addition, satellite simulation may allow remote access graphics (over telephone lines or a dedicated link) where not previously possible. ** At Brown, we have implemented three such systems to provide support for IBM 2250 Mod 1 and Mod 3 programs. Two of these were simulators using an IBM 1130/ 2250 Mod 427 and an Interdata Mod 3/ ARDS storage ** As a commercial example, ADAGE is now offering a display system with simulator software incorporating man\' of these ideas [ADAGE, Ul7:2J. Intelligent Satellites for Interactive Graphics tube;26 and one now being completed is an emulator using a Digital Scientific Meta 4 and a Vector General display system. Our experience has indicated that it is indeed useful to be able to continue to run old programs while developing more suitable support for the new system. In addition, we have found that device emulation is a good "benchmark" application for gaining experience with and assessing the capabilities of the new system. On the other hand, in no instance have we found that device emulation made full use of the satellite facilities. Black box approaches We are now to the point of considering programs which require the full services of the mainframe! satellite configuration. Such -programs are characterized by a need for the graphics capability of the satellite processor and a set of other requirements (computing power, core, bulk secondary storage, etc.) which cannot be completely satisfied by the satellite. In addition, we will assume that the satellite possesses the critical intelligence referred to above. It is at this point that the "division of labor" problem arises, i.e., determining the optimal way in which the various subtasks of a large application should be allocated to the two processors. This question has as yet received no adequate treatment (a possible approach is outlined below), and it is indicative of the difficulties inherent in the problem that no current system for satellite graphics provides a truly general solution. The most common treatment of the satellite processor is as a black box. GIN031 and SPINDLE16 typify this approach in which all hardware and operating system details of the satellite are hidden from the applications program. The satellite is provided with a relatively fixed run-time environment which performs tasks such as display file management, attention queueing, light-pen tracking, and data-structure management (in conjunction with mainframe routines). Access to these facilities from the application program is usually in the form of a subroutine library callable from a high -level language (e.g., FORTRAN). This approach has the attractive feature of "protecting" the applications programmer from getting involved in multiple languages, operating system details, and hardware peculiarities. In addition, a well-designed system of this type can be easily reconfigured to support a new satellite system without impacting the application program. On the other hand, defining a fixed satellite! mainframe task allocation may incur unnecessary systems overhead if the allocation should prove to be inappropriate for a particular application. Particularly in the case of high-powered satellites, it may be difficult to provide access to all the facilities of the satellite without requiring the applications programmer to "get his hands dirty" by fiddling with various pieces of the system. Worst is poor use (inadequate use) of local facilities, wasting power and incurring charges on the mainframe. 233 Systems for interconnected processing While the black-box approach hides the satellite from the programmer, interconnected processor (lCP) systems (connecting one or more small satellites to a large host) allow (and sometimes require) cognizance of both the mainframe and satellite processors. At the lowest level, such a "system" consists of no more than a communications package or access method. At a slightly higher level are packages such as IBM's processor-to-processor (PTOP) routines for 360! 1130 communications. 22 These routines provide a high-level communication interface together with data conversion capabilities. More sophisticated systems are exemplified by Bell Labs' GRIN _27 and UNIVAC's Interactive Control Table (ICT) approach. B In these systems, a special-purpose language is provided with which the application programmer specifies the detailed data structure manipulation and! or attention handling which is to take place during an interactive session. Once this specification has been made, it becomes part of the system environment of both processors. The Univac system allows this environment to be changed at runtime by providing for the dynamic loading of new satellite programs for full attention handling and data structure manipulation. Thus the programmer has at least some control over the activities of the satellite processor. A very general system of this type has been outlined by Ross et al./ 4 in which a Display Interface System (DIS) is described. The DIS consists of "minimal executives" in both the satellite and mainframe processors. These executives act in conjunction to provide attention and programhandling mechanisms in both machines. Additional features, such as satellite storage management and display file handling, are available in the form of system-provided routines which can be allocated to either processor. Effectively, the DIS provides a "meta-system" which can be used by a systems programmer to tailor the appearance of the mainframe! satellite interface to provide optimal utilization of the satellite configuration. While a full DIS system has not yet been implemented, the basic design principles were used with apparently good success in the design of the GINO package. 31 What objections to the preceding systems can be raised? The Univac system requires bi-linguality and the overhead of a local interpreter, a deficiency recognized by the implementers.9 This particular system also failed to achieve critical intelligence since the hardware on which it was implemented lacked mass storage, general purpose registers, and a decent instruction set. The Grin-2 experiment is somewhat more ambiguous, with in-house users apparently satisfied and some outsiders dissatisfied, with such features as a fixed (ring) data structure, an unwieldy programming language, not enough local core, etc. The GINO satellite system, though used very successfully, has had only relatively minor housekeeping and transformation functions executed locally, thereby not saving very much on host resources in this intelligent 234 National Computer Conference, 1973 terminal mode. Thus, a truly general purpose, flexible, easy to use, and cost effective system for host-satellite communication is yet to be achieved. tions, windowing, and clipping. SIMALE is a high -speed parallel processor with writeable control store! Webber, 1973!. The 360 Interface-The communications link to the mainframe (an IBM 360;67 under CP67;CMS) is a multiplexer channel interface. The interface, which is driven from firmware, is non-specific, that is, it can appear as any IBM -supported device. Eventually, this link will be downgraded to a medium to low speed (e.g., 1200 BAUD) communications line. The local operating system-The operating system which runs on the local processor was built using the "extended machine" or "structured" approach typified by Dijkstra's THE System. 10 With this approach, the operating system is developed as a series of distinct levels, each level providing a more intelligent "host" machine to the next level. The design of the system facilities the vertical movement of various facilities between levels as experience dictates. As facilities become stabilized on the lowest level, they can be moved into the firmware with no impact on user programs. A SYSTEM FOR STUDYING SATELLITE GRAPHICS An environment for interconnected processing BROWN UNIVERSITY GRAPHICS SYSTEM S I MAL E (TRANSFORM.~TION PROCESSOR) Figure 2 The Brown University graphics system (BUGS) For the past eighteen months the Graphics Project at Brown University has been implementing a laboratory system for the investigation of a variety of questions relating to satellite graphics. The salient features of the system (shown in Figure 2)* are as follows.** The local processor: The general-purpose processor of the system is a microprogrammable Digital Scientific Meta 4. It has been provided with a 360'-like firmware defined instruction set with additions and modifications to enhance the ability of the processor to meet the needs of the operating system and graphics applications. The display processor: A second Meta 4 was chosen to serve as a programmable display processor to drive the Vector General display. While the Vector General itself is a relatively powerful processor, this "level of indirectness" was added to allow complete freedom in designing (and altering) the display instruction set seen by the user. The SIMALE- It was determined that even with the high speed of the Meta 4, it would not be possible to provide full three-dimensional transformations (with windowing and clipping) at a speed sufficient to display 1000-2000 vectors. For this reason, we have designed the SIMALE (Super Integral Microprogrammed Arithmetic Logic Expediter) to perform homogeneous transforma* A PDP 11/45 with a Vector General display is a cheaper commercial system sharing many of the characteristics of the BUGS system. We are providing a FORTRA::-.r based graphics subroutine package for this svstern, both for standalone and for 360/370 satellite mode. ** For more complete information about the system, see References 2 :mri ~8 The system outlined above, which admittedly has gone around the wheel of reincarnation several times, has been designed with the goal of keeping each subsystem as open-ended as possible, thus allowing maximum flexibility in subsystem/task allocation. This approach is also being taken in the design of system software to support graphics applications requiring both satellite and mainframe facilities. With currently extant systems, the ramifications of splitting an application between the mainframe and the satellite are many and frequently ugly. Minimally, the programmer must become acquainted with a new instruction set or an unfamiliar implementation of a higher-level language. In addition, the vagaries of the satellite operating system must be painfully gleaned from Those-in-theKnow. With luck, there will be some sort of support for 110 to the mainframe, but probably no good guidelines on how best it should be used. Most importantly, the programmer will have little or no knowledge about how to split his application between the two processors in such a way as to make optimal use of each. In other words, mistakes are bound to occur, mistakes of the kind which frequently require a significant amount of recoding and; or redesign. The basic goal of the ICP system outlined below is to alleviate these problems while placing minimal constraints on the programmer's use of the two processors. The aim is to provide not merely a high-level access method or system environment through which the satellite can be referenced, but rather a set of tools which will allow the programmer to subdivide an applications program or system between the satellite and the mainframe ",,'jthout constant reference to the fact that he is working Intelligent Satellites for Interactive Graphics with two dissimilar processors. These tools include the following: * • A completely transparent I/O interface between the mainframe and the satellite. I/O between the two processors should take place at a purely logical level with no consideration of communications protocol, interface characteristics, or timing dependencies. • A run-time environment to support inter-process communication as described below. In the limit, this environment should be sufficiently powerful to allow dynamic (run time) redefinition of the task/processor allocation. • A high-level language with translators capable of generating code for both machines. Minimally, this language should let the programmer disregard as far as possible differences in the hardware and operating system defined characteristics of the two processors.** Optimally, the language should provide constructs to maximize the ease with which the various tasks of a large application can be moved from one processor to the other. 235 • The ability to completely redefine the compile-time code generators. This allows implementation of a compiler which will generate code for either the satellite or the mainframe. • Extensibility mechanisms. In particular, semantic extensibility allows the definition of new data types, storage classes, and scopes for variables. • The ON ACCESS mechanism. This facility, which is similar to PL/l's CHECK statement, allows compiletime definition of routines which are to be invoked whenever a particular variable is accessed at run time. • Operating system independence. The constructs which are generated by the compiler have been designed to be as independent as possible of the operating system under which they are to run. In addition, the run-time environment required by an LSD program has been kept as small as possible. • Run-time symbol table access. Complete information about the scope, type and storage class of all variables is available at run time. The language for systems development LSD extended for the ICP system Of the tools mentioned above, the most important is the high-level language in which the application is going to be written. Indeed, the language is the heart of our approach to ICPing since it provides the uniform environment in which the programmer can work with little or no regard for the final task/processor subdivision of the application. If he wishes, the language will also let him hand-tool each routine for the processor on which it will run. The language which we are using for both the Iep system and application programs is the Language for Systems Development (LSD).5 LSD is a general-purpose procedure-oriented language with many of the features and much of the syntax of PL/I. In contrast to PL/I, however, the language enables the programmer to get as close as he wants to the machine for which he is programming rather than hiding that machine from him. Thus, while the ordinary applications programmer can simply use it as a FORTRAN replacement, a systems programmer can explicitly perform operations on main memory locations and registers; he can intersperse LSD code with assembly language or machine language (through the CODE/ENCODE construct). LSD also will provide a variety of extension mechanisms to permit the user to tailor the language to specific problems or programming styles. Some of the features of LSD which make it an ideal vehicle for the ICP system are the following: The fundamental extension to be made to LSD for ICP'ing is the addition of a new scope for variables and procedures. Currently, an LSD variable may have one of three scope attributes-LOCAL, GLOBAL, or EXTERNAL. A LOCAL variable is accessible only within the procedure in which it is allocated. A GLOBAL variable may be known to all procedures in the system. EXTERNAL indicates that the variable has been defined as a GLOBAL in some other procedure which is external to the current procedure. A procedure can be either external or internal. An internal procedure is defined within another procedure. An external procedure is the basic unit of programming within the system; that is, it can be compiled separately from all other procedures with no loss of information. For the use of ICP applications, a new scope will be defined which will be referred to here as ICPABLE. A declaration of ICPABLE defines the named variable or procedure as one which may reside in the other processor. t This declaration· will force the compiler to take the following actions: * The approach taken is similar to that of J. D. Foley in his current National Science Foundation sponsored research in computer graphics; the envisiuned run-time environments and facilities, however, are quite dissimilar, as will be described below. ** Note that microprogramming is a very handy implementation tool for making the satellite architecture and instruction set somewhat similar to that of the host, thereby reducing the load on the compiler design team. • On variable access or assignment, a run-time routine must be called which has the task of returning the value (or address) of the variable, possibly accessing the other processor to obtain the current value of the variable. • On a call of a procedure which has been declared ICPABLE, a similar check must be made as to the current whereabouts of the procedure. If the procet This definition is similar to Foley's GLOBAL; however, assignments between processors are not run-time alterable in Foley's system, a significant and far reaching difference. 236 National Computer Conference, 1973 dure is in the other processor, an appropriate mechanism for passing of control and parameters must be invoked. The end effect of this extension will be that the programmer need have only very vague ideas about the eventual disposition of his programs and data. During program development, any unclear areas can be resolved by declaring all affected variables and routines as ICPABLE. If the referenced object is in the same processor as the "referencor," overhead will be minimal; if it is not, overhead beyond the necessary communications delay will hopefully still be minimal. * Once the application has been shaken down, this minimal overhead can be removed by suitable redeclarations of the ICPABLE variables and procedures. The run-time environment As currently envisioned an application requiring satellite graphics will run in an environment consisting of five levels. • At the lowest level will be a set of routines (in each processor) which handle the lowest level physical I/O. A standard interface will be defined between these routines and higher levels to ensure flexibility with respect to the variety of possible satellite/ mainframe links. • Above the low level I/O package will be an (LSD callable) access method for explicit use by the LSD programmer as well as the routines supporting implicit inter-process communication. Also at this level will be any system supplied routines which are found necessary to interface with the lowest level facilities on the two processors (e.g., a routine to actually start display regeneration). • The access method will be used for the most part by a third level of routines in charge of performing all necessary data transmission and conversion. • Between the data conversion routines and the actual LSD program will be a supervisory package which keeps track of the current procedure/variable/processor assignment. When dynamic movement of variables and procedures between processors becomes feasible, it also will be undertaken at this level. • The highest level of the run-time environment will consist of a "meta-system" which is used for system resource utilization, response measurements, and dynamic reconfiguring of the application program. The idea here is to provide a logical "joystick" with which the programmer (or user) can make real-time decisions as to the optimal deployment of the various * The overhead inflicted by various flavors of special purpose run-time environments is notoriously unpredictable: the "minimal" overhead for ICPABLE variables and procedures could prove to be entirely unacceptable. pieces of the application. By moving the stick in the "360 direction" he causes some of the modules to be loaded and executed in that machine; by moving in "the other direction," he causes modules to be shifted to the satellite. Observing response or some graphically displayed resource utilization and cost data, he can manipulate the stick, trying for a (temporal) local optimum. A hypothetical example In order to illustrate a use of the system described above, we offer for consideration a piece of a larger application consisting of four procedures-DISKIO, DSUPDATE, BUFFGEN, and ATTNWAIT-together with a MAINLINE representing the rest of the application. DISKIO is a routine which handles I/O to bulk storage on the mainframe; DSUPDATE is in charge of modifying and updating the graphic data structure; BUFFGEN generates a display file from the data structure; and ATTNWAIT processes attentions from the graphic input devices. While the programmer is writing these routines, he disregards the eventual destinations of these routines, and programs as if the system were to appear as:*'* MAINFRAME SATELLITE MAINLINE + ... BUPFGEN DISKIO t DSUPDATE t ATTNWAIT Figure 3 However this disregard while implementing does not mean that the programmer is unaware of the fact that he is indeed working with two processors, and that at some point certain processor/task assignments are going to be made. While he is reasonably certain that DISKIO is going to run in the mainframe and ATTNWAIT in the satellite, he is less sure about BUFFGEN and DSUPDATE and therefore declares these routines ICPABLE. He then (in effect) tells the compiler, "Compile DISKIO for the mainframe, ATTNW AIT for the satellite, and DSUPDATE and BUFFGEN for both processors." When the system is ready for testing, he invokes the highest level of the run-time environment and requests •• Arrows represent call;;; system routilles have beeIl omitt.ed for clarity. Intelligent Satellites for Interactive Graphics his program be run with a trial allocation of tasks as follows: MAINFRAME SATELLITE 237 make the best of a bad situation, the programmer moves the DSUPDATE routine into the satellite: MAINFRMtI-E SATELLITE MAINLINE • BUFFGEN BUFFGEN ATTNWAIT ~ DISKIO i DISKIO t DSUPDATE ATTNWAIT i- DSUPDATE Figure 4 After running with this allocation for a while, the programmer reenters the "meta-system" level to see what kind of statistics have been gathered during the session. The statistics indicate that during those times when some demand was being made on the total system (i.e., periods during which the total application was not quiescent while waiting for a user action), the satellite processor was relatively idle. The programmer therefore decides to try a reallocation of tasks by moving the BUFFGEN routine into that satellite, resulting in: MAINFRAME I, SATELLITE MAINLINE BUFFGEN i DISKIO i ATTNWAIT Figure 6 While the satellite may now be overloaded (e.g., overlays may be needed), response time is still improved since less work is being done in the currently "crunched" mainframe. Hopefully this example gives some idea of the power we envision for the full Iep system. Obviously this description has glossed over many of the difficulties and implementation probiems which we will encounter. In particular, the problem of data base allocation and the transporting of pieces of a data structure between the two processors with different machine defined word sizes presents a formidable problem. Since we do not want to "inflict" a built-in data structure facility on the user it may become necessary to require of him more than minimal cognizance of the two processors for problems involving data structure segmentation. CONCLUSIONS Figure 6 This configuration improves the utilization of the satellite, and the programmer returns to the application environment for further experimentation. After a while, response time degrades drastically. On investigation it is found that two systems and four PL/ I compiles are currently running in the mainframe, To Decentralization and distribution of computing power is coming of age, and is expected to be the standard of the future, with perhaps several levels and layers of hosts and satellites, embedded in a network. Special-purpose-application dedicated intelligent terminals are already proliferating because they are cost effective and their hardware / firmware / software design is straightforward. Intelligent satellite computing, on the other hand, is still in its infancy, especially in its full generality where there is a genuine and changeable division of labor between host and satellite. Few design rules of thumb exist beyond the "buy-Iow/sell-high" variety (for example, interactions on the satellite, number-crunching and data base manage- 238 National Computer Conference, 1973 ment on the host). Even fewer tools exist for studying varying solutions, and their implementation is a far from trivial task. We hope to be able to report some concrete results of our investigation into this important problem area in the near future. REFERENCES AND SELECTED BIBLIOGRAPHY 1. ADAGE Corp., Technical Description of ADAGE Interactive Graphics System/370, News release, March 1972. 2. Anagostopoulos, P., Sockut, G., Stabler, G. M., van Dam, A., Michel, M., "Computer Architecture and Instruction Set Design, NCCE, June 4,1973. 3. The Design Assistant, Applicon Corp., News Release, 1972. 4. Bairstow, J. N., "The Terminal that Thinks for Itself," Computer Decisions, January 1973, 10-13. 5. Bergeron, D., Gannon, J., Shecter, D., Tompa, F., van Dam, A., "Systems Programming Languages," Advances in Computers, 12, Academic Press, October 1972. 6. Bryden, J. E., "Visual Displays for Computers," Computer Design, October 1971, pp. 55-79. 7. Christensen, C., Pinson, E. N., "Multi-Function Graphics for a Large Computer System," FJCC 31,1967, pp. 697-712. 8. Cotton, I. W., Greatorex, F. S., Jr., "Data Structures and Techniques for Remote Computer Graphics," FJCC 33, 1968, pp. 533544. 9. Cotton, I. W., "Languages for Graphic Attention-Handling," International Symposium on Computer Graphics, Brunei, April 1970. 10. Dijkstra, E. W., "The Structure of THE Multiprogramming System," CACM, 11, 1968, pp. 341-346. 11. Foley, J. D., "An Approach to the Optimum Design of Computer Architecture," CACM, 14, 1971, pp. 380-390. 12. Foster, C., "A View of Computer Architecture," CACM, 15, 1972, pp. 557 -565. 13. Hagan, T. G., Nixon, R. J., Schaefer, L. J., "The Adage Graphics Terminal," FJCC, 33, 1968, pp. 747-755. 14. Hobbs, L. C., "The Rationale for Smart Terminals," Computer, November-December 1971, pp. 33·35. 15. Hobbs. L. C., "Terminals," Proc. IEEE, 60, 1972. pp. 1273-1284. 16. Kilgour, A. C., "The Evolution of a Graphics System for Linked Computers," Software-Practice and Experience, 1, 1971, pp. 259268. 17. Knudson, D., and Vezza, A., "Remote Computer Display Terminals," Computer Handling of Graphical Information, Seminar, Society of Photographic Scientists and Engineers, Washington, D.C., 1970, pp. 249-268. 18. Machover, C., "Computer Graphics Terminals-A Backward Look," SJCC, 1972, pp. 439-446. 19. Machover C., "The Intelligent Terminal, Pertinent Concepts in Computer Graphics," Proc. of the Second University of Illinois Conference on Computer Graphics, M. Faiman and J. Nievergelt (eds.), University of Illinois Press, Urbana, 1969, pp. 179-199. 20. Myer, T. H., Sutherland, I. E., "On the Design of Display Processors, CACM 11, 1968, p. 410. 21. Ninke, W. H., "A Satellite Display Console System for a MultiAccess Central Computer," Proc. IFIPS, 1968, E65-E71. 22. Rapkin, M. D., Abu-Gheida, O. M., "Stand-Alone/ Remote Graphic System," FJCC, 33, 1968, pp. 731-746. 23. Rosenthal, C. W., "Increasing Capabilities in Interactive Computer Graphics Terminals," Computer, November-December 1972, pp.48-53. 24. Ross, D. T., Stotz, R. H., Thornhill, D. E., Lang, C. A., "The Design and Programming of a Display Interface System Integrating Multi-Access and Satellite Computers," Proc. ACM;SHARE 4th Annual Design Workshop, June 1967, Los Angeles. 25. Ryden, K. H., Newton, C. M., "Graphics Software for Remote Terminals and Their Use in Radiation Treatment Planning," SJCC, 1972, pp. 1145-1156. 26. Schiller, W. L., Abraham, R. L., Fox, R. M., van Dam., A., "A Microprogrammed Intelligent Graphics Terminal," IEEE Trans. Computers C-20, July 1971, pp. 775-782. 27. Stabler, G. M., A 2250 Modell Simulation Support Package, IBM PID No. 1130-03.4.014, 1969. 28. Stabler, G. M., The Brown University Graphics System, Brown University, Providence, R. I., February, 1973. 29. van Dam, A., "Microprogramming for Computer Graphics," Annual SEAS Conference, Pisa, Italy, September 1971. 30. Webber, H., "The Super Integral Microprogrammed Arithmetic Logic Expediter (SIMALE)," SIGMICRO, 6, 4, 1972. 31. Woodsford, P. A., "The Design and Implementation of the GINO 3D Graphics Software Package." Software-Practice and Experience, 1. 1972,pp.335-365. Fourth generation data management systems by KEVIN M. WHITNEY General Motors Research Laboratories Warren, Michigan I~TRODUCTION AND HISTORY second generation of data management facilities became available to speed the job of generating reports and writing new application packages. These report generators, file maihtenance packages, and inquiry systems provided easy-to-use facilities for selecting, summarizing, sorting, editing, and reporting data from files with a variety of formats. A user language simple enough for non-programmers to use was sometimes provided, although using more advanced system facilities often required some programming aptitude. This second generation of data management systems brought non-procedural user languages and a greater degree of program independence from data formats and types. Much more general capabilities were incorporated into third generation systems such as IDS, IMS, and the CODASYL report specifications. l All of these systems emphasize the effective management of large amounts of data rather than the manipulation or retrieval of data items. All have facilities to manage data organized in much more complicated data structures than the sequential structures used by earlier systems. Relationship's among data items in many different files are expressible and manipulable in the data management systems. Provisions for the recovery from many different types of system errors, head crashes, erroneous updates, etc. are integral parts of these systems. Audit trails of modifications of the data base are often automatically maintained, and new file descriptions are handled by standard system facilities. In general, many of the functions which previous generations of data management systems left to the operating system or to user application programs are automatically incorporated in these systems. Many of the third generation systems provide a convenient user inquiry language, a well-defined interface for user application programs, and new language facilities to aid the data administrator in the description and maintenance of the data base. A fourth generation data management system will continue the development of all these trends toward generality, flexibility, and modularity. Other improvements will result from theories and concepts now being tried in experimental systems. Much greater degrees of data independence (from user program changes, from data description and storage changes, from new relationships among the data) will be common; user languages will Many hundreds of programming systems have been developed in recent years to aid programmers in the management of large amounts of data. Some trends in the development of these data management systems are followed in this paper and combined with ideas now being studied to predict the architecture of the next generation of data management systems. The evol ution of data management facilities can be grouped into several generations with fuzzy boundaries. Generation zero was the era when each programmer wrote all his own input, output, and data manipulation facilities. A new generation of facilities occurred with the use of standard access methods and standard input/ output conversion routines for all programs at an installation or on a particular computer system. The second generation of data management was the development of file manipulation and report writing systems such as RPG, EASYTRIEVE, and MARK IV. Much more comprehensive facilities for the creation, updating, and accessing of large structures of files are included in the third generation of generalized data management systems such as IMS/2, IDS, and the CODASYL specifications. Each of these generations of data management systems marked great increases in system flexibility, generality, modularity, and usability. Before speculating on the future of data management, let us survey this history in more detail. Standard access methods, such as ISAM, BDAM and SAM which formed the first generation data management facilities were mainly incorporated in programming languages and merely gathered some of the commonly performed data manipulation facilities into standard packages for more convenient use, The main benefit of these standard facilities was to relieve each application programmer of the burden of recoding common tasks. This standardization also reduced program dependency on actual stored data structures and formats. Although these access methods and input/ output conversion routines were often convenient to use, they could only accommodate a limited number of different data structures and data types. As computer systems became more common, more widely used, more highly depended on, and eventually more essential to the functioning of many businesses, a 239 240 National Computer Conference, 1973 APPLI CATl ON PR(XJRAM USER SERVICES APPLI CATION QUERY TELEPROCESS I f'XJ PRCNRAM LANGUAGE MJNITOR DATA DESCRIPTION AND MANIPULATION DBA FACILITIES SERVICES DATA ACCESS AND CONTROL FACILITIES Figure I-Information management system structure. become much less procedural; and data manipulation facilities for use in writing application programs will become simpler and more powerful. Concepts from set theory and relation theory will become more widely used as the advantages of a sound theoretical basis for information systems become more widely appreciated. Increasingly, information management systems will make more of the optimization decisions relating to file organization and compromises between different user requirements. The trend to bending the computer toward user requirements rather than bending the user to the requirements of the computer will continue resulting in progressively easier to use systems. One organization of the facilities of an information management system may be the division into modules shown in Figure 1. This modularity may represent software or hardware (or both) boundaries and interfaces. A data access and control module is needed to manage data flow to and from the storage media. This data management module is used by the data description and manipulation module in providing data description and manipulation at a level less dependent on the storage structure of this data. The user can access data through application processors or through a system provided query language processor. A variety of system services are grouped into the user services module and the data base administrator services module. User services include such conveniences as on-line manuals, help and explain commands, and command audit trails. Data base administrator services include facilities to load and dump data bases, to perform restructuring of data and storage organizations, to monitoring performance, and to control checkpointing and recovery from errors. Impacting the fourth generation information management systems are the theories and methodologies of data description and manipulation, the relational view of information, the establishment of sound theoretical foundations for information systems, and the development of networks of cooperating processors. Each of these topics will be discussed in one of the following sections. Following those sections is a description of our experiences with RDMS, a relational data management system which illustrates some of these new theories and methodologies. DATA DESCRIPTION AND MANIPULATION Certainly the most complex and difficult problem facing the designers, implementers, and users of an information management system is the selection of language facilities for the description and manipulation of data. Although many attempts have been made to separate data description from data manipulation, it must be noted that data description and data manipulation are inextricably intertwined. While the declarative statements which describe a data base may indeed be kept separate from the statements in application programs which manipulate the data in the data base, nevertheless the data description facilities available determine and are determined by the data manipulation facilities available. Descriptive statements for vectors aren't very useful without vector manipulation statements, and vice versa. The description of data may be done at a wide variety of level~ of generality ranging from general statements about the relationships between large sets of data items to explicit details about the actual storage of the data items. Information management systems of the next generation will have data description facilities at a variety of levels to serve different classes of users. At least three main levels of data description can be distinguished, the information structure for the users, the data structure for the data base administrators, and the storage structure for the system implementers and the system. The information structure which determines the user's view of the data is (ideally) quite abstract, indicating the relationships among various types of data items, but omitting details such as data item types, precisions, field lengths, encodings, or storage locations. The user should also be free of system performance considerations such as indexing schemes or file organizations for efficient retrieval of the data items, particularly since his access requirements may conflict with those of other users. In any event the system or the data administrator is in a better position than anyone user to make those arbitrations. An example of the information structure for some data described by a relational model is shown in Figure 2. A second level of data description is necessary to specify the structure (logical) of the data which will facilitate the efficient retrieval of the data representing the information in the data base. This second level of description is the data structure and represents additional informa- PEOPLE (NAME. AGE. HEIGHT. WEIGHT) S H I RT S (S H I RT #, COL 0 R, S I Z E, COL L AR ) S LAC K S (S LAC K S #, COL 0 R, S I Z E, F AB RIC) OWN S - S H I RT S (N AME, S H I RT # ) oVI NS - S LAC KS (N A['1 E, S LAC KS # ) Figure 2-An information structure specified by a relational model Fourth Generation Data Management Systems tion about the patterns of retrieval expected for information in the data base. The set occurrence selection facility of the CODASYL proposal is an example of this level of data description. CODASYL schema data description statements for the information structure of Figure 2 are shown in Figure 3. A still more detailed level of data description is the storage structure of the data which represents the actual organization of the data as stored in the system's storage media. At the storage structure level, items of concern include field types, lengths, encodings, relationship pointers and index organizations. Figure 4 shows a diagram of pointers and storage blocks specifying a storage structure for the data described in Figure 3. Data manipulation facilities must also exist at a variety of levels corresponding to the data description levels. As data description becom€s more general, the corresponding data manipulation becomes less procedural and more descriptive. E. F. Codd's relational model of datal described in the next section provides an example of a highly non-procedural data manipulation language. PEOPLE INFORMATIO~ Information management systems deal fundamentally with things such as people, ages, weights, colors, and sizes which are represented in a computer by integers, floating point numbers, and character strings. A collection of items of similar type is called a domain. Domains may overlap, as for example with the domains of people and of children. SET IS PEOPLE; M~ER IS SYSID1. M:MBER IS PERSON DUPLICAlES ooT AUJ)h'8) FOR NJVv1E. RECORD IS PERSON. 1YPE CHARAClER 12. NA'1E lYPE FIXill; CHECK R.AnJE 5, SO. P6E HEIGHT 1YPE FlXEJJ. WEIGHT 1m: FlXill. #SHIRTS lYPE FlXill. #SLACKS 1YPE FlXill. OCCURRS #SHI RTS. SHIRTS 2 COLOR TYPE BIT 4: 8~CODI~ crABLE. 2 SIZE PICTURE 99. 2 COLLAR PICTURE 93. SLACKS OCOJRRS #SLACKS. 2 COLOR 1YPE BIT 4: ENCODIf\G crABLE. 2 SIZE PICTURE 93. 2 FABRIC PICIURE 99. Figure 3-A data structure specified in the CODASYL data description language .... .... CHAIN NAME AGE I B EBCDIC HEIGHT B ... COLOR ci SIZE I I I B B COLLAR B BI COLLAR BI #SLACKS 0 I COLOR C SIZE I I B • • • COLOR C SIZE ~r WEIGHT #SHIRTS • • • COLOR C SIZE A RELATIONAL MODEL FOR 241 B ,!FABRIC . B B BIFABRIC BI Figure 4-A storage structure specified by a pointer diagram A relationship is an association among one or more not necessarily distinct domains. "Is married to" is a relationship associating the domains of men and of women. An occurrence of a relationship is an ordered collection of items, one from each domain of the relationship, which satisfy the relationship. Each occurrence of a relationship associating N domains is an ordered collection (John and Mary is an occurrence of the relationship "is married to" if John is married to Mary). A relation is a set of some tuples satisfying a relationship. The number of domains of a relation is called its degree and the number of tuples of a relation is called its cardinality. This simple structure is adequate to represent a great variety of information structures. Certain data manipulation facilities arise naturally from the relational description of information. The basic operations are the traditional operations of set theory (union, intersection, difference, etc.) and some new operations on sets of relations (projection, join, selection, etc.). Projection reduces a relation to a subset of its domains, while join creates a relation by combining two component relations which have one or more common domains. Selection extracts the subset of the tuples of a relation which satisfy some restriction on the values of items in a particular domain. In combination with the usual control and editing commands, these operations provide a convenient and nonprocedural data manipulation language. l\.iote that nothing need be known about the data or storage structures of the information represented by a relational model. The user may concern himself entirely with the domains of interesting objects and their relationships. 242 National Computer Conference, 1973 THEORETICAL FOUNDATIONS FOR INFORMATION SYSTEMS As information systems become increasingly complicated, it will be more important to base their designs on sound theoretical foundations. Although it has always been customary to test a system as comprehensively as possible, larger systems can never completely be tested. Thus it is important to find other methods of verifying that a system will function as specified. Current work in proving program correctness has this same aim with regard to programs. In this section three contributions to the theoretical foundations of information system structure will illustrate some possible effects of new theory on system design. D. L. Childs 3 has devised an ordering for set structures useful in the implementation of set operations. E. F. Codd 4 has developed the relational calculus as a sound basis for user languages, and W. T. Hardgrave 6 has proposed a method for eliminating ambiguous responses to queries of hierarchical data structures. Childs proposed a general set theoretic data structure based on a recursive definition of sets and relations. His theory guarantees that it is always possible to assign unique key values to any set element in such a structure. These key values may be used to create an ordered representation of the data structure. Set operations are much more efficient on ordered sets than on unordered sets. Thus the theory leads to efficient implementations of complex structures. In a series of papers, E. F. Codd has developed the relational model of data which was explained in the previous section. One important theoretical result of this theory is a proof that any relational information in a data base of relations can be retrieved by a sequence of the basic relational and set operations defined in the previous section of this paper. Furthermore, it is possible to estimate the system resources necessary to answer the query without answering it. Thus a user can be warned if he asks a particularly difficult query. Although not all queries on a data base of relations can be answered as relations, the inclusion of functions on relations (counts, sums, averages, etc.) guarantee a very wide range of legal queries. This theory assures a system design that a general purpose system will not suddenly fail when someone asks an unexpected new query. Research into storage structures underlying the relational data model by Date & HopewelP show that a variety of possible anomalies in the storage, updating, and retrieval of information do not arise when the information is stored in Codd's canonical third normal form. Thus by using a relational storage structure for data, certain types of consistency are automatically assured. These studies show also a variety of efficient methods of implementing a relational data model. A third investigation into the fundamentals of information systems design is W. T. Hardgrave's study of information retrieval from tree structured data bases with Boolean logical query languages. Hardgrave analyzed anomalies resulting from the use of the "not" qualifier in boolean queries. Finding unavoidable problems with the usual set theoretic retrieval methods, he formulated additional tree operations which separate the selection of data items from the qualification of items for presentation. These capabilities may be considered a generalization of the HAS clause of the TDMS system. This study not only focuses attention on possible multiple interpretations for some Boolean requests applied to items at different levels of a tree hierarchy, but also presents more severe warnings about possible problems with the interpretation of network structured data such as in the CODASYL proposal. NETWORKS OF COOPERATING PROCESSORS Continuing decreases in the cost of data processing and storage equipment and increases in the cost of data manipulation software will bring further changes in the architecture of information management systems. The example of Figure 5 shows the natural trend from soft- SOFTWARE MODULARITY APPLICATION PROGRAM T CAi1 I SAM SSP TERMINALS HARDWARE MODULARITY APPLICATION PROCESSOR Ca-t\UNICATION PROCESSOR DATA BASE PROCESSOR ARRAY PROCESSOR J\. T E Ri1 I NAL S Figure 5-The trend from software modularity toward hardware modularity Fourth Generation Data Management Systems ware modularity of third generation systems to hardware modularity of fourth generation systems. This trend toward hardware modularity has been under way for many years. Control Data Corporation has advocated the use of independent peripheral processors to handle some of the more mundane functions of computing systems such as input/ output spooling, paging, disk and drum interfaces, etc. The use of front end processors such as the IBM 3705 to handle communications functions independently of the central processor is already common. IBM uses the Integrated Storage Controller, a small independent processor, to control 3330 disk drives. Special purpose computer systems for Fourier transforms and for matrix manipulation are being used as peripheral or attached processors in current computing systems. Two specific examples of independent processors in data management systems are intelligent remote inquiry terminals and the data base computer. While intelligent remote terminals are already common, independent data base computers are still in research laboratories. These data base machines consist of an independent control processor and a large storage media. The processor not only manages the control commands necessary to handle the storage media, but also can perform logical extraction or selection of data records to reduce the amount of data transmitted to the host processor. As more experience with independent data base computers is accumulated, they will assume additional tasks such as selecting data compaction and compression methods for optimal data storage and selecting indexing methods for optimal data access. RDMS, A RELATIONAL DATA MANAGEMENT SYSTEM To gain some experience with the advantages and limitations of data management using a relational model of information, the RDMS system was designed and implemented in PL/ I on a large virtual memory computer system. The system consists of three main sections, a command (query and data manipulation) language interpreter, set and relational data manipulation facilities, and an interface to the operating system. This modular design resulted in a readily portable group of set and relation manipulation facilities with rigidly defined interfaces to the user programs and to the operating system. Sets are used internally by the system in self-describing data structure which catalogs each user's sets and their characteristics. Because one of the main benefits of a relational model of information is its simplicity, care was taken to keep the data description and manipulation languages simple. All data managed by RDMS is organized in relation sets viewed by the user only as named collections of named domains. The user need not concern himself with the type of a domain, nor its precision, nor its storage representations. All data manipulation facilities store their output in sets which may be manipulated by other RDMS com- 243 mands. The command format is simple and consistent containing an output set name (if the command produces an output set), a keyword identifying the command, and the parameters of that command. These parameters may be set names, domain names, domain values, and character strings to label output displays. For example, a command which extracts a subset of a set is: "CHEAP_ WIDGET _ SET = SUBSET OF WIDGET _ SET WHERE COST LT 100.00". Commands are typed on a keyboard, and the system responds with a full screen graphic display. RDMS commands may be grouped in four main classes: Set control commands which manipulate entire sets (create, destroy, save, catalog, uncatalog, universe, etc.); Display commands which display or print the contents of sets (list, graph, histogram, print, etc.); Set manipulation commands which specify, modify, analyze, select, and combine the contents of sets (union, intersection, subset, join, statistics, summary, set, domains, etc.); and Special purpose commands which provide miscellaneous facilities for system maintenance, bulk input and output, and assorted user conveniences (explain, command list and trace, read from, describe, etc.). Several small data analysis and manipulation applications were tried using RDMS to test its adequacy for flexible data manipulation. A medical records analysis was particularly informative because the problems to be solved were specified by persons with neither programming nor data base experience. We were given thirty data items of eight different data types for each of several hundred mother-child pairs and asked to find any effects of a particular medication. The large amount of information displayed on the graphic console and the power of individual commands were demonstrated by answering an initial set of 41 questions with only 35 commands. Some of the more pleasing features of the system are the following. Combining individual commands into very complex requests is greatly facilitated by maintaining all data in sets used both for inputs and outputs of commands. Histograms and graphs of data from sets may either be displayed on the graphic terminal or printed on the printer. A permanent record of all commands, any screen display, and the contents of any set can be printed. Mistakes can often be undone by the REMEMBER and FORGET commands which provide an explicit checkpointing facility for each set. The main complaints from users were the paucity of data types available, the difficulty of editing erroneous or missing data items, and the inability to distinguish missing data by a special null code. We feel the RDMS system was a successful and useful experiment from the viewpoints of both the system user and the system implementer. Sets of relations manipulated by powerful descriptive commands are usable in real information systelns. Notl-programmer~ can readily adapt to the relational model of data and the corresponding types of data manipulation commands. For the system implementer, the relational data structure provides a 244 National Computer Conference, 1973 simple and consistent collection of data description and manipulation facilities with which to build a variety of information systems. SUMMARY Data management has evolved through at least three distinct stages and is entering a fourth, from access methods, to file management systems, to data management systems, and toward information systems. In this evolution, the user's facilities for dealing with large amounts of information have become more general, flexible, extensible, and modular. Gradually he has been unburdened of various tedious levels of detail allowing him to focus his attention more directly on the relationships among various types of information and their manipulation. These trends will continue, spurred on by advances in information system design theory and in computing system hardware and software. ACKNOWLEDGMENTS The author is indebted to E. F. Codd, C. P. Date, G. G. Dodd, and A. Metaxides for many long hours of discussion on the subjects in this paper. REFERENCES 1. CODASYL Data Base Task Group April 1971 Report, Available from ACM. 2. Codd, E. F., "A Relational Model of Data for Large Shared Data Bases," Communications of the ACM, Vol. 13, No.6, June 1970. 3. Childs, D. L., "Feasibility of a Set-Theoretic Data Structure," Proceedings of IFIP, 1968. 4. Codd, E. F., "Relational Completeness of Data Base Sublanguage," Proceedings of Courant Institute Symposium on Data Base Systems, 1971. 5. Dayl, C. J., Hopewell, P., "Storage Structure and Physical Data Independence," Proceedings of ACM SIGFIDET Workshop, 1971. 6. Hardgrave, W. T., "BOLTS - A Retrieval Language for Tree Structured Data Base Systems," Proceedings of the Fourth International Conference on Information Systems (COINS-72), December 1972. 7. Whitney, V. K. M., "A Relational Data Management System (RDMSl," Proceedings of the Fourth International Conference on Information Systems (COIN8-7~), December 1972. Representation of sets on mass storage devices for information retrieval systems by STUART T. BYROM University of Tennessee Knoxville, Tennessee and by WALTER T. HARDGRAVE CERN-European Organization for Nuclear Research Switzerland Boolean operations on these representations along with several examples. IXTRODUCTION Information retrieval systems utilizing Boolean operators to manipulate subsets of the data base require an efficient method of set representation. Commands in the user language permit the selection of subsets by retrieving occurrences in the data base that obey user specified conditions. Compound conditions may then be obtained by performing the traditional Boolean operators AXD, OR, and XOT to selected subsets. Entries in the data base may be assigned identification numbers in order that the representations of subsets mav be in the form of a set of positive identification numbers. Th~s, the problem of manipulating large sets of occurrences reduces to one of efficiently representing subsets of positive integers. For example, information stored in nodes structured as data trees may be retrieved via a qualification clause which selects nodes satisfying an attribute-relation-value (a-r-v) triple. The a-r-v triple represents a subset of the universe of all nodes in the tree. If the nodes in the data tree are assigned cOILSecutive positive integer numbers, then a set of nodes may be represented by a set of identification numbers. The number of nodes in the universe \\ill be assumed to be no more than 231 , or approximately two billion, although the assumption will also be made that anyone subset ,vill be small with respect to the universe. Ho'wever, this assumption will not hold for the complement of a set, that is, the result of a Boolean KOT. Because sets may be quite large, it is necessary to store them as efficiently as possible. One method of storing sets of positive integers in order to conserve storage space is the Bit Stream Set Representation (BSSR) which assigns a single binary digit (bit) in storage to each node in the universe. Definition of a bit stream set representation Let P be the set of positive integers. Let S be a subset of the set of contiguous integers 1 through k, and let R be a Bit Stream Set Representation (BSSR) of S, defined as follO\vs: such that biER, iEP for l::;i::;k. (1) and bt=O forall iEES. if R is in complement form (to be defined later) , then bo= 1, other'\\>ise bo=O. (2) (3) Thus the BSSR is a one-to-one mapping from the set of k+ 1 bits. The inclusion of the integer i in the set S is represented by the ith bit having the value "1". Likewise, the integer i not in the set S is represented by its corresponding ith bit having the value "0". Each subset of P has a unique bit stream representation. Furthermore, every subset of the integers 1 through k may be represented by a bit stream of length k + 1. For example, the set {2, 5,100000, 2000000000} is a subset of the integers 1 through 231-1 (i.e., 2,137,483,647) and may be represented by the BSSR integ~rs 1 through k to the ordered (0010010 ... 010 ... 010 ... 0) such that except where BIT STREA:'vI SET REPRESENTATION ~= bf> = bl()()oOO = b:!ooooooooo = 1. In this manner, any subset of the integers 1 through 231_1 may be represented by a BSSR of length 231. The definitions involving the Bit Stream Set Representation will be followed by the algorithms for performing the 245 246 National Computer Conference, 1973 Bit stream set representation w£th directories where When a set is represented in the bit stream form, a substream of zero bits may be eliminated in order to save space with no loss of information. By dividing the BSSR into equal length substreams called blocks, it is possible to omit those which contain all zero bits. However, in order to maintain the contiguous numbering scheme of the bits, it is necessary to indicate any omissions with a directory. otherwise dl,o = d l ,48 = d1,976562 = 1 d1,j=0 for 0~j~220_1. By prefixing the directory to the three non-zero blocks of the BSSR of S, the following unambiguous representation results: (lD ... OlD ... OlD ... 0 00100lD ... 0 0 ... 010 ... 0 0 ... 0lD ... 0) Definition of a block Let R be the BSSR of the set S as previously defined. Let R be divided into kim distinct substreams B o,; of length m. Then each B o,; is a block of R and is defined as follows: such that Bo,j~R for The recursive property of directorip,s O~j~klm-1. In addition, the definition implies the following: Bo,on BO,ln ... n B k / m - 1 = ( ) (1) Ro,o UBO,I U ... UB k / m- 1 = R. (2) Finally, to indicate a block of all zero bits, the notation Bo,j=O implies for all biE B j that bi =0 for mj~i~m( j+1)-1. In the previous example, the BSSR of the set S may be divided into 220 blocks of length m = 211 bits as follows: Bo,o= BO,I= (0010010 ... 0) (0... 0) B O,4S= (0 ... 010 ... 0) BO.9766rtl = The length of the resulting BSSR is 220 +3(2 11 ) =1,054,720 which is substantially less than the BSSR without a directory 231. (0 ... 010 ... 0) As described above, the directory Dl indicates which blocks of the BSSR are omitted in order to conserve storage space. However, the length of DI (possibly very large itself) may also contain relatively long sub streams of zero bits. Thus, by dividing this directory into blocks of length n, a higher level directory D2 may be constructed to indicate omissions in D I • In fact, the idea of a "directory to a directory" may be applied recursively as many times as desired to eliminate all blocks containing all zero bits. Definition of a directory block A directory block is defined in the same manner as is a block of R (i.e., Bo,j). That is, first let the level L directory DL be divided into substreams of length nL; then let BL,i be a directory block of D L defined as follows: 0) B O,I048576= (0 ... such that such that BL,jr;;;.D L otherwise for L~ 1 and Definition of a directory Let R be a BSSR of the set S and let Bo,o, BO,I, .. , , BO,k/m-1 be the blocks of R as defined above. Then the bit stream directory DI may be defined as fo11O\\'s: DI = (dl,o, dI,I, ' , , , dI,k/m-I) such that Defim'tion of higher level directories The directory DI is defined to be the lowest level directory (L = 1) to R in a BSSR. Using the above definition of directory blocks, a higher level directory DL may be defined as follows: if Bo,j~O, if Bo,j=O, then then d1 ,j= 1 dl,j=O for O~j~klm-1. In the example, the directory for the BSSR of S is D 1 = (10 ... 010 ... 010 ... U) such that if BL_l,j~O, if B L -1,J=O. then dL,i = 1 then dL.j=O for O~j~maxL' Representation of Sets on Mass Storage Devices 247 The level 1 directory Dl of the example BSSR may be divided into blocks of length nl = 211 as follows: (from the universe 1 through 231 _1), each pair of bits in the BSSR's Bl,o= (10 ... 010 ... 0) Bl,l = ( o. . . 0) BSSRI = (0010010 ... 010 ... 010 ... 0) BSSR2 = (0110000 ... 010 . . . 0) B l ,428 = ( o ... 010 ... 0) B l ,511= o ... ( 0) where all blocks except Bl,O and B l ,428 contain all zero bits. Thus, the level 2 directory for the BSSR is D 2 = (10 ... 010 ... 0) such that and othenvise 1 then subtract "1" from L and go to Step 2. Step 5: If neither BSSR is in complement form, go to Step 6; othenvise, replace the level zero blocks of the BSSR in complement form with their complement (Boolean NOT). Then, for each dl,j* = 1 in D1 * such that dl,j=O in D1 of that BSSR, insert the block B o,j=l (which is a stream of all ones) in the proper relative position in Do. Step 6: The level zero blocks of BSSR* are equal to the Boolean AND of the level zero blocks of the two BSSR's. Algorithm for the Boolean OR If both BSSR's are in complement form, then use the AND algorithm to obtain the BSSR* in complement form since by theorem ,.......,R1 OR ,.......,R2 is equivalent to ,.......,(R1 AND R2)' (Again note that both of these BSSR's should be treated in the AND algorithm as if each were not in complement form.) Step 1: Let L be equal to the number of the highest directory level of the two BSSR's. Step 2: The level L blocks of BSSR * are equal to the Boolean OR of each corresponding pair of bits of the level L blocks of the two BSSR's. Step 3: For each dL,;*= 1 in DL* such that dL.j=O in a BSSR, insert the block B L-l,j = 0 in the proper relative position in D L-1 in that BSSR. Step 4: If L> 1, then subtract" 1" from L and go to Step 2. Step 5: If neither BSSR is in complement form, go to Step 6; otherwise, replace the level zero blocks of the BSSR not in complement form with their complement (Boolean NOT). Finally, perform the Boolean AND on the level zero blocks of the two BSSR's to obtain the resultant level zero blocks of BSSR * in complement form. (Terminate procedure). Step 6: The level zero blocks of BSSR * are equal to the Boolean OR of the level zero blocks of the two BSSR's. 0010010 ... 0 0 ... 010 ... 0 0 i .. 010 ... 0) '------v----" "----y-----I '----y----.I Bo,o B O,48 B O,976562 BSSR2 : (10 ... 0 10 ... 010 ... 0 0110 ... 0 O ... 010 ... 0) '-y-----I '----y----.I '--v--" ' - - - - - y - - ' Bo,o Step 1: L=2 Step 2: Find Boolean AND of level 2 directories: D2 of BSSR1 = (10 ... 010 ... 0) D2 of BSSR2 = (10 . . . 0) D2* of BSSR* = (10 . . . 0) Step 3: Since lh,428* =0 in D2* and lh,428 = 1 in D2 of BSSR1, omit block B 1 ,428 from BSSR1• Next omit all blocks within the range of B 1 ,428 in BSSR1 ; i.e., block B O,976562. Thus, BSSR1 = (10 ... 010 ... 010 ... 00010010 ... 0 0 ... 010 ... 0) '---v--'" ~ Bo,o B O,48 '---y-I ~ Step 4: L=2-1 =1 Step 2: Find Boolean AND of level 1 directories: D1 of BSSR1 = (10 ... 010 ... 0) D1 of BSSR2 = (10 ... 010 ... 0) D1* of BSSR*= (10 ... 010 ... 0) Step 3: (No occurrences) Step 4: L=l Step 5: (Neither is in complement form so continue to Step 6). Step 6: Find Boolean AND of level 0 blocks: R of BSSR1 = (0010010 ... 0 O ... 010 ... 0) R of BSSR 2 = (0110000 ... 0 O ... 010 ... 0) R of BSSR* = (0010000 ... 0 0 ... 010 ... 0) Since lh = blOOOOO = 1, the resulting BSSR * represents the set {2, 100000}. Example: 8 1 AND ,.......,82 BSSR1 is shown in the previous example; BSSR2 is as follows in complement form: BSSR2 = (10 ... 010 ... 010 ... 01110 ... 0 0 ... 010 ... 0) '-v--" Example Boolean operations ~ '---v--' '---v----' Bo,o The sets 8 1 and 8 2 and their complements will be used in the examples to follow ,,-here 8 1 ={2, 5, 100000, 2000000000} 8 2 = {l, 2, 100000}. Example: 8 1 AND 8 2 BSSR 1 : (10 ... 010 ... 0 10 ... 010 ... 0 O ... 010 ... 0 "----v---" " - - - - y - - - - - ' '-----y-----" BI.428 B O,48 B O•48 Step 1: L=2 Step 2: Since BSSR 2 is in complement form, then D2* of BSSR*=D2 of BSSR 1 = (10 ... 010 ... 0) Step 3: C~\o occurrences) Step 4: L=2-1=1 Step 2: Since BSSR 2 is in complement form, then D1 * of BSSH. * = D1 of BSSR 1 = (10 ... 010 ... 0 O ... 010 ... 0) '--v------" '----y----.I Representation of Sets on Mass Storage Devices Step 3: (K0 occurrences) Step 4: L=1 Step 5: Since BSSR2 is in complement form, then replace its level zero blocks ,yith their complement (reverse each binary value). R of BSSR2 = (0001 ... 1 1 ... 101 ... 1) Bo.o ~ B o.o 1) '---y-----' B O•48 ~~'---y-----' B O•48 BO.976562 Since bs = b2000000000 = 1, the resulting BSSR* represents the set {5, 2000000000} . Exa.mple: SI OR S2 The BSSR's for SI and S2 are shown in a previous example. Step 1: L=2 Step 2: Find Boolean OR of level 2 directories: BO.976562 The BSSR's for SI and rovS2 are shown in a previous example. of BSSR* = (10 ... 010 ... 0) Step 3: Since ~.428*= 1 and d2 •428 =0 in BSSR2, insert the block B 1 •428 into BSSR2• Dl of BSSR 2 = (10 ... 010 ... 0 o ... 0) '----y------' ~ Bl,O Step 4: L=2-1=1 Step 2: Find Boolean OR of level 1 directories: D2 of BSSR1 = (10 ... 010 ... 0) D2 of BSSR2 = (10. . . 0) D2* of BSSR*= (10 ... 010 ... 0) Step 3: Since d 2 •428*= 1 and d 2 •428 =0 in BSSR2, insert the block B 1 •428 =O into BSSR2• Dl of BSSR2 = (10 ... 010 ... 0 o. . . 0) '-v------' '-v----" B 1 •428 Step 4: L=2-1 = 1 Step 2: Find Boolean OR of level 1 directories: Dl of BSSR1 = (10 ... 010 ... 0 Dl of BSSR2 = (10 ... 010 ... 0 Dl* of BSSR*= (10 ... 010 ... 0 '-v------' B 1 •O D2 of BSSR1 = (10 ... 010 ... 0) D2 of BSSR2 = (10. . . 0) o ... 010 ... 0) o. . . 0) o ... 010 ... 0) "----y---/ B 1 •428 Step 3: Since dl.976562*=1 and dl.976562=0 in BSSR2, insert the block BO.976562 = 0 into BSSR2• R of BSSR2 = (1110 ... 0 0 ... 010 ... 0 0 ... 0) B 1 •428 Bo.o Dl of BSSR 1 = (10 ... 010 ... 0 0 ... 010 ... 0) Dl of BSSR2 = (10 ... 010 ... 0 o. . . 0) Dl* of BSSR*= (10 ... 010 ... 0 o ... 010 ... 0) '----y------' '------v--" Step 3: Since dl.976562 * = 1 and dl.976M2 = 0 in BSSR2, insert the block BO.976562 = 0 into BSSR2• R of BSSR2 = (0110 ... 0 0 ... 010 ... 0 0... 0) ~ '---y----/ Bo.o BO.976562 BO.48 Step 4: L= 1 Step 5: Both BSSR's are not in complement form so continue to Step 6. BO.48 BO.976562 Step 4: L=1 Step 5: Replace the level zero blocks of the BSSR not in complement form with their complement (reverse each binary value). R of BSSR1 = (1101101. .. 11. .. 101. .. 11. .. 101. .. 1) '--y-----I B 1 •428 '---y---' '---v----I BO.48 Step 1: L=2 Step 2: Find Boolean OR of level 2 directories: R of BSSR1 = (0010010 ... 00 ... 0lD ... 00 ... 010 ... 0) R of BSSR2 = (0001111. .. 11. .. 101. .. 11. . . 1) RofBSSR*= (0000010 ... 00... 00 ... 010 ... 0) D2* '---y----/ ~ BO.976562 Step 6: Find Boolean AND of level zero blocks: Bo.o RofBSSR*= (0110010 ... 00 ... 010 ... 00 ... 010 ... 0) 1. BO.976562 = R of BSSR2 = (0001. .. 11. .. 101. .. 11. . . '---y---' R of BSSR1 = (0010010 ... 00 ... 010 ... 00 ... 010 ... 0) R of BSSR2 = (0110000 ... 00 ... 010 ... 00... 0) Since b1 = b2 = bs = blOOOOO = b2000000000 = 1, the resulting BSSR* represents the set {I, 2, 5, 100000, 2000000000} . BO.48 BSSR2, then insert the block Step 6: Find Boolean OR of level zero blocks: Bo.o ~~ 249 B o.o '--v-------' '---v---" BO•48 BO.976562 X ext, find the Boolean AXD of the level zero blocks: R of BSSR 1 = (1101101. .. 11. .. 101. .. 11. .. 101. .. 1) 0) RofBSSR*= (1100000 ... 00... 00... 0) R of BSSR2 = (1110000 ... 00 ... 010 ... 00... Since b1 = 1 and bo = 1, then the resulting BSSR * is in the complement form representing the set "'" {I }. 250 National Computer Conference, 1973 Algorithm for the Boolean NOT In order to perform the Boolean ~OT on a BSSR in complement form (as previously defined), simply reverse the value of bo= 1 to bo= O. In general, the resultant BSSR *' of a BSSR not in complement form will be large for small subsets, possibly approaching the size of the universe itself. Therefore, the use of a BSSR in complement form as opposed to BSSR*, the result of a ~OT algorithm, should be significantly more efficient. Step 1: Let H be equal to the number of the highest level directory of the BSSR, and let L=O. (Note that R=Do) Step 2: Perform the Boolean NOT on each bit in the level L blocks by reversing its binary value (with the exception of bo in Bo.o). Step 3: For each BL.j'~O, change its corresponding L+1 directory bit to "0". Step 4: Insert all omitted blocks from DL with blocks of "all ones." Step 5: Omit all level L blocks where BL.j=O. Step 6: Add "I" to L. If L , where N is an abstract set and A~NXN. Each element in N is called a node and each pair (a, b) in A is termed an arc. The arc (a, b) is said to be directed from node b. A path is a sequence of two or more nodes (11}, 112,' • " n m ) with each connected to the next by an arc. That is, for each ni, 1 ~ i ~ m-1, (n i, ni+l) EA. A path is a loop if its first and last nodes are the same. From th(' discussion in th(' pr('vious s('ction, it should be clear that th(' stat(' of all a('('('ss('s with f('Sp<'ct to a giv('n databas(' ('an b(' d('nn('d by d('seribing: (1) the allocatable data elempnts (P.g., records) m the databas(', (2) th(' activ(' processps (WIUTERS), and (3) th(' lock list associat('d with each process. Thpr('for('; thp state of all accpss('s of a databasp can bp dpfinpd by a database access state graph, . The s('t of nodes within paeh of t hps(' graphs consists of th(' union of thp s('t of activp proe('ss('s, P, and tl1<' 8('t of allocatablr data ('l('m('nts, E. Each lock list is r('prespntPd by a path b('ginning at an activ(' proc('ss nod(' and conn<'cting it ,,"ith ('ach data d('mpnt allocated to that process. Thus, the set of arcs within the lock lists comprises the srt L. :\Iore formally, a database acc('ss state graph is a direct('d graph
,,-here P = (Pi I Pi is the i-th oldest processl , E = (e I e is an allocatable data dement I , and L = A UB. The set of lock lists, L, is compos('d of the s('t of allocatrd elements, A = (a, b) I a = Pi and b is the oldest data ('lement allocated to pi, or A = eij, the j-th oldest data (,lement allocated to Pi and (eij_l, eij) E A l , and the set of blocked allocation rcquE'sts, B = {(a, b) I a=pi or a=eij_l and (eij-2, eii-l) EA with process Pi being blocked when f('questing alloration of data element b = eij. That is, b = ekl for some k=i and I such that either (Pk, b)EA or (ekl-l, b) EA} . Since each access state of a database is represented by an access state graph, operation of the LOCK-UXLOCK Mechanism can be modeled by state transition functions that map access state graphs to access state graphs. The four required functions are: LOCK Function. If database access state 8=
then LOCK(s) =s'=
. If P= {Pi I Pi is a process, l~i~nl then p'=p U1Pn+d. That is, the LOCK Function adds a process node to the graph. (2) The UNLOCK Function. This function is the inverse of the LOCK Function, and its application deletes an isolated process node from a graph. (3) The ALLOCATE Function. If database access state s=
, then ALLOCATE (s, pi, ed =s'. If L=A UB then s'=
and L'=A UB U Up.;. e"i) or (ei;-1. eii) 1. This funl'tion flOris fin arr> to (1) The Database Sharing 273 the graph and thereby models the allocation of a data element to a process. (4) The DEALLOCATE Function. This function is the inverse of the ALLOCATE Function. Its application deletes an arc from the graph and thus represents the release of a data element from a lock list. Figure 1 illustrates the application of the LOCK and UXLOCK Functions to a simple database access state. In the first graph sho·wn, P = {PI, P2, P3}, E = {d l, d2,···, d9 }, and L = the set of paths, {(PI, d7, dg, dg), (P2, d 4, d2), (P3, d4)}. The arc (Pa, d4) EB indicates P3 is blocked, ,vhile all other arcs are elements of A. The figure shows the LOCK Function adding process node P4, and the UXLOCK Function is shmvn deleting it. Figure 2 gives an example of the application of the ALLOCATE and DEALLOCATE Functions. In terms .of the model, the normal a.ccess sequence for a given process consists first of an application of the LOCK ---LOCK P PROCESSES I DA TA ELEMENTS --UNLOCK I PROCESSES DATA ELEMENTS Figure I-The LOCK and LOCK functions Function. This is followed by some number of applications of the ALLOCATE and DEALLOCATE Functions. At some point, the number of applications of ALLOCATE is equaled by applications of DEALLOCATE, and the access sequence ends with an U~LOCK. DEADLOCK DETECTIO~ From the discussion above, it should be obvious that the ALLOCATE Function is the only function defined that can precipitate a deadlock. This is clearly the case, for ALLOCATE is the only function capable of blocking a process. It is now possible to describe a simple and efficient deadlock detection algorithm in terms of the model just presented. The follmving theorem provides the theoretical basis for the detection procedure. P3 d7 dB dg P d7 dg dB ~I~I I I DEALLOCATE (P2' d3) I I PROCESSES DATA ELEMENTS PROCESSES DATA ELEMENTS Figure 2-The ALLOCATE and DEALLOCATE functions (2) the database access state graph representing tains a loop. 8' PROOF: To establish these conditions as necessary for 8' to be a deadlocked state, notice first that if Pi is not blocked then, by definition, 8' is not a deadlocked state. No",-, let
be the database access state graph representing 8 with P= {Pi 11 ~i~n, and n2::2}. Assume ALLOCATE (8, pi, e) =8' is a deadlocked state and that 8' =
does not contain a loop. Since 8' is a deadlocked state, in 8' there is a set of P d of m deadlocked processes, 'v here m ~ 2, and P d= IPdl, Pd2· •• , Pdm} r;;;.P. By definition, each Pdi E P d is blocked. Furthermore, each Pdi is blocked by a Pdj E P d, with i ~ j. If Pdi were not blocked by some pEP d, then Pdi would have to be blocked by a nondeadlocked process; therefore, Pdi ,vould not be deadlocked. Thus, if there are m processes in P d, then Pdi is blocked by one of the (m-l) processes {Pd2, pda,···, Pdm}. Assume for convenience that Pd2 blocks Pdl. ~ow, Pd2 must in turn be blocked by one of the (m - 2) processes {Pd3, Pd4, ••• , Pdm}. If not, then Pd2 ,vould have to be blocked by Pdl. If Pd2 ,vere blocked by Pdl, then for some data element e' the path (Pd2,· •• , e) E A and the arc (e, e') E B while the path (Pdl, ••• , e') EA. But since Pdl is blocked by Pd2, this implies that for some data element b' the path (Pdl, ••• e', ... , b) EA and the arc (b, b') EB while the path (Pd2, ••• , b', ••. , e) EA. This, however, violates the assumption that no loop is contained in < P UE, L' >, since the path (b' .•• e e' ... b, b') is a loop (see Figure 3). Hence, Pd2 mus~ be bl~ck~d b; one of the processes {PdS, Pd4, ••• , Pdm}. o b' o Pd 2 THEORE~I: If a valid database access state 8 is not a deadlocked state, then ALLOCATE (8, pi, e) =3' is a deadlocked state if and only if (1) process Pi is blocked on attempting to allocate data element e, and con- o o Figure 3-A deadlocked state involving processed Pdl and Pd2 0 274 National Computer Conference, 1973 :More generally, note that if a sequence of k processes PI, P2, "', Pk exists such that each Pi-l is blocked by Pi for all 2 ~ i ~ k, then a path exists from PI through elements of each p/s lock list to the last element in the lock list of Pk. The lock list of each Pi-l is connected to the lock list of Pi by the arc representing a blocked request, which is always the last arc in a lock list. Now consider the j-th process of the m deadlocked processes of st.ate s'. Assume for convenience that for 2~i~j, Pdi-l is blocked by Pdi. Then, Pdj must be blocked by one of the (m-j) processes {Pdh Pdj+l, •• " Pdm}. For if Pdj were blocked on allocating b' by some Pdi with 1 ~ i
. To establish the sufficiency of the conditions in the theorem, suppose that the blocking of Pi creates a loop in the access state graph. Since a single lock list cannot contain a loop, elements of j lock lists, where j ~ 2, must participate in the loop. Since the elements of one lock list can be connected to those of another lock list only by an arc representing a blocked allocation request, the existence of a loop implies that a sequence of processes PI, P2, "', Pj exists with each Pi-I, ') ~ i ~j, being blocked by Pi and P j being blocked by PI In this case a set of processes exist such that each process is blocked by a member of that same set; thus, no chance remains for them to become not blocked. Therefore, the state s' is a deadlocked state, and the theorem is established. Straightforward applicat.ion of this t.heorem results in a deadlock detection procedure that is both simple and efficient. Since a deadlock can occur only when an allocation request results in a process being blocked, \vhich is assumed to be an infrequent event in a transaction processing environment ,11 only infrequently will an examination of the database access state be necessary. In those instances when a process is blocked and it becomes necessary to test for a loop in the access state graph, the computation required for the test is nearly trivial since (1) the data element requested by the blocked process must be in the loop, and (2) the out-degree of every node in a database access state graph is one. Thus, deadlocked states of arbitrary complexity are easily and efficiently detected. This detection method has yet another useful characteristic -it directly identifies those processes that are responsible for the deadlock. The processes that are responsible are, of course, those which are blocking each other. In general, however, it is possible to encounter a deadlocked access state in which an arbitrary number of processes participate, but only a small fraction of these are responsible for that state. This condition can exist since any number of processes can themselves be blocked while either not blocking other processes or not blocking others in a deadlocked manner (i.e. processes participating in the deadlock whose lock lists can be removed from :he database access state graph without removing the deadlock condition). However, it is obviously those processes whose lock lists participate in the loop that cause a deadlock condition to exist. By detecting the existence of a loop, the algorithm has also isolated the processes responsible for the deadlock; and, thus, the method has also accomplished the first essential step in the recovery process. RECOVERY In the context of shared databases, Recovery is the procedure by which the effects of a (necessarily) aborted process on the object database* are reversed so that the process can be restarted. On the detection of a deadlock a Recovery Procedure must be invoked. The first element of the Recovery Procedure is the determination of which process to abort. This decision might be based on any of several criteria. For example, it might be advisible to abort the most recent of the offending processes, or the process with the fewest allocated data elements. In any event, the information required for this decision is readily available (see above). This decision is, however, a small part of the recovery problem. To recover efficiently, the LOCK-UNLOCK Mecha.nism requires features beyond an efficient detection algorIthm. One such feature is a checkpoint facility-a facility that records the state of a process and thus enables it to be restarted. Clearly, a checkpoint must be performed at each LOCK. Furthermore, to enable efficient restoration of the database, utilization of a process map is appropriate. A process map is basically a map from a virtual addressing space t.o a real addressing space, maintained for each process in the database access state graph. Database management systems are typically characterized by three levels of addressing: content addressing, logical addressing, and physical addressing. Associative references are processed in two steps: a ~ontent-to-Iogical address transformation, followed by a logical-to-physical address transformation. This "virtual" secondary storage characterized by a logical-to-physical storage map provides device independence and facilitates the e!fi~ient use ~f storage hierarchies. The process map is SImIlarly a logical-to-physical storage map. A process map is created and associated with process at the execution of a LOCK Function. With each execution of an ALLOCATE Function, a (physical) copy of the allocated data element is created and an entry is made in the associated process map. Subsequent references by the process to the database are routed through its process map; hence, incremental updates are performed on the copies of t.he data elements. The DEALLOCATE Function effects a modification** of the database storage map and the deletion of the associated entry in the process map. ALLOCATE therefore has the effect of creating a physical copy of t.he object data element accessible only to the allocator, and DEALLOCATE has the effect of making the physical copy available to all processes and the * The term database as used here includes all data files affected by the process, auxiliary files as well as the principal data files. ** The database storage map is modified so that references to the object data element are mapped to the physical copy created at the execution of ALLOCATE and subsequently modified by the allocator. Database Sharing original available to the database manager as allocatable space. A process map makes the recovery from a deadlocked access state a relatively simple matter. Once a decision is reached as to which process to abort, that process is merely restarted at the checkpoint performed by the LOCK Function. Implicit in this action is, of course, the restoration of the lock list of that process to an empty state. That is, each data element that was allocated to the process is released, and the copies of these elements are discarded. Clearly, priorities must be appropriately arranged to insure that a process blocked by the aborted process is allocated the released data element for which it was previously blocked. No further action by the Recovery Procedure is required, for due to the process maps, the actual database was unaltered by the aborted process. Note further that the utilization of process maps significantly reduces the probability of WRITERS interfering with READERS, since references to data elements by READERS are always directed to the actual database. SUMMARY The above discussion of the LOCK-UNLOCK Mechanism is intended to serve as a functional description of the elements of a database management system that are essential in order to provide an efficient facility for database sharing. In an actual database management system, the LOCK-U~LOCK Mechanism could be manifested in the form of LOCK and UNLOCK commands used by the programmer. Alternatively, the LOCK Function could be assumed implicit in the commonly used OPEN command. Under these schemes, ALLOCATE could be accomplished via a FIKD command, and DEALLOCATE could be implicitly invoked by an UNLOCK or CLOSE. The occurrence of a deadlock can be translated directly into a degradation in system throughput. The ,,,"ork done by a process to the point where it is aborted plus the overhead required for recovery represent the computational cost of a deadlock. Thus the justification of a LOCK-UNLOCK 275 mechanism of the type described here is predicated on an acceptably low frequency of occurrence of deadlocked access states. Of course, as tasks become small in terms of computational and other resource requirements, the throughput cost of deadlocks as well as the probability of their occurrences diminishes. Any database sharing mechanism can significantly contribute to the satisfaction of the requirement for efficient, responsive multiprogramming in the transaction environment. The LOCK-UKLOCK :11echanism not only provides the potential for efficient database sharing, but it also eliminates the requirement for special consideration for sharing from the application program. Moreover, this is accomplished while the integrity of the database is guaranteed. REFERENCES 1. Bernstein, A. J., Shoshani, A., "Synchronization in a Parallel Accessed Data Base," CACM, Vol. 12, No. 11, pp. 604-607, ~o vember 1969, 2. Coffman, E. G., Elphick, M. J., Shoshani, A., "System Deadlocks," Computing Surveys, Vol. 3, No.2, pp. 67-77, June 1971. 3. Collmeyer, A. J., "Database Management in a Multi-Access Environment" Computer, Vol. 4, No.6, pp. 36-46, November/December 1971. 4. Dennis, J. B., Van Hom, E. C., "Programming Semantics for Multiprogrammed Computations," CACM, Vol. 9, No.3, pp. 143-155, March 1966. 5. Dijkstra, E. W., "The Structure of THE Multiprogramming System," CACM, Vol. 11, No.5, pp. 341-346, March 1968. 6. Habermann, A. N., "Prevention of System Deadlocks," CACM, Vol. 12, No.7, pp. 373-385, July 1969. 7. Havender, J. W., "Avoiding Deadlock in Multitasking Systems," IBM Systems Journal, No.2, 1968. 8. Murphy, J. E., "Resource Allocation with Interlock Detection in a Multi-Task System," Proceedings AFIPS Fall Joint Computer Conference, pp. 1169-1176, 1968. 9. Holt, R. C., "Some Deadlock Properties of Computer Systems," Computing Surveys, Vol. 4, No.3, pp. 179-196, September 1972. 10. CODASYL Data Base Task Group Report, April 1971. 11. Shemer, J. E., Coli meyer, A. J., "Database Sharing-A Study of Interference, Roadblock, and Deadlock, Proceedings of 1972 ACMSIGFIDET Workshop. Optimal file allocation in multi-level storage systems* by PETER P. S. CHEN** Harvard University Cambridge, Massachusetts allocation strategy is to allocate the more frequently used files to faster devices to the extent possible. However, waiting time in the request queues is ignored. Thus, this file allocation strategy may induce undesirably long reqm'St queues befDre some devices. By considering queueing delay, a more realistic analysis may be performed. We analyze three types of file allocation problem. The first one is to allocate files minimizing the mean overall system response time without considering the storage cost. The second one is to allocate files minimizing the total storage cost and satisfying one mean overall system response time requirement. The last onp is to allocate files minimizing the total storage cost and satisfying an individual response time requirement for each file. We propose algorithms for the solutions of the first two problems; the third problem is considered elsewhere. 6 IXTRODUCTION Storage is an important and expensive component of a computer system. ::\Iany types of storage such as semiconductor, magnetic core, bulk core, disk, drum, tape, etc. are available today, each having different cost and physical attributes (e.g., access time). To be economical, the storage system of a modern computer generally consists of several different types of storage devices. Such an assemblage is called a multi-level storage system (or a storage hierarchy system). Since each type of storage has a different cost/performance ratio, a series of important problems arise in the design and use of multi-level storage systems-for example, ho\v to allocate files within a multi-level storage system in order to achieve the best performance without considering cost, and also ·when the cost is considered. The purpose of this paper is to study these problems. For simplicity in designing models, the following assumptions are made: ANALYSIS To design models, it is important to identify the significant parameters of the physical systems and to describe the interrelationships among these parameters. In the following, we shall describe the essential characteristics of file allocation problems. The storage device types concerned in this paper are auxilary devices. It is assumed that the block size is fixed for each device type, exactly one block of information is transfered per input-output request, and a storage cost per block is associated with each device type. The service time for a request generally consists of two components. One is the data transfer time, roughly a constant for each device, and the other is the electromechanical delay time, which may vary from request to request. Thus, the device service time is considered to be a random variable. Let M denote the total number of devices in the storage hierarchy. For device j (j = 1, 2, ... , M), the cost per block is Cj, and request service time is assumed to be exponentially distributed \"ith mean l/.uj (Ul~U2~··· ~UM>O): (a) Statistics of file usage are assumed to be knO\vn either by hardware/software measurements in previous runs or by analysis of frequency and type of access to the information structure. (b) Allocation is done statically (before execution) and not dynamically (during execution). Although these assumptions are introduced primarily to make analysis more tractable, many practical situations fit into this restricted case. One example is periodical reorganization of the data base for airline reservation systems. Another is the allocation of user files and non-resident system programs io auxilary devices. These file allocation problems have usually been treated intuitively or by trial and error. Only recently have some analyses been performed in this area.1.2·3 The work done by Ramamoorthy and Chandy4 and by Arora and Gall0 5 is particularly interesting; it concludes that the optimal file Prob [service time * This work was sponsored in part by the Electronic Systems Division, ::::;tJ= l-exp( -ujl), t~O A file is a set of information to be aiiocated in the storage hierarchy. The length of a file may vary from a few words to some bound set by the system designer. The file reference frequency is assumed to be known by some means. For U.S. Air Force, Hanscom Field, Bedford, Massachusetts under Contract No. F-19628-70-C-0217. ** The work reported here will be included in the author's Ph.D. Thesis, "Optimal File Allocation," to be presented to Harvard University. 277 278 National Computer Conference, 1973 simplicity, it is assumed that each block of a file has the same request frequency (for otherv.ise, we may redefine the files to satisfy this condition). Let L denote the total number of files. The length of file i (i = 1, 2, ... , L) is denoted by N i and the per block request frequency by Ii (11 ?/2? ... ?IL) . We assume that each block of a file can be assigned to any storage device, and there is sufficient storage on each device to allocate files in any manner desired. L files (with N i blocks each) must be allocated among M storage devices. We are interested in the effect of the allocation pattern on the total storage cost and response time. Let nij denote the number of blocks of file i allocated to storage devicej. The nij are integers; however, for simplicity we shall assume the nij are continuous variables. (A nearoptimal integer solution can be obtained by rounding the optimal solution to the nearest integer.) Note that the nij are nonnegative: The mean system response time is the weighted sum of the mean response time for requests forwarded to each device: R(Al, ... , AM) = M M /=1 /=1 L: (Aj/A)R j = L: [AJ!A(Uj-Aj)] or, The probability that the response time for a request forwarded to device j exceeds T can be expressed b y7 Pj(t>T) =exp[ - (I-AJ!uj)u j T] =exp i=I, ... ,L, j=I, ... ,M Note also that M L: nij=Ni , i=I, ... , L (2) /=1 • TJ (9) Since nij/ N i is the proportion of requests for file i forwarded to device j, the probability that the response time for ·a request for file i exceeds T can be expressed as a weighted sum of the individual probabilities for each device: Since Cj is the cost per block of storage using device J. and ~L [(t, nijli-Ui) (1) ' ~i=l nij IS the total number of blocks of storage using device M Pli/(t> T) J, the total storage cost is: = L: (nij/Ni)Pj(t> T) /=1 M L /=1 i=l c= L: Cj L: nij (3) Since the reference frequency for blocks in file i (i = 1 2, ... ,L) is ii, and there arenij blocks of file i allocated t~ storage device j, the total input request rate for device j is: ::.vIINLvIIZING THE :vIEAN SYSTE::vr RESPONSE TliVIE L Aj= L: nijii i=l (4) To prevent the queue lengths from growing v.ithout bound it is required that ' L Aj= L: nijli A~A(k) (k=I, . .. , M), set, otherwise A/=O, (13) where = k k i=1 i=1 L: Uj- U kl/2 L: U/,2, = ~1=80 nll=40 n12=l0 ~1=60 ~=O ~=20 n31=30 n32=70 (1) n31=loo n32=O (2) Allocation strategy A. Order files according to their relative frequency of access, and allocate files to devices in order, starting by allocating the file \vith the highest reference frequency to the faster device, etc. (for example, the first set of solution in Example 1). k=I, ... ,M The second solution stated in the above example provides a counterexample to the conjecture that allocation strategy A is a necessary condition for the optimal solution of the type 1 problem. M A(M +1) nll=50 n12=O A commonly used file allocation strategy is: j=I, ... , k A(k) 279 L: Uj j=1 Theorem 1. The set of input rates obtained by (13) holds for every optimal solution to the type 1 problem. Proof of this theorem follows immediately from the manner in which the load partition problem was obtained from the statement of the type 1 problem. Utilizing (13), we propose the follo\\ing algorithm for solution of the type 1 problem. Algorithm 1. Step 1. Calculate the total mean input rate A by (6). Step 2. Use (13) to obtain the optimal solution AI*, ... , XM* for the corresponding load partition problem. Step 3. Allocate file blocks in any manner desired, but ensure the resulting mean input rate to devicej is equal to Xl. :YIINE\IIZING THE STORAGE COST-ONE 1IEAN SYSTE11 RESPONSE TIME CONSTRAINT Problem statement When storage cost is also a factor, the following problem arises: allocate the files such that the storage cost is minimized and the mean system response time is bounded. That is, 1Iinimize M L L: L: Cj i=1 (14a) nij i=1 subject to The optimality of Algorithm 1 follows directly from Theorem (14b) 1. In the following, we illustrate the use of Algorithm 1 by a simple example. Example 1. Suppose that there are three files (L=3) to be allocated to two devices (M" = 2) in order to minimize the mean system response time. Given: 13=1 Na=l00 L L: nij Ii < Uj, j=l, ... , Jf (14c) i = 1, . , , ,L; j = I, . , , ,J.lf (14d) i=l .1{ L: nij = lVi, i=l, .. . , L (14e) pI (14a) denotes the total storage cost. (14b) is the mean system response time constraint where V is the given bound. The above ",ill be referred to as the type 2 problem. 280 National Computer Conference, 1973 Optimal solution The following theorems state the properties of the (unique) optimal solution for the type 2 problem. Proofs are supplied in the Appendix. Theorem 2.1. If Cj > Cj+l ( j = 1, ... , M -1) , then the optimal solution for the type 2 problem follows allocation strategy A. In the follo\\ing theorem, we consider the case in which there are only two devices (M =2). Let AI" A2' denote the mean input rates at the optimum for the type 2 problem with M =2, and Al*, A2* denote the mean input rates at the optimum for a corresponding load partition problem. Let MINE\UZING THE STORAGE COST-INDIVIDUAL RESPONSE TIME REQUIREMENTS Files may have individual response time requirements. This means that inequality (14b) is replaced by L individual response time requirements (one for each file). In addition, there are many situations in which the probability of the response time for an individual request exceeding some given bound must be limited. Thus, the cumulative response time probability distribution must enter the analysis. With inequality (14b) replaced by L individual inequalities (10), we formulate another important type of file allocation problem: allocate files minimizing the storage cost and limiting the probability that the response time for file i exceeds some given bound T to Qi. That is, Minimize M where L (15a) LCj Lnii i=1 i=1 subject to It is easy to see that a solution, !nij}, satisfies (14b)-(14e) is equivalent to that corresponding input rates (AI, A2) satisfy Al E S. Theorem 2.2. For a type 2 problem with M = 2, the mean input rates (AI" A2') at the optimum have the following property: P{i)(t>T) ~Qi, i=1, ... , L (15b) L L nij/i 1', terminate. (Xo feasible solution exists.) Step 3. Use AI*, ... , A](* and allocation strategy A to calculate Inij}, the initial feasible solution. Step 4. r se this initial feasible solution and the Sequential Unconstrained .:\Iinimization Technique (Sl'.~IT) 9 to find As use of distributed computer networks becomes more widesprrad, it will become economical to store some files in the optimal ~obltjon. f'hpap ~toragp at remote ~itp<:; mthpr th::m ~torp them )n(,ftlly. Remote storage Optimal File Allocation Since retrieval time for files at a remote site may not be acceptable, determining the tradeoff between storage cost and response time will be a problem. Our models can be easily adjusted to apply to these problems. A simple application of the models is to consider each remote storage as part of the storage hierarchy. The mean service time for retrieving a file block from that remote site is assumed to be exponentially distributed 'with an appropriate mean. Note that the situation considered here is conceptually different from the models developed by Chull and Casey.12 The latter models are optimized from the point of view of a computer network designer. Our model takes the point of view of a computer center manager at one site in the network who stores files at other sites. SUM~IARY We have analyzed three types of file allocation problem in multi-level storage systems, and proposed algorithms for solving t,vo of them. Since effects of queueing delay are given proper consideration, this analysis is more precise than previous analyses. Considering files with individual response time distribution requirements, we have presented a model (the type 3 problem) which is suitable for real-time environments. One might expect that the optimal strategy al,,-ays allocates more frequently used files to faster devices (allocation strategy A). This is true in some situations, such as in the type 2 problem. However, when each file has an individual response time requirement (the type 3 prob1em), this strategy may not be optimal. ::\loreover, in the case where storage cost is not a factor (the type 1 problem), use of this strategy is not essential. Finally, \ve have briefly discussed extension to general service time distributions, allowing the resulting models to fit the practical situation better, and application of the models to use of remote storage in a distributed computer network. 281 5. Arora, S. R., Gallo, A., "Optimal Sizing, Loading and Re-Ioading in a Multi-Level Memory Hierarchy System," AFIPS Proceedings, SJCC, pp. 337-344,1971. 6. Chen, P. P. S., Mealy, G. H., "Optimal Allocation of Files with Individual Response Time Requirements," Proceedings of the Seventh Annual Princeton Conference on Information Sciences and Systems, Princeton University, March 1973. 7. Satty, T. L., Elements of Queueing Theory, McGraw-Hill Book Company, 1961. 8. Chen, P. P. S., "Optimal Partition of I-put Load to Parallel Exponential Servers," Proceedings of the Fifth Southeastern Symposium on System Theory, North Carolina State University, March 1973. 9. Fiacco, A. V., McCormick, G. P., Nonlinear Programming Sequential Unconstrained Minimization Techniques, Wiley, 1968. 10. Chen, P. P. S., Buzen, J. P., "On Optimal Load Partition and Bottlenecks" (abstract), Computer Science Conference, Columbus, Ohio, February 1973. 11. Chu, W. W., "Optimal File Allocation in a Multiple Computer System," IEEE Tran. on Computers, Vol. C-18, No. 10, October 1969. 12. Casey, R. G., "Allocation of Copies of a File in an Information Network," AFIPS Proceedings, SJCC, pp. 617-625,1972. APPENDIX Proof of Theorem 2.1: Assume that the optimal solution consists of ni'j~O and nij' ~O, where fi >Ii' and Uj >Uj'. That is, some portion of a less frequently referenced file i' is allocated to a faster device j, and some portion of a more frequently used file i is allocated to a slower device j'. Let a = ::\Iin[ni'i fi', nij'li]. Exchanging a/fi' blocks of file i' in device j with a/fi blocks of file i in device j' will not change the mean request input rate to these t,vo devices, nor will it change the mean response time. The inequality (14b) is still satisfiable, and so are (14c) - (14e) . But this exchange reduces the total storage cost by the amount: [(a/fi') Cl + (a/fi)C2]- [(a/Ii)cl + (a/fi') C2J ACKNOWLEDG}fENT The author is indebted to G. H. ::\lealy, J. P. Buzen and S. P. Bradley for their comments. REFEREXCES 1. Lowe, T. C., "The Influence of Data Base Characteristics and Usage on Direct Access File Organization," JACM, Vol. 15, pp. 535-548, October 1968. 2. Baskett, F., Browne, J. C., Raike, M., "The Management of a Multi-Level Non-Paged Memory System," AFIPS Proceedings, SJCC, pp. 459-465, 1970. 3. Coiimeyer, A. J., Shemer, J. E., "Anaiysis of Retrievai Performance for Selected File Organization Techniques," AFIPS Proceedings, FJCC, pp. 201-210, 1970. 4. Ramamoorthy, C. V., Chandy, K, M., "Optimization of Memory Hierarchies in Multiprogrammed Systems," JACM, July 1970. =a(fi-fi') (CI-C2)/(fi'fd >0 This contradicts the assumption. Thus, the optimal solution of the type 2 problem obeys allocation strategy A. Proof of Theorem 2.2: Assume that at optimum Al = Ala ~ ::\Iin I Al I Al E S I. \Ve can find AlbE S such that Alb ~ • X QUERY + OTHER OMS USERS :::I Z if UJ < w a: J 15 t- :2 ., t- I w ~ 20 ~ :::I (OMS) I I 30 ~ a: 15 TOOL SELECTION PROGRAMS ~GENERAL :2 6 permit a terminal user to interrogate DMS-structured files using the DMS query language. After opening a file with its protection key, questions can be asked with any search conditions and their logical combinations by means of Boolean"and", "or" operations. Search criteria associate any field name with desired val ues using relational operators "equal", "not equal", "greater than", "less than", "between". Searches can also be made for data that start with certain characters or that contain the given string within the field. All searches take advantage of the indexing available within a file whenever po~sible. However, it is also possible to request sequential searches or other lengthy processing. Thus, system response times can occasionally be quite long. A special DMS-calling application for tool selection enables production engineering designers to choose drills 35 Interaction Statistics from a Database Management System and other manufacturing tools. The program first asks for certain inputs from the user and then goes through an iterative algorithm to compare the user's needs with available tools and come up with an optimum selection. During this process the program formulates and sends several queries to DMS and analyzes the answers. This results in a mean system response time to the terminal user of 32.8 seconds, the longest of any program. However, very few interactive cycles are required to solve a particular design problem. This can be seen in Figure 4 by the small number of cycles per session and consequent short terminal session time. The tabulated results show the decisive importance of application type in predicting the system performance. The relation between mean system response and mean CPU time for various AP's is also shown graphically in Figure 5; for illustration it is compared to simulation results given by Scherr.15 The average user turnaround time versus the mean system response time for the AP's is plotted in Figure 6. No obvious correlation seems to exist in this case. Both Figures 5 and 6 include average results for individual application programs belonging to the program classes discussed previously. No frequency distributions of the timings at the manmachine interface are available. The longest system response times were recorded and these reached between 10 and 20 minutes on a month-by-month basis. It is believed that the distributions would follow the hyperexponential pattern, which commonly applies to time-sharing statistics. This pattern was also found empirically at the AP-DMS interface, as discussed later in this paper. OPERATING CHARACTERISTICS The man-machine statistics were generated during a six-month period in which the workload was light and the operating conditions remained relatively stable. The UAIMS utility was "up" four hours per working day, two 13 12 MEAN RESPONSE TIME (SECI 11 10 ~/ ~ I I \ :1 5 \ A 9 \ \"\ ,{ \/' I 4 ~ PERIOD OF MEASUREMENT I :r " lNO OF TERMINAL SESSIONS PER HOUR OF UPTIME FOR MAN-MACHINE STATISTICS 287 in the morning and two in the afternoon. Out of the total available uptime the system was in use 76 percent of the time on the average. "In use" is defined as having one or more programs signed on for processing, and may include periods of no system activity due to user thinking, etc. The mean system response time and terminal loading are plotted on a month-by-month basis in Figure 7. As expected, there is a correlation between loading and response time. But, due to the light workload, the effects of secondary storage I/O contention, investigated by Atwood,19 were not pronounced and did not cause appreciable delays in response. Average loading and traffic figures per hour of UAIMS availability (uptime) for the six month period of measurement are summarized below: Occupancy - of CPU - of computer system - of terminals Number of programs signed on (terminal sessions) Messages received form all terminals Number of DMS transactions DMS utilization - Execution time - Time in CPU 3.9 minutes 23.3 minutes 1.4 hours 7 182 73 7.4 minutes 2.2 minutes As explained earlier, the numbers for computer system occupancy and DMS execution time come from elapsed time measurements within programs, i.e., by summing all response times. They include, besides the computing (CPU) time shown, all secondary storage input/ output I/O) and waiting periods when another program may be in control. The concept corresponds to what Stevens 20 and Y ourdon 21 call "user time." The general-purpose database management system, DMS, is by far the most important element in UAIMS from a load and performance standpoint. It accounts for 60 percent of total CPU utilization, the rest being distributed among all application programs. The DMS utilization per hour of UAIMS "in use" (as defined previously) is plotted in Figure 8 on a month-by-month basis. Also shown are two reliability measures. One is the percentage of calls to DMS that were not completed, for any reason at all. The non-completion ratio diminished as improvements and corrections were made to DMS during the time period. The average tends to be about 0.2 percent of transactions, and includes the effects of application program debugging, system faiiures, and canceilations made by the computer operator. System crashes (for all reasons including hardware failures) are also plotted and presently average around 5 per month. DATA MANAGEMENT INTERFACE STATISTICS 1971 7 8 9 10 11 1972 12 1 2 4 5 6 7 8 9 TIME (MONTHS) Figure 7 - U AIMS workload characteristics 10 11 12 The discussion so far has centered on the man-machine interface and its traffic pattern. It was just noted, however, that within this type of real-time information sys- 288 National Computer Conference, 1973 ::fL_____________D_M_s_C_A_LL_s_P_E_R_H_O_~~ ____ US_E_)______ ~ ________ MEDIAN ~. ~ 1.5 OMS NON·COMPLETION PERCENTAGE 1.0 MEAN 0.5 t ::t ~., ,.,~"'" '" .o,~ 0 6 7 B 9 10 1971 11 0 2 12 1 2 3 4 5 6 7 B 9 10 11 ;"'T~~CTION~r-------' VARIOUS OPEN FILE I ----EXECUTION TIME (SEC) I CPU I I 13 i 3.6 -- ~I--l--' QUERY NEW ~:T~Ay I NUMBER OF DISK I/O'S l ____________.. 0.7 0.3 I 3.3 L~~T~E-~O~D.IN~-_=r_~4% 6.5 1 2 77 % _ ! 2.5 3.0 t ~ ALL TRANS- I ACTIONS MODIFICATION h 0 0.5 MEAN 1.0 1.5 • 2.0 C. P. U. TIME (SEC) Figure lO-Observed frequency distribution for DMS responses which facilitate the query process. At the other extreme we have control functions to open and close files, to view the data definition tables. etc. These make up a third of all transactions and have the fastest response. A frequency distribution for all transactions, regardless of function, was derived and plotted from measurements covering the eight-month period June 1971 - January 1972 (45,000 transactions). The density and cumulative probability are shown in Figures 10 and 11 respectively. The distribution of responses follows the familiar shape which fits interarrival and service time distributions in time-sharing systems.22.23.16.24 These distributions are close to exponential but contain too few data near the origin and too many in the tail, producing a characteristic skewness. This is seen more clearly in Figure 12 which compares the observed data with an exponential distribution by plotting the complement of the cumulative probability on a semi-log scale. The empirical curve could be mathematically approximated by fitting a hyperexponential distribution to the data, that is a linear combination of two or more ordinary exponential distributions. 5.9 6.1 13.2 , 5.6 ; ~---~ OCCUPANCY (SECI 12 KUPDATlNG) I I 10 MEDIAN tern another important interface exists. It is the one shown in Figure 3 between application programs and the DMS software. An analysis of 70,000 DMS transactions originating from all application programs during the one-year period June 1971 through May 1972 yielded the mean response statistics and the observed relative loadings tabulated in Figure 9. Overall, the average time between call and return was 5.6 seconds. Of this time 1.8 seconds were devoted to computer processing. Data transfer between DMS and the disk files required 23.7 I/O's which amount to about 2.6 seconds. * The remaining 1.2 second represents interruptions from OS and the TPE and incl udes time-slicing waits (see Figure 2). When broken down by function performed, the average response statistics vary by a factor of ten or more. In data management technology there is a performance trade-off between query and updating which poses a design choice. Data structures providing multiple access for associative searching and fast retrieval respond well to unanticipated queries but are time-consuming to build and maintain. This is illustrated in Figure 9. For the system described, the design choice proved correct since queries represent the greatest load on the system. The penalty is paid for data entry and modification transactions which take much more time. This is largely due to the many IIO's required to update the file, its indexes and directories TYPE OF TRANSACTION 8 12 1972 Figure 8-Utilization and reliability ; 6 EXECUTION TIME (SEC) ~ TIME (MONTHS) r------ ---- 4 2.3 10 3.8 1. 8 CUMULATIVE PROBABILITY i .-+ 27.8 55.3% 5% 38.3 62.9 l~~uI~.:9;,= i 10% 20% 50% 80,," 90% 95% 10 18 99% 99.9% 23.7 100% Figure 9- DMS transaction responses * Each I/O to the 2314 disk storage requires 110 ms on the average. for seek, rotational delay, and one track of data transfer. EXEC UTION TIME (SEC) 0.6 1.0 1.3 2.0 5.0 CPU OCCUPANCY (SEC) 0.18 0.30 0.45 0.6 12 NO OF DISK I/O 5 0.9 1.3 3.2 5.4 17 40 80 45 220 20 90 260 600 Figure II-Cumulative frequency distribution of all n:-VlS transactions Interaction Statistics from a Database Management System 0 100 CPU OCCUPANCY (SEC) 2 4 6 8 10 I '50 50 g 0 ~ ~ ~ ~ 20 80 ~ III « III 0 III 10 co: Q.. > I 5 ....... ~ ~ 90 0 co: Q.. J 95 ....... :::> :::> u more like exponential. We would suggest that if the hyperexponential pattern continues to be empirically confirmed for all interactive environments, then it should be accepted at its face value and further investigated by the theoreticians so that it may be better explained and understood. III « ~ 289 2 I ~i w > ;::: ~ 98 I 1 ~ :::> u 99 I 0.5 0 10 20 30 40 50 60 REFERE~CES :::> 99.5 EXECUTION TIME (SEC) Figure 12-Comparison to the exponential distribution CO~CLUSIO~S The man-machine interactive characteristics of a database management system can be substantially different from those of general purpose time-shared systems. In this paper the timings were compared by separating UAIMS application programs into groups of non-DMS and DMS users. Data management applications on the average required more time per interactive cycle for both user turnaround to think, etc., (17 vs. 14 sec) and for system response (7 vs. 3 sec). They were also far more computer-bound (1.6 vs. 0.2 sec CPU time per cycle). In order to predict the behavior of a new application in a real-time environment it is important to know the type of program and its expected transaction mix. This was highlighted by one particular DMS application, for tool selection, which had to be considered separately because of its totally distinct man-machine performance. Numerically, the results obtained here are comparable to those reported by Scherr!" Bryant23, and others.!7.24 Depending on the application, mean "think" times range between 11 and 32 seconds. Response times, which depend on both the hardware and software, average between 2 and 33 seconds at the terminal, and between 1 and 13 seconds at the AP-DMS interface, depending on the function requested. At the latter interface, the shape of the frequency distribution conforms to the "hyperexponential" pattern described by Coffman & Wood,n and found by all investigators. We may infer that the same pattern holds for the man-machine parameters of system response and user turnaround, making the median values considerably less than the means (around halO, Some researchers, including Parupudi & Winograd/ 4 have doubted the validity of such results and attempted to "normalize" the observed data by discarding the largest 10 percent. This practice has the effect of artificially reducing the mean values and making the distributions 1. Miller, E. F., "Bibliography on Techniques of Computer Performance Analysis," Computer (IEEE), Vol. 5, No.5, pp. 39-47. September/October 1972. 2. Johnson, R. R., "~eeded - A Measure for Measure," Datamation, Vol. 16, No. 17. pp. 20-30, December 15,1970. 3. Lueas, H. C., Jr., "Performance Evaluation and Monitoring," ACM Computing Surveys, Vol. 3, ~o. 3, pp. 79-90, September 1971. 4. Kronos, J. D., United Aircraft Information Management Systems (UAIMS) User's Guide - General Information, United Aircraft Research Laboratories Report K-032131-21, July 1971. 5. Hobbs, W. F., Levy, A. H. McBride, J., "The Baylor Medical School Teleprocessing System," AFIPS Conference Proceedings, Vol. 32, SJCC, pp. 31-36, 1968. 6. A Survey of Generalized Data Base Management Systems, CODASYL Systems Committee Technical Report, ACM, ~ew York, May 1969. 7. Angell, T., Randell, T. M., "Generalized Data Management Systems," IEEE Computer Group News, Vol. 2, ::\0. 12, pp. 5-12, November 1969. 8. Prendergast, S. L., "Selecting a Data Management System," Computer Decisions, Vol. 4, ::\0. 8, pp. 12-15, August 1972. 9. Fry, J. P., "Managing Data is the Key to MIS," Computer Decisions, Vol. 3, ~o. 1, pp. 6-10, January 1971. 10. Olle, T.W., "MIS Data Bases," Datamation, Vol. 16, No. 15, pp. 47 -50, November 15, 1970. 11. CODASYL Data Base Task Group Report, ACM Kew York, April 1971. 12. Feature Analysis of Generalized Data Base Management System~, CODASYL Systems Committee Technical Report, ACM, New York, May 1971. 13. Shemer, J. E., Robertson, J. B., "Instrumentation of Time Shared Systems", Computer (IEEE), Vol. 5, No.4, pp. 39-48, July/ August 1972. 14. Stimler, S., "Some Criteria for Time Sharing System Performance," Communications ACM, Vol. 12, No.1, pp. 47-52, January 1969. 15. Scherr, A. L., An Analysis of Time Shared Computer Systems, The MIT Press, Cambridge, Massachusetts, 1967. 16. Schrage, L., The Modelin{? of Man Machine Interactive System~, Department of Economics and Graduate School of Business Report 6942, University of Chicago, September 1969. 17. Schwetman, H. D., Deline, J. R., "An Operational Analysis of Remote Console System," AFIPS Proceedings, Vol. 34, SJCC, pp. 257-264, 1969. 18. Sharpe, W. F., The Economics of Computers, Columbia University Press, ~ew York, 1969. 19. Atwood, R. C., "Effects of Secondary Storage I,' 0 Contention on the Performance of an Interactive Information Management System" Proceedings ACM Annual Conference, pp. 670-679, August 1972. 20. Stevens, M. E., Problems of Network Accountin{? i\1onitorinf{ and Performance Measurement, National Bureau of Standards Report PB 198048, U. S. Department of Commerce, September 1970. 290 National Computer Conference, 1973 21. Yourdon, E., "An Approach to Measuring a Time Sharing System," Datamation, Vol. 15, No.4, pp. 124-126, April 1969. 22. Coffman, E. G., Jr., Wood, R. C., "Interarrival Statistics for Time Sharing Systems," Communications ACM, Vol. 9, No.7, pp. 500503, July 1966. 23. Bryan, G. E., "JOSS - 20,000 Hours at a Console - A Statistical Summary," AFIPS Proceedings, Vol. 33, SJCC, pp. 1019-1032, 1968. 24. Parupudi, M., Winograd, J., "Interactive Task Behavior in a Time Sharing Environment," Proceedings ACM Annual Conference, pp. 680-692, August 1972. EDP conversion consideration by WILLIAM E. HANNA, JR. Social Security Administration Baltimore, Maryland ABSTRACT Conversion from one manufacturer to another is a simple phrase that embodies a myriad of changes. There are large changes in the obvious. That is, changes in hardware and software. This, however, is only the beginning. There are sweeping changes to be made in concept, DP management, machine operation, systems programming, forms and forms control, methods and procedures to mention a few. The changes in this case are not analogous at all to a change from one automobile manufacturer to another. Rather, the change is analogous to a change from an automobile to a helicopter. The conversion, if it is successfully done, then has a sweeping effect on all operations. Special purpose leased or written software packages will not work. Extensive Systems software that allows the unity of processing on multiple machines will not work. Systems and systems programmer groups will no longer be backed up to each other nor can their efforts be coordinated and shared. This will create multiple problems on systems instead of duplicate problems for the same equipment. One for one conversion will not be a satisfactory method of' program change. A complete redesign of application program systems would be necessary to best utilize the hardware and software capabilities of a new system. The evolution of virtual machine architecture* by J. P. BUZEN and U. O. GAGLIARDI Honeywell Information Systems, Inc. Billerica, Massachusetts and Harvard University Cambridge, Massachusetts I~TRODUCTION more privileged state. The critical operations restricted to privileged state typically include such functions as channel program initiation, modification of address mapping mechanisms, direct monitoring of external interrupts, etc. Experience has shown that this solution can be quite effective if the privileged software is limited in quantity, is stable in the sense that few changes are made over long periods of time, and is written by skilled professional programmers. While this architectural principle has proven its value by fostering the development of computing systems with true simultaneity of I/O operations and high overall resource utilization, it has generated a whole host of problems of its own. These problems arise from the fact that the only software which has complete access to and control of all the functional capabilities of the hardware is the privileged software nucleus. Probabiy the most serious difficulty arises in the area of program transportability since non-privileged programs are actually written for the extended machine formed by the privileged software nucleus plus the nonprivileged functions of the hardware. These extended machines are more difficult to standardize than hardware machines since it is relatively easy to modify or extend a system whose primitives are in part implemented in software. This has frequently resulted in a multiplicity of extended machines running on what would otherwise be compatible hardware machines. A user who wishes to run programs from another installation which were written for a different extended machine is faced with either scheduling his installation to run the "foreign" software nucleus for some period of time or converting the programs to his installation's extended machine. Neither of these alternatives is particularly attractive in the majority of cases. Another problem is that it is impossible to run two versions of the privileged software nucleus at the same time. This makes continued development and modification of the nucleus difficult since system programmer~ often have to work odd hours in order to have a dedicated machine at their disposal. In addition to the inconvenience this may cause, such procedures do not result in In the early 1960's two major evolutionary steps were taken with regard to computing systems architecture. These were the emergence of I/O processors and the use of multiprogramming to improve resource utilization and overall performance. As a consequence of the first step computing systems became multiprocessor configurations where nonidentical processors could have access to the common main memory of the system. The second step resulted in several computational processes sharing a single processor on a time-multiplexed basis while vying for a common pool of resources. Both these developments introduced very serious potential problems for system integrity. An I/O processor executing an "incorrect" channel program could alter areas of main memory that belonged to other computations or to the nucleus of the software system. A computational process executing an "incorrect" procedure could cause similar problems to arise. Since abundant experience had demonstrated that it was not possible to rely on the "correctness" of all software, the multi-processing/ multiprogramming architectures of the third generation had to rely on a completely new approach. DUAL STATE ARCHITECTURE The approach chosen was to separate the software into two classes: the first containing a relatively small amount of code which was presumed to be logically correct, the second containing all the rest. At the same time the system architecture was defined so that aU functionaiity which could cause undesirable interference between processes was strictly denied to the second class of software. Essentially, third generation architectures created two distinct modes of system operation (privileged/non-privileged, master/slave, system/user, etc.) and permitted certain critical operations to be performed only in the * This work was sponsored in part by the Electronic Systems Division, U.S. Air Force, Hanscom Field, Bedford, Massachusetts under Contract Number F19628-70-C-0217. 291 292 National Computer Conference, 1973 BARE MACHINE nn"""~~~ - BASIC MACHINE INTERFACE PRIVILEGED SOFTWARE NUCLEUS USER PROGRAM USER PROGRAM Figure I-Conventional extended machine organization very efficient utilization of resources since a single programmer who is modifying or debugging a system from a console does not normally generate a very heavy load. A final problem is that test and diagnostic software has to have access to and control of all the functional capabilities of the hardware and thus cannot be run simultaneously with the privileged software nucleus. This in turn severely curtails the amount of testing and diagnosis that can be performed without interfering with normal production schedules. The ever increasing emphasis on computer system reliability will tend to make this an even more serious problem in the future. machine interface, then a different privileged software nucleus could be run on each of the additional basic machine interfaces and the problems mentioned in the proceding section could be eliminated. A basic machine interface which is not supported directly on a bare machine but is instead supported in a manner similar to an extended machine interface is known as a virtual machine. As illustrated in Figure 2, the program which supports the additional basic machine interfaces is known as a virtual machine monitor or VMM. Since a basic machine interface supported by a VMM is functionally identical to the basic machine interface of the corresponding real machine, any privileged software nucleus which runs on the bare machine will run on the virtual machine as well. Furthermore, a privileged software nucleus will have no way of determining whether it is running on a bare machine or on a virtual machine. Thus a virtual machine is, in a very fundamental sense, equivalent to and functionally indistinguishable from its real machine counterpart. In practice no virtual machine is completely equivalent to its real machine counterpart. For example, when several virtual machines share a single processor on a timemultiplexed basis, the time dependent characteristics of the virtual and real machine are likely to differ significantly. The overhead created by the VMM is also apt to cause timing differences. A more significant factor is that virtual machines sometimes lack certain minor functional capabilities of their real machine counterparts such as the ability to execute self-modifying channel programs. Thus the characterization of virtual machines presented in the preceding paragraph must be slightly modified in many cases to encompass all entities which are conventionally referred to as virtual machines. Perhaps the most significant aspect of virtual machine monitors is the manner in which programs running on a virtual machine are executed. The VMM does not perform instruction-by-instruction interpretation of these THE VIRTUAL MACHINE CONCEPT Figure 1 illustrates the conventional dual state extended machine architecture which is responsible for all the difficulties that were cited in the preceding section. As can be seen in the Figure, the crux of the problem is that conventional systems contain only one basic machine interface* and thus are only capable of running one privileged software nucleus at any given time. Note, however, that conventional systems are capable of running a number of user programs at the same time since the privi1eged software nucleus can support several extended machine interfaces. If it were possible to construct a privileged software nucleus which supported several copies of the basic machine interface rather than the extended * A basic machine interface is the set of all software visible objects and instructions that are directly supported by the hardware and firmware of a particular system. BARE MACHINE BASIC ~V~IRT~U~AL~- ~~~~~~CE MACHINE MONITOR ~~~~ BASIC ~ - - MACHINE :-_~~~~~ PRIVILEGED SOFTWARE NUCLEUS INTERFACE #1 PRIVILEGED SOFTWARE NUCLEUS #2 EXTENDED MACHINES Figure 2--Virtual machine organization The Evolution of Virtual Machine Architecture programs but rather allows them to run directly on the bare machine for much of the time. However, the VMM will occasionally trap certain instructions and execute them interpretively in order to insure the integrity of the system as a whole. Control is returned to the executing program after the interpretive phase is completed. Thus program execution on a virtual machine is quite similar to program execution on an extended machine: the majority of the instructions execute directly without software intervention, but occasionally the controlling software will seize control in order to perform a necessary interpretive operation. VIRTUAL MACHINES AND EMULATORS Figure 2 is not intended to imply that the basic machine interfac-e s-upport-ed by the VMM must be identical to the interface of the bare machine that the VMM iUns on. However, these interfaces often are identical in practice. When they are not, they are usually members of the same computer family as in the case of the original version of CP-67,1 a VMM which runs on an IBM 360 Model 67 (with paging) and supports a virtual IBM 360 Model 65 (without paging) beneath it. When the two interfaces are distinctly different the program which supports the virtual interface is usually called an emulator rather than a virtual machine monitor. Aside from this comparatively minor difference, virtual machines and emulators are quite similar in both structure and function. However, because they are not implemented with the same objectives in mind, the two concepts often give the appearance of being markedly different. Virtual machine monitors are usually implemented without adding special order code translation firmware to the bare machine. Thus, most VMM's project either the same basic machine interface or a restricted subset of the basic machines interface that they themselves run on. In addition, VMM's are usually capable of supporting several independent virtual machines beneath them since many of the most important VMM applications involve concurrent processing of more than one privileged software nucleus. Finally, VMM's which do project the same interface as the one they run on must deal with the problem of recursion (i.e., running a virtual machine monitor under itself). In fact, proper handling of exception conditions under recursion is one of the more challenging problems of virtual machine design. Emulators, by contrast, map the basic machine interface of one machine onto the basic machine interface of another and thus never need be concerned with the problem of recursion. Another point of difference is that an emulator normally supports only one copy of a basic machine interface and thus does not have to deal with the ~cheduling and resource aliocation problems which arise when multiple independent copies are supported. Still another implementation difference is that emulators must 293 frequently deal with more complex I/O problems than virtual machine monitors do since the emulated system and the system that the emulator is running on may have very different I/O devices and channel architecture. Modern integrated emulators3 exhibit another difference from the virtual machine monitor illustrated in Figure 2 in that an integrated emulator runs on an extended machine rather than running directly on a bare machine. However, it is possible to create virtual machine monitors which also run on extended machines as indicated in Figure 3. Goldberg 4 refers to such systems as Type II virtual machines. Systems of the type depicted in Figure 2 are referred to as Type I virtual machines. It should be apparent from this discussion that virtual machines and emulators have a great deal in common and that significant interchange of ideas is possible. For a further discussion of this point, see Mallach. 5 ADDITIONAL APPLICATIONS It has already been indicated that virtual machine systems can be used to resolve a number of problems in program portability, software development, and "test and diagnostic" scheduling. These are not the only situations in which virtual machines are of interest, and in fact virtual machine systems can be applied to a number of equally significant problems in the areas of security, reliability and measurement. From the standpoint of reliability one of the most important aspects of virtual machine systems is the high degree of isolation that a virtual machine monitor provides for each basic machine interface operating under its control. In particular, a programming error in one privileged software nucleus will not affect the operation of BARE MACHINE ~~~~ --~~~~INE PRIVILEGED SOFTWARE NUCLEUS INTERFACE *'1 ...-::\.:~~~-EXTENDED MACHINE EXTENDED p;;;;;;.-.~.r,;;;;.;;;; -- ~~~~~~E #1 TYPE l! VI RTUAL MACH I NE AA~R~tD EXTENDED MACHINE ~ I USER I I PROGRAM I --INTERFACE #2 Figure 3-Type II virtual machine organization 294 National Computer Conference, 1973 another privileged software nucleus running on an independent virtual machine controlled by the same monitor. Thus virtual machine monitors can localize and control the impact of operating system errors in much the same way that conventional systems localize and control the impact of user program errors. In multiprogramming applications where both high availability and graceful degradation in the midst of failures are required, virtual machine systems can, for a large class of utility functions, be shown to have a quantifiable advantage over conventionally organized systems. 6 The high degree of isolation that exists between independent virtual machines also makes these systems important in certain privacy and security applications. 7 •s Since a privileged software nucleus has, in principle, no way of determining whether it is running on a virtual or a real machine, it has no way of spying on or altering any other virtual machine that may be coexisting with it in the same system. Thus the isolation of independent virtual machines is important for privacy and security as well as system reliability. Another consideration of interest in this context is that virtual machine monitors typically do not require a large amount of code or a high degree of logical complexity. This makes it feasible to carry out comprehensive checkout procedures and thus insure high overall reliability as well as the integrity of any special privacy and security features that may be present. The applications of virtual machines to the measurement of system behavior are somewhat different in nature. It has already been noted that existing virtual machine monitors intercept certain instructions for interpretive execution rather than allowing them to execute directly on the bare machine. These intercepted instructions typically include I/O requests and most other supervisory calls. Hence, if it is desired to measure the frequency of Ii 0 operations or the amount of supervisory overhead in a system, it is possible to modify the virtual machine monitor to collect these statistics and then run the system under that modified monitor. In this way no changes have to be made to the system itself. A large body of experimental data has been collected by using virtual machine monitors in this fashion. 9 . lo . ll EARLY VIRTUAL MACHINES Virtual machine monitors for computers with dual state architecture first appeared in the mid 1960's. Early VMM'SI2.13 were most noteworthy for the manner in which they controlled the processor state, main memory and 110 operations of the virtual machines which ran under their control. This section presents a brief description and analysis of the special mapping techniques that were employed in these early systems. Processor state mapping The mapping of processor state was probably the most unusual feature of early virtual machine monitors. If a VMM did not maintain proper control over the actual state of the processor, a privileged software nucleus executing on a virtual machine could conceivably enter privileged mode and gain unrestricted access to the entire system. It would then be able to interfere at will with the VMM itself or with any other virtual machine present in the system. Since this is obviously an unacceptable situation, some mapping of virtual processor state to actual processor state was required. The solution that was adopted involved running all virtual machine processes in the non-privileged state and having the virtual machine monitor maintain a virtual state indicator which was set to either privileged or nonprivileged mode, depending on the state the process would be in if it were executing directly on the bare machine. Instructions which were insensitive to the actual state of the machine were then allowed to execute directly on the bare machine with no intervention on the part of the VMM. All other instructions were trapped by the VMM and executed interpretively, using the virtual system state indicator to determine the appropriate action in each case. The particular instructions which have to be trapped for interpretive execution vary from machine to machine, but general guidelines for determining the types of instructions which require trapping can be identified. 14 First and most obvious is any instruction which can change the state of the machine. Such instructions must be trapped to allow the virtual state indicator to be properly maintained. A second type is any instruction which directly queries the state of the machine, or any instruction which is executed differently in privileged and nonprivileged state. These instructions have to be executed interpretively since the virtual and actual states of the system are not always the same. Memory mapping Early virtual machine monitors also mapped the main memory addresses generated by processes running on virtual machines. This was necessary because each virtual machine running under a VMM normally has an address space consisting of a single linear sequence that begins at zero. Since physical memory contains only one true zero and one linear addressing sequence, some form of address mapping is required in order to run several virtual machines at the same time. Another reason for address mapping is that certain locations in main memory are normally used by the hardware to determine where to transfer control when an interrupt is received. Since most processors automatically enter privileged mode following an interrupt generated transfer of control, it is necessary to prevent a process executing on a virtual machine from obtaining access to these locations. By mapping these special locations in virtual address space into ordinary locations in real memory, the VMM can retain complete control over the The Evolution of Virtual Machine Architecture 295 actual locations used by the hardware and thus safeguard the integrity of the entire system. Early VMM's relied on conventional paging techniques to solve their memory mapping problems. Faults generated by references to pages that were not in memory were handled entirely by the VMM's and were totally invisible to processes running on the virtual machines. VMM's also gained control after faults caused by references to addresses that exceeded the limits of a virtual machine's memory, but in this case all the VMM had to do was set the virtual state indicator to privileged mode and transfer control to the section of the virtual machine's privileged software nucleus which normally handles out-of-bounds memory exceptions. These traps were thus completely visible to the software running on the virtual machine, and in a sense they should not have been directed to the VMM at all. More advanced virtual machine architectures permit these traps to be handled directly by the appropriate level of control. 15.16 It should be noted that the virtual machines supported by early VMM's did not include paging mechanisms within their basic machine interfaces. In other words, only privileged software nuclei which were designed to run on non-paged machines could be run under these early virtual machine monitors. Thus these VMM's could not be run recursively. One of the drawbacks of copying channel programs into private work areas and executing the absolutized copies is that channel programs which dynamically modify themselves during execution sometimes do not operate correctly. Hence it was not possible to execute certain selfmodifying channel programs in early VMM's. However, since the majority of commonly used channel programs are not self-modifying, this lack of functionality could frequently be tolerated without serious inconvenience. Channel program absolutization is not the only reason for VMM intervention in I/O operations. Intervention is also needed to maintain system integrity since an improperly written channel program can interfere with other virtual machines or with the VMM itself. The need for intervention also arises in the case of communication with the operator's console. This communication must clearly be mapped to some other device since there is normally only one real operator's console in a system. A final point is that VMM intervention in I! 0 operations makes it possible to transform requests for one device into requests for another (e.g., tape requests to disk requests) and to provide a virtual machine with devices which have no real counterpart (e.g., a disk with only five cylinders). These features are not essential to VMM operation, but they have proven to be extremely valuable by-products in certain applications. I/O mapping Summary The final problem which early VMM's had to resolve was the mapping of I/O operations. As in the case of main memory addresses, there are a number of reasons why I/O operations have to be mapped. The primarj reason is that the only addresses which appear in programs running on virtual machines are virtual (mapped) addresses. However, existing I/O channels require absolute (real) addresses for proper operation since timing considerations make it extremely difficult for channels to dynamically look up addresses in page tables as central processors do. Thus all channel programs created within a particular virtual machine must have their addresses "absolutized" before they can be executed. The VMM performs this mapping function by trapping the instruction which initiates channel program execution, copying the channel program into a private work area, absolutizing the addresses in the copied program, and then initiating the absolutized copy. \Vhen the channel program terminates, the VMM again gains control since all special memory locations which govern interrupt generated transfers are maintained by the VMM. After receiving the interrupt, the VMM transfers control to the address which appears in the correspo~ding interrupt dispatching location of the appropriate virtual machine's memory. Thus I/O completion interrupts are "reflected back" to the virtual machine in the same manner that out-of-bounds memory exceptions are. In summary, early VMM's ran all programs in non-privileged mode, mapped main memory through paging techniques, and performed all I/O operations interpretively. Thus they could only be implemented on paged computer systems which had the ability to trap all instructions that could change or query processor state, initiate I/O operations, or in some manner be "sensitive" to the state of the processor. 14 Note that paging per se is not really necessary for virtual machine implementation, and in fact any memory relocation mechanism which can be made invisible to non-privileged processes will suffice. However, the trapping of all sensitive instructions in non-privileged mode is an absolute requirement for this type of virtual machine architecture. Since very few systems provide all the necessary traps, only a limited number of these VMM's have actually been constructed.12.13.17.19 PAGED VIRTUAL MACHI]\;ES It has already been noted that early VMM's did not support paged virtual machines and thus could not be run on the virtual machines they created. This lack of a recursive capability implied that VMM testing and development had to be carried out on a dedicated processor. In order to overcome this difficulty and to achieve a more satisfying degree of logical completeness, CP-67 was modified so that it could be run recursively. 18 296 National Computer Conference, 1973 The major problem which had to be overcome was the efficient handling of the additional paging operation that took place within the VMM itself. 18 . 20 To put the problem in perspective, note that early VMM's used their page tables to map addresses in the virtual machine's memory into addresses in the real machine's memory. For example, virtual memory address A' might be mapped into real memory address A". However, processes running on paged virtual machines do not deal with addresses which refer directly to the virtual machine's memory the way address A does. Rather, an address A used by such a process must be mapped into an address such as A' by the page table of the virtual machine. Thus, in order to run a process on a paged virtual machine, a process generated address A must first be mapped into a virtual machine memory address A' by the virtual machine's page table, and then A' must be mapped into a real address A" by the VMM's page table. In order to carry out this double mapping efficiently, the VMM constructs a composed page table (in which virtual process address A is mapped into real address A") and executes with this map controlling the address translation hardware. When the VMM transfers a page out of memory, it must first change its own page table and then recompute the composed map. Similarly, if the privileged software nucleus changes the virtual machine's page table, the VMM must be notified so that the composed map can be recomputed. This second consideration poses some difficulties. Since the virtual machine's page tables are stored in ordinary (virtual) memory locations, instructions which reference the tables are not necessarily trapped by the VMM. Thus changes could theoretically go undetected by the VMM. However, any change to a page table must in practice be followed by an instruction to clear the associative memory since the processor might otherwise use an out of date associative memory entry in a subsequent reference. Fortunately, the instruction which clears the associative memory will cause a trap when executed in non-privileged mode and thus allow the VMM to recompute the composed page table. Therefore, as long as the privileged software nucleus is correctly written, the operation of a virtual machine will be identical to the operation of the corresponding real machine. If the privileged software nucleus fails to clear the associative memory after changing a page table entry, proper operation cannot be guaranteed in either case. I TYPE II VIRTUAL MACHINES VMM's which run on an extended machine interface are generally easier to construct than VMM's which run directly on a bare machine. This is because Type II VMM's can utilize the extended machine's instruction repertoire when carrying out complex operations such as I I O. In addition, the VMM can take advantage of the extended machine's memory management facilitie~ (which may include paging) and its file system. Thus Type II virtual machines offer a number of implementation advantages. Processor state mapping Type II virtual machines have been constructed for the extended machine interface projected by the UMMPS operating system. 21 UMMPS runs on an IBM 360 Model 67, and thus the VMM which runs under UMMPS is able to utilize the same processor state mapping that CP-67 does. However, the instruction in the VMM which initiates operation of a virtual machine must inform UMMPS that subsequent privileged instruction traps generated by the virtual machine should not be acted on directly but should instead be referred to the VMM for appropriate interpretation. Memory mapping The instruction which initiates operation of a virtual machine also instructs UMMPS to alter its page tables to reflect the fact that a new address space has been activated. The memory of the virtual machine created by the VMM is required to occupy a contiguous region beginning at a known address in the VMM's address space. Thus UMMPS creates the page table for the virtual machine simply by deleting certain entries from the page table used for the VMM and then subtracting a constant from the remaining virtual addresses so the new address space begins at zero. If the virtual machine being created is paged, it is then necessary to compose the resulting table with the page table that appears in the memory of the virtual machine. This latter operation is completely analogous to the creation of paged virtual machines under CP67. I/O mapping I/O operations in the original UMMPS Type II virtual machine were handled by having UMMPS transfer control to the VMM after trapping the instruction which initiated channel program execution. The VMM translated the channel program into its address space by applying the virtual machine's page map if necessary and then adding a constant relocation factor to each address. After performing this translation the VMM called upon UMMPS to execute the channel program. UMMPS then absolutized the channel program and initiated its execution. In addition to the overhead it entailed, this mapping procedure made it impossible for the virtual machine to execute a self-modifying channel program. A recent modification to the UMMPS virtual machine monitor has been able to alleviate this situation. 22 This modification involves positioning the virtual machine's memory in real The Evol ution of Virtual Machine Architecture memory so that the virtual and real address of each location is identical. This eliminates the need for channel program absolutization and thus improves efficiency while at the same time making self-modification of channel programs possible. One of the difficulties that had to be overcome when making this change to the VMM was that the real counterparts of certain virtual machine memory locations were already being used by UMMPS. The solution that was adopted was to simply re-write the virtual machine's privileged software nucleus so that most of these locations were never used. A more detailed discussion of this point is provided by Srodawa and Bates. 22 Parmelee ll describes a similar modification that has been made to CP -67. SINGLE STATE ARCHITECTURE One of the more unusual approaches to the problem of creating virtual machine architectures is based on the idea of eliminating privileged state entirely.lo.n The proponents of this approach argue that the primary-and in fact only essential-function of privileged state is to protect the processor's address mapping mechanism. If the address mapping mechanism were removed from the basic machine interface and thereby made totally invisible to software, there would be no need to protect the mechanism and therefore no need for privileged state. In these single state architectures all software visible addresses are relative addresses and the mechanism for translating these relative addresses to absolute addresses always concealed. That is, each software level operates in an address space of some given size and structure but has no way of determining whether its addresses correspond literally to real memory addresses or whether they are mapped in some fashion. Since all addressing including 110 is done in this relative context, there is really no need for software to know absolute address and thus no generality is lost. The central feature of this architecture is the manner in which software level N creates the address space of software level N + 1. Basically, level N allocates a portion of its own address space for use by level N + 1. The location of the address space of level N +1 is thus specified in terms of its relative address within level N. After defining the new address space, the level N software executes a special transfer of control instruction which changes the address mapping mechanism so that addresses will be translated relative to the new address space. At the same time, control passes to some location within that new space. Note that this special instruction need not be privileged since by its nature it may only allocate a subset of the resources it already has access to. Thus it cannot cause interference with superior levels. Level N can protect itself from level N + 1 by defining the address space of level N + 1 so that it does not encompass any information which level N wishes to keep secure. In particular, the 297 address map that level N sets up for level N + 1 is excluded from level N+1's address space. When an addressing fault occurs, the architecture traps back to the next lower level and adjusts the address map accordingly. Thus the system must retain a complete catalog of all active maps and must be able to compose and decompose them when necessary. This is relatively easy to do when only relocation/bounds maps are permitted 15 but more difficult when segmentation is involved. 23 Since each level sees the same bare machine interface except for a smaller address space, each level corresponds to a new virtual machine. Mapping of processor state is unnecessary, mapping of memory is defined by the level N VMM relative to its own address space and is completely invisible to level N + 1, and mapping of 110 is treated as a special case of mapping of memory. The two published reports on this architecture are essentially preliminary documents. More details have to be worked out before a complete system can be defined. THE VIRTUAL MACHINE FAULT The single state architecture discussed in the preceding section provides a highly efficient environment for the creation of recursive virtual machine systems. However, the basic machine interface associated with this architecture lacks a number of features which are useful when writing a privileged software nucleus. These features, which are present to varying degrees in several late third generation computer systems, include descriptor based memory addressing, multi-layered rings of protection and process synchronization primitives. A recent analysis24 of virtual machine architectures for these more complex systems is based on an important distinction between two different types of faults. The first type is associated with software visible features of a basic machine interface such as privilegedlnonprivileged status, address mapping tables, etc. These faults are handled by the privileged software nucleus which runs that interface. The second type of fault appears only in virtual machine systems and is generated when a process attempts to alter a resource map that the VMM is maintaining or attempts to reference a resource which is available on a virtual machine but not the real system (e.g., a virtual machine memory location that is not in real memory). These faults are handled solely by the VMM and are completely invisible to the virtual machine itself. * Since conventional architectures support only the former type of fault, conventional VMM's are forced to map both fault types onto a single mechanism. As already noted, this is done by running all virtual machine proc* Faults caused by references to unavailable real resources were not clearly identified in this paper. The distinctions being drawn here are based on a later analysis by Goldberg. 16 298 National Computer Conference, 1973 esses in non-privileged mode, directing all faults to the VMM, and having the VMM "reflect" all faults of the first type back to the privileged software nucleus of the virtual machine. An obvious improvement to this situation can be realized by creating an architecture which recognizes and supports both types of faults. A preliminary VMM design for a machine with this type of architecture has been proposed.24 The design relies on static composition of all resource maps and thus requires a trap to the VMM each time a privileged process attempts to alter a software visible map. However, the privileged/non-privileged distinction within a virtual machine is supported directly by the bare machine and a privileged process is allowed to read all software visible constructs (e.g., processor state) without generating any type of fault. The major value of this design is that it can be implemented on an existing system by making only a relatively small number of hardware,! firmware modifications. DYNAMIC MAP COMPOSITION-THE HARDWARE VIRTUALIZER The clear distinction between virtual machine faults (handled by the VMM) and process exceptions (handled by the privileged software nucleus of the virtual machine) first appeared in a Ph.D. thesis by Goldberg. 16 One of the essential ideas of the thesis is that the various resource maps which have to be invoked in order to run a process on a virtual machine should be automatically composed by the hardware and firmware of the system. Since map composition takes place dynamically, this proposal eliminates the need to generate a virtual machine fault each time a privileged process running on a virtual machine alters a software visible map. Thus the only cause of a virtual machine fault is a reference to a resource that is not present in a higher level virtual or real machine. The thesis contains a detailed description of a "hardware virtualizer" which performs the map composition function. It includes a description of the virtualizer itself, the supporting control mechanisms, the instructions used for recursive virtual machine creation, and the various fault handling mechanisms. These details will not be considered here since they are treated in a companion paper.25 It is interesting to note that the work on single state architecture 15.23 can be regarded as a special case of the preceding analysis in which process exceptions caused by privileged state are completely eliminated and only virtual machine faults remain. Similarly, the earlier work of Gagliardi and Goldberg24 represents another special case in which map composition is carried out statically by the VMM and where additional virtual machine faults are generated each time a component of the composite map is modified. By carefully identifying the appropriate functionality and visibility of all the maps involved in virtual machine operation. Goldberg's later analysis provides a highly valuable model for the design of virtual machine architectures and for the analysis of additional problems in this area. CONCLUSION A number of issues related to the architecture and implementation of virtual machine systems remain to be resolved. These include the design of efficient I/O control mechanisms, the development of techniques for sharing resources among independent virtual machines, and the formulation of resource allocation policies that provide efficient virtual machine operation. Many of these issues were addressed at the ACM SIGARCH-SIGOPS Workshop on Virtual Computer Systems held recently at Harvard University's Center for Research in Computing Technology. * In view of the major commitment of at least one large computer manufacturer to the support of virtual machine systems,27 the emergence of powerful new theoretical insights, and the rapidly expanding list of applications, one can confidently predict a continuing succession of virtual machine implementations and theoretical advances in the future. ACKNOWLEDGMENT ·We would like to express our appreciation to Dr. R. P. Goldberg for generously providing us with much of the source material that was used in the preparation of this paper. REFERENCES 1. Control Program-67 Cambridge Monitor System, IBM Corporation, IBM Type III Release No. 360D-05.2.005, IBM Program Information Department, Hawthorne, New York. 2. Mallach, E. G., "Emulation-A Survey," Honeywell Computer Journal, Vol. 6, No.4, 1973. 3. Allred, G., "System/370 Integrated Emulation under OS and DOS," Proceedings AFIPS &lCC, 1971. 4. Goldberg, R P., "Virtual Machines-Semantics and Examples," Proceedings IEEE International Computer Society Conference, Boston, Massachusetts, 1971. 5. Mallach, E. G., "On the Relationship between Emulators and Virtual Machines," Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Boston, Massachusetts, 1971. 6. Buzen, J. P., Chen, P. P., Goldberg, R. P., "Virtual Machine Techniques for Improving Software Reliability," Proceedings IEEE Symposium on Computer Software Reliability, New York, 1973. 7. Attansio, C. R, "Virtual Machines and Data Security," Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 8. Madnick, S. E., Donovan, J. J., "Virtual Machine Approach to Information System Security and Isolation," Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. '" Proceedings26 may be ordered from ACM Headquarters in :\'ew York City. The Evolution of Virtual Machine Architecture 9. Casarosa, V., "VHM-A Virtual Hardware Monitor," Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 10. Bard, Y., "Performance Criteria and Measurement for a TimeSharing System," IBM Systems Journal, Vol. 10, No.3, 1971. 11. Parmelee, R. P., Preferred Virtual Machines for CP-67, IBM Cambridge Scientific Center Report No. G320-2068. 12. Adair, R., Bayles, R. U., Comeau, L. W., Creasy, R. J., A Virtual Machine System for the 360/40, IBM Cambridge Scientific Center Report No. G320-2007, 1966. 13. Meyer, R. A., Seawright, L. H., "A Virtual Machine Time-Sharing System," IBM Systems Journal, Vol. 9, No.3, 1970. 14. Goldberg, R. P., "Hardware Requirements for Virtual Computer Systems," Proceedings Hawaii International Conference on System Sciences, Honolulu, Hawaii, 1971. 15. Lauer, H. C., Snow, C. R., "Is Supervisor-State Necessary?," Proceedings ACM AICA International Computing Symposium, Venice, Italy, 1972. 16. Goldberg, R. P., Architectural Principles for Virtual Computer Systems, Ph.D. Thesis, Division of Engineering and Applied Physics, Harvard University, Cambridge, Massachusetts, 1972. 17. Sayre, D., On Virtual Systems, IBM T. J. Watson Research Laboratory, Yorktown Heights, 1966. 18. Parmelee, R. P., Peterson T. I., Tillman, C. C., Hatfield, D. J., "Virtual Storage and Virtual Machine Concepts," IBM Systems Journal, Vol. 11, No.2, 1972. 299 19. Auroux, A., Hans, C., "Le Concept de Machines VirtueUes," Revue Francaise d'Informatique et de Recherche Operationelle, Vol. 15, No. B3, 1968. 20. Goldberg, R. P., Virtual Machine Systems, MIT Lincoln Laboratory Report No. MS-2687 (also 28L-0036), Lexington, Massachusetts, 1969. 21. Hogg, J., Madderom, P., The Virtual Machine Facility-How to Fake a 360, University of British Columbia, University of Michigan Computer Center Internal :i\ote. 22. Srodawa, R. J., Bates, L. A., "An Efficient Virtual Machine Implementation" Proceedings AFIPS National Computer Conference, 1973. 23. -Lauer, H. C., Wyeth, D., "A Recursive Virtual Machine Architecture," Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 24. Gagliardi, U. 0., Goldberg, R. P., "Virtualizeable Architectures," Proceedings ACM AICA International Computing Symposium, Venice, Italy, 1972. 25. Goldberg, R. P., "Architecture of Virtual Machines," Proceedings AFIPS National Computer Conference, 1973. 26. Goldberg, R. P. (ed.), Proceedings ACM SIGOPS-SIGARCH Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 27. IBM Virtual Machine Facility/370-Planning Guide, IBM Corporation, Publication No. GC20-1801-0, 1972. An efficient virtual machine implernentation* by RONALD J. SRODAWA and LEE A. BATES Wayne State University Detroit, Michigan sor while a full-duplex system possesses two central processors.) One consideration in this decision was availability -even if several hardware components fail simultaneously a filiI-duplex system can g-enerally be configured into a usable subsystem. The other consideration was utilization of the central processors-MTS was approaching saturation of its single central processor while OS /360 generally utilized very little of its central processor. As an interim measure the hardware was configured as two disjoint subsystems with one software system assigned to each subsystem. The singular advantage to this scheme was that the consolidation could be achieved with no changes to software. The goal of additional hardware availability was achieved immediately. The second goal of enhanced central processor utilization, of course, could not be attained until the two software systems could be integrated into a single system. The security of the administrative data base was still assured by the configuration of the hardware as two disjoint subsystems. The final goal was to run MTS and OS /360 within a single software system. This was not an easy task to accomplish because of the large amount of code contained in the ADS-TP system and its heavy reliance on many of the features of OS /360. Much of the code in ADS-TP interfaced at a low level with the Supervisor and Data Management services of OS/360. The terminal access method was an original package written to interface with OS/360 at the EXCP (execute channel program) level,2 The indexed sequential access method 3 (ISAM), partitioned datasets, the ability to catalog magnetic tape datasets, and conditional jobstep control were other features of OS/360 which were utilized by the administrative data base applications. Three alternatives were proposed for supporting ADSTP and MTS within a single operating system. These were: I:\,TRODUCTION Wayne State University has traditionally combined all computational facilities, for administrative as well as research and educational uses, in one central center. At times all services have been provided under a single hardware and software system. At other times administrative services have been provided on a hardware and software system distinct from that used for research and educational services. In recent past, these services were provided by two similar, but distinct hardware systems and two distinct operating systems. The administrative services were provided by an on-line teleprocessing system developed by Wayne State University running under the IBM OS/360 using MVT.l This system (called the Administrative Data Systems Teleprocessing System-ADS-TP) was run on an IBM System/360 Model 50. On the other hand, the research and educational services Were provided by the WRAP system running under IBM OS /360 using MFT. (WRAP was an antecedent to the IBM TSC for OS/360 and was developed at Wayne State University.) WRAP was run on a System/360 Model 65. Two independent hardware systems were used to assure the security of the administrative data base which was on-line to the ADSTP system. The above configuration did not provide sufficient services for research and education. This situation was alleviated by exchanging the System/360 Model 65 running WRAP for a System/360 Model 67 half-duplex running the Michigan Terminal System (MTS). (MTS is a timesharing system developed at the University of Michigan for the IBM System/360 Model 67. It utilizes the address translation and multi-processor features of that hardware system.) It was decided to consolidate the above hardware configuration (a Model 50 and a Model 67 half-duplex) into a single hardware system-a System/360 Model 67 fullduplex. (A half-duplex system has a single central proces- (1) MTS and OS/360 as co-equal systems. (2) Required OS/360 features installed into MTS. (3) Virtual Machine support in MTS. (A virtual ma- chine is a simulation of a hardware system upon a similar hardware system. A virtuai machine does not have the poor performance typical of simulation because most of the instruction set is interpreted by the host hardware system. The most * A preliminary version of this paper \vas presented at the limited attendance Workshop on Virtual Computer Systems, sponsored by ACM SIGARCH-SIGOPS and held at Center for Research in Computing Technology, Harvard University, Cambridge, Massachusetts, March 26-27, 1973. 301 302 National Computer Conference, 1973 well-known virtual machine implementation is CP67. 4 The third alternative was chosen for several reasons: translate the addresses contained in channel programs presented to it by tasks. (3) The Model 67 does not incorporate memory protection into the segment and page tables, but rather uses the standard System/360 scheme. (1) Least coding effort. OS/360 would be left unper- turbed and MTS changes would be minimal. (2) Software Reliability. OS/360 code (considered less reliable than MTS code) would not be incorporated into MTS. Most new code and all of as /360 would operate within a single MTS task. (3) Demonstrated feasibility. A virtual machine existed in MTS which supported OS/ 360 with some restrictions. (4) Isolation. as /360 would be isolated within a single MTS task. In addition to reliability considerations, this assures the security of the administrative data base, since input/ output devices cannot be shared between tasks in MTS. Certain performance goals were required from the resulting system. The ADS-TP system was to perform overall as if it were running on an independent System/ 360 Model 50. It would be quite easy to measure the central processor degradation against this goal. However, the measure of adequate teleprocessing response is much more subjective. Here a degradation of 30 percent as compared to response on the System/360 Model 67 halfduplex subsystem was considered the maximum acceptable degradation. Standard teleprocessing scripts were developed for the measurement of this degradation. These scripts originate from an MTS task running on one Model 67 subsystem while the system under test (either OS/360 under a virtual machine or as /360 on the real machine) is run on the other subsystem. This is accomplished by connecting teleprocessing line adapters between the two subsystems. Degradation is measured in terms of the total elapsed time to complete the scripts. IBM SYSTEM/360 MODEL 67 FEATURES I t is necessary to understand the special features of the IBM System/360 Model 67 before describing the implementation of MTS and the virtual machine. It is assumed that the reader is familiar with the basic architecture of the IBM System/360. 5 The Model 67 features are described in detail in the IBM Functional Characteristics Manual. 6 The pertinent hardware features are: MTS ARCHITECTURE This section describes those elements of the MTS architecture which must be understood in order to read the remainder of the paper. Alexander7 . 8 contains more information on this topic. UMMPS The heart of the MTS system is UMMPS-the supervisor. Every active MTS user, terminal or batch, is serviced by a single task independent of any other task. There are several additional tasks which provide basic system services, such as spooling. The concept of a task in MTS is similar to a task in as /360 or a process in M ultics. Tasks are always executed in problem state. That is, they cannot execute the System/360 privileged instructions. Tasks are usually executed with a non-zero protection key. This allows a storage key of zero to be used to protect memory regions from change by tasks. The resident system is that portion of the MTS system which remains in real memory at all times-it is never paged. The major component of the resident system is UMMPS and its various tables. The resident system is assigned to the beginning of real memory, starting with the Prefix Storage Area (PSA). Task addres~ space A task possesses an address space consisting of nine segments. Table I describes the contents of these segments. Shared segments are protected from task programs by a storage key of zero. Private segments are generally assigned a storage key of one. Inter-task protection of private storage is achieved by the address translation mappings. Task input/output An input/ output operation is started on a device by means of an SVC instruction similar to Execute Channel TABLE I-MTS Segment Usage (1) The Model 67 possesses two level address translation-segmentation and pagination. The segment is the natural unit of memory to share, since two segment table entries may point to the same page table. (2) Channel programs must contain real memory addresses, not virtual addresses. A supervisor must Segment Attributes Contents 0-1 2 3 4-8 not paged, shared paged, shared paged, private paged, private Resident System Initial Virtual Memory Virtual Machine Memory rser program and data An Efficient Vinual :Machine Implementation Program (EXCP)in OS/360. The identification of the device (a task may own more than one) and a channel program are passed as arguments to the SVC instruction. A task may either wait for the end of the operation (synchronous) or enable a task interrupt for the end of the operation on the device (asynchronous). In either case the request is made by means of a second SVC instruction. The channel program presented to UMMPS is written in terms of virtual memory addresses. UMMPS then generates an equivalent channel program which references the data areas by their real memory addresses. Channel commands referencing data areas which straddle page boundaries may require translation into two or more chained channel commands. Task interrupts Ul\1l\1PS provides a facility through which tasks may enable interrupts which are taken when certain asynchronous conditions are sensed by UMMPS. A task may enable end of operation, attention, and PCI (ProgramControlled Interrupts) for specific input/ output devices. Tasks may also enable interrupts for abnormal events such as timer interrupts and program interrupts. The general processing at the time of the interrupt consists of pushing the current state of the task onto a stack and changing the state of the task so that it continues with the first instruction of the interrupt routine. The interrupt routine returns by means of an SVC instruction which restores the previous state of the task. A task may be interrupted by a different condition while still processing a previous interrupt, to any practical level. segments disappear. UMMPS then sets the general purpose registers of the task to the contents of the vector argument. The right-hand half of the task's PSW is set to the contents specified as an argument. The task is then scheduled for use of a processor in the normal fashion. These changes are depicted in Figure 1. Processing continues in this manner until the task program either issues an SVC instruction or causes a program interrupt. At that time the address space of the task reverts back to normal and the task is interrupted. The utility of this mechanism should be obvious. Segment Three of a task's address space is used as an image of the virtual machine's address space. The SWAPTRA SVC instruction is executed by a program to enter the mode in which the virtual machine's program is run. An interrupt to the original program will be generated at just precisely that point where some function of the virtual machine must be siniulatea:(e.g., the execution of a privileged instruction by the virtual machine program). Thus, the problem state instructions of the virtual machine's program will be executed by the Model 67 processor while privileged instructions will cause an interrupt to the program which invoked the virtual machine mode. The virtual machine monitor The virtual machine is initiated by loading and executing a program called the Virtual Machine Monitor. This program is a typical MTS program, except that it issues NORMAL ~VOS THE MADDEROM VIRTUAL MACHINE oz VIRTUAL MACHINE 3 MONITOR ~ 4-8 The first virtual machine for MTS was developed by Peter Madderom at the University of British Columbia. The particular implementation was unsuitable for the support of a production operating system. However, the basic architecture of all succeeding virtual machines has remained the same. This virtual machine was used to run restricted versions of OS/360, stand-alone direct access device initialization and restoration programs (DASDI and Dump/Restore), and test versions of MTS. 303 MEMORY ....... I- Z w :E (!' w 2 "I V M" e/) The SWAPTRA SVC in::;truction AFTER 1g I ~ a.;: The model as currently developed represents only the mapping of resources in a computer system. This machinery is sufficient to discuss virtualization of certain mini-computers, e.g., DEC PDP-8, which do not exhibit any local mapping structure. However, most current (third generation) general purpose systems have additional software-visible hardware maps. This additional structure may be as simple as supervisor/problem states (IB::\I System/360) and relocationbounds registers (DEC PDP-lO and Honeywell 6000), or as complex as segmentation-paging-rings2I (::\lultics-Honeywell 6180). In future fourth generation systems, the maps \villlikely be even more complex and might feature a formal implementation of the process mode122 ,23 in hardwarefirmware. The crucial point about each of these hardware (supported) maps is that they are software visible. In certain systems, the visibility extends to non-privileged software. I5 However, in all cases the maps are visible to privileged software. Is Typically, an operating system on one of these machines 'will alter the map information before dispatching a user process. The map modification might be as simple as setting the processor mode to problem state or might be as complex as changing the process's address space by switching its segment table. In either case, however, the subsequent execution of the process and access to resources by it will br affected by the current local map. Therefore, in order to faithfully model the running of processes on a virtual machine, we must introduce the local mapping structure into the model. vVe develop a model of the software-visible hardware map by defining the set of process names P= Ipo, PI, ... ,pjl to be the set of names addressable by a process executing on the computer system. [Process spaces are always represented as circles in the figures.] Let R= Iro, rI, ... , rnl be the set of (n"al) fPSUUfCt' Ilam(-'s, as Lefurf'. Thrn, for the active process, wr provide a way of associating procrss names with resource names during process execution. To this end, via all of the software visible hardware 311 mapping structure, e.g., supervisor/problem state, segment table, etc., we define, for each momC'nt of time, a function ¢: P---+RU(el such that if xEP, yER, then cjJ(x) ={y if y is the resource name for process name x e if x docs not have a corresponding resource. The value cjJ(x) =e causes an exception to occur to some exception handling procedure, presumably to a privileged procedure of the operating system on this machine. To avoid confusion with Vll1-faults (see above), procpss traps will always be called exceptions. 'We call the function cjJ a process map or cjJ-map. The term process map is applied regardless of what form the cjJ-map takes. In future (fourth generation) systems, cjJ might actually represent the firmware implementation of proceSBe~ although this is not necessary. The important point about cJ> is that unlike f, which is an inter-level map, rP is a local or intra-level map and does not cross a level of resource mapping. Running a virtual machine: f 0 cjJ Running a process on a virtual machine means running a process on a configuration with virtual resources. Thus, if a process P= {Po, PI, ... , pjl runs on the virtual machine V = {vo, VI, ••. , vml then as before, with virtual resource names, V, substitutrd for real ones in the rrsource range of the map. Thr virtual resource names, in turn, arr mapprd into their real equivalents by thr map, f: V ---+R. Thus, a process name x corresponds to a real resource f (cjJ (x) ). In general, process names are mapped into real resource names under the (composed) map focjJ: P---+RU(tIU{el. This (composed) map can fail to take a process name into a real resource name in O1W of two ways. In the event of a process name exception (Figure 2a), control is givrn, without V.MJl1 knowledge or intervention, to thr privileged software of the operating system within the same level. A virtual name fault, however, causes control to pass to a process in a lower level virtual machine, without the operating system's knowledgr or intervention (Figure 2b). vVhile this fault handling softv.-are in the V.LVIll1 is not subject to an f-map since it is running on the real machine, it is subject to its cjJ-map just as any other process on the machine. The cjJ-map may be combined with the recursive f-map result to produce the "grnrral" composed map fl of 1.1 0 ... of 1.1 ... 1.1 0 cjJ. Thus, for virtual machines, regardless of the level of rrcursian, th('rr is only one application of th(' cjJ-map followed by 11 applications of an f-map. This is an important result that comes out of the formalism of distinguishing the f and cjJ maps. Thus, in a system with a complex cjJ-map but with a 312 National Computer Conference, 1973 -e p -t 00 v R (a ) -e • p V -t 0 R ( b) Figure 2-Process exception and VM-fault simple f-map, n-Ievel recursion may be easy and inexpensive to implement. In the model presented,J-maps map resources of level n + 1 into resources of level n. It is equally possible to define an f-map in which resources of level n + 1 are mapped into process names of level n (which are thpn mapped into resource names of level 11). This new f-map is called a Type II f-map to distinguish it from the Type I f-map which is discussed in this paper. 19 .24 I nterpretation of the model The model is very important for illustrating the existence of two very different basic maps in virtual machines. Previous works have not clearly distinguished the difference or isolated the maps adequately. The key point is that f and 1> are two totally different maps and serve different functions. There is no a priori requirement that f or 1> be of a particular form or that there be a fixed relationship bet\veen them. The q,..map is the interface seen by an executing program whereas the f-map is the interface seen by the resources. In order to add virtual machines to an existing computer system, 1> is already defined and only f must be added. The choice of whether the f-map is R-B, paging, etc., depends upon how the resources of the virtual machines are to be used. In any case, the f-map must be made recursive whereas 1> need not be. If a new machine is being designed, then neither 1> nor f is yet defined. 1> may be chosen to idealize the structures seen by the programmer whereas f may be chosen to optimize the utilization of resources in the system. Such a "decoupled" view of system d('sign might lead to systems with 1> = segmentation andf=paging. Another intrinsic distinction betw('('n the maps is that the f-map supports levels of r('sourc(' allocation betw('('n virtual machines, while thf' 1>-map establishf's layers (rings, master/ slave mode) of privil('ge within a single virtual machine. The virtual machine model may be used to analyze and charactrriz(' diffrrent virtual machines and architectures. I9 As can be seen from Table I, none of the existing or previously proposed systems providps direct support of completely general virtual machines. CP-67 has a non-trivial1>-map but no direct hardware support of the f-map; thE' approach of Lauer and Snow provides direct hardware support of the f-map but has a trivial 1>-map, i.e., 1> = identity. Therefore, CP-67 must utilize software plus the layer relationship of the 1>-map to simulatE' levels, whereas Lauer and Snow must utilize softvlare plus the level relationship of the f-map to simulatr layers. * The Gagliardi-Goldberg "Venice Proposal" (VP) 18 supports both the layer and level relationships explicitly. However, since the VP does not directly provide hardware support for f (it supports 1> and f 01», certain software intervention is still required. In the next section, we shall discuss a design, called the Hardware Virtualizer (HV), which eliminates the weaknesses of the previous designs. As can be seen from Table I, the HV is based directly upon the virtual machine model which we have developed. HARDWARE VIRTUALIZER (HV) Despite the value of the virtual machine model in providing insight into existing and proposed systems, perhaps its most important result is that it implies a natural means of implementing virtual machines in all conventional computer systems. Since the f-map and q,..map are distinct and (possibly) different in a virtual computer system, they should be repre- i I H"d~" I (~!~e-ratiOn) I Cagl1ardt~ldberg18 ~::~6 ~:~:~;'0n- 1 --- 1--- fl I 'i! bounds} I Dtrect harJIJarf.> ~~;~~ittt)'l; b=-t·-----I-Ha-r-dwa-re---ll----t---: - - -:----t. hD.",~-.',e-+-~::::;--! '"i d:vr:~l!liC (l'Ieg_ntstlon, : Wyet:l17 t ' pagIng) ! : Ct'lDpositiu" i Hardware {coapletely Sll"bHrary) L--_--'---_ _ _ HardwaY''' Evah,;,ted Dir('C'1 i y!!S iyf'<: (c01IIplf'tf'ly: dYf'Rt"ic!llly hardwal"f' I arbitrary) ~_______ I dynar.jc j ~~~~ost-I ! i hMJIJAr ... ; ! :~~;~!~t1C.: ! ~ _ _ "______ _'__ _ _ _ _ _ -.- * This is not to suggest that the Lauer and Snow approach is inferior. It is only less general in that it will not support modern operating systems running directly on the individual virtual machines. Architecture of Virtual Machines sented by independent constructs. When a process running on a virtual machine references a resource via a process name, the required real resource name should be obtained by a dynamic composition of the f-map and ¢-map at execution time. Furthermore, the result should hold regardless of recursion or the particular form of f and ¢. We call a hardwarefirmware device which implements the above functionality a Hardware Virtualizer (HT'). The HV may be conceptually thought of as either an extension to an existing system or an integral part of the design of a new one. HV design and requirements The design of a Hardware Virtualizer must consider the following points: (1) The databas-e to store f (2) A mechanism to invoke f (3) The mechanics of map composition (4) The action of a V M-fault. In the discussion which follO\vs, we shall develop the basis for a Hardware Virtualizer design somewhat independently of the particular form of the f-map or ¢-map under consideration. We assume that the ¢-map is given (it could be the identity map) and we discuss the additional structure associated with the f-map. Although we shall refer to certain particular f-map structures, such as the R-B or paging form of memory map, the actual detailed examples are postponed until later . Da tabase to represent f The V2l1M at level n must create and maintain a database \vhich represents the f-map relationship between two adjacent levels of virtual machine resources, namely level n + 1 to level n. This database must be stored so that it is invisible to the virtual machine, i.e., level n+l, including the most VMCB ROOT MEMORY MAP PROCESSOR MAP I/O MAP STATUS PAGE TABLE 313 privileged software. Let us assume that for economic reasons l8 the database must be stored in main memory. Then f may not be in the (virtual) memory of level n+l, but it must be in the (virtual) memory of level n. The only requirement on where the f-map is stored in level n memory is that it be possible for the HV to locate it by applying a deterministic algorithm from the beginning (ROOT) of level n memory. The f-maps corresponding to different virtual machines at the same level may be identified either implicitlyl6 or explicitly.l8 For explicit identification, we assume a virtual ~Iachine Table (V:\ITAB), the ith entry of which points to the Virtual :\Iachine Control Block (v.:VrCB) of virtual machine i (supported at level n). See Figure 3. The V:\ICB provides the representation of the f-map for the virtual machine. It contains the memory map, processor map, and I/O map. In addition, there may be other status and/or accounting data for the virtual machine.* The specific form of the V:\ICB is dependent upon the f-map actually used, e.g., R-B, paging, etc. Additional information possibly kept in the V:\ICB includes capability information for the virtual processor indicating particular features and instructions, present or absent. These capability bits include, for example, scientific instruction set or virtual machine instruction set (recursion). If recursion is supported, then the V~ICB must include sufficient information to automatically restart a higher level virtual machine on a lower level V.2\f-fault (Figure lc). Mechanism to invoke f In order to invoke the f-map, the HV requires an additional register and one instruction for manipulating it. The register is the virtual machine identifier register (V~nD) which contains the "tree name" of the virtual machine currently executing. The V~nD is a multisyllabic register, whose syllables identify all of the f-maps which must be composed together in order to yield a real resource name. The new instruction is LV.:\IID (load V':\IID) which appends a new syllable to the V.:\IID register. This instruction should more accurately be called append V.:\IID but LV.:\IID is retained for historical reasons. For the hardware virtualizer design to be successful, the V:\UD register (and the LV.:\IID instruction) must have four crucial properties. l8 .l9 (1) The V.:\fID register absolute rontents may neither be read nor written by software. (2) The V.:\IID of the real machine is the null identifier. (3) Only the LV.:\IID instruction may append syllables to the V~IID. (4) Only a V.il1-fault (or an instruction which terminates the operation of a virtual machine) may remove syllables from the V~IID. Figure 3-The VMTAB and VMCB's * As noted earlier, mapping of I/O and other resources may be treated as a special case of the mapping of memory. Under these circumstances, the VMCB reduces to the memory map component. 314 National Computer Conference, 1973 Map composer VM CAPABILITY EXCEPTION A map composer is needed to provide the dynamic composition of the q,-map (possibly indentity) and the active i-maps on each access to a resource. The q,-map is known and the active i-maps, i.e., the VMCB's, are determined from the VMID register. Figure 5 sketches the map composition mechanism while avoiding implementation details related to specific choice of maps. As can be seen, the composer accepts a process name P and develops a real resource name R or causes a V M -fault. VM-fault VMCB [VMID]. NEXT _ SYLLABLE - - SYLLABLE ( Store SYLLABLE in NEXT _ SYLLABLE field of current VMCB) A V..Ll.f-fault occurs when there does not exist a valid mapping between two adjacent levels of resources. As shown in Figure 5, a VM-fault causes control to be passed to the V.Al.ilI PROCESSOR REGISTERS AND MEMORY MAP LOADED FROM VMCB [VMID] 1 - VMCB [VMID] • NEXT _ SYLLABLE Figure 4- L VMID instruction Figure 4 sketches the operation of the LV::\lID instruction while avoiding implementation details related to a specific choice of map. In the flowchart, we use the V.:\IID as a subscript to indicate the current control block, V:\1CB [V.MID]. Thus SYLLABLE, the operand of the LV::\lID instruction, is stored in the NEXT_SYLLABLE field of the current VMCB. SYLLABLE is appended to the V~UD and this new virtual machine is activated. If the NEXT_SYLLABLE field of the new V::\ICB is NULL, indicating that this level of machine was not previously active, then the LV::\IID instruction completes and execution continues within this virtual machine. Otherwise, if it is not null, the lower level was previously active and was suspended due to a V2lf-fault at a still lowpr level. In this case, ex('cution of the LV::\IID instruction continups by appending the );EXT_SYLLABLE fjpld of thr new V:\ICB to the V~IID. Figure i'i--··Map composition and VM-fault Architecture of Virtual Machines superior to the level which caused the fault. This is done by removing the appropriate number of syllables from the V1VlID. Preformance assumptions The performance of the Hardware Virtualizer depends strongly upon the specific f-map, (2000) =2000. The VMID is NULL. Therefore, the resource name 2000 is a real resource and we fetch the instruction at physical location 2000, LVMID 2800. We apply the R-B map to 2800 and eventually fetch 1 which is loaded into the VMID register. Virtual machine 1 is now activated and its IC and R-B registers are loaded from VMCB1. Thus, IC is now 2100 and R-B is 1-3. Even though the memory of virtual machine 1 is 5000 words (as can be seen from its page table) the R-B register limits this active process to addressing only 3000 words. This limit was presumably set by the operating system of virtual machine 1 because the active process is a standard (non-monitor) user. K ow we are in Line 2 and the IC is 2100. To apply the -map, we add 1000, checking that 2100 is less than 3000, and obtain >(2100) =3100. Since the V11ID is 1, we must apply Jl to map the virtual resource 3100 to its real equivalent. The page table, pointed at by VMCB1, indicates that virtual page 3 is at location 4000. Therefore, Jl (3100) = 4100 and the LOAD 128 instruction is fetched. The other sequences may be evaluated in the same manner. Line 3 illustrates a process exception to the local exception handler of V:111, Line 5 illustrates activation of recursion, and Lines 4 and 6 illustrate VM-faults to the fault handler of their respective V2lf21.fs. It should be noted that we have added a paged J-map which is invisible to software at level n. The pre-existing R-B >-map remains visible at level n. Thus, operating systems \vhich are aware of the R-B map but unaware of the page map may be run on the virtual machine without any alterations. X ote that the addition of an R-B J-map instead of the paged J-map is possible. This new R-B J-map would be distinct from and an addition to the existing R-B 4>-map; it would also have to satisfy the recursion properties of J-maps.19 Similarly, a paged J-map added to a machine such as the IB:\I 360/67 would be distinct from the existing paged 4>-map. CONCLUSIOX In this paper we have developed a model which represents the addressing of resources by processes executing on a virtual machine. The model distinguishes two maps: (1) the 4>-map which maps process names into resource names, and (2) the J-map which maps virtual resource names into real resource names. The >-map is an intra level map, visible to (at least) thf' privileged software of a given virtual machine and expressing a relationship within a single level. The J-map is an intrr-Ievel map, invisible to all soft\'.-are of the virtual machine and establishing a relationship between the resourcrs of two adjacrnt levels of virtual machines. Thus, running a process on a virtual machine consists of running it under t he composed map J 0 4>. 317 Application of the model provides a description and interpretation of previous virtual machine designs. However, the most important result is the Hardware Virtualizer which emerges as the natural implementation of the virtual machine model. The Hardware Virtualizer design handles all process exceptions directly within the executing virtual machine without software intervention. All resource faults (VMfaults) generated by a virtual machine are directed to the appropriate virtual machine monitor without the knowledge of processes on the virtual machine (regardless of the level of recursion) . A number of virtual machine problems, both theoretical and practical must still be solved. However, the virtual machine model and the Hardware Virtualizer should provide a firm foundation for subsequent work in the field. ACKNOWLEDG~1ENTS The author would like to thank his colleagues at both 1IIT and Harvard for the numerous discussions about virtual machines over the years. Special thanks is due to Dr. U. O. Gagliardi who supervised the author's Ph.D. research. In particular, it was Dr. Gagliardi who first suggested the notion of a nested virtual machine fault structure and associated virtual machine identifier (V:'\UD) register functionality. REFERENCES 1. Buzen, J. P., Gagliardi, U. 0., "The Evolution of Virtual Machine Architecture," Proceedings AFIPS National Computer Conference, 1973. 2. Berthaud, M., Jacolin, M., Potin, P., Savary, R., "Coupling Virtual Machines and System Construction," Proceedings A CM SIGARCH-SIGOPS Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 3. Meyer, R. A., Seawright, L. H., "A Virtual Machine Time-Sharing System," IBM Systems Journal, Vol. 9, No.3, 1970. 4. Parmelee, R. P., "Virtual Machines-Some Unexpected Applications," Proceedings IEEE International Computer Society Conference, Boston, Massachusetts, 1971. 5. Winett, J. M., "Virtual Machines for Developing Systems Software," Proceedings IEEE International Computer Society Conference, Boston, Massachusetts, 1971. 6. Casarosa, V., Paoli, C., "VHM-A Virtual Hardware Monitor," Proceedings ACM SIGARCH-SIGOPS Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 7. Keefe, D. D., "Hierarchical Control Programs for Systems Evaluation," IBM Systems Journal, Vol. 7, No.2, 1968. 8. Buzen, J. 0., Chen, P. P., Goldberg, R. P., "Virtual Machine Techniques for Improving Software Reliability," Proceedings IEEE Symposium on Computer Software Reliability, New York, 1973. 9. Attanasio, C. R., "Virtual Machines and Data Security,'· Proceedings ACM SIGARCH-SIGOPS Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 10. Madnick, S. E., Donovan, J. J., "Virtual Machine Approach to Information System Security and Isolation," Proceedings AC.I\1 SIGARCH-SIGOPS Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 11. Adair, R., Bayles, R. U., Comeau, L. W., Creasy, R. J., "A Virtual Machine System for the 360/40," IBM Cambridge Scientific Center Report No. G320-2007, 1966. 318 National Computer Conference, 1973 12. Srodawa, R. J., Bates, L. A., "An Efficient Virtual Machine Implementation," Proceedings AFIPS National Computer Conference, 1973. 13. Fuchi, K., Tanaka, H., Namago, Y., Yuba, T., "A Program Simulator by Partial Interpretation," Proceedings ACM SIGOPS Second Symposium on Operating Systems Principles, Princeton, New Jersey, 1969. 14. IBM VirtualMachine Facilityj370-Planning Guide, IBM Corporation, Publication Number GC20-1801-0, 1972. 15. Goldberg, R. P., "Hardware Requirements for Virtual Machine Systems," Proceedings Hawaii International Conference on System Sciences, Honolulu, Hawaii, 1971. 16. Lauer, H. C., Snow, C. R., "Is Supervisor-State Necessary?," Proceedings ACM AICA International Computing Symposium, Venice, Italy, 1972. 17. Lauer, H. C., Wyeth, D., "A Recursive Virtual Machine Architecture," Proceedings ACM SIGARCH-SIGOPS Workshop on Virtual Computer Systems, Cambridge, Massachusetts, 1973. 18. Gagliardi, U.O., Goldberg, R. P., "Virtualizable Architectures," Proceedings ACM AICA International Computing Symposium, Venice, Italy, 1972. 19. Goldberg, R. P., Architectural Principles for Virtual Computer Systems, Ph.D. Thesis, Division of Engineering and Applied Physics, Harvard University, Cambridge, Massachusetts, 1972. 20. Goldberg, R. P., Virtual Machine Systems, MIT Lincoln Laboratory Report No. MS-2687, (also 28L-0036), Lexington, Massachusetts, 1969. 21. Schroeder, M. D., Saltzer, J. H., "A Hardware Architecture for Implementing Protection Rings," Communications of the ACM, Vol. 15, No.3, 1972. 22. The Fourth Generation, Infotech, Maidenhead, England, 1972. 23. Liskov, B. H., "The Design of the VENUS Operating System," Communications of the ACM, Vol. 15, No.3, 1972. 24. Goldberg, R. P., "Virtual Machines-Semantics and Examples," Proceedings IEEE International Computer Society Conference, Boston, Massachusetts, 1971. 25. Madnick, S. E., Storage Hierarchy Systems, Ph.D. Thesis, Department of Electrical Engineering, MIT, Cambridge, Massachusetts, 1972. 26. Schroeder, M. D., "Performance of the GE-645 Associative Memory while Multics is in Operation," Proceedings ACM SIGOPS Workshop on System Performance Evaluation, Cambridge, Massachusetts. 1971. The computer-aided design environment project (COMRADE) by THOMAS R. RHODES* Naval Ship Research and Development Center Bethesda, Maryland Some of the characteristics favoring choice of this phase, were: BACKGROUND Since 1965, the Naval Ship Engineering Center (NA VSEC) and the Naval Ship Research and Development Center (NSRDC), sponsored by the Naval Ship Systems Command, have been actively involved in developing and using computer facilities for the design and construction of naval ships. The overall goals of this effort, known as the Computer-Aided Ship Design and Construction (CASDAC) project, have been twofoldfirst, to achieve significant near term improvements in the performance of ship design and construction tasks, and second, to develop a long term integrated CASDAC system for all phases of the ship design and construction process.! While pursuit of the first goal has achieved notable cost savings,! it has also produced a situation tending to delay the attainment of the second goal, that of an integrated CASDAC system. There soon were many individual batch-oriented computer programs, designed and operated independently of each other, involving relative simplicity, low cost, and short-term benefits, all of which contrasted against the complications, high cost, and projected benefits of an interactive integrated-application system. But yet, it was considered that a quantum improvement in the time and cost of performing ship design could only be realized through a coordinated and integrated approach. The real question facing the Navy was whether such an approach was technically and economically feasible. In an attempt to demonstrate the feasibility of an integrated approach, members of the Computer-Aided Design Division (CADD) and the Computer Sciences Division (CSD) of NSRDC joined with members of the CASDAC office of NAVSEC to investigate and develop a prototype system. The phase of ship design known as concept design was chosen to be modeled as an integrated system and to provide the macrocosm for studying system requirements. • that as an initial phase of ship design, wherein basic features such as ship type and size, weapons and electronics systems, propulsion machinery and major shipboard arrangements were determined, it represented a critical phase where optimization over many alternatives could result in improved ship performance and lower development costs; • that as an activity with a high level of creativity and analysis, in which operational requirements were transformed into a feasible engineering reality, it could be enhanced through application of computer aids; • that as an activity with extensive interaction and information exchange among a multiplicity of engineers representing different disciplines (e.g., naval architecture, marine, mechanical, electrical, etc., engineering), it produced a dynamic atmosphere that was considered a "natural" for an integrated solution; and, • that with relatively few engineering tasks and data requirements compared to later ship development phases, it offered a tractable situation for analysis and application of existing computer technology. SYSTEM REQUIREMENTS The initial study effort led the engineers and application programmers of CADD and the systems programmers and analysts of CSD along different, but compiementary paths in viewing the system requirements-one view reflecting the engineering requirements of concept design and the other, the imposed computer requirements. The engineering analysis sought to identify the various tasks, their contribution to the overall design process, their relationships, dependencies, data input and output requirements, and the role of the engineer throughout the design process. Each task was further divided to reveal the major steps and to explore the computer implementa- * The views and conclusions contained in this document are those of the author and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Department of the Navy. 319 320 National Computer Conference, 1973 tion of them. TNhile a "building-block" approach to problem solving, involving a strong interaction between the engineer and the system, was desired, questions were raised as to how much flexibility the system should provide. Should the designer work at a low level with "atomic" engineering functions to incrementally describe and solve his problem, or should the designer work within a pre-established set of alternatives, where major decision points have been defined, and he should have only to choose and sequence among a variety of algorithmic procedures in which much of the problem structure has been imbedded within the program logic? While the "atomic" approach appeared more flexible and was conceivably more adaptable to new design situations, the latter approach was favored for the initial effort since, under this approach it was deemed that satisfactory design results could still be obtained for a broad set of problems, many existing application programs were amenable for use, and less sophistication was required to develop and use the system. From the analysis of the overall design process, a good indication of program and designer data requirements was obtained. The required data was organized to reflect various logical associations, producing a large and complex data structure. 2 To minimize data redundancy a distinction was made between data describing the characteristics of a particular ship and data that was common to many ships. This resulted in separate ship and catalog files and in having data associations both within and between files. This separation of data was favored also because the catalog files would be less subject to change during the design process than the relatively volatile ship file. and less queuing would be required during processing. The data base was considered to be the crucial link through which information would be shared among designers and programs, and hence it represented the key element in an integrated system. The demand for information required that data management capabilities be made available to both the application program during execution, and to the designer working directly with the data base at the terminal. The large and complex data structure indicated that efficient and flexible techniques would be necessary to structure, store, access, and manipulate this data, and finally, some means of controlling access to the files would be required to preserve data integrity. In addition to the analyses of the design process engineering requirements, consideration was also given to coordinating or controlling the overall design process. Although the designer would be responsible for performing design tasks, he would do so under the general direction of the project leader or design administrator. Task assignments and final acceptance of design results would normally be under the purview of this member of the design team, which implied that the system would need to be responsive to the administrator's role by providing controls over program and file access. and reports on the design statur-; ~nrl systpm l1S~gf> While the engineering analysis was directed toward identifying the elements and requirements of an integrated ship-concept design system, the computer sci~nce effort was directed toward providing general mechamsms that were adaptable to ship design and to similar situations where it was necessary to coordinate and integrate many users, programs, and data files. The effort to produce a prototype ship-concept design system was termed the Integrated Ship Design System (ISDS) project, * while the effort to develop a framework of capabilities for constructing integrated systems was termed the Computer-Aided Design Environment (COMRADE) project. SYSTEM DESCRIPTION From the analysis done during 1970, design and development of a prototype system was scheduled to begin t.he following year using NSRDC\; CDC-6700 computer WIth the SCOPE 3.3 Operating System. The NSRDC computer configuration, shown in Figure 1, provides remote access from interactive graphic, conversational keyboard, and batch stations to high-speed dual processors with extensive secondary storage. Conversational teletype (TTY) and medium speed batch (200 UT) capabilities are provided through the CDC INTERCOM Time-Sharing software, while high-speed remote batch and interactive graphic communications are serviced by the CDC EXPORT -IMPORT software. This configuration appeared satisfactory for a prototype effort; however, the relatively small main memory resource and the separate job schedulers for interactive graphics and conversational keyboards were considered major problems for program development and integrated operations. To minimize difficulties in a first level effort, exclusive attention was given to use of the more available conversational keyboard (TTY) as the principal designer interface to the system rather than to the more desirable interactive graphic terminal. However, some graphic applications were planned and these would interface with the data base for test purposes. From a consideration of the ISDS requirements and an examination of related efforts, such as the Integrated Civil Engineering System (lCES)3, COMRADE proceeded to design and develop three major software components: an Executive system; a Data Management system' and a Design Administration system. References 4, 5 6' and 7 describe these components in greater detail, h~w~ver, the following summary gives an indication of their functions and capabilities: • Executive System-An interactive supervisor program, functioning under the INTERCOM TimeSharing system, that interprets and processes command procedures. Through supporting software, * "The Integrated Ship Design System, Model I-System Development Plan," internal document of Computation and m p11 t, ,-SRDC, Fcbru:1r:: 1071. ~lathematics Depart- The Computer-Aided Design Environment Project (COMRADE) ICOOS- . . IU Ill"'" RII!DIi "Mill ~~, 4E Rill :J~. ~~1_1 ~T~a._ "~!IIJ ICauS IIllil _"~!~_IIIII"QI_ IIf:JI _IkDli 1 ~ ]1: 1I'f. . . . - S·iOC :(In &~"J !_2.!!.~ ~1lU; '! Rill 1m 321 NSRDC DISPATCH OFFICE BlDG 11 ROOM iii ; --------..J 'lCDIS--.iI:. :"'~~';.\~.J ~DG3.11~ ~~~_. ~~21~ 1'ElfTTl'f II ' I'\lITTil l -- -:, ~'~-;:J ~~ j.:~;::;-~:: -:;~:;~ !~;l ~~-: ~~.~~~ ~~~~ r.~~~~ ~~LlIlDIil,om~j ,:;~~~~l ~{~ r;-;m-is:--: ~1~'!J -,;-.~ r~;-u--: ~~ NSRDC CENTRAL COMPUTER ~;-~.;.---: I.IIG 11 .. 120 PIOCESSOI IJ1UDIiTIIOIIISIlEIDII 20P£Rl'llERALPIOCESSOIUIITS 24111PUT!OtIll'IIT CHAIIIIHS 1;1 iiil.LIIii CHiiACTEiS SISTEIIOISKS coe 6100 CEMTUI ~~~~~~. ~~~ ~";;j ~~i;: ~~~J:;.~ i ~;~~ ~ [~~~3;J ~ PliCII1111 Off·S1AT1OIICOIm:iSATIOIIALTEiIlllALS 1411£W TEAIIIIIALS 2 EXISTlIIGTE_ 16TOTALTE_ALS -;-;CI.~ '-~~ i~~-;-, UrsEc".-.rP;C .. lOII / CENTER PERIPHERAlS IICLUOE 13DISIIPACIISIREIIOYUI.EI 4 EACH· 1 T11ACK'9 IRACII TAP£S MDG 11 ~= ~20 Figure I-The NSRDC computer system known as the Procedure Definition Language (PDL), command procedures are defined as the complex sequence of computer operations necessary to perform a corresponding design task. Operations that can be automatically performed include: printing tutorials or data at the terminal; reading data from the terminal and passing it to programs through a System Common Communication area, and vice versa; setting default data values in system common; attaching, unloading, and purging files; initiating programs for time-shared or batch execution; executing most SCOPE control statements; and, altering the sequence of operations through conditional or unconditional transfers. Capabilities are also provided to control command usage through commandlocks and user-keys, and to define unique "subsystems" of related commands. Through the Executive capabilities, command procedures representing design tasks can be defined in such a way as to present a problem-oriented interface to the user and to hide distracting computer operations. Since com- \ puter actions can be dynamically altered during command processing, considerable flexibility for user decision-making can be provided. Finally, during execution of an application program step, the Executive is removed ~~ ~ ~ i= Z 6c 'L-.":"':':'DE:":V::'El::"OP::';':"'&-ITEST PROGRAM .~ATALOG SUBSYSTEM DEVELOPERS: PROGRAM r-[ENGINEERS & PROGRAMMERS] • ENTER COMMAND PROCESS ~ DEFINITION STATEMENTS =TEST DESIGN TASK IN SUBSYSTEM ~____~ • • • • • SIGNS-ON ENTERS COMMANDS FOLLOWS INSTRUCTIONS SELECTS OPTIONS & ENTERS DATA OBTAINS RESULTS tJlLES Figure 2-COMRADE command definition and execution phases 322 National Computer Conference, 1973 from main memory, permitting larger residency by the application module. Upon termination of the module execution, control is returned to the Executive. (In Figure 2, the command definition and execution phases are figuratively shown as steps 2 and 3.) • Data Management System-A library of FORTRAN -callable subroutines and user-oriented command procedures that provide data management capabilities to both the programmer and terminal user. Users may store, update, and retrieve data by name, retrieve data via queries on data attributes, cross-link data in different files through pointers, and in general define and process large, complex file and data structures. STEP I - 3LOCK TYPE DEF I NIT I ON STEP 2 - DATA LOADII-I3/UPDATII-I3 \ DATA LOADlllG - JAT~, UPDATII-I3 DIHA FI LE e SUB-D I RECTOR I ES The COMRADE Data Management System (CDMS) is hierarchically structured into three levels: STEP 3 - DATA RETRIEVAL e, DATA BLOCKS e BTDT e I NY ERTED LI STS (1) The foundation or interfaces with the SCOPE I/O operations consists of the direct access technique and directory processing programs. Variable length logical records can be accessed by name, where each name is "hashed" to form an index into a directory that can reference up to 516,000 data records. Previously used disk space is reallocated and a paged, circular-buffer is used to store and process data records. This set of programs, called the COMRADE Data Storage Facility (CDSF), can be used by a programmer to store and retrieve data records or blocks by name; however, at this level, he would be required to do his own internal record structuring and processing. (2) Built on this foundation are system procedures, called the Block-Type Manipulation Facility (BTMF), that enable the data record contents to be defined, stored, retrieved, and updated by name, thus enabling the programmer to logically define and process records without regard to the internal structure. At this level, the format of each unique block-type is defined before the data file is generated, and then, subsequently it is used to build and process corresponding data blocks. Each block-type can be logically defined as subblocks of named data elements. Each element can be of real, integer, character, or pointer data type, and can be singleor multi-valued (i.e., array). Single-valued elements can be "inverted" and used as keys for a query-language retrieval, and pointer-elements are used to form relationships among data records within one or several files. Sets of elements can be grouped together under a group name, with the group repeated as needed (i.e., repeating groups). U sing the BTMF programs. the user can then process data logically by name while the system identifies and resolves the internal record structure. ,, , ", :-;;;~R~ -- -~ '\J GENERATOR I (FUTURE) I ~-----j * * B"NF - srock TYiJe Manipulation Facl I i:y CDSF - C~RADE Data Storage Fae iii ty Figure 3-File definition and processing using COMRADE data management system (3) While the second level capabilities were provided as subroutines for programmer use, the third level consists of user-oriented command procedures for terminal use. Utilizing the BTMF routines, these programs enable terminal users to define data records; to load and update files; to retrieve data by name or through a query language; and to obtain information on file characteristics, such as size, record names, block-types and inverted attribute names and ranges. In Figure 3, the various CDMS components for file definition and processing are shown. • Desigh Administration System-A set of command procedures and program capabilities to assist the design project leader or administrator to: • identify valid subsystem users and assign appropriate command and file access keys; • identify subsystem files and assign appropriate file locks and passwords; • selectively monitor and receive reports on subsystem activities, such as, names of subsystem users, dates and times, commands used, significant events, estimated cost of processing, etc. Additional functions are provided to allow programs to dynamically attach and unload files during execution, and to prepare and cleanup necessary files during subsystem sign-on and sign-off. The Computer-Aided Design Environment Project (COMRADE) STATUS In the Spring of 1972, testing of the described COMRADE software began, using a selected set of ship design programs and data necessary to verify operational capabilities and to demonstrate the formation of an ISDS. Various interactive design commands were implemented, together with ship and catalog data files of limited size, for evaluation purposes. Figure 4 illustrates the functional components involved in the system development effort. During testing, engineers and ship designers who saw and used these capabilities and who were not involved with implementation, generally approved the system interface and the capabilities provided. Correspondingly, subsystem developers found the COMRADE capabilities to be convenient and necessary, but not always sufficient, thus providing feedback for corrections and further development. While the current ISDS effort is directed toward constructing a set of design commands and data files sufficient to engage in actual ship concept design studies, COMRADE efforts have been concentrated on evaluating performance, "tuning" components for more efficient operation, documenting existing work, and planning major enhancements. For example, work planned for the coming year, includes: • developing an interface between the Executive and the interactive graphics terminal for a balanced system environment; • developing a report generator facility to conveniently retrieve and format data file information; • developing a PERT -like facility to conveniently define, schedule, and monitor a subsystems activity; and, • considering application of computer-networks for a computer-aided design environment. While enhancements and improvements are planned, application of COMRADE capabilities to other areas of c::]) HISTORICAL SHIP FILES 323 ship design, engineering, logistics, etc., will also be investigated. s For example, planning is under way to develop a Computer-Aided Piping Design and Construction (CAPDAC) system which will integrate shipyard planning, design, and fabrication activities related to piping systems.* SUMMARY In response to the Navy requirements for an integrated ship design system the Computer-Aided Design Environment project has developed an initial set of general software capabilities, not merely limited to ship design, that provide a framework for assembling and coordinating programs, data, and their users into an integrated subsystem. The three major COMRADE components are: the Executive System, an interactive program operating under the INTERCOM time-sharing system of the CDC-6700 computer at NSRDC, which can process a complex and varying sequence of computer operations in response to user-defined commands; the Data Management System, a direct access capability to process large complex file and data structures via subroutines or terminal commands; and, the Design Administration System, a set of subroutines and terminal commands used to control subsystem and file access, and to optionally monitor and report on selected information, such as user-names, date and time, commands used, cost estimates, etc., during subsystem operations. These capabilities have been applied to several prototype application systems, most notably the Integrated Ship Design System, and several other application systems are being planned. While the COMRADE mechanisms have been shown to work, they are merely "tools" in constructing integrated systems and therefore depend on careful system planning and judicious use by subsystem developers to achieve an effective man-machine system. Many other factors, such as the performance and capabilities of the computer system, the application of software engineering techniques to modular program construction, the organization of data files and data communication regions for programs and users, and the efficiency of program elements are all particularly significant in determining the appearance and performance of an integrated system. The COMRADE capabilities, in conjunction with the ISDS project, have demonstrated the technical feasibility of constructing a convenient and effective computer tool that will provide guidelines and direction for continuing and similar efforts toward achieving a completely integrated CASDAC system. * Sheridan, H., "Formulation of a Computerized Piping System for Naval Ships," internal technical note, Computation and Mathematics Department, :-.rSRDC, June 1971. 324 National Computer Conference, 1973 REFERENCES 1. Naval Ship Systems Command Technical Development PlanComputer-Aided Ship Design and Construction, Naval Ship Systems Command, Washington, D.C., February 1970. 2. Thomson, B., "Plex Data Structure for Integrated Ship Design," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 3. Roos, D., ICES System Design, second edition, M.I.T. Press, 1967 4. Tinker, R., Avrunin, I., "The COMRADE Executive System," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 5. Willner, S., Bandurski, A., Gorham, W., and Wallace, M., "COMRADE Data Management System," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 6. Bandurski, A., Wallace, M., "COMRADE Data Management System-Storage and Retrieval Techniques," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 7. Chernick, M., "COMRADE Design Administration System," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 8. Brainin, J., "Use of COMRADE in Engineering Design," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. Use of COMRADE in engineering design by JACK BRAININ* Naval Ship Research and Development Center Bethesda, Maryland INTRODUCTION The Naval Ship Research and Development Center began formal work in computer aided design in 1965. The initial tasks undertaken were the develop-me-nt of individual batch application programs which were little more than the computerization of manual design methods. The programs developed included those shown in Figure 1. These programs were used by those people acquainted with them, and many programs existed in various agencies in a number of versions. New users had to determine the version most suitable for their purpose by individual contact and then had to prepare the necessary data inputs, often in a rather laborious manner. Occasionally, users linked a number of batch programs together to form a suite of programs. Such suites could deal with modest segments of a technical problem. Existing or independently developed suites would deal with other segments of the problem. The interfaces between suites was very difficult, unwieldy, and sometimes impossible. The resulting system was inflexible, running time tended to be excessive and no user-directed interaction was possible. Generally, computer-aided design lacked what might be called an overall strategy. • Feasibility studies for destroyers, 5ub:-.1a.rines and auxiliary shi?s • Structures - Ship ::lidship section design - Ship hull lines fa~ring and plate development • Ship propulsion system design - Propeller, shafting, reduction gear and vibration and dynanic shock analysis • Steano po;;er plant - Heat balance - Condenser design • Jesign of :Jiping systens for co:opressible and incompressible flow • Elect:-ical - Pm·;er distribution .Jna:!.ysis - Cable sizing • Electronics - AntE::nn.:1 r.1..J.tching • InterJctiv(: gro.T'hiC:s - • Construction - Scil(;duling npl:rati,ms - I~TEGRATED :;ac.hincry .J.r:-.::I.ngcnents Dctl:ction Df (:le:ctror:1agnetic hJzards E1C:C"tronic sysU.:r.l block di3.grd:-r.s C'l:-::p:l:-t,ent .:lrrangt:r:1c.:nt &. £l:Jod.J.blc l"ngths :.;_CIL::-i:~!: l.Jl1l:"l)j 'lr!,!~ldlions Figure I-CAD Computer programs have been developed for the above c. OVERVIEW OF net-"vor~s SYSTEMS d. Integrated systems provide computer-aided design with an overall strategy and greatly reduce the time required to execute a design. Such systems permit data transfer between various functional design disciplines and provide the engineer with the means to use the computer as a productive tool without the requirement that he also become a programmer. The integrated systems approach facilitates the work of designers and managers by: e. f. g. h. a. permitting communication between users via a preformatted design file. b. providing tools for the project leader to maintain control over the programs to be used, the personnel to be employed and by permitting him to set up target dates for tasks and generate performance reports via the computer. permitting the exhange of data between programs via computer files which are automatically created for the user in any integrated suite of programs. permitting tasks to be initiated by engineers using simple language statements at computer terminals. providing program instructions and program descriptions to engineers directly from the teletype terminal. permitting engineers to exercise their disciplines without having to acquire programmer expertise. demanding programs to be standardized for inclusion in the system and hence inducing homogeneity into routine processes. aiding in the building of compatible suites of programs. HISTORY OF INTEGRATED SYSTEMS/COMRADE A number of integrated systems have been envisioned over the last five years. In the Navy, these systems included an Integrated Ship Design System (ISDS) for the preliminary design of naval ships and a Ship Inte- * The views and conclusions contained in this document are those of the author and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Department of the Navy. 325 326 National Computer Conference, 1973 grated System (SIS) for detail ship design and ship construction. Associated with the SIS are a number of functional subsystems dealing with: electrical! electronics hull structure ship arrangements piping heating, ventilation and air conditioning stores and replenishment document generation and control The Integrated Ship Design System for preliminary design has been under development at the Naval Ship Research and Development Center since 1969 and it provided a focal point for COMRADE! in the real world. The development of the Ship Integrated System is part of the engineering development plan under the ComputerAided Design and Construction Project within the Department of the Navy. In addition to the use of COMRADE for the Integrated Systems noted in the foregoing, the COMRADE ideas have provoked discussion and investigation for use by: a. The National Aeronautics and Space Administration's Integrated Preliminary Aerospace Design System. 2 The feasibility of this system is currently being investigated by two aerospace contractors. b. The Navy's Hydrofoil Program Office. This office has determined to use COMRADE for its Hydrofoil Design and Analysis System 3 now being initiated. COMRADE is capable of application to the design and construction of more or less any complex manufactured product. trast, a design file will contain data pertaining specifically to the particular product being designed. While the System Software Development effort may be applied to a number of product lines the application program development is product dependent. For example, an application program to fair the lines on a ship would not be the same program that would fair the lines on an aerospace vehicle. Similarly, a ship design file would differ from an aerospace design file. APPLICATION OF COMRADE Figure 2 cites the application of the COMRADE System Development effort to the case of a building design. For greater generality a building was chosen, rather than a ship, to further illustrate the potential application of the system. In this instance, the design file is a BUILDING design file and contains building design data. An engineering user sitting at a teletype enters a simple English language command (e.g., BUILDING) to identify that he would like to design a building. A BUILDING subsystem of COMRADE is automatically initiated if the COMRADE design administration system recognizes the terminal user as a valid user of the subsystem BUILDING. The procedure definition language of COMRADE then ~ ~:.'~;:~:«I v£:::.,:'lLC r----- "DST DC _! ST".IC1c'J(' ~ HVAc '/r, 'I TI-Jr, .1"1.£( rvud/)AT, ,,' P~"'JT /,,/;,I')$~" Pe- COMPONENTS OF AN INTEGRATED SYSTEM ;! 51itL The development of an integrated system may be broken down into three major areas of development: System Software Development; File Development; and Application Program Development. The System Software Development effort, in the case of COMRADE, has been subdivided into an executive system, 4 a data management system, 5 and a design administration system. 6 The executive, among other things, provides an interface between the engineering user at a remote terminal and the computer system. The data management system supports the transfer of data between users, programs and files, and provides a common data handling capability. The design administration system logs system usage, generates usage reports and specifies and controls access to the engineering system, its commands and files. The file development effort may be thought of as being broken into a catalog file and a design file. The catalog would contain standard engineering data which may be used in the construction of a number of products. In con- ~ BRICk. P~oqti'IIM ~ ~ ! GLAS~ ~ ,.,S'.JUTI'II ~ M4r'::,,'''1L OOtJR,5 ~ ~~ l(' W,AlOOf,N°:' UI oJ (Q ~ '"".. ~ r ~ ~ ~ j ~ V· ..,.... :I ~ :..c Iu L/BRAIlY I ~ I'~!;\) t-I ~ ~ !IJ ~ \} ::lo ~ QJ ~ ~ , J \ "<: PESlfI'" AI1M,MISTIi'I1Tltn.1 I~ l~_ BArCH Figure 2-Integrated building- design ~i ; 1;:)' ° Use of COMRADE in Engineering Design issues a tutorial to the terminal user which queries: "WHAT BUILDING?" The engineer responds to the question, with an English language statement specifying the building type and name (e.g., skyscraper 123, private dwelling 102, or bank 149). The design administration system compares a list of valid users of each building design with the name of the terminal users. As an illustration, assume the user enters the command BANK1. If the user is permitted to work on BANK1, there will be a match and the user will be permitted to proceed into the selection of a design procedure. A user may have access privileges to BANK1 but not to BANK2 or BANK3. If a user enters BANK2 and has not been authorized to operate on BANK2, a diagnostic message will be returned to the user informing him, "YOU ARE NOT AUTHORIZED TO USE BANK2." This provides another level of access control which prevents people who may be working on the system, but are not working on this particular bank, from gaining access to this bank's files. Upon approval of the design administration system the user is accepted as a valid user of the BUILDING subsystem and the BANK1 design. However, this still does not permit the user to arbitrarily use any command within the system (all users of the BANK subsystem will not be able to access the details of the alarm system) nor does it permit him to gain access to all of the files within the system as each command and file has associated with it a command access key or a file access key which must be matched by the user input. The tasks required to design a bank are stored in a library of application programs (structural details, security systems, power and lighting systems, ventilation, furniture and vault arrangements, etc.) The user selects and enters an English language command, from the library, which is the name of the design procedure he would like to execute. For illustrative purposes, he may enter ELEC PWR DIST to indicate that he is designing an electrical power distribution system. Upon approval of the design administration system the user will be accepted by the system as a valid user of the named command procedure (ELEC PWR DIST). The user is offered a choice of prompting messages, either complete or abbreviated. The user's response is dependent on his familiarity with the program. A new user will normally select the complete tutorial option and an experienced user will normally select the abbreviated option. A convention has been established that the user's response is selected from the choices enclosed in parentheses. The execution of the design procedure is an interactive iterative procedure involving a dialogue between an engineer at the terminal communicating with the tutorial directions which are typed step-by-step by the teletype to the user, who no longer needs to be a computer expert but rather a reasonable engineer with good engineering judgment. These tutorials which are "human-readable" are produced by the Procedure Definition Language of the COMRADE executive. During the terminal session the user is given choices as to the source of the input data. 327 Data may be input from a catalog file (which would contain data such as motors, generators, cables and their corresponding impedances), a design file (data pertaining to the BANK being designed, such as the physical space available to locate a vault), a card deck, an old file, or from the user at the terminal (who would enter data such as the path and length of cables). The user of a given design need not manually input all required data if this data is already within the system as a result of a previously executed design procedure. Figure 3 illustrates a hypothetical user-system interaction for the design procedure ELEC PWR DIS. The terminal user merely responds to the tutorials by providing the underlined values. The user begins by entering the design procedure name (ELEC PWR DIS) and from then on the user and system interact as shown. The ELEC PWR DIS design procedure consists of four program modules (represented as A, B, C and D in Figure 4) which may be combined in a number of ways to perform an electrical power distribution analysis. Module A calculates the impedance matrix of the input circuit; B performs a short circuit analysis; C performs a load flow analysis; and D performs a transient stability analysis. ???? BUILDING WHAT BUILDING? ???? BANK I ELEC PWR DIS WOULD YOU LIKE COOPLETE (COOP) OR ABBREVIATED (ABBR) TUTORIALS? (OLD) OR (NEW) CIRCUIT? COOP NEW DESCRIBE NEW CIRCUIT: IS DATA TO CCME FROM (CARDS), (OLD FILE) OR (TERM)? TER.'1 IDENTIFY ELEMENT, LEADING NODE, TRAILING NODE AND RETURN CARRIAGE TYPE DONE TO INDICATE CIRCUIT IS COOPLETE ELI? GENI, 0, 1 EL2? GEN2, 0, 2 EL3? EL4? EL5? DONE IMPEDANCE MATRIX CALCULATION IS COOPLETE WHICH DESIGN MODULE: (SC) SHORT CIRCUIT (LF) LOAD FLCM (TS) TRANSIENT STABILITY (END) EXIT FRCM ELEC PWR DIS E1~ER TYPE or SHORT (3) ') ...J nUAC"'C' J.J.Ln. ...)J..:.. ("\" VJ..'\.. f" \.LJ 1 J.. nUACC' J..1J.o:'l.LJ.I.J., l..Tf'\n't" .i.1VJ../':"" T 1'\ ..I..eU. .l....1 Figure 3-Sample terminal session for design procedure ELEC PWR DIST 328 National Computer Conference, 1973 eReads All Data Input e 'Models' Electrical Network e Impedance Matrix Output for arrangements. As more and more programs are executed the design file will grow, and when it is complete it will contain a complete digital description of the building being designed . .I COMMENTS AND CONCLUSIONS J ! \ - - - - -\ShO" C' : I'EA~'E: :ARTY FEDE~rl~l:r ,:'<;:O(E:: Iril3 ':CJI'1F-'I_EfE ~.I,_t~F(( -;- ,lJI=E f'lr'WTr ti,l"<: - . - - - E q ,"1 p~;"':l t C{)'''n€'c~ tj on:'", -o-Cilt·JlC'gRefcrences -----EquLJXrenc in ConparU'le!".ts - ---- S~lrr]CCS ':3o'J:-"c~ng C71nTn"1p~~,= Figure I-Partial overview of ship design file structure Plex Data Structure for Integrated Ship Design 349 the CL pointer* which is part of the equipment block. This scheme consolidates catalog data for ease of maintenance and eliminates the duplication which would occur when a particular item of equipment appears many times on a ship. Other pointers represent relationships with other blocks on the SDF: The CM pointer* indicates the block of the compartment in which the equipment is located. The SW pointer* relates to a block in the Ship Work Breakdown Structure (SWBS) Branch, used for cost and weight accounting. The PR pointer is the parent pointer to the equipment block immediately above in the Equipment Branch, and the element EQ is a repeating group of pointers to equipment blocks immediately below. The NOTEPTR and CONNPTR pointers are used for associa- Equipment Blocks Notes Blocks ATTRS PTRS Block Header - For Data Mgt Only DESCR 4 BPrRS i (7 1:5 ::::> H '"'" [;; ""~ ~ ~ <: I 8 9 X ) Y 11 Z \ I 12 ALPHA 13 H '"'"[;; <>: ~ 0 "" r Subblock Header '00",'=,"- } '""n,""= ",''',.. co 'He BETA II I I I 17 SW 18 CL ) 19 PR Subblock Header Pointer to compartment arrangement block ! Pointer to Ship Work Breakdown Structure block Pointer to ca ta log block 1 Pointer to parent equipment block Pointers to sub-equipment blocks EQ 21 NOTEPTR \..22 CONNPTR Figure 3-Data structure in ship design branch Alpha name of equipment GAMMA CM 20 I } ,=""= 'n 'He 16 I I DESCRIP 10 ( I ! I I I I \, 14 1:5 0 I , Pointer to notes block I I Pointer to connection data block ment, trim and heel, and the dynamic characteristics affecting her maneuverability and motic;ms in a seaway. A traditionally basic aspect of naval architecture is the meticulous accounting of the weight and location of each piece of material which constitutes the ship. All weight items on a ship are classified according to the Ship Work Breakdown Structure (SWBS) for the purposes of weight budgeting and accounting. The same SWBS organization is used for cost accounting and for cost and man-hour estimating. Figure 4 is a portion of the SWBS classification, which extends for three and sometimes four levels of hierarchy. This standard hierarchy will automatically form the upper levels of the SWBS Branch of the SDF, and engineers will be able to expand the SWBS Branch through Figure 2-Equipment block format Group tion with auxiliary blocks containing, respectively, alphanumeric notes respecting the equipment, and data defining the physical connections (piping, wiring, etc.) with other equipment components. Figure 3 shows a portion of the Ship Systems Branch in more detail, illustrating the relationship between equipmen blocks, catalog blocks, and notes blocks. 3~0 ElectriC Plant •• 310 Electric Power Generation 311 Ships Service Power Generation 311.1 Ship Service Generators 311.2 Ship Service/Emergency (Diesel) Ship work breakdown structure branch 311.3 Ship Service/Emergency (Gas Turbine) The total weight, center of gravity, and moments of inertia of the finished ship determine her static displace- 312 Emergency Generators 314 Power Conversion Systems * See Figure 2 Figure 4-Portion of ship work breakdown structure 350 National Computer Conference, 1973 12M -04 } 5 " ] " , 0 3 Level 01 ~18. Sa I r"'''Deck Deck lot Plat. 2nd Plat. INBOARD PROFILE Note parabolic sheer on .01 Level, Main Deck, .and 2nd Deck, atraight line sheer elsewhere. 01Leve1 _- ~ -Main Deck _... - ... ... F1g. 5b SECTION VIEW Note camber. on Hain Deck and 01 Level Figure 5-Illustration of sheer and camber any of the terminal third or fourth level S\VBS blocks by defining additional SWBS blocks if such would be useful. Each SWBS block may contain estimates of weight and cost which have been made at its level of detail, and summaries of estimated or computed weight/ cost data from SWBS blocks below it. The lowest level SWBS blocks contain pointers to equipment blocks which fall into their respective accountability. These pointers indicate the equipment block highest on the Equipment Branch such that all equipment blocks under it belong to the same SWBS group; it is thus not necessary to have pointers to every component of a system which is entirely accounted for by one SWBS block. Other SWBS data elements include the longitudinal and vertical centers of gravity relative to ship coordinates, up-and-down pointers to form the hierarchy of the SWBS branch, and a pointer to a NOTES block containing relevant alphanumeric comments. Sheer and camber are mutually independent and each is defined analytically; therefore, the deck surfaces are analytically defined. The bulkheads of a ship are analogous to a building's interior walls. N onstructural bulkheads may be termed partitions. ISDS requires bulkheads to be flat and vertical. Most bulkheads are oriented either longitudinally or transversely, but special definition also allows "general" bulkheads which are oblique to the centerline axis of the ship. Bulkheads are defined as finite planes; a key designates the bulkhead as longitudinal or transverse, a single dimension locates the plane, and four pointers indicate two decks and two bulkheads (or hull surfaces) which bound the plane (Figure 6). It is significant that the edges and corners of bulkheads are not directly specified but are defined indirectly through pointers to bounding surfaces. This scheme greatly simplifies the updating of changes in bulkhead definition. Numerous data are associated with each bulkhead. In addition to the elements mentioned above, other data elements designate whether the bulkhead is watertight, oiltight or non-tight, structural or non-structural, and pointers are included to all compartments bounded by the bulkhead. Data to be added in the future could include . i J r03 Level fIDll13-02 ;.e:vel BRD NO 7S Surfaces branch The various blocks of the Surfaces Branch contain the descriptions of the physical surfaces which bound and subdivide the ship-the hull envelope, the decks and platforms, the bulkheads and numerous small partitions which segment the ship into its many compartments. The ship hull, i.e., the bottom and sides of the external watertight skin, is a complex three-dimensional shape. It is defined by specifying the coordinates of points on the hull lying in regularly spaced transverse planes, or stations. Arbitrary points on the hull may be obtained by double interpolation. Decks are surfaces which are horizontal or nearly horizontal. Decks are further classified as "levels" (decks in the superstructure), "platforms" (decks in the main hull which are not longitudinally continuous), and continuous decks (See Figure 5a). In general, decks have curvature in two directions. Sheer is the longitudinal curvature (Figure 5a) and camber is the transverse curvature (Figure 5b). BHD NQ 74 The Data Structurp Below Defines The Four PartlU01l Bulkhead. Shown Above. L BHD N9 Orientation Location BOUnding} BRD'a Deck Above l>i!ck BeloW 13 TRANSV. ,>I 122. { 74. 75 )3 Level :>2 Level , _14 16.5 -J 75 i13 ;6 03 Level 02 Level 76 . LONG'L LONG'L TRANSV. -16.5 162. 73 74 76 75 -' 03 Level 03 Level 02 J.eve1 ·02 T;cvcl - - \. Figure 6-Data structure for partition bulkheads Piex Data Structure for Integrated Ship Design access openings, penetrations, structural details, and weight data. The bulkhead is thus an important physical entity, but to the problem of shipboard arrangements it is important primarily as a space-bounding surface. Arrangements branch The structure of the arrangements branch enables the designer to think of the ship as a deck-oriented tree structure of spaces. 4 One such tree structure is shown in Figure 7. The first subdivision below "SHIP" always represents the platforms, levels or complete decks of the ship. Each deck, level, and platform is then further subdivided, as directed by the designer, into progressively smaller units. The smallest subdivision is the actual compartment on the ship. Each subdivision, large or small, is represented by an arrang-ement block and corresponds to -oB-€ node -of Figure 7. An arrangement may comprise a deck, a level, a platform, a superstructure house, a segment of a platform, a space, a group of spaces, or any contiguous designer-designated portion of one deck (or level or platform) of a ship. Data elements for each arrangement block include: • deck area, volume, and the name of the arrangement snIP 351 • pointers to bulkheads or hull surfaces forming the perimeter of the arrangement • pointers to the decks above and below • pointers to its component arrangement blocks and a backpointer to its parent arrangement The "undefined attribute" capability of CDMS3 provides means for storing additional attributes for specific blocks as needed. Those low level arrangement blocks representing compartments on the ship could contain summaries of component weight and requirements for electrical power, ventilation, or cooling water. The reader will realize that there is a rigid logical interdependence between the subdividing surfaces and the subdivided spaces in a ship, building, or similar entity. The data structure chosen for the Ship Design File has been designed to minimize the chance of allowing contradictory surface/arrangement data to occur. Typical accesses of ship design file An attempt was made early in the development of ISDS to use COMRADE's inverted file capability to manage the physical ship data of the Ship Design File (SDF). The inverted file structure works well on the equipment catalog files, for which it was developed, but it was soon discovered that there is a basic difference between a typical query of a catalog file and the kind of accesses characteristic of the physical ship data. In a catalog query the user asks the program to identify components which possess certain prescribed attributes. For example, he may ask for the names and manufacturers of all pumps capable of a flow of 1500 gallons per minute. The query processor must locate unknown data blocks based upon known attributes. This requires that extensive cross-reference files be maintained on each attribute to be queried. The logic involved in a typical access to physical ship data is quite different from that of an inverted file query. In this case we start with a known block in the SDF and follow particular pointers to retrieve blocks akin to the known block by a relationship which is also known. Some examples of this type of query are: • What are the bounding bulkheads and components of a particular compartment? This question is common in graphics arrangement programming. 4 . 5 e In which compartment is this component located? • In which compartments are components of this system located? • What are the compartments on this deck? • What weight groups are represented by this system? • List the cost of components in all compartments on this deck. Figure 7-Typical portion of arrangement branch The reader will note that some of the above retrievals require the processor to follow several sets of relational pointers. The last example requires the following logic: 352 National Computer Conference, 1973 1. Start with the prescribed deck. Follow the CM pointer to the arrangement block for that deck. 2. From arrangement block, follow CM pointers to next level arrangement blocks. Repeat down to bottom of Arrangement Branch; lowest blocks represent compartments. 3. From compartment arrangement blocks, follow EQ pointers to all equipment blocks within each compartment. 4. From each equipment block, follow CL pointer to catalog block for that component. Retrieve cost of component and print with name of component. The COMRADE Data Management System 3 has developed a "pointer chasing" algorithm which processes this type of query. The user must specify the start block, the sequence of the pointers to be followed, and the data to be retrieved at the end of the search. The pointer-chasing retrieval routines can be called from Fortran programs as well as by a teletype user. It will be a straightforward step to build simple summary or computational algorithms around the pointer-chasing routines. Typical of this type of program will be one for performing a weight-moment summary for the whole SWBS Branch. CONCLUSION The development of the ISDS Ship Design File is principally notable for the following points, most of which will apply to other integrated design systems: 1. The basic problem in data structure for integrated design was defined as the requirement for multirelational access of common data. 2. A clear differentiation was made between equipment catalog data and ship dependent data. 3. A plex data structure was designed in response to the above requirements, which is modelled for convenience as a cross-connected tree structure. It features small blocks of attribute data connected by many relational pointers. 4. The data structure is a logical model of the physical ship, whose principal entities are surfaces, spaces bounded by those surfaces, and items of equipment in the spaces. This logical structure directly serves graphic arrangement routines, and preserves the arrangement data in digital form for use by numbercrunching analysis programs. Most of this model is directly applicable to architectural design of buildings, and part of it to the design of aircraft and other vehicles. 5. A "pointer-chasing" query capability was developed to facilitate use of a data base dominated by interblock relationships. REFERENCES 1. Brainin, J., "Use of COMRADE in Engineering Design," presented at 1973 National Computer Conference, New York, June 1973, American Federation of Information Processing Societies. 2. Tinker, R., Avrunin, L., "The COMRADE Executive System" presented at 1973 National Computer Conference, New York, June 1973, American Federation of Information Processing Societies. 3. Willner, S., Gorham, W., Wallace, M., Bandurski, A., "The COMRADE Data Management System," presented at 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 4. Chen, R., Skall, M., and Thomson, B., "Integrated Ship Design by Interactive Graphics (lSDIG)," Proceedings oi SHARE XXXVI. 1971. 5. Operators '/ [Tsers ' Manual for the Computer Oriented Graphics Arrangement Program (COGAP), prepared by Lockheed-Georgia Company for the ~aval Ship Engineering Center, Washington, D.C., 1972. COMRADE data management system storage and retrieval techniques by ANN ELLIS BANDURSKI and MICHAEL A. WALLACE * Naval Ship Research and Development Center Bethesda, Maryland INTRODUCTION The three parts of the CDMS package are built one upon the other. The lowest-level component deals with the interface between the data and the computer environment. The package at this level provides a compact and extremely efficient capability for the dynamic storage and retrieval of variable-length data blocks. The subroutines in this portion of the package may be used directly by the programmer who seeks economy above all else, but they also constitute the foundation for the higher-level components. The package at the second level 3 provides a mechanism whereby sets of data can be organized within storage blocks; block types can be defined and names can be given to data elements within the block types. This facility allows "pointer" as well as attribute data to be defined so that data values within blocks may contribute a logical structure to the data base. A sub-routine library is provided at this level through which these features are made available to the programmer. The third component of the system 3 is provided to make the system most usable to the designer who may not be a programmer. This level provides a set of interactive command procedures. Using the system at this level, the designer can sit at a remote terminal and interact with the data base directly as he develops a design. Also included at this level are programs to initiate as well as execute data management functions. Although a user may access the data base through any of these three levels, it is the low-level component which actually stores and retrieves data. This low-level component is known as the COMRADE Data Storage Facility (CDSF). The CDSF handles all of the underlying data base management functions, including file usage and inverted list processing as well as data block storage and retrieval. The CDSF consists of a set of FORTRAN -callable subroutines, most of which are written in FORTRAN. This component, as part of the COMRADE system, is operational on a CDC-6700 computer under the SCOPE 3.3 Operating System. The configuration through which the CDSF utilizes computer resources ensures two things to the users of COMRADE data management at all levels. First, it ensures that there will be a minimum of restrictions on The design of a data management software system involves the consolidation of a balanced set of individual techniques. Today's third generation computers provide the resources for the storage of large, complex sets of data, and for the rapid retrieval of that data. Randomaccess mass storage devices will store millions of bits of information, and central processors have instruction execution times on the scale of tenths of microseconds. But central memory space is limited and mass storage access time is not negligible. The COMRADE Data Management System was designed as a tool under the Computer-Aided Design Environment (COMRADE) software system l to manage the large quantities of data involved in the design of a complex system. The computer-aided design environment has characteristics which place demands upon a data management system and the ways in which it utilizes computer resources. 2 The computer-aided design environment is characterized, first, by an immense and volatile set of data: data brought into the design process, data sorted, calculated, and communicated during the design process, and data which constitute the end product of the design process. Another characteristic of the computer-aided design environment is the wide range of disciplines represented by the individuals who are involved in the design process and make use of the design data. In response to these characteristics, the COMRADE Data Management System (CDMS) focuses not only upon furnishing efficient means for the storage of large quantities of data, but also upon making facilities available through which data are readily accessible to the nonprogramming designer. Rather than present an extensive package which attempts to be everything to everyone, CDMS provides a three-part data management capability. The flexibility allowed by this approach permits the individual using the system to make his own trade-off decisions between ease of use and efficiency. * The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Department of the Navy. 353 354 National Computer Conference, 1973 the size and uses of a data base. It provides for the dynamic allocation of variable-length data blocks up to 2 16 -1 words each, and will handle up to 516,000 data blocks in an unstructured format on each data base file. Secondly, it ensures the efficient operation of CDMS. It lays the foundation for the rapid response which is vital to the designer working at the teletype and accessing design data. It also minimizes both core and disk space requirements by strict management of these spaces. There are three basic elements of CDSF: • Data block handling routines furnish the operations necessary for the dynamic storage, retrieval, update and deletion of data blocks on disk files. These routines maintain a two-level directory on each file whereby any data block may be accessed directly through its block name. To conserve disk space, a reallocation scheme is included; and a circular buffer is managed which will allow more than one data block to be in main memory at a time. The mechanisms involved in data block handling will be described in more detail shortly. • Inverted list processing routines maintain ordered lists of the values of certain data elements and the names of the data blocks in which those values occur. These lists allow data blocks to be referenced on the basis of data values they contain without requiring a linear search of all blocks on a file. The methods used for inverted list creation, update and retrieval are presented later. • File use routines allow multiple random-access disk files to be established for data base use under CDMS. With these routines the programmer may transfer between files within one program. They lay the foundation for processing the cross file data pointers which are defined and used through higherlevel CDMS components. When a file is established as part of a CDMS data base, the file use routines issue a request to the operating system for a file on a permanent disk file device. Space is allocated on this file for CDSF file management tables which are maintained on the file throughout its existence. The file use routines open a file to be used and prepare its I/O interface with the operating system. They also rewrite updated file management tables onto a file when it is closed. Now, details of the techniques utilized in the handling of data blocks and the processing of inverted lists will be presented. may be found on the file. This makes it possible for a symbolic name rather than a numerical index to be used to access a data block during its residence on the file. CD SF provides all of the basic data management functions to handle variable-length data blocks, while allowing them to be referenced by name. A data block may be stored on any file which has been established for data base use. All or portions of a data block's contents may be retrieved. Modification of the contents of a data block is permitted, including that which requires increasing or decreasing the size of a block. Finally, removal of a data block from a file may be accomplished. The access of data blocks by name is provided through a two-level directory which is maintained on each file. The first level or main directory is brought into main memory when a file is opened and remains there throughout the time the file is in use. The second level of this directory consists of fixed-length subdirectories which are created on the file as they are needed. Only one subdirectory is brought into core at a time to be used. The main directory contains the addresses of the subdirectories on that file. It is in the subdirectories that the disk addresses of all data blocks on the file are stored. Through the use of this expandable two-level directory, up to 516,000 data blocks can be stored on a file. Since the main directory is brought into main memory when the first data block on a file is accessed, all blocks which are subsequently referenced on that file require only two disk retrievals (one to get the subdirectory and one to get the data block). When access to a data block is requested, its block name (48 bits; 8 characters) and its version number (a 5bit number from 0 to 31) are specified. This block name/ version number pair is put through a scatter function which transforms such pairs into uniformly distributed values. Binary digits (bits) are extracted from the resultant value to form a "key". This key is used as an index into- the main directory where the address of the appropriate subdirectory is found. The key is then used to index the subdirectory to locate the address of the data block (see Figure 1). It is generally found that block names which are generated by one person have a high degree of correlation. To scatter the indexes into the directory, a multiplicative ------~ BLOCI{ VERSION S~E NUMB!:R I ~--. HEADER I !. ~i DATA BLOCK HANDLING ,"- Jj _ _I I : The storage and access of the multitude of data blocks which are needed in a design data base are managed by CD SF . When a data block is stored, it is given a block name. CDSF keeps a directory of all the names of data blocks on a file and the disk addresses where those blocks ) ,. _~I I,----"R.LI - , - - - - i_ -'-------- -.--------;~.-"/ Figure i-Access of data blocks by name COMRADE Data Management System Storage and Retrieval Techniques congruential transform was chosen to apply to the block name/version number pairs. The block name and version number are concatenated and the resultant 53-bit value is multiplied by a 53-bit transform. The result of the multiplication (modulo 253 ) has bits extracted from it to produce the key into the main directory. When the first data blocks are stored on a file, one subdirectory contains the entries for all blocks. These entries are divided into two groups: each entry is placed in either the upper or the lower half of the subdirectory, according to the value of the first bit in its key. When there is only one subdirectory, there is only one address in the main directory and it is not necessary to use any bits from the index to find the address of the appropriate subdirectory . When one of the halves of the subdirectory becomes filled, a new subdirectory is created into which the entries from the filled half are moved. Each of these entries is placed in the new subdirectory's upper or lower half according to the value of the second bit in its key. Now two subdirectory addresses will appear in the main directory. The first bit in a key must be used to determine which of these addresses refers to the appropriate subdirectory. The length of the main directory is always a power of two so that whenever it must expand to accommodate new subdirectory addresses, it simply doubles in size. When the directory doubles, the addresses already in the directory are spread throughout its new length by placing each address in two contiguous locations. The address of the new subdirectory then replaces half of one of these address pairs. Each time the directory doubles, an additional bit must be used from the key to find the appropriate subdirectory. Correspondingly, each time half of a subdirectory splits out to form a new subdirectory, the half where an entry is placed in the new subdirectory is determined by the bit in the key following the one used to determine entry locations in the previous subdirectory. Figure 2 illustrates the process by which the directory expands. The entries are placed in the subdirectories sequentially from each end towards the middle. Additionally, the entries in each half of a subdirectory are chained together to form four lists. The two bits of the key following the bit which determines the half of the subdirectory are used to determine which of these four lists an entry is chained onto. In order to quickly locate a data block entry within a subdirectory, each subdirectory has a header which gives the address of the first entry on each of the four lists in each of its halves. This header also indicates which bit in a key should be used to determine the subdirectory half for a particular entry. It also points to the next empty location in the upper and lower halves of the subdirectory in which an entry may be placed. Figure 3 shows the arrangement of entries in a subdirectory and the contents of the header. SF'r.;J~:n sr":i:R'='C:'CRY IS (:-:-<::: :1".";--1{ I(ALY 355 C~F:A"-?:I I:'I~ ;.~) 1"<;" I '!.;:-:~-:::S5 ~!""-~;;'--1S:'i_ __ -~'---------" ADJ~"'S<; 2 ~~~:"l'~ , :':PPE~ \HAIF :(r·.",,:~ '"AI':" _-J -,riBT-! \ I Sl'BJT:Z:'C:-C~Y"'- ;~g~h~:r: ! ~;'~(I~l) I , ----- .\'inRESS1Il"7'si"~ BIT If) ,AJ)n~-.:ss 3 l1~RESSt:· ___ J (''''LLOCATE) 1 fnT I I -!- ~Jj)R~SS 4 !~;~L _~ ~ __________ .J Figure 2-Directoryexpansion An entry in a subdirectory for a data block contains the transformed block name/version number, the disk address where the block is stored, the length of the data block and the amount of disk space in which the block is stored. Disk space reallocation When a data block is stored on a file, it is usually written directly at the end of information on the file using the least amount of disk space which will contain the block. When a data block is removed from a file, the allocated disk space in which it was stored may be used for another data block. A disk space reallocation table is maintained on each file to keep track of blocks of disk space which are available for reuse. This table consists of an index and up to 72 lists, each of which holds the addresses of up to 503 reallocatable disk space blocks in a particular range. The index for the reallocation lists is brought into main memory when the file is opened. An entry is made in one of the reallocation lists each time a data block is removed from the file. The exact size of the reallocatable space is kept in the list along with its address. An entry is made in the index if the block to be reallocated is the first one in a particular size range and a new list is created for it. :4------t I ~Lower ~LILl f5 H a l f - - -___ I I 1 pp er !-I.alf j I, ~.' ---, I it:wIL; L6--L~:---+-1-----+---+-_-__-_-_-_':_------------"------:-,--+-i ~ I I 'I : ~IT 'L3 I -+L7 I. : TLB I " I I k. :L4 I • I I ~ HEADER DATI BLOCK EmRIES • Next free location in lower half (~LL) • Next free location in upper half (J>."FLU) • Dtsk address of data block • Bit number in key used to determine subdirectory half for data block entry • Da ta length • I Eight list pointers • Storage block length • Pointer to next entry on list Figure 3-Subdirectory format example 356 National Computer Conference, 1973 The reallocation table is referenced each time a new block is to be stored on the file and each time a larger block of disk space is needed to store a data block whose size is being increased. The entry in the name-access subdirectory for each data block retains the exact size of the disk block in which the data is placed, so that no disk space is lost if the block is again reallocated. Circular buffer Although data blocks may be created that are up to 212 -1 words in size, it is usually not desirable to use an enormous amount of main memory space to transmit a data block to the disk file. In order to be able to handle data blocks of almost any size, the CDSF uses a circular buffer for I I 0 whose size is defined by the programmer. When the retrieval of a large data block is requested, the circular buffer allows one portion of the block to be brought into main memory at a time. The circular buffer will also retain portions of data blocks or entire data blocks until the space which they occupy in the buffer is needed for other data. This multiblock capability operates on a first-in, first-out basis. Because of this feature, it may not be necessary to access a data block through the directory each time it is requested. The contents of the circular buffer are checked for the desired block. and if part of that block is in main memory (in the buffer), the need to read and search a subdirectory and possibly the need to read the data block is obviated. INVERTED LIST PROCESSING When a data base file is created under COMRADE, retrieval by block name is the most efficient type of retrieval because the file is set up with a block name directory. If the user wishes to retrieve information from the file on the basis of the values of data elements within data blocks, the file may be set up to include inverted lists to make this type of retrieval efficient also. An inverted list is a list of all of the values for one data element which occur in data blocks throughout the file, with information for each value indicating where in a data block it may be found. Such a list is ordered on the basis of the data values so that the entries for one value or a range of values may be quickly located. When the higher-level CDMS components are used to store and request data, the inverted lists will be automatically created and referenced as follows: When a data block is stored which contains a data value for an inverted element, the value is also placed in the inverted list for that element; when a query requests information on those data blocks in which particular values of inverted elements occur, the inverted lists are searched and the desired information retrieved. The CDSF is responsible for building and maintaining the inverted lists, and for searching them to retrieve information to satisfy query requests. Inverted list storage At the time a file is defined under CDMS, one of the tables for which disk space is allocated is an inverted element directory. This directory is brought into main memory each time the file is opened. Once values have been stored on the file for an inverted element, this directory will contain the name of the element and the disk address of the first block of its inverted list. As an inverted list grows, it may require many disk storage blocks. Each of these blocks may contain up to 340 value entries which are ordered within that block. The inverted list storage block for an element whose values occur in only a few data blocks may be accessed directly from the inverted list index. When the first storage block for an element is filled, it becomes an index for up to 510 additional storage blocks. Each time another storage block becomes filled, the entries which it contains are divided equally between itself and a new block. A new entry is made in the index to reflect the existence of the new block and where it occurs in the set of ordered blocks. Even though there may be storage blocks for an element, only the directory and one storage block need be accessed to locate a particular value. Figure 4 shows the relationships of the different types of inverted list blocks. In Figure 5, the format for an inverted list index is shown, and in Figure 6, the inverted val ue storage block format is given. The addresses of inverted list storage blocks which have been emptied by the deletion of values are kept for reallocation within the inverted list index. Inverted list update procedure An entry must be placed in an inverted list so that its value falls between immediately greater and smaller values. Inverted list processing time may be minimized by a procedure for accumulating inverted list insertions and deletions and making many updates at one time. The entries are presented so that it is necessary to reference each inverted list block only once. The CDSF provides a two-part procedure for making bulk updates. I:-ve:rte::l F.lefl'('nt ')ire~tory ------ _______________--0--=--_ __ ----- ..... - - - - - ._----------- .--~ I I rrllPr ~E"'; "•.:st !:'!.t!ex L.---r------=====--:-:-J '----_._--_.. --=.=..r~~_._::._: __.-.....-j ...... -~-:.. -= ---:--- Figure 4-Inverted list structure - -_ ... COMRADE Data Management System Storage and Retrieval Techniques NO. OF SPARE BLOCK!'; I NO SMALLEST VALUE IN LIST ; [ r OF Ar'T'm" "T.nrJ{~ 'M~--~~-- ~. HEADER ~ (~,UF"F"':'''l'F",i~ GREATEST VALUE IN BLOCK I,! 1 ~G~REA~TE~S~T~V~AL~U~E~I~N~B~LO~C~K~#~2_____________ _ I" ADDRESS OF BLOCK II 1 ·· · ! OF : SPL"('S. \ \ '> , ··· ·· ENTRIES FOR. UP TO 510 INVERTED LIST STORAGE BLOCKS I i --y- L~T~ SPECIFICAT:O~S IJ ) /T~~R-',j" R.EALLOCATABLE BLOCKS \ \ Inverted list information retrieval An inverted list is consulted in order to locate within the data base the occurrence (or occurrences) of a partic- k=================Fl~B~~I~FI~==~RMnER 1 VERSION NUMBER \ .... ( II 2 LOCATION INFO&,""'.ATION DATA VALUE I I) lVhere: SPEC!:!CATT0~S "''!TFI ORDER PRrSERVED I i Figure 7-Inverted list update procedure ular value or range of values for an inverted element. The information kept in the inverted list for each value includes not only the block name/version number of the data block in which a value occurs, but specifies where within the block the value is located. Since there may be many or few occurrences of the specified value(s) within an inverted list, a disk file is created to receive entries for up to 64,897 values. The names of the data blocks which contain these values may be picked up from this file. If further' information from the data blocks is required, the individual blocks may be accessed through the directory. CONCLUSION We have seen some of the storage and retrieval techniques which have been developed to handle large quantities of blocked data using direct retrieval and limited core storage. Capabilities for the access of data blocks by name, for inverted list processing, and for multi-file usage, provide an efficient and flexible data management system. The higher level components of CDMS, together with this foundation, provide the user a variety of capabilities from which to manage his data efficiently and conveniently. # 1 LOCATION INFOR.'1ATIO'J DATA VALUE 1VERSIOX NUMBER FILE "--- S0R':'F.D First, a sequential list of additions and deletions to be made to all inverted lists is kept in the order that the specifications are given. The entries in this list for each element are linked together in inverse order, from the last one entered to the first. As this list grows, portions of it may be placed on a special disk file created for this purpose. The list will continue to grow until one of two things takes place: until the end of the computer run during which it is built; or until a query is made which uses the inverted lists. Then, when one of these things occurs, the linked list of additions and deletions for one element is extracted from the total set of specifications. This list is sorted according to element value; then the value entries are integrated into the existing portions of the list. During this process it must be remembered that the sequence in which specifications are given is significant. When the specifications for one list are sorted into order on the basis of value, the original order must be maintained'so that if one specification nullifies another, the correct result will remain. Figure 7 shows the two parts of the inverted list update procedure. 2 t _J SO~T Figure 5-Inverted hst mdex il c:; <;"'ECS. ' L.:~NF' ~LE~~ ADDRESS OF SPARE BLOCK iI 2 ADDRESS OF SPARE BLOCK if 1 BLOCK NAME I TRA~~:~ER) ADDRESS OF BLOCK 1/ 2 BLOCK :">lAME I,! 1 OF S?C:CC:;.: I i 357 B is a flag indicating that 1m' value in this b1od, equals high value in orev;.ous block F is a flag indicating that high value in tfclis blOCK equals low value in previous block Figure 6-Inverted list storage block 1-WORD E';rt es.:.gn, ~.:~~~~~ Blocks Ji Report I Figure 3-COMRADE design administration-Logging and reporting system growth in the number of users the number of CPFMS files can be expected to grow correspondingly. As the number of files increases, tighter file controls will be called for, i.e., daily auditing of files may be necessary; a generalized file usage reporting system will be needed; and automatic file purging with an off-line hierarchical storage system may be needed. Each of these areas are to be addressed in the future. COMRADE LOGGING AND REPORTI~G SYSTEM Early in the planning of COMRADE it was deemed essential that a comprehensive system for recording and reporting subsystem activity be developed. Reports were to be generated for subsystem managers, for developers, and for users for the purposes of subsystem management, monitoring, and inter-user communication. The recording of subsystem activity was to be as transparent to the user as possible. The result of this planning is the implementation of the COMRADE Logging and Reporting System (LRS). (See Figure 3). The system is designed to record and report activity within given subsystems rather than within COMRADE as a whole. Reports can be tailored to the needs of individual subsystems. LRS provides basic modules and techniques that make such reports possible. To illustrate the necessity of tailoring the reporting of subsystem activity to the needs of the subsystem consider the example of the ISDS subsystem. Since ISDS is a ship design subsystem, the LRS for ISDS is closely concerned with providing reports based upon activities pertinent to ship designs. Thus an ISDS design manager can easily generate a report that monitors the activity within a given ship design rather than all ISDS activity. On the other hand, within a subsystem such as PERSO~NEL, t reports are generated about the activity of the entire subsystem. In PERSONNEL, unlike ISDS, there is no natural boundary such as individual design projects to limit reports. t The PERSONNEL subsystem of COMRADE processes personnel data at ~SRDC. 362 National Computer Conference, 1973 In a subsystem that has implemented logging, basic questions that can be answered by the COMRADE LRS include: Who used the subsystem? When did they use the subsystem? What commands were used? Much of this information can be and is obtained from the COMRADE Executive 2 which calls a design administration subroutine. The name of each command to be executed is placed into a file called the log. (A temporary log is created for each user at subsystem sign-on which is merged into the subsystem's permanent log when the user signs-off.) It is essential that entries placed in the log be uniquely identified for proper processing by the REPORT command. For this reason each entry contains as its first word a header word. The header word includes the clock time that the entry was placed in the log, the number of words in the entry and two numbers that uniquely identify this entry. These numbers are called primary type and secondary type. The secondary type is used strictlv to distinguish between entries of the same primary type. However, the primary type is used for an additional purpose called "selective logging and reporting." Some events can be considered more important than others. The more important an event, the lower the primary type assigned to it. Selective logging causes only events with a primary type lower or equal to a preselected primary type (called High Type to Log or HTL) to be logged. Requests to place entries into the log with primary types greater than HTL will be ignored. The HTL value may be set by a system manager by executing the proper command. The selective logging ability can be used to advantage for subsystem debugging. While a subsystem is being debugged many events can be logged. Then the subsystem implementors can use the resultant reports to closely monitor subsystem operations. Later the value of HTL can be lowered, eliminating excess logging. Selective reporting is similar to selective logging. It is controlled by a parameter (called maximum level to report or MLR) that may be specified in requesting reports. The report generator (a COMRADE command appropriately titled REPORT) will include in reports only those events whose primary types are equal to or lower than MLR. Thus, reports may comprehensively reflect system usage or may be limited to only those events considered most important. Other parameters that may be specified to the REPORT command include the site where the output is to be printed; a user name to which the report is limited; maximum number of lines to be printed; the first and last dates to be covered by the report; and in the case of subsystems where design names are significant, the design name to which the report is limited. A sample report for the ISDS subsystem is shown in Figure 4. This report was generated on January 10, 1973. Parameters specified in generating this report include first and last dates of January 2, 1973 and January 10, 1973 respectively; maximum level to report of 30; and maximum lines to be printed of 25. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 REPORT 1/10/73 - FOR PERIOD FROM 01/02/73 THRU 01/10/73 LEVEL REPORTED = 30 MAXIMU~ DDG3 1/ 2/73 CAJBBRAINI I SD S ENTERED 14:08 FUEL 14:33 - CP LOGOUT 15:07 - CP ESTIMATED :::OST '" 1.03 6.78 $ 29.50 DDG3 1/ 3/73 CAGXDAVIS ISDS ENTERED 10:37 .94 FUEL 10:43 - CP 6.24 LOGOUT 11:14 - CP ESTIMATED COST = $ 18.50 13:02 13:04 13:16 13:18 13:32 1/ 4/73 DDG3 CAJBBRAINI ISDS ENTERED - CP RETRIEVAL UPFUEL - CP RETRIEVAL -CP - CP LOGOUT ESTIMATED COST = CAWEWILLNE .97 6.12 11.40 15.76 $ 15.00 PP PP 23.66 244.89 PP PP 29.41 254.65 PP PP PP PP 25.22 82.05 155.19 214.22 SYSSHIP LINE COUNT EXCEEDED Figure 4-Sample report Three complete ISDS sessions are reflected in this sample. The second session beginning on line 10 will serve as an illustration. A user known to COMRADE as CAGXDAVIS entered the ISDS subsystem on January 3, 1973 at 10:37 with a design project name of DDG3 (a ship design name). At 10:43 she executed the FUEL command. (The running CPU and PPU (peripherial processing unit) times were .94 and 29.41 seconds respectively at the start of FUEL execution.) At 11:14 CAGXDAVIS logged out of COMRADE. The ISDS session was estimated to cost $18.50. SUPPORT FUNCTIONS FOR COMRADE SUBSYSTEMS The support functions for COMRADE subsystems consist of programs and modules needed for a broad class of COMRADE subsystems, both design and non-design systems. The components, while performing widely different tasks, may together be used to achieve workable subsystems. Presently the support functions include the following components: (1) a command procedure and program for maintaining a list (file) of valid subsystem users; (2) several programs for initializing subsystem operations (subsystem sign on); and (3) a program to terminate subsystem operations (subsystem sign off). The command MODUSERS allows the file of valid users for a subsystem to be displayed and modified at a terminal. The file contains the names, file access keys (FAK's) and command access keys (CAK's)2 of the subsystem users, each of which may be requested by terminal user's option. The use of MOD USERS is limited to those COMRADE users (usually project managers) who have the proper CAK. The file of valid users is processed by one of the subsystem sign on routines. This routine must verify that the user name is in the file of valid users. These keys are then available for processing by the Executive and by the CPFMS routines. COMRADE Design Administration System Before an exit is made from a subsystem, the subsystem sign off routine must be executed. The purposes of this routine include: temporary log to permanent log copy necessary for logging and miscellaneous subsystem cleanup. The design support system is still under development. A significant project to develop a command procedure to initialize designs within design subsystems has recently been completed. However, a corresponding command procedure to remove designs has not yet been implemented. Future areas of research and further development include the area of file and command access conventions, explicit inter-user communications and active project monitoring and control. 363 REFERENCES 1. Rhodes, T. R., "The Computer-Aided Design Environment (COMRADE) Project," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 2. Tinker, R. W. and Avrunin, I. L., "The COMRADE Executive System," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. 3. Willner, S. E., Bandurski, A. E., Gorham, W. C. Jr. and Wallace, M. A.. "The COMRADE Data Management System," Presented at the 1973 National Computer Conference, New York, June 1973. American Federation of Information Processing Societies. A business data processing curriculum for community colleges by DONALD A. DAVIDSON LaGuardia Community College Long Island City, );ew York may be working in banking, some in insurance, some in retailing, and some in manufacturing, etc. These work-study jobs are developed by a dedicated cadre of cooperative education placement personnel in conjunction with the members of the data processing faculty, serving as technical liaison. Since we know the types of jobs that our students will undertake, both in their cooperative internships and also upon their graduation, we are able to tailor our curriculum and course content to the needs of the business data processing community. Because of the diversity of the marketplace, we feel that our curriculum will prepare our students to find jobs in the EDP field in all areas throughout the country. Our curriculum, as it now stands, begins with an "Introduction to Data Processing" course taken during the student's first quarter in residence at the college. This course, which is business-oriented, includes such topics as: the history of EDP; a brief introduction to the punched-card and unit-record equipment; an introduction to electronic computer theory and numbering systems; analysis and flowcharting; and programming languages. In order to "turn the students on" to computers, we utilize the interactive BASIC language. The hardware which we utilize in the introductory course is the Data General Nova 1200 with six (6) ASR33 Teletype terminals. These six terminals support five sections of about thirty students each, or roughly 150 students in our "Intro" course. The second course that we introduce the students to is called "Basic COBOL Programming." We chose COBOL because most studies in the past two years (including our own) had shown that this language is used by at least 60 percent of the EDP installations in the greater metropolitan area of New York. We use behavioral objectives in teaching our EDP courses at LaGuardia. We set up goals for each student, so that they may ascertain their own mastery of the course. Students' grades are based on the number of programs that they complete. Evaluation of the levels of attainment aids both the faculty and the cooperative education coordinators in work internship placement. During the third study quarter, we offer a course in "Advanced COBOL Programming" which covers A business data processing curriculum must, of necessity, be both dynamic and flexible. We must constantly be seeking to fulfill the needs of industry in our environs. LaGuardia Community College is located in New York City, which is probably the largest marketplace for business applications programmers in the world. Because LaGuardia is situated in the center of commerce, it was decided, when setting up the college, to make cooperative education the key thrust of the institution. The Cooperative Education Plan offers the student the opportunity to combine classroom learning with practical work experience. It is designed to help students determine and explore their own individual goals and, in general, to help them develop increased knowledge and skills in their major field of study, explore different career possibilities, and obtain experiences which will promote educational as well as personal growth. Built into the structure of the college, cooperative education helps keep the college in touch with developments outside of it. Identifying work internships and placing students on various jobs attunes the college to changing needs in terms of career opportunities and related curricula. LaGuardia operates on a year-round quarter system. Full-time students spend their first two quarters studying on campus and then begin to alternate offcampus internship terms with on-campus study terms. In the course of the basic two-year program, a student will take five study quarters and three work internship quarters. The paid work internships in many ways are also academic experiences because they allow the student to practice in the "real world" what they have learned in the classroom. Since the students are alternating work with study, there is a constant feedback between the students out on the work internship and the instructors in the classroom. The feedback is largely in the area of modification of course content in the data processing area, so as to encompass all of the latest innovations in the industry. We find that the students are very perceptive and wish to share the knuwledge which they gain on the job with their fellow students and, of course, with their instructors. This knowledge is generally in the areas that are unique to the applications that they are working with. Some students 365 366 National Computer Conference, 1973 advanced applications of COBOL, such as nested loops, multi -dimensional table handling, and processing of disk and tape files. We examine various types of file management techniques and the student analyzes, flowcharts, codes, debugs, and documents many interesting programs. The advanced COBOL course is taken in conjunction with a course in "Systems Design and Analysis" that further advances the student toward the goal of becoming a constructive and useful member of the data processing community. When the student returns for the fourth study quarter, he or she may take a course in "Operating Systems" and either "RPG Programming" or "Basic Assembler Language" (BAL). During the final study quarter, the stu- dent may opt for either PL/1 or FORTRAN, depending on their prospective employer's recommendations. The sequence of courses during the last three quarters is generally influenced by the cooperative employer's needs. There is a constant series of contacts being made between students, instructors, and coop employers throughout the student's stay at LaGuardia. This team effort is the fulcrum around which everything revolves. We believe that the evolutionary business data processing curriculum at LaGuardia, which is constantly being reevaluated by the very nature of the cooperative education program, could become a model for other community colleges throughout the nation. Computing at the junior/community collegePrograms and problems by HAROLD JOSEPH HIGHLAND State University Agricultural and Technical College Farmingdale, New York INTRODUCTION plan to continue their education in this field at a four-year college. This and the following papers contain different views of the same subject-"Computing Education at the Junior/ Community College." It is a topic that has been long neglected, swept under the rug so to speak, in computing circles. It is about time that this topic was fully aired and its vital importance recognized by all engaged in computer education, business, industry and government. There are probably more students and more teachers involved in computer education at the junior/community colleges than at any other level of education. Before proceeding, I should like to thank the participants of this panel for their enthusiastic cooperation and valuable contribution. Although they represent different parts of the country and different programs, they are all involved with junior/community college computer education, Furthermore, I should like to define, or perhaps explain, the use of the term, "academic computing in the junior/community college." It was selected, not because we need to add to the myriad of terms we already have in computer education, but because there was no term broad enough to cover all aspects of computing education at this level of higher education. • In some institutions, the term, computer science, is used but many times the courses and the level at which they are taught bear no relationship to computer science taught at a four-year college, following the guidelines of Curriculum '68 which was developed under Dr. William F. Atchison. • In other institutions, the term, data processing, is used; but here again there are extremely wide variations. Not all such programs are solely and purely business-oriented. • The term, computer technology, is likewise encountered at the junior/community college. Some of these programs are designed to educate electronic technicians; others involve the training of computer opera. tors. Still others more closely resemble computer science at the four-year college or data processing in a college of business administration. • Finally, we are beginning to encounter the term, information processing, since curriculum titles are being used at times to show that one is keeping up with the state of the art. Oftentimes, the courses and their content are far different from the program proposed by the ACM Curriculum Committee on Computer Education for Management (C3CM) for undergraduate education under the leadership of Dr. J. Daniel Couger. • Dr. Alton Ashworth of Central Texas College (Killeen, Texas) has been in charge of developing a model program for the junior/community college under an Office of Education grant. • Mr. Robert G. Bise of Orange Coast College (Costa Mesa, California) developed the prime program in computing education at the junior/community college-level, which has served as the prototype for such programs in California. • Professor Donald Davidson of LaGuardia Community College of the City University of New York (Long Island City, New York) has been instrumental in developing a work-study program for underprivileged students in a metropolis. • Dr. John Maniotes of the Calumet Campus of Purdue University (Hammond, Indiana) has had extensive experience in developing and running an integrated two-year and four-year program in computing on his campus. • Professor Charles B. Thompson of New York State University Agricultural and Technical College at Farmingdale (Farmingdale, New York) has been very instrumental in the development of a dual program designed not only to meet the needs of careeroriented students but also one to serve students who JUNIOR/COMMUNITY COLLEGE PROGRAMS Having served as a director of a graduate business school as well as a dean of instruction at a four-year liberal arts college, I was startled when I joined a two-year 367 368 National Computer Conference, 1973 college to develop a program in computing. The lack of uniformity in course selection, course sequence, proportion of theory and application taught, were immediately evident. Programs were being developed all over the country and often were designed to meet immediate nearterm needs or merely to be "modern." Little or no concern was given to the impact of technology and the possibility of obsolescence of the graduates from these programs. One of my associates engaged at the graduate level in computing assured me that diversity meant freedom. Yet, I see this diversity as chaos with too many charletons and humbugs performing rituals in hexidecimal in the classroom without concern for the quality of education or the future of their students, or sincere educators who cannot upgrade the quality of their educational programs because they cannot provide their administration with "authoritative, professional guidelines." Now, many years later, I find that some of the extremes have died but there is still a lack of cohesion in computing programs at the junior/community college level. Let me just touch several of these areas. Department structure and affiliation Where is computing taught at the junior I community college level? In some schools there is a separate department of computer science, data processing, computer technology; a department which sets its own curriculum and guidelines. In other institutions, computing is part of the business department; it is one more 'major' in the class of marketing, accounting or management. Still at other colleges, computing is part of the mathematics department; here we most often find that curriculum which is closest to the four-year college in computer science. Yet, the emphasis is primarily a mathematical approach without concern for computing applications in other areas. Faculty education and experience Because of the rapid growth in the number of junior / community colleges over the past several years and the increased demand for computer faculty at four-year and graduate schools, the junior I community colleges have been low man on the totem pole. Except for a core of dedicated teachers, most of those with adequate education and experience have not, until recently, been attracted to the junior I community colleges. At a level where application is emphasized at the expense of theory, we find many teachers who have never had practical experience in computing in industry, business and/ or government. Too many teach from the textbook or repeat what they have learned from their professors at four-year and graduate schools, and many of them as well have spent all their years in the ivory tower. Programs and options Depending upon the individual school, the student interested in computing may be forced into some narrow area, such as business data processing. Many of the junior / community colleges are too small to offer a broad spectrum of computing courses. The areas in which they train students include: • keypunch operators • computer operators • computer programmers. In some schools, the computing programs are careeroriented, and except in few cases, these students find that their two years of education is discounted if they wish to continue their education at a four-year college. In other schools, computing is computer science oriented and the student wishing to work upon graduation does not possess the necessary skills to find and hold a job. The problem of training computer operators is a critical one at the junior/community college level. Too many of the schools have inadequate computing facilities to provide a proper program in this area. Some have turned to simulators to do the job. Any of you who have had experience with most of these simulators recognize their numerous shortcomings. (l should apologize to my colleagues in the simulation area for that comment since I am editor of the A eM SIGSIM Simuletter, but in this case, I feel that it is the truth.) Other schools have turned to work-study programs or cooperative training programs wherein the students study the theory of operations in school but obtain their experience at cooperating companies in industry and business. Computer courses and sequence In this stage of computing, can anyone imagine spendmg time in a one-year course in "electric accounting machines?" Yet, there are a number of two-year schools, both public and private, that train their students in unit record equipment and spend considerable time in wiring. At the other end of the spectrum are the schools which require career-oriented students to take courses in logic circuits, Boolean algebra, and hardware specifications. In fact, until recently, there was one school which spent one half of a one semester course teaching keypunching to students who supposedly were being trained to become junior programmers. Where does a course in systems analysis fit into the curriculum? Is this one which is taught in the first semester concurrent with an introduction to data processing, or is this the capstone in which students can utilize the information they learned in all their other courses? Similarly, should the students be required to take statistics with the mathematics department and do their work with pencil and paper or even a calculator, or should they use Computing at the Junior /Community College the computer and spend less time on the mechanics and more on the concepts? Credits in computing How many credits in a two-year program should be devoted to computing? Currently, there are schools that offer a data processing "major" with as little as 12 credits in computing (and six of these are in electric accounting machines) to schools which require almost 40 credits out of a total of 62 to 64. What is the minimum number of courses and/ or credits which should be taught? And which courses? Computing facilities Many of the junior/community colleges have some computing facilities available for student use. Yet there are some offering computing programs that do not have any computing facility available for student use. One cannot but question the value of such a program. Furthermore, what type of facilities are available for which programs? Do you need the same for computer science (in the four-year sense) as you do for a career-oriented program in the business area? It is possible to continue in examing other areas of diversity, but it should be apparent that there is a wide spectrum of programs under the heading of computing in the junior/ community college. SOME PROBLEMS TO BE ANSWERED The two-year college, no matter what it is called (junior or community or as has become fashionable to leave either term out), is a unique institution in education. In some cases its students vary little from those who enter a four-year institution, but in other cases, these two-year schools receive those students who cannot be admitted to the four-year colleges. Computing languages The number and intensity of languages studied varies greatly among the junior/community colleges. There is also great variation in which language is taught first and in what sequence are languages studied by the student. Among the languages offered by the two-year colleges are: BASIC, COBOL, FORTRAN, RPG, PL/I, APL, AL, JCL. At some schools students are required to take only one language during their entire two years, while in a few three and even four languages are taught to all students as part of the basic program. At this level of instruction, which is the simplest language to introduce to the students? Some look upon BASIC as being too much like FORTRAN and therefore too scientific, unsuitable to students going into the busi- 369 ness field. Many schools start with FORTRAN, but in one, a year of COBOL is the prerequisite for the study of FORTRAN. A few, like my own college, start with PL/I. Since these schools are more often under local community control as compared with four-year colleges and universities, the programs should be designed to meet community needs. But a broader view is also necessary. It is about time that we recognized that the four-year colleges in computing are almost monolithic in their programs as compared with the two-year schools. The computing community has an obligation to see that some level of competency or adequacy is set for these institutions. I am not proposing a system of accreditation but the establishment of some guidelines, fundamental curriculums to meet several purposes. Attention should also be devoted to the high level of attrition in these programs. Is it really the student's fault that they have failed? Or is it the lack of a properly sequenced curriculum, adequate teaching aids, quality of the teachers, or what? Many teachers and administrators at the junior/ community college level require some professional ammunition in attempting to get college presidents, local appropriation committee, etc., to upgrade existing equipment and programs. It is here that ACM can playa strong role, but it must be done now. In addition, a greater exchange of information among the junior colleges is necessary. An exchange of curriculums, course outlines, booklists-an airing of problems: how much mathematics should a student be required to take, what techniques have been used to cut attrition, what arrangements have been made for transfer of students-is essential in this area. It appears apparent that in light of accelerated computer technology, the limited computing facilities at the junior / community college level and concomitant problems that many of the two-year programs offer today will not be viable within the near future. Computing is already making inroads at the secondary school level. In some parts of the country, and this we have locally, there are special vocational educational centers for intensive training of high school students. If they develop adequate programs for training input clerks, control clerks and computer operators (and eventually they will), what will be taught at the two-year level? Finally, to do its job effectively, the junior/ community college must have better information about industry's and government's needs in the computing field. Today, the Bureau of Labor Statistics either lumps all those engaged in computing into a single category or at most separates programmers from the rest. What can be done to obtain greater insight into these needs so that more effective programs can be developed an"d taught at the junior / community college level? The problems are many and those who are truly interested in doing are few. Those of us within ACM should seek some dynamic structure within which to operate; now is the time to start. The two year and four year computer technology programs at Purdue University by JOHN ~Y1ANIOTES Purdue University Hammond, Indiana I~TRODUCTION anapolis regional campus. Table I describes the regional campuses with respect to their location, distance f~o_~ t_~e Lafayette campus, enrollment, and principal computing equipment. The Computer Science programs are offered at Purdue's Lafayette Campus. There are eight kinds of educational programs in the United States which presently supply industry with the majority of its computer-EDP oriented employees. For lack of better terminology, I would classify these programs as follows: THE TWO YEAR COMPUTER TECHNOLOGY PROGRAM (1) The two year Data Processing (DP) programs* (2) (3) (4) (5) (6) (7) (8) which are concentrated at vocational institutions, community colleges, junior colleges and at some senior colleges and universities. The four year academic programs offered by many senior colleges and universities in the areas of Business Data Processing and Computer Science. The graduate level programs in Information Systems and Computer Science offered at some major colleges and universities. The specialized programs offered by the various computer manufacturers' education centers. The company in-house training programs. The so-called private commercial schools, many of which have been established through franchising, and which usually offer training ranging from 3 to 12 months. The private home study schools. The various high schools which offer vocational oriented training programs in EDP. Currently, the two year programs at the regional campuses are designed to produce graduates in the occupational group that begins with the computer programmer in either the commercial or technical areas of programming. The regional campuses offer two options in their two year programs. These options are referred to as the Commercial option and the Technical option, respectively. However, even with the dual options the enrollment is overwhelmingly business-oriented. For this reason this section will reflect primarily with the business-oriented two year Computer Technology program. The curriculum for the two year program is divided into five areas: (1) Data processing and computer basics and equip- (2) (3) (4) (5) Purdue University offers extensive instruction in the areas of computing and information processing ranging from the freshman level to the Ph.D. degree. Purdue University currently offers B.S., M.S., and Ph.D. degrees in Computer Science as well as A.A.S. and B.S. degrees in Computer Technology. The main difference between these two areas is that the Computer Technology program is practical and application-oriented, while the Computer Science program is theoretical and research-oriented. The -Computer Technology programs are offered at Purdue's three regional camouses at Hammond, Fort Wayne, and Westville and at indiana University's Indi- ment Assembler and compiler languages Organization of business Business applications Supporting sciences and electives During the first year of the program as indicated in Appendix A, the students acquire an introdu~tion t? ~ata processing, programming, and computers. In addItIon, they study such academic courses as English compositi?n, speech fundamentals, basic and intermediate accountmg principles, and data processing mathematics. In the second year, the students concentrate heavily on computer programming, programming systems, operating systems, systems analysis, and systems applications. In addition, the students continue their related course study in areas such as technical report writing, economics, and statistics. An important point to keep in mind is that the two year program emphasizes the practical, rather than the theo- * Sometimes these programs bear the name of Computer Programming Technology, or simply Computer Technology. 371 372 National Computer Conference, 1973 TABLE I -Some Statistics Regarding Purdue University and Its Regional Campuses Institution Purdue University Location Distance Principal (Miles) to Lafayette Enrollment Computing Campus Fall 1972 Equipment Lafayette Purdue Regional Campuses Calumet Hammond Fort Wayne Fort Wayne North Cen- Westville tral Indiana Univ. Regional Campus Indianapolis Indianapolis 26,204 100 CDC 6500 80 4,880 2,794 1,354 IBM 360/22 IBM 360/22 IBM 360/22 60 16,938 IBM 360/44* 115 * The IBM 360/44 at Indianapolis is linked to a CDC 6600 at Indiana University, Bloomington, Indiana. The other three IBM 360/22's are linked to the CDC 6500 at Purdue University, Lafayette, Indiana. retical aspects of EDP. In addition, the program emphasizes the solution of EDP problems using the "hands on" approach to operate computers and other peripheral devices and to debug their programs. Strong emphasis is placed on "real-life" laboratory exercises which are intended to reinforce the student's knowledge of data processing techniques by requiring the student to apply them in a broad spectrum of practical applications. In addition the "hands on" approach exposes students to a wider variety of applications and techniques than most programmers would receive in a year of on-the-job training since most of the latter training tends to focus on one language and a relatively narrow range of applications. In the two year Computer Technology program, the curriculum is designed to prepare a person for the entry level position of a programmer and to perform the following functions: (1) Analyze problems initially presented by a systems analyst with respect to the type and extent of data to be processed, the method of processing to be employed, and the format and the extent of the final results. (2) Design detailed flowcharts, decision tables, and computer programs giving the computations involved and the sequences of computer and other machine operations necessary to edit and input the data, process it, and edit and output information. * The IBM 360/44 at Indianapolis is linked to a CDC 6600 at Indiana University, Bloomington, Indiana. The other three IBM 360/22's are linked to the CDC 6500 at Purdue University, Lafayette, Indiana. (3) Utilize the programming languages of COBOL, RPG, FORTRAN, as well as a machine and assembly language to construct the necessary program steps, correct program errors, and determine the cause of machine stoppage. (4) Verify the accuracy and completeness of computer programs by preparing sample data and testing it on the computer. (5) Evaluate and modify existing programs to take into account changed requirements. (6) Confer with technical personnel with respect to planning new or altered programs. (7) Prepare full documentation with respect to procedures on the computer and other machines and on the content of the computer programs and their full usage. (8) Devise more efficient methods for the solution of commercial or scientific problems. (9) Comprehend the major concepts, types of equipment, programming systems, and operating systems related to EDP. After successfully completing the two year Computer Technology program, students are awarded an Associate Degree. A student graduating from the program not only has studied theories and principles but has had extensive practical experience in operating and applying data processing techniques on modern computing equipment. This combination enables the graduate to step into an entry level programming job and become productive in a short period of time. The first Computer Technology program was initiated in 1963 at the Calumet and Indianapolis regional campuses, respectively. Course revisions and curricular changes have taken place during the past ten years in order to keep pace with the current state of the art. In addition, changes have been designed to "fine-tune" the two year programs while providing flexibility in individual courses and regional variation to meet the special needs at each community where the curriculum is taught. Appendix A illustrates the two year program at the Purdue Calumet Campus that has undergone "fine-tuning" in order to deal with third generation computers. The curriculum in Appendix A is offered on a semester basis, and it can be used for illustrative purposes. (The sequence of courses in the two year program varies between regional campuses, and it is not the intent of this paper to debate the merits of different course sequences.) The curriculum in Appendix A reflects the following changes that have taken place during the past ten years: (1) Many of the original programming, business, and supporting courses in Appendix A have been assigned specific names so as to become readily identifiable and to reflect their status in the curriculum. Computer Technoiogy Programs at Purdue University (2) The one-time importance of unit record equipment (tab equipment) has diminished. It is no longer necessary for a viable third generation program to concentrate mainly on "board wiring" and punched card applications. Hence, the priority of unit record equipment has been considerably reduced (not eliminated) in the curriculum. (3) The importance of first and second generation computers has also diminished. Third generation computer hardware and software concepts are stressed by the curriculum in Appendix A. (4) File organization techniques and disk/tape programming concepts are emphasized together with input/ output control systems and the functions of an operating system. (5) Two semesters of assembly language together with the compiler languages of COBOL, RPG, and FORTRAN are also stressed since these are the common tools that the programmer utilizes on the job. THE FOUR YEAR COMPUTER TECHNOLOGY PROGRAM This baccalaureate program is a two year "add on" curriculum which is open to associate degree graduates of Computer Technology or the equivalent in data processing. The program builds on the students' knowledge of computer programming acquired in the first two years, and emphasizes the practical aspects of such areas as computer systems analysis and commercial systems design. The inclusion of many elective courses enables the students to pursue areas of special interest. Graduates from this third and fourth year of study are prepared to fill a variety of positions related to data processing, computer systems, systems analysis, systems programming, and computer programming. The objectives of an additional third and fourth year of study leading to a baccalaureate degree are summarized below: (1) With regard to the student, the objectives of the curriculum are: (a) General-To prepare a graduate who is: 1. Proficient in computing, information processing, and data management techniques; 2. Capable of developing computer programs in a wide variety of application areas and in a number of commonly used languages; 3. Capable of productive effort for the employer shortly after graduation; 4. Capable of remaining current with the changing technolog-y. (b) Technical Competence-To prepare a person who is knowledgeable concerning: 373 Mathematical concepts relating to computer programming; 2. Techniques used in the definition and solution of commercial systems problems; 3. Computer and peripheral equipment operations, computer operating systems, and data communications; 4. Fundamentals in the subject matter areas most closely related to computer applications. (c) General Education-To broaden the individual through exposure to: 1. Humanities and social sciences; 2. Oral and written communications; 3. Business, management, and supervisory concepts; 4. Elective courses directed toward further individual development. (2) With regard to the program, the objectives are to provide a curriculum which is: 1. (a) Viable and responsive to the changing technology; (b) Based on a two year modular structure that encompasses both the commercial and technical options of an associate degree program in Computer Technology. The format for identifying the requirements for the baccalaureate degree differs from that normally found in a standard college or university catalog. In addition to the usual semester-by-semester plan of study, the minimum requirements in the Computer Concentration Courses are specified as indicated in Appendix B. The baccalaureate program has been offered since 1968 at the Indianapolis regional campus and is scheduled to be officially offered at the Purdue Calument Campus during the 1973 Fall . Semester. The computer course requirements provide the students with a flexibility allowing for varied implementations. Thus, as indicated in Appendix B, one student may take course sequences emphasizing Computer Systems Analysis while another emphasizes Systems Programming. This flexible structure also allows the curriculum to remain current in a rapidly changing industry without requiring constant revision. THE ROLES OF COMPUTER SCIENCE AND COMPUTER TECHNOLOGY* Computer Technology training currently is provided to students at three levels; all levels stress in-depth practical experience. The two year associate degree program develops computer practitioners whose competency lies pri* The author wishes to thank the Computer Technology staff and the Computer Science staff at the Indianapolis regional campus for some of their ideas and thoughts as expressed in this section. 374 National Computer Conference, 1973 marily in programming and secondarily in systems. The learn-by-doing technique is heavily stressed. The baccalaureate program is broader-based and affords the students an opportunity to have a business, scientific, and communications exposure. The primary goal is the development of computer professionals well versed in the current state of the art. The technologist is provided with the perspective to apply his tools through an integrated systems approach to data processing problems. The third level concerns the direct charge of providing service courses. Certain offerings for the community at-large are non-credit, while credit courses are offered for other departments of the University to fill the need for computer-oriented electives in various curriculums. Each course is designed to meet the needs of the department in question. Computer Science as a discipline is also concerned with the integrated use of the computer as a component in the overall solution to problems. Students are trained in the formulation of computer-oriented solutions peculiar to design-oriented problems. These persons have a mathematical background suited to the needs of their discipline as a framework within which to arrive at a solution. A computer scientist functions in an environment analogous to that of the theoretical mathematician while the technologist functions in an environment analogous to that of the applied mathematician. In cases where the problem is so new and/ or complex as to require new solution techniques or substantial modifications or new applications of existing techniques, a computer scientist acts in conjunction with the disciplineoriented professional and the computer technologist in developing the problem solution. The computer scientist has the depth of mathematical training and the breadth of theoretical knowledge to effectively contribute to the decision-making process and to the feasibility of developing new techniques such as the development of new numerical methods, algorithms, optimization techniques, simulation models, new higher-level languages, operating systems, or management information systems. In carrying out the results of such planning, the creativity of the computer scientist is his contribution to the problem solution; effective use of established methods is the contribution of the computer technologist. In general, the computer scientist is a theorist with a broad interdisciplinary overview, while the computer technologist is a specialist in the techniques of analysis and implementation. The scientist is sought as one who can synthesize diverse information into an integrated solution approach, while the technologist is sought as a professional who can produce computer-solution results efficiently. Accordingly, Computer Science and Computer Technology are each individually responsible for their respective degree programs, their implementation and development, as well as those service courses which meet particular needs. Therefore, even in those courses and offerings which appear similar in content, there is a difference in emphasis and orientation, reflecting the different roles of the two disciplines. The A.A.S. and B.S. programs in Computer Technology and the B.S., M.S. and Ph.D. programs in Computer Science are part of a continuum called computing and information processing. PROBLEMS FACED BY THE COMPUTER TECHNOLOGY PROGRAMS Summarized below are some of the problems faced by the two year and four year Computer Technology programs. Although some of these problems may be pertinent to Purdue University, others are general enough to apply to other institutions which have similar academic programs. Staffing A problem that the Computer Technology staff faces is the constant updating required in their field as compared to their colleagues in such fields as liberal arts or the humanities. It has been said that the half-life of one's computer-EDP knowledge obsoletes each five years due to the many new developments that are occurring in the field. Recognizing this problem, the staff has been periodically infused with new computer-oriented knowledge through attendance at summer institutes, computer manufacturers' education centers, technical meetings sponsored by professional organizations, and by various consulting assignments in industry. In addition, the campus library's selection of computer-oriented books and journals has been expanded so as to enable the staff to remain abreast with the latest developments in their field. Another problem that has been experienced over the years concerns the difficulty in hiring experienced instructors who possess up-to-date knowledge about computers and data processing applications. One of the problems contributing to this difficulty has been the low starting salaries and fringe benefits commonly found in the teaching professjon. University administrators must be constantly made aware that computer-oriented staff members have unique problems and additional resources must be made available to compensate for these deficiencies. The demand for competent computer-oriented instructors is high and the supply has a long way to catch up. Student transfer problems No serious problems have been experienced by the Purdue graduates of the two year programs who have transferred to the baccalaureate Computer Technology program at the Indianapolis regional campus. This is due to the fact that at all regional campuses the computer equipment is from the same manufacturer and the courses are structured essentially in the same manner. Some problems have been experienced whenever two year graduates from other schools which did not have similar computer equipment transferred into the bacca- Computer Technology Programs at Purdue University laureate program. The problems in these cases stemmed from the lack of familiarity of the operating system and the assembly language of the computer utilized in the baccalaureate program. Problems have also been experienced whenever students from private commercial schools have tried to transfer into the two year program. These types of students have been found to be weak in EDP fundamentals, flowcharting, programming logic, and documentation. In some instances these students have had to retake some of the basic computer-oriented courses before they were fully admitted to the two year program. Evaluation of students' computer programs Currently various methods exist to evaluate students on their computer-oriented courses using such m~alls as written and oral exams, quizzes, homework problems, and laboratory exercises. A problem faced by instructors in such courses involves the evaluation or grading of students' computer programs especially those programs of some complexity. Questions most often asked in this area are: What do you grade on? What constitutes a good (or bad) program? What parameters do you consider important-execution time, amount of storage utilized, or the number of unsuccessful attempts tried before completion? These are areas which bear further study and thinking by instructors since programming is an art and not an exact science. Instructional materials More than 1,000 books from more than 120 publishers have been published for the computer-EDP field. Currently good textbooks, student work manuals, and visual aids exist for introductory computer or data processing courses and for computer programming courses which appear in the freshman and sophomore level of the Computer Technology program. In fact one may state that there are too many books published for these courses at these levels. Good textbooks, student work manuals, and visual aids for the junior and senior levels of the Computer Technology program are practically non-existent. The courses which require good textbooks are: Management Information Systems, Operating Systems, Commercial Systems Applications, Computer Graphics, Hybrid Computing Systems, and Data Communications. The text material for these courses usually consists of reference manuals from computer manufacturers, notes from the instructor, or handbooks oriented for experienced professionals rather than for students. More effort needs to be exerted by textbook publishers in producing student oriented textbooks for these courses. Computer-oriented aptitude tests There is a need for the development of good aptitude tests to predict if an entering student will be successful in graduating from the Computer Technology programs or 375 whether a graduate from these programs will be successful in a programming or systems analysis job position. Our A.A.S. and B.S. Computer Technology graduates have reported that they face in many instances an aptitude test when they apply for a programming or systems position. It seems that interviewers confronted with the problem of predicting job success among applicants for these positions have come to rely heavily on aptitude tests especially the IBM PAT. It is unfortunate that in many instances the IBM PAT score is the sole factor used to determine whether an applicant qualifies for further consideration. The IBM PAT scores are not an accurate predictor of job success. It is apparent that the computer-EDP field needs to give our psychological test developers additional qualities to base their tests on if they are to perform the task of predicting job success as many employers believe they now do. In addition, further work is necessary to develop "third generation aptitude tests" in order to be part of the computer hardware and software presently available. Hopefully some of these tests can also be utilized as entrance examinations for computer-oriented academic programs. Funds As far as funds are concerned, it seems there are two bottomless pits at academic institutions-libraries and computer centers. Adequate funds to purchase or lease modern computers and their peripheral equipment, especially 110 terminals, is a problem faced by all academic institutions including Purdue University. Funds from the National Defense Education Act are scarce especially for institutions such as Purdue University that once were funded from this Act in the early 60's. In addition, computer manufacturers no longer offer large educational discounts for new computer equipment as they once did in the past. Currently, the emphasis at Purdue University is to share computer resources at all levels. At each regional campus a common third generation computer is shared for all academic, administrative, and research oriented tasks. In addition these computers are linked to a CDC 6500 at the Lafayette campus thereby providing economically and efficiently computing power and storage to many users at one time. Typical COBOL or FORTRAN type problems submitted by Computer Technology students are processed at a cost of 20¢ to 30¢ a program on the CDC 6500 with an "average" turn-around time of approximately 5 to 15 minutes during non-peak times. A question often asked is: "Where will additional funds come from?" I don't think there will be any significant outlays of funds from federal and state government sources. Nor will there be any sizeable student tuition increases. Rather I expect that academic institutions will have to increase the efficiency of their computer centers and start actively looking for ways of stretching their funds such as third party leasing. 376 National Computer Conference, 1973 APPENDIX A-CURRICULUM OUTLINE FOR THE TWO YEAR CO:MPUTER TECHNOLOGY PROGRAM AT PURDUE UNIVERSITY* Hours per Week Lab Total 2 3 o o o o 6 3 3 3 3 5 3 3 3 3 16 2 18 17 Class Credits First Semester Introduction to Data Processing English Composition I Introductory Accounting Algebra Elective 4 3 3 3 Serond Semester Data Processing Math RPG Programming FORTRAN Programming Fundamentals of Speech Communication Cost Accounting 3 2 2 o 3 2 2 4 4 3 o 3 3 3 3 3 3 o 3 3 13 4 17 15 3 2 5 4 3 o o Third Semester Assembly Language Programming I Statistical Methods Systems Analysis and Design COBOL Programming Technical Report Writing 3 2 3 2 3 3 4 o 3 3 3 3 3 14 4 18 16 3 2 5 4 2 2 4 3 APPENDIX B (Continued) Sixth Semester PL/I Programming Computer Concentration Course Calculus II Physical Science Elective Elective 2 2 2 o 4 4 3 6 3 3 3 3 4 3 14 6 20 16 4 4 8 6 4 3 3 2 o o 6 3 3 4 3 3 14 6 20 16 4 4 8 6 3 3 3 3 3 4 o o o 4 3 4 14 4 18 16 2 3 4 3 o 2 Seventh Semester Computer Concentration Courses (2) Physical Science Elective Social Science Elective Humanities Elective Eighth Semester Computer Concentration Courses (2) Social Science Elective Humanities Elective Electives * For the description of the courses, consult the latest edition of the "School of Technology Bulletin", Purdue University, Lafayette, Indiana. The Computer Concentration Courses are defined as follows: Any two of the following sequences plus one additional computer-oriented course. Fourth Semester Assembly Language ProgrammingII Commercial Systems Applications Computer Operating Systems I Computer Seminar Principles of Economics Elective 2 2 4 2 o 3 3 o o 2 3 3 3 1 3 3 15 6 21 17 * For the description of the courses, consult the latest edition of the "School of Technology Bulletin", Purdue University, Lafayette, Indiana. APPENDIX B-CO~IPUTER TECHNOLOGY CURRICULU:M OUTLINE FOR A THIRD AND FOURTH YEAR OF STUDY AT PURDUE UKIVERSITY* The General Requirements for the baccalaureate program are: Hours per Week Class Lab 2 4 2 4 4 3 8 6 3 2 o o o 3 3 2 3 3 2 14 6 20 Ii Total Credits Fifth Semester Data Communications Computer Concentration Courses (2) Calculus I Communications Elective Elective 3 (1) Commercial Systems sequence (a) Management Information Systems I (b) :Management Information Systems II (c) Financial Accounting (2) Computer Systems Analysis sequence (a) Systems Analysis of Computer Applications (b) Computer System Planning (c) Design of Data Processing Systems (3) Systems Programming sequence (a) Introduction to Computer Systems (b) Computer Operating Systems II (c) Systems Programming (4) Technical Systems sequence (a) Numerical Methods (b) Topics in FORTRAN (c) Hybrid Computing Systems (1) Completion of an Associate Degree in Applied Science, in Computer Technology or the equivalent. (2) Completion of the Core Requirements, plus additional courses as required to complete a minimum of 130 semester credit hours which includes credits earned toward the Associate Degree. The additional courses are free electives, except that not more than 9 semester credit hours may hI' t~ken if' the Comput0r T0('hnology Department. Computer Technology Programs at Purdue University APPENDIX B (Continued) (3) A minimum of 40 semester credit hours must be 300 or higher level courses. The Core Requirements for the baccalaureate program consist of 111 semester credit hours in the follO\ving areas: Semester Credit Hours (1) General Education (a) Communications (English, Speech, Report Writing) (b) Social Science (Economics, Political Science, Psychology Sociology) (c) Humanities (Creative Arts, History, Literature, Philosophy) (d) Business (Industrial Management, Industrial Supervision) (e) Mathematics (Including Calculus, Finite Mathematics and Statistics) (f) Physical Science (Biology, Chemistry, Physics) 377 6 9 17 8 61 (2) Computing Principles (a) Data Processing Basics (b) Assembly Languages ( c) Compiler Languages (d) Computer Systems 12 9 6 8 9 6 29 (3) Computer Concentration Courses 21 Computing studies at Farmingdale by CHARLES B. THOMPSON State University, Agricultural and Technical College Farmingdale, New York enter the program to advance their careers in the commercial aspects of data processing. By and large, the data processing program has been a success; those who have completed the program can and have succeeded. Trends are developing, however, which threaten the success of this or like programs. The era of "Send me a warm body, I'll train him" is over. The recession, closing off entry jobs and causing a surplus of available and experienced personnel, has brought on a problem of locating meaningful junior programmer jobs for the graduates of the program. Although the predicted economic expansion will reduce this problem, the recession has brought to light the lack of professional recognition and unclear career patterns for the personnel in the information processing field. The present and future student is aware and skeptical of entering a program which may equip him for a nonexistent job. The publicity and the increased attention to sociological/ health careers has caused a significant reduction of potential students. The era produced a proliferation of two and four year programs in computing science, data processing, and programs with minors in these subjects. This levelled the enrollment at a lower figure than had been anticipated, endangering future programs. Educational institutions, more than ever, must offer a variety of modern programs, supported with up-to-date hardware systems and facuity, and change these programs to meet the future, a challenge which is very costly and risk prone. To meet this challenge, information is needed. Information which is authentic and available that can be used by students, educators, employees, and employers. Too many decisions are based on one's limited environment, not always objective or timely. A paradox, in that most computing programs are developing personnel who are to participate in supplying objective and timely information. Information which will be considered authentic must come from a national organization which has as its purpose developing information processing personnel. This organization would publish statistics, local and national, about personnel needs and qualifications in greater depth and degree than is presently distributed. A natural outgrowth of such an organization's purpose would be to promote recognition of information processing personnel, Farmingdale Agricultural and Technical College is part of the State University System of New York. The College is one of three public two year colleg-es serving Nassau and Suffolk counties. The school is located 25 miles east of New York City on the boundary line dividing these tv/o counties. The computing program at the school is an academic program in data processing. The program was started in 1967, under the direction of Dr. Harold J. Highland. The program has about 150 day and 300 evening students enrolled. Computing support for the program is an IBM 360/30, DOS system. The objective of the program is to equip the student with the skills and knowledge to enter the data processing field as a junior programmer. A junior programmer is defined as one who has a strong command of one language, familiarity with two others, has extensive experience programming structured problems, and has a general knowledge of computing and data processing systems. The overall philosophy of instruction is application. Students are assigned programming problems as a primary vehicle of learning. The "hands on approach" rapidly develops command of programming and the confidence to program or solve problems. The day student has a choice of choosing the scientific or the commercial option of the program. The first year is a core year for a second year of concentrated study in scientific programming, FORTRA~, or commercial programming, COBOL. Upon completing the program, the student is awarded an AAS degree in data processing. The enrolling day student is generally a first generation college student. The academic background of the students vary widely, but can be grouped into those with three or more successful years of secondary mathematics, those without, and those with some knowledge of data processing. Students in all three groups have completed the program. They have entered the field as an operator or junior programmer or continued their studies in Computer Science, Programming and Systems; or Business. The evening students have diverse backgrounds, but almost all have some knowledge of computing, varying from operations to systems programming. These students 379 380 National Computer Conference, 1973 to conduct research in information processing instructional systems, and to develop programs of studies. The statistics would be used by students and their counselors in deciding about their choice of careers. Educators would use the data in providing needed programs. Employees woud be able to select and choose alternative educational programs for advancement. Employers would be able to provide professional development programs to meet their future needs. Other functions the organization could serve is the promotion of professional recognition, seeking scholastic aid, and distributing programs, formal, inhouse, or intensive, with recommended instructional systems which would provide effective and efficient education. This organization could also serve another needed educational and development function, regional training centers. These centers would equip personnel with locally needed qualifications. Personnel attending the centers would be recent graduates of college programs and inservice personnel temporarily relieved from their assignments. These centers would conduct intensive up to the minute training. Hundreds of thousands future positions which are forecasted can only be filled by a national effort. If trends threatening this highly technical profession continue, the nation will face a shortage of qualified personnel and over supply of obsolete skilled personnel. Only a national organization can prevent another Apalachia. Computer education at Orange Coast CollegeProblems and programs in the fourth phase by ROBERT G. BISE Orange Coast College Costa Mesa, California ment consisted solely of leased Electro-Mechanical Tabulating Equipment. To this, we added computing power in the form of a General Precision LGP30 with 4K drum memory and paper tape/typewriter input-output devices in 1959. The curriculum was designed around the available hardware to include courses of instruction in ElectroMechanical Wiring Principles, Basic Concepts of Data Processing, the Electro-Mechanical Application of sorting, collating, reproducing, interpreting, tabulating and calculating. It also included programming in machine language on the LGP30. In addition students were required to study the principles of accounting, cost accounting, and accounting systems. The business and industrial growth that has been associated with Orange County (California) speaks for itself. New and relocafing firms representing the entire spectrum of technologies are moving into the cities of Costa Mesa and Newport Beach daily. The Coast Community College District, and especially the staff of the Business Division of Orange Coast College have continually developed educational programs to support this environment. In the past period of shifting technologies, we have been supported by stable patterns of human behavior. As we plan for the next shift, we no longer find these stable patterns of human behavior. Instead we find only revolving fragmentations of the past and undefined forces that may be in the future. In 1973, we are undertaking the development of viable programs and curriculum in the areas of computing and management information systems. In this we will be continually trying to integrate the experience and intuitive judgment that we have gained during a decade of total submersion in the changing forces of computer technology. We understand that the total environment in which we will make our attempt has the potential for treachery of the senses. Charles Poore of the New York Times has labeled these times the Karate Age where with one quick and deadly assault, a man, a university, a regime or a nation may be sent writhing in agony. A review of the technological changes that have taken place in computing quickly reveals that those who have been in the field of computing over the past decade had experienced "Future Shock" somewhat before Alvin ToffIer coined the phrase. A brief history of computing at Orange Coast College will serve as a vehicle for reviewing these changes in computer technology. At the same time we may review the continuous process of curriculum redevelopment and the demands that were made of the instructional staff at Orange Coast College. The history may be seen as comprising three distinct phases. PHASE II Phase II was initiated through the acquisition in 1963 of second-generation computing hardware systems in the form of IBM 1401 and 1620 computers with disk storage. The curriculum shifted to keep pace with the hardware. Although the principles of Electro-Mechanical Wiring and Tabulating equipment were retained, additional hands-on experiences were provided in machine language, SPS, and FORTRAN on both machines and COBOL and AUTOCODER on the 140l. The principles of Accounting, Cost Accounting and Accounting Systems continued to be a part of the program and a new emphasis was initiated in Management Information Systems. The objective of the two-year vocational program in data processing at this time was to develop qualified entrance-level tab operators and application programmers through hands-on experience. The California Department of Vocational Education in conjunction with the Federal government provided assistance to the program in the form of grants for the development of curriculum and training of the instructional staff. With the rush by other institutions to provide programs in data processing and computer science, another dimension was added to the program in the summer of 1962. In conjunction with the State of California Department of Vocational Education, a summer institute program for the intensive training and retraining of instructors in data PHASE I In 1958 Orange Coast Community College entered into its first phase of data processing. At that time our equip381 382 National Computer Conference, 1973 processing was initiated. This program was to become an on-going part of the total program. With the active participation of the instructional staff in the training of others (and also of cross-training themselves) a sense of mastery over conditions developed. The frantic rush to keep up with hardware and programming sophistication seemed likely to be a condition of the past. That sense of mastery was short-lived when in 1964 IBM changed the game from checkers to chess with their announcement of the System 360. PHASE III In 1966-67 the State of California underwrote a proposal to defray the costs of training two OCC instructors in third -generation programming and concepts. In return for this training, the staff agreed to the development of a detailed report containing all of the necessary educational ingredients to make the transition from second to thirdgeneration computing. This report was made available to all institutions. The curriculum by the fall of 1968 presented the concepts of 360 programming through an understanding of the programming languages of RPG, FORTRAN, COBOL, PL/1 and ALC. The concepts of operating systems, file design, file management, and job control were integrated into the programming classes. Cost Accounting became an elective in the program and a course in Management Information Systems Projects became a requirement for graduation. The latter class was designed to provide students with the background necessary to function in their fast-developing role as staff consultants to line management at all levels. Through the generous contribution by Hunts Foods of computing time on their 360, we were able to introduce a third-generation curriculum in the spring of 1967. Third-generation computing hardware was available at the college by November of 1968 (IBM System 360/40). In January of 1969 teleprocessing terminals were added using APL as the computer language. There was one factor upon which we all agreed after the hectic year of 1969: one was only kidding oneself if he found security in technological expertise. The concepts of the third generation increased the need for summer institute programs for the retraining of educators in the field, and the college offered the first summer institute in third generation programming in the summer of 1969. Quickly we became aware of the fact that where in Phase II we were involved in a simple vocational program, with the sophistications of third generation, higher aptitudes, wider perspective, and greater perseverance would be required of the student. We could no longer provide mere vocational education but had to be involved in providing some measure of professional education and training. The offers that our graduates were receiving from the labor market required them to possess a much keener insight into the realities of the business environment and demanded a strong understanding of the organization and the part the computer played in the organization. In the summer of 1970 our new facility was completed which doubled our capacity. We now had a designated room for our IBM 029 keypunches and IBM 2741 teleprocessing terminals. We attempted to maintain our philosophy of hands-on training through a student/ reader / printer and the addition to our curriculum of a hands-on course in computer operation. The program for the development of computer-assisted instruction initiated in 1969 necessitated the acquisition of an IBM 360/50 DOS System in the fall of 1970. The college having expanded to two colleges in 1965, changed the name of its district to the Coast Community College District in 1970. Through the foresight of the district administration, a policy of decentralizing computing power was implemented through the placement of the teleprocessing terminals throughout both campuses. This included the use of dial-up teleprocessing terminals. Both the added use of computing throughout both colleges and the additional administrative requirements to implement program budgeting systems allowed the Business Information Systems instructional program to receive the benefit of more sophisticated hardware systems. The IBM 360/50 DOS system could not meet the demands for the additional computing requirements, and a change was made from DOS to OS with one megabyte of low-speed core in 1971. Through the use of CRT terminals a student file inquiry system became operational in 1972. This necessitated a further upgrading of the system to an IBM 370/155 OS MFT containing one megabyte of main memory. With the two year program arriving at a somewhat stable position, new emphasis was placed upon developing courses of instruction to service the other disciplines of the college and to integrate all disciplines with the sense of the rapidly increasing rate of technological change. The ability to adapt was emphasized. Two courses were designed to meet this objective. A course of instruction using the language of FORTRAN and APL was developed to integrate programming concepts and applications with the respective discipline of the prospective transfer student to the four year college. Another course was developed using the interactive teleprocessing language of APL to provide instruction to all students of the campus. With the changing of emphasis in the computing field came requests from the computing community for additional courses in Computer Operations, Data Communications Systems, Management of the Computer Effort, Operating Systems, and most recently Advanced COBOL. In order to further meet the needs of the rapidly-growing business environment, two one-day seminars were held in the areas of Management and the Computer and Data Communications for Management. We also held a twoday seminar for a visiting Japanese top-management Computer Education at Orange Coast College group. The title of this seminar was the use of computing by American managers. Since September of this year we have been involved in the evaluation of our total curriculum and have attempted to make our program more flexible to the three basic student groups that we serve. The first group is comprised of an increasing number of students who are transferring to four year colleges to complete their education. Most of these four year colleges do not have as wide an offering of courses, and those that are offered are at the upper division level. Consequently, students must use much of their course work in Business Information Systems taken at our institution to fulfill elective lower-division courses. We have been able to obtain some relief from this problem through "one-to-one" articulation on an individual college basis, but this is a nagging problem causing a great deal of frustration to the student. The second group we serve is that of the two year terminal student. These students can be classified into two general categories: those with a good aptitude for programming and systems work and those that have average aptitude and strong desire. We feel that the higher aptitude student would benefit by taking more advanced work in programming and systems. For the second group of students we see very fulfilling careers in the area of computer operations and possibly computer sales and allied fields. We encourage members of this group to take courses in computer operations and to broaden their general understanding of the field. The third group is comprised of practicing professionals in the computer field, and managers and staff people from various fields of business. For this group we have added courses in Data Communications Systems, Managing the Computer Programming Effort, Advanced COBOL and Operating Systems. In our attempt to meet the needs of these three basic segments of our student population, we have devised what we feel to be the basic minimum core requirements for our students. The core requirements are intended to develop the technical base necessary to compete in the dynamic information and computer industry and in addition to provide each student with a macro view of the environment in which the function of computing is performed. We attempt to accomplish this through nineteen units of required courses consisting of Introduction to the Concepts of Information Systems, COBOL and PL/ 1, Assembly Language Coding, Management Information Systems and a Management Information Systems Projects class. Eight additional units are required in Accounting or in Calculus, and nine additional units are required from a group consisting of: Advanced COBOL, Computer Operations, RPG, Data Communications Systems, Managing the Programming Effort, FORTRAN / APL, Computer Science, Operating Systems, APL, Cost Accounting and Managerial Mathematics. 383 FACTORS TO BE CONSIDERED IN THE IMPENDING PHASE IV The manufacturers promised that they would never do anything to us like they did with the complete change in architecture in 1964, but somebody forgot to get it in writing from the usurpers of the industry, that forward and vital mini-computer industry. Will the Volkswagen of the computer industry, the mini, make the big one of the computing field alter its competitive path? We can only wait and see. One thing we are sure of is that the l'v1iniComputer, Data Communications, Teleprocessing and Data-Based Management Systems are upon us. We are told that'the next major thrust of computing will be in manufacturing systems and the language of computing is going to be eventually reduced to the level of the user through the terminals and CRT. This is the picture of the 70's and we are told by John Diebold that the 80's will usher in the Cybernetic System in "intelligent machines," where Japan has every intention of dominating the market. Before we attempt to define the problem of developing the curriculum for the last 1970's and 80's we might benefit by reviewing our societal framework over the past ten years or so. The social upheaval over these recent years has shaken our institution to the very mantle of our earth. The Civil Rights sit-ins in Greensboro, North Carolina, in 1960, were followed in 1963 by Martin Luther King's "I have a dream" speech to 200,000 civil rights demonstrators in Washington, D.C. Polarization of social and political values were thereafter punctuated by an infamous series of assassinations and attempted assassinations. The Free Speech Movement at Berkeley in 1964 was followed by the Viet Nam protest from 1967 to the inauguration of the President on January 20th of this year. The energy of dissatisfaction and discontent has been registered through the vast disenchantment with our industrial military complex and the expenditure of great sums of money for space exploration. The result of all this has been that technology has been identified as one of the major sources of our society's problem. The War on Poverty Program in the early 60's and the concern for the environment and health of our citizens brought about a new sense of social consciousness nonexistent in previous periods. The dethroning of college president after college president because of a total inability to grasp what was taking place and make the required changes drove the point even deeper. Suddenly in 1969 and 1970 a lionized profession (engineering) of the 1950's and 1960's suddenly found itself largely obsolete and unwanted. Thus a profession found itself in the same position that the blue collar worker had been faced with for decades. Students following the path of success established by our society, acquired the training and education suppos- 384 National Computer Conference, 1973 edly designed to provide them with the "good life" of the future. The shock they received when they attempted to enter a labor market that could not utilize their skills, and an environment they did not understand destroyed much of their confidence in the ability of our economic system to meet the needs of the people. The computer industry leaped off of the solid economic base established in 1958, and with the other industries of our economy grew rapidly during the early and mid-sixties. The constant pressure of supporting the war in Viet Nam and meeting the unprecedented demands at home finally forced our economy into a heated period of inflation and the eventual recession of 1969 and 1970. The fixed costs of computing were finally registered upon a management that had grown up in years of unparalleled growth. Hopefully the recent fight for survival experienced by management has provided the necessary insights into what courses of action management is to take if we are not to repeat the mistakes of the 1960's. Whether management has been able to work through the archetypes of management past and sense the new needs of its employees only time will tell. One thing seems certain, organizational needs are not yet synchronized with human needs and the pace of technology will only widen the gap. It appears that management does not know how to reward its employees for productive efforts within the framework of the new social consciousness. To sense a real problem we have only to listen to personnel executives on one side lamenting the fact they are unable to find employees who can fit quickly into the work organization and become productive. On the other side, these same personnel experts are admonishing educators for developing people for a work environment that cannot adequately utilize their skills, thus bringing about employee dissatisfaction and turnover. There appears to be a mutual fuzziness both on the part of the executives defining the specifications of required skills for the near future and the part of the educator attempting to educate with such specifications in mind. The atrophy that sets in as a misplaced individual exists in a state of unrelieved boredom only furthers the loss of identity and therefore raises frustration to a dangerous level. An impersonalization of organization that grows through a management strategy of merger and acquisition frequently spawns a hostile enemy behind an employee's mask of acceptance. Management will be using the computer to ever-increasing degrees to eliminate specific human procedures. However, it seems probable that for every problem solved in this too-obvious manner, there may be created a dozen more, for the approach ignores the basic root structure of men's needs. All of the foregoing societal events that have transpired over the past decade have contributed two vital factors: (1) There is a definite sense of social consciousness and a definite desire for real freedom. The Civil Rights Movement and the course of events that followed released untold amounts of human energy that is far from being coordinated in tandem. (2) The power of our present and near future technology gives us unlimited capacity for the solution of high priority problems of our world. Alone this technical competence is useless unless interwoven with the tapestry of human understanding. Such a process undertakes what Warren Bennis has referred to as the process of human revitalization. He identified the following four basic points in the process. (1) An ability to learn from experience and to codify, store and retrieve the resultant knowledge. (2) An ability to learn how to learn, the ability to develop one's own methods for improving the learning process. (3) An ability to acquire and use feedback mechanisms on performance to become self-analytical. (4) An ability to direct one's own destiny. The program and curricula of the late 70's and 80's must especially develop the students' ability to learn how to learn and to direct their own destinies. It is difficult to perceive how such programs and curricula can be successful without the practice and consistent involvement of the business community, in both the developm~nt and implementation. Sharp distinctions between campus and business arenas are already dulling. Work experience programs, onsite educational programs, educational TV and programmed instruction technology and concepts have made significant advances, and have an undeniable future. All we seem to need is the sense of urgency that will cause us to allocate resources toward the realistic assessment of the situation. Effective definition of objectives will require mutual contributions of time and intellectual resources on the part of both business and educational leaders. Our problem today is one of breaking down our own archetypes and the archetypes of our institutions in order to develop those inner human qualities that men must integrate with future technologies. i Academic Computing at the Junior/Community College Session Computing at Central Texas College by ALTON W. ASHWORTH, JR. 2. Central Texas College Kileen, Texas 3. ABSTRACT Central Texas College has developed a post secondary curriculum in data processing in conjunction with the United States Office of Education. The program has been developed around the career education guidelines established by the United States Office of Education. The following list of program advantages will be discussed in some detail at the June meeting: 1. A complete unit of learning has been provided for the student in his first year and in his second year. At the end of his first year he will have received useful skills that are saleable in the market place. During the first year he will have had a balance of data processing courses, mathematics, business practices and effective communications. These subjects, combined with the learning of a basic programming language and systems analysis, will qualify him for many of the collateral jobs that exist in a data processing environment. He will have learned some advanced programming languages. He will have had applications courses. He will have learned some of the internal workings of the computers and programming. He will have been exposed to data management systems and transmission techniques providing him with an insight into the future of data 4. 5. 6. 7. 8. 385 processing. He will have had an elective during his last semester that could be an industry co-op program. The curriculum is flexible enough so that the student will be able to change his educational objectives to a four year program without extensive loss of credit. Through the new organization of courses, certain social and business objectives have been met as well as those of data processing. At specific points during education, wen rounded educational objectives have been met. A balance of traditional courses and special computer oriented courses exist between his two years of education. He will receive five data processing courses his first year and five data processing courses his second year, plus his elective co-op program with industry. A balance of programming languages has been provided the student for his first and second year education. He will learn two programming languages his first, BASIC AND COBOL, and two programming languages his second year, FORTRAN and ASSEMBLY. The curriculum is designed to develop people to become working members of society. In addition to data processing capabilities, communications skills and social awareness development courses have been provided. Sufficient math has been provided in the curriculum to allow the student to advance his own studies of data processing after leaving school. Considerable applications experience has been gained in both the educational and working environments. The design of IBM OS/VS2 release 2 by A. L. SCHERR International Business Machines Corporation Poughkeepsie, New York INTRODUCTION an extremely large address space, allows program structures to be simpler, intermediate data files to be eliminated, and teal main storage to be used to hold data that in the past was resident on a direct access device. This latter use can result in a significant performance advantage which will be discussed later. The purpose of this paper is to give some insight into the design of IBM OS/VS2, rather than cover individual features of the release. Included are the overall objectives for the design, some of the system's key architectural features, and how these relate to the environments that the system is intended to be used in. The major objective is to show how the design of the system fits together and to provide an insight into the rationale of the design. MULTIPLE ADDRESS SPACES Perhaps the most obvious new feature of Release 2 is the support of multiple address spaces. Each job step, TSO user, and operator STARTed program in the system has a private address space that is 16 million bytes, less the space taken by the operating system. Figure 1 is a comparison between Release 1 and Release 2 storage maps. Both maps extend from 0 to 16 million bytes. Release 1 and MVT actually look alike, with the only difference being that MVT's address space is limited to the size of real storage. The Release 1 map, shows two TSO regions with several users in each. Users A and B, for example, cannot be in execution at the same time because only one of these users can occupy the region at a time. The others are swapped out. The transition from Release 1 to Release 2 can be understood very simply by considering the Release 1 system with a single TSO region the size of the total available virtual storage. What has been done in Release 2 is to remove the multiprogramming restriction between the users of the TSO region. On the other hand, Release 2 does not allow two jobs to share the same address space. One of the first implications of this design is that it is no longer necessary for the operator to get storage maps printed at the console so that he can manage main storage. To show the effect of multiple address spaces on certain control program functions, TCAM will be used as an example. In Release 1, terminal input is read through a channel directly into the TCAM region. There it undergoes some processing and is then moved to the user's region or to the TSO control region. In Release 2, the input is read into a common system area buffer at the top of the map, and from there is transmitted under TCAM's control to the user. To programs that have done inter- OBJECTIVES Release 2 represents a major revision of OS to provide a new base for future application areas. The key thrust is to provide a new SCP base with increased orientation toward DB/DC applications and the additional requirements placed on an operating system because of them. Another key goal of the system is to support multiple applications concurrently in a single complex. This complex may include multiple CPUs, loosely or tightly coupled. The system must dynamically adjust itself to the changing loads in the various environments that it supports, as well as provide increased security and greater insulation from errors. Maintaining a high level of compatibility continues to be a major objective for VS2. Extending the system, adding function, and changing its internal structure, while at the same time considering compatibility, represented a significant challenge to the designers of Release 2. Over the last few years we have learned a lot about the needs of our users, and during this time, the state-of-theart in software design has moved forward. The system has been reoriented to incorporate these things into the system. USE OF VIRTUAL STORAGE The incorporation of virtual storage into OS has allowed the system to support programs whose size is larger than available real main storage. There are operational advantages; however, significant additional benefits can be realized. Using virtual storage to provide for 387 388 National Computer Conference, 1973 SYSTEM •I - I i USER ADDRESS SPACE TCAM T T S 0 C S 0 T S 0 X B COMPATIBILITY 1 j TS CTL B A T C H T C A M I---t--r----t--- t--SYSTEM The area in between is the private address space for each user. User requests for storage in all subpools are allocated from the bottom of this private address space. Requests for Local Supervisor Queue Area and the Scheduler Work Area storage are satisfied from the top. Compatibility is a major objective in Release 2. Object code and load modules from MVT and VS2 Release 1, not dependent on the internal structure of the system, will run with Release 2. JCL compatibility is maintained, and SYSTEM Figure I-Multiple address spaces region communication in previous systems, this new storage map represents a major difference. In Release 2 V = R jobs no longer affect the virtual address space of V = V jobs. Since each job is assigned a 16 million byte address range, V = R jobs only affect the amount of real storage available. (See Figure 2). SYS QUE AREA PAGEABLE LPA COMMON SYS. AREA STORAGE MAP Figure 3 shows the storage map seen by a single job. This corresponds to an MVT or Release 1 region. At the top of the map in areas which are commonly addressable by all of the address spaces is the System Queue Area containing system control blocks, the pageable Link Pack Area, and the Common System Area for use in communicating between users. This area is used, for example, by TCAM and IMS for inter-region communication. At the bottom of the map is the Nucleus and that part of Link Pack Area which is to remain permanently in main storage. REGION FIXED LPA NUCLEUS Figure 3-Storage map SYSTEM Figure 2-V=R, V=V the data sets and access methods of previous releases apply, as well as EXCP. SMF is compatible as well. However, it must be recognized that in moving from a non-virtual to a virtual environment, the usefulness of some of the measurements has changed; and in order to account completely for usage, it may be necessary to make some use of the new measurements that are provided. Internal interfaces are the area of greatest concern because, in some cases, such interfaces have been extensively used. Generally, our approach has been to evaluate every change of this type to see what the effect is on the user community as well as our program products. Several proposed changes were not made because of their potential impact; but, on the other hand, some change is The Design of IBM OS / VS2 Release 2 required to make progress, and thus we have had to consider many difficult trade-offs. The major differences that affect compatibility include the system catalog, which is now a VSAM based data set and requires conversion from the catalogs of existing systems. Forward and backward conversion utilities have been provided, as well as compatibility interfaces allowing the use of the original OS catalog macros. As mentioned earlier, the new storage map, will impact programs that have done inter-region communications. Also, lOS appendages run enabled in Release 2 and must use a new synchronization mechanism. Therefore, there is likely to be impact to user-written lOS appendages. PARALLELISM One of our major design goals in the system was to provide for as much parallelism of operation as possible. The reduction of software bottlenecks that prevented efficient multiprogramming is the major technique that we used. Listed are five of the main areas that we worked in. Each of these areas will be discussed. • • • • • Job Queue Allocation Catalog TSO Region MP65 Disable Lock Experienced OS users will recognize these as areas with a high potential for improvement. JOB QUEUE We have eliminated the Job Queue data set that OS has used since its beginning. With HASP or ASP in an OS system, there were really two job queues-the one kept by the support system relating primarily to information required to schedule jobs and the printing of output, and the OS job queue which contains similar information as well as information pertaining only to a job in execution. One type of information is really for inter-region communication between various parts of the scheduling functions of the system; the other, for intra-region communication between the scheduling components and data management in behalf of the executing job. The inter-region information has now been placed entirely in the job queue maintained by the job entry subsystem, either JES2 or JES3. The intra-region information has been segmented and placed into the individual job's address space. In this way, the portions of the original OS job queue having the highest usage are now in each job's private address space. The less frequently used information relating to communication between various components of the scheduling function is now in the JES job queue. Thus, all of these elements of the job queue 389 can be accessed in parallel. The JES job queue is also used to journal information required to restart jobs during warmstart or from a checkpoint. (See Figure 4.) ALLOCATION The component of the system that does data set and device allocation has been completely redesigned. Both batch and dynamic allocation are now supported by the same code and provide essentially the same function. The design is oriented toward virtual storage-no overlays are used, and all work areas are in virtual storage. Allocation of data sets to storage or public volumes can be done completely in parallel, regardless of other allocation activity. This type of allocation represents probably the most cOlIHllGn -form in mo-st installatiGns, and, in g-ener-al-, the design of the new allocation provides shorter paths for these generally simpler cases. When it is necessary to allocate a device and perform volume mounting, these requests are serialized by device group. Therefore, a request for a disk need not be held up because another job is waiting for a card reader. Other improvements in this area include the ability to prevent a job from holding devices until its entire requirement can be met, and the ability to cancel a job-waiting for devices. CATALOG The catalog has been converted to an indexed VSAM data set, primarily to allow for faster access to a large catalog. The curves in Figure 5 give the general idea of how access time should relate to catalog size with this new structure. ...-.~_,----, os JOB QUEUE Il VIRTUAL STORAGE Figure 4-Job queue 390 National Computer Conference, 1973 • INDEXED VSAM DATA SET • DESIGNED FOR FAST ACCESS TO A LARGE CATALOG ~ TIME ACCESS! ---~........ : ; . . - - - OS CATALOG --- VSAM CATALOG CATALOG SIZE Figure 5-Catalog TSOREGION As previously stated, in MVT or Release 1, TSO users sharing the same region cannot be concurrently in execution. This restriction is eliminated in Release 2. Therefore, the situation shifts from one where each region serves a given set of users, to one where the entire system serves all of the users. Thus, any potential imbalance between regions is eliminated. (See Figure 6.) Moreover, previous support placed a limit on the level of multiprogramming for TSO at the number of TSO regions. In Release 2, the level of multiprogramming can vary and is dependent upon the load placed on the system. ments with heavy usage of control program services, this lock becomes a significant performance bottleneck. (See Figure 7.) In the Release 2 support of MP, we have used instead a number of specific locks, each relating to a particular function. Generally, the program obtains the appropriate lock relating to the data that it is going to update or use, performs the operation, and then frees the lock. Whether or not the system is disabled during this operation depends on whether or not interrupts can be handled. The locks that are used include one per address space, a dispatcher lock, multiple lOS locks, a lock for real storage management, locks for global supervisor services, and locks for virtual storage management. This means that, for example, a GETMAIN can be performed in a user's • MP 65 TECHNIQUE: ONE LOCK BOTH CPU's CANNOT BE DISABLED AT THE SAME TIME LOCKS In a tightly-coupled multiprocessing system, it is highly desirable from a performance point of view to allow the control program to be executed simultaneously on both CPU's. However, some means is then required to synchronize or serialize the use of control information used by the control program. System code in MVT disabled for interrupts prior to the updating or use of this type of control information; and when the operation was completed, the system was enabled. The MVT technique used for Model 65 multiprocessing was to use one lock which prevented both CPU's from being disabled at the same time. In environ- Figure 7-Locks private address space at the same time that another GETMAIN is being done in another user's space, or an interrupt is handled by. lOS. The net result is that the system is enabled for interrupts more often and more elements of the control program can execute in parallel. The primary advantages here are to a tightly-coupled multiprocessing system, but some of these carry over into other environments. (See Figure 8.) MAIN STORAGE EXPLOITATION - ELIMINATES: UNBALANCE BETWEEN REGIONS - LIMIT ON LEVEL OF TSO MULTIPROGRAM'V1ING Figure 6-TSO region Because of recent changes in the relative costs of various hardware components, the trade-off between main storage usage and other activity in the system has changed. In Release 2, our goal has been to exploit main storage by trading it for CPU and I I 0 activity wherever possible. The Design of IBM OS/VS2 Release 2 - LOCK OBTAINED DEPENDS ON FUNCTION BEING PERFORMED - NOT GENERALLY DISABLED - LOCKS: • ONE/ADDRESS SPACE • DISPATCHER • lOS (MULTIPLE) • REAL STG. MGR. • GLOBAL SUPV. SVCS. • VIRTUAL STG. MGR. • • • Figure 8-VS2/ReI2 uses multiple locks In MVT and VS2 Release 1, data sets are generally placed on a device and all access to this data must go to that device. Main storage content is limited to the images of programs. Certainly, in many environments there is data whose usage is high enough to warrant at least a part of it being resident on a higher speed device or perhaps in main storage. In fact, there are environments where some blocks of data receive higher usage than some of the pages of the program, and ideally should receive preference for main storage occupancy. In Release 2, we have attempted to move in the direction of allowing data to be placed in the storage hierarchy dynamically, according to its usage. Therefore, certain data can be resident in main storage or on a higher speed device if it has high enough frequency of usage. The whole idea is to allow more data, more information, to be resident in main storage. Thus, given a fixed amount of main storage, there is a better choice as to what to put there. More importantly, given more main storage, there is more useful information to put into it. In Release 2 there are three facilities for this type of exploitation of main storage: virtual I/O, Scheduler Work Areas, and the large address spaces. VIRTUAL I/O Virtual I/O provides for placing data sets in the paging hierarchy. The net result of this is that if a page of data is 391 resident in main storage, there is a reduction in I/O and CPU time. The CPU time is reduced because of the elimination of I/O interrupt handling, channel scheduling, and task switching. Because blocking is done automatically at 4K, greater efficiency may result. When I/O is done, it is performed by the paging mechanism, with generally more efficiency than with the conventional techniques. An additional advantage of virtual I/O is that no direct access device space management is required, and therefore aliocation time is faster. Because space is aliocated in 4K blocks as needed, space utilization is also more efficient. In Release 2, temporary data sets are supported for virtual I/O in a compatible way. No JCL or program changes are required for SAM, PAM, DAM, XDAP, and the equivalent operations in EXCP. Any program dependencies on direct access device characteristics are handled in a transparent way. SCHEDULER WORK AREA The Scheduler Work Area allows a job's job queue information to be contained in its own virtual storage. Thus access times are better for this information when it is required for allocation, termination, or OPEN / CLOSEEnd of Volume processing. If usage is high enough, this information would be resident in main storage with the same advantages as with virtual 1/ O. LARGE ADDRESS SPACES The use of large address spaces to achieve greater performance has been described exhaustively in other places, however, several techniques which have been incorporated into portions of the control program should be highlighted. Overlay structures have been eliminated, and the use of the Overlay Supervisor, LINK, and XCTL services has been removed with a decrease in I/O activity as well as CPU time. Spill files have been eliminated; instead, large work areas in virtual storage have been used. The allocation redesign makes use of both of these techniques. RESOURCE MANAGEMENT In the resource management area, our goal has been to centralize all of the major resource control algorithms. The objective here is to achieve better coordination than is possible with decentralized algorithms. With a decentralized design, two uncoordinated algorithms can sometimes work at cross purposes. By having a centralized set of algorithms, more opportunity exists for optimization. The system resource manager in Release 2 replaces the TSO driver, the I/O load balancing algorithm of Release 1, and HASP's heuristic dispatching. Further, it provides a new algorithm to control paging and prevent thrashing 392 National Computer Conference, 1973 by dynamically adjusting the level of multiprogramming. The rate at which users get service is controlled by the Workload Manager in accordance with installation specified parameters. WORKLOAD MANAGEMENT Priorities for this Workload Manager are not absolute, but rather are expressed in terms of a rate of service for each job. This allows a departure from the usual situation where a lower priority job gets only what is left over after the higher priority jobs have received all of the service they can get. In Release 2, two jobs can proceed at a relative rate of progress that can be set by the installation. These service rates are specified for different system loads so that the relative rate of service received by two jobs can change as the overall system load shifts. Finally, service rates can be specified for a given set of users or jobs, where a set can include as few as one user. Figure 9 shows a sample of how this is done. There are five sets of users, A through E; and service rates varying from 0 to 1,000 service units per second. Service is expressed in terms of a linear combination of CPU time, I/O services, and main storage use. The number 1 curve, which might be considered for a light load, shows the users in groups A and B receiving high service rates, users in groups C and D slightly less service, and E even less. User sets A and B might be two types of TSO users, C and D, high turnaround requirement batch jobs; and E the rest of the batch jobs. As the load gets heavier, the installation has specified that they would like more of the degradation to apply to the users in Sets D and E, and the least degradation to apply to sets A and B. Curve 4 represents the heaviest load where users in set A get significantly better service than anyone else, and users in sets C through E receive only what is left. The system attempts to operate on the lowest numbered curve; however, as the load gets heavier, it degrades the service seen by each of the sets of users proportionally to the way shown by the curves. That is, in going from curve 1 to curve 2, it reduces the service seen by users in category C more than for category A. A set of reports is produced which the installation can use to determine the response time or turnaround time and throughput that is being produced by the system for each user set. Should an adjustment be required, a higher rate of service specified for a set of users will yield better response time or turnaround time. Our objective here is to provide a relatively simple way to achieve discrimination between users and to provide the right level of service to each group of users. RECOVERY Recovery mechanisms in the system have also been overhauled in a major way. A significant amount of work has been done in this area. Our goal is to contain errors to the lowest possible level, and either to recover from an error so that the system can proceed as if the error never occurred, or at least to clean up so that the effect of the error is not felt outside of the function in which it occurred. In this area we have really recognized that it is not enough to have code with a minimum number of bugs, but rather to have a system that minimizes the effect of the failures that do occur. The same approach for minimizing software failures is used for hardware error recovery as well, especially in the multiprocessing environment. Generally, the method is to provide specialized recovery routines that operate as a part of the main line functions, and which receive control whenever an error is detected by the system. There are approximately 500 such recovery routines in Release 2. INTEGRITY In Release 2 we have closed all of the known integrity loopholes in VS2. This means that unauthorized access or use of system facilities and data or user data, is prevented, both for accidental as well as intentional actions, and we will now accept APARs for integrity failures. Integrity is a prerequisite for adequate security, where security is defined as an authorization mechanism to distinguish between what various users can do. Moreover, integrity should also provide for an increased level of reliability. SERVICE MANAGER 3 A 2 ~----------------~~------~-.~~--~ D~--~~--------~~----------~~------~ 1000 SERVICE RATE - UNITS/SEC. USER SETS Figure 9-Workload management In Release 2, we have provided a new transaction-oriented dispatching mechanism which allows the efficient creation of new units of multiprogramming. Our goal here was to increase performance by trading off function. This new unit of multiprogramming differs from the OS task in that it is not a unit of resource ownership or recovery. The new facility, called the Service Manager, is used by the Release 2 supervisor, JES3, lOS, VTAM, and the version of IMS for use with VS2 Release 2. This mechanism can also be used by appropriately authorized user programs. For example, a VTAM application. The Design of IBM OS / VS2 Release 2 RELEASE 2 PERFORMANCE Summarizing what has been done in Release 2 from a performance standpoint the following points are noteworthy. Because of the reduction in serialization and the tradeoffs that can be made between I/O activity and main storage, the system can better utilize the CPU. Figure 10 shows conceptually the CPU and I/O overlap for an MVT job. The wait state time is comprised of I/O wait plus other waits caused by serialization on system resources. These wait times are reduced as a result of virtual I/O, scheduler work area, the new catalog, allocation, etc. However, this wait time may be extended due to paging. This is typically rather small, especially in a large main storage environment. On the other hand, CPU time generally will be reduced as a result of virtual I/O activity since fewer interrupts are handled, etc. Other overhead is also reduced because the reduction in II 0 and wait time generally allows the CPU to be fully utilized at a lower level of multiprogramming. On the negative side is degradation due to extra instructions required in Release 2 because of enhanced recovery, integrity, and new function. The overall effect is that individual jobs tend to look more CPU bound. The general performance characteristics of Release 2 are significantly different than previous OS systems. The system now is generally more responsive, in that there is better consistency, with fewer long responses caused by the processing requirements of other jobs and the operator. Because fewer initiators can be used, and because of the reduction in bottlenecks, batch turnaround time can be improved. And, with the System Resource Manager, the installation has more control over the service seen by an individual user or job. VS2 RELEASE 2 ENVIRONMENTS The following summary shows how the features of VS2 Release 2 apply to various environments, such as Multiple large applications, Data basel data communications, Time sharing, 1/0& WAIT \ \ Batch, Multiprocessing, and finally, Operations. MUL TIPLE APPLICATIONS One of our major goals is to allow multiple applications to operate effectively in a single system complex. This is theoretically highly desirable, but previous operating systems have had insufficient capabilities to shift resourCes dynamically from one application to another as the load changed. Perhaps even more important, failures in one application often brought down other applications, or even the entire system. There was also insufficient separation of applications from a security point of view. Release 2 provides both better isolation and integrity to address these problems. With virtual storage and other facilities in Release 2, more dynamic control and use of resources is also possible. TELEPROCESSING In the teleprocessing area, Release 2 is intended to be a base for high performance data basel data communications applications. VSAM, VTAM, Service Manager, Virtual I/O, Large Address Spaces, and the new Allocation all provide tools for such applications. TIME SHARING (TSO) For time sharing, a number of performance improvements have been made: SWA, the Catalog, etc. Compatibility between TSO and other areas of the system is more complete, primarily because the rest of the system is now more like TSO. Dynamic device allocation with volume mounting represents a new facility for TSO users that are authorized by the installation. SYSOUT data can be routed through JES2 or JES3 to a remote high speed work station to provide bulk output for a TSO user. Finally, large address spaces have been classically considered a time sharing function. BATCH PROGRAMS CPU \ 393 \ \ / \ / I In the batch area there are a number of performance improvements as well. Dynamic data set and device allocation is provided for the first time for the batch programs. Among other things, this allows the ability to start printing SYSOUT data sets dynamically prior to the end of the job. This can be done with a minimal change to the JCL and with no programming change. Remote job entry is provided through the JES2 and the JES3 packages. I I / Figure lO-CPU and I/O overlap for an MVT job MUL TIPROCESSING Multiprocessing has traditionally placed a heavy emphasis both on reliability and availability as well as 394 National Computer Conference, 1973 performance. In the reliability area, a number of hardware improvements have been made. Certainly the increased integrity, both between the operating system and the user, as well as between the various parts of the control program, provides the potential for better reliability. Most important are the new recovery mechanisms in Release 2. In the performance area, the complexity of a multiprogramming system is generally increased in the MP environment; however, the facilities for increased multiprogramming efficiency in Release 2 go a long way toward achieving good performance on MP systems. The exploitation of main storage is also important, since most MP systems are configured with large amounts of main storage. The multiple locks of Release 2 are aimed directly at minimizing contention for control program usage of the CPU's in a tightly coupled multiprocessing system. OPERATIONAL CHARACTERISTICS On the operational side of the system, our goal has been to have less dependence on the operator for performance. Generally, the system is significantly less serialized on the operators and their activities. The system, we feel, is simpler to operate. Tuning should be significantly easier as well. There are fewer bottlenecks to balance, fewer parameters to specify, and the system is more self-tuning. Moreover, the system can allow for more control over its own operation with the new integrity facilities, the System Resource Manager, and so on. CONCLUSION The purpose of this paper has been to provide some insight into how we arrived at the design of Release 2. Our objective was to provide some very significant increases in function and availability, with improved performance characteristics, and with a high degree of compatibility to previous OS systems. We think that the system has done a good job of meeting these often conflicting objectives. OS! VS2 Release 2 represents a major step forward, but it is only a first step, since it provides the base on which we will build total support for advanced applications in the 1970's. IBM OS/VSI-An evolutionary growth system by T. F. WHEELER, JR. International Business Machines Corporation Endicott, ~ew York Significant enhancement was made to the job scheduling algorithms. The single most important addition has been the incorporation of Job Entry Rubsystemand Remote Entry Services into the Release 2 scheduler. These functions provide efficient job entry from both local and remote users, providing a transparency of operation that enhances remote capabilities. I will also investigate these changes in detail at a later point in the paper. Finally, VS1 will contain a new data management function-Virtual Storage Access Method (VSAM). This function and its new data set organization has been added, as an ISAM (Index Sequential Access Method) replacement, to better support more sophisticated and online applications. Significant improvements in the exploitation of relocate, data integrity and recovery, device independent addressing, and migration ability help to make VSAM an important base for data base development. Since VSAM is a topic in itself, it will not be discussed in this paper. INTRODUCTION A brief -st-OOy-o-f-IDM- OS; VSl +Ope rating System/ Virtual Storage 1) will reveal a system providing many faceted growth capabilities at all levels of user-system interaction. Additional meaningful function is provided on a stabilized base to assure this growth capability. It can be further seen that installation growth is achieved through new application work and not by a continual rework of existing programs. To assure the users ability to move to new work almost immediately, OS/VS1 is built on an IBM OS/MFT (Operating System/Multiprogramming with a Fixed Number of Tasks) base. Compatibility is defined to extend to most object programs, source programs, data and libraries from OS/MFT to OS/VS1, thus assuring a normal movement of existing programs to the virtual environment. Figure 1 graphically represents the areas of change between MFT and VS 1. In like manner, the transitional problem of education is greatly reduced for the programmer and operator alike. VS1 uses the MFT languages in all areas of programmer/ operator contact to the system, and from the system generation procedure to the operator control language, VS1 incorporates, improves and extends the existing MFT language to support the virtual function. As an OS compatible system VS1 becomes a vital part of the IBM family of new virtual systems which includes DOS/VS (Disk Operating System/Virtual Storage), OS/ VS1, OS/VS2 and VM/370 (Virtual Machine/370). Each is based upon its predecessor system but each expands the horizon of support with virtual memory. The virtual storage facility is the single most important new characteristic of VSl. It offers significantly longer address space for both application partitions and system functions by providing, with, adequate equipment, a 16 million-byte addressing capability. To provide this enhanced capability, OS/VS1 requires a System/370 central processing unit with the dynamic address translation facility. VSl supports this facility on the System/370 Models 135, 145, 158, 168, and those 155's and 165's which have the DAT facility field installed. In addition to the hardware facility, significant changes were made to control programs code which I shall discuss later in this paper. RELOCATE General discussion Virtual storage separates address space and real storage and then expands the address space to make it larger than real storage. In VS1, address space can be up to 16,777,216 bytes containing the control program, data, and normal application jobs within partitions. Virtual storage addresses are not related to real storage addresses, but both are broken into 2048-byte sections called in virtual storage, pages, and in real storage, page frames. A great deal of study went into determining the optimal page size for a VSl environment. Involved in this study was a determination of the effective CPU time for instructions and data within a page and the time taken to move the page to secondary storage from real storage. The page size of 2K balances these considerations for optimal performance. In like manner, the DASD (Direct Access Storage Device) mapping algorithm was considered critical in achieving both medium entry and performance in that entry level. The direct mapping of virtual to secondary space greatly simplifies the movement of data from real to secondary storage and reduces the logic size of the page input/ output routines. 395 396 National Computer Conference, 1973 u • • Compilers • ObjectProgroms • Service Routines • Appl ications • Libraries • Dota • Control Language • Procedures Data Management SLIGHTLY CHANGED Figure I-Areas of change between OSjMFT and OSjVSI Page management The key component in the management of virtual storage is page measurement. Page measurement is accessed directly by the System/370 hardware when a page exception occurs. A page exception occurs when the address translation feature is unable to resolve a virtual address to a real storage location. At. the point of the exception, page management assumes responsibility for ensuring the addressability of the initial storage contents. OS/VS1 uses a number of pointer queues to manage its least recently used page replacement algorithm and regulate the flow of pages to and from the external page storage. Some of these queues include: 1. In-Use Queues-The addresses in these queues point to locations of currently active page frames. These frames contain the most recently executed code and the most recently used data and tables. The number of in-use queues is a variable dependent upon the number of active partitions and active tasks including system tasks. Figure 2 shows four such in-use queues. 2. Available Page Queues-This queue contains the frames that are available for program servicing when a page fault occurs. At the initial program load, all RSPTE's (real storage page table entries) representing real storage blocks above the fixed nucleus appear on this queue. As execution occurs, this queue is maintained at a minimum threshold to minimize both lockout and thrashing possibilities. 3. Page Input/Output Device Queues-These queues are addresses of frames that are being used for page I/O. The input queue represents the list of frame addresses that are currently being filled from external page storage (SYSl.PAGE). The output queue contains the addresses of the IE'ast referenced pages that are about to be stored on external page storage (SYSl.PAGE). 4. Logical Fix Queue-This queue contains the addresses of both short-term fixed page frames and long-term page frames. Keys to page frame arrangement are the change and reference bits. Both bits are set by hardware and reset in the process of paging activity by the page management routines. The change bit indicates whether the contents of a given page frame have been modified since the page was brought into real storage. This bit is reset only when the page is moved to the external page file. The reference bit is turned on when reference is made to the contents of a page frame. At periodic intervals (between 3 and 9 task switches in Release 1), the status of the in-use queues page frames is adjusted. This process involves the migration of all unreferenced frames to the next lower queue and all referenced frames to the highest level queue. This migration enables the low reference level frames to move to the lowest level queue and eventually permit their replacement. As we have noted before, when a referenced page is not contained in real storage, the hardware facility turns control over to page management. Page management immediately looks to the available queue to satisfy the request. If an adequate number of frames is available, the request is immediately satisfied. If there is an inadequate number to satisfy the request, the page replacement routine is entered. The page-frame release request formula is applied as follows: A + HTV - APC = Release Request Amount where: A = Page Allocation Request HTV = High Threshold on Available Page Queue APC = Available Page Frame Count Available Queue Queue n-3 Queue n-2 Oueu( n-l OL'E:U(, "! R C ADDR 1010iFRAME6 RC ADDR loll FRAME 30 R C AD DR RC ADDR RC ADDR 10101 FRAME 12 I ~iiJ 11111 FRAME 2 1010lFRAME7 IlloIF"AMEI loll I FRAMES 11010 FRAME< 10111 FRAME 23 I 1 1010IFRAME22 Id,IFRAMEI411dl FRAMEI7 Illd F'M'E 161 1010iFRAME24 10101 FRAME 11 IdolFRAMElslldo FRAME 32 Idll F 10 83,520 l I <10 <10100,5361 78 ,91 1+ ACKNOWLEDGl\IENTS The author would like to thank Dr. K. Xoda for his support of the writing of this paper, and Dr. N. Ikeno and Dr. Nakamura for their valuable suggestions and many dIScussions. ? REFERENCES 1. Bobeck, A. H., "Properties and Device Applications of Magnetic Domains in Orthoferrites," Bell System Technical Journal, Vol. 46, No.8, 1967. 2. Graham, R. L., "A Mathematical Study of a Model of Magnetic Domain Interactions," Bell System Technical Journal, Vol. 49, No. 8,1970. 3. Friedman, A. D., Menon, P. R., "Mathematical Models of Computation Using Magnetic Bubble Interactions," Bell System Technical Journal, Vol. 50, No.6, 1971. 4. Minnick, R. C., Bailey, P. T., Sandfort. R. M .. Semon, W. L., "Magnetic Bubble Computer Systems." Proceedings of Fall Joint Computer Conference, 1972. The realization of symmetric switching functions using magnetic bubble technology by H. CHANG,* T. C. CHEN and C. TUNG IB}!;! Research Laboratory San Jose, California I~TRODUCTIO~ Several features are note\,"orthy for device applications: (i) Stable bubbles exist over a range of bias field strengths, thus exhibiting storage cap-ability. (ii) A bubble can be deformed by lowering the bias field for further manipulation, e.g., bubble generation, replication, etc. (iii) A bubble can be annihilated by raising the bias field. (iv) Bubbles interact with one another like magnets when they get closer than about three diameters. These interactions limit storage density, but are necessary for logic circuits implementation. Since its debut in the late sixties, magnetic bubble technology has quickly evolved to become a promising alternative to semiconductor technology for computer construction. 1- 7 "While emphasis has been placed upon the application of bubble technology to storage, its application to logic has thus far been largely limited to the implementation of simple basic operators, such as A~D, OR, etc. 5 •7 An exception is the excellent work recently reported in References 12 and 13. This limitation to simple basic operators, however, is not in keeping with the high density and low connectivity requirements of LSI (Large-Scale Integration), and it has become increasingly important to find powerful multi-input switching functions. The symmetric function is a natural candidate. The realization of symmetric functions using magnetic bubble technology has been found to be very simple. The second part of this paper provides some basic information for a qualitative understanding of the bubble technology. Part three briefly revie"ws symmetric functions and also introduces residue threshold functions. Part four describes the mechanism for realizing symmetric functions, and part five presents an implementation. Some concluding remarks are made in part six. In the past, more attention has been given to the application of bubble technology to data storage than to data processing. The most popular configuration of bubble storage, by far, is the shift register with bubbles (representing OXE's) and voids (representing ZERO's) propagating along fixed tracks. Propagation The transmission and manipulation of information rely, directly or indirectly, on the propagation of bubbles. There are two basic methods of producing movement of bubbles in the plane. The first method employs the current in a conductor loop to produce a field for attracting an adjacent bubble. A sequence of bubble positions may be propagated by exciting a series of conductor loops wired to carry current pulses. This is referred to as "conductor propagation." The second method, "field access propagation," depends on the alternating magnetic poles in a patterned permalloy overlay; the poles arc induced by a rotating field in the plane. A permalloy bar is easily magnetized by this field along its long direction. When suitable permalloy patterns are subjected to this rotating field, the induced magnetism in the form of a moving train of poles pulls the attracted bubbles along. Since field access propagation is more suitable for the implementation discussed in this paper, it will be examined in more detaiL The integers 1, 2, 3 and 4 v.ill be used to denote the four phases of the rotating field, counterclockwise starting from the first quadrant, as shown in Figure 2. :.vIAGKETIC BUBBLES Fundamentals Basic to magnetic bubble devices is the existence of magnetic domains in uniaxial magnetic materials over a range of bias field. The magnetic domain is a cylindrical region in a garnet film or an orthoferrite platelet-hence the name bubble-with magnetization perpendicular to the plane of film or platelet and opposite to that in the surrounding region. This configuration, Figure 1, is achieved when the film or the platelet has uniaxial magnetic anisotropy to orient the magnetization perpendicular to the plane, has sufficiently low magnetization to prevent the demagnetizing field to force the magnetization into the plane, and has a bias field opposite to the bubble magnetization direction to prevent the bubble from expanding into serpentine domains-the natural demagnetized state. * With T. J. Watson Research Center, Yorktown Heights, New York. 413 414 National Computer Conference, 1973 .1 1 Easy axis of uniaxial anisotrophy Figure I-A bubble is a cylindrical magnetic domain with magnetization opposite to that of its surrounding. It exists in a thin film of uniaxial anisotropy, under proper bias field The permalloy pattern shown in Figure 3 ",ill guide the bubble propagation from left to right. As the field rotates from 1 to 2, for instance, the upper end of the vertical I -bar to the right of the current bubble position will be magnetized positively and thus be able to attract the negative end of the bubble toward the right. The entire bubble moves as a result. Information is carried by a stream of bubbles and voids (vacancies), conventionally designated to represent 1 and 0, respectively. As each bubble in the stream moves by a unit 2 4 (a) Rotating Field ,. 3 L DOOO Rotating field A bubble t Permalloy pattern Figure 3-Bubble propagation-A T-J bar permalloy pattern propagates bubbles by moving magnetic poles induced in a rotating field distance, the voids in between, if any, ",ill have an apparent movement also by a unit distance. Thus the entire stream flows at a regular rate in response to the periodic magnetization of the regular TI permalloy pattern. When the permalloy pattern like the one shown in Figure 3 is arranged in a loop, a shift register memory results. An idler is a cross-like permalloy pattern as shown in Figure 4. In the absence of extraneous influence, the bubble in an idler will circulate indefinitely; it is movable by, for example, a suitably positioned repelling bubble or magnetism induced by a wire loop nearby. Thus, without the external influence of magnetic force other than the rotating field, a vacant idler position serves as a bubble trap! and a filled 4 2 (b) Corresponding Positions of Induced Positive Magnetic Poles Figure 2-Labelling convention for the four phases of a rotating field and their corresponding positions of induced positive magnetic poles Figure 4-An idler circulates a bubble within a permalloy cross Symmetric Switching Functions idler position appears like a "short-circuit" insofar as bubble propagation is concerned; a stream of bubbles and voids in a tandem of idlers may thus be able to remain stationary in the presence of the rotating field. Other forms of permalloy patterns can be used, notably Y-patterns and "angelfish." 8 ~~ j2 Interact-ion The magnetostatic interaction between bubbles is essential to the generation of logic functions. Bubble domain media such as orthoferrite platelets and garnet films have very low wall motion threshold. Hence the flux lines emanated from a bubble are adequate to move an adjacent bubble as far as two or three bubble diameters away. Figure 5 shows how bubbles can interact with each other to generate two logic functions. s The device shown in Figure 4 has two streams of input variables, namely A and B. The presence or absence of a bubble (also respectively called bubble and void) represents the true or false value of a variable. The lower bubble, representing A, will propagate toward the output terminal labelled A VB, independent of bubble B. If bubble B is currently at the position shown, then one quarter cycle later it may take one of the two possible paths, depending on the presence of bubble A. With bubble A being where it is as shown, the interaction of these two bubbles will force bubble B to propagate to 4' rather than to 4, one quarter cycle later. The information that both A and B bubbles are present is conveyed by the propagation of bubble B toward the output terminal labelled A AB. With the absence of bubble A, bubble B will propagate downward to the output terminal labelled A V B. In addition to the interaction of bubbles, one can clearly see that the precise timing of bubble propagation is crucial to the proper generation of logic functions. C> D8 D ~8~ u U 1 3 / 415 0. . D 0··· 341 3.5 L OD& Figure 6-Bubble generation-A permalloy pattern generates new bubbles by stretching and then severing a mother bubble into halves The realization of other logic elements such as binary fuIi adder has been demonstrated to be feasible. 8 Generation and annihilation A permanent bubble associated with the generator disk, shown in Figure 6, is forced to stretch ·when one end becomes trapped during phase 2 of the planar rotating field. As the planar field keeps rotating, the bubble is further stretched. Between phases 3 and 4 its thinly stretched position ",·ill sever into two, leaving a newly formed bubble to the right of the generator disk. The annihilation of a bubble can be achieved by arranging the permalloy pattern as shown in Figure 7. During phases 3 and 4, the bubble remains essentially in the same position. During phase 1, the bubble is severely weakened because the attracting pole of the permalloy pattern is remote; yet the repelling one is near and strong, thus annihilating the bubble. SY~rYIETRIC AVB Figure 5-Bubble logic-A permalloy pattern brings bubbles together to interact magnetostatically and thereby generates logic functions SWITCHIKG FUXCTIOXS Symmetiic switching functions The function S(A I X) =S(al, a2, . .. , am I Xl, X2, . •. , Xn) 416 National Computer Conference, 1973 db ~ ~'-----------1 Xd\X2= S(O, 1 (b) (c) 8-------:-"111 ~ with Qi: an integer and O::::;ai::::;n Xj: 0 or 1 (switching variable) is said to be a symmetric switching function when and only when the sum over X equals one of the values in A, i.e., L Xi=one of the values in A = 0 otherwise. The values in A are commonly called a-numbers. It is clear that a permutation of the vector X does not change the value in S, hence the term "symmetric." 9-10 Symmetric switching functions can be used to synthesize other logical functions with a proper selection of a-numbers. In fact, the symmetric function is a universal logical connective since the commonly used universal set of logical connectives, AXD, OR, and XOT can be synthesized trivially: Xd\X2=S(21 Xl V X 2 = S(I, Xl X 2 = S(O 1 1 Xl, X2) Xl, X2). As examples of synthesizing practical circuits, the binary adder with X, y, z as its input operands and input carry can have its outputs, carry and sum, defined by: 31 X, y, z) sum= S(l, 31 X, y, z). carry=S(2, Residue threshold functions A subset of symmetric functions, called residue threshold functions, has been recently studied. ll Given n switching variables Xl, X2, ••. ,Xn , m a positive integer, and t a nonnegative integer, the residue threshold function is defined as R(t, m 1 Xl, ... ,xn ) =.R(t, miX) =.t::::; (L Xi) :\Iod m that is, R(t, miX) = 1 if and only if (L Xi) :\Iod m~t. i Here, (L Xi) ::\Iod m is defined to be the least positive remainder of (L Xi) 1m, which is a number between 0 and m-1 inclusively. The relationship between the symmetric switching function and the residue threshold function is very simple. R(t, miX) (d) Figure 7-Bubble annihilation-A permalloy pattern propagates a bubble to a trapping pO!'ition where the bubble is collapsed upon field reversal SeA 1 X) = 1 if Xl V (a) db~-----------. ~ J1~-----------. ~ db As further examples, the popular NAND and NOR functions, each of which is a universal logical connective in its own right, can be synthesized by symmetric functions, at no increase in complexity, as follo\vs: Xl, X2) 21 Xl, X2) =S(O'XI). is equivalent to SeA 1 X), \vith A containing all positive intf'gf'rs, a;'s (a;::::;n) such that t::::;a; :\Iod m. As noted before, the symmetric function derives its powerful capability of synthesizing any other logical function from the "personalization" of its a-numbers. In practice, the anumbers are usually much more structured than a mere enumeration would indicate, and a common structure is the cyclicity. An example here may help clarify this point. To find the parity of a sequence of bits, one needs only to know whether the number of ONE's in the sequence is even or odd. The exact number of ONE's is immaterial. Thus, instead of specifying S(l, 3, 5, 7, ... 1 X), one needs only to specify R(l, 21 X). The underlying structure permits a significantly simplified implementation as will be seen in a later section of this paper. STEPS TO\VARD REALIZING SY:\DIETRIC SWITCHING FUXCTIOXS Based on economical considerations, the progress of LSI of solid state devices is measured in terms of the ever in- Symmetric Switching Functions creasing device density and chip area, hence the number of devices per chip. One should note that as the area for devices is increased by a factor of m 2, the periphery for interconnections is increased only by a factor of m. Thus the merit of LSI can best be improved by increasing the versatility of devices to permit self-sufficiency within the chip and to reduce the number of external interconnections required for input/ output, control and power. Bubble domain devices with shift-register type of memory capability and symmetric switching function type of logic capability appear to be an 417 SIAiXi attractive candidate for LSI. Here we consider the mechanisms required for the realization of symmetric switching functions using magnetic bubble technology. Implementation based on these observations will be discussed in the next section. Given a sequence of bubbles and voids, X, consider: (i) Bubble sifting: since the symmetric function is invariant to permutation, one can sift the bubbles in X to result in a new sequence Y in which bubbles gravitate to one end (say, left end) of Y. For instance, if X is 0 1 0 1 then Y ,,,"ould be 1 1 0 o. (ii) Leading bubble detection: there exists a simple relationship between the position of the leading bubble (rightmost bubble) in Y and the number of bubbles in either X or Y. This relationship is m=n+1-p, or p=n-m+1, where m is the number of bubbles, n the length of X or Y, and p the position of the leading bubble (I-origin, left indexed) in Y. For the case m= 0, p will be considered n+ I as the above formula dictates. In practice, one can augment Y into Z by appending a I to the left of Y, in order to accommodate the m = 0 case. At the end of this leading bubble detection stage, one obtains a bubble stream W in which there is one and only one bubble. (iii) Interaction \vith the control bubble stream: a control stream of bubbles and voids can be constructed with the positions of bubbles representing the a-numbers. That is, A = aI, ... , am is represented by A-nt...mbers Figure 8-A block diagram showing required bubble mechanisms to realize symmetric switching functions Example 1: X= 0101 A=O, 2, 3 y= 1100 B=10110 Z=11100 W=00100 ANDing between Wand B W=OO~OO B=10~10 T hence S (A I X) = 1 Example 2: X= 0000 A=3 B=00010 Y= 0000 Z=10000 W=10000 ANDing between Wand B W=~OOOO B=~O 010 F hence SeA I X) =0 IMPLEMEXTATION Figure 9 shows the detailed permalloy pattern implementing the scheme depicted in Figure 8. It consists of four reI.\PUT (both data & flusher! such that = 0 otherwise. By proper timing, the information coming out of the stage of leading bubble detection (with bubble/void representing that the leading bubble has/has not been detected) is required to AXD with the components of B. At any time during AXDing a 1 output indicates that the number of OXE's in X agrees with one of the a-numbers_ Therefore, S (A ! X) = 1 if and only if AKDing of the control stream and the output from the leading bubble detection stage yields a true output. The mechanism described above is summarized in a block diagram shown in Figure 8. Figure 9-A permalloy pattern to implement the block diagram in Figure 8 418 National Computer Conference, 1973 gions, with region I being the sifter, region II the leading bubble detector, region III the control bubble stream, and region IV the AND gate. The bubble sifter (region I) contains n idlers in tandem, with the leftmost one connected to the vertical input channel. The idlers within the sifter are slightly offset in such a way that it favors bubble movement toward the right. A bubble in the idler tandem, however, will not move to the right unless all the idler positions to its left are filled with bubbles, and there is a bubble entering at the input channel trying to push all the bubbles in the sifter. The net effect is that voids will skip all preceding bubbles. The input stream X as defined before will, after n cycles, become the n-bit stream Y residing in the sifter with, say, m bubbles juxtaposed 'with (n-m) voids to their right. 'Without further external influence, the Y stream will stay in the sifter indefinitely. The entering of an (n+1)-bubble flushing stream at the input channel at this time will drive the Y stream to the right for leading bubble detection. The first bubble in the flushing stream and the Y stream form the augmented stream designated as Z in the previous section. Initially, the leading bubble detector (region II) contains a resident bubble in its right idler. As the very first bubble from the sifter arrives at the center idler, the resident bubble will be repelled to the right into the AND gate (region IV), conveying the information that the leading bubble of the Z stream has arrived. The bubble in the center idler is trapped there and, through its magnetostatic force, will divert all the following bubbles from the sifter upward to an annihilator. A bubble generator (not shown in Figure 9) will issue bubbles, according to the a-numbers of the given symmetric function, into region III which is a closed loop shift register. This circulating bubble stream is the personalized control bubble stream B discussed previously. The deRcription of detailed operations in terms of bubble interactions with the permalloy pattern is now in order. The sequence of events together with the time intervals is shown in Figure 10. For illustration, we assume that the input is o 1 0 1 \\ith the rightmost position being the leading position, followed by a sequence of five flushing bubbles. The bubbles in the input stream (X or Y), called data bubbles, I'\IPUT (both data & flushed V : ~J _II_I i-tL'I-loJ Li~OUT,", :.... :J~·~·L:J~·~r±· ±. ±: I: \:;"~~:::" j V ....................................t....................... .. Annihilation L"i _1_11 ~ _I1) L j~OUTPUT :(~ -.J L IL1:± ± ±~ .. ~~ .. ~~~~·~·L~·~·r: ............. ":": I~ ~ : '( I II : IV : 1 '-7"(symmetnc function) HC I I I J~ : ------------T T T .rIII : ............................................. ;{,' ..... : Annihilation Figure 10( b )-t= 2. The first data bit is trapped at the first position (leftmost idler) while the second data bit (a void) has skipped by. This demonstrates the sifting function. are represented by solid dots, the flushing bubbles by circles and the resident bubble by a circle with a cross in it. ' The initial (t=O) status is shown in Figure 10-a, the corresponding field phase is 3 (see Figure 2). At t = 1, the first data bubble has advanced to the leftmost idler, and at the input channel is the void. One cycle later, t = 2, the first data bubble still remains at the same position; thus it can be said that the void has skipped the bubble preceding it and now resides at the second idler from the left. At the same time, there comes the second data bubble at the input channel. At t = 27.4, the second data bubble moves downward and repels the first data bubble toward the right, causing it temporarily out of synchronization. At t = 2%, the first data bubble is attracted to position 1 of the second idler from the left thus re-synchronized with the rotating field. The above sf'q~ence of operations in the last three field phases of a cycle is shown in Figure lO-c. Figure lO-d shows the status at t=4; the two data bubbles are residing at the two idlers at the left and the two voids can be thought to have skipped the bubbles and advanced to the two idlers at the right. The first flushing bubble is now at the input channel. For this example, it takes two more cycles to fill up the sifter and the input channel, and two additional cycles for the leading data bubble to be flushed out of the sifter and propagated to the center idler of the Annihilation ............................. t .................... . riC INPUT (both data & flusher) 1II ----------- I I I JH- TTT ~~ :··········································:t·····: Annihilation Figure 10-Sequence of key events in the symmetric-function bubble device: (a)-t""O. The device is cleared. A resident bubble is loaded into the first bubble detector. The first data bit (a O~E, i.e., a bubble) of the input bubble stream (0101) is ready to enter the sifter. INPUT (both data & flusher) V................... ,..1 ...........,..........: :....;;; Annihilation ~4 I I I : -42~4 - ~I j i -+ ~ UJ L ~ ?.:'~:,:; " ~·······f-··J-·-J··J±··±·±:··:I '""'' '", ~-fc-- -~I I I I JH~ _ I II IV - -- ----------- T T T ~~ :....................................... :} ......: An~i"ilation Figure 10(c) . ·t=2,2 1'4, 2 2 i 4, The thiro rtata bit (a hubble) pushes the first data bit (a bubble) to the second position Symmetric Switching Functions detector. The interactions at t=8 and t=8~ are shown in Figure lO-e. As time advances from t=8 to t=8~, the presence of the first data bubble in the detector causes the resident bubble to move to position 4' in region IV, and the bubble following it to move upward to position 4' in region II which leads to an annihilator. As the leading data bubble is trapped in the center idler, all following bubbles will be diverted upward to the annihilator. Consequently, during the whole operation one and only one bubble will leave region II and enter region IV. Half a cycle later, t = 8%" the resident bubble is at position 2 in region IV, shown in Figure lO-f. If, at this instant, there is a bubble in region III at the position indicated by a small square, the resident bubble will be forced to move to position 3' in region IV at t = 9, giving an indication that the symmetric function is true. Otherwise, the resident bubble 'will movedow:nward.in region IV and he annihilated eventually. It is clear now that the upper portion of region IV behaves as an AND gate. INPUT (both data & flusherl . . V...................~n~~r.ti.O~ :-{J.J _I :~[[I_I d Ll=>oUTeUT INPUT (both data & flusher! U ~ ---------- T T T 1II \:'"':,7,~," Annihilation ............................. t .................. . ~~dtr1+fi~ :~\;~7;~i' I I)h !... ~C r :.................................J.... : ----------- T IT T ~ III Figure 10(e)-t=8, 8-1/4. The leading data bubble has entered the center idler of the leading bubble detector, pushing the resident bubble into the AKD-gate. The leading data bubble is trapped in the center idler, diverting the following bu6bles to an annihilator is because the sifting can be performed during flushing time; no data bubble can leave the sifter until all idlers in it have been filled. Assuming that the flushing bubbles are allowed to enter at the leftmost input channel, we find ta= 1+ (n-m) +2+%, = (n-m) +3%,. and the operation time for this parallel input is thus n+4. In many applications, the a-numbers are well structured and thus help simplify the control bubble stream significantly. As we discussed earlier, the parity check of a sequence of bits, X, can be expressed as ~r . ............................. ............. ...... .. ~ Annihilation ~ Annihilation ••••••••••••••••• i.. ~-1.J.::J.~.J±.±.±.:I· ~C I I IJ~ 419 R(1, 21 X). To realize this, the control bubble stream needs only to consist of one bubble and one void in a tight loop, saving much space. Figure 1O(d)-t=4. The flusher bubbles are ready to enter the sifter In general, with an n-bit (bubble/void) input having m bubbles, the critical AKDing time, ta, (ta = 8%, in the case discussed above) between the resident bubble and the control bubble stream is ta=n+ (n-m) +2+%, = (2n-m) +2%" of which n cycles are required to load and sift the data bubbles and voids in the sifter (region I), n-m cycles required to fill the sifter, 2 more cycles required to propagate the rightmost bubble in the sifter to the center idler of region II, and finally %' cycle required to move the resident bubble to position 2 of region IV. It can be easily deduced from the above formula that if the resident bubble cannot be found at position 3' in region IV before or at then the symmetric function is false. In other words, the operation time, excluding initialization, of this device is 2n+3 cycles. We have shown a bit-serial implementation. If each idler in the sifter is directly connected with an input channel, the parallel input operation can be performed to gain speed. This COXCLUSIO~ 'Ve have shown how to realize symmetric switching with magnetic bubble devices. The implementation is simple, yet U INPUT (both data & flusherl Annihilatio.1 ........................... t.................... . I I ·i . ..l...I III I :.~ I 3L 11 rv .-'-:-f-- ....... I.... l .. j ....i:± ..±.±. :31 HC--- ~r 1 I I J2:~ j ----------- T T T ~r : ~ - Ii)- 1&:- <;>- <;>-:. . .~OUTPUT .--;:>"""(symmetric . functionl -- . .J,..... . ........................................ Annihilation Figure lO(f)-t =8-3/ 4. The resident bubble is at a position to interact with the control bubble streams. The presence of a control bubble at the square will force the resident bubble to move to 3' leading to the output, indicating that the symmetric function is true. The absence of a control bubble will permit the resident bubble to move to 3 and then the annihilator, indicating that the symmetric function is false 420 National Computer Conference, 1973 with the easily achievable personalization of the control bubble stream it produces a versatile and powerful logic device. Our a-numbers stream is a simple example of the personalization through a circulating shift register memory. The personalization persists as long as the vertical bias magnetic field is present. Both bubble memory and bubble logic devices are implemented with very similar permalloy patterns, hence it is possible to have a mixture of memory and logic at a very local scale. Such a mixture is particularly attractive because of its low cost and low power dissipation. Note that traditionally memory and logic are in separate units. For example, the ferrite core memory and semiconductor central processing unit are separate, because of different technologies. In semiconductors memory and logic are separate, partly because of the density contrast of repetitive cells in memory versus a great variety of cells in logic; and more importantly because read-write memory is volatile, and logic must use a more nondestructive implementation. Thus cireuit logic resembles read-only memory, and tends to be different from read-write memory in construction. The magnetic disks, drums, and tapes simply do not have any resident logic capability, and must rely on external logic circuits (control unit, channels, etc.) for data routing and data management. With the capability c:f an intimate mix of memory and logic, much of the previous demarcation lines can be removed. The design optimization should be greatly facilitated. In fact, the hardware capability may induce revolutionary changes in computer organization and architecture. REFERENCES 1. Bobeck, A. H., Fischer, R. F., Perneski, A. J., Remeika, J. P., Van Uitert, L. G., "Application of Orthoferrites to Domain Wall Devices," IEEE Trans. Magnetics 5, 3, September 1969, pp. 544553. 2. Perneski, A. J., "Propagation of Cylindrical Magnetic Domains in Orthoferrites," IEEE Trans. Magnetics 5, 3, September 1969, pp. 554-557. 3. Thiele, A. A., "Theory of Static Stability of Cylindrical Domains in Uniaxial Platelets," J. Appl. Phys. 41, 3, March 1970, pp. 11391145. 4. Bonyhard, P. I., Danylchuk, I., Kish, D. E., Smith, J. L., "Application of Bubble Devices," IEEE Trans. Magnetics 6, 3, September 1970, pp. 447-451. 5. Sandfort, R. M., Burke, E. R., "Logic Functions for Magnetic Bubble Devices," IEEE Trans. Magnetics 7, September 1971, pp. 358-360. 6. Ahamed, S. V., "The Design and Embodiment of Magnetic Domain Encoders and Single-Error Correcting Decoders for Cyclic Block Codes," B.S. T.J. 51,2, February 1972, pp. 461-485. 7. Garey, M. R., "Resident-Bubble Cellular Logic Using Magnetic Domains," IEEE Trans. Computers C-21, 4, April 1972, pp. 392396. 8. Bobeck, A. H., Scovil, H. E. D., "Magnetic Bubbles," Scientific American 224, 6, June 1971, pp. 78-91. 9. Harrison, M. A., Introduction to Switching and Automata Theory, McGraw-Hill, New York, 1965. 10. Kohavi, Z., Switching and Finite Automata Theory, McGraw-Hill, New York, 1970. 11. Ho, I. T., Chen, T. C., "Multiple Addition by Residue Threshold Functions," IEEE CompCon Proceedings, September 1972. 12. Minnick, R. C., Bailey, P. T., Sanfort, R. M., Semon, W. L., "Magnetic Bubble Computer Systems," AFIPS Proceedings, Vol. 41, December 1972, pp. 1279-1298. 13. Minnick, R. C., Bailey, P. T., Sanfort, R. M., Semon, W. L., "Magnetic Bubble Logic," WESCON Proceedings, 1972. The Control Data ST AR-l 00 paging station by W. C. HOHN and P. D. JONES Control Data Corporation St. Paul, Minnesota monitor. What happens then is flow diagrammed in Figure 4. The paging station contains the overflow pages from main memory and as such is critical in the overall performance of the system. I:\fTRODUCTION The Control Data STAR-IOO is a large capacity, high speed, virtualm:emoryt.2 computer system whose input/ output for storage and access is managed by "Stations"3 separate from the main computer. This modularizes the total computing function into independent, asynchronous tasks which operate in parallel with the central processor. The approach also simplifies the central processor design and provides for fan out to a large number of storage devices and terminals. A station consists of an SCU (station control unit) and an SBU (station buffer unit). An SCU is a mini-computer with small drum and display console existing with power supplies and cooling in its own cabinet. An SBU consists of 64K (K = 1024) bytes of high bandwidth core memory. A STAR-IOO system is shown in Figure 1 with the performance goals noted on each storage station. The M/P station manages maintenance and performance analysis of the STAR-IOO mainframe. The media, working and paging stations consist of tapes and disk packs, large disk and drums respectively. Figure 2 shows the layout of a user's virtual storage which covers all the program's active files whenever they are stored. Each user has four keys, which reside in the program's control package and provide four levels of access protection in virtual memory. In STAR-IOO there is one global page table for all users with one entry for each core page. There are two page sizes, normal (4096 bytes) and large (128 times normal). The page table has 16 associative registers at the top and the rest of the table is in core memory (1008 entries in the case of the typical four million byte memory). The translate time in the 16 associative registers is one minor cycle (40 nanoseconds) and the search time is approximately 50 entires per microsecond. When a hit is made that entry jumps to the top of the table, thus most frequently referenced blocks have entries near the top of the table and conversely, the best candidates for removal from memory are at the bottom of the table. This paging mechanism was basically chosen to give the machine a virtual memory capability without degrading performance (100 million results per second). The memory access mechanism is illustrated in Figure 3. When the virtual address is not found in the page table an access interrupt occurs and control is switched to the HARDWARE CONFIGURATION The STAR-IOO paging station, as illustrated in Figure 5, presently consists of two Control Data 865 drums and a page table search mechanism, called a comparator, connected to an SBU; the whole is controlled by an SCU. One half of the 16 page SBU contains the virtual page table and the other half (minus some drum control space) is used as buffer space for the drum/ central page transfers. In order to ease the SBU memory conflict situation the SBU memory is hardwired such that it operates as two independent phased (4 bank) memories each with a bandwidth of four 16 bit words every 1.1 microsecond. By this means the comparator has sole access to its memory half and the drums and channels compete in their half, with the drum being top priority. Figure 6 depicts the functional aspects of the station. The block labelled CONTROL represents the minicomputer within the SCU. All hardware interfaces-drums, comparator, channels-are controlled by routines residing in the SCU. The SBU provides a data freeway for page flow from the drum to central memory. Having the SBU between central memory and the drum reduces channel design complexity in central by not having to interface to critical, real time, rotating devices. DRUM QUEUE AND DRIVER Requests for drum transfers are not made to the driver directly but instead to the queue program. This program translates the drum block address (from the comparator search) into a head and sector address. If the resulting sector position is free in the associative queue the request is placed in the queue, otherwise it is placed in retry mode when it is offered to the queue program periodically until accepted. As the number of requests increase, the probability of filling more associative queue slots increases (assuming random drum block addresses) thus raising drum throughput. 421 422 National Computer Conference, 1973 VIRTUAL ADDRESS VI RTUAL PAGE 33 BITS BIT ADDR 15 BITS + ,\OUND PHYSICAL ADDRESS PAGE ADDR BIT ADDR 11 BITS 15 BITS ACCESS Figure l-STAR-lOO system INTERRUPT Control of the drum is illustrated in Figure 7. The driver program scans the drum angular position and its associated queue slot and if a request exists for that angular position the necessary control information is sent to the drum hardware. The drum control hardware has the ability to stack functions thus making it possible to "stream" contiguous sectors from the drum. SYSTEM TASKS PRIVATE SHARING PUBLIC SHARING I I 32 TRILLION BYTES ACCESS INTERRUPT I \Access Protection I I LIBRARY KEY Figure 3-Memory access mechanism KEY I '-Access protection I I KEY I \Access Protection CREATE/KILL USER I I KEY I cAccess protection 256 REGISTERS Figure 2-STAR-IOO virtual memory Figure 4-Access interrupt control The Control Data STAR-100 Paging Station CENTRAL REQUEST 423 ASSOCIATIVE QUEUE STATION BUFFER UNIT {SBU} §f 18 19 20 21 COMPARATOR Figure 7-Drum queue and driver STATION CONTROL UNIT {SCU} Figure 5-Paging station hardware configuration COMPARATOR The virtual page table maps the drum(s) one entry per drum block. Each 64 bit entry contains a unique drum block address and is flagged as either free or attached to a virtual address. The entry format is II , 1 VIRTUAL BLOCK KEY 2 33 12 . ~ Free/ActIve '\. Usage Bits DRUM BLOCK I 16 bits This form of page table maximizes performance because the table is both compressed (all active entries at the top) and ordered by activity, two characteristics which minimize search time. The table scan rate is one entry every 1.1 microsecond or 910,000 entries per second. Two compares are simultaneously made against a table entry (if .only one virtual address request is active the second search is for a null). The average search time, if both virtual addresses exist in the table, is two thirds of the table length times memory speed. 8=2/3L ·M If either request is absent from the table then the average search time becomes table length times memory speed. 8=L·M The comparator is a hardware unit which compares selected virtual addresses against the page table entries. The hardware "ripples" the table as it searches for a compare. That is, all entries move down as the search passes by them unless a match is found. The entry which matches is placed in the now vacant slot at the top of the list, thereby generating in time a list topped by the most active entry and arranged thereafter in order of descending activity. In both cases table length, L, is the dynamic length of active table entries, (the table entries attached to a virtual address). If the ripple search were replaced by a straight linear search, the search time would be the reserved table length times memory speed. Table I lists some search rates reflecting the dynamic table length characteristic of the rippled page table. MESSAGES The paging station is driven by messages from the central processor. The paging messages are listed in Figure 8. A brief discussion on the message switch mechanism is contained in the next section on performance. Essentially _ C E N T R A Table I-Comparator Search Time OMMUNICATION CONTROL l---------~ L Active Table Length {Entrys} Search Time {Millisecond} Search Rate {2/S} {Per Second} 1D00 1.1 1820 2000 2·2 910 3000 3.3 bOb 4000 4.4 455 COMPARATOR AGE TABLE Page Table Memory Speed - Figure 6-Paging station functional configuration 1·1 Micro-second 424 National Computer Conference, 1973 PAG ING MESS AGES Function PliiJrS![!!et~rs Function NaMe ~ For[!!liiJt 200 Read page B, K, .!d., P 2A 201 Wri te page B, K, U, P 2A 202 Re I· ri te page B, K, U, P 2A 203 Delete N pages 204 Delete key {N deleted} 205 Read most active block .·i th given key, then delete. {Page name and usage bits returned} B, K, !!, £. 2A 206 Read least-active block .·i th given key, then delete. {Page name and usage bits returned} B, K, !!, £. 21. 207 Read and delete page B, K, .!d., P 2A 208 Read drum page table B, N, ~, 209 S I.ap page B, K = number of pages !i ' K, P 2A !:!, K 2A K"" n:" I 2B .!d.r, P r ' p., BIN II ul 1 2 12 33 bits Block address or number of pages usage bits: These are stored in the drum page table on .'ri te and rewrite and returned in this posi tion on read. U {bit 1} 011, unmodi fied/modi fied since initial access. K Key Virtual page address FORMAT 2B 16 Bits 16 16 xxxx 1300 0100 --Header 0200 0040 000 0060 48001--Message Parameters (Hexadecimal numbers) The header tells us the message parameter length is one 64 bit word, the message is being sent from station #0100 (central) to station #1300 (paging). The message is to read (function code 200) virtual block #4800, key #80 (#40 shifted one place left) to central memory block #60. The usage bits of this page, that is whether it has been modified or not by the central processor, will be returned with the response. MESSAGE PROCESSING FORMAT 2" BIN 0001 2C Parameters underlined are returned .'ith the response 16 bits xxx x xxxx} 0000 16 The steps involved in processing a typical paging message are shown in Figure 9. All code is reentrant and many messages can be processed simultaneously. The work space for each message is contained in its control package. The number of messages which can be processed simultaneously is a station parameter and is set to ensure a deadlock situation cannot arise, that is, where no message can proceed because of a lack of resources. The comparator search, drum transfer and channel transfer are asynchronous operations which operate in parallel with the SCU processor. For each function code there is one unique message process routine which makes heavy use of subroutines to fulfill its role. The message routines are not permanently resident in core memory but float in and out as required from the mini-drum. starting block for drum page table number of blocks to be read "'Ord index {64 bit word} to first active entry PERFORMANCE PARAMETERS index {64 bit word} to last +1 active entry There are a number of parameters which have a first order effect on station performance and these are now discussed. FORMAT 2C II ~:I 16 Bits Pr 1 2 12 33 B, U, K and P are the same as in Format 2A. The subscripts Rand W denote the pages to be read and ..... i tten respectively. MESS AGE HEADER Response Code Length & Priority Sender To Zipcode 16 Bits 16 Used From Z ipcode By I Function Code 16 16 Figure 8-Paging messages and formats the paging station polls central memory for messages, on finding a group of active messages they are read to the SCU where they are processed. The meaning of the messages is self-evident and the following sample message is typical: SEND RESPONSE) i\ END "_ SEQUENCE I\ I , ! I ~! \~~/! Figure 9-Read page flow diagram The Control Data STAR-100 Paging Station Mes~ages Storage device performance 425 per Second Large capacity, fast drums (or fixed head disks) are normally used at present for the paging devices. The goal in the STAR system is for a paging device with 109 bit capacity, 40 megabit transfer rate and a capability of delivering 1000 random page transfers per second. The initial paging station has twin Control Data 865 drums each with 1408 pages (1 page = 4096 bytes), made up of 64 head groups of 22 pages per 34 millisecond revolution time. The maximum transfer rate is approximately two times 660 pages per second. Paging St ~tion Mix: 50% Reads .1:;0% Rewrites bOO 0 = one drum T = two drums = slow memory {1· 1 us} = fast memory {. 2 us} M = modi fied software S f SOD 400 Processor speed The average number of memory cycles to process a message such as read page is 3000. Table 2 lists the 300 Messages per Second 700 Paging Station 200 Mix: 50% Reads 25% Re..,rites ~TFM 25% Not found 600 0 = one drum T = two /~"" drums S = slow memory {1.1 us} f = fast memory {. 2 us} M = modified software 500 //' /; 400 100 30 STATION QUEUE LENGTH EXPERIMENTAL RESULT ~I\OF 300 _---c,0SM _---C,TSM :::f7 r oL-------------~1~0-ST-A-TI-ON-Q-U-EU-E-L-EN-GT-H~20~------------~30 EXPERIMENTAL RES ULTS Figure lO-Experimental results maximum throughput one can expect with varying memory speeds. Figures 10 and 11 show performance curves for two memory speeds, 1.1 and 0.2 microseconds. Experiments were conducted to obtain essentially a curve of station performance versus processor speed; the maximum throughput in Table II was not obtained since when the processor time required to drive a block from drum to central memory became larger than the drum page transfer time (about 1.5 milliseconds) it was not possible to stream pages from the drum. Drum revol utions were lost and there was a sudden drop in performance. Figure ll-Experimental results these were rented at the beginning of a message and released at the end (Figure 9). This resulted in a falling off in performance as the queue length became large. The software was modified to enable the drum driver to rent SBU space only for the drum and channel transfer time thus making better use of this resource. The difference in performance is shown in Figures 10 and 11. SBU bandwidth As mentioned earlier the paging station SBU operates as two independent phased memories with approximately 58 megabits of bandwidth in each half. Table III illustrates the channel bandwidth available when zero, one and two drums are active. Clearly, just as continuity of flow applies in aero- and hydro-dynamics it also applies Table II-Message Throughput Versus Cycle Time Memory Cycle Time Data buffer size ~ With the present SBU's there arei blocks available to be rented as buffer space for drum transfers. Initially Maximum Throughput 0·1 Micro-seconds 3333 Messages/Second 0·2 1667 0.5 666 1.0 333 1.1 300 __~2_.0~________________L-~1~6~7__________________ Average number cycles per transaction j I 3000 426 National Computer Conference, 1973 DRUMS ACTIVE CHANNEL TRANSFER RATE 0 40 Mega-bit before they were full, multiple boats were handled and responses were sent back in the first available boat. Figures 10 and 11 show performance curves obtained using this last scheme with a constant queue of messages maintained in the station. In this case the response time is given by queue length divided by the number of messages serviced per second. 1 24 Other factors 2 10 SCU memory size for message and control package buffers is another performance limiting parameter, but 16K (K = 1024) bytes appears adequate for the present. The comparator search takes less time than drum latency and is not a limiting factor. Channel bandwidth to central is 40 megabits and is not as much a limit as the SBU bandwidth. The total input/ output bandwidth in STAR memory is 3,200 megabit and a 400 megabit data path is available in the future for any new high speed paging Table III -Channel Transfer Times Maximum Bandwidth = 58 Mega-bit in data-dynamics. If channel transfers cannot keep up with drum transfers then drum revolutions are lost and performance curves flatten off. device~. CONCLUSION Message switch The message handling mechanism between central memory and the SCU vitally affects station performance and already has undergone three distinct developments. Initially, there was a circular queue with pointers to the various messages. This scheme involved too many channel transfers, 3+N in fact (read the queue pointers, read the message pointers, read N messages, reset the queue pointers), and was discarded. Apart from the channel transfer time itself there is considerable processor overhead in driving a channel. In the second scheme messages were collected in channel "boats" (Figure 12) and one transfer brought over N messages. The drawback with this scheme was that only one "boat" was brought over at a time, and it was returned only when all messages were complete. This resulted in poor performance and very poor response times. The third scheme was, in effect, a sophisticated boat scheme. Boats could be processed HEADER MESSAGE 1 ACKNOWLEDGMENTS This work was performed in the Advanced Design Laboratory of Control Data and thanks are due to all members of that laboratory but in particular, J. E. Thornton, G. S. Christensen, J. E. Janes, D. J. Humphrey, J. B. Morgan, R. W. Locke, C. L. Berkey, R. A. Sandness, W. D. Marcus and R. R. Fetter. Check: Sum Zipcode REFERENCES MESSAGE 2 MESSAGE N CONTROL As with other STAR stations 3 it is believed a successful attempt has been made to identify part of the general purpose computing function, in this case the paging function, and separate it from the central processor to operate as an independent, asynchronous, self-optimizing unit. Indications are that heavily timeshared, large capacity, high speed virtual memory computers will require paging rates of the order of 1000 pages per second and it appears an upgrade of the paging station with faster drums and twin SBU's will meet this goal. Next incoming boat address Message Count, Full Flag, Chgd:::sum Figure 12-Message boat 1. Curtis, R. L., "Management of High Speed Memory on the STAR100 Computer," IEEE International Computer Conference Digest, Boston, 1971. 2. Jones, P. D., "Implicit Storage Management in the Control Data STAR-100," IEEE Compeon 72 Digest, 1972. 3. Christensen, G. S., Jones, P. D., "The Control Data STAR-IOO File Storage Station," Fall Joint Computer Conference Proceedings, Vol. 41, 1972. The linguistic string parser* by R. GRISHMAN, N. SAGER, C. RAZE, and B. BOOKCHIN J.A.lew York UniveiSity New York, New York The linguistic string parser is a system for the syntactic analysis of English scientific text. This system, now in its third version, has been developed over the past 8 years by the Linguistic String Project of New York University. The structure of the system can be traced to an algorithm for natural language parsing described in 1960. 1 This algorithm was designed to overcome certain limitations of the first parsing program for English, which ran on the UNIVAC 1 at the University of Pennsylvania in 1959. 2 The UNIVAC program obtained one "preferred" grammatical reading for each sentence; the parsing program and the grammar were not separate components in the overall system. The 1960 algorithm obtained all valid parses of a sentence; it was syntax-driven by a grammar consisting of elementary linguistic strings and restrictions on the strings (described below). Successive implementations were made in 1965,3 in 1967,4 and in 1971. 5 The system contains the largest-coverage grammar of English among implemented natural language p.lrsers. Implementation of a large grammar by several people over a period of years raises the same problems of complexity and scale which affect large programming projects. The main thrust in the development of the current version of the parser has been to use modern programming techniques, ranging from higher-level languages and subroutine structures to syntax -directed translation and non-deterministic programming, in order to structure and simplify the task of the grammar writer. In this paper we shall briefly review the linguistic basis of the parser and describe the principal features of the current implementation. We shall then consider one particularly thorny problem of computational linguistics, that of conjunctions, and indicate how various features of the parser have simplified our approach to the problem. Readers are referred to an earlier report 6 for descriptions of unusual aspects of the parser incorporated into earlier versions of the system. Our approach to the recognition of the structure of natural language sentences is based on linguistic string theory. This theory sets forth, in terms of particular syn- tactic categories (noun, tensed verb, etc.) a set of elementary strings and rules for combining the elementary strings to form sentence strings. The simplest sentences consist of just one elementary string, called a center string. Examples of center strings are noun tensed-verb, such as "Tapes stretch." and noun tensed-verb noun, such as "Users cause problems." Any sentence string may be made into a more complicated sentence string by inserting an adjunct string to the left or right of an element of some elementary string of the sentence. For example, "Programmers at our installation write useless code." is built up by adjoining "at our installation" to the right of "programmers" and "useless" to the left of "code" in the center string "programmers write code." Sentences may also be augmented by the insertion of a conjunct string, such as "and debug" in "Programmers at our installation write and debug useless code." Finally, string theory allows an element of a string to be replaced by a replacement string. One example of this is the replacement of noun by what noun tensed-verb to form the sentence "What linguists do is baffling." The status of string analysis in linguistic theory, its empirical basis and its relation to constituent analysis on the one hand and transformational analysis on the other, have been discussed by Harris. 7 More recently, Joshi and his coworkers have developed a formal system of grammars, called string adjunct grammars, which show formally the relation between the linguistic string structure and the transformational structure of sentences. s The string parser adds to linguistic string theory a computational form for the basic relations of string grammar. In terms of these relations the arguments of grammatical constraints (i.e., mutually constrained sentence words) can always be identified in the sentence regardless of the distance or the complexity of the relation which the words have to each other in that sentence. 9 Each word of the language is assigned one or more word categories on the basis of its grammatical properties. For example, "stretches" would be classed as a tensed verb and a noun, while "tape" would be assigned the three categories tensed verb, untensed verb, and noun. Every sequence of words is thereby associated with one OJ more sequences of word categories. Linguistic string theory claims that each sentence of the language has at least one sequence of word categories which is a sentence string, * The work reported here was supported in part by research grants from the National Science Foundation: GN559 and GN659 in the Office of Science Information Services, and GS2462 and GS27925 in the Division of Social Sciences. 427 428 National Computer Conference, 1973 i.e., which can be built up from a center string by adjunction, conjunction, and replacement. However, not every combination of words drawn from the appropriate categories and inserted into a sentence string forms a valid sentence. Sometimes only words with related grammatical properties are acceptable in the same string, or in adjoined strings. For example, one of the sequences of word categories associated with "Tape stretch." is noun tensed-verb, which is a sentence string; this sentence is ungrammatical, however, because a singular noun has been combined with a plural tensed-verb. To record these properties, we add the subcategory (or attribute) singular to the category noun in the definition of "tape" and the subcategory plural to the category tensedverb in the definition of "stretch." We then incorporate into the grammar a restriction on the center string noun tensed-verb, to check for number agreement between noun and verb. The number of such restrictions required for a grammar of English is quite large. (The current grammar has about 250 restrictions.) However, the structural relationship between the elements being compared by a restriction is almost always one of a few standard types. Either the restriction operates between two elements of an elementary string, or between an element of an elementary string and an element of an adjunct string adjoining the first string or a replacement string inserted into the first string, or (less often) between elements of two adjunct strings adjoined to the same elementary string. This property is an important benefit of the use of linguistic string analysis; it simplifies the design of the restrictions and plays an important role in the organization of the grammar, as will be described later. adjunct set definitions: these definitions group sets of strings which may adjoin particular elements. The remaining definitions, including those for SUBJECT, VERB, and OBJECT, are collections of positional variants; these define the possible values of the elements of string definitions. Once component 1. has been rewritten in this way, it is possible to use any context-free parser as the core of the analysis algorithm. We have employed a top-down serial parser with automatic backup which builds a parse tree of a sentence being analyzed and, if the sentence is ambiguous, generates the different parse trees sequentially. The parse tree for a very simple sentence is shown in Figure 1. A few things are worth noting about this parse tree. Most striking is the unusual appearance of the parse tree, as if it had grown up under a steady west wind. We have adopted the convention of connecting the first daughter node to its parent by a vertical line, and connecting the other daughter nodes to the first by a horizontal line. This is really the natural convention for a string grammar, since it emphasizes the connection between the elements of a string definition. More interesting is the regular appearance of "LXR" definitions: a below the subject, a < L TVR > below the verb, and a below the object. Each LXR has three elements: one for left adjuncts, one for right adjuncts, and one in the middle for the core word. The core of an element of a definition is the word category corresponding to this element in the associated elementary string in the sentence; e.g. the core of SUBJECT (and of LXR) in Figure 1 is the noun "trees"; it is the one terminal node below the element in question which is not an adjunct. In some cases the core of an element is itself a string. LXR definitions and linguistic string definitions play a distinguished role in conjoining, as will be described later. Each restriction in the grammar is translated into a sequence of operations to move about the parse tree and test various properties, including the subcategories of words attached to the parse tree. When a portion of the parse tree containing some restrictions has been completed, the parser invokes a "restriction interpreter" to execute those restrictions. If the restriction interpreter returns a success signal, the parser proceeds as if nothing IMPLEMENTATION As the preceding discussion indicates, the string grammar has three components: (1) a set of elementary strings together with rules for combining them to form sentence strings, (2) a set of restrictions on those strings, and (3) a word dictionary, listing the categories and subcategories of each word. Component 1. defines a context-free language and, for purposes of parsing, we have chosen to rewrite it as a BNF grammar. The approximately 200 BNF definitions in our grammar can be divided into three groups. About 100 of these are single-option string definitions; each of these corresponds to one (or occasionally several) strings. For example, ASSERTION:: =
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19 Producer : Adobe Acrobat 9.0 Paper Capture Plug-in Modify Date : 2008:11:18 20:44:16-08:00 Create Date : 2008:11:18 20:44:16-08:00 Metadata Date : 2008:11:18 20:44:16-08:00 Format : application/pdf Document ID : uuid:f1f3c947-a9e8-4be3-a4d8-908d30376f79 Instance ID : uuid:a5c91434-6faa-48bf-88b2-8bd52c990ae0 Page Layout : SinglePage Page Mode : UseOutlines Page Count : 936EXIF Metadata provided by EXIF.tools