Sort_Merge_Vers_4_1_Users_Guide_60482900A_May79 Sort Merge Vers 4 1 Users Guide 60482900A May79
User Manual: Pdf Sort_Merge_Vers_4_1_Users_Guide_60482900A_May79
Open the PDF directly: View PDF .
Page Count: 56
Download | |
Open PDF In Browser | View PDF |
60482900 CONTROL DATA CORPORATION SORT/MERGE VERSIONS 4 AND 1 USERS GUIDE 0^\ CDC® OPERATING SYSTEMS NOS 1 NOS/BE 1 SCOPE 2 REVISION RECORD Revision Description A (05-15-79) Original release. REVISION LETTERS I, 0, Q, AND X ARE NOT USED ^COPYRIGHT CONTROL DATA CORPORATION 1979 All Rights Reserved Printed in the United States of America i i Address comments concerning this manual to: CONTROL DATA CORPORATION Publications and Graphics Division P. 0. BOX 3492 SUNNYVALE, CALIFORNIA 94088-3492 or use Comment Sheet in the back of this manual 60482900 A LIST OF EFFECTIVE PAGES New features, as well as changes, deletions, and additions to information in this manual are indicated by bars in the margins or by a dot near the page number if the entire page is affected. A bar by the page number indicates pagina tion rather than content has changed. Page Cover Inside Cover Title Page ii iii/iv v thru viii 1-1 thru 1-4 2-1 thru 2-4 3-1 thru 3-4 4-1 thru 4-10 5-1 thru 5-7 6-1 thru 6-3 A-l thru A-4 B-l B-2 C-l thru C-3 Index-1. Comment Sheet Mailer Back Cover 60482900 A Revision Page Revision Page Revision A A A A A A A A A A A A A A A A iii/iv 0^ PREFACE This user's guide provides an introduction to the high-speed record processing facilities of Sort/Merge. It is intended for students and others unfamiliar with Control Data's Sort/Merge. Sort/Merge is available under the following operating systems: Sort/Merge Version 4 operates under NOS 1 for the CONTROL DATA ® CYBER 170 Series, CYBER 70 Models 71, 72, 73, 74, and 6000 Series Computer Systems Sort/Merge Version 4 operates under NOS/BE 1 for the CDC ©CYBER 170 Series, CYBER 70 Models 71, 72, 73, 74, and 6000 Series Computer Systems Sort/Merge Version 1 operates under SCOPE 2.1 for t h e C O N T R O L D ATA C Y B E R 1 7 0 M o d e l 1 7 6 , CYBER 70 Model 76 and 7600 Computer Systems This user's guide describes both Sort/Merge version 4 and version 1 with primary emphasis on the description of Sort/Merge 4 and the NOS operating system. Where Sort/Merge 1 differs from Sort/Merge 4, a reference is made to "the Sort/Merge reference manual. The differences in specification of NOS/BE control statements are covered in appendix C. If you are not an experienced programmer, you need not read section 5. The ability to write owncode routines, whether in COMPASS or FORTRAN Extended, is not required in order to use Sort/Merge. Those readers who wish to find precise definitions of the various facets of Sort/Merge should refer to the Sort/Merge reference manual. This user's guide is not intended to precisely define the specific attributes of Sort/Merge, but rather to provide an introduction to its use and its application to problem solution. There should be no conflicts between this user's guide and other CDC publications. However, you should note that this user's guide presents only part of the total overview presented in the reference manuals. If you follow the examples in this publication, you should be able to create and run simple sort programs. You will also be better prepared to use the information supplied in the reference manual. If you are not familiar with your operating system, you should consider reading the applicable user's guides and reference manuals listed below. Publication 0^>\ 60482900 A Publication Number Sort/Merge Versions 4 and 1 Reference Manual 60497500 NOS Version 1 Reference Manual Volume 1 60435400 NOS Version 1 Reference Manual Volume 2 60445300 NOS Version 1 Applications Programmer's Instant Manual 60436000 NOS Version 1 Batch User's Guide 60436300 NOS Version 1 Terminal User's Instant Manual 60435800 NOS Version 1 Time-Sharing User's Guide 60436400 NOS Version 1 Time-Sharing User's Reference Manual 60435500 NOS/BE Version 1 Reference Manual 60493800 NOS/BE Version 1 User's Guide 60494000 COBOL Version 5 Reference Manual 60497100 COBOL Version 5 User's Guide 60497200 COMPASS Version 3 Reference Manual 60492600 CDC CYBER Record Manager Advanced Access Methods Version 2 Reference Manual 60499300 CDC CYBER Record Manager Advanced Access Methods Version 2 User's Guide 60499400 CDC CYBER Record Manager Basic Access Methods Version 1.5 Reference Manual 60495700 CYBER Record Manager Basic Access Methods Version 1 User's FORTRAN FORTRAN CYBER FORM 8-Blt Extended Version Extended Common Version Subroutines Version Utilities 1 Guide 4 Reference 4 User's Reference Reference Reference Manual 60495800 Manual 60497800 Guide 60499700 Manual 60495600 Manual Version 60496200 1 60495500 KJt 8.c?Street, be ordered from Control 55103. Data Corporation Literature and Distribution Services, 308 ma?u~ North Dale St. Paul, Minnesota 60482900 A CONTENTS 0$$^ 1. INTRODUCTION 1-1 4. CONTROL STATEMENT SORTS 4-1 Computer Sorting Purpose of Sort/Merge Sort/Merge Merging Sort/Merge and the Operating System CYBER Record Manager Equipment Used for Data Entry Storage of Data Manipulation of Data Accuracy of Data 1-1 1-2 1-2 1-2 1-3 1-3 1-4 1-4 1-4 1-4 2. INPUT PREPARATION 2-1 How It All Started Record Design Expanding Input Major and Minor Sort Keys Variable Length Records Records and Files How Sort Works Sorted Files What Fields Will I Sort On? Character Sets Display Code ASCII Code 2-1 2-1 2-1 2-1 2-2 2-2 2-2 2-2 2-3 2-3 2-3 2-3 SORTMRG Statement FILE Statement Sort/Merge Directives SORT MERGE FIELD BYTESIZE KEY SEQUENCE OPTIONS Order Dumps Optimization EQUATE OWNCODE FILE TAPE Job Examples Combining Dissimilar Files Using FORM 4-1 4-1 4-1 4-1 4-1 4-1 4-1 4-2 4-2 4-2 4-2 4-2 4-2 4-3 4-3 4-3 4-3 4-3 4-8 5. OWNCODE 5-1 3. SORTING CONCEPTS 3-1 Sort Key Description Types of Data to be Sorted Logical Key Integer Key Display 3-1 3-1 COMPASS Owncode OWNCODE Exits 1 through 4 Example OWNCODE Exit 5 OWNCODE Exit 6 How OWNCODE Works FORTRAN Calls Unique Uses of OWNCODE Record Compaction 5-1 5-1 5-1 5-5 5-5 5-5 5-6 5-6 5-7 6. RUNNING SORT/MERGE 6-1 Time-Saving Design COBOL and Sort/Merge FORTRAN Calls and Sort/Merge Checkpoint/Restart Tape Sorting Tag Sort Summary 6-1 6-1 6-2 6-2 6-2 6-2 6-3 Float INTBCD Collating Sequence Selecting a Collating Sequence Importance of Blanks Alternate Specification of Key Types Sort Order Using Merge Merging During a Sort Merge Order 3-1 3-1 3-1 3-2 3-2 3-2 3-3 3-3 3-3 3-3 3-3 3-4 3-4 APPENDIXES A. B. Character Sets Glossary A-l B-l Running Sort/Merge Under the NOS/BE Operating System INDEX y fl $ ^ \ 60482900 A C-l FIGURES 1-1 4-1 4-2 4-3 4-4 4-5 4-6 4-7 Simple Two-Way Tape Merge Record Format Input Records Sort Directives NOS Control Statements Sort Output by Name Creating a NOS Permanent File Sort by Department, Name, and Salary 1-3 4-4 4-5 4-5 4-5 4-6 4-7 4-7 4-8 4-9 4-10 4-11 5-1 User Sequence Sort by Department, Salary, and Age Sort for Seniority List Sort Using INTBCD Collating Sequence Reformatting Records Using FORM COMPASS Owncode Example to Convert Leading Blanks to Zeros in Signed Numeric Data 4-8 4-8 4-9 4-9 5-2 TABLES 3-1 Sign Overpunch Codes 3-2 60482900 A INTRODUCTION Sorting information is part of our everyday lives. Sorting is the process of arranging information into a predefined sequence so as to enhance its value. Sorted information is easier to search. Imagine how useful a telephone directory or a dictionary would be if the information were not sorted in alphabetical order. Or imagine trying to make use of all of the raw data collected during a nationwide census. The tremendous volume of raw data collected represents little usable information before it is sorted, totaled, and compiled into meaningful statistics. Before the introduction of data processing equipment, the time required to complete the tabulation of these statistics was awesome. One of the first card sorting machines was devised by Herman Hollerith to help solve this problem. At the time Mr. Hollerith worked for the Census Bureau, cards containing all statistics were handwritten, hand sorted into various categories and counted, resorted into other categories and counted again, until all categories were compiled. Hollerith's basic change was to use a hand punch to punch the information into 240 separate areas of a standardized card. Each area had a specific meaning, such as age group, sex, and so on. His card reading machine had forty dial counters. Whenever a hole was encountered in the card in a specific area, the dial wired to that hole would be incremented by 1. An entire card could be read in only six passes. A card box with 26 separate compartments was attached to the card reader. Depending on which connections were made, one of the lids would open automatically to allow the reader operator to drop the card in and then close the lid. Approximately 100 of his machines were used to tabulate the 1890 U.S. census. These machines are considered the first of the data processing machines. Though slow and tedious, they reduced the 1890 census tabulation effort from an anticipated 7 years to less than 3 years. Because these machines were hand-fed, they reached average speeds of up to 20 cards per minute. As the years passed, the card sorting machine was improved by adding an automatic card feed. Later improvements added chutes, gates, and multiple pockets which received the cards. The importance of the card and its position within the stack grew in relation to the importance of the data contained in the card. The basic concept of sorting cards changed to ordering all of the cards based on the content of a single card column as opposed to the individual value of each punch as used previously. By defining a field, for example, as a numeric amount contained in card columns 65 through 70, if all cards were sorted on column 70 in the first pass, column 69 in the second pass, and so on, all of the cards would be in correct numerical sequence after column 65 had been sorted. Whether the amount field would be in ascending or descending numeric order depended on the order in which the card sorter operator had stacked them after each pass. Considering that card sorters processed about 300 cards per minute by 1930, and that a sort on a 6-column field required 6 passes, the time required to sort a box of 2000 cards was about one hour. 60482900 A Later improvements to the electromechanical card sorting devices allowed them to reach speeds in excess of 1000 cards per minute, but still required that each column of a field be sorted. And only after the cards were sorted could the information they contained be totaled and printed. The amount of hand labor required to sort and print a few boxes of cards was staggering by today's standards, yet the card sorter was considered a labor-saving device at the time. T h e c o m p u t e r s o f t h e e a r l y 1 9 5 0 s c o u l d sto re th e information from cards on magnetic tape, sort this information into sequences, merge these sorted sequences, and write the completely sorted information to tape for subsequent tabulation and printing. Manual labor was no longer needed to handle card decks for more than the initial input pass. Sorting information remains one of the major uses of computers in business applications, such as credit card processing. More importance is now attached to the information the card contains than the card itself, yet in numerous applications the cards themselves are still physically sorted and returned to the customer. The electromechanical card sorter still plays a role, though a diminishing one, in present day operations. It is now cheaper to read cards, sort the information, and punch new cards rather than physically sort the input card file. COMPUTER SORTING The use of computers for sorting the concepts originally applied to longer considered limited to the contained in a single card. A measured in terms of how many sorted in one hour. information has changed sorting. A record is no information that can be sort run is no longer boxes of cards can be Much work has been done in the last 25 years to improve computer sorting techniques. Many books discuss the various techniques and their applications, and yet the use of computers has not altered sorting procedures nearly as much as it has emphasized the need for speed and the ability to handle a very large number of records in one sort. Computer sorting can still be compared, in concept, to the sorting of playing cards. The following example illustrates these concepts. If a person is given one complete deck of playing cards and asked to put them in order, the procedure is a simple one. Most people will make an initial distribution by suit, creating four files of equal size. After that, each file can be sorted by holding the 13 cards of a suit in one hand while the cards are shifted about and placed in order. As soon as each of the four suits has been ordered, the four are stacked together and the job is completed. In sort terminology, this was accomplished by the following basic sorting methods: a distribution, an internal sort, and then a final merge of the four sorted files. 1-1* ♦< ♦* A0U07000 5257526MM AOii071001117624MS A0GQ625Q1157719MS A0C079512237*28MS A0C07QG09H7623MS A0C06840 2017625MS A0G087504047435MO AOU062503157821HS A0C16350SQ16836FD A0u0625(i6017821MS AQGQ7230815763ZMN AOC061J11227719FS AQI.Q71001107731MS AOQ0673Q30 77622MM AG3092112157338MM A0038750 30l7e32MS AOC061005G37822FS AQ0071Q07017724MM A0QQ625a2217820HS AC3062504017818MS A0C081009187348MM AOu075011227527FS AQG210812076e46MM A0CQ62510117722FS A00172108167239MO A0G879GQ3157631MS BUC07360 2297634HM 80G375811037427FS BQuQ91104Q17441MM C0C063504217721FS CGG147204U36928FM Cfli0750Q6307629MD D0C068508277624FM D00058511277723FS E0j1975Q8307436MO E0U075Q10 37335MS £0:i410l0207232MS EOC06650 70 97723MS E0G11350 2157528FS E00091012057336MO Figure 4-7. Sort by Department, Name, and Salary the first on the year and the second on the month and day together as a four-digit field. Dates specified in the form mmddyy require specification of at least two fields, yy and mmdd. If the dates entered into the file were formatted as yymmdd, only one sort key field would be needed. You might wish to note that another example, illustrated in figure 4-10, shows that specifying the FIELD as DISPLAY code and the KEY as INTBCD results in no significant change to the alphabetic order of employee names. This is because the alphabet runs in the same order in both character sets and collating sequences. Such specification on a sort key field which contained letters, digits, and special characters would emphasize the differences between the DISPLAY and INTBCD collating sequences. The collating sequences are given in appendix A. 4-7 1 SORT 2 FILE.INPUTsNEH,OUTPUT?:NEW 3 FIELD,0EPT(26,1, OISPLAY),SALARY(27,6, 4 .DISPLAY).AGE(39 .2,DISPLAY) 5 KEY.OEPT(A.OMN). SALARY(D.DISPLAY), 6 ,AGE(D,DISPLAY) 7 SEQUENCE,OWNCE.C , 0 , 8 , A ) 8 ENO CARLSON.JACK JOHNSON,ARMANO SMITH,MARGERY OAVIS,ROBERT WILSON,OOUGLAS POPOV,IVAN JOHNSON,ANNABELLE PETROV.GEORGE GOHEZ.LINOA JONES.CHERYL WANG.LISA MULOER.HENK JONES.FRANCES GARCIA,ARTHUR SOKOL.OONALO WILLIAMS.BENEDICT OURAND,HELEN MARTIN,RICHARD COHEN.JOSEPH MEYER,WILLIAM SMITH,JOHN BOER.GEORGE WILLIAMS,ROBERT SMITH.ROBERTA IVANOV.LEONARD LI .WANG NEWMAN,ANDREW BAKKER,JOACHIM ANDERSON,TIMOTHY BROWN.JAMES CHANG.ROBERT LOPEZ,COSME TAYLOR,JENNIFER DUBOIS.ANORE FISCHER,OAVID PETIT,ARNOLD BERNARO.JOHN SCHULZ,CHARLES MILLER,FLORENCE KIM,LEE E00197508307436MO E00141010207232MS E00113502157528FS EO0107501037335MS E00091012057336MO E0 00 66507097723MS C00147204036928FM C00075006307629MD C00063504217721FS 0000 68508277624FH 0000 58511277723FS ** B00091104017441MM B00075811037427FS 80007380229763%MM A00210812076846HM A00172108167239MD A00163509016836FO A00092112157338HM A00087504047435MO AQ0087503017632MS A0 0081009187348MM A00079512237428MS A00079003157631HS A00075011227527FS A00072306157632MH A00071001107731MS A000710Q7017724MH A00071001117624MS A00070005257526MM A00Q70009117623MS A00068402017625MS A00067003077622MM A00062510117722FS A00062503157821MS A00062506017821MS A00062502217820MS AQ0062501157719MS A00062504017818MS A00061005037822FS A00061011227719FS Figure 4-8. User Sequence Sort by Department, Salary, and Age 1 2 3 4 5 6 7 8 9 10 SORT FILE.INPUT=NEW.OUTPUT=NEW FIELD.START1I37.2, OISPLAY), ,START2(33, 2, OISPLAY), ,START3(35,2,OISPLAY), ,AGE(39,2,DISPLAY) KEY.STARTKO, DISPLAY), ,START2(0,DISPLAY) ,START3(0,OISPLAY), ,AGE(A,OISPLAY) END FISCHER,OAVID MILLER.FLORENCE SCHULZ,CHARLES DUBOIS,ANDRE PETIT,ARNOLD WANG,LISA KIM,LEE TAYLOR,JENNIFER POPOV,IVAN NEWMAN,ANDREW GOMEZ,LINDA BERNARD,JOHN LI.WANG BROWN.JAMES JONES.CHERYL IVANOV,LEONARD PETROV,GEORGE WILLIAMS,ROBERT LOPEZ,COSME MEYER.WILLIAM GARCIA.ARTHUR CHANG.ROBERT BAKKER,JOACHIM SMITH,ROBERTA ANDERSON,TIMOTHY SMITH,MARGERY BOER.GEORGE JONES.FRANCES CARLSON,JACK COHEN,JOSEPH MULDER,HENK MART IN,RICHARD WILSON,DOUGLAS SMITH,JOHN DAVIS,ROBERT JOHNSON,ARMANO WILLIAMS,BENEDICT JOHNSON.ANNABELLE SOKOL,DONALD OURAND,HELEN A00062506017821MS A00061005037822FS A00062504017818MS A00062503157821MS A00062502217820MS D00058511277723FS A00061011227719FS A00062510117722FS E00066507097723MS A00071007017724MM C00063504217721FS A00062501157719MS A00071001107731MS A00070009117623MS O00066508277624FM A00072306157632MM C00075006307629MD A00079003157631MS A00067003077622MM A00087503017632MS 800073802297634MM A00066402017625MS A00071001117624MS A00075011227527FS A00070005257526MM E00113502157528FS A00079512237428MS B00075811037427FS E00197508307436MD A00087504047435MO B00091104017441MM A00092112157338MM E000910120 57336MO A0 0081009187348MM E00107501037335MS E00141010207232MS A00172108167239MO C00147204036928FM A00210812076846MM A00163509016836FD Figure 4-9. Sort for Seniority List COMBINING DISSIMILAR FILES USING FORM One basic premise of sort files is that the key fields be identical in length and starting position. A sort can only be specified on these parameters. In order to combine files, it is necessary that at least one record field in each file match. If for example, you wish to combine two large sorted files, such as university student grade files, and different fields have been used for similar information, you can use FORM to modify the records in one of the files to match the records in the other file. The FORM reference manual gives an example of such use. 4-8 In the following example, figure 4-11, the output file which has been sorted on the name field only will be reformatted using FORM to illustrate the report formatting capabilities of FORM. The FORM directives used are shown with the formatted output. FORM can be used to expand the records by placing a row of blanks between the key fields as shown here, or it can be used to change the layout of the keyfields within each record so as to match the format of other records if you wish to merge files which are composed of identical key fields but in different formats. For some statistical reports only a small portion of a large record needs to be extracted for sorting in order to 60482900 A 0^^!\ INP(NEW) 1 S3?T 2 F L E , I N P J T = N £ W , OUTPUT= MINE 3 F I E L 0 , N A 1 E ( 1 , 2 3 , D I S P L AY ) 4 K E Y, N A M E I A , I N T B C D ) QLT (FORMAT, 3G0 = X) 5 £>n R E F ( F C R M AT, i < 2 - , X 4 = 2 4 X 3 , N < t - 2 7 N b , $ $ , NE-33K6,S S,X3=39X2,X2-41X1,X2 = 42X1] 0^\ ANOERSON.TUDTHY BAKKER, JOACHIM BERNARD, JOrM 80ER, GEORGE BROWN,JAMES CARLS ON, JAC< CHANG,ROBERT COHEN,JOSEPH OAV IS,ROBERT OUBOIS,ANDRE DURANO.HELEN F I S C H E R , O AV I D GARCIA,ARTHJ* GOMEZ ,LINOA I VA N O V, L E O N A R D JOHNSON,ANNABcLLE JOHNS ON, ARMANO JONES,CHERYL JONES,FRANCES KIM,LEE LI, HANG LOPEZ,COSME MARTIN, RICH4 FILE(OUT♦ 6T*C,RT=Z»FL*80) SORTMRS(OWN) COL LEN Oil • « » « » » » » » » o « » « » » ft « » « OitfNl T W c OtfNl] IDENT EQU EQU OWN IFLT ERR IFLT ERR IFGT ERR COL.1,1 COL MUST BE AT LEAST 1 LEN,1,1 LEN MUST BE AT LEAST 1 LENt10,l LEN MUST 8E LESS THAN 11 OWN1 - OWNCOOE EXIT 1 PROCESSOR CALLING SEQUENCEA2 = ADDRESS OF RECORD XO = 30/NWORDS, 30/NCHARACTERS OOESCONVERTS THE FIELD STARTING IN COLUMN OF LENGTH FROM FORTRAN FORMATTED INTEGERS (I.E. LEADING BLANKS, OPTIONAL NEGATIVE SIGN, DIGITS) TO A LOGICAL VALUE THAT PRESERVES NUMERIC ORDER XO « XI = X2 = X3 » X4 = X5 = X6 = X7 A2 = B5 = B6 = B7 = 30/NWORDS, 30/NCHARACTERS 77777777777777777700B RECORD TO RIGHT OF FIELO SIGN (ALL ZEROS MEANS ♦» ALL ONES MEANS -) WORD WITH CURRENT CHARACTER CURRENT CHARACTER CURRENT BINARY VALUE CURRENT BINARY DIGIT ADDRESS OF FIRST WORD OF RECORD CURRENT CHARACTER SCRATCH NUMBER OF CHARACTERS LEFT IN FIELD ENTRY OWN I BSS SET SET SET SA4 SAS MX3 BX4 BXS BX4 LX4 MX2 BX2 MXl BX6 MX3 S87 COL-1 T/10 COL-1-10<» W A2*V A2*W*1 6»C -X3»X4 X3»X5 X4*<5. 6*C 6»LEN -X2»X4 -6 XO-XO ZR LX4 BXS SB5 SB7 SB6 EQ SB6 NE MX3 B7.0WN1* LEN -X1*X4 X5 B7-1 1R B5,*6,0WN11 1R85**6,OWN13 60 COL = 1 2 ... 10 11 12 ... 0 1 . . . 9 1 0 11 . . . 0 0 ... 0 1 1 ... 0 1 ... 9 0 1 ... GET WORD WITH FIRST CHARACTER GET NEXT WORD ALIGN MASK AT START OF FIELD EXTRACT (FIRST PART OF) FIELD EXTRACT SECOND PART OF FIELD (MAYBE) COMBINE LEFT-JUSTIFY FIELD SAVE RECORD TO RIGHT OF FIELD 77777777777777777700B CLEAR RESULT REGISTER GUESS THAT SIGN IS POSITIVE SET NUMBER OF CHARACTERS LEFT IF NO MORE CHARACTERS IN FIELD RIGHT-JUSTIFY NEXT CHARACTER EXTRACT NEXT CHARACTER DECREMENT COUNT OF CHARACTERS LEFT IF BLANK, LOOP IF NOT -, SKIP NOTE NEGATIVE SIGN Figure 5-1. COMPASS Owncode Example to Convert Leading Blanks to Zeros in Signed Numeric Data (Sheet 1 of 3) 5-2 60482900 A 0WN12 0WN13 OWN 14 ERROR /$f?^^. «» ZR LX4 BX5 SB5 SB7 SB6 LT SB6 GT SX7 1X6 SX7 1X6 EQ BSS BX6 LX6 MX3 BX6 MX3 BX6 8X6 LX6 SA4 MX3 BX4 BX2 BX7 SA7 SAS BXS BX2 BX7 SA7 EQ SAl LXi SB7 JP EJECT 0WN3 - B7.0WN14 -X1*X4 XS B7-1 IF END OF FIELD, JUMP RIGHT-JUSTIFY NEXT CHARACTER EXTRACT NEXT CHARACTER DECREMENT COUNT OF CHARACTERS LEFT IRQ B5»86»ERROR IF CHARACTER LESS THAN +0+, ERROR 1R9 85,B6,ERROR IF CHARACTER GREATER THAN +9+, ERROR 10 X6«X7 85-IR0 X6*X7 OWN 12 X3-X6 60-6»LEN X3-X6 6»L£N X3«X6 X6*x2 -6*C A2*4 6*C X3»<4 -X3»X6 X4*X2 A<* A2*W*1 -X3«X5 X3*X6 X2*X5 RESULT * 10»RESULT ♦ DIGIT GO TRY FOR NEXT CHARACTER (HERE AT END OF FIELD) APPLY SIGN TO RESULT LEFT-JUSTIFY RESULT TOGGLE SIGN BIT STRIP EXCESS ONES (IF NEGATIVE) APPEND RECORD TO IMMEDIATE RIGHT OF FIELD ALIGN FIELD GET WORD FOR FIRST CHARACTER MASK ALIGNED AT START OF FIELD DELETE ORIGINAL FIELD INSERT CREATED RESULT COMBINE STORE BACK GET NEXT WORO. JUST IN CASE A5 OWN I OWN I GO BACK WITH RECORD 30 XI B7*l owncode EXIT 3 PROCESSOR • CALLING SEOUENCEo A2 = ADDRESS OF RECORD • XO = 30/NWORDS, 30/NCHARACTERS » DOESi» CONVERTS THE FIELD STARTING IN COLUMN OF LENGTH » FROM THE FORMAT CREATED BY OWN1 TU FORTRAN FORMATTED INTEGERS • ( I . E . LEADING BLANKS, OPTIONAL NEGATIVE SIGN, DIGITS). » » » »' » » • • » o XO = 30/NWORDS, 30/NCHARACTERS XI = 0.10000000001 X2 - RECORf) TO IMMEDIATE RIGHT OF FIELD X3 = SIGN X4 = NUMBER AT START QF LOOP XS = NEXT NUMBER X6 = CHARACTERS X7 = RIGHT-JUSTIFIED CHARACTER A2 = ADDRESS OF FIRST WORD OF RECORD B7 = number of characters left in field ENTRY 0WN3 T w C BSS SET SET SET SA4 SA5 Mx3 BX4 BX5 OWN 3 COL-1 T/10 C O L - l - 1 0 fl A2*W A2*w*l 6*C -X3«X4 X3*X5 GET WORD WITH FIRST CHARACTER GET NEXT WQrtD, JUST In CASE MASK STARTIN6 AT FIELD DELETE BEFORE FIELD DELETE AFTE* FIELD+9 Figure 5-1. COMPASS Owncode Example to Convert Leading Blanks to Zeros in Signed Numeric Data (Sheet 2 of 3) 60482900 A 5-3 BX4 LX4 MX2 BX2 MX3 BX3 AX3 LX4 AX4 BX4 SB7 MX6 SAl 0WN31 PXS NX5 FX5 UX5 LX5 1X7 LX5 1X7 AX5 1X7 SX7 BX6 LX6 SB7 BX4 NZ PL SX7 BX6 LXb SB7 0WN32 ZR SX7 BX6 LX6 SB7 EQ 0WN33 BX6 LX6 SA4 MX3 BX4 BX2 BX7 SA7 SAS BXS BX2 BX7 SA7 EQ X4*X5 6»C COMBINE LEFT-JUSTIFY FIELD 6»LEN -X2»X4 SAVE RECORD TO IMMEDIATE RIGHT OF FIELD MASK FOR SIGN BIAS -X4»X3 1 IFF NEGATIVE 59 777777777777777777776 IFF NEGATIVE SHIFT TO EXTENDABLE SIGN 61-6»LEN EXTEND TO RIGHT X4-X3 TAKE ABSOLUTE VALUE OF NUMBER LEN NUMBER OF CHARACTERS LEFT TO SET UP CLEAR RESULT REGISTER =0.10000000001 X4 X5*X1 X5,35 X5,*5 X5 + X5 X5*X7 X4-X7 X7*lR0 X6*<7 -6 87-1 X5 X4,0WN31 X3,QWN32 1R- X6*X7 (E.G.) 123456789.0 (E.G.) 12345678.9012 (E.G.) 12345678 (E.G.) 12345678*2 (E.G.) 12345678*8 (E.G.) 123456780 (E.G.) 12345*78 (E.G.) 9 CONVERT TO DIGIT CHARACTER INSERT INTO RESULT MAKE ROOM FOR NEXT CHARACTER DECREMENT NUMBER OF CHARACTERS LEFT TO SET (E.G.) 12345678 IF MORE DIGITS, LOOP IF POSITIVE* SKIP INSERT NEGATIVE SIGN CHARACTER -6 B7-1 B7,QWN33 1R X6*X7 if field already full* skip insert a blank character -6 B7-1 OWN 32 X6*X2 -6»C A2*rf 6*C X3°X4 -X3»X6 X4*X2 A4 A2*W*1 -X3»X5 X3»X6 X2 + XS GO TRY AGAIN COMBINE LEFT-JUSTIFIED CHARACTERS AND RECORD TO IMMEDIATE RIGHT OF FIELD ALIGN FIELD WITH RECORD GET WORD FOR FIRST CHARACTER MASK ALIGNEO AT START OF FIELD DELETE CREATED FIELD INSERT FORTRAN CHARACTERS COMBINE STORE GET NEXT WORD, JUST IN CASE A5 OWN 3 EXIT END SORT FILES»SORT=INPUT ♦OUT»UT=OUT FIELD,<(3,9,LOGICAL) KEYfK 0w>JCODE,l=OwNl,3 =OWN3 END {Input Records here! Figure 5-1. COMPASS Owncode Example to Convert Leading Blanks to Zeros in Signed Numeric Data (Sheet 3 of 3) 5-4 60482900 A After sorting all of the records on the dates (as changed for this sort run through use of owncode exit 1), an exit 3 table look-up owncode routine could be used to change the format of the output date to July 4, 1776. Even though this use of owncode appears complicated, it requires far less processing time than a separate subroutine would require to do a similar translation. In most cases, files containing records modified through use of owncode to simplify or speed the sort process are not modified back to their original state; they are usually left in their more easy to sort format to avoid duplicate processing each time they are used. To illustrate, the date that was originally stored as JUL041776 was changed to 17760704 for sorting purposes and the file was stored in this format. When this file is updated, any new input should first be sorted using the same sort routine with its owncode table look up code before merging the new input with the existing file. Output returned to its initial state through use of owncode exit 3 does not change the format of the records stored on the sort input file. Another variation on this change is even easier to use. When the date is converted from the original format by means of a table look-up procedure and owncode exit 1, the new format is stored as part of the same record. In this manner, you can sort the record based on the easily sorted format and output the more easily recognized format. Your record would change from JUL041776 to 3UL0417760704 as a result of the owncode exit 1. Thus, you would sort this field on the last eight characters and output the date as the first nine characters. Variations of this concept allow you to sort any type of information you might define because you can transform any data into a sortable field tag and append the tag to your original record. Exits 1 and 2 owncode routines are not allowed in a merge-only run to ensure that the input records remain in sorted order. You can achieve the same purpose by using either exits 3 and 4 in a merge-only run or exits 1 and 2 in a sort run with supplementary merge files. If you specify an exit 1 or 2 owncode routine in a merge-only run, the exit is ignored and a nonfatal diagnostic message occurs. OWNCODE EXIT 5 One typical use for the owncode routines is to handle duplicate or equal keys. Equal keys can be duplicate keys, or a key comprised of characters treated the same (as the result of an EQUATE directive), or created equal because of a signed numeric overpunch. When equal sort keys are encountered in two or more records, the method of handling such records often needs to be defined. Just because the sort keys are equal does not mean that the records are identical. In many cases, you will want to be able to control the order of records with equal keys. One of the controls available to you is to output the records containing equal keys in the order they were input. This control is available by specifying RETAIN on the Sort/Merge OPTIONS directive. Owncode is not required in such a case. Using owncode exit 5 will allow you to stop and compare records with equal keys. If the records are exact duplicates, you might wish to delete one record. In the case where two identical orders were booked in one day from one company, you would most likely wish to flag the records to remind you to check the validity of both orders. Owncode exit 5 allows you to modify or replace either or both records or to retain both records without 60482900 A modification. It is more difficult to delete both equal records, but it can be done by having the owncode exit 5 routine signal an owncode exit 3 routine. At times equal record keys are considered to indicate an error, such as in a file that should only contain one entry for each customer. In such files, it would be proper to delete any duplicate records. Exit 5 can be used to identify records with equal sort keys. If the record overlap is sufficient to ensure that a record is indeed a duplicate, then deletion of the duplicate record is quite simple. You merely need to provide the address and length of the record you wish to keep in registers A2 and XO, and modify the return address. Owncode routines can also be used to write all questionable records to an exception file for separate processing. When this is done, the questionable records are usually deleted from the file being processed. (The file should be closed in the owncode exit 4 routine.) When the exception file is corrected, it can be sorted and merged with the original file. Another use of duplicate key processing allows you to count records. If, for example, you wanted to determine the number of items of each particular size article sold during a given period, you might specify sufficient sort keys to uniquely identify the articles and sizes you wish to tally, append a count field containing 1, and then use owncode exit 5 to add the counts for each duplicate record key found in the articles sold file. (The record containing the appended count field is retained while the other record is omitted.) The sort run will then list every item sold, in the order specified, with total sales of each item. In cases where only the items that sold more than a certain number are of interest, further owncode output specification could be used to delete all records whose numerical tally is below that level. Thus, in this manner, you could create a base for next year's reorder levels, a list of this season's most popular items, and so forth. OWNCODE EXIT 6 Owncode exit 6 is used for checking or verifying nonstandard labels on files. Most files will not use nonstandard labels; owncode exit 6 is not needed for these files. If your CYBER Record Manager FILE control statement specifies LT=NS,ULP*NO, you will need to refer to the Basic Access Methods reference manual information on label processing. The use of the GETL and PUTL macros described there requires a knowledge of COMPASS assembly language. Owncode exit 6 is only available to users of Sort/Merge 4; this capability is not supported for Sort/Merge 1. HOW OWNCODE WORKS When you specify an owncode exit in your Sort/Merge program, register A2 contains the address of the current data record and register XO contains the record length. In addition, during entry into owncode exit 5, registers A3 and X4 are used for the address and length of the second record of a comparison involving equal sort key data. Transfer from Sort/Merge to the owncode routines is accomplished with a return jump (RJ) instruction which fills the entry point of the owncode routine with a return to the Sort/Merge program. To return to Sort/Merge control (and leave the owncode routine), your code must 5-5 return to the entry point of the owncode routine. This is the normal return address. You can request specific processing action by altering the return address in the entry point of the owncode routine. You will usually do this by putting the normal return address in a B register, Bn, and jumping to Bn+1, Bn+2, or Bn+3. (This operation is often described in text but not in code as NR+1, NR+2, or NR+3.) When the function you have chosen is complete, a normal return address to the entry point of the owncode routine causes a jump to Sort/Merge to continue normal processing. The specific owncode functions available to you are described in detail in the Sort/Merge reference manual. F u r t h e r d e t a i l s o f t h e C O M PA S S i n s t r u c t i o n s a r e presented in the COMPASS reference manual. FORTRAN CALLS A set of library routines is provided for calling Sort/Merge from a FORTRAN program. The use of Sort/Merge in conjunction with FORTRAN provides a record and file handling capability often overlooked by the typical programmer using FORTRAN. Calls to Sort/Merge allow t h e F O RT R A N p r o g r a m m e r t o u s e t h e S o r t / M e r g e owncode exits to access records and files more easily than is possible directly from FORTRAN. M o r e o v e r, c a l l s t o S o r t / M e r g e o w n c o d e a l l o w t h e FORTRAN programmer to write subroutines in FORTRAN instead of COMPASS, and use variables instead of constants as parameters. The FORTRAN calls require that all conventions for using FORTRAN statements be observed when using these calls. The more commonly used FORTRAN calls are listed below with their corresponding Sort/Merge directives. CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL SMSORT SMMERGE SMFILE SMKEY SMSEQ SMEQU SMOPT SMEND SMOWN SMRTN SORT directive MERGE directive FILE directive KEY directive SEQUENCE directive EQUATE directive OPTIONS directive END directive OWNCODE directive No corresponding directive As with the Sort/Merge directives, the first call should be to either SMSORT or SMMERGE; the last call must be to SMEND, which initiates processing using the information collected by the other calls. The SMSORT and SMMERGE calls require that you specify the maximum record length, in characters, of the records to be sorted as the first parameter. You can also specify the number of CM words to be used for working storage if you wish, though the default of 22 000 octal words is usually sufficient. The full list of FORTRAN calls and how to use them appears in the FORTRAN Extended reference manual and in the Sort/Merge reference manual. The use of Sort/Merge owncode complements the capabilities of FORTRAN; their combined use provides both excellent record handling and computing power. For example, consider a problem where a series of worldwide experiments over a number of years has resulted in the collection of voluminous amounts of data concerning the effects of temperature and pressure on radio waves. Though most of these readings were 5-6 recorded in metric measurement (Celsius and millimeters of mercury), a great number were recorded in nonmetric values (Fahrenheit and inches of mercury). One advantage exists in that all records within each file are consistent; all files are either metric or nonmetric. In order to combine all of the data for input to a program written in FORTRAN, the data must first be converted to all metric equivalents and sorted into order based on temperature as the major sort key and pressure as the minor sort key. Convert the nonmetric information in each nonmetric file to metric equivalents before combining all of the files. The most straightforward method of combining and sorting all of the data is to sort all files into order and then merge them. The FORTRAN call to Sort/Merge owncode exit 1 is one of the best methods available for solving the problem of conversion. Record manipulation is handled entirely by Sort/Merge. The owncode routine is written in FORTRAN; each temperature and pressure value is converted to metric, the values in metric are sorted and output to an intermediate file in one operation. When all of the files to be converted are in metric, and all of the metric files are sorted, all intermediate files are merged to create the single input file for the processing program. Such operations are not as easily managed if attempted only within FORTRAN. The following problem illustrates that raw input is of little value before it has been organized into a more usable form. The problem is to determine which zones of a given real estate area are appreciating the fastest, the frequency of sales in each zone, and how much each zone has appreciated since last year. With thousands of real property sales each year spread over hundreds of zones, it becomes a task for the computer to determine such statistics. To obtain a sorted comparison, all sales within each zone must be totaled and averaged. The averages for this year must be compared to the averages for last year in order to compute average appreciation. If averages are not available for last year, they too must be computed. Then the difference between this year's average and last year's average can be computed for each zone. The differences are sorted in descending order by zone, along with other desired information, to create a usable report. One method of creating such a report is to use FORTRAN to create records that contain the desired information, sort these records based on the percentage of appreciation, add report headers through use of owncode, and output a formatted report. The sort specified would probably be made through a FORTRAN call to the sort routine, and the owncode exits 3 and 4 header additions would naturally follow. UNIQUE USES OF OWNCODE One unique use of the Sort/Merge owncode routines is to sort text entries for an index or for a glossary. For an index, it is possible to create a row of leading dots the length of the index line, and overlay the initial portion of the line with the entry and the final portion of the line with the page reference to create a typical entry that looks like this: Symbol generator 42 These lines can easily be sorted to create the index. 60482900 A /^%. For a glossary, you might consider a standard entry which starts with the term to be defined, followed by a blank line and then the descriptive text. The final output might look like: DIRECT! VESInstructions that supplement processing defined by the SORTMRG control statement for execution of Sort/Merge record processing. RECORD COMPACTION Many variations of compaction are possible. Using owncode routines or FORM, you can extract the key fields of interest from all the records in a file, and sort only the new records. For example, you might wish to extract only the age and salary fields from a payroll file to create an age/salary profile chart. Other uses which are often overlooked include preprocessing of input such as stripping away headings in a report file for subsequent sorting, pagination of reports, combination of entries, deletion of duplicate entries, reformatting reports, adding headings, creating footnotes, controlling page depth, blanking certain fields in reports, and so on. You would probably find a number of identical short records. Rather than keep each record separately, it could be worthwhile to keep only one copy of each identical record, with a count field appended to the record to keep track of the number of times an identical record is encountered. In this manner, the identical records can be deleted, thus shortening the total length of the file. Owncode exit 5 makes this process simple because the exit is available on every duplicate key encountered. On c r e a t i o n o f t h e p r o fi l e c h a r t , t h e a m p l i t u d e o f t h e identical entries will have to be expanded by the number of identical occurrences recorded in the appended count field. As a final note, if you have part of your information already sorted, this can sometimes be made to work in your favor. For example, when creating an index, page number references will be in page number order in your intermediate file. Thus, on output of multiple page references, you can first check to delete duplicates, and the remaining references will remain in numerical order if you have specified the RETAIN option throughout the run. It might occur, for example, on a civil service or military payroll profile that the number of unique salaries is quite low but the number of identical records quite high. Thus, the profile input file would be much shorter than the original file. Not only would each record be shorter, but the compacting of ichntical entries would substantially reduce the number of records in the file. To create a glossary that can be sorted, you will need to create each glossary entry as a single continuous record. Owncode exit 3 is then used, following the sort of the records, to format the glossary entries. 60482900 A 5-7 RUNNING SORT/MERGE 0^\ Time is one of the main considerations when sorting records. Sorting is generally an extremely time-consuming process. After the desired order of the records has been established and the present condition of the records is known, an efficient procedure should be established to make the sort process as efficient as possible. Establishing an efficient procedure is very important because sorting can require large amounts of computer time; moreover, most sort jobs are repeated on a regular basis. A small saving on each run, therefore, can compound to a significant saving over the course of a year. file often requires the use of another file, you should c o n s i d e r c o n s o l i d a t i n g t h e t w o fi l e s t o r e d u c e t h e overhead associated with handling two files. To establish an efficient procedure requires a great deal of planning and checking. It includes all of the steps involved in creating the records and files to be sorted, the order of the output desired, the specification of the Sort/Merge directives, and even the time of day the sort is to be run. For example, if sort is to be run in a multiprogramming environment, it can be beneficial to run the sort program with certain types of jobs which will not result in much competition for the system resources. If jobs compete for resources, time can be lost through the swapping in and out of one or more of the jobs. If tremendously large sort jobs are to be run, you might consider scheduling them for the hours of low system activity to avoid such conflict or try to balance the job mix to avoid conflict. If you work with large files, you will be concerned about possible machine malfunction, power failures, and other potential problems. Checkpointing files being sorted is a good idea which can work to your advantage; however, dividing large files into smaller files makes them much more manageable, and is a superior insurance of successful completion. Not only is sort time reduced, but overhead is also reduced since no time is required for a checkpoint dump nor is a device required to receive the dump. Note that the checkpoint/restart option is not available to tape sort users. At the point the sort run is initiated, the avenues available to you are limited in respect to time-saving opportunities. There is usually little choice as to the computer to be used, the sort program available, the form of the existing input files, and the desired order of output. The length of time required to sort any given file depends on the characteristics of the records and the characteristics of the computer to be used. Since these are usually fixed, only a few options such as available central memory and the possibility of additional hardware devices can be controlled at this point. Most time-saving methods must be considered before the records and files to be sorted are created. TIME-SAVING DESIGN The length of records affects the sort process. Long records require a larger amount of central memory, reducing the number of records that can be stored as a sorted string. Shorter records increase the number, thus requiring fewer passes during the sort phase. The length of the records can only be controlled during the process of designing and creating the records as noted in section 2. The size of the files to be sorted can also be reduced if they are organized so that only active records are sorted; inactive records can be relegated to another file. File space is valuable so you should be careful that inactive or useless information is either stored on another file, archived, or purged when it is no longer of use. Such a process is known as file maintenance. File maintenance includes keeping the file up to date by adding records as well as deleting or modifying them. It is also important to monitor the use of each file to determine which files are active and which should be reorganized. If the use of one Extremely long records require a large amount of time and memory to sort. It might be advantageous to create a tag sort environment where only the key need be sorted, thus reducing the time and central memory requirements. A l s o , w h e n a l o n g r e c o r d p r o v e s d i f fi c u l t t o r e a d , subsequent read attempts cost a great deal of time for a long record, whereas for a short key, an added saving accrues. E f fi c i e n t o r g a n i z a t i o n o f r e c o r d s , e x c e l l e n t fi l e management, working with smaller files, and a proven procedure will make the required sort jobs run faster and ensure optimum performance. COBOL AND SORT/MERGE The COBOL SORT verb calls the same Sort/Merge program available under the operating system. Identical input should result in identical output because there is no difference between the sort program available under COBOL and the sort program available under the operating system. The choice of which sort to specify, for COBOL programmers, depends on other factors. For a small sort job, it probably will not be advantageous to leave COBOL to specify a sort under the operating system. You won't need to store registers, variables, and so forth. However, a COBOL SORT ties up the COBOL field length which often reduces the amount of central memory available to the sort process. When large files are involved, you might well find an advantage in leaving COBOL to take advantage of the additional field length which could improve the speed and efficiency of the sort program. This decision depends on the size of the file to be sorted and the size of the COBOL program competing for field length. One point worthy of note is that a COBOL SORT is often s p e c i fi e d b e c a u s e t h e C O B O L p r o g r a m m e r i s n o t accustomed to specifying an operating system sort. Actual specification of the COBOL SORT is different from the specification of the operating system sort, though the concepts are the same. For a complete description of the COBOL Sort/Merge facility, see the COBOL reference manual. /pP'N 60482900 A 6-1 As a general rule, a sort which is not extremely large will run just as fast under either COBOL SORT or the operating system SORTMRG control statement. If you have serious doubts whether there will be sufficient field length under COBOL, then by all means consider using the operating system Sort/Merge program. Another effective method of reducing field length conflict when using large files and a large COBOL program is to segment the COBOL program and put the sort into a short segment. There are no owncode exits available to the COBOL programmer such as those available to the COMPASS and FORTRAN programmers. On the other hand, COBOL programmers can use procedures to achieve most of the same results. COBOL allows the use of INPUT PROCEDURE and OUTPUT PROCEDURE phrases to specify the procedure to be executed under system control either at the time the SORT statement is executed or after the records have been sorted. The INPUT PROCEDURE can include statements to select, create, or modify records before the sort process begins. Thus, the same functions of Sort/Merge owncode are available to the COBOL programmer though their specifications are totally different. When using the INPUT or OUTPUT PROCEDURE, the COBOL programmer must be wary of the problems that can be created. Changing the length of the record can shift the position of the sort key fields so they no longer match the original specification or the rest of the record to be sorted. Extending the length of the records can cause some records to exceed the record length specified. Deleting records without being aware of a future need for them can prove catastrophic. Use of the COBOL SORT is described in the COBOL reference manual and the COBOL 5 user's guide. FORTRAN CALLS AND SORT/MERGE FORTRAN calls allow the use and specification of owncode routines written in FORTRAN. The FORTRAN calls are almost identical with the Sort/Merge directives. They are described fully in the Sort/Merge reference manual. A number of the arguments concerning the advantages of the COBOL SORT verb versus the operating system sort hold true for the FORTRAN calls to Sort/Merge as well. It can be advantageous to sort without leaving the FORTRAN program; however, this convenience is not without its cost in field length that otherwise might be used by Sort/Merge for more efficient operation. One method of avoiding a field length conflict between a large FORTRAN program and large sort files is to use separate job steps with control statements. Another method is to use overlays and ensure that the sort occurs in a short overlay. CHECKPOINT/RESTART The use of checkpoint/restart is recommended in many applications to allow you to recover some of the work already done in case of a power failure or machine malfunction. In the case of Sort/Merge there are other options that you should consider. Checkpointing a program requires that you specify a certain point or a certain number of records after which the operating system will take a dump of the work done to that point. 6-2 In case of job failure, you can return to that point in processing rather than start anew. A checkpoint dump requires that system resources such as disk space or tape units be assigned for the dump. Such resource allocation often reduces the resources available to the Sort/Merge program. As noted previously, reducing the size of the files you wish to sort, sorting them, and then merging them can be faster than sorting one large file. If you compare this procedure with the checkpoint/restart procedure, you will find that time is saved in sorting, and that the same insurance against machine or system malfunction is afforded. Shorter files sort faster on a relative basis than do long files, and reducing the system overhead associated with checkpointing a file adds to the speed achieved by sorting smaller files. Refer to the Sort/Merge reference manual for details of checkpoint/restart usage. /,c^\ TAPE SORTING The tape variant of Sort/Merge provides two forms of processing, balanced and polyphase. A tape sort is not the same as a disk sort with tape input and output. A tape sort does not require disk units. Sort/Merge is most efficient when the disk oriented version is used. Use of tape sorts is discouraged when disk facilities are available because tape is less reliable and is usually less efficient. In cases where the size of records and the size of files seems to require specification of a tape sort, you might consider dividing the large files into smaller files to be presorted and merged later rather than use the tape sort. Other reasons for not specifying a tape sort include the need for many tape units to contain all of the scratch tapes required, and the added possibility for operator error, in addition to the almost certain occurrence of tape parity errors in a multireel environment. Also, a tape sort cannot be checkpointed. Tape sorts are explained in the Sort/Merge reference manual appendix. Note that backup tape files are usually maintained in step sequence, so that if any one operation is unsuccessful, you can back up to the next previous step to recover. It is not uncommon to see up to four levels of such backup files being retained as insurance against loss of information on tape. The polyphase sort is usually more efficient than the balanced sort. In a situation where you can have the opportunity to try both types of tape sort on the same or very similar files, the best way to determine the better sort is to try both and compare. The number of tape drives available best determines which tape sort is better; the balanced sort becomes preferable when you can commit eight or more tape drives to the sort. TAG SORT When records to be sorted are extremely large, it is sometimes quite time-consuming or impossible to sort them because few will fit into the memory available. It is often easier and better to sort such large records by creating a key which identifies the record and includes sufficient information to link the key to the record, and then sort only the keys. When they are in order, the 60482900 A /^™Hv records can then be retrieved in sorted order from their storage locations on disk in the order of the sorted keys. This is called a tag sort. Depending on the application, it is often advantageous to never order the records themselves on disk, but rather to order only the output. Keys can be created in a number of ways depending on your needs. One common method of creating keys is to use an algorithm which extracts the sort key from the record along with disk address values which identify the location of the record and appends this value to the key field to be sorted. When the keys are all sorted, the records can be retrieved in sorted order from the disk containing them. This type of sort technique is usually only undertaken by systems programmers. SUMMARY • Sorting small files and merging sorted files is faster than sorting one large file. If possible, sort only the information you need, not the entire record. • Always sort files before merging them with larger files. It is always better to make smaller sort runs than to checkpoint larger sort runs. It is usually true that the time required to sort half the records requires less than half the time needed to sort all the records. A shorter sort is usually more efficient. As the size of a sort grows, the number of records sorted per second decreases. Be very careful when using owncode to change record size or sort key field position before sorting the file. The following points are offered for your consideration before starting any sort or merge operation: • If you can avoid sorting information, you will probably save time. Always keep backup files. You can always go back if you keep previous information. Some installations keep up to four levels of backup files. /pp?^ 60482900 A 6-3 0®% CHARACTER SETS CONTROL DATA operating systems offer the following variations of a basic character set: CDC 64-character set CDC 63-character set Graphic character representation appearing at a terminal or printer depends on the installation character set and the terminal type. Characters shown in the CDC Graphic column of the standard character set table are applicable to BCD terminals; ASCII graphic characters are applicable to ASCII-CRT and ASCII-TTY terminals. ASCII 64-character set ASCII 63-character set /$$$£\ The set in use at a particular installation was specified when the operating system was installed. You cannot change it. STANDARD COLLATING SEQUENCES If the installation character set is the CDC character set, the collating sequence default is COBOL6. If the installation character set is ASCII, the collating sequence default is ASCII6. Depending on another installation option, the operating system assumes an input card deck has been punched either in 026 or in 029 mode (regardless of the character set in use). COLLATION OF ARBITRARY CHARACTERS Under NOS, the alternate mode can be specified by a 26 or 29 punched in columns 79 and 80 of any 6/7/9 card. In addition, 026 mode can be specified by a card with 5/7/9 multipunched in column 1, and 029 mode can be specified by a card with 5/7/9 multipunched in column 1 and a 9 punched in column 2. Under NOS/BE, the alternate mode can be specified by a 26 or 29 punched in columns 79 and 80 of the job statement or any 7/8/9 card. The specified mode remains in effect through the end of the job unless it is reset. Several graphics are not common for all codes. Where these differences in graphics occur, arbitrary assignment of collation positions and of translations between codes must be made. For example, display code data that is collated in the ASCII6 collating sequence requires assignment of specific graphics. One of these graphics is the identity character = (60) in display code that is interpreted as the number character (//) in ASCII6 collating sequence in table A-2. r 60482900 A A-l TABLE A-l. STANDARD CHARACTER SETS CDC Display Code (octal) oof 01 02 03 04 05 06 07 10 11 12 13 14 15 16 17 20 21 22 23 24 25 26 27 30 31 32 33 34 35 36 37 40 41 42 43 44 45 46 47 50 51 52 53 54 55 56 57 60 61 62 63 64 65 66 67 70 71 72 73 74 75 76 77 Graphic : (colon)ft blank , (comma) . (period) 1 t %*t r- —1 ; (semicolon) ASCII Hollerith Punch (026) External BCD Code 8-2 12-1 12-2 12-3 12-4 12-5 12-6 12-7 12-8 12-9 11-1 11-2 11-3 11-4 11-5 11-6 11-7 11-8 11-9 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 00 61 62 63 64 65 66 67 70 71 41 42 43 44 45 46 47 50 51 22 23 24 25 26 27 30 31 12 01 02 03 04 05 06 07 10 11 60 40 54 21 34 74 53 13 20 33 73 36 17 32 16 14 35 52 37 55 56 72 57 15 75 76 77 12 11 11-8-4 0-1 0-8-4 12-8-4 11-8-3 8-3 no punch 0-8-3 12-8-3 0-8-6 8-7 0-8-2 8-6 8-4 0-8-5 11-0 or 11-8-21u 0-8-7 11-8-5 11-8-6 12-0 or 12-8-2m 11-8-7 8-5 12-8-5 12-8-6 12-8-7 Graphic Subset : (colon) ^ blank , (comma) . (period) %" " (quote) (underline) ' (apostrophe) - (circumflex) ; (semicolon) Punch (029) Code (octal) 8-2 12-1 12-2 12-3 12-4 12-5 12-6 12-7 12-8 12-9 11-1 11-2 11-3 11-4 11-5 11-6 11-7 11-8 11-9 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 072 101 102 103 104 105 106 107 110 111 112 113 114 115 116 117 120 121 122 123 124 125 126 127 130 131 132 060 061 062 063 064 065 066 067 070 071 053 055 052 057 050 051 044 075 040 054 056 043 133 135 045 042 137 041 046 047 077 074 076 100 134 136 073 12-8-6 11 11-8-4 0-1 12-8-5 11-8-5 11-8-3 8-6 no punch 0-8-3 12-8-3 8-3 12-8-2 11-8-2 0-8-4 8-7 0-8-5 ... 12-8-7 or 11-0TTT 12 8-5 0-8-7 ... 12-8-4 or 12-0TTT 0-8-6 8-4 0-8-2 11-8-7 11-8-6 Twelve zero bits at th e end of a 60-bit word in a zero b yte record are an e nd of record mark rather than two colons. TTlrt installations using a 63-graphic set. display code 00 rlas no associated graphic or card code; display code 63 is the colon (8-2 punch). The % graphic and related card codes do not exist and translations yield a blank (55o). mThe alternate Hollerit h (026) and ASCII (029) punches are accepted for ir put only. A-2 yCSi!^. 60482900 A TABLE A-2. 6-BIT CHARACTER CODE COLLATING SEQUNCES COBOL6t Graphics blank < %t [ — s A t > > —1 • ) t + $ * / » ( = ¥ < A B C D E F G Display Code 55 74t 63 61 65 60 67 70 71 73 75 76 57 52 77 45 53 47 46 50 56 51 54 64 72 01 02 03 04 05 06 07 INTBCD ASCH6+t CDC Display Graphics Code Graphics INTBCD Graphics Sequence 00 + 00 00 :t blank 01 01 01 ii 02 02 02 03 03 03 04 04 04 05 05 05 %t+ 06 06 06 07 07 07 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17 17 20 20 20 21 21 21 22 22 22 23 23 23 24 24 24 25 25 25 26 26 26 27 27 27 30 30 30 31 31 31 32 32 32 33 33 33 34 34 34 35 35 35 —l 36 36 36 37 37 37 DISPLAY t /ffP^v 60482900 A A-3 TABLE A-2. 6-BIT CHARACTER CODE COLLATING SEQUENCE (Contd) COBOL6 t DISPLAY + INTBCD ASCU6++ CDC i s p l a y Graphics D i s p l a y G r a p h i c s DCode Graphics IN TB C D G r a p h i c s Sequence Code Code H 10 40 40 @ 40 I 11 41 41 A 41 V 66 J 12 K 13 L 14 M 15 N 16 O 17 P 20 Q R 21 ] 62 S 23 T 24 U 25 V 26 W 27 X 30 [ ] Y 31 %t Z 32 00 + 22 blank 0 33 1 34 A 2 35 3 36 t i 4 37 < 5 40 > 6 41 < 7 42 > 8 43 9 44 V 42 43 44 45 46 47 50 51 52 53 54 55 56 57 60 61 62 63+ 64 65 66 67 70 71 72 73 74 75 76 77 •M blank 42 43 44 45 46 47 50 51 52 53 54 55 56 57 60 61 62 63 64 65 66 67 70 71 72 73 74 75 76 77 B 42 C D 43 44 E 45 F 46 G 47 H 50 I 51 J 52 K 53 L 54 M 55 N 56 O 57 P 60 Q 61 R 62 S T 63 64 U 65 V 66 W 67 X 70 Y 71 Z 72 [ 73 \ 74 ] 75 76 77 tUnder the CDC 63-character set, there is no percent graphic; the colon is display code 63. Display Code 00 is not used. ttUnder the ASCII 63-character set, there is no percent graphic; the colon collates in position 05, not position 32. A-4 60482900 A GLOSSARY Advanced Access Methods (AAM) A file manager that processes indexed sequential, direct access, and actual key file organizations, and supports the Multiple-Index Processor. (See CYBER Record Manager.) FileA logically related set of information; the largest collection of information that can be addressed by a file name. Starts at beginning-of-information and ends at end-of-information. Balanced Tape Sort Sort that always keeps its intermediate tapes divided into the same two groups. Sorted strings are merged from one group to another as long as possible, then the direction is reversed. FILE Control Statement A CYBER Record Manager control statement that contains parameters used to build the file information table for processing. Must be provided for every input or output file to be processed by a directive sort or merge. Not to be confused with the Sort/Merge FILE directive. Basic Access Methods (BAM) A file manager that processes sequential and word addressable file organizations. (See CYBER Record Manager.) Buffer An intermediate storage area used to compensate for a difference in rates of data flow, or time of event occurrence, when transmitting data between central memory and an external device during input/output operations. Collating Sequence J ^ N /^PN Sequence that determines precedence given to character data for sorting, merging, and comparing. CYBER Record Manager A generic term relating to the common products AAM and BAM that run under the NOS and NOS/BE operating systems and that allow a variety of record types, blocking types, and file organizations to be created and accessed. The execution time input/output of COBOL 4, COBOL 5, FORTRAN Extended 4, Sort/Merge 4, ALGOL 4, and the DMS-170 products is implemented through CYBER Record Manager. Neither the input/output of the NOS and NOS/BE operating systems themselves nor any of the system utilities such as COPY or SKIPF is implemented through CYBER Record Manager. All CYBER Record Manager file processing requests ultimately pass through the operating system input/output routines. Direct Access File In the context of CYBER Record Manager, a direct access file is one of the five file organizations. It is characterized by the system hashing of the unique key within each file record to distribute records randomly in blocks called home blocks of the file. In the context of NOS permanent files, a direct access file is a file that is accessed and modified d i r e c t l y, a s c o n t r a s t e d w i t h a n i n d i r e c t a c c e s s permanent file. Directives Instructions that supplement processing defined by the SORTMRG control statement for execution of Sort/Merge record processing. 60482900 A File Information Table (FIT) A table through which a user program communicates with CYBER Record Manager. All file processing executes on the basis of fields in the table. Some fields can be set by the Sort/Merge user in the FILE control statement. Key Comparison Internal technique of comparing sort keys that usually requires less elapsed time and more central processing time than key extraction. Key Extraction Internal technique of comparing sort keys that usually requires less central processing time and more elapsed time than key comparison. Macro Sequence of source statements that are saved and then assembled when needed through a macro call. Used when Sort/Merge functions as a COMPASS s u b r o u t i n e f o r a C O M PA S S p r o g r a m o r a s a relocatable program generated for the COBOL SORT verb. Merge Order Internal parameter governing the number of buffers used by Sort/Merge Version 4 in the intermediate merge phase. Owncode Routine Closed COMPASS subroutine written by the user that provides the capability to insert, substitute, modify, or delete input and output records during Sort/Merge processing. Polyphase Tape Sort Sort with only one intermediate output tape for each merge phase; however, the output tape is changed for each merge phase. A polyphase tape sort usually can sort more records than a balanced tape sort in the same amount of time and with the same number of intermediate tapes. Random File In the context of CYBER Record Manager, a file with word addressable, indexed sequential, direct access, or actual key organization in which individual records can be accessed by the values of their keys. B-l Record Sort Key CYBER Record Manager defines a record as a group Field of information of related characters. A record or a portion thereof merge input file used is the smallest collection of information passed records are written to between CYBER Record Manager and a user program. Eight different record types exist, as Sort Order defined by the RT field of the file information table. Order for sorting keys, within each record in a sort or to determine the order in which t h e o u t p u t fi l e . either ascending or descending. Signed Numeric Data Ta p e Sort Integer data stored internally in display code. Sorts Sort that has its intermediate scratch files residing a c c o r d i n g t o t h e m a g n i t u d e a n d t h e s i g n o f t h e o n t a p e r a t h e r t h a n d i s k . O r i g i n a l i n p u t fi l e a n d / o r i n t e g e r t h e d i s p l a y c o d e r e p r e s e n t s . fi n a l o u t p u t fi l e c a n r e s i d e o n d i s k o r t a p e . S^!\ B-2 60482900 A ) RUNNING SORT/MERGE UNDER THE NOS/BE OPERATING SYSTEM This appendix illustrates the basic differences between the NOS and the NOS/BE operating systems with respect to Sort/Merge and includes the NOS/BE control statements you will need to run the job examples given in section 4 if your installation is using the NOS/BE operating system. As noted previously, the Sort/Merge directives need not be changed because of a change of operating system. Certain limitations apply if you are using Sort/Merge Version 1 which is required if your computer is a CYBER 170 Model 176, a CYBER 70 Model 76, or a 7600. Refer to the Sort/Merge reference manual for these limitations. Users of the remaining CYBER 170 and CYBER 70 models and the 6000 Series computers will use Sort/Merge Version 4 described in the same publication. User programs can call Sort/Merge with COMPASS assembly language macros, the FORTRAN Extended interface routine calls, or through the COBOL language. These uses are described in detail in the Sort/Merge reference manual and in the respective language reference manuals. r CONTROL STATEMENT FORMATS The major differences between the control statements required for the NOS and NOS/BE operating systems from the Sort/Merge user's perspective are in the areas of the job control statement including accounting information and in the use of permanent files. Other important differences are noted only where they apply to this user's guide. You should refer to the user's guide and reference manuals applicable to your operating system for all details. PERMANENT FILES Users of NOS permanent files will encounter both direct and indirect access permanent files. Their use is quite different. For purposes of these examples, only indirect access permanent files are used. Large files might be better served by using direct access permanent files. The choice of type of NOS permanent file is described in the NOS Time-Sharing user's guide. There is no NOS/BE counterpart for direct and indirect access files. If during a NOS job, you wish to save a file for future use, such as for input to another job, a simple SAVE,filename is the minimum control statement you can specify to create an indirect access permanent file. This file will be saved for the number of days your installation allows. When you wish to use this file again in a subsequent job, all you need enter to access the file is the control statement GET,filename. NOS/BE permanent files can be used in a similar manner. NOS/BE requires that you enter CATALOG,filename,id to m a k e a n e x i s t i n g l o c a l fi l e p e r m a n e n t a n d ATTACH,filename,id to access an existing permanent file. Before you can create the permanent file, however, you must first allocate file space for it through use of the REQUEST,*PF control statement. NOS/BE also allows you to keep up to 5 cycles of one permanent file under one file name; there is no NOS counterpart for this concept. The" length of time that a NOS/BE permanent file is kept depends on the time you specify, or on the operating system default. At the installation where these jobs were run, the default retention period is set at 5 days. By running a job using the permanent file more often than every 5 days, the 5-day period is renewed. When the file is no longer needed, it is automatically purged from the system 5 days later. If you specify a longer period, you should purge the permanent file when you no longer need it. ACCOUNTING INFORMATION As you have noted in the practice examples, NOS usually requires a USER and a CHARGE control statement following the job control statement. These are used for identification and accounting purposes. If these control statements are not given, or are incorrect, the run will t e r m i n a t e w i t h a m e s s a g e i n d i c a t i n g t h e e r r o r. Procedures vary from installation to installation depending on the accounting methods in use. Interactive users at some installations are limited in the number of attempts they are allowed when signing onto the system. NOS/BE users are often required to include their accounting information on the job control statement following the terminator. Other installations require this information on a separate ACCOUNT control statement. Security procedures usually terminate any unauthorized job. You might wish to note the accounting information required at your installation inside the front cover. You should be careful with this information because your account will be billed for all jobs run under this number. JOB EXAMPLES The following NOS/BE control statement examples will allow you to run the practice examples given. For more information on your operating system and the control statements available, you should consult the NOS/BE user's guide or the NOS Batch user's guide or the NOS Time-Sharing user's guide. The NOS control statements given in figure 4-4 can be replaced with the NOS/BE control statements shown in fi g u r e C - l . E a c h fi g u r e c a p t i o n i n c l u d e s t h e fi g u r e number of the related NOS example. Under NOS/BE, the T parameter on the job statement specifies a time limit for the job in octal seconds. The time limit also influences the priority given the job in the input queue. Too high a limit can reduce the job priority. Too low a limit can stop the job before it completes. Refer to the NOS/BE user's guide for details. 00$>K. 60482900 A C-l Figure C-2 is the NOS/BE counterpart of figure 4-5. The explanations associated with figure 4-5 also apply here. Figure C-3 illustrates how you can create a NOS/BE permanent file of the sorted output file. When creating a permanent file under NOS/BE, you must specify an ID for the file and you can specify a retention period if you wish as shown with the CATALOG statement in figure C-3. To subsequently access this permanent file requires a s t a t e m e n t s u c h a s AT TA C H , N E W, I D = M E . B o t h t h e CATALOG and the ATTACH statements allow a large number of parameters for various purposes. These parameters are described in the NOS/BE user's guide and the NOS/BE reference manual. jobcard. user and accounting information FILE(NEW,6T=C,RT=Z,FL=80) SORTMRG. REWIND.NEW. COPYSBF.NEW.OUTPUT. 7/8/9 multipunched in column 1 sort directives 7/8/9 multipunched in column 1 input records 6/7/8/9 multipunched in column 1 Figure C-l. NOS/BE Control Statements (Figure 4-4) r c n t p e E S SORT/MERGE 4.6 L497 03/29/79 14.19. 11 . c T I FAGE 1 t SORT ? F I L E , T N P U Ts I N P U T t O U T P U T = N E * * F I E L C f N A M E d , ? ? , D I S P L AY ) 4 K E Y, M M E C A , D I S P L A Y ) * ENO AN0FPSON,TTM0 "MY B R K K F P, J O A C H I f BERNARD,JOHN BOEo,GEORGE BROWN,J AMES CP«»LSON,JACK CHAW,, ROBERT COHEN,JOSEPH D AV I S , R O B E R T DUBOIS,BNOPF n U PA M P, H F L E N F I S C H F R , O AV I D GARCIA,ARTHUR G 0 M E 7 , LT N D A I VA N O V, L E O N A R n JOHNSON , ANNAB «H.LE JOHNSON, ARMANI" S M I T H , R O B E R TA S0KOL,0nNAL0 TAY L O R , J E N N I F E R M P N G , LT S A MILLIAM<*fBENFriCT WILLIAM?,ROBE FT WILSON, OO'JGLA 5 A0O0700052«752*MM AnOO71Q011l7^2«.MS ?. A000625011S7719MS Ani)0 7 ERGE OROco USEO ♦»»♦*»♦»13 14.1«.1 = . »»E^0 SORT FUN 1 4 . 1 9 . 1 « ? . R F W I f n , N E W. 1 4 . 1 9 . 1 « . C O P Y ^ B F t N E W , O U T P U T. 1 4 . 1 P. 1 « 5 . 0 P r 0 0 0 0 S l 2 W O R O S - F T L E C U T P ' J T , D C u n lfc.l9.1«?.M5 7c3<» WOROS ( 1 0 7 ? 2 MAX USED) 1 4 . 1 P. 1 C . C P A .210 SFC. .210 AOJ. 14.1<».1*.CPB .270 SEC. .270 ADJ. l4.iq.1c.xo .347 SEC. .3'. 7 ADJ. 14.19.15.CM 14.601 KWS. .801 AOJ. 14.19.1*.SS ,7?0 14.lQ.m.PP *.05«> SEC. DATE 0 1 / 2 9 / 7 9 1 « » . 1 P. 1 ? . E J c N O O F J O B , » • Figure C-2. NOS/BE Sort Output by Name (figure 4-5) (Sheet 2 of 2) r EXRC1.T10. accounting information REQUEST,NEW,*PF. FILE(NEW,BT=C,RT=Z,FL=80) SORTMRG. CATALOG,NEW,ID=ME,RP=10. REWIND.NEW. COPYSBF.NEW.OUTPUT. 7/8/9 multipunched in column 1 sort directives 7/8/9 multipunched in column 1 input records 6/7/8/9 multipunched in column 1 Figure C-3. Creating a NOS/BE Permanent File (Figure 4-6) 60482900 A C-3 {*%: 0^, INDEX ACCOUNT C-l ASCII code 2-3, A-l AT TA C H C - l Blanks, importance 3-3 Blanks, leading 5-1, 4-3 CATALOG C-l Character sets 2-3, A-l CHARGE statement 4-5 Checkpoint 6-2, 4-2 COBOL SORT 6-1 Collating sequence 3-2, A-3 COMPARE 4-3 COPYSBF 4-4 CYBER Record Manager 1-3, 4-5 GET 4-7 Glossary B-l Hollerith, Herman 1-1 Input preparation 2-1 INTBCD 3-2, 4-7 Merge order 3-4 NODUMP 4-2 NOS/BE C-l OWN 4-3 OWNCODE 5-1 00^*-. Data input 1-4 Data storage 1-4 Directives BYTESIZE 4-1 END 4-1 EQUATE 4-3 FIELD 4-1 FILE 4-3 KEY 4-2 MERGE 4-1 OPTIONS 4-2 SEQUENCE 4-2 SORT 4-1 TAPE 4-3 OWNCODE 4-3 Permanent files NOS 4-5 NOS/BE C-l Record design 2-1, 6-1 RETAIN 4-2, 5-7 REQUEST C-l SAVE 4-5 Sign overpunch codes 3-2 Singed numeric data 3-2 SORT directive example 4-4 SORT keys 3-1 SORT order 3-3 Display code 2-3, A-l DUMP 4-2 Dumps, checkpoint 4-2, 6-1 EBCDIC 2-4 Examples 4-3, C-l EXTRACT 4-3 FILE statement 4-1, 4-5 FORM 1-4,4-8 FORTRAN calls 5-6, 6-2 60482900 A SORTMRG statement 4-1 Tag sort 6-2 Tape sort 6-2 USER statement 4-5 Variable length records 2-2 VERIFY 4-2 VOLDUMP 4-2 Index-1 0*^1 COMMENT SHEET MANUAL TITLE: Sort/Merge Versions 4 and 1 User's Guide PUBLICATION NO.: 60482900 REVISION: A This form is not intended to be used as an order blank. Control Data Corporation welcomes your evaluation of this manual. Please indicate any errors, suggested additions or deletions, or general comments on the back (please include page number references). ° Please reply No reply necessary FOLD FOLD NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 8241 MINNEAPOLIS, MINN. POSTAGE WILL BE PAID BY CONTROL DATA CORPORATION Publications and Graphics Division P. O . B O X 3 4 9 2 Sunnyvale, California 94088-3492 \— FOLD FOLD NO POSTAGE STAMP NECESSARY IF MAILED IN U.S.A. FOLD ON DOTTED LINES AND TAPE NAME: COMPANY: STREET ADDRESS; CITY/STATE/ZIP: TAPE TAPE v_-
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No Page Count : 56 Creator : ScanSnap Manager Producer : Mac OS X 10.5.8 Quartz PDFContext Create Date : 2009:10:21 00:06:54Z Modify Date : 2009:10:21 00:06:54ZEXIF Metadata provided by EXIF.tools